Yeppp! library is a collection of low-level functions optimized for modern processors using SIMD instructions.
- Available on all major platforms (see full list of platforms):
- Compatible with Windows, Linux, and Android.
- Maintains binary compatibility across Linux distributions.
- Supports x86, x86-64, ARM, and MIPS processors.
- Code once, high performance everywhere!
- Each library function has several versions optimized for different microarchitectures.
- The optimal function implementation is chosen in run-time depending on processor microarchitecture and available instruction extensions.
- Highly optimized vector mathematical functions (see benchmarks)
- High-performance polynomial evaluation functions (see benchmarks and usage example).
- Cross-platform access to CPU information:
- Detection of processor microarchitecture and instruction sets.
- Portable access to CPU cycle counter and high-resolution system timer.
- Supports reading CPU cycle counter on ARM and MIPS architectures.
- Easy to use right out of the box:
- Precompiled binaries for Windows, Linux, and Android.
- Comprehensive documentation (browse online).
- C and C++-compatible header files.
- Official bindings for Java (JNI), FORTRAN, and .Net
- Usage examples for R.
- Free and open-source under permissive license.
Support
You are welcome to post questions or suggestions about the Yeppp! library in yeppp mailing list
Releases 
Next release: Beta 3 (library version 0.9.9.0). To be released in mid-June. - Even faster polynomial evaluation function for Haswell (see Yeppp! vs GCC 4.8 vs Clang 3.3).
- Improved performance of vector exp function on Bulldozer and Haswell (see benchmark).
- Haswell-optimized version of log function (see benchmark)
- New software-pipelined implementations in yepCore module (Haswell versions included).
New features: - Mac OS X support (x86 and x86-64 architectures).
- Additional data-processing function: sum/max/min of all elements/absolute values of elements, operations with constants (e.g. add constant, subtract from constant, etc), in-place operations.
Current release: Beta 2 (library version 0.9.8.0). Released on May 28, 2013; updated on May 29, 2013. Improvements in vector functions:
- New function for evaluation of the specified polynomial on an array with optimizations for all significant x64 microarchitectures and great performance (see benchmarks and usage example).
- Improved performance of vector sine function on AMD Bulldozer & Piledriver.
- Additional version of vector sine function with optimization for Haswell.
- Vector cosine function with optimizations for Nehalem, Bulldozer, Sandy Bridge, and Haswell.
Improvements in language bindings: - C# and FORTRAN bindings are now auto-generated and cover all functionality of yepCore and yepMath modules.
- Improved Makefile for FORTRAN bindings to recognize both Intel and GNU compilers, and choose options accordingly.
Other improvements:
- Windows builds now have version information built-in
- Android builds and most Linux builds have separate files with library binary and debug information.
Previous release: Beta 1 (library version 0.9.7.0). Released on May 19, 2013; updated on May 20, 2013. - Super fast exponential and logarithm functions (see benchmarks) optimized for all significant x64 architectures (Haswell (exp only), Sandy Bridge, Nehalem, Bulldozer, K10, Bobcat).
- Sin and Tan functions.
- Builds for ARM, MIPS, and Android/x86 are provided again.
- Project switched to 3-clause BSD license for compatibility with other open-source codes which might be used in future versions of Yeppp!
Previous release: Technology Preview 6 (library version 0.9.6.1). Released on April 30, 2013. - New versions of vector exponent and logarithm functions optimized for AMD Bulldozer/Piledriver and Intel Sandy Bridge/Ivy Bridge.
- Refactored initialization code.
- Debug information files.
- Simplified building the library with new build system.
- Functions to retrieve cleanly formatted human-readable processor name (e.g. "Intel Xeon X5550", "AMD FX-6300", or "Samsung Exynos 5250").
- New yepAtomic subsystem with functions for atomic operations.
- New yepRandom subsystem for random number generation.
- New functions for reading energy and power counters on Sandy Bridge and Ivy Bridge.
- Support for Xeon Phi architecture
- Binaries for bindings are not provided with these release
- Binaries for ARM, MIPS, and Android/x86 are not provided
- No VERSIONINFO resource in Windows binaries
- Bug in dot product kernel on Sandy Bridge
Previous release: Technology Preview 5 (library version 0.9.4.0). Released on February 22, 2013. Improved language bindings:
- All functions (including yepLibrary module) in are made accessible from Java.
- Complete documentation for Java bindings (browse online)
- Initial release of CLR (.Net/Mono via P/Invoke) and FORTRAN 2003 bindings.
- Examples of using Java and CLR bindings:
- All examples are ported to Java
- Entropy example ported to C#
- Vector exponent function added (with optimized kernel for Nehalem)
- Java bindings now allow to process arbitrary slice of an array
Previous release: Technology Preview 4 (library version 0.9.3.0). Released on February 8, 2013. - Added Makefile for R bindings
- Added ability to directly read processor cycle counter on ARM systems with enabled user-mode access to performance counters.
- The kernel-mode driver for Linux which enables user-mode access to performance counter and configures processor cycle counter is now included with the library.
- Fixed build bugs which caused incompatibility issues in arm-linux-hardeabi-v7a, arm-linux-softeabi-v5t, arm-linux-softeabi-android and mips-linux-o32-android targets
- Fixed bug in access to processor cycle counter on MIPS
Previous release: Technology Preview 3 (library version 0.9.2.0). Released on February 6, 2013. This release includes sources for all library components:
- Yeppp! library (optimized kernels, CPU dispatcher, and CPU information retrieval)
- PEACH-Py code-generator (generates kernels and support code)
- Yeppp! build framework (builds the library; unusable in this release)
- Yeppp! runtime (support functions typically provided by compiler runtime; uses MIT license)
- Bindings for Java, D, and R
- Build scripts
- Two new ARM platforms are supported:
- arm-linux-softeabi-android corresponds to "armeabi" in Android NDK
- arm-linux-softeabi-v5t provides binaries compatible with soft-float ARM EABI on Linux (e.g. Debian armel port)
- Sum-of-squares function (with optimized kernels for Nehalem, Sandy Bridge, and Bulldozer)
- Multiply-add vector function
- Additional optimized kernels
- Implemented detection of microarchitecture for quad-core Qualcomm Krait cores.
- Implemented detection of microarchitecture for Intel Cloverview cores (new Atom for Tablets).
- Fixed erroneous detection of Piledriver cores as Bulldozer cores.
- Implemented detection of MIPS R2, MIPS Paired-Single and MIPS DSP R2 extensions.
- Improved detection of VFPv3-D32 on ARM to work properly on misconfigured Linux kernels.
- Documentation lists optimized implementations for every function.
- Linux and Android binaries now provide version information in .version ELF section
Note: this version is not binary compatible with previous releases
Previous release: Technology Preview 2 (library version 0.9.1.0). Released on January 7, 2013.
- Vector logarithm function
- Implementation is optimized for Nehalem microarchitecture.
Limitations:
- Only x64 binaries for Windows and Linux are provided with this release
Previous release: Technology Preview 1 (library version 0.9.0.0). Released on November 28, 2012.
- Detection of processor vendor, architecture, microarchitecture, instruction set extensions, and other extended features.
- Cross-platform access to processor cycle counter and high-performance system timer.
- Five types of vector operations: addition, subtraction, multiplication, negation, and dot product.
- Addition, subtraction, dot product, and most multiplication functions have versions optimized for Nehalem on x64 architecture (both Linux and Windows).
- This optimized implementations are also used for other x64 microarchitectures when the required instruction sets are available.
- Windows: x86 and x64 versions.
- Linux: x86, x64, and ARM (gnueabihf) versions.
- Android: x86, ARM (armeabiv7a), and MIPS versions.
Author
Marat Dukhan
|
|