YEPPP Library Logo
Yeppp!
About the project
Yeppp! library is a collection of low-level functions optimized for modern processors using SIMD instructions.
  • Available on all major platforms (see full list of platforms):
    • Compatible with Windows, Linux, and Android.
      • Maintains binary compatibility across Linux distributions.
    • Supports x86, x86-64, ARM, and MIPS processors.
  • Code once, high performance everywhere!
    • Each library function has several versions optimized for different microarchitectures.
    • The optimal function implementation is chosen in run-time depending on processor microarchitecture and available instruction extensions.
    • Highly optimized vector mathematical functions (see benchmarks)
    • High-performance polynomial evaluation functions (see benchmarks and usage example).
  • Cross-platform access to CPU information:
    • Detection of processor microarchitecture and instruction sets.
    • Portable access to CPU cycle counter and high-resolution system timer.
      • Supports reading CPU cycle counter on ARM and MIPS architectures.
  • Easy to use right out of the box:
    • Precompiled binaries for Windows, Linux, and Android.
    • Comprehensive documentation (browse online).
    • C and C++-compatible header files.
    • Official bindings for Java (JNI), FORTRAN, and .Net
    • Usage examples for R.
  • Free and open-source under permissive license.
Support

You are welcome to post questions or suggestions about the Yeppp! library in yeppp mailing list

Releases

Next release: Beta 3 (library version 0.9.9.0).
To be released in mid-June.
Improvements:
  • Even faster polynomial evaluation function for Haswell (see Yeppp! vs GCC 4.8 vs Clang 3.3).
  • Improved performance of vector exp function on Bulldozer and Haswell (see benchmark).
  • Haswell-optimized version of log function (see benchmark)
  • New software-pipelined implementations in yepCore module (Haswell versions included).
New features:
  • Mac OS X support (x86 and x86-64 architectures).
  • Additional data-processing function: sum/max/min of all elements/absolute values of elements, operations with constants (e.g. add constant, subtract from constant, etc), in-place operations.
Current release: Beta 2 (library version 0.9.8.0).
Released on May 28, 2013; updated on May 29, 2013.
Improvements in vector functions:
  • New function for evaluation of the specified polynomial on an array with optimizations for all significant x64 microarchitectures and great performance (see benchmarks and usage example).
  • Improved performance of vector sine function on AMD Bulldozer & Piledriver.
  • Additional version of vector sine function with optimization for Haswell.
  • Vector cosine function with optimizations for Nehalem, Bulldozer, Sandy Bridge, and Haswell.
Improvements in language bindings:
  • C# and FORTRAN bindings are now auto-generated and cover all functionality of yepCore and yepMath modules.
  • Improved Makefile for FORTRAN bindings to recognize both Intel and GNU compilers, and choose options accordingly.
Other improvements:
  • Windows builds now have version information built-in
  • Android builds and most Linux builds have separate files with library binary and debug information.
Previous release: Beta 1 (library version 0.9.7.0).
Released on May 19, 2013; updated on May 20, 2013.
Improvements:
  • Super fast exponential and logarithm functions (see benchmarks) optimized for all significant x64 architectures (Haswell (exp only), Sandy Bridge, Nehalem, Bulldozer, K10, Bobcat).
  • Sin and Tan functions.
  • Builds for ARM, MIPS, and Android/x86 are provided again.
Other changes:
  • Project switched to 3-clause BSD license for compatibility with other open-source codes which might be used in future versions of Yeppp!
Previous release: Technology Preview 6 (library version 0.9.6.1).
Released on April 30, 2013.
Improvements:
  • New versions of vector exponent and logarithm functions optimized for AMD Bulldozer/Piledriver and Intel Sandy Bridge/Ivy Bridge.
  • Refactored initialization code.
  • Debug information files.
  • Simplified building the library with new build system.
New functionality:
  • Functions to retrieve cleanly formatted human-readable processor name (e.g. "Intel Xeon X5550", "AMD FX-6300", or "Samsung Exynos 5250").
  • New yepAtomic subsystem with functions for atomic operations.
  • New yepRandom subsystem for random number generation.
  • New functions for reading energy and power counters on Sandy Bridge and Ivy Bridge.
  • Support for Xeon Phi architecture
Regressions:
  • Binaries for bindings are not provided with these release
  • Binaries for ARM, MIPS, and Android/x86 are not provided
  • No VERSIONINFO resource in Windows binaries
  • Bug in dot product kernel on Sandy Bridge
Previous release: Technology Preview 5 (library version 0.9.4.0).
Released on February 22, 2013.
Improved language bindings:
  • All functions (including yepLibrary module) in are made accessible from Java.
  • Complete documentation for Java bindings (browse online)
  • Initial release of CLR (.Net/Mono via P/Invoke) and FORTRAN 2003 bindings.
  • Examples of using Java and CLR bindings:
    • All examples are ported to Java
    • Entropy example ported to C#
New functionality:
  • Vector exponent function added (with optimized kernel for Nehalem)
  • Java bindings now allow to process arbitrary slice of an array
Previous release: Technology Preview 4 (library version 0.9.3.0).
Released on February 8, 2013.
New functionality:
  • Added Makefile for R bindings
  • Added ability to directly read processor cycle counter on ARM systems with enabled user-mode access to performance counters.
    • The kernel-mode driver for Linux which enables user-mode access to performance counter and configures processor cycle counter is now included with the library.
    Bug fixes:
    • Fixed build bugs which caused incompatibility issues in arm-linux-hardeabi-v7a, arm-linux-softeabi-v5t, arm-linux-softeabi-android and mips-linux-o32-android targets
    • Fixed bug in access to processor cycle counter on MIPS

    Previous release: Technology Preview 3 (library version 0.9.2.0).
    Released on February 6, 2013.
    This release includes sources for all library components:
    • Yeppp! library (optimized kernels, CPU dispatcher, and CPU information retrieval)
    • PEACH-Py code-generator (generates kernels and support code)
    • Yeppp! build framework (builds the library; unusable in this release)
    • Yeppp! runtime (support functions typically provided by compiler runtime; uses MIT license)
    • Bindings for Java, D, and R
    • Build scripts
    New functionality:
    • Two new ARM platforms are supported:
      • arm-linux-softeabi-android corresponds to "armeabi" in Android NDK
      • arm-linux-softeabi-v5t provides binaries compatible with soft-float ARM EABI on Linux (e.g. Debian armel port)
    • Sum-of-squares function (with optimized kernels for Nehalem, Sandy Bridge, and Bulldozer)
    • Multiply-add vector function
    • Additional optimized kernels
    Improved CPU detection:
    • Implemented detection of microarchitecture for quad-core Qualcomm Krait cores.
    • Implemented detection of microarchitecture for Intel Cloverview cores (new Atom for Tablets).
    • Fixed erroneous detection of Piledriver cores as Bulldozer cores.
    • Implemented detection of MIPS R2, MIPS Paired-Single and MIPS DSP R2 extensions.
    • Improved detection of VFPv3-D32 on ARM to work properly on misconfigured Linux kernels.
    Other improvements:
    • Documentation lists optimized implementations for every function.
    • Linux and Android binaries now provide version information in .version  ELF section
    Note: this version is not binary compatible with previous releases

    Previous release: Technology Preview 2 (library version 0.9.1.0).
    Released on January 7, 2013.

    New functionality:
    • Vector logarithm function
      • Implementation is optimized for Nehalem microarchitecture.
    Limitations:
    • Only x64 binaries for Windows and Linux are provided with this release
    Previous release: Technology Preview 1 (library version 0.9.0.0).
    Released on November 28, 2012.

    Supported functionality:
    • Detection of processor vendor, architecture, microarchitecture, instruction set extensions, and other extended features.
    • Cross-platform access to processor cycle counter and high-performance system timer.
    • Five types of vector operations: addition, subtraction, multiplication, negation, and dot product.
      • Addition, subtraction, dot product, and most multiplication functions have versions optimized for Nehalem on x64 architecture (both Linux and Windows).
      • This optimized implementations are also used for other x64 microarchitectures when the required instruction sets are available.
    Supported platforms:
      • Windows: x86 and x64 versions.
      • Linux: x86, x64, and ARM (gnueabihf) versions.
      • Android: x86, ARM (armeabiv7a), and MIPS versions.

    Author
    Marat Dukhan