Agenium Scale is an innovative company located at the Plateau de Saclay in France (European Silicon Valley) that provides software solutions for high performance computing and complex systems. Its expertise lies in various business areas as well as in the entire software development chain and includes knowledge of processors and computation architectures. At Scale we work in all business areas that require computations: automotive, space, rail, finance, telecoms, aeronautics, defence.
Computations are found in many technical areas such as:
We propose a wide variety of services including:
We also write libraries to help HPC engineers write well optimized programs. You can check us out on Github. Our main library is NSIMD. It is a vectorization library that abstracts SIMD programming. It was designed to exploit the maximum power of processors at a low development cost. NSIMD provides C89, C++98, C++11 and C++14 APIs. All APIs allow writing generic code. Binary compatibility is guaranteed by the fact that only a C ABI is exposed. The C++ API only wraps the C calls.
The list of supported SIMD instruction sets follows:
Support for the following architectures is on the way:
A part of the library is open sourced on github (https://github.com/agenium-scale/nsimd) and can be downloaded and tested at will thanks to its MIT license.
A small part of it is made of a proprietary binary at the price of 49.90 €/user and can purshased at https://store.agenium-scale.com/en/. It contains among other:
We have put NSIMD into GROMACS to demonstrate its potential. GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is heavely used in the HPC community to bench super computers and has became a reference in this area.
As GROMACS is already a fully optimized software our goal is too obtain similar running times and we do! It also prooves the claims of NSIMD, namely low development cost for high performences and portability. We have replaced nearly 11000 lines of GROMACS code by 4700 lines of NSIMD code.
We also work for the french Army and use NSIMD as the base library for our neural network inference engine. Its C++ API allows us to write all layer kernels once and have better performances than Caffe on Intel Workstations and Arm mobile devices (such as smartphones). We speed-up neural networks using quantizations and fixed-point arithmetic which are all supported by NSIMD.
We encountered several times very well optimized code written for one specific CPU using its vendor specific API. This situation becomes a problem when upgrading hardware even from the same vendor. A lot of people buy newer Xeon which are AVX-512 capable but have written their code for old AVX-capable only Xeons. That's when our translator comes into play. It is a clang-based program that takes your C/C++ code as input and chases down all vendor specific code. The output is C/C++ code whose calls to vendor APIs have been replaced by portable NSIMD code. This program saves you roughly 80% of translation time. The resulting code is then portable and uses the last SIMD capabilities.