2. Simplify handling of special cases by taking advantage of the fact that the builtin vrsqrt approximation handles negative, zero and +inf arguments correctly. This speeds up the SSE and AVX implementations by ~20%. 3. Make the Newton-Raphson formula used for rsqrt more numerically robust: Before: y = y * (1.5 - x/2 * y^2) After: y = y * (1.5 - y * (x/2) * y) Forming y^2 can overflow for very large or very small (denormalized) values of x, while x*y ~= 1. For AVX512, this makes it possible to compute accurate results for denormal inputs down to ~1e-42 in single precision. 4. Add a faster double precision implementation for Knights Landing using the vrsqrt28 instruction and a single Newton-Raphson iteration. Benchmark results: https://bitbucket.org/snippets/rmlarsen/5LBq9o |
||
|---|---|---|
| bench | ||
| blas | ||
| cmake | ||
| debug | ||
| demos | ||
| doc | ||
| Eigen | ||
| failtest | ||
| lapack | ||
| scripts | ||
| test | ||
| unsupported | ||
| .hgeol | ||
| .hgignore | ||
| CMakeLists.txt | ||
| COPYING.BSD | ||
| COPYING.GPL | ||
| COPYING.LGPL | ||
| COPYING.MINPACK | ||
| COPYING.MPL2 | ||
| COPYING.README | ||
| CTestConfig.cmake | ||
| CTestCustom.cmake.in | ||
| eigen3.pc.in | ||
| INSTALL | ||
| README.md | ||
| signature_of_eigen3_matrix_library | ||
Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.
For more information go to http://eigen.tuxfamily.org/.
For pull request please only use the official repository at https://bitbucket.org/eigen/eigen.
For bug reports and feature requests go to http://eigen.tuxfamily.org/bz.