eigen

CFD/eigen

Go to file

Rasmus Munk Larsen d5e2ec7447 Speed up tensor FFT by up ~25-50%. Benchmark Base (ns) New (ns) Improvement ------------------------------------------------------------------ BM_tensor_fft_single_1D_cpu/8 132 134 -1.5% BM_tensor_fft_single_1D_cpu/9 1162 1229 -5.8% BM_tensor_fft_single_1D_cpu/16 199 195 +2.0% BM_tensor_fft_single_1D_cpu/17 2587 2267 +12.4% BM_tensor_fft_single_1D_cpu/32 373 341 +8.6% BM_tensor_fft_single_1D_cpu/33 5922 4879 +17.6% BM_tensor_fft_single_1D_cpu/64 797 675 +15.3% BM_tensor_fft_single_1D_cpu/65 13580 10481 +22.8% BM_tensor_fft_single_1D_cpu/128 1753 1375 +21.6% BM_tensor_fft_single_1D_cpu/129 31426 22789 +27.5% BM_tensor_fft_single_1D_cpu/256 4005 3008 +24.9% BM_tensor_fft_single_1D_cpu/257 70910 49549 +30.1% BM_tensor_fft_single_1D_cpu/512 8989 6524 +27.4% BM_tensor_fft_single_1D_cpu/513 165402 107751 +34.9% BM_tensor_fft_single_1D_cpu/999 198293 115909 +41.5% BM_tensor_fft_single_1D_cpu/1ki 21289 14143 +33.6% BM_tensor_fft_single_1D_cpu/1k 361980 233355 +35.5% BM_tensor_fft_double_1D_cpu/8 138 131 +5.1% BM_tensor_fft_double_1D_cpu/9 1253 1133 +9.6% BM_tensor_fft_double_1D_cpu/16 218 200 +8.3% BM_tensor_fft_double_1D_cpu/17 2770 2392 +13.6% BM_tensor_fft_double_1D_cpu/32 406 368 +9.4% BM_tensor_fft_double_1D_cpu/33 6418 5153 +19.7% BM_tensor_fft_double_1D_cpu/64 856 728 +15.0% BM_tensor_fft_double_1D_cpu/65 14666 11148 +24.0% BM_tensor_fft_double_1D_cpu/128 1913 1502 +21.5% BM_tensor_fft_double_1D_cpu/129 36414 24072 +33.9% BM_tensor_fft_double_1D_cpu/256 4226 3216 +23.9% BM_tensor_fft_double_1D_cpu/257 86638 52059 +39.9% BM_tensor_fft_double_1D_cpu/512 9397 6939 +26.2% BM_tensor_fft_double_1D_cpu/513 203208 114090 +43.9% BM_tensor_fft_double_1D_cpu/999 237841 125583 +47.2% BM_tensor_fft_double_1D_cpu/1ki 20921 15392 +26.4% BM_tensor_fft_double_1D_cpu/1k 455183 250763 +44.9% BM_tensor_fft_single_2D_cpu/8 1051 1005 +4.4% BM_tensor_fft_single_2D_cpu/9 16784 14837 +11.6% BM_tensor_fft_single_2D_cpu/16 4074 3772 +7.4% BM_tensor_fft_single_2D_cpu/17 75802 63884 +15.7% BM_tensor_fft_single_2D_cpu/32 20580 16931 +17.7% BM_tensor_fft_single_2D_cpu/33 345798 278579 +19.4% BM_tensor_fft_single_2D_cpu/64 97548 81237 +16.7% BM_tensor_fft_single_2D_cpu/65 1592701 1227048 +23.0% BM_tensor_fft_single_2D_cpu/128 472318 384303 +18.6% BM_tensor_fft_single_2D_cpu/129 7038351 5445308 +22.6% BM_tensor_fft_single_2D_cpu/256 2309474 1850969 +19.9% BM_tensor_fft_single_2D_cpu/257 31849182 23797538 +25.3% BM_tensor_fft_single_2D_cpu/512 10395194 8077499 +22.3% BM_tensor_fft_single_2D_cpu/513 144053843 104242541 +27.6% BM_tensor_fft_single_2D_cpu/999 279885833 208389718 +25.5% BM_tensor_fft_single_2D_cpu/1ki 45967677 36070985 +21.5% BM_tensor_fft_single_2D_cpu/1k 619727095 456489500 +26.3% BM_tensor_fft_double_2D_cpu/8 1110 1016 +8.5% BM_tensor_fft_double_2D_cpu/9 17957 15768 +12.2% BM_tensor_fft_double_2D_cpu/16 4558 4000 +12.2% BM_tensor_fft_double_2D_cpu/17 79237 66901 +15.6% BM_tensor_fft_double_2D_cpu/32 21494 17699 +17.7% BM_tensor_fft_double_2D_cpu/33 357962 290357 +18.9% BM_tensor_fft_double_2D_cpu/64 105179 87435 +16.9% BM_tensor_fft_double_2D_cpu/65 1617143 1288006 +20.4% BM_tensor_fft_double_2D_cpu/128 512848 419397 +18.2% BM_tensor_fft_double_2D_cpu/129 7271322 5636884 +22.5% BM_tensor_fft_double_2D_cpu/256 2415529 1922032 +20.4% BM_tensor_fft_double_2D_cpu/257 32517952 24462177 +24.8% BM_tensor_fft_double_2D_cpu/512 10724898 8287617 +22.7% BM_tensor_fft_double_2D_cpu/513 146007419 108603266 +25.6% BM_tensor_fft_double_2D_cpu/999 296351330 221885776 +25.1% BM_tensor_fft_double_2D_cpu/1ki 59334166 48357539 +18.5% BM_tensor_fft_double_2D_cpu/1k 666660132 483840349 +27.4%		2016-02-19 16:29:23 -08:00
bench	Add COD and BDCSVD in list of benched solvers.	2016-02-19 23:00:33 +01:00
blas	bug #1152 : Fix data race in static initialization of blas	2016-01-26 11:44:16 -05:00
cmake	bug #1120 : Make sure that SuperLU version is checked	2015-12-16 11:37:16 +01:00
debug	Make gdb pretty printer Python3-compatible (bug #800 ).	2014-04-28 14:10:22 +01:00
demos	Fixed compilation error due to obsolete internal::abs and internal::sqrt function calls	2014-03-26 22:02:48 -04:00
doc	Import wiki's paragraph: "I disabled vectorization, but I'm still getting annoyed about alignment issues"	2016-02-12 22:16:59 +01:00
Eigen	merge	2016-02-19 23:01:27 +01:00
failtest	Add unit tests for bug #981 : valid and invalid usage of ternary operator	2015-09-09 11:38:25 +02:00
lapack	Removed deprecated header (unsupported/Eigen/BDCSVD is included in Eigen/SVD now)	2014-10-29 17:51:14 +01:00
scripts	Use @CMAKE_MAKE_PROGRAM@ instead of make in buildtests.sh	2015-02-28 16:51:53 +01:00
test	Extend unit test to stress smart_copy with empty input/output.	2016-02-19 22:59:28 +01:00
unsupported	Speed up tensor FFT by up ~25-50%.	2016-02-19 16:29:23 -08:00
.hgeol	Added a pattern which forces LF line endings for *.sh files.	2013-07-31 18:20:58 +02:00
.hgignore	Ignore automalically imported lapack source files	2014-10-17 15:34:39 +02:00
CMakeLists.txt	Improve handling of deprecated EIGEN_INCLUDE_INSTALL_DIR variable	2015-12-10 15:47:06 +01:00
COPYING.BSD	Intel(R) MKL support added.	2011-12-05 14:52:21 +07:00
COPYING.GPL	there's no reason why we should follow the FSF's stupid recommendation for the naming of these files, right? This could give the wrong impression that Eigen is only GPL-licensed.	2009-11-14 23:26:07 -05:00
COPYING.LGPL	Replace COPYING.LGPL by a copy of the LGPL 2.1 (instead of LGPL 3).	2012-09-10 13:27:44 -04:00
COPYING.MINPACK	add COPYING.MINPACK	2012-07-15 11:46:22 -04:00
COPYING.MPL2	add COPYING.MPL2	2012-07-15 10:20:59 -04:00
COPYING.README	Replace COPYING.LGPL by a copy of the LGPL 2.1 (instead of LGPL 3).	2012-09-10 13:27:44 -04:00
CTestConfig.cmake	swap 3.2 <-> default CTestConfig.cmake file	2014-03-05 10:07:44 +01:00
CTestCustom.cmake.in	Reduce maximum number of warnings/errors. (they took GBs even for limited period of time)	2013-06-20 17:39:15 +02:00
eigen3.pc.in	Further fixes for CMAKE_INSTALL_PREFIX correctness	2015-11-07 21:29:24 -05:00
INSTALL	finally, the right fix: set CTEST_BUILD_TARGET.	2009-10-04 20:27:44 -04:00
README.md	README.md edited online with Bitbucket	2014-05-21 14:08:04 +00:00
signature_of_eigen3_matrix_library	improve the scripts for building unit tests:	2009-11-25 21:26:37 -05:00

README.md

Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.

For more information go to http://eigen.tuxfamily.org/.