e9e5261664Fix a couple issues introduced in the previous commit: * removed DirectAccessBit from Part * use a template specialization in inverseProduct() to transform a Part xpr to a Flagged xpr
Gael Guennebaud
2008-07-26 23:05:44 +0000
e77ccf2928* Rewrite the triangular solver so that we can take advantage of our efficient matrix-vector products: => up to 6 times faster ! * Added DirectAccessBit to Part * Added an exemple of a cwise operator * Renamed perpendicular() => someOrthogonal() (geometry module) * Fix a weired bug in ei_constant_functor: the default copy constructor did not copy the imaginary part when the single member of the class is a complex...
Gael Guennebaud
2008-07-26 20:40:29 +0000
2940617e6fbugfix in some internal asserts of CacheFriendlyProduct
Gael Guennebaud
2008-07-26 12:26:27 +0000
f997a3e902update the inverse test a little make use of static asserts in Map fix 2 warnings in CacheFriendlyProduct: unused var 'Vectorized'
Benoit Jacob
2008-07-26 12:08:28 +0000
b466c266a0* Fix some complex alignment issues in the cache friendly matrix-vector products. * Minor update of the cores of the Cholesky algorithms to make them more friendly wrt to matrix-vector products => speedup x5 !
Gael Guennebaud
2008-07-23 17:30:00 +0000
172000aaebAdd .perpendicular() function in Geometry module (adapted from Eigen1) Documentation: * add an overview for each module. * add an example for .all() and Cwise::operator<
Gael Guennebaud
2008-07-22 10:54:42 +0000
516db2c3b9Fix compilation issues with icc and g++ < 4.1. Those include: - conflicts with operator * overloads - discard the use of ei_pdiv for interger (g++ handles operators on __m128* types, this is why it worked) - weird behavior of icc in fixed size Block() constructor complaining the initializer of m_blockRows and m_blockCols were missing while we are in fixed size (maybe this hide deeper problem since this is a recent one, but icc gives only little feedback)
Gael Guennebaud
2008-07-21 12:40:56 +0000
c10f069b6b* Merge Extract and Part to the Part expression. Renamed "MatrixBase::extract() const" to "MatrixBase::part() const" * Renamed static functions identity, zero, ones, random with an upper case first letter: Identity, Zero, Ones and Random.
Gael Guennebaud
2008-07-21 00:34:46 +0000
ce425d92f1Various documentation improvements, in particualr in Cholesky and Geometry module. Added doxygen groups for Matrix typedefs and the Geometry module
Gael Guennebaud
2008-07-20 15:18:54 +0000
269f683902Add cholesky's members to MatrixBase Various documentation improvements including new snippets (AngleAxis and Cholesky)
Gael Guennebaud
2008-07-19 22:59:05 +0000
6e2c53e056Added an automatically generated list of selected examples in the documentation. Added the custom gemetry_module tag, and use it.
Gael Guennebaud
2008-07-19 20:36:41 +0000
05ad083467Added MatrixBase::Unit*() static function to easily create unit/basis vectors. Removed EulerAngles, addes typdefs for Quaternion and AngleAxis, and added automatic conversions from Quaternion/AngleAxis to Matrix3 such that: Matrix3f m = AngleAxisf(0.2,Vector3f::UnitX) * AngleAxisf(0.2,Vector3f::UnitY); just works.
Gael Guennebaud
2008-07-19 13:03:23 +0000
7245c63067Complete rewrite of partial reduction according to mailing list discussions.
Gael Guennebaud
2008-07-19 11:36:32 +0000
8b4945a5a2add some static asserts, use them, fix gcc 4.3 warning in Product.h.
Benoit Jacob
2008-07-19 00:25:41 +0000
22a816ade8* Fix a couple of issues related to the recent cache friendly products * Improve the efficiency of matrix*vector in unaligned cases * Trivial fixes in the destructors of MatrixStorage * Removed the matrixNorm in test/product.cpp (twice faster and that assumed the matrix product was ok while checking that !!)
Gael Guennebaud
2008-07-19 00:09:01 +0000
62ec1dd616* big rework of Inverse.h: - remove all invertibility checking, will be redundant with LU - general case: adapt to matrix storage order for better perf - size 4 case: handle corner cases without falling back to gen case. - rationalize with selectors instead of compile time if - add C-style computeInverse() * update inverse test. * in snippets, default cout precision to 3 decimal places * add some cmake module from kdelibs to support btl with cmake 2.4
Benoit Jacob
2008-07-15 23:56:17 +0000
b970a9c8aatrivial fix in EulerAngles constructor
Gael Guennebaud
2008-07-15 22:42:55 +0000
c8cbc1665eenhancements of the plot generator: - removed the ugly X11 and PNG gnuplots terminals - use enhanced postscript terminal - use imagemagick to generate the png files (with compression) - disable the fortran impl by default since it is as meaningless as a "C impl" - update line settings
Gael Guennebaud
2008-07-13 11:46:36 +0000
99a625243fOptimization: added super efficient rowmajor * vector product (and vector * colmajor). It basically performs 4 dot products at once reducing loads of the vector and improving instructions scheduling. With 3 cache friendly algorithms, we now handle all product configurations with outstanding perf for large matrices.
Gael Guennebaud
2008-07-13 01:22:54 +0000
51e6ee39f0SVN_SILENT trivial fix
Benoit Jacob
2008-07-12 23:42:19 +0000
bd0183f850fix a cmake issue in FindTvmet and FindMKL
Gael Guennebaud
2008-07-12 23:34:42 +0000
e979e6485fanother occurence of that little cmake fix
Benoit Jacob
2008-07-12 23:27:41 +0000
861d18d553* Optimization: added a specialization of Block for xpr with DirectAccessBit * some simplifications and fixes in cache friendly products
Gael Guennebaud
2008-07-12 22:59:34 +0000
1bbaea9885little cmake fix
Benoit Jacob
2008-07-12 22:13:03 +0000
10c4e36b39disable MKL check and fortran for cmake <2.6
Gael Guennebaud
2008-07-12 21:54:02 +0000
ed6e07b2f6various improvements of the plot generator in BTL
Gael Guennebaud
2008-07-12 21:41:32 +0000
8233de8b69various minor updates in the benchmark suite like non inlining of some functions as well as the experimental C code used to design efficient eigen's matrix vector products.
Gael Guennebaud
2008-07-12 12:14:08 +0000
b7bd1b3446Add a *very efficient* evaluation path for both col-major matrix * vector and vector * row-major products. Currently, it is enabled only is the matrix has DirectAccessBit flag and the product is "large enough". Added the respective unit tests in test/product/cpp.
Gael Guennebaud
2008-07-12 12:12:02 +0000
6f71ef8277resurrected tvmet, added mt4, intel's MKL and handcoded vectorized backends in the benchmark suite
Gael Guennebaud
2008-07-10 18:28:50 +0000
2b53fd4d53some performance fixes in Assign.h reported by Gael. Some doc update in Cwise.
Benoit Jacob
2008-07-10 16:15:55 +0000
7b4c6b8862in BTL: a specific bench/action can be selected at runtime, e.g.: BTL_CONFIG="-a ata" ctest -V -R eigen run the all benchmarks having "ata" in their name for all libraries matching the regexp "eigen"
Gael Guennebaud
2008-07-09 22:35:11 +0000
c9b046d5d5* added optimized paths for matrix-vector and vector-matrix products (using either a cache friendly strategy or re-using dot-product vectorized implementation) * add LinearAccessBit to Transpose
Gael Guennebaud
2008-07-09 22:30:18 +0000
25904802bcraah, results were corrupted by overflow. Now slice vectorization is about a +25% speedup which is still nice as i expected zero or even negative benefit.
Benoit Jacob
2008-07-09 16:46:26 +0000
8f21a5e862add benchmark for slice vectorization... expected it to be little or zero benefit... turns out to be 20x speedup. Something is wrong.
Benoit Jacob
2008-07-09 16:43:11 +0000
28539e7597imported a reworked version of BTL (Benchmark for Templated Libraries). the modifications to initial code follow: * changed build system from plain makefiles to cmake * added eigen2 (4 versions: vec/novec and fixed/dynamic), GMM++, MTL4 interfaces * added "transposed matrix * vector" product action * updated blitz interface to use condensed products instead of hand coded loops * removed some deprecated interfaces * changed default storage order to column major for all libraries * new generic bench timer strategy which is supposed to be more accurate * various code clean-up
Gael Guennebaud
2008-07-09 14:04:48 +0000
5f55ab524c* added a lazyAssign overload skipping .lazy() such that c = (<xpr>).lazy() such that lazyAssign overloads of <xpr> are automatically called (this also reduces assign instansiations)
Gael Guennebaud
2008-07-09 13:54:21 +0000
783eb6da9bI forgot that the previous commit needed minor changes outside the bench folder
Gael Guennebaud
2008-07-08 17:25:58 +0000
77a622f2bbadd Cholesky and eigensolver benchmark
Gael Guennebaud
2008-07-08 17:20:17 +0000
6f09d3a67d- many updates after Cwise change - fix compilation in product.cpp with std::complex - fix bug in MatrixBase::operator!=
Benoit Jacob
2008-07-08 07:56:01 +0000
f5791eeb70the big Array/Cwise rework as discussed on the mailing list. The new API can be seen in Eigen/src/Core/Cwise.h.
Benoit Jacob
2008-07-08 00:49:10 +0000
c910c517b3fix issues in previously added additionnal product tests
Gael Guennebaud
2008-07-06 19:02:03 +0000
a9d319d44f* do the ActualPacketAccesBit change as discussed on list * add comment in Product.h about CanVectorizeInner * fix typo in test/product.cpp
Benoit Jacob
2008-07-04 12:43:55 +0000
8463b7d3f4* fix compilation issue in Product * added some tests for product and swap * overload .swap() for dynamic-sized matrix of same size
Gael Guennebaud
2008-07-02 16:05:33 +0000
9433df83a7* resurected Flagged::_expression used to optimize m+=(a*b).lazy() (equivalent to the GEMM blas routine) * added a GEMM benchmark
Gael Guennebaud
2008-07-01 16:20:06 +0000
95549007b3* fix error in divergence test, now it is even faster * add comments in render() in case anyone ever reads that :P
Benoit Jacob
2008-07-01 14:23:01 +0000
a356ebd47dinterleaved rendering balances the load better
Benoit Jacob
2008-07-01 14:12:32 +0000
56d03f181e* multi-threaded rendering * increased number of iterations, with more iterations done before testing divergence. results in x2 speedup from vectorization.
Benoit Jacob
2008-07-01 12:01:58 +0000
cacf986a7f- use double precision to store the position / zoom / other stuff - some temporary fix to get a +50% improvement from vectorization until we have vectorisation for comparisons and redux
Benoit Jacob
2008-06-30 07:33:08 +0000
37a50fa526* added an in-place version of inverseProduct which might be twice faster fot small fixed size matrix * added a sparse triangular solver (sparse version of inverseProduct) * various other improvements in the Sparse module
Gael Guennebaud
2008-06-29 21:29:12 +0000
fbdecf09e1fix little bug in computation of max_iter
Benoit Jacob
2008-06-29 12:20:07 +0000
97a1038653improve greatly mandelbrot demo: - much better coloring - determine max number of iterations and choice between float and double at runtime based on zoom level - do draft renderings with increasing resolution before final rendering
Benoit Jacob
2008-06-29 12:04:00 +0000
027818d739* added innerSize / outerSize functions to MatrixBase * added complete implementation of sparse matrix product (with a little glue in Eigen/Core) * added an exhaustive bench of sparse products including GMM++ and MTL4 => Eigen outperforms in all transposed/density configurations !
Gael Guennebaud
2008-06-28 23:07:14 +0000
6917be9113add mandelbrot demo
Benoit Jacob
2008-06-28 20:33:47 +0000
55e08f7102fix breakage from my last commit
Benoit Jacob
2008-06-28 17:15:16 +0000
844f69e4a9* update CMakeLists, only build instantiations if TEST_LIB is defined * allow default Matrix constructor in dynamic size, defaulting to (1, 1), this is convenient in mandelbrot example.
Benoit Jacob
2008-06-27 10:53:30 +0000
6de4871c8cfix a couple of issues in the new Map.h
Benoit Jacob
2008-06-27 01:42:44 +0000
e27b2b95cf* rework Map, allow vectorization * rework PacketMath and DummyPacketMath, make these actual template specializations instead of just overriding by non-template inline functions * introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix * remove Matrix::map() methods, use Map constructors instead.
Benoit Jacob
2008-06-27 01:22:35 +0000
e5d301dc96various work on the Sparse module: * added some glue to Eigen/Core (SparseBit, ei_eval, Matrix) * add two new sparse matrix types: HashMatrix: based on std::map (for random writes) LinkedVectorMatrix: array of linked vectors (for outer coherent writes, e.g. to transpose a matrix) * add a SparseSetter class to easily set/update any kind of matrices, e.g.: { SparseSetter<MatrixType,RandomAccessPattern> wrapper(mymatrix); for (...) wrapper->coeffRef(rand(),rand()) = rand(); } * automatic shallow copy for RValue * and a lot of mess ! plus: * remove the remaining ArrayBit related stuff * don't use alloca in product for very large memory allocation
Gael Guennebaud
2008-06-26 23:22:26 +0000
c5bd1703cbchange derived classes methods from "private:_method()" to "public:method()" i.e. reimplementing the generic method() from MatrixBase. improves compilation speed by 7%, reduces almost by half the call depth of trivial functions, making gcc errors and application backtraces nicer...
Benoit Jacob
2008-06-26 20:08:16 +0000
25ba9f377c* add bench/benchVecAdd.cpp by Gael, fix crash (ei_pload on non-aligned) * introduce packet(int), make use of it in linear vectorized paths --> completely fixes the slowdown noticed in benchVecAdd. * generalize coeff(int) to linear-access xprs * clarify the access flag bits * rework api dox in Coeffs.h and util/Constants.h * improve certain expressions's flags, allowing more vectorization * fix bug in Block: start(int) and end(int) returned dyn*dyn size * fix bug in Block: just because the Eval type has packet access doesn't imply the block xpr should have it too.
Benoit Jacob
2008-06-26 16:06:41 +0000
5b0da4b778make use of ei_pmadd in dot-product: will further improve performance on architectures having a packed-mul-add assembly instruction.
Benoit Jacob
2008-06-24 18:08:35 +0000
3b94436d2f* vectorize dot product, copying code from sum. * make the conj functor vectorizable: it is just identity in real case, and complex doesn't use the vectorized path anyway. * fix bug in Block: a 3x1 block in a 4x4 matrix (all fixed-size) should not be vectorizable, since in fixed-size we are assuming the size to be a multiple of packet size. (Or would you prefer Vector3d to be flagged "packetaccess" even though no packet access is possible on vectors of that type?) * rename: isOrtho for vectors ---> isOrthogonal isOrtho for matrices ---> isUnitary * add normalize() * reimplement normalized with quotient1 functor
Benoit Jacob
2008-06-24 15:13:00 +0000
c9560df4a0* add ei_pdiv intrinsic, make quotient functor vectorizable * add vdw benchmark from Tim's real-world use case
Benoit Jacob
2008-06-23 22:00:18 +0000
ac9aa47bbcoptimize linear vectorization both in Assign and Sum (optimal amortized perf)
Gael Guennebaud
2008-06-23 15:50:28 +0000
ea1990ef3dadd experimental code for sparse matrix: - uses the common "Compressed Column Storage" scheme - supports every unary and binary operators with xpr template assuming binaryOp(0,0) == 0 and unaryOp(0) = 0 (otherwise a sparse matrix doesnot make sense) - this is the first commit, so of course, there are still several shorcommings !
Gael Guennebaud
2008-06-23 13:25:22 +0000
03d19f3baequick temporary fix for a perf issue we just identified with vectorization.... now the sum benchmark runs 3x faster with vectorization than without.
Benoit Jacob
2008-06-23 11:23:05 +0000
32596c5e9eadd benchmark for sum
Benoit Jacob
2008-06-23 11:03:27 +0000
dc9206cec5split sum away from redux and vectorize it. (could come back to redux after it has been vectorized, and could serve as a starting point for that) also make the abs2 functor vectorizable (for real types).
Benoit Jacob
2008-06-23 10:32:48 +0000
8a967fb17c* implement slice vectorization. Because it uses unaligned packet access, it is not certain that it will bring a performance improvement: benchmarking needed. * improve logic choosing slice vectorization. * fix typo in SSE packet math, causing crash in unaligned case. * fix bug in Product, causing crash in unaligned case. * add TEST_SSE3 CMake option.
Benoit Jacob
2008-06-22 15:02:05 +0000
8cef541b5aforgot to add the unit test array.cpp
Gael Guennebaud
2008-06-21 17:28:07 +0000
32c5ea388ework on rotations in the Geometry module: - convertions are done trough constructors and operator= - added a EulerAngles class
Gael Guennebaud
2008-06-21 15:01:49 +0000
574416b842Override MatrixBase::eval() since matrices don't need to be evaluated, it is enough to just read them.
Benoit Jacob
2008-06-20 15:26:39 +0000
54238961d6* added a pseudo expression Array giving access to: - matrix-scalar addition/subtraction operators, e.g.: m.array() += 0.5; - matrix/matrix comparison operators, e.g.: if (m1.array() < m2.array()) {} * fix compilation issues with Transform and gcc < 4.1
Gael Guennebaud
2008-06-20 12:38:03 +0000
e735692e37move "enum" back to "const int" int ei_assign_impl: in fact, casting enums to int is enough to get compile time constants with ICC.
Gael Guennebaud
2008-06-20 07:10:50 +0000
fb4a151982* more cleaning in Product * make Matrix2f (and similar) vectorized using linear path * fix a couple of warnings and compilation issues with ICC and gcc 3.3/3.4 (cannot get Transform compiles with gcc 3.3/3.4, see the FIXME)
Gael Guennebaud
2008-06-19 23:00:51 +0000
82c3cea1d5* refactoring of Product: * use ProductReturnType<>::Type to get the correct Product xpr type * Product is no longer instanciated for xpr types which are evaluated * vectorization of "a.transpose() * b" for the normal product (small and fixed-size matrix) * some cleanning * removed ArrayBase
Gael Guennebaud
2008-06-19 17:33:57 +0000
5dbfed1902fix two bugs dicovered by the previous commit.
Gael Guennebaud
2008-06-16 16:39:58 +0000
bb1f4e44f1* Block: row and column expressions in the inner direction now have the Like1D flag.
Benoit Jacob
2008-06-16 14:54:31 +0000
9857764ae7aaargh.
Benoit Jacob
2008-06-16 11:20:29 +0000
478bfaf228fix bug in computation of unrolling limit: div instead of mul
Benoit Jacob
2008-06-16 11:18:59 +0000
c905b31b42* Big rework of Assign.h: ** Much better organization ** Fix a few bugs ** Add the ability to unroll only the inner loop ** Add an unrolled path to the Like1D vectorization. Not well tested. ** Add placeholder for sliced vectorization. Unimplemented.
Benoit Jacob
2008-06-16 10:49:44 +0000
bc0c7c57edAdded an extensible mechanism to support any kind of rotation representation in Transform via the template static class ToRotationMatrix. Added a lightweight AngleAxis class (similar to Rotation2D).
Gael Guennebaud
2008-06-15 17:22:41 +0000
0ee6b08128* split Product to a DiagonalProduct template specialization to optimize matrix-diag and diag-matrix products without making Product over complicated. * compilation fixes in Tridiagonalization and HessenbergDecomposition in the case of 2x2 matrices. * added an Orientation2D small class with similar interface than Quaternion (used by Transform to handle 2D and 3D orientations seamlessly) * added a couple of features in Transform.
Gael Guennebaud
2008-06-15 11:54:18 +0000
fbbd8afe30Started a Transform class in the Geometry module to represent homography. Fix indentation in Quaternion.h
Gael Guennebaud
2008-06-15 08:33:44 +0000
4af7089ab8* Added a generalized eigen solver for the selfadjoint case. (as new members to SelfAdjointEigenSolver) The QR module now depends on Cholesky. * Fix Transpose to correctly preserve the *TriangularBit.
Gael Guennebaud
2008-06-14 19:42:12 +0000
f07f907810Add QR and Cholesky module instantiations in the lib. To try it with the unit tests set the cmake variable TEST_LIB to ON.
Gael Guennebaud
2008-06-14 13:02:41 +0000
53289a8b64* even though the _Flags default to the corrected value, still correct them in the ei_traits, so that they're guaranteed even if the user specified his own non-default flags (like before).
Benoit Jacob
2008-06-13 08:09:48 +0000
c90c77051f* make the _Flags template parameter of Matrix default to the corrected flags. This ensures that unless explicitly messed up otherwise, a Matrix type is equal to its own Eval type. This seriously reduces the number of types instantiated. Measured +13% compile speed, -7% binary size.
Benoit Jacob
2008-06-13 07:53:45 +0000
e3fac69f19Added a Hessenberg decomposition class for both real and complex matrices. This is the first step towards a non-selfadjoint eigen solver. Notes: - We might consider merging Tridiagonalization and Hessenberg toghether ? - Or we could factorize some code into a Householder class (could also be shared with QR)
Gael Guennebaud
2008-06-08 15:03:23 +0000
4dd57b585d* rewrite of the QR decomposition: - works for complex - allows direct access to the matrix R * removed the scale by the matrix dimensions in MatrixBase::isMuchSmallerThan(scalar)
Gael Guennebaud
2008-06-07 22:47:11 +0000
eb7b7b2cfc* remove Cross product expression: MatrixBase::cross() now returns a temporary which is even better optimized by the compiler. * Quaternion no longer inherits MatrixBase. Instead it stores the coefficients using a Matrix<> and provides only relevant methods.
Gael Guennebaud
2008-06-07 13:18:29 +0000
6998037930* move some compile time "if" to their respective unroller (assign and dot) * fix a couple of compilation issues when unrolling is disabled * reduce default unrolling limit to a more reasonable value
Gael Guennebaud
2008-06-07 01:07:48 +0000
a172385720Updated fuzzy comparisons to use L2 norm as all my experiments tends to show L2 norm works very well here. (the legacy implementation is still available via a preprocessor token to allow further experiments if needed...)
Gael Guennebaud
2008-06-06 18:37:53 +0000
8769bfd9aafix a compilation issue in non debug mode
Gael Guennebaud
2008-06-06 14:11:26 +0000
869394ee8bfix some compile errors with gcc 4.3, some warnings, some documentation
Benoit Jacob
2008-06-06 13:10:00 +0000
2126baf9dcadd an optimized path for the tridiagonalization of a 3x3 matrix. (useful for plane fitting, and covariance analysis of 3D data)
Gael Guennebaud
2008-06-04 13:41:32 +0000
48262b9734added a static assertion mechanism (see notes in Core/util/StaticAssert.h for details)
Gael Guennebaud
2008-06-04 11:16:11 +0000
60726f91a9hack to to make the nomalloc unit test compiles with -pedantic
Gael Guennebaud
2008-06-04 10:15:48 +0000
42ad9c4352update of the eigeinsolver unit test to check complex
Gael Guennebaud
2008-06-03 18:04:36 +0000
a0cff1a295fix eigenvectors computations :)
Gael Guennebaud
2008-06-03 18:03:55 +0000