Commit Graph

  • e9e5261664 Fix a couple issues introduced in the previous commit: * removed DirectAccessBit from Part * use a template specialization in inverseProduct() to transform a Part xpr to a Flagged xpr Gael Guennebaud 2008-07-26 23:05:44 +0000
  • e77ccf2928 * Rewrite the triangular solver so that we can take advantage of our efficient matrix-vector products: => up to 6 times faster ! * Added DirectAccessBit to Part * Added an exemple of a cwise operator * Renamed perpendicular() => someOrthogonal() (geometry module) * Fix a weired bug in ei_constant_functor: the default copy constructor did not copy the imaginary part when the single member of the class is a complex... Gael Guennebaud 2008-07-26 20:40:29 +0000
  • 2940617e6f bugfix in some internal asserts of CacheFriendlyProduct Gael Guennebaud 2008-07-26 12:26:27 +0000
  • f997a3e902 update the inverse test a little make use of static asserts in Map fix 2 warnings in CacheFriendlyProduct: unused var 'Vectorized' Benoit Jacob 2008-07-26 12:08:28 +0000
  • b466c266a0 * Fix some complex alignment issues in the cache friendly matrix-vector products. * Minor update of the cores of the Cholesky algorithms to make them more friendly wrt to matrix-vector products => speedup x5 ! Gael Guennebaud 2008-07-23 17:30:00 +0000
  • 172000aaeb Add .perpendicular() function in Geometry module (adapted from Eigen1) Documentation: * add an overview for each module. * add an example for .all() and Cwise::operator< Gael Guennebaud 2008-07-22 10:54:42 +0000
  • 516db2c3b9 Fix compilation issues with icc and g++ < 4.1. Those include: - conflicts with operator * overloads - discard the use of ei_pdiv for interger (g++ handles operators on __m128* types, this is why it worked) - weird behavior of icc in fixed size Block() constructor complaining the initializer of m_blockRows and m_blockCols were missing while we are in fixed size (maybe this hide deeper problem since this is a recent one, but icc gives only little feedback) Gael Guennebaud 2008-07-21 12:40:56 +0000
  • c10f069b6b * Merge Extract and Part to the Part expression. Renamed "MatrixBase::extract() const" to "MatrixBase::part() const" * Renamed static functions identity, zero, ones, random with an upper case first letter: Identity, Zero, Ones and Random. Gael Guennebaud 2008-07-21 00:34:46 +0000
  • ce425d92f1 Various documentation improvements, in particualr in Cholesky and Geometry module. Added doxygen groups for Matrix typedefs and the Geometry module Gael Guennebaud 2008-07-20 15:18:54 +0000
  • 269f683902 Add cholesky's members to MatrixBase Various documentation improvements including new snippets (AngleAxis and Cholesky) Gael Guennebaud 2008-07-19 22:59:05 +0000
  • 6e2c53e056 Added an automatically generated list of selected examples in the documentation. Added the custom gemetry_module tag, and use it. Gael Guennebaud 2008-07-19 20:36:41 +0000
  • 05ad083467 Added MatrixBase::Unit*() static function to easily create unit/basis vectors. Removed EulerAngles, addes typdefs for Quaternion and AngleAxis, and added automatic conversions from Quaternion/AngleAxis to Matrix3 such that: Matrix3f m = AngleAxisf(0.2,Vector3f::UnitX) * AngleAxisf(0.2,Vector3f::UnitY); just works. Gael Guennebaud 2008-07-19 13:03:23 +0000
  • 7245c63067 Complete rewrite of partial reduction according to mailing list discussions. Gael Guennebaud 2008-07-19 11:36:32 +0000
  • 8b4945a5a2 add some static asserts, use them, fix gcc 4.3 warning in Product.h. Benoit Jacob 2008-07-19 00:25:41 +0000
  • 22a816ade8 * Fix a couple of issues related to the recent cache friendly products * Improve the efficiency of matrix*vector in unaligned cases * Trivial fixes in the destructors of MatrixStorage * Removed the matrixNorm in test/product.cpp (twice faster and that assumed the matrix product was ok while checking that !!) Gael Guennebaud 2008-07-19 00:09:01 +0000
  • 62ec1dd616 * big rework of Inverse.h: - remove all invertibility checking, will be redundant with LU - general case: adapt to matrix storage order for better perf - size 4 case: handle corner cases without falling back to gen case. - rationalize with selectors instead of compile time if - add C-style computeInverse() * update inverse test. * in snippets, default cout precision to 3 decimal places * add some cmake module from kdelibs to support btl with cmake 2.4 Benoit Jacob 2008-07-15 23:56:17 +0000
  • b970a9c8aa trivial fix in EulerAngles constructor Gael Guennebaud 2008-07-15 22:42:55 +0000
  • c8cbc1665e enhancements of the plot generator: - removed the ugly X11 and PNG gnuplots terminals - use enhanced postscript terminal - use imagemagick to generate the png files (with compression) - disable the fortran impl by default since it is as meaningless as a "C impl" - update line settings Gael Guennebaud 2008-07-13 11:46:36 +0000
  • 99a625243f Optimization: added super efficient rowmajor * vector product (and vector * colmajor). It basically performs 4 dot products at once reducing loads of the vector and improving instructions scheduling. With 3 cache friendly algorithms, we now handle all product configurations with outstanding perf for large matrices. Gael Guennebaud 2008-07-13 01:22:54 +0000
  • 51e6ee39f0 SVN_SILENT trivial fix Benoit Jacob 2008-07-12 23:42:19 +0000
  • bd0183f850 fix a cmake issue in FindTvmet and FindMKL Gael Guennebaud 2008-07-12 23:34:42 +0000
  • e979e6485f another occurence of that little cmake fix Benoit Jacob 2008-07-12 23:27:41 +0000
  • 861d18d553 * Optimization: added a specialization of Block for xpr with DirectAccessBit * some simplifications and fixes in cache friendly products Gael Guennebaud 2008-07-12 22:59:34 +0000
  • 1bbaea9885 little cmake fix Benoit Jacob 2008-07-12 22:13:03 +0000
  • 10c4e36b39 disable MKL check and fortran for cmake <2.6 Gael Guennebaud 2008-07-12 21:54:02 +0000
  • ed6e07b2f6 various improvements of the plot generator in BTL Gael Guennebaud 2008-07-12 21:41:32 +0000
  • 8233de8b69 various minor updates in the benchmark suite like non inlining of some functions as well as the experimental C code used to design efficient eigen's matrix vector products. Gael Guennebaud 2008-07-12 12:14:08 +0000
  • b7bd1b3446 Add a *very efficient* evaluation path for both col-major matrix * vector and vector * row-major products. Currently, it is enabled only is the matrix has DirectAccessBit flag and the product is "large enough". Added the respective unit tests in test/product/cpp. Gael Guennebaud 2008-07-12 12:12:02 +0000
  • 6f71ef8277 resurrected tvmet, added mt4, intel's MKL and handcoded vectorized backends in the benchmark suite Gael Guennebaud 2008-07-10 18:28:50 +0000
  • 2b53fd4d53 some performance fixes in Assign.h reported by Gael. Some doc update in Cwise. Benoit Jacob 2008-07-10 16:15:55 +0000
  • 7b4c6b8862 in BTL: a specific bench/action can be selected at runtime, e.g.: BTL_CONFIG="-a ata" ctest -V -R eigen run the all benchmarks having "ata" in their name for all libraries matching the regexp "eigen" Gael Guennebaud 2008-07-09 22:35:11 +0000
  • c9b046d5d5 * added optimized paths for matrix-vector and vector-matrix products (using either a cache friendly strategy or re-using dot-product vectorized implementation) * add LinearAccessBit to Transpose Gael Guennebaud 2008-07-09 22:30:18 +0000
  • 25904802bc raah, results were corrupted by overflow. Now slice vectorization is about a +25% speedup which is still nice as i expected zero or even negative benefit. Benoit Jacob 2008-07-09 16:46:26 +0000
  • 8f21a5e862 add benchmark for slice vectorization... expected it to be little or zero benefit... turns out to be 20x speedup. Something is wrong. Benoit Jacob 2008-07-09 16:43:11 +0000
  • 28539e7597 imported a reworked version of BTL (Benchmark for Templated Libraries). the modifications to initial code follow: * changed build system from plain makefiles to cmake * added eigen2 (4 versions: vec/novec and fixed/dynamic), GMM++, MTL4 interfaces * added "transposed matrix * vector" product action * updated blitz interface to use condensed products instead of hand coded loops * removed some deprecated interfaces * changed default storage order to column major for all libraries * new generic bench timer strategy which is supposed to be more accurate * various code clean-up Gael Guennebaud 2008-07-09 14:04:48 +0000
  • 5f55ab524c * added a lazyAssign overload skipping .lazy() such that c = (<xpr>).lazy() such that lazyAssign overloads of <xpr> are automatically called (this also reduces assign instansiations) Gael Guennebaud 2008-07-09 13:54:21 +0000
  • 783eb6da9b I forgot that the previous commit needed minor changes outside the bench folder Gael Guennebaud 2008-07-08 17:25:58 +0000
  • 77a622f2bb add Cholesky and eigensolver benchmark Gael Guennebaud 2008-07-08 17:20:17 +0000
  • 6f09d3a67d - many updates after Cwise change - fix compilation in product.cpp with std::complex - fix bug in MatrixBase::operator!= Benoit Jacob 2008-07-08 07:56:01 +0000
  • f5791eeb70 the big Array/Cwise rework as discussed on the mailing list. The new API can be seen in Eigen/src/Core/Cwise.h. Benoit Jacob 2008-07-08 00:49:10 +0000
  • c910c517b3 fix issues in previously added additionnal product tests Gael Guennebaud 2008-07-06 19:02:03 +0000
  • a9d319d44f * do the ActualPacketAccesBit change as discussed on list * add comment in Product.h about CanVectorizeInner * fix typo in test/product.cpp Benoit Jacob 2008-07-04 12:43:55 +0000
  • 8463b7d3f4 * fix compilation issue in Product * added some tests for product and swap * overload .swap() for dynamic-sized matrix of same size Gael Guennebaud 2008-07-02 16:05:33 +0000
  • 9433df83a7 * resurected Flagged::_expression used to optimize m+=(a*b).lazy() (equivalent to the GEMM blas routine) * added a GEMM benchmark Gael Guennebaud 2008-07-01 16:20:06 +0000
  • 95549007b3 * fix error in divergence test, now it is even faster * add comments in render() in case anyone ever reads that :P Benoit Jacob 2008-07-01 14:23:01 +0000
  • a356ebd47d interleaved rendering balances the load better Benoit Jacob 2008-07-01 14:12:32 +0000
  • 56d03f181e * multi-threaded rendering * increased number of iterations, with more iterations done before testing divergence. results in x2 speedup from vectorization. Benoit Jacob 2008-07-01 12:01:58 +0000
  • cacf986a7f - use double precision to store the position / zoom / other stuff - some temporary fix to get a +50% improvement from vectorization until we have vectorisation for comparisons and redux Benoit Jacob 2008-06-30 07:33:08 +0000
  • 37a50fa526 * added an in-place version of inverseProduct which might be twice faster fot small fixed size matrix * added a sparse triangular solver (sparse version of inverseProduct) * various other improvements in the Sparse module Gael Guennebaud 2008-06-29 21:29:12 +0000
  • fbdecf09e1 fix little bug in computation of max_iter Benoit Jacob 2008-06-29 12:20:07 +0000
  • 97a1038653 improve greatly mandelbrot demo: - much better coloring - determine max number of iterations and choice between float and double at runtime based on zoom level - do draft renderings with increasing resolution before final rendering Benoit Jacob 2008-06-29 12:04:00 +0000
  • 027818d739 * added innerSize / outerSize functions to MatrixBase * added complete implementation of sparse matrix product (with a little glue in Eigen/Core) * added an exhaustive bench of sparse products including GMM++ and MTL4 => Eigen outperforms in all transposed/density configurations ! Gael Guennebaud 2008-06-28 23:07:14 +0000
  • 6917be9113 add mandelbrot demo Benoit Jacob 2008-06-28 20:33:47 +0000
  • 55e08f7102 fix breakage from my last commit Benoit Jacob 2008-06-28 17:15:16 +0000
  • 844f69e4a9 * update CMakeLists, only build instantiations if TEST_LIB is defined * allow default Matrix constructor in dynamic size, defaulting to (1, 1), this is convenient in mandelbrot example. Benoit Jacob 2008-06-27 10:53:30 +0000
  • 6de4871c8c fix a couple of issues in the new Map.h Benoit Jacob 2008-06-27 01:42:44 +0000
  • e27b2b95cf * rework Map, allow vectorization * rework PacketMath and DummyPacketMath, make these actual template specializations instead of just overriding by non-template inline functions * introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix * remove Matrix::map() methods, use Map constructors instead. Benoit Jacob 2008-06-27 01:22:35 +0000
  • e5d301dc96 various work on the Sparse module: * added some glue to Eigen/Core (SparseBit, ei_eval, Matrix) * add two new sparse matrix types: HashMatrix: based on std::map (for random writes) LinkedVectorMatrix: array of linked vectors (for outer coherent writes, e.g. to transpose a matrix) * add a SparseSetter class to easily set/update any kind of matrices, e.g.: { SparseSetter<MatrixType,RandomAccessPattern> wrapper(mymatrix); for (...) wrapper->coeffRef(rand(),rand()) = rand(); } * automatic shallow copy for RValue * and a lot of mess ! plus: * remove the remaining ArrayBit related stuff * don't use alloca in product for very large memory allocation Gael Guennebaud 2008-06-26 23:22:26 +0000
  • c5bd1703cb change derived classes methods from "private:_method()" to "public:method()" i.e. reimplementing the generic method() from MatrixBase. improves compilation speed by 7%, reduces almost by half the call depth of trivial functions, making gcc errors and application backtraces nicer... Benoit Jacob 2008-06-26 20:08:16 +0000
  • 25ba9f377c * add bench/benchVecAdd.cpp by Gael, fix crash (ei_pload on non-aligned) * introduce packet(int), make use of it in linear vectorized paths --> completely fixes the slowdown noticed in benchVecAdd. * generalize coeff(int) to linear-access xprs * clarify the access flag bits * rework api dox in Coeffs.h and util/Constants.h * improve certain expressions's flags, allowing more vectorization * fix bug in Block: start(int) and end(int) returned dyn*dyn size * fix bug in Block: just because the Eval type has packet access doesn't imply the block xpr should have it too. Benoit Jacob 2008-06-26 16:06:41 +0000
  • 5b0da4b778 make use of ei_pmadd in dot-product: will further improve performance on architectures having a packed-mul-add assembly instruction. Benoit Jacob 2008-06-24 18:08:35 +0000
  • 3b94436d2f * vectorize dot product, copying code from sum. * make the conj functor vectorizable: it is just identity in real case, and complex doesn't use the vectorized path anyway. * fix bug in Block: a 3x1 block in a 4x4 matrix (all fixed-size) should not be vectorizable, since in fixed-size we are assuming the size to be a multiple of packet size. (Or would you prefer Vector3d to be flagged "packetaccess" even though no packet access is possible on vectors of that type?) * rename: isOrtho for vectors ---> isOrthogonal isOrtho for matrices ---> isUnitary * add normalize() * reimplement normalized with quotient1 functor Benoit Jacob 2008-06-24 15:13:00 +0000
  • c9560df4a0 * add ei_pdiv intrinsic, make quotient functor vectorizable * add vdw benchmark from Tim's real-world use case Benoit Jacob 2008-06-23 22:00:18 +0000
  • ac9aa47bbc optimize linear vectorization both in Assign and Sum (optimal amortized perf) Gael Guennebaud 2008-06-23 15:50:28 +0000
  • ea1990ef3d add experimental code for sparse matrix: - uses the common "Compressed Column Storage" scheme - supports every unary and binary operators with xpr template assuming binaryOp(0,0) == 0 and unaryOp(0) = 0 (otherwise a sparse matrix doesnot make sense) - this is the first commit, so of course, there are still several shorcommings ! Gael Guennebaud 2008-06-23 13:25:22 +0000
  • 03d19f3bae quick temporary fix for a perf issue we just identified with vectorization.... now the sum benchmark runs 3x faster with vectorization than without. Benoit Jacob 2008-06-23 11:23:05 +0000
  • 32596c5e9e add benchmark for sum Benoit Jacob 2008-06-23 11:03:27 +0000
  • dc9206cec5 split sum away from redux and vectorize it. (could come back to redux after it has been vectorized, and could serve as a starting point for that) also make the abs2 functor vectorizable (for real types). Benoit Jacob 2008-06-23 10:32:48 +0000
  • 8a967fb17c * implement slice vectorization. Because it uses unaligned packet access, it is not certain that it will bring a performance improvement: benchmarking needed. * improve logic choosing slice vectorization. * fix typo in SSE packet math, causing crash in unaligned case. * fix bug in Product, causing crash in unaligned case. * add TEST_SSE3 CMake option. Benoit Jacob 2008-06-22 15:02:05 +0000
  • 8cef541b5a forgot to add the unit test array.cpp Gael Guennebaud 2008-06-21 17:28:07 +0000
  • 32c5ea388e work on rotations in the Geometry module: - convertions are done trough constructors and operator= - added a EulerAngles class Gael Guennebaud 2008-06-21 15:01:49 +0000
  • 574416b842 Override MatrixBase::eval() since matrices don't need to be evaluated, it is enough to just read them. Benoit Jacob 2008-06-20 15:26:39 +0000
  • 54238961d6 * added a pseudo expression Array giving access to: - matrix-scalar addition/subtraction operators, e.g.: m.array() += 0.5; - matrix/matrix comparison operators, e.g.: if (m1.array() < m2.array()) {} * fix compilation issues with Transform and gcc < 4.1 Gael Guennebaud 2008-06-20 12:38:03 +0000
  • e735692e37 move "enum" back to "const int" int ei_assign_impl: in fact, casting enums to int is enough to get compile time constants with ICC. Gael Guennebaud 2008-06-20 07:10:50 +0000
  • fb4a151982 * more cleaning in Product * make Matrix2f (and similar) vectorized using linear path * fix a couple of warnings and compilation issues with ICC and gcc 3.3/3.4 (cannot get Transform compiles with gcc 3.3/3.4, see the FIXME) Gael Guennebaud 2008-06-19 23:00:51 +0000
  • 82c3cea1d5 * refactoring of Product: * use ProductReturnType<>::Type to get the correct Product xpr type * Product is no longer instanciated for xpr types which are evaluated * vectorization of "a.transpose() * b" for the normal product (small and fixed-size matrix) * some cleanning * removed ArrayBase Gael Guennebaud 2008-06-19 17:33:57 +0000
  • 5dbfed1902 fix two bugs dicovered by the previous commit. Gael Guennebaud 2008-06-16 16:39:58 +0000
  • bb1f4e44f1 * Block: row and column expressions in the inner direction now have the Like1D flag. Benoit Jacob 2008-06-16 14:54:31 +0000
  • 9857764ae7 aaargh. Benoit Jacob 2008-06-16 11:20:29 +0000
  • 478bfaf228 fix bug in computation of unrolling limit: div instead of mul Benoit Jacob 2008-06-16 11:18:59 +0000
  • c905b31b42 * Big rework of Assign.h: ** Much better organization ** Fix a few bugs ** Add the ability to unroll only the inner loop ** Add an unrolled path to the Like1D vectorization. Not well tested. ** Add placeholder for sliced vectorization. Unimplemented. Benoit Jacob 2008-06-16 10:49:44 +0000
  • bc0c7c57ed Added an extensible mechanism to support any kind of rotation representation in Transform via the template static class ToRotationMatrix. Added a lightweight AngleAxis class (similar to Rotation2D). Gael Guennebaud 2008-06-15 17:22:41 +0000
  • 0ee6b08128 * split Product to a DiagonalProduct template specialization to optimize matrix-diag and diag-matrix products without making Product over complicated. * compilation fixes in Tridiagonalization and HessenbergDecomposition in the case of 2x2 matrices. * added an Orientation2D small class with similar interface than Quaternion (used by Transform to handle 2D and 3D orientations seamlessly) * added a couple of features in Transform. Gael Guennebaud 2008-06-15 11:54:18 +0000
  • fbbd8afe30 Started a Transform class in the Geometry module to represent homography. Fix indentation in Quaternion.h Gael Guennebaud 2008-06-15 08:33:44 +0000
  • 4af7089ab8 * Added a generalized eigen solver for the selfadjoint case. (as new members to SelfAdjointEigenSolver) The QR module now depends on Cholesky. * Fix Transpose to correctly preserve the *TriangularBit. Gael Guennebaud 2008-06-14 19:42:12 +0000
  • f07f907810 Add QR and Cholesky module instantiations in the lib. To try it with the unit tests set the cmake variable TEST_LIB to ON. Gael Guennebaud 2008-06-14 13:02:41 +0000
  • 53289a8b64 * even though the _Flags default to the corrected value, still correct them in the ei_traits, so that they're guaranteed even if the user specified his own non-default flags (like before). Benoit Jacob 2008-06-13 08:09:48 +0000
  • c90c77051f * make the _Flags template parameter of Matrix default to the corrected flags. This ensures that unless explicitly messed up otherwise, a Matrix type is equal to its own Eval type. This seriously reduces the number of types instantiated. Measured +13% compile speed, -7% binary size. Benoit Jacob 2008-06-13 07:53:45 +0000
  • e3fac69f19 Added a Hessenberg decomposition class for both real and complex matrices. This is the first step towards a non-selfadjoint eigen solver. Notes: - We might consider merging Tridiagonalization and Hessenberg toghether ? - Or we could factorize some code into a Householder class (could also be shared with QR) Gael Guennebaud 2008-06-08 15:03:23 +0000
  • 4dd57b585d * rewrite of the QR decomposition: - works for complex - allows direct access to the matrix R * removed the scale by the matrix dimensions in MatrixBase::isMuchSmallerThan(scalar) Gael Guennebaud 2008-06-07 22:47:11 +0000
  • eb7b7b2cfc * remove Cross product expression: MatrixBase::cross() now returns a temporary which is even better optimized by the compiler. * Quaternion no longer inherits MatrixBase. Instead it stores the coefficients using a Matrix<> and provides only relevant methods. Gael Guennebaud 2008-06-07 13:18:29 +0000
  • 6998037930 * move some compile time "if" to their respective unroller (assign and dot) * fix a couple of compilation issues when unrolling is disabled * reduce default unrolling limit to a more reasonable value Gael Guennebaud 2008-06-07 01:07:48 +0000
  • a172385720 Updated fuzzy comparisons to use L2 norm as all my experiments tends to show L2 norm works very well here. (the legacy implementation is still available via a preprocessor token to allow further experiments if needed...) Gael Guennebaud 2008-06-06 18:37:53 +0000
  • 8769bfd9aa fix a compilation issue in non debug mode Gael Guennebaud 2008-06-06 14:11:26 +0000
  • 869394ee8b fix some compile errors with gcc 4.3, some warnings, some documentation Benoit Jacob 2008-06-06 13:10:00 +0000
  • 2126baf9dc add an optimized path for the tridiagonalization of a 3x3 matrix. (useful for plane fitting, and covariance analysis of 3D data) Gael Guennebaud 2008-06-04 13:41:32 +0000
  • 48262b9734 added a static assertion mechanism (see notes in Core/util/StaticAssert.h for details) Gael Guennebaud 2008-06-04 11:16:11 +0000
  • 60726f91a9 hack to to make the nomalloc unit test compiles with -pedantic Gael Guennebaud 2008-06-04 10:15:48 +0000
  • 42ad9c4352 update of the eigeinsolver unit test to check complex Gael Guennebaud 2008-06-03 18:04:36 +0000
  • a0cff1a295 fix eigenvectors computations :) Gael Guennebaud 2008-06-03 18:03:55 +0000