Gael Guennebaud
3ada6e4bed
Merged hongkai-dai/eigen/tip into default (bug #1298 )
2016-09-19 22:08:06 +02:00
Benoit Steiner
c3ca9b1e76
Deleted some unecessary and confusing EIGEN_DEVICE_FUNC
2016-09-19 11:33:39 -07:00
Hongkai Dai
5dcc6d301a
remove ternary operator in euler angles
2016-09-19 10:30:30 -07:00
Luke Iwanski
c771df6bc3
Updated the owners of the file.
2016-09-19 14:09:25 +01:00
Luke Iwanski
b91e021172
Merged with default.
2016-09-19 14:03:54 +01:00
Luke Iwanski
cb81975714
Partial OpenCL support via SYCL compatible with ComputeCpp CE.
2016-09-19 12:44:13 +01:00
Emil Fresk
6edd2e2851
Made AutoDiffJacobian more intuitive to use and updated for C++11
...
Changes:
* Removed unnecessary types from the Functor by inferring from its types
* Removed inputs() function reference, replaced with .rows()
* Updated the forward constructor to use variadic templates
* Added optional parameters to the Fuctor for passing parameters,
control signals, etc
* Has been tested with fixed size and dynamic matricies
Ammendment by chtz: overload operator() for compatibility with not fully conforming compilers
2016-09-16 14:03:55 +02:00
Gael Guennebaud
18f6e47815
Fix order of "static inline".
2016-09-16 11:32:54 +02:00
Benoit Steiner
488ad7dd1b
Added missing EIGEN_DEVICE_FUNC qualifiers
2016-09-14 13:35:00 -07:00
Benoit Steiner
e4d4d15588
Register the cxx11_tensor_device only for recent cuda architectures (i.e. >= 3.0) since the test instantiate contractions that require a modern gpu.
2016-09-12 19:01:52 -07:00
Benoit Steiner
4dfd888c92
CUDA contractions require arch >= 3.0: don't compile the cuda contraction tests on older architectures.
2016-09-12 18:49:01 -07:00
Benoit Steiner
028e299577
Fixed a bug impacting some outer reductions on GPU
2016-09-12 18:36:52 -07:00
Benoit Steiner
5f50f12d2c
Added the ability to compute the absolute value of a complex number on GPU, as well as a test to catch the problem.
2016-09-12 13:46:13 -07:00
Benoit Steiner
8321dcce76
Merged latest updates from trunk
2016-09-12 10:33:05 -07:00
Benoit Steiner
eb6ba00cc8
Properly size the list of waiters
2016-09-12 10:31:55 -07:00
Benoit Steiner
a618094b62
Added a resize method to MaxSizeVector
2016-09-12 10:30:53 -07:00
Gael Guennebaud
471eac5399
bug #1195 : move NumTraits::Div<>::Cost to internal::scalar_div_cost (with some specializations in arch/SSE and arch/AVX)
2016-09-08 08:36:27 +02:00
Gael Guennebaud
e1642f485c
bug #1288 : fix memory leak in arpack wrapper.
2016-09-05 18:01:30 +02:00
Gael Guennebaud
dabc81751f
Fix compilation when cuda_fp16.h does not exist.
2016-09-05 17:14:20 +02:00
Benoit Steiner
87a8a1975e
Fixed a regression test
2016-09-02 19:29:33 -07:00
Benoit Steiner
13df3441ae
Use MaxSizeVector instead of std::vector: xcode sometimes assumes that std::vector allocates aligned memory and therefore issues aligned instruction to initialize it. This can result in random crashes when compiling with AVX instructions enabled.
2016-09-02 19:25:47 -07:00
Benoit Steiner
cadd124d73
Pulled latest update from trunk
2016-09-02 15:30:02 -07:00
Benoit Steiner
05b0518077
Made the index type an explicit template parameter to help some compilers compile the code.
2016-09-02 15:29:34 -07:00
Benoit Steiner
adf864fec0
Merged in rmlarsen/eigen (pull request PR-222)
...
Fix CUDA build broken by changes to min and max reduction.
2016-09-02 14:11:20 -07:00
Rasmus Munk Larsen
13e93ca8b7
Fix CUDA build broken by changes to min and max reduction.
2016-09-02 13:41:36 -07:00
Benoit Steiner
6c05c3dd49
Fix the cxx11_tensor_cuda.cu test on 32bit platforms.
2016-09-02 11:12:16 -07:00
Benoit Steiner
039e225f7f
Added a test for nullary expressions on CUDA
...
Also check that we can mix 64 and 32 bit indices in the same compilation unit
2016-09-01 13:28:12 -07:00
Benoit Steiner
c53f783705
Updated the contraction code to support constant inputs.
2016-09-01 11:41:27 -07:00
Gael Guennebaud
46475eff9a
Adjust Tensor module wrt recent change in nullary functor
2016-09-01 13:40:45 +02:00
Gael Guennebaud
72a4d49315
Fix compilation with CUDA 8
2016-09-01 13:39:33 +02:00
Rasmus Munk Larsen
a1e092d1e8
Fix bugs to make min- and max reducers with correctly with IEEE infinities.
2016-08-31 15:04:16 -07:00
Gael Guennebaud
1f84f0d33a
merge EulerAngles module
2016-08-30 10:01:53 +02:00
Gael Guennebaud
e074f720c7
Include missing forward declaration of SparseMatrix
2016-08-29 18:56:46 +02:00
Gael Guennebaud
6cd7b9ea6b
Fix compilation with cuda 8
2016-08-29 11:06:08 +02:00
Gael Guennebaud
35a8e94577
bug #1167 : simplify installation of header files using cmake's install(DIRECTORY ...) command.
2016-08-29 10:59:37 +02:00
Gael Guennebaud
0f56b5a6de
enable vectorization path when testing half on cuda, and add test for log1p
2016-08-26 14:55:51 +02:00
Gael Guennebaud
965e595f02
Add missing log1p method
2016-08-26 14:55:00 +02:00
Benoit Steiner
34ae80179a
Use array_prod instead of calling TotalSize since TotalSize is only available on DSize.
2016-08-15 10:29:14 -07:00
Benoit Steiner
fe73648c98
Fixed a bug in the documentation.
2016-08-12 10:00:43 -07:00
Benoit Steiner
e3a8dfb02f
std::erfcf doesn't exist: use numext::erfc instead
2016-08-11 15:24:06 -07:00
Benoit Steiner
64e68cbe87
Don't attempt to optimize partial reductions when the optimized implementation doesn't buy anything.
2016-08-08 19:29:59 -07:00
Benoit Steiner
5eea1c7f97
Fixed cut and paste bug in debud message
2016-08-04 17:34:13 -07:00
Benoit Steiner
b50d8f8c4a
Extended a regression test to validate that we basic fp16 support works with cuda 7.0
2016-08-03 16:50:13 -07:00
Benoit Steiner
fad9828769
Deleted redundant regression test.
2016-08-03 16:08:37 -07:00
Benoit Steiner
ca2cee2739
Merged in ibab/eigen (pull request PR-206)
...
Expose real and imag methods on Tensors
2016-08-03 11:53:04 -07:00
Benoit Steiner
d92df04ce8
Cleaned up the new float16 test a bit
2016-08-03 11:50:07 -07:00
Benoit Steiner
81099ef482
Added a test for fp16
2016-08-03 11:41:17 -07:00
Benoit Steiner
a20b58845f
CUDA_ARCH isn't always defined, so avoid relying on it too much when figuring out which implementation to use for reductions. Instead rely on the device to tell us on which hardware version we're running.
2016-08-03 10:00:43 -07:00
Benoit Steiner
fd220dd8b0
Use numext::conj instead of std::conj
2016-08-01 18:16:16 -07:00
Benoit Steiner
e256acec7c
Avoid unecessary object copies
2016-08-01 17:03:39 -07:00
Benoit Steiner
2693fd54bf
bug #1266 : half implementation has been moved to half_impl namespace
2016-07-29 13:45:56 -07:00
Gael Guennebaud
cc2f6d68b1
bug #1264 : fix compilation
2016-07-27 23:30:47 +02:00
Gael Guennebaud
8972323c08
Big 1261: add missing max(ADS,ADS) overload (same for min)
2016-07-27 14:52:48 +02:00
Gael Guennebaud
5d94dc85e5
bug #1260 : add regression test
2016-07-27 14:38:30 +02:00
Gael Guennebaud
0d7039319c
bug #1260 : remove doubtful specializations of ScalarBinaryOpTraits
2016-07-27 14:35:52 +02:00
Benoit Steiner
3d3d34e442
Deleted dead code.
2016-07-25 08:53:37 -07:00
Gael Guennebaud
6d5daf32f5
bug #1255 : comment out broken and unsused line.
2016-07-25 14:48:30 +02:00
Gael Guennebaud
f9598d73b5
bug #1250 : fix pow() for AutoDiffScalar with custom nested scalar type.
2016-07-25 14:42:19 +02:00
Gael Guennebaud
fd1117f2be
Implement digits10 for mpreal
2016-07-25 14:38:55 +02:00
Gael Guennebaud
9908020d36
Add minimal support for Array<string>, and fix Tensor<string>
2016-07-25 14:25:56 +02:00
Benoit Steiner
c6b0de2c21
Improved partial reductions in more cases
2016-07-22 17:18:20 -07:00
Gael Guennebaud
32d95e86c9
merge
2016-07-22 16:43:12 +02:00
Gael Guennebaud
d7a0e52478
Fix testing of log nearby 1
2016-07-22 15:44:26 +02:00
Gael Guennebaud
7acf23c14c
Truely split unit test.
2016-07-22 15:41:23 +02:00
Gael Guennebaud
d075d122ea
Move half unit test from unsupported to main tests
2016-07-22 14:34:19 +02:00
Gael Guennebaud
0f350a8b7e
Fix CUDA compilation
2016-07-21 18:47:07 +02:00
Gael Guennebaud
82798162c0
Extend unit testing of half with ADL and arrays.
2016-07-21 15:47:21 +02:00
Yi Lin
7b4abc2b1d
Fixed a code comment error
2016-07-20 22:28:54 +08:00
Benoit Steiner
20f7ef2f89
An evalTo expression is only aligned iff both the lhs and the rhs are aligned.
2016-07-12 10:56:42 -07:00
Gael Guennebaud
c98bac2966
Manually add -stdd=c++11 to nvcc for old cmake versions
2016-07-12 09:29:18 +02:00
Benoit Steiner
40eb97516c
reverted unintended change.
2016-07-11 14:28:03 -07:00
Benoit Steiner
03b71c273e
Made the packetmath test compile again. A better fix would be to move the special function tests to the unsupported directory where the code now resides.
2016-07-11 13:50:24 -07:00
Benoit Steiner
3a2dd352ae
Improved the contraction mapper to properly support tensor products
2016-07-11 13:43:41 -07:00
Benoit Steiner
0bc020be9d
Improved the detection of packet size in the tensor scan evaluator.
2016-07-11 12:14:56 -07:00
Gael Guennebaud
a96a7ce3f7
Move CUDA's special functions to SpecialFunctions module.
2016-07-11 18:39:11 +02:00
Gael Guennebaud
fd60966310
merge
2016-07-11 18:11:47 +02:00
Gael Guennebaud
7d636349dc
Fix configuration of CUDA:
...
- preserve user defined CUDA_NVCC_FLAGS
- remove the -ansi flag that conflicts with -std=c++11
- do not add -std=c++11 if already there
2016-07-11 18:09:04 +02:00
Gael Guennebaud
131ee4bb8e
Split test_slice_in_expr which seems to be huge for visual
2016-07-11 11:46:55 +02:00
Gael Guennebaud
194daa3048
Fix assertion (it did not make sense for static_val types)
2016-07-11 11:39:27 +02:00
Gael Guennebaud
18c35747ce
Emulate _BitScanReverse64 for 32 bits builds
2016-07-11 11:38:04 +02:00
Gael Guennebaud
599f8ba617
Change runtime to compile-time conditional.
2016-07-08 11:39:43 +02:00
Gael Guennebaud
544935101a
Fix warnings
2016-07-08 11:38:52 +02:00
Gael Guennebaud
59bf2774a3
Fix warnings
2016-07-08 11:38:11 +02:00
Gael Guennebaud
2f7e2614e7
bug #1232 : refactor special functions as a new SpecialFunctions module, currently in unsupported/.
2016-07-08 11:13:55 +02:00
Gael Guennebaud
8b7431d8fd
fix compilation with c++11
2016-07-07 15:18:23 +02:00
Gael Guennebaud
69378eed0b
Split huge unit test
2016-07-07 15:18:04 +02:00
Gael Guennebaud
179ebb88f9
Fix warning
2016-07-07 09:16:40 +02:00
Gael Guennebaud
5d2dada197
Fix warnings
2016-07-07 09:05:15 +02:00
Gael Guennebaud
f5e780fb05
split huge unit test
2016-07-07 08:59:59 +02:00
Gael Guennebaud
ce9fc0ce14
fix clang compilation
2016-07-04 12:59:02 +02:00
Gael Guennebaud
440020474c
Workaround compilation issue with msvc
2016-07-04 12:49:19 +02:00
Igor Babuschkin
78f37ca03c
Expose real and imag methods on Tensors
2016-07-01 17:34:31 +01:00
Benoit Steiner
cb2d8b8fa6
Made it possible to compile reductions for an old cuda architecture and run them on a recent gpu.
2016-06-29 15:42:01 -07:00
Benoit Steiner
b2a47641ce
Made the code compile when using CUDA architecture < 300
2016-06-29 15:32:47 -07:00
Igor Babuschkin
85699850d9
Add missing CUDA kernel to tensor scan op
...
The TensorScanOp implementation was missing a CUDA kernel launch.
This adds a simple placeholder implementation.
2016-06-29 11:54:35 +01:00
Benoit Steiner
1a9f92e781
Added a test to validate the tensor scan evaluation on GPU. The test is currently disabled since the code segfaults.
2016-06-27 16:02:52 -07:00
Benoit Steiner
75c333f94c
Don't store the scan axis in the evaluator of the tensor scan operation since it's only used in the constructor.
...
Also avoid taking references to values that may becomes stale after a copy construction.
2016-06-27 10:32:38 -07:00
Benoit Steiner
7944d4431f
Made the cost model cwiseMax and cwiseMin methods consts to help the PowerPC cuda compiler compile this code.
2016-08-18 13:46:36 -07:00
Benoit Steiner
647a51b426
Force the inlining of a simple accessor.
2016-08-18 12:31:02 -07:00
Benoit Steiner
a452dedb4f
Merged in ibab/eigen/double-tensor-reduction (pull request PR-216)
...
Enable efficient Tensor reduction for doubles on the GPU (continued)
2016-08-18 12:29:54 -07:00