Benoit Steiner
|
7944d4431f
|
Made the cost model cwiseMax and cwiseMin methods consts to help the PowerPC cuda compiler compile this code.
|
2016-08-18 13:46:36 -07:00 |
|
Benoit Steiner
|
647a51b426
|
Force the inlining of a simple accessor.
|
2016-08-18 12:31:02 -07:00 |
|
Benoit Steiner
|
a452dedb4f
|
Merged in ibab/eigen/double-tensor-reduction (pull request PR-216)
Enable efficient Tensor reduction for doubles on the GPU (continued)
|
2016-08-18 12:29:54 -07:00 |
|
Igor Babuschkin
|
18c67df31c
|
Fix remaining CUDA >= 300 checks
|
2016-08-18 17:18:30 +01:00 |
|
Igor Babuschkin
|
1569a7d7ab
|
Add the necessary CUDA >= 300 checks back
|
2016-08-18 17:15:12 +01:00 |
|
Benoit Steiner
|
2b17f34574
|
Properly detect the type of the result of a contraction.
|
2016-08-16 16:00:30 -07:00 |
|
Benoit Steiner
|
34ae80179a
|
Use array_prod instead of calling TotalSize since TotalSize is only available on DSize.
|
2016-08-15 10:29:14 -07:00 |
|
Benoit Steiner
|
fe73648c98
|
Fixed a bug in the documentation.
|
2016-08-12 10:00:43 -07:00 |
|
Benoit Steiner
|
e3a8dfb02f
|
std::erfcf doesn't exist: use numext::erfc instead
|
2016-08-11 15:24:06 -07:00 |
|
Benoit Steiner
|
64e68cbe87
|
Don't attempt to optimize partial reductions when the optimized implementation doesn't buy anything.
|
2016-08-08 19:29:59 -07:00 |
|
Igor Babuschkin
|
841e075154
|
Remove CUDA >= 300 checks and enable outer reductin for doubles
|
2016-08-06 18:07:50 +01:00 |
|
Igor Babuschkin
|
0425118e2a
|
Merge upstream changes
|
2016-08-05 14:34:57 +01:00 |
|
Igor Babuschkin
|
9537e8b118
|
Make use of atomicExch for atomicExchCustom
|
2016-08-05 14:29:58 +01:00 |
|
Benoit Steiner
|
5eea1c7f97
|
Fixed cut and paste bug in debud message
|
2016-08-04 17:34:13 -07:00 |
|
Benoit Steiner
|
b50d8f8c4a
|
Extended a regression test to validate that we basic fp16 support works with cuda 7.0
|
2016-08-03 16:50:13 -07:00 |
|
Benoit Steiner
|
fad9828769
|
Deleted redundant regression test.
|
2016-08-03 16:08:37 -07:00 |
|
Benoit Steiner
|
ca2cee2739
|
Merged in ibab/eigen (pull request PR-206)
Expose real and imag methods on Tensors
|
2016-08-03 11:53:04 -07:00 |
|
Benoit Steiner
|
d92df04ce8
|
Cleaned up the new float16 test a bit
|
2016-08-03 11:50:07 -07:00 |
|
Benoit Steiner
|
81099ef482
|
Added a test for fp16
|
2016-08-03 11:41:17 -07:00 |
|
Benoit Steiner
|
a20b58845f
|
CUDA_ARCH isn't always defined, so avoid relying on it too much when figuring out which implementation to use for reductions. Instead rely on the device to tell us on which hardware version we're running.
|
2016-08-03 10:00:43 -07:00 |
|
Benoit Steiner
|
fd220dd8b0
|
Use numext::conj instead of std::conj
|
2016-08-01 18:16:16 -07:00 |
|
Benoit Steiner
|
e256acec7c
|
Avoid unecessary object copies
|
2016-08-01 17:03:39 -07:00 |
|
Benoit Steiner
|
2693fd54bf
|
bug #1266: half implementation has been moved to half_impl namespace
|
2016-07-29 13:45:56 -07:00 |
|
Gael Guennebaud
|
cc2f6d68b1
|
bug #1264: fix compilation
|
2016-07-27 23:30:47 +02:00 |
|
Gael Guennebaud
|
8972323c08
|
Big 1261: add missing max(ADS,ADS) overload (same for min)
|
2016-07-27 14:52:48 +02:00 |
|
Gael Guennebaud
|
5d94dc85e5
|
bug #1260: add regression test
|
2016-07-27 14:38:30 +02:00 |
|
Gael Guennebaud
|
0d7039319c
|
bug #1260: remove doubtful specializations of ScalarBinaryOpTraits
|
2016-07-27 14:35:52 +02:00 |
|
Benoit Steiner
|
3d3d34e442
|
Deleted dead code.
|
2016-07-25 08:53:37 -07:00 |
|
Gael Guennebaud
|
6d5daf32f5
|
bug #1255: comment out broken and unsused line.
|
2016-07-25 14:48:30 +02:00 |
|
Gael Guennebaud
|
f9598d73b5
|
bug #1250: fix pow() for AutoDiffScalar with custom nested scalar type.
|
2016-07-25 14:42:19 +02:00 |
|
Gael Guennebaud
|
fd1117f2be
|
Implement digits10 for mpreal
|
2016-07-25 14:38:55 +02:00 |
|
Gael Guennebaud
|
9908020d36
|
Add minimal support for Array<string>, and fix Tensor<string>
|
2016-07-25 14:25:56 +02:00 |
|
Benoit Steiner
|
c6b0de2c21
|
Improved partial reductions in more cases
|
2016-07-22 17:18:20 -07:00 |
|
Gael Guennebaud
|
32d95e86c9
|
merge
|
2016-07-22 16:43:12 +02:00 |
|
Gael Guennebaud
|
d7a0e52478
|
Fix testing of log nearby 1
|
2016-07-22 15:44:26 +02:00 |
|
Gael Guennebaud
|
7acf23c14c
|
Truely split unit test.
|
2016-07-22 15:41:23 +02:00 |
|
Gael Guennebaud
|
d075d122ea
|
Move half unit test from unsupported to main tests
|
2016-07-22 14:34:19 +02:00 |
|
Gael Guennebaud
|
0f350a8b7e
|
Fix CUDA compilation
|
2016-07-21 18:47:07 +02:00 |
|
Gael Guennebaud
|
82798162c0
|
Extend unit testing of half with ADL and arrays.
|
2016-07-21 15:47:21 +02:00 |
|
Yi Lin
|
7b4abc2b1d
|
Fixed a code comment error
|
2016-07-20 22:28:54 +08:00 |
|
Benoit Steiner
|
20f7ef2f89
|
An evalTo expression is only aligned iff both the lhs and the rhs are aligned.
|
2016-07-12 10:56:42 -07:00 |
|
Gael Guennebaud
|
c98bac2966
|
Manually add -stdd=c++11 to nvcc for old cmake versions
|
2016-07-12 09:29:18 +02:00 |
|
Benoit Steiner
|
40eb97516c
|
reverted unintended change.
|
2016-07-11 14:28:03 -07:00 |
|
Benoit Steiner
|
03b71c273e
|
Made the packetmath test compile again. A better fix would be to move the special function tests to the unsupported directory where the code now resides.
|
2016-07-11 13:50:24 -07:00 |
|
Benoit Steiner
|
3a2dd352ae
|
Improved the contraction mapper to properly support tensor products
|
2016-07-11 13:43:41 -07:00 |
|
Benoit Steiner
|
0bc020be9d
|
Improved the detection of packet size in the tensor scan evaluator.
|
2016-07-11 12:14:56 -07:00 |
|
Gael Guennebaud
|
a96a7ce3f7
|
Move CUDA's special functions to SpecialFunctions module.
|
2016-07-11 18:39:11 +02:00 |
|
Gael Guennebaud
|
fd60966310
|
merge
|
2016-07-11 18:11:47 +02:00 |
|
Gael Guennebaud
|
7d636349dc
|
Fix configuration of CUDA:
- preserve user defined CUDA_NVCC_FLAGS
- remove the -ansi flag that conflicts with -std=c++11
- do not add -std=c++11 if already there
|
2016-07-11 18:09:04 +02:00 |
|
Gael Guennebaud
|
131ee4bb8e
|
Split test_slice_in_expr which seems to be huge for visual
|
2016-07-11 11:46:55 +02:00 |
|
Gael Guennebaud
|
194daa3048
|
Fix assertion (it did not make sense for static_val types)
|
2016-07-11 11:39:27 +02:00 |
|
Gael Guennebaud
|
18c35747ce
|
Emulate _BitScanReverse64 for 32 bits builds
|
2016-07-11 11:38:04 +02:00 |
|
Gael Guennebaud
|
599f8ba617
|
Change runtime to compile-time conditional.
|
2016-07-08 11:39:43 +02:00 |
|
Gael Guennebaud
|
544935101a
|
Fix warnings
|
2016-07-08 11:38:52 +02:00 |
|
Gael Guennebaud
|
59bf2774a3
|
Fix warnings
|
2016-07-08 11:38:11 +02:00 |
|
Gael Guennebaud
|
2f7e2614e7
|
bug #1232: refactor special functions as a new SpecialFunctions module, currently in unsupported/.
|
2016-07-08 11:13:55 +02:00 |
|
Gael Guennebaud
|
8b7431d8fd
|
fix compilation with c++11
|
2016-07-07 15:18:23 +02:00 |
|
Gael Guennebaud
|
69378eed0b
|
Split huge unit test
|
2016-07-07 15:18:04 +02:00 |
|
Gael Guennebaud
|
179ebb88f9
|
Fix warning
|
2016-07-07 09:16:40 +02:00 |
|
Gael Guennebaud
|
5d2dada197
|
Fix warnings
|
2016-07-07 09:05:15 +02:00 |
|
Gael Guennebaud
|
f5e780fb05
|
split huge unit test
|
2016-07-07 08:59:59 +02:00 |
|
Gael Guennebaud
|
ce9fc0ce14
|
fix clang compilation
|
2016-07-04 12:59:02 +02:00 |
|
Gael Guennebaud
|
440020474c
|
Workaround compilation issue with msvc
|
2016-07-04 12:49:19 +02:00 |
|
Igor Babuschkin
|
eeb0d880ee
|
Enable efficient Tensor reduction for doubles
|
2016-07-01 19:08:26 +01:00 |
|
Igor Babuschkin
|
78f37ca03c
|
Expose real and imag methods on Tensors
|
2016-07-01 17:34:31 +01:00 |
|
Benoit Steiner
|
cb2d8b8fa6
|
Made it possible to compile reductions for an old cuda architecture and run them on a recent gpu.
|
2016-06-29 15:42:01 -07:00 |
|
Benoit Steiner
|
b2a47641ce
|
Made the code compile when using CUDA architecture < 300
|
2016-06-29 15:32:47 -07:00 |
|
Igor Babuschkin
|
85699850d9
|
Add missing CUDA kernel to tensor scan op
The TensorScanOp implementation was missing a CUDA kernel launch.
This adds a simple placeholder implementation.
|
2016-06-29 11:54:35 +01:00 |
|
Benoit Steiner
|
1a9f92e781
|
Added a test to validate the tensor scan evaluation on GPU. The test is currently disabled since the code segfaults.
|
2016-06-27 16:02:52 -07:00 |
|
Benoit Steiner
|
75c333f94c
|
Don't store the scan axis in the evaluator of the tensor scan operation since it's only used in the constructor.
Also avoid taking references to values that may becomes stale after a copy construction.
|
2016-06-27 10:32:38 -07:00 |
|
Gael Guennebaud
|
cfff370549
|
Fix hyperbolic functions for autodiff.
|
2016-06-24 23:21:35 +02:00 |
|
Gael Guennebaud
|
3852351793
|
merge pull request 198
|
2016-06-24 11:48:17 +02:00 |
|
Gael Guennebaud
|
6dd9077070
|
Fix some unused typedef warnings.
|
2016-06-24 11:34:21 +02:00 |
|
Gael Guennebaud
|
ce90647fa5
|
Fix NumTraits<AutoDiff>
|
2016-06-24 11:34:02 +02:00 |
|
Gael Guennebaud
|
fa39f81b48
|
Fix instantiation of ScalarBinaryOpTraits for AutoDiff.
|
2016-06-24 11:33:30 +02:00 |
|
Rasmus Munk Larsen
|
a9c1e4d7b7
|
Return -1 from CurrentThreadId when called by thread outside the pool.
|
2016-06-23 16:40:07 -07:00 |
|
Rasmus Munk Larsen
|
d39df320d2
|
Resolve merge.
|
2016-06-23 15:08:03 -07:00 |
|
Gael Guennebaud
|
361dbd246d
|
Add unit test for printing empty tensors
|
2016-06-23 18:54:30 +02:00 |
|
Gael Guennebaud
|
360a743a10
|
bug #1241: does not emmit anything for empty tensors
|
2016-06-23 18:47:31 +02:00 |
|
Gael Guennebaud
|
7c6561485a
|
merge PR 194
|
2016-06-23 15:29:57 +02:00 |
|
Benoit Steiner
|
a29a2cb4ff
|
Silenced a couple of compilation warnings generated by xcode
|
2016-06-22 16:43:02 -07:00 |
|
Benoit Steiner
|
f8fcd6b32d
|
Turned the constructor of the PerThread struct into what is effectively a constant expression to make the code compatible with a wider range of compilers
|
2016-06-22 16:03:11 -07:00 |
|
Benoit Steiner
|
c58df31747
|
Handle empty tensors in the print functions
|
2016-06-21 09:22:43 -07:00 |
|
Benoit Steiner
|
de32f8d656
|
Fixed the printing of rank-0 tensors
|
2016-06-20 10:46:45 -07:00 |
|
Tal Hadad
|
8e198d6835
|
Complete docs and add ostream operator for EulerAngles.
|
2016-06-19 20:42:45 +03:00 |
|
Geoffrey Lalonde
|
72c95383e0
|
Add autodiff coverage for standard library hyperbolic functions, and tests.
* * *
Corrected tanh derivatived, moved test definitions.
* * *
Added more test cases, removed lingering lines
|
2016-06-15 23:33:19 -07:00 |
|
Benoit Steiner
|
7d495d890a
|
Merged in ibab/eigen (pull request PR-197)
Implement exclusive scan option for Tensor library
|
2016-06-14 17:54:59 -07:00 |
|
Benoit Steiner
|
aedc5be1d6
|
Avoid generating pseudo random numbers that are multiple of 5: this helps
spread the load over multiple cpus without havind to rely on work stealing.
|
2016-06-14 17:51:47 -07:00 |
|
Igor Babuschkin
|
c4d10e921f
|
Implement exclusive scan option
|
2016-06-14 19:44:07 +01:00 |
|
Gael Guennebaud
|
76236cdea4
|
merge
|
2016-06-14 15:33:47 +02:00 |
|
Gael Guennebaud
|
62134082aa
|
Update AutoDiffScalar wrt to scalar-multiple.
|
2016-06-14 15:06:35 +02:00 |
|
Gael Guennebaud
|
5d38203735
|
Update Tensor module to use bind1st_op and bind2nd_op
|
2016-06-14 15:06:03 +02:00 |
|
Gael Guennebaud
|
f925dba3d9
|
Fix compilation of BVH example
|
2016-06-14 11:32:09 +02:00 |
|
Tal Hadad
|
6edfe8771b
|
Little bit docs
|
2016-06-13 22:03:19 +03:00 |
|
Tal Hadad
|
6e1c086593
|
Add static assertion
|
2016-06-13 21:55:17 +03:00 |
|
Gael Guennebaud
|
3c12e24164
|
Add bind1st_op and bind2nd_op helpers to turn binary functors into unary ones, and implement scalar_multiple2 and scalar_quotient2 on top of them.
|
2016-06-13 16:18:59 +02:00 |
|
Tal Hadad
|
06206482d9
|
More docs, and minor code fixes
|
2016-06-12 23:40:17 +03:00 |
|
Benoit Steiner
|
65d33e5898
|
Merged in ibab/eigen (pull request PR-195)
Add small fixes to TensorScanOp
|
2016-06-10 19:31:17 -07:00 |
|
Benoit Steiner
|
a05607875a
|
Don't refer to the half2 type unless it's been defined
|
2016-06-10 11:53:56 -07:00 |
|
Igor Babuschkin
|
86aedc9282
|
Add small fixes to TensorScanOp
|
2016-06-07 20:06:38 +01:00 |
|