eigen/tvmet-1.7.1/doc/compiler.dox
Benoit Jacob 3036eeca0a Starting Eigen 2 development. The current plan is to use the last
release of tvmet (inactive for 2 years and developer unreachable) as the
basis for eigen2, because it provides seemingly good expression template
mechanisms, we want that, and it would take years to reinvent that
wheel. We'll see. So this commit imports the last tvmet release.
2007-05-30 06:24:51 +00:00

376 lines
12 KiB
Plaintext
Raw Blame History

/*
* $Id: compiler.dox,v 1.24 2005/03/09 12:05:19 opetzold Exp $
*/
/**
\page compiler Compiler Support
<p>Contents:</p>
- \ref requirements
- \ref gcc
- \ref kcc
- \ref pgCC
- \ref intel
- \ref vc71
\section requirements General compiler Requirements
This library is designed for portability - no compiler specific extensions
are used. Nevertheless, there are a few requirements: (These are all a part
of the C++ standard.)
- Support for the <tt>mutable</tt> keyword is required. This is used by the
CommaInitializer only.
- The <tt>typename</tt> keyword is used exhaustively here.
- The namespace concept is required. The tvmet library is itself is a
namespace. To avoid collisions of operators, there is also an element_wise
namespace within tvmet.
- Partial specialization is needed for the extrema functions min and max
to distinguish between vectors and matrices. This allows tvmet to return
an object with a specific behavior. (The location of an extremum in a
matrix has a (row, column) position whereas a vector extremum has only a
single index for its position).
\section gcc The GNU Compiler Collection
The <a href=http://gcc.gnu.org>GNU compiler</a> collection is mainly used for
developing this library. Moreover, it does compile the library the fastest.
\subsection gcc2953 GNU C++ Compiler v2.95.3
Gcc v2.95.3 is the last official release of the version 2 series from gnu.org.
Since this compiler features the \ref requirements it does work, but only
partial.
There are certain difficulties - see \ref ambiguous_overload (also, please
read about \ref gcc296). Furthermore, there are problems with functions and
operators declared in the namespace <code>element_wise</code> - the
compiler doesn't seem to find them--even though the compiler does know
about namespace tvmet. It appears to be a problem with nested namespaces and
the compiler's ability to perform function/operator lookup, especially during
regression tests: <code> matrix /= matrix </code> compiles inside a single
file but not at the regression tests--which is a contradiction in terms.
Porting to gcc v2.95.3 requires a lot of knowledge and effort--unfortunately,
I don't have enough of either. The examples do compile and the regression
tests build partially.
Matrix and vector operators are working, but don't expect too much.
\subsection gcc296 GNU C++ Compiler v2.96 (Rh7.x, MD8.x)
This compiler isn't an official release of the GNU Compiler group but shipped
by <a href=http://www.redhat.com>Red Hat</a> and Co.
Blitz++ is using a hasFastAccess() flag to perform a check for the use of
_bz_meta_vecAssign::fastAssign (without bounds checking) or
_bz_meta_vecAssign::assign (with bounds checking). This
isn't really necessary for operations on blitz::TinyVector, since it's
always true. Nevertheless, it is important for the produced asm code using
the gcc-c++-2.96-0.48mdk. Generally the code for Blitz++ using the gcc-2.96
is better than tvmet because of this (tested!).
I got into trouble with stl_relops.h where miscellaneous operators are defined. A
simple define of __SGI_STL_INTERNAL_RELOPS in the config header doesn't solve
the problem, only the commented out header version, see \ref
ambiguous_overload. Because of this problem, the regression tests don't
compile with this version. Projects with do not use the relational
operators are not affected.
It seems that the inlining performed by this compiler collection isn't very
smart. I got a lot of warnings: can't inline call to ... So, it would be
best to use the \ref gcc30x and later compilers.
\subsection gcc30x GNU C++ Compiler v3.0.x
These compiler produce better code than the \ref gcc296! Even the problems
with blitz++ fastAssign have vanished. And this compiler conforms to the
standard. The regression tests does compile and run successfully.
Due to the nature of ET and MT there is a need for a high level of inlining.
The v3.0.x seems to do this well as compared to the v2.9x compilers which
produce inline warnings.
This compiler works great with the
<a href=http://www.stlport.org">STLPort-4.5.3</a>
implementation of the STL/C++ Library, Tiny Vector and Matrix template library
and <a href=http://cppunit.sourceforge.net>cpp-unit</a>.
\subsection gcc31x GNU C++ Compiler v3.1
%tvmet does compile with this new GNU C++ compiler. The produced code looks
as good as the code created by \ref gcc30x. (Does anyone have time to make
a benchmark?)
The primary goal is conformance to the standard ISO/IEC 14882:1998.
\subsection gcc32x GNU C++ Compiler v3.2.x
The once again changed Application Binary Interface (ABI) doesn't affect
tvmet since it isn't a binary library--it's only compiled templates inside
the client code.
There are some problems with the GNU C++ compiler collection on the
regression test due to some bugs (IMO), \sa \ref regressiontest_failed.
\subsection gcc33x GNU C++ Compiler v3.3
Tested and works fine. Only some warnings on failed inlining which doesn't
concern tvmet directly.
Anyway, here the code from <tt>examples/ray.cc</tt> on gcc 3.3.3 using
<tt>-O2 -DTVMET_OPTIMIZE</tt>
\par Assembler (IA-32 Intel<65> Architecture):
\code
movl 16(%ebp), %edx
movl 12(%ebp), %ebx
movl 8(%ebp), %esi
fldl 8(%edx)
fldl 16(%edx)
fmull 16(%ebx)
fxch %st(1)
movl %ebx, -24(%ebp)
fmull 8(%ebx)
movl %edx, -32(%ebp)
fldl (%edx)
fmull (%ebx)
fxch %st(1)
movl %edx, -60(%ebp)
movl %edx, -12(%ebp)
faddp %st, %st(2)
faddp %st, %st(1)
fadd %st(0), %st
fstpl -56(%ebp)
movl -56(%ebp), %ecx
movl -52(%ebp), %eax
movl %ecx, -20(%ebp)
movl %eax, -16(%ebp)
movl %ecx, -40(%ebp)
movl %eax, -36(%ebp)
movl %ecx, -68(%ebp)
movl %eax, -64(%ebp)
fldl (%edx)
fmull -20(%ebp)
fsubrl (%ebx)
fstpl (%esi)
fldl 8(%edx)
fmull -20(%ebp)
fsubrl 8(%ebx)
fstpl 8(%esi)
fldl 16(%edx)
fmull -20(%ebp)
fsubrl 16(%ebx)
fstpl 16(%esi)
addl $64, %esp
popl %ebx
popl %esi
popl %ebp
ret
\endcode
\subsection gcc34x GNU C++ Compiler v3.4.x
The compiler 3.4.3 works fine, starting with tvmet release 1.7.1. The problem is
the correct syntax for the CommaInitializer template declaration and
implementation.
There is no assembler output for our <tt>examples/ray.cc</tt>, since I don't
have this compiler yet (yes, I need to update my linux system ;-)
\section kcc Kai C++
This has not been tested. Unfortunately Kai's compiler is no longer shipped
-- one should use the Intel compiler instead
(see <a href=http://www.kai.com>here</a>).
If you have used it successfully including regression and/or benchmark tests,
please give me an answer.
\section pgCC Portland Group Compiler Technology
\subsection pgCC32 Portland Group C++ 3.2
The <a href=http://www.pgroup.com>Portland Group</a> C++ compiler is shipped
with the RogueWave Standard C++ Library which provides conformance to the
standard. Unfortunately, the &lt;cname&gt; C library wrapper headers and the C++
overloads of the math functions are not provided on all platforms, see
<http://www.cug.com/roundup>. The download evaluation version 3.2-4 for
Linux is affected for example. At first glance, it does compile with pgCC
since it has has the great <a href=http://www.edg.com>EDG</a> front-end.
Maybe there is a solution with other standard library implementations like
<a href=http://www.stlport.org>STLPort</a> (On a quick try the STL Port
doesn't recognize the pgCC). If you know more about this, please let me know.
Anyway, the code produced is very poor even if I use high inlining levels
like the command line option -Minline=levels:100 which increases the compile
time dramatically! The benchmark tests have not been done. Unfortunately,
my trial period has expired. I haven't any idea if this compiler will pass
the regression tests.
\subsection pgCC51 Portland Group C++ 5.1
The <a href=http://www.pgroup.com>Portland Group</a> C++ compiler is shipped
with the <a href=http://www.stlport.org>STLport</a> Standard C++ Library, cool!
The code produced isn't very compact compared with the intel or gnu compiler.
Anyway it works, but the compiler time increases dramatically
even on higher inline levels.
\section intel Intel Compiler
\subsection icc5 Intel Compiler v5.0.1
This compiler complains even more than gcc-3.0.x regarding template
specifiers (e.g. correct spaces for template arguments to std::complex are
needed even when not instanced).
The produced code looks good but, I haven't done a benchmark to compare it
with the gcc-3.0.x since the compile time increases for the benchmark test
dramatically.
I have not run any regression tests due to the compile time needed by my
AMD K6/400 Linux box ...
\subsection icc6 Intel Compiler v6.0.x
Should work, but I haven't tested it.
\subsection icc7 Intel Compiler v7.x
This compiler is well supported by tvmet and passes the regression tests
without any failure - as opposed to the GNU C++ compiler collection.
\subsection icc8 Intel Compiler v8.x
No regression tests are done - reports are welcome. I'm not expecting
problems. Anyway, this versions uses pure macros for IEEE math isnan and
isinf. This prevents overwriting with tvmet's functions. Therefore
this functions are disabled after tvmet release 1.4.1. The code produced
is even on <tt>examples/ray.cc</tt> more compact than the \ref gcc33x.
Anyway, here the code from <tt>examples/ray.cc</tt> using
<tt>-O2 -DTVMET_OPTIMIZE</tt>
\par Assembler (IA-32 Intel<65> Architecture):
\code
movl 4(%esp), %ecx
movl 8(%esp), %edx
movl 12(%esp), %eax
fldl (%edx)
fmull (%eax)
fldl 8(%edx)
fmull 8(%eax)
fldl 16(%edx)
fmull 16(%eax)
faddp %st, %st(1)
faddp %st, %st(1)
fldl (%eax)
fxch %st(1)
fadd %st(0), %st
fmul %st, %st(1)
fxch %st(1)
fsubrl (%edx)
fstpl (%ecx)
fldl 8(%eax)
fmul %st(1), %st
fsubrl 8(%edx)
fstpl 8(%ecx)
fldl 16(%eax)
fmulp %st, %st(1)
fsubrl 16(%edx)
fstpl 16(%ecx)
ret
\endcode
\section vc71 Microsoft Visual C++ v7.1
\htmlonly
<script language="JavaScript">
var m_name="blbounnejapny";
var m_domain="hotmail.com";
var m_text='<a href="mailto:'+m_name+'@'+m_domain+'?subject=tvmet and Microsoft VC++">';
m_text+='Robi Carnecky</a>';
document.write(m_text);
</script>
\endhtmlonly
\latexonly
Robi Carnecky <blbounnejapny@hotmail.com>
\endlatexonly
has reported the success on tvmet using Visual C++ v7.1. At this
release of tvmet there are some warnings left - the work is on
progress.
The <a href="http://msdn.microsoft.com/visualc/vctoolkit2003/">Microsoft Visual C++ Toolkit 2003</a>
and Visual C++ prior 7.1 do not compile - you will get an undefined
internal error unfortunally.
Anyway, here the code from <tt>examples/ray.cc</tt>:
\par Assembler (IA-32 Intel<65> Architecture, no SSE2):
\code
push ebp
mov ebp, esp
and esp, -8 ; fffffff8H
sub esp, 28 ; 0000001cH
mov eax, DWORD PTR _ray$[ebp]
mov ecx, DWORD PTR _surfaceNormal$[ebp]
fld QWORD PTR [eax+16]
fmul QWORD PTR [ecx+16]
push ebx
fld QWORD PTR [eax+8]
push esi
fmul QWORD PTR [ecx+8]
push edi
mov edi, DWORD PTR $T35206[esp+52]
faddp ST(1), ST(0)
mov DWORD PTR $T35027[esp+60], edi
fld QWORD PTR [eax]
pop edi
fmul QWORD PTR [ecx]
faddp ST(1), ST(0)
fadd ST(0), ST(0)
fstp QWORD PTR $T35206[esp+36]
mov esi, DWORD PTR $T35206[esp+40]
mov edx, DWORD PTR $T35206[esp+36]
mov ebx, DWORD PTR $T35265[esp+40]
mov DWORD PTR $T35027[esp+44], edx
mov edx, DWORD PTR _reflection$[ebp]
mov DWORD PTR $T35027[esp+48], esi
fld QWORD PTR $T35027[esp+44]
fmul QWORD PTR [ecx]
pop esi
mov DWORD PTR $T35027[esp+36], ebx
pop ebx
fsubr QWORD PTR [eax]
fstp QWORD PTR [edx]
fld QWORD PTR $T35027[esp+36]
fmul QWORD PTR [ecx+8]
fsubr QWORD PTR [eax+8]
fstp QWORD PTR [edx+8]
fld QWORD PTR $T35027[esp+36]
fmul QWORD PTR [ecx+16]
fsubr QWORD PTR [eax+16]
fstp QWORD PTR [edx+16]
mov esp, ebp
pop ebp
\endcode
\sa \ref regressiontest_failed
\sa \ref install_win
*/
// Local Variables:
// mode:c++
// End: