The scope of this document is to provide information on the implementation of floating point arithmetics in the C language on different platforms and details on different compilers. It will explain why it is problematic to simply use floating point math without considering the implications. It will also try to provide a set of preprocessor macros that allow the consistent usage of floating point arithmetics across as many platforms and compilers as possible.

This document is far from complete. Please be encouraged to contact me if you have any further information on this topic whatsoever. I'm especially interested in floating point arithmetics of non-x86 platforms, for example sparc, ia64, arm or alpha. I'd also be interested in someone with knowledge of Windows64 or at least the possibility to test some things with different compilers there. (especially Visual Studio 9, MinGW-gcc, Cygwin-gcc)

This document currently only deals with floating point precision. Two other topics will be discussed in future versions of this documents: floting point exceptions and internal rounding control.

Please note that some understanding of the assembly language and the internal doings of microprocessors is extremely helpful for understanding this document completely. Nevertheless the conclusion should not be problematic.

For a general explanation on floating point numbers, please refer to Wikipedia article on IEEE 754. The following is only a small overview.

There are four main types of floating point numbers that are widely implemented today:

- Single precision
- Stored in 32 bits: 1 bit for the sign, 8 bits for the exponent and 23 bits for the fraction.
- Double precision
- Stored in 64 bits: 1 bit for the sign, 11 bits for the exponent and 52 bits for the fraction.
- Double extended precision
- This is not exactly defined by IEEE 754. But the most common version is the x87 implementation which has 1 bit for the sign, 15 bits for the exponent, 63 bits for the fraction and 1 bit for the integer part before the fraction (which is nearly always 1, except for denormals).
- Quad precision
- This is defined in the revised IEEE 754 standard (IEEE 754-2008), stored in 128 bits, uses 1 bit for the sign, 15 bits for the exponent and 112 bits for the fraction. Several non-x86 (integer-64bit) processors support this data type.

Some processors (most notably the x87 fpu) have a so-called internal precision, i.e. instead of having different registers and instructions for different floating point types (as is the case with integers) they have a single internal floating point register type (in the x87 case double extended precision) and a single operation type for doing math.

Many floating point experts state the case that using a higher internal precision for floating point calculations is better in order to ensure algorithms remain stable. While I basically agree with this argument, the problem with x87 FPUs is that they **hide** the usage of this internal precision from the the programmer. This causes two different kinds of headaches:

- First of all, simple C code is highly platform and compiler-dependent. With integers, this is not so much the case, since the C standard gaurantees an integer to have at least a certain size and when performing calculations within that size, the code remains completely portable and produces the same results on every platform (i.e. is deterministic). With floating point calculations, this is not the case anymore. A change in internal precision can result in different results for different platforms, thus the code is not deterministic anymore.
- Also, one would expect that the floating point number
`0.002877`and the expression`2877.0 / 1000000.0`would be represented by the same floating point number (since both`2877`and`1000000`are exactly representable). If the conversion and calculation are done within the same precision, this is actually the case. However, if an internal precision is used, it may not be. Say, for example, that`0.002877`is converted to the closest double representation of that number. Then calculate the expression`2877.0 / 1000000.0`in double-extended precision and after the calculation truncate it to double precision. This will actually yield a slightly different number.

**Note** that the second problem apparently does not occur for single-precision floating point numbers since the number of fraction bits is so small that there appears to be no short decimal number that maps to a floating point number with these characteristics. Thus for single precision, calculating in a higher precision and then truncating *appears* to have the same effect as directly calculating in single precision.

This section will take a look at different hardware implementations and analyze them.

x86 platforms support two types of floating point instruction sets nowadays: The classical x87 floating point unit and the new MMX/SSE instructions that can even perform parallel calculations.

The characteristics of the x87 FPU are:

- Different instructions for loading and storing different types of floating point numbers from and to memory.
- A
*single instruction set*for performing calculations. - An
*internal precision*with which all calculations are done. - Instructions to
*change*the internal precision - Support for single, double and double-extended precision.

The characteristics of the MMX/SSE instructions are:

- Different instructions for loading and storing different types of floating point numbers from and to memory.
*Different instructions for performing caluclations*depending on the data types in the registers.- Calculation precision is
*dependent on operhand types*. - Supports only single and double precision
- Support for parallel computation.

According to the Documentation (see: Book I: PowerPC User Instruction Set Architecture) and some tests done by Scott MacVicar, the following holds for PPC:

- Different instructions for loading and storing different types of floating point numbers from and to memory.
- A
*single instruction set*for performing calculations. - All calculations are done in
*double precision*. - Supports single and double precision, but not double-extended precision
- At least the Power4 processor (integer-64bit PPC) also supports quad precision

The PPC processor also seems to support a non-IEEE-mode for the FPU. The documentation does not clearly state how this mode is defined and for portability reasons it clearly should be avoided.

Johannes Schlüter did some tests on UltraSPARC IIIi platform which yielded the following results:

*Different instructions for performing caluclations*depending on the data types in the registers.- Calculation precision is
*dependent on operhand types*. - Supports single, double and quad precision

I'd love to have some details on how other platforms implement floating point operations. The following is a non-exhaustive list of platforms I don't have any information on:

- Linux kernel x87 emulation for systems without x87
- IA64 (Not equal to EMT64/AMD64!)
- Alpha
- ...

Some notes:

- I could get somebody to test some things on Linux on a Nokia N810 (ARM) but that does not mean other ARM devices react in the same way.

This section focusses on the implementation of floating point operations in compilers and system libraries, i.e. how C code is transformed into assembly instructions and how they interact with the internal processor state.

Calling conventions define how function parameters and return values are passed. This is platform, operating-system and compiler dependent. Please note that quite a few systems pass floating point parameters and return values directly in the FPU registers and not im memory. This can have interesting side-effects, take for example a method that returns a `float`

(single-precision) but instead of simply storing the return value the function is directly used within another function. With certain calling conventions the FP registers may directly be passed through to the second function - without ever storing the value in memory and thus not truncating it from internal precision. On the other hand, different systems may have calling conventions that use the stack for parameters and thus the value is stored in memory and is truncated. This also causes non-deterministic behaviour in the sense that different systems react differently to the same piece of code. I will discuss this issue separately for each compiler if applicable.

I did extensive tests with `cl.exe`

from Microsoft Visual C++ 2005 Express Edition under Windows 2000 and with MinGW (gcc 3.4.5 mingw-special). I could only test on 32bit Windows, I'd be happy to hear about 64bit Windows.

Most notably, the Visual C++ Compiler does *not* support double-extended precision, it only supports single precision and double precision. ([1], [2]) It does, however, know a `long double`

type which is treated the same way as `double`

for arithmetic reasons but is a different type with regard to type coerxion. It also switches the processor's internal precision to double precision, so all calculations are executed in double precision.

**TODO:** It would be nice to know when this exactly happens. Is it the kernel or some part of the runtime initialization?

The Microsoft Visual C++ Runtime also supports the functions `_controlfp`

`_controlfp_s`

to switch x87 FP mode. They are documented MSDN: _controlfp, _controlfp_s. The latter is available since Visual C++ 2005. In order to set the current precision, use:

// Both are defined here #include <float.h> // old version unsigned int fpu_oldcw; fpu_oldcw = _controlfp(0, 0); // store old vlaue _controlfp(precision, _MCW_PC); // calculation here _controlfp(fpu_oldcw, _MCW_PC); // new version unsigned int fpu_oldcw, fpu_cw; _controlfp_s(&fpu_cw, 0, 0); fpu_oldcw = fpu_cw; _controlfp_s(&fpu_cw, precision, _MCW_PC); // calculation here _controlfp_s(&fpu_cw, fpu_oldcw, _MCW_PC);

Since the new version only does additional range checking on the values, it is ok to use the old version as long as the supplied values are valid.

The precision may be one of:

- _PC_24
- Single precision
- _PC_53
- Double precision
- _PC_64
- Double-extended precision

Please note that the double-extended precision setting is for internal precision only. Values will always be truncated to double precision with Visual C++.

In order to ensure that the optimizer does not make wrong assumptions, please add `#pragma fenv_access(on)`

to the source code. This will tell the compiler that the floating point mode may be changed by the source.

**Calling conventions:** Windows does use FPU registers for return values but it uses a small trick to ensure that values react deterministaclly. For single precision values the compiler temporarily stores the value in a variable and loads it again - the value is then truncated to single precision. (double precision values don't matter since the compiler assumes the processor is always in double precision) Consider the following source code:

float divf (float a, float b) { return a / b; }

This is transformed to:

_divf PROC ; COMDAT ; Line 36 fld DWORD PTR _a$[esp-4] fdiv DWORD PTR _b$[esp-4] fstp DWORD PTR tv130[esp-4] fld DWORD PTR tv130[esp-4] ; Line 37 ret 0 _divf ENDP

**Note:** Is it possible to have `cl.exe` generate SSE code?

Conclusion:

- If you want to use single precision for calculations, just use the
`float`

datatype for all operhands. Since single-precision values will always be truncated before use, the calculations will always appear to have been in single precision. Also, it couldn't hurt to use`_controlfp`

just in case. - If you want to use double precision for calculations, just use the
`double`

datatype for all operhands. Since the internal precision set by the compiler is double precision, it will be used automatically. However, it cannot hurt to use`_controlfp`

to force the precision just in case. - If you want to use double-extended precision for calculations, it gets complicated. You can of course use
`_controlfp`

to force the internal precision to double-extended. But be aware of the fact that whenever a value is stored back to memory (which may happen arbitrarily based on the decision of the optimizer), it will be truncated to double precision. Using double-extended precision with Microsoft Visual C++ is not feasible.

This compiler supports double-extended precision (datatype `long double`

). It sets the processor (or leaves it) in double-extended internal precision.

MinGW offers access to `_controlfp`

but not `_controlfp_s`

. The semantics are the same as with Visual C++, only that MinGW can actually use double-extended precision.

**Calling conventions:** Please note that MinGW never truncates to either double or float before returning a value. You can, however, force it to do that by either using the compiler option `-ffloat-store` or stroring the result in an explicit variable that is declared `volatile`

and then returning that variable. This will result in the desired code.

Conclusion:

- If you want to use single precision for caluclations, use the
`float`

datatype for all operhands and use the compiler option`-ffloat-store`

. If it is not feasible to specify a compiler option, use the following code:{ volatile float result = /* actual result */; return result; }

instead ofreturn /* actual result */;

Also, it couldn't hurt to use`_controlfp`

just in case. - If you want to use double precision for calculations, use the
`double`

datatype*and*use`_controlfp`

to set and restore the precision at the beginning of the method (`-ffloat-store`

will not suffice!) - If you want to use double-extended precision for calculations, simply use the
`long double`

datatype. If you haven't touched the internal precision, you don't need to explicitely set it but it wouldn't hurt, just in case.

Not tested, feedback welcome.

As does the GCC that comes with MinGW, this compiler supports double-extended precision and also uses that for the internal processor precision.

glibc provides a header file fpu_control.h which defines several useful macros to set the precision of the FPU. These macros evaluate to inline assembler code that directly sets the FPU state. Example usage:

#include <fpu_control.h> fpu_control_t fpu_oldcw, fpu_cw; _FPU_GETCW(fpu_oldcw); // store old cw fpu_cw = (fpu_oldcw & ~_FPU_EXTENDED & ~_FPU_DOUBLE & ~_FPU_SINGLE) | precision; _FPU_SETCW(fpu_cw); // calculations here _FPU_SETCW(fpu_oldcw); // restore old cw

Precision my be one of:

- _FPU_SINGLE
- Single precision
- _FPU_DOUBLE
- Double precision
- _FPU_EXTENDED
- Double-extended precision

**Calling conventions:** In order for the GCC to truncate variables to the correct precision before returning them, use the `-ffloat-store` compiler option or declare an explicit return value variable `volatile`

(as with MinGW). Please note that this alone will **not** solve the problem described in this article since the above mentioned example `0.002877` will result in *different* doubles even *after* truncation.

**Caution:** Since the FPU control macros only expand to inline assembler, the optimizer will sometimes reorder the actual calculation after the reset of the FPU control word. Example:

double divcw (double a, double b) { double res; fpu_control_t _oldcw, _cw; _FPU_GETCW(_oldcw); _cw = (_oldcw & ~_FPU_EXTENDED & ~_FPU_SINGLE) | _FPU_DOUBLE; _FPU_SETCW(_cw); res = a / b; _FPU_SETCW(_oldcw); return res; }

Without optimization, this will be expanded to:

divcw: # stuff left out for brevity #APPfldcw -12(%ebp)#NO_APP fldl -24(%ebp)fdivl -32(%ebp)fstpl -8(%ebp) #APPfldcw -10(%ebp)#NO_APP fldl -8(%ebp) leave ret .size divcw, .-divcw

With optimization (-O2, but -O1 has the same problem) however, this will result in:

divcw: pushl %ebp movl %esp, %ebp subl $16, %esp fldl 8(%ebp) #APP fnstcw -2(%ebp) #NO_APP movzwl -2(%ebp), %eax andl $-769, %eax orl $512, %eax movw %ax, -4(%ebp)#APP fldcw -4(%ebp) fldcw -2(%ebp) #NO_APP fdivl 16(%ebp)leave ret .size divcw, .-divcw

As one can see, the optimizer reordered the instructions. I consider this a bug in GCC. GCC 3.3.6 and 3.4.6 were not affected according to my tests (but will only use `-ffloat-store` for explicit variables). I did not yet test it with newer versions.

This reordering can be avoided by two means: Either use the `-ffloat-store` compiler option or declare the return varaible (here: `result`

) `volatile`

.

**Using MMX/SSE:** GCC also supports a compiler option `-mfpmath=sse`. In order for it to take effect, at least the additional options `-msse` and `-march=XXX` (where XXX is e.g. pentium4 or any other arch value that supports SSE) must be supplied. In that case, for single and double precision floating point calculations the MMX/SSE extensions will be used instead of the FPU and thus eliminating the need for changing the FPU precision because the MMX/SSE extensions always operate with the same precision as the operhands.

Conclusion:

- If you want to use single precision for caluclations, use the
`float`

datatype for all operhands and use the compiler option`-ffloat-store`

. If it is not feasible to specify a compiler option, use the following code:{ volatile float result = /* actual result */; return result; }

instead ofreturn /* actual result */;

Also, it couldn't hurt to use`_FPU_SETCW`

just in case. - If you want to use double precision for calculations, use the
`double`

datatype*and*use`_FPU_SETCW`

to set and restore the precision at the beginning of the method (`-ffloat-store`

will not suffice!). Please note that*in addition*to`_FPU_SETCW`

,`-ffloat-store`

or the volatile trick (just with double instead of float) must be used, else the compiler will use the wrong optimization. - If you want to use double-extended precision for calculations, simply use the
`long double`

datatype. If you haven't touched the internal precision, you don't need to explicitely set it but it wouldn't hurt, just in case.

Not tested, feedback welcome.

With FreeBSD the x87 fpu is in double precision mode by default. It does, however, support the double-extended data type (`long double`

). Calculations with that data type will however only be executed within double precision by default.

FreeBSD also supports several macros to change the processor's float mode:

#include <machine/ieeefp.h> fp_prec_t fpu_oldprec; fpu_oldprec = fpgetprec(); fpsetprec(precision); // calculations here fpsetprec(fpu_oldprec);

Valid precision values are:

- FP_PS
- Single precision
- FP_PD
- Double precision
- FP_PE
- Double-extended precision

**Calling conventions:** The compiler does not generally store the result of an operation in memory and back into the registers to truncate it. `-ffloat-store` itself in only interpreted for actual variables, not for expressions (i.e. `double result = a / b; return result;`

and `return a / b;`

will result in different code with `-ffloat-store` active - even in optimizer mode.

Conclusion:

- If you want to use single precision for calculations, use the
`float`

datatype. Also use a temporary floating point varible to store the result:{ volatile float result = /* actual result */; return result; }

instead ofreturn /* actual result */;

Using`-ffloat-store`would require an additional variable anyway to have effect, so just declaring a new volatile variable will suffice anyway. Using`fpsetprec`

couldn't hurt either. - If you want to use double precision for calculations, use the
`double`

datatype. Since the default internal precision is double by default, one does not need to change anything. Using`fpsetprec`

and a volatile return variable couldn't hurt, though. - If you want to use double-extended precision for calculations, use the
`long double`

datatype. Also be sure to use`fpsetprec`

to ensure the correct internal precision.

Not tested, feedback welcome.

It does not appear to be possible to switch to double precision on OpenSolaris on Intel platforms without resorting to inline assembler or relying on GCC and SSE extensions (and OpenSolaris runs on non-SSE-x86 processors, so that's not portable).

The compiler supports double-extended precision and uses that for the default floating point precision on the platform. There seems to be no possibility to explicitely change the FP precision setting without resorting to inline assembler. (there are some macros / functions defined in sys/ieeefp.h but the interesting part is enclosed in an `#ifdef _KERNEL`

) The only possibility appears to be to use MMX/SSE instructions: `-mfpmath=sse -msse -march=pentium4` or similar - or to use inline assembly to set the internal precision.

**Calling conventions:** The compiler does not generally store the result of an operation in memory and back into the registers to truncate it. `-ffloat-store` itself in only interpreted for actual variables, not for expressions (i.e. `double result = a / b; return result;`

and `return a / b;`

will result in different code with `-ffloat-store` active - even in optimizer mode.

Conclusion:

- If you want to use single precision for calculations, use the
`float`

datatype. Also use a temporary floating point variable to store the result:{ volatile float result = /* actual result */; return result; }

instead ofreturn /* actual result */;

Using`-ffloat-store`would require an additional variable anyway to have effect, so just declaring a new volatile variable will suffice anyway. sing inline assembly to set the precision manually couldn't hurt either. - If you want to use double precision for calculations, use the
`double`

datatype and use inline assembly to manually set the precision (see below). - If you want to use double-extended precision for calculations, use the
`long double`

datatype. Using inline assembly to force the precision couldn't hurt either.

The compiler behaves more or less identical with GCC on OpenSolaris, there are only two differences:

- For single-precision values the values are always truncated before returning them via the store-and-reload trick.
- There appears to be no option to enable SSE for the compiler.

Conclusion:

- If you want to use single precision for calculations, simply use the
`float`

datatype for all operhands. Since the compiler will always truncate results, the semantics will work correctly. - For double and double-extended precision, the same as for GCC on OpenSolaris applies.

Not tested, feedback welcome.

This section is based on tests that other people did for me.

The FPU mode is double-extended by default and that datatype is also supported. But since Mac OS X runs only on x86 platforms that already support MMX/SSE, the compiler will always use MMX/SSE for single and double precision floating point calculations. Only double-extended precision floating point calculations will be done with the x87 FPU.

This is actually the most desirable behaviour since the precision is always determined by the largest operhand type in C.

Because of this, Mac OS X does not provide any C wrapper macros to change the internal precision setting of the x87 FPU. It is simply not necessary. Should this really be wanted, inline assembler would probably be possible, I haven't tested this, however.

**Not tested:** See what happens if `-mfpmath=i387` is used as a compiler option.

Conclusion:

Simply use the correct datatype and the operations performed will have the correct semantics: `float`

for single precision, `double`

for double precision and `long double`

for double-extended precision.

Not tested, feedback welcome.

GCC on 64bit x86 Linux behaves the same as GCC on Mac OS X: FPU mode is extended by default but since SSE is supported by all x86_64 processors, SSE is always used for single and double precision floating point calculations and thus the largest operhand in C defines the calculation precision.

However, it is possible to use the `-mfpmath=i387` switch. In that case, glibc also provides the fpu_control.h with the same macros as for the x86 case.

Conclusion:

Simply use the correct datatype and the operations performed will have the correct semantics: `float`

for single precision, `double`

for double precision and `long double`

for double-extended precision.

Scott MacVicar did some tests on the Power4 platform with GCC 4.0.1 which yielded the expected results that the compiler generates different instructions for different datatypes, thus using the correct precision. An interesting detail: quad precision calculations are done via software emulation by default, perhaps for compability with older processor revisions.

Conclusion:

Simply use the correct datatype and the operations performed will have the correct semantics: `float`

for single precision, `double`

for double precision and `long double`

for quad precision.

Johannes Schlüter did some tests on the UltraSPARC IIIi platform with both GCC 3.4.3 and Sun's CC which yielded the expected results that *both* compilers generate different instructions for different datatypes, thus using the correct precision. An interesting detail: quad precision calculations are done via software emulation by default (by both (!) compilers), perhaps for compability with older processor revisions.

Conclusion:

Simply use the correct datatype and the operations performed will have the correct semantics: `float`

for single precision, `double`

for double precision and `long double`

for quad precision.

A friend of mine ran the test suite below on a Nokia N810 with Linux installed. Since the N810 does not posess a FPU, all FPU operations are emulated via function calls.

The emulation uses the same precision as the largest operhand type. `long double`

is treated the same as `double`

, i.e. there is no extended precision.

Conclusion:

Simply use the correct datatype and the operations performed will have the correct semantics: `float`

for single precision and `double`

for double precision.

The best solution would probably be using the MPFR library (or something similar). That has its drawbacks though: The performance is slower than native FPU performance and it requires memory allocation on the heap. Also, having such a big library as a requirement for the own software may not always be the optimal choice. Thus a solution using the native FP unit will be built.

Using single or double precision for floating point calculations seems to be the only portable alternative without using an entire library. Double-extended precision is not supported on PPC and SPARC, quad precision is not supported on x86 and extended precision is also not supported by Microsoft's Visual C++ compiler.

Since there is probably no good reason to restrict calculations to single precision (lower precision means less accuracy), this document will focus on using double precision portably.

In order to acchieve this, the following steps should be taken:

- Define several macros that automate the switching of the FPU precision setting.
- Every function doing FP math should only use double-precision values, i.e. the C
`double`

datatype. - On entering a function that does FP math, the current FPU precision setting should be saved, the FPU should be set to double precision, then all calculations should be done and at the end of the function, the previous FPU precision should be restored.
- When calling a function from a foreign library that potentially does floating point calculations, the FPU state should be temporarily restored for the duration of the call.
- Since it is not possible to ensure that
`-ffloat-store`is always used when using GCC, a volatile return variable should be used for returning data, this has the same effect.

The following macros provide platform-independent support for saving, switching and restoring FPU precision. If a platform does not support this or it is not necessary, they will simply expand to no-ops. XPFPA stands for "cross platform floating point arithmetic".

- XPFPA_DECLARE()
- Declare all necessary variables to save the current FP precision setting.
- XPFPA_SWITCH_DOUBLE()
- Save the current precision and switch to double precision
- XPFPA_RESTORE()
- Restore the previous precision
- XPFPA_RETURN_DOUBLE(val)
- Restore the previous precision setting and return a value. Handles the compiler-misoptimization discussed above.
- XPFPA_SWITCH_SINGLE()
- Save the current precision and switch to single precision. This macro is defined for completeness purposes only.
- XPFPA_RETURN_SINGLE(val)
- Restore the previous precision setting and return a value. Handles the compiler-misoptimization discussed above.
- XPFPA_SWITCH_DOUBLE_EXTENDED()
- Save the current precision and switch to double-extended precision. May not work at all (e.g. PPC) or not as expected (MS Visual C++). This macro is defined for completeness purposes only.

Example usage:

#include "xpfpa.h" double somefn (double a) { XPFPA_DECLARE() double localvar1; XPFPA_SWITCH_DOUBLE() // do some calculation localvar1 = a * a; // restore state temporarily, call an external library function XPFPA_RESTORE() localvar1 = external_lib_solve_linear_equation(localvar1); XPFPA_SWITCH_DOUBLE() // do some more calculations localvar1 = localvar1 / 2.0; // return the value XPFPA_RETURN_DOUBLE(localvar1); }

Please be sure to always use the correct datatype for the calculations since several architectures do not posess an internal precision but rather determine the precision according to the datatype of the operhands!

The header file xpfpa.h defines the above macros on systems where the internal precision can be changed. On other systems, the macros expand to nothing and we assume that the correct precision is chosen by the compiler based on the operhand types. The file can also be downloaded in an archive: xpfpa.tar.gz

In order to set the correct defines for the macros, I have written an m4 file that may be included in your `configure.in`. It's name is `float_precision.m4` and it is available in the archive autoconf-macros.tar.gz. Usage:

dnl ... top of configure.in or acinclude.m4 or whatever: sinclude(float_precision.m4) dnl ... below: CHECK_FLOAT_PRECISION

It will set one or more of the following variables, please make sure your `config.h.in` contains them:

- HAVE__FPU_SETCW
- HAVE_FPSETPREC
- HAVE__CONTROLFP
- HAVE__CONTROLFP_S
- HAVE_FPU_INLINE_ASM_X86

I have also provided a macro for CMake that may be included in your `CMakeLists.txt`. It's name is `CheckFloatPrecision.cmake` and it is available in the archive cmake-macros.tar.gz. Usage:

include(CheckFloatPrecision.cmake) # or, if in module path include(CheckFloatPrecision) check_float_precision()

It will set one or more of the following variables, please make sure your `config.h.in` contains them:

- HAVE__FPU_SETCW
- HAVE_FPSETPREC
- HAVE__CONTROLFP
- HAVE__CONTROLFP_S
- HAVE_FPU_INLINE_ASM_X86

I have started to create an easy test suite that does some checks on the default floating point behaviour. It will only run on POSIX-like platforms with a C compiler that supports options in the style of GCC (e.g. GCC itself, ICC, Sun C Compiler, ...).

It can be downloaded here: fp-testsuite.tar.bz2 / fp-testsuite.tar.gz.

If you have a new platform / new compiler that I haven't discussed yet, please run the following (in the directory where you unpacked the archive):

CC=/path/to/your/compiler ./fptest.sh | tee messages.txt

Then please send me `messages.txt` and `results.tar.gz`.

Author: Christian Seiler, Last change: 2008-10-28