Wednesday, 5 March 2014

Changing default CFLAGS on i386


I recently discovered that upstream gcc is intentionally violating the
ISO C specification. We noticed this because of some failing tests in

The failure is caused by gcc playing 'fast and loose' with IEEE floating
point. You can read about this in the gcc manpage:

This option allows further control over excess precision on
machines where floating-point registers have more precision
than the IEEE "float" and "double" types and the processor
does not support operations rounding to those types. By
default, -fexcess-precision=fast is in effect; this means
that operations are carried out in the precision of the
registers and that it is unpredictable when rounding to the
types specified in the source code takes place. When
compiling C, if -fexcess-precision=standard is specified
then excess precision follows the rules specified in ISO
C99; in particular, both casts and assignments cause values
to be rounded to their semantic types (whereas
-ffloat-store only affects assignments). This option is
enabled by default for C if a strict conformance option
such as -std=c99 is used.

-fexcess-precision=standard is not implemented for
languages other than C, and has no effect if
-funsafe-math-optimizations or -ffast-math is specified.
On the x86, it also has no effect if -mfpmath=sse or
-mfpmath=sse+387 is specified; in the former case, IEEE
semantics apply without excess precision, and in the
latter, rounding is unpredictable.

As a programmer, my primary concern is that when I type "cc -o x x.c", I
get correct output, as per the specification. That's not currently
happening on Ubuntu.

I filed a bug about this upstream, which was quickly closed as INVALID.
It appears to have about 50 duplicates as well.

The logic given usptream is reasonably pragmatic: following the rules of
the standard would be expensive, so by default, we don't follow the
standard. I consider that this is not so different from the other
violations of spec that are enabled by -ffast-math, but there is a
difference in degree...

The good news is that this bug doesn't exist on 64 bit. The reason for
that is that every 64 bit processor in known existence implements
SSE/SSE2 and using those instructions enables correct behaviour with
high performance (even higher than the incorrect behaviour that one gets
when building 32bit binaries currently, in fact).

It is my proposal that we do one of two things, both of which would give
us a C compiler that produces correct code:

1) Use -fexcess-precision=standard and suffer the performance hit in
the name of correctness

2) Bite the bullet and use -march=pentium4 -mfpmath=sse (which is what
happens for 64bit)

Pentium 3 brought SSE, but SSE2 is needed (Pentium 4) in order to get
the improved performance and correctness for doubles.

For the record, Pentium 4 started shipping in 2000. The last Pentium 3
shipped in 2003, which is the same year that AMD released the Athlon 64,
with SSE/SSE2/SSE3 support.

As a possible caveat, this adherence to the spec (which mandates
rounding) would result in reduced precision which might have a negative
impact on some programs which enjoy this increased precision (even if
this effect is semi-nondeterministic and in violation of the C
specification). These programs should use 'long double' if they want
the old behaviour back.

Using -march=pentium4 would of course generally improve performance in
other areas as well.


ubuntu-devel mailing list
Modify settings or unsubscribe at: