Home My Page Projects MPFR
Summary Activity Forums Tracker Lists Tasks Docs News SCM Files

[#11301] mpfr_sub1 bug

2010-10-17 20:10
Submitted by:
Jakub Jelinek (jakub)
Assigned to:
Paul Zimmermann (zimmerma)
Target Version:
mpfr_sub1 bug

Detailed description

This source should show a bug, where mpfr_fma (IMNSHO) gives incorrect result
(1ulp smaller), which causes also __builtin_fmal to misbehave.
The testcase comment explains why I believe the right result is
0xc.0bffffffffffffep-3942L (== 0x3.02fffffffffffff8p-3940L)
instead of
0xc.0bffffffffffffdp-3942L (== 0x3.02fffffffffffff4p-3940L)
FMA is supposed to be computed with infinite precision and only rounded at the
end. When stepping through mpfr_fma, it seems u is computed correctly:
(gdb) p mpfr_printf ("%Ra\n", u)
$1 = 27
so the problem seems to be in the following mpfr_add (s, u, z, rnd_mode);
where u has precision 128, but s and z have precision 64.
The textcase is for extended 80-bit long double, so ?86/x86_64/ia64.

#include <math.h>
#include <gmp.h>
#include <stdio.h>
#include <mpfr.h>


80-bit extended long double

fmal (0x8.07fffffffffffffp+7418L, 0xcp-11363L, -0xf.ff80f8p-4070L)

* 0xcp-11363L
= 0xc.0bffffffffffffe8p-3942L

- 0x0.0000000000000000000000000000000fff80f8p-3942L
= 0xc.0bffffffffffffe7fffffffffffffff0007f08p-3942L

= 0xc.0bffffffffffffep-3942L instead of 0xc.0bffffffffffffdp-3942L

__builtin_fmal uses mpfr_fma internally in the compiler.

volatile long double zero;

main (void)
mpz_t a, b, c;
mpz_init (a); mpz_init (b); mpz_init (c);
mpz_set_str (a, "807fffffffffffff", 16);
mpz_set_str (b, "c", 16);
mpz_mul (c, a, b);
mpz_mul_si (c, c, 2);
mpz_out_str (stdout, 16, c);
printf ("\n");
mpz_set_str (a, "c0bffffffffffffe80000000000000000000000", 16);
mpz_set_str (b, "00000000000000000000000000000000fff80f8", 16);
mpz_sub (c, a, b);
mpz_out_str (stdout, 16, c);
printf ("\n");
mpz_clear (a); mpz_clear (b); mpz_clear (c);
printf ("mul %La\n__builtin_fmal %La\nfmal %La\n",
0x8.07fffffffffffffp+7418L * 0xcp-11363L,
__builtin_fmal (0x8.07fffffffffffffp+7418L, 0xcp-11363L,
fmal (0x8.07fffffffffffffp+7418L, 0xcp-11363L,
-0xf.ff80f8p-4070L + zero));
mpfr_set_default_prec (64);
mpfr_set_emin (-16384 -64 +4);
mpfr_set_emax (16384);
mpfr_t xx, xy, xz, xr, xm;
mpfr_init (xx); mpfr_set_ld (xx, 0x8.07fffffffffffffp+7418L, GMP_RNDN);
mpfr_init (xy); mpfr_set_ld (xy, 0xcp-11363L, GMP_RNDN);
mpfr_init (xz); mpfr_set_ld (xz, -0xf.ff80f8p-4070L, GMP_RNDN);
mpfr_init (xr); mpfr_init (xm);
int i = mpfr_mul (xm, xx, xy, GMP_RNDN);
mpfr_subnormalize (xm, i, GMP_RNDN);
i = mpfr_fma (xr, xx, xy, xz, GMP_RNDN);
mpfr_subnormalize (xr, i, GMP_RNDN);
mpfr_printf ("mpfr_mul %Ra\nmpfr_fma %Ra\n", xm, xr);
mpfr_clear (xx); mpfr_clear (xy); mpfr_clear (xz);
mpfr_clear (xm); mpfr_clear (xr);
return 0;

mpfr_sub1 is called, where b is 2 limbs long (128-bit), while a and c is just
one limb long (64-bit).
As c is small, but not small enough (difference in exponents exactly 128), we
reach the:

for (k = 0; (bn > 0) || (cn > 0); k = 1)

loop in sub1.c, with sh == 0. In the first iteration bb is equal to
MPFR_LIMB_HIGHBIT and cc is 0 (it is before the highest limb in c), and we
reach the special case:

/* the case rounding to nearest with sh=0 is special since one couldn't
subtract above 1/2 ulp in the trailing limb of the result */
if ((rnd_mode == GMP_RNDN) && sh == 0 && k == 0)
mp_limb_t half = MPFR_LIMB_HIGHBIT;

is_exact = (bb == cc);

/* add one ulp if bb > cc + half
truncate if cc - half < bb < cc + half
sub one ulp if bb < cc - half

The comments seems to cover most of the cases, just not two important ones,
where either bb is equal to cc + half (this case) or bb is equal to cc - half.
In that case after this bb is equal to cc, eventhough they weren't equal
before. In this case with round to nearest we really can't decide right away,
need to look at next (or following limbs) to decide if exactly 0.5ulp was added
resp. subtracted, or if tiny bit above it or below it, but that decision should
take into account that the special case above with bb == cc + half or bb == cc
- half happened. In the code it doesn't, so we fall through to the next limb -
0 in bb (as it is after lowest limb) and 0xfff80f8000000000 in cc. k is
non-zero, so the special case isn't done now, bb is smaller than cc, so it sets
down and reaches the
420 else if (down && sh == 0)
421 goto sub_one_ulp;
case, so it incorrectly subtracts 1ulp instead of just raising inexact and
Message  ↓
Date: 2010-10-20 16:43
Sender: Jakub Jelinek

All the 3 x 10bil tests finished fine, I think this bug can be closed.
Thanks for fixing it.

Date: 2010-10-19 16:40
Sender: Jakub Jelinek

I'm now running my mpfr tester on x86_64-linux for both 80-bit long double, 64-bit double and 32-bit float, against mpfr-2.4.2 patched with a backport of your changes, each for 5 hours already and they haven't reported a failure, so it looks good.

Date: 2010-10-19 11:18
Sender: Paul Zimmermann

Dear Jakub, there was still a remaining issue, which should
be fixed now. I did draw a tree of all possible cases for
rounding to nearest. This tree has depth 3 (after depth 3 the
cases are similar) and 3 nodes at depth 1, 13 nodes at depth
2, and 15 nodes at level 3. I did check that all termination
nodes are correctly handled, and reciprocally that all conditions used are useful for at least one node.

I wait your approval to close that bug.


Date: 2010-10-18 21:06
Sender: Paul Zimmermann

Dear Jakub, my first fix was incorrect (but our test cases did
not exhibit it, which shows they are incomplete). I have now
added your 3 examples and they work correctly in revision
7213. Please could you try again?


PS: are you working on implementing a true fma in glibc?
If so, this is really great!

Date: 2010-10-18 20:54
Sender: Jakub Jelinek

Analyzing the first case shows it is a bug in the new mpfr_sub1 code:

With old mpfr:
mpfr_fma 0x8.7ffffffffffc7ffp-14444
mpfr_fma 0x8.7ffffffffffc7feffffffffffffffff8p-14444
With mpfr from svn trunk:
mpfr_fma 0x8.7ffffffffffc7fep-14444
mpfr_fma 0x8.7ffffffffffc7feffffffffffffffff8p-14444

#include <mpfr.h>


80-bit extended long double

fmal (0xf.fffffffffffffffp-14766L, -0xf.fffffffffffffffp+317L, 0x8.3ffffffffffe3ffp-14443L)

* -0xf.fffffffffffffffp+317L
= -0xf.ffffffffffffffe0000000000000001p-14445L

+ 0x10.7ffffffffffc7fe00000000000000000p-14444L
= 0x8.7ffffffffffc7feffffffffffffffff8p-14444L

= 0x8.7ffffffffffc7ffp-14444L instead of 0x8.7ffffffffffc7fep-14444L

main (void)
mpfr_set_default_prec (64);
mpfr_set_emin (-16384 -64 +4);
mpfr_set_emax (16384);
mpfr_t xx, xy, xr, xz;
mpfr_init (xx); mpfr_set_ld (xx, 0xf.fffffffffffffffp-14766L, GMP_RNDN);
mpfr_init (xy); mpfr_set_ld (xy, -0xf.fffffffffffffffp+317L, GMP_RNDN);
mpfr_init (xz); mpfr_set_ld (xz, 0x8.3ffffffffffe3ffp-14443L, GMP_RNDN);
mpfr_init (xr);
int i = mpfr_fma (xr, xx, xy, xz, GMP_RNDN);
mpfr_subnormalize (xr, i, GMP_RNDN);
mpfr_printf ("mpfr_fma %Ra\n", xr);
mpfr_clear (xx); mpfr_clear (xy); mpfr_clear (xz); mpfr_clear (xr);
mpfr_init2 (xx, 256); mpfr_set_ld (xx, 0xf.fffffffffffffffp-14766L, GMP_RNDN);
mpfr_init2 (xy, 256); mpfr_set_ld (xy, -0xf.fffffffffffffffp+317L, GMP_RNDN);
mpfr_init2 (xz, 256); mpfr_set_ld (xz, 0x8.3ffffffffffe3ffp-14443L, GMP_RNDN);
mpfr_init2 (xr, 256);
i = mpfr_fma (xr, xx, xy, xz, GMP_RNDN);
mpfr_subnormalize (xr, i, GMP_RNDN);
mpfr_printf ("mpfr_fma %Ra\n", xr);
mpfr_clear (xx); mpfr_clear (xy); mpfr_clear (xz); mpfr_clear (xr);
return 0;

Date: 2010-10-18 19:35
Sender: Jakub Jelinek

That change fixes the testcase, but on my
fma tester:
it results in many failures. With glibc from git trunk on x86_64 the tester reported earlier 28 errors out of 10 billion tests (the first one was the case reported here, haven't checked the remaining 27 but assumed it could be the same issue), now the test that ran for ~ 8 hours before reported already 400 errors within 10 minutes of runtime.
Haven't analyzed them yet, so there is theoretical possibility they might be glibc errors instead of mpfr bugs, but a problem on the mpfr side is also possible.

First 3 reported errors:

fma (0xf.fffffffffffffffp-14766, -0xf.fffffffffffffffp+317, 0x8.3ffffffffffe3ffp-14443) = 0x8.7ffffffffffc7ffp-14444 mpfr_fma 0x8.7ffffffffffc7fep-14444

fma (-0xf.fffffffffffffffp-11420, 0xf.fffffffffffffffp+9863, 0x8.fffff80ffffffffp-1551) = 0x9.fffff01ffffffffp-1552 mpfr_fma 0x9.fffff01fffffffep-1552

fma (0xf.fffffffffffffffp-2125, -0xf.fffffffffffffffp-6000, 0x8p-8119) = 0x8.000000000000001p-8120 mpfr_fma 0x8p-8120

Date: 2010-10-18 18:59
Sender: Paul Zimmermann

thank you Jakub for reporting and analyzing that bug.
It should be fixed in svn revision 7112. Please can you
confirm it works now as expected before I close it?

Paul Zimmermann

Date: 2010-10-18 10:26
Sender: Vincent Lefèvre


Thanks for the bug report. I confirm that there is a problem and I'm going to add a testcase that is architecture-independent.

Field Old Value Date By
close_date2011-05-02 14:452011-05-02 14:45vlefevre
Target Version2.42011-05-02 14:45vlefevre
ResolutionAccepted2010-10-20 20:45zimmerma
status_idOpen2010-10-20 20:45zimmerma
close_date2010-10-20 20:452010-10-20 20:45zimmerma
status_idASSIGNED2010-10-19 15:13zimmerma
close_dateNone2010-10-19 15:13zimmerma
close_date2010-10-19 11:182010-10-19 11:18zimmerma
close_date2010-10-18 21:062010-10-18 21:06zimmerma
ResolutionNone2010-10-18 18:59zimmerma
status_idOpen2010-10-18 18:59zimmerma
assigned_tonone2010-10-18 18:59zimmerma
close_date2010-10-18 18:592010-10-18 18:59zimmerma