Edge behavior in `jax.scipy.special.betainc` #21900

mdhaber · 2024-06-15T22:43:57Z

Description

jax.scipy.special.betainc seems to have trouble with very small values of the parameter a, at least for certain values of b and x.

import matplotlib.pyplot as plt
import numpy as np
from scipy.special import betainc as betainc_scipy
import jax.numpy as xp
from jax.scipy.special import betainc as betainc_jax

a = np.logspace(-40, -1, 300)
b = 1
x = 0.25
plt.loglog(a, betainc_scipy(a, b, x), label='scipy')
plt.loglog(a, betainc_jax(xp.asarray(a), b, x), label='jax')
plt.xlabel('a')
plt.ylabel('betainc(a, 1, 0.25)')
plt.legend()

I know that it is difficult to guarantee accuracy to machine precision for all possible combinations of input : ) Just thought I'd point out this problem spot since it came up in SciPy testing (scipy/scipy#20963).

There is a separate issue related to edge case behavior in betainc that I thought I should bring up: there are some edge cases where betainc returns NaN but another result would be more useful (e.g. in Harrell-Davis quantile estimates). To some extent, the "correct" result is a matter of interpretation, but SciPy recently addressed similar cases in scipy/scipy#22425. While we're waiting for an Array API special function extension (data-apis/array-api#725), please consider adopting this behavior.

import jax.numpy as np
from jax.scipy import special
cases = [((0.0, 0.0, 0.0), np.nan),
         ((0.0, 0.0, 0.5), np.nan),
         ((0.0, 0.0, 1.0), np.nan),
         ((np.inf, np.inf, 0.0), np.nan),
         ((np.inf, np.inf, 0.5), np.nan),
         ((np.inf, np.inf, 1.0), np.nan),
         ((0.0, 1.0, 0.0), 0.0),
         ((0.0, 1.0, 0.5), 1.0),
         ((0.0, 1.0, 1.0), 1.0),
         ((1.0, 0.0, 0.0), 0.0),
         ((1.0, 0.0, 0.5), 0.0),
         ((1.0, 0.0, 1.0), 1.0),
         ((0.0, np.inf, 0.0), 0.0),
         ((0.0, np.inf, 0.5), 1.0),
         ((0.0, np.inf, 1.0), 1.0),
         ((np.inf, 0.0, 0.0), 0.0),
         ((np.inf, 0.0, 0.5), 0.0),
         ((np.inf, 0.0, 1.0), 1.0),
         ((1.0, np.inf, 0.0), 0.0),
         ((1.0, np.inf, 0.5), 1.0),
         ((1.0, np.inf, 1.0), 1.0),
         ((np.inf, 1.0, 0.0), 0.0),
         ((np.inf, 1.0, 0.5), 0.0),
         ((np.inf, 1.0, 1.0), 1.0)]

for args, ref in cases:
    print(special.betainc(*args), ref)

Produces:
(actual, desired)

nan nan
nan nan
nan nan
nan nan
nan nan
nan nan
nan 0.0
nan 1.0
nan 1.0
nan 0.0
nan 0.0
nan 1.0
nan 0.0
nan 1.0
nan 1.0
nan 0.0
nan 0.0
nan 1.0
nan 0.0
nan 1.0
nan 1.0
nan 0.0
nan 0.0
nan 1.0

System info (python version, jaxlib version, accelerator, etc.)

jax: 0.4.26
jaxlib: 0.4.26
numpy: 1.25.2
python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
jax.devices (1 total, 1 local): [CpuDevice(id=0)]
process_count: 1
platform: uname_result(system='Linux', node='e901fac133dc', release='6.1.85+', version='#1 SMP PREEMPT_DYNAMIC Sun Apr 28 14:29:16 UTC 2024', machine='x86_64')

The text was updated successfully, but these errors were encountered:

rajasekharporeddy · 2024-06-17T04:49:54Z

Hi @mdhaber

JAX typically uses single-precision floating-point numbers for calculations, while SciPY defaults to double precision. This difference in precision can lead to slightly different results, especially when working with very small numbers. If the double precision is enabled in JAX, then JAX yields the results that are consistent with SciPy even with very small numbers:

import jax

jax.config.update('jax_enable_x64', True)

import matplotlib.pyplot as plt
import numpy as np
from scipy.special import betainc as betainc_scipy
import jax.numpy as xp
from jax.scipy.special import betainc as betainc_jax

a = np.logspace(-40, -1, 300)
b = 1
x = 0.25
plt.loglog(a, betainc_scipy(a, b, x), label='scipy')
plt.loglog(a, betainc_jax(xp.asarray(a), b, x), label='jax')
plt.xlabel('a')
plt.ylabel('betainc(a, 1, 0.25)')
plt.legend()

Please find the gist for reference.

Thank you.

mdhaber · 2024-06-17T06:26:03Z

Thanks! Although this actually came up in the context of 32-bit calculations. The definitions should have been:

a = np.logspace(-40, -1, 300, dtype=np.float32)
b = np.float32(1.)
x = np.float32(0.25)

and the plot looks the same. SciPy's outputs are float32, so I assume that's being preserved internally, although perhaps it is converting back and forth.
In any case, I know the trouble area is toward the small end of normal numbers and extends into the subnormals, so I understand if it's not a priority. Feel free to close!

To zoom in:

import matplotlib.pyplot as plt
import numpy as np
from scipy.special import betainc as betainc_scipy
import jax.numpy as xp
from jax.scipy.special import betainc as betainc_jax

a0 = np.finfo(np.float32).smallest_normal
b = np.float32(1.)
x = np.float32(0.25)
factor = np.float32(10)
a = np.logspace(np.log10(a0), np.log10(a0*factor), 300, dtype=np.float32)
plt.loglog(a, betainc_scipy(a, b, x), label='scipy')
plt.loglog(a, betainc_jax(xp.asarray(a), b, x), label='jax')
plt.xlabel('a')
plt.ylabel('betainc(a, 1, 0.25)')
plt.legend()

rajasekharporeddy · 2024-06-24T09:20:34Z

Hi @mdhaber

IIUC, according to scipy/scipy/#8495 (Comment), SciPy do all the internal (low level c) calculations in float64 even if the input is float32 or other. But JAX do it in float32 itself. That might be causing this difference.

Thank you.

pearu · 2024-06-24T09:55:50Z

Whatever are the reasons for scipy to use float64 internally (one practical reason could be that there are no float32 implementation available for scipy, for instance), evaluating functions using float32 correctly requires the usage of an algorithm that can properly handle overflows, underflows, or cancellations. Using higher precision is a typical cheap trick to avoid paying attention to these fp errors in implementations of the function algorithms to keep algorithms simple.
So, I wonder what is the location of jax.scipy.special.betainc implementation which may provide explanations for the behavior observed in this issue.

jakevdp · 2024-06-24T13:35:05Z

JAX's implementation is here, and mentions that it's based on http://dlmf.nist.gov/8.17.E23: https://github.com/google/jax/blob/2b728d55b6054bba8ae26b3523722e80d660e771/jax/_src/lax/special.py#L182-L190

mdhaber · 2024-06-24T15:33:54Z

SciPy do all the internal (low level c) calculations in float64 even if the input is float32 or other.

But that comment is about scipy.ndimage.affine_transform, not scipy.special.betainc.

I confirmed with @steppi that SciPy now uses Boost's ibeta for betainc, and the types seem to be preserved in the calculation.

Here is where betainc is defined in terms of ibeta.
https://github.com/scipy/scipy/blob/e36e728081475466d2faae65e1dfecfa2314c857/scipy/special/functions.json#L118-L123

Here is where ibeta is used for float and double instantiations of the function.
https://github.com/scipy/scipy/blob/e36e728081475466d2faae65e1dfecfa2314c857/scipy/special/boost_special_functions.h#L106-L116

and Boost's ibeta is templated:
https://beta.boost.org/doc/libs/1_68_0/libs/math/doc/html/math_toolkit/sf_beta/ibeta_function.html

vfdev-5 · 2024-07-09T05:42:52Z

Checking the same code on GPU, we have a bit different plots:

import matplotlib.pyplot as plt
import numpy as np
from scipy.special import betainc as betainc_scipy
import jax.numpy as xp
from jax.scipy.special import betainc as betainc_jax

a = np.logspace(-40, -1, 300)
b = 1
x = 0.25

output = betainc_jax(xp.asarray(a), b, x)

plt.loglog(a, betainc_scipy(a, b, x), label='scipy')
plt.loglog(a, output, label='jax')
plt.xlabel('a')
plt.ylabel('betainc(a, 1, 0.25)')
plt.legend()

print(output.devices(), output.dtype)
# {cuda(id=0)} float32

and

import matplotlib.pyplot as plt
import numpy as np
from scipy.special import betainc as betainc_scipy
import jax.numpy as xp
from jax.scipy.special import betainc as betainc_jax

a0 = np.finfo(np.float32).smallest_normal
b = np.float32(1.)
x = np.float32(0.25)
factor = np.float32(10)
a = np.logspace(np.log10(a0), np.log10(a0*factor), 300, dtype=np.float32)

output = betainc_jax(xp.asarray(a), b, x)
plt.loglog(a, betainc_scipy(a, b, x), label='scipy')
plt.loglog(a, output, label='jax')
plt.xlabel('a')
plt.ylabel('betainc(a, 1, 0.25)')
plt.legend()

print(output.devices(), output.dtype)
# {cuda(id=0)} float32

So, to reproduce the issue we can add on top of the code

import os
os.environ["CUDA_VISIBLE_DEVICES"] = ""
os.environ["JAX_PLATFORMS"] = "cpu"

mdhaber · 2025-02-08T19:14:47Z

Thanks for the investigation @vfdev-5. I extended the top post with a description of a separate issue regarding edge case behavior, which is probably easier to address.

< 8000 /task-lists>

pearu · 2025-03-13T21:52:53Z

Heads up: #27107 fixes all issues reported here.

mdhaber · 2025-03-20T17:50:15Z

Thank you!

mdhaber added the bug Something isn't working label Jun 15, 2024

mdhaber mentioned this issue Jun 15, 2024

TST: special.tests.test_support_alternative_backends: failure with JAX betainc scipy/scipy#20963

Closed

mdhaber closed this as completed Jun 17, 2024

mdhaber reopened this Jun 17, 2024

superbobry assigned jakevdp Jun 17, 2024

mdhaber mentioned this issue Feb 10, 2025

ENH: stats.quantile: add discontinuous (HF 1-3) and Harrell-Davis methods; add marray support scipy/scipy#22505

Merged

This was referenced Mar 12, 2025

State of jax.scipy.special functions: tested by evaluation or autograd, incorrectness and missing functionality #27088

Open

Update handling of betainc edge cases #27107

Merged

copybara-service bot closed this as completed in #27107 Mar 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Edge behavior in `jax.scipy.special.betainc` #21900

Edge behavior in `jax.scipy.special.betainc` #21900

Edge behavior in jax.scipy.special.betainc #21900

Edge behavior in jax.scipy.special.betainc #21900

Comments

Description

System info (python version, jaxlib version, accelerator, etc.)

Edge behavior in `jax.scipy.special.betainc` #21900

Edge behavior in `jax.scipy.special.betainc` #21900