What are the most common operations that would cause a NaN, in Python, which originate while working with NumPy or SciPy?

For example:

1e500 - 1e500
>>> nan

What is the reasoning for this behavior and why does it not return 0?

If you do any of the following without horsing around with the floating-point environment, you should get a NaN where you didn’t have one before:

  • 0/0 (either sign on top and bottom)
  • inf/inf (either sign on top and bottom)
  • inf - inf or (-inf) + inf or inf + (-inf) or (-inf) - (-inf)
  • 0 * inf and inf * 0 (either sign on both factors)
  • sqrt(x) when x < 0
  • fmod(x, y) when y = 0 or x is infinite; here fmod is floating-point remainder.

The canonical reference for these aspects of machine arithmetic is the IEEE 754 specification. Section 7.1 describes the invalid operation exception, which is the one that is raised when you’re about to get a NaN. “Exception” in IEEE 754 means something different than it does in a programming language context.

Lots of special function implementations document their behaviour at singularities of the function they’re trying to implement. See the man page for atan2 and log, for instance.

You’re asking specifically about NumPy and SciPy. I’m not sure whether this is simply to say “I’m asking about the machine arithmetic that happens under the hood in NumPy” or “I’m asking about eig() and stuff.” I’m assuming the former, but the rest of this answer tries to make a vague connection to the higher-level functions in NumPy. The basic rule is: If the implementation of a function commits one of the above sins, you get a NaN.

For fft, for instance, you’re liable to get NaNs if your input values are around 1e1010 or larger and a silent loss of precision if your input values are around 1e-1010 or smaller. Apart from truly ridiculously scaled inputs, though, you’re quite safe with fft.

For things involving matrix math, NaNs can crop up (usually through the inf - inf route) if your numbers are huge or your matrix is extremely ill-conditioned. A complete discussion of how you can get screwed by numerical linear algebra is too long to belong in an answer. I’d suggest going over a numerical linear algebra book (Trefethen and Bau is popular) over the course of a few months instead.

One thing I’ve found useful when writing and debugging code that “shouldn’t” generate NaNs is to tell the machine to trap if a NaN occurs. In GNU C, I do this:

#include <fenv.h>