Overflow and underflow

Updated to 16 days ago

definition

When a computer represents a real number, it usually has some approximation errors, which is caused by the representation method of the binary itself. In many cases, the computer will round these values, simply put, if the number can only retain 5 decimal places, then if the number's value is $10^{-6}$ , then this value may be rounded to 0 by the computer. The problem caused by this rounding isCompound operationEspecially common in.

Underflow: When a number close to zero is rounded to zero, an underflow occurs.
Overflow: When the number of a large number of orders is approximate to $\infty$ When an overflow occurs

In fact, overflow and underflow are in the process of gradient descentGradient disappearanceandGradient explosionNo matter which phenomenon occurs, it will lead to failure of model training.

example

The softmax function is a typical function that is prone to overflow and underflow. The form of this function is as follows:

$softmax(x)=\frac{exp(x_i)}{\sum_{j=1}^n exp(x_j)}$

Assume all $x_i$ equal to a constant c, then through the above softmax function, all outputs are $\frac{1}{n}$ 。

We can roughly estimate that when c is very small, $exp(x_i)$ It's very small, close to 0, that is $exp(x_i)$ An underflow may occur, resulting in the output of the softmax function being undefined (the denominator cannot be 0). Similarly, exp© may also overflow, resulting in both numerator and denominator being infinite, which makes the entire expression undefined.

Solution

There are many ways to solve softmax overflow and underflow, such as processing incoming values, that is:
$softmax(z) \\ z = x-max_ix_i$

Since all inputs are subtracted or added with a value, the input size ratio remains unchanged, and the output size ratio of the softmax function does not change, that is, large value or large value, small value or small value.

The above equation can ensure that when c is large, the maximum of z is only 0 and exp© is 1. In turn, prevent overflow
It can also be guaranteed that the denominator of the softma function formula must be greater than or equal to 1, because the largest $x_i$ - $max_ix_i$ If equal to 0, then exp(z) is 1, effectively preventing underflow.

Attached

In addition to the denominator, it may cause overflow or underflow, and the numerator may also cause overflow or underflow. For example, log softmax(x), when the numerator is very small, the output will be obtained as $\infty$ In turn, it causes overflow.