Teachings of a Samurai Engineer

Teachings of a Samurai Engineer 2: Various Considerations on Numerical Calculations

In my last column, I said that depending on the circumstances, it is possible that an error may be produced when calculating a total, for example in an accounting system. Let’s explore such a situation in more detail.

In my last column, I said that depending on the circumstances, it is possible that an error may be produced when calculating a total, for example in an accounting system. Let’s explore such a situation in more detail.

Many programming languages, including PHP, have data types, either explicitly or implicitly.

PHP handles “numbers” with two data types:

  • int (integer numbers)
  • float/double (real numbers)

Although I wrote both float and double for the real number data type, PHP treats these two data types as completely the same.

In general, the int data type is often used. But as I wrote in the previous column, it has a maximum permitted size.

To confirm the byte size, you can use the constant PHP_INT_SIZE. To see the numerical range that is actually permitted, use PHP_INT_MIN and PHP_INT_MAX.

In reality, these values differ depending on the operating environment.

The range of PHP_INT_MIN to PHP_INT_MAX is often -2147483648 to 2147483647 (32-bit system) or -9223372036854775808 to 9223372036854775807 (64-bit system).

In case of the latter, the maximum value is approximately 922 “kei,” a unit that most people are unfamiliar with. In the case of the former, the max is about 2.1 billion. Most people know this unit.

To put it another way, in the case of the former, its size is such that, depending on the work being done, you may think, “It’s possible to handle a number that exceeds this value.”

So what happens when the upper limit given above is exceeded?

In a language such as C, if 1 is added to a value that is same as PHP_INT_MAX, the number immediately becomes a negative value (this is called “overflow”).

Related to this phenomenon, search “Nuclear Gandhi” on Google or another search engine. You’ll find an intriguing story about the character of Gandhi in the game Civilization, so be sure to check it out. (“Nuclear Gandhi” is a case of what is called “integer underflow.”)

PHP has a different behavior from what I described above.

First, let’s observe this behavior with some code. We’ll then confirm the legitimacy of this behavior by checking the PHP manual.

<?php
//
$i = PHP_INT_MAX;
var_dump($i);
$i += 1;
var_dump($i);
//
$i = PHP_INT_MIN;
var_dump($i);
$i -= 1;
var_dump($i);

In my environment, the results are as follows:
<Result>
int(9223372036854775807)
float(9.2233720368548E+18)
int(-9223372036854775808)
float(-9.2233720368548E+18)
</Results>

Although the result should be int type, when a value exceeds PHP_INT_MAX or PHP_INT_MIN, it becomes float type.

This behavior is officially described in the PHP manual:
https://www.php.net/manual/en/language.types.integer.php#language.types.integer.overflow

If PHP encounters a number beyond the bounds of the integer type, it will be interpreted as a float instead. Also, an operation which results in a number beyond the bounds of the integer type will return a float instead.

In short, the data type changes from integer to float. What kind of problem does this cause?

A float is a data type that can express any value (to a certain extent). To compensate for this ability, it contains a degree of error. The existence of this degree of error can be a problem.

For example:
<?php
//
$f1 = 9223372036854775808;
$f2 = 9223372036854775809;
var_dump( $f1 === $f2 );

The result of this code is
<Result>
bool(true)
</Result>

Even though the digits in the ones’ place in the numbers above are 8 and 9 – different digits – the result output indicates that they considered the same. This is a case of “degree of error” in the float data type.

Because such an issue occurs, you must basically pay attention to the behavior of int becoming float when PHP_INT_MAX is exceeded, although this possibility really depends on the nature of the computing task.

To prevent this problem, you can, for example, check the variable of the calculation result with is_int(). If is_int() returns false, then you can throw an exception and write, “Stop processing.” You can know quickly that a problem has occurred.

<?php
//
$i = PHP_INT_MAX;
var_dump( is_int($i) );
$i += 1;
var_dump( is_int($i) );

<Result>
bool(true)
bool(false)
</Results>

Or, if you expect that value being handled will probably exceed 2.1 billion but not 922 kei, you can write code that does not allow the program to proceed when PHP_INT_SIZE is less than 8.

This approach is easier, as you only need to write it once at a place that is always accessed, such as the top page of a website.

Now, there are times when you may need to handle a large value and want to calculate it properly and accurately.

In the case of PHP, there are two “arbitrary-precision arithmetic” libraries available: “BCMath Arbitrary Precision Mathematics” and “GNU Multiple Precision.”

Neither modules are installed by default, and proper installation steps are required (if compiling, prescribed compilation options must be explicitly specified). By using either library, you can calculate values correctly, even if they are “big.”

For example, let’s try writing code that adds 1 to PHP_INT_MAX using the “BCMath Arbitrary Precision Mathematics” library.
<?php
var_dump(bcadd('9223372036854775807', '1'));

<Result>
string(19) “9223372036854775808”
</Results>

An arbitrary-precision arithmetic library has the characteristic of also often treating values as strings so there is no degree of error.

With an arbitrary-precision arithmetic library, you can check if values are equal with === and compare the size of the numbers with comparison functions such as bccomp() and gmp_cmp().

So you can handle these numbers as you do normally do with real numbers.

When inputting a big number like one handled by the two libraries, often-used MySQL allows the following, as written in the manual:
https://dev.mysql.com/doc/refman/5.6/en/numeric-type-syntax.html

You can always store an exact integer value in a BIGINT column by storing it using a string. In this case, MySQL performs a string-to-number conversion that involves no intermediate double-precision representation.

Remember this documented feature well.

This feature remains in MySQL Version 8.0.

Part 1

Part 3

Part 4

Part 5

Part 6

Part 7

Part 8

Michiaki Furusho

PREVIOUS ARTICLE NEXT ARTICLE