Scale factor (computer science)

A scale factor is used in computer science when a real world set of numbers needs to be represented on a different scale in order to fit a specific number format. For instance, a 16 bit unsigned integer (uint16) can only hold a value as large as 65,53510. If uint16's are to be used to represent values from 0 to 131,07010, then a scale factor of 1/2 would be introduced. Notice that while the scale factor extends the range, it also decreases the precision. In this example, for instance, the number 3 could not be represented because a stored 1 represents a real world 2, and a stored 2 represents a real world 4.

Uses

Certain number formats may be chosen for an application for convenience in programming, or because of certain advantages offered by the hardware for that number format. For instance, early processors did not natively support the IEEE floating point standard for representing fractional values, so integers were used to store representations of the real world values by applying a scale factor to the real value. By necessity, this was done in software, since the hardware did not support fractional value.

Operations on Scaled Values

Once the scaled representation of a real value is stored, the scaling can often be ignored until the value needs to come back into the "real world". For instance, adding two scaled values is just as valid as unscaling the values, adding the real values, and then scaling the result, and the former is much easier and faster. For other operations, however, the scaling is very important.

Multiplication, for instance, needs to take account of the fact that both numbers are scaled. As an example, consider two real world values A and B. The real world multiplication of these real world values is:

A * B = P

Now suppose we're storing the values with a scale factor of Z. If we simply multiply the stored representations we'll get the following:

AZ * BZ = Q

Note how AZ is the scaled real world value or A or simply the product of A * Z, and likewise, BZ is the scaled representation of B. Also note that we didn't write PZ as the answer, the reason is simple: PZ is not the answer. You can see this by rearranging the statement, where each line in the following are equivalent:

AZ * BZ = Q
A * Z * B * Z = Q
(A * B) * Z * Z = Q
P * Z * Z = Q
PZ * Z = Q

Note how we substituted P for A * B on line 4. You can now see that the result of AZ * BZ (which is Q) is NOT PZ, it's PZ * Z. If PZ were the answer, we could simply store it directly, since it has the scale factor built in, as is the case with addition and subtraction. For multiplication, however, you can see that the product of two scaled values has an extra scaling built in. As long as this is taken into account, there's still no need to convert AZ and BZ into A and B before performing the operation, you just need to divide the result by Z before storing it back. You will then have PZ stored as the result of the multiplication, which is fine because you weren't storing the result of AZ * BZ, you were storing the scaled representation of the result of A * B.

Common Scaling Scenarios

Fractional Values Scaled to Integers

As already mentioned, many older processors (and possibly some current ones) do not natively support fractional mathematics. In this case, fractional values can be scaled into integers by multiplying them by ten to the power of whatever decimal precision you want to retain. In other words, if you want to preserve n digits to the right of the decimal point, you need to multiply the entire number by 10n. (Or if you're working in binary and you want to save m digits to the right of the binary point, then you would multiply the number by 2m, or alternately, bit shift the value m places to the left). For example, consider the following set of real world fractional values:

15.400, 0.133, 4.650, 1.000, 8.001

Notice how they all have 3 digits to the right of the decimal place. If we want to save all of that information (in other words, not lose any precision), we need to multiply these numbers by 103, or 1,000, giving us integer values of:

15400, 133, 4650, 1000, 8001

(also note that these numbers cannot be stored in 8bit integers, it will require at least 14 bits, or, more realistically, 16.)

Integer values to Fractional

Certain processors, particularly DSPs common in the embedded system industry, have built in support for the fixed point arithmetic, such as Q and IQ formats.

Since the fractional part of a number takes up some bits in the field, the range of values possible in a fixed point value is less than the same number of bits would provide to an integer. For instance, in an 8 bit field, an unsigned integer can store values from [0, 255] but an unsigned fixed point with 5 fractional bits only has 3 bits left over for the integer value, and so can only store integer values from [0, 7] (note that the number of values that the two fields can store is the same, 28 = 256, because the fixed point field can also store 32 fractional values for each integer value). It is therefore common that a scaling factor is used to store real world values that may be larger than the maximum value of the fixed point format.

As an example, assume we are using an unsigned 8 bit fixed point format with 4 fractional bits, and 4 integer bits. As mentioned, the highest integer value it can store is 15, and the highest mixed value it can store is 15.9375 (0xF.F or 1111.1111b). If the real world values we want to manipulate are in the range [0,160], we need to scale these values in order to get them into fixed point. Note that we can not use a scale factor of 1/10 here because scaling 160 by 1/10 gives us 16, which is greater than the greatest value we can store in our fixed point format. 1/11 will work as a scale factor, however, because 160/11 = 14.5454... which fits in our range. Let's use this scale factor to convert the following real world values into scaled representations:

154, 101, 54, 3, 0, 160

Scaling these with the scale factor (1/11) gives us the following values:

154/11 = 14
101/11 = 9.1818...
54/11 = 4.9090...
3/11 = 0.2727...
0/11 = 0
160/11 = 14.5454...

Note however, that many of these values have been truncated because they contain repeating fractions. When we try to store these in our fixed point format, we're going to lose some of our precision (which didn't seem all that precise when they were just integers). This is an interesting problem because we said we could fit 256 different values into our 8 bit format, and we're only trying to store values from a range with 161 possible values (0 through 160). As it turns out, the problem was our scale factor, 11, which introduced unnecessary precision requirements. The resolution of the problem is to find a better scaling factor. For more information, read on.

Picking a Scale Factor

An example above illustrated how certain scale factors can cause unnecessary precision loss. We will revisit this example to further explore the situation.

We're storing representations of real data in 8 bit unsigned fixed point fields with 4 integer bits and 4 fractional bits. This gives us a range of [0, 15.9375] in decimal, or [0x0.0, 0xF.F] in hex. Our real world data is all integers and in the range [0, 160] in decimal. Note that there are only 161 unique values that we may want to store, so our 8 bit field should be plenty, since 8 bits can have 256 unique configurations.

In the example given above, we picked a scale factor of 11 so that all the numbers would be small enough to fit in the range. However, when we began scaling the following real world data:

154, 101, 54, 3, 0, 160

We discovered that the precision of these fractions is going to be a problem. The following box illustrates this showing the original data, its scaled decimal values, and the binary equivalent of the scaled value.

154/11 = 14 = 1110.0
101/11 = 9.1818... = 1001.00101110...
54/11 = 4.9090... = 100.111010...
3/11 = 0.2727... = 0.010010...
0/11 = 0 = 0.0
160/11 = 14.5454... = 1110.10010...

Notice how several of the binary fractions require more than the 4 fractional bits provided by our fixed point format. To fit them into our fields, we would simply truncate the remaining bits, giving us the following stored representations:

1110.0000
1001.0010
0100.1110
0000.0100
0000.0000
1110.1001

Or in decimal:

14.0
9.125
4.875
0.25
0.0
14.5625

And when we need to bring them back into the real world, we need to divide by our scale factor, 1/11, giving the following "real world" values:

154.0
100.375
53.625
2.75
0
160.1875

Notice how they've changed? For one thing, they aren't all integers anymore, immediately indicating that an error was introduced in the storage, due to a poor choice of scaling factor.

Picking a better Scale Factor

Most data sets will not have a perfect scale factor; you will probably always get some error introduced by the scaling process. However it certainly may be possible to pick a better scaling factor. For one thing, note that dividing a number by a power of two is the same as shifting all the bits to the right once for each power of two. (It's the same thing in decimal, when you divide by 10, you shift all the decimal digits one place to the right, when you divide by 100, you shift them all two places to the right). The pattern of bits doesn't change, it just moves. On the other hand, when you divide by a number that is NOT an integer power of 2, you are changing the bit pattern. This is likely to produce a bit pattern with even more bits to the right of the binary point, artificially introducing required precision. Therefore, it is almost always preferable to use a scale factor that is a power of two. You may still lose bits that get shifted right off the end of the field, but at least you won't be introducing new bits that will be shifted off the end.

To illustrate the use of powers of two in the scale factor, let's use a factor of 1/16 with the above data set. The binary value for our original data set is given below:

154 = 1001 1010
101 = 0110 0101
54 =  0011 0110
3 =   0000 0011
0 =   0000 0000
160 = 1010 0000

As we already knew, they all fit in 8 bits. Scaling these by 1/16 is the same as dividing by 16, which is the same as shifting the bits 4 places to the right. All that really means is inserting a binary point between the first four and last four bits of each number. Conveniently, that's the exact format of our fixed point fields. So just as we suspected, since all these numbers don't require more than 8 bits to represent them as integers, it doesn't have to take more than 8 bits to scale them down and fit them in a fixed point format.

References

1. Fixed-Point Arithmetic: An Introduction, Randy Yates, July 7, 2009 -- www.digitalsignallabs.com