Q (number format)
From Wikipedia, the free encyclopedia
Q is a fixed point number format where the number of fractional bits is specified as the Q number. For instance, Q15 means there are 15 fractional bits. Specific Q formats generally have a standard total field size, for instance, Q15 generally assumes a 16 bit value with either 1 sign or 1 integer bit in addition to the 15 fractional bits. It is often used in hardware that does not have a floating-point unit.
Contents |
[edit] Characteristics
Because Q format numbers are fixed point, they can be stored and operated on as integers. The Q size and the underlying integer size are chosen on an application-specific basis, depending on the range and resolution needed.
For a given Q format, using an N-bit integer with Q decimal bits:
[edit] Range
- Signed: [-2N-1-Q, 2N-1-Q-1]
- Unsigned: [0, 2N-Q-1]
[edit] Resolution
- 2-Q
Unlike floating point, resolution is consistent over the entire range.
[edit] Representation
- Signed: Q(N,M) [-2N-1, 2N-1-1]/2M
N is the total number of bits and M is the number of bits after the decimal point. There will be 1 bit for sign and N-M-1 bits for magnitude. For example, Q(8,7) has 7 bits after decimal points (i.e. the resolution is 2-7) and the range is [-1,0.992].
- Unsigned: U(N,M) [0, 2N-1]/2M
N is the total number of bits and M is the number of bits after the decimal point. Therefore, there will be N-M bits for magnitude. For example U(8,7) has 7 bits after decimal points (i.e. the resolution is 2-7) and the range is [0,1.992].
[edit] Conversion
[edit] Float to Q
To convert a number from floating point to Q format:
- Multiply the floating point number by 2Q
- Round to the nearest integer
[edit] Q to Float
To convert a number from Q format to floating point:
- Convert the number directly to floating point
- Divide by 2Q
[edit] Math Operations
Because Q format numbers are stored as integers, math operations can generally be done using built-in integer math units. Because multiplication and division will quickly overrun the range or under run the resolution of the integer, the operands should either be stored as long integers or shifted before the operation is performed.
[edit] Addition
- C = A + B
[edit] Subtraction
- C = A - B
[edit] Multiplication
- C = (A >> (Q/2)) * (B >> (Q - (Q/2)))
[edit] Division
- ((long)(A)<<Q)/(B)
[edit] See also
- IQ (number format)
- fixed-point arithmetic