Converting Between Float and Fixed-Point with an ESP32

1 minute read

TL;WR: You can use FLOOR.S, CIEL.S, ROUND.S, FLOAT.S, and UFLOAT.S in a single inline-assembly instruction to convert between fixed-point and float values.
Example implementations at bottom of page.

There are alternate double versions of the instructions in the xtensa ISA but they are not implemented in the ESP32 tensilica cores.

You can find the full xtensa assembly instruction set documentation on the Cadence website

If you’re writing code that uses floating point numbers you’re already using some of these instructions. The toolchain will use them whenever you convert between int/uint and float types.

The instructions have a third parameter which specifies how many bits come after the decimal point (fractional bits). A value of 1 would be ±0.5 in the last bit, 2 would be ±0.25 in the last two bits, etc.

The normal assembly emitted when doing something like float f = (int)53 will have it set to 0 (no fractional bits), but by calling the instruction manually you can specify the precision.

The fractional bits parameter is a 0..15 constant and cannot be specified at runtime, so if you need different shapes you would just implement as separate methods, templates, or overloading methods with custom types.

For example, the instruction FLOAT.S f0, a2, 0 with 53 in register a2 would convert 53 to 53.0f in the f0 register. However the instruction FLOAT.S f0, a2, 1 would convert it to 26.5f.

Fractional Bits Example Instruction Input Binary Input Int Result Float
0 (normal float<>int) FLOAT.S f*n*, a*n*, 0 0b00110100 52 52.0
0 (normal float<>int) FLOAT.S f*n*, a*n*, 0 0b00110101 53 53.0
1 (lsb is ±0.5) FLOAT.S f*n*, a*n*, 1 0b00110100 52 26.0
1 (lsb is ±0.5) FLOAT.S f*n*, a*n*, 1 0b00110101 53 26.5
2 (lsb is ±0.25) FLOAT.S f*n*, a*n*, 2 0b00110100 52 13.0
2 (lsb is ±0.25) FLOAT.S f*n*, a*n*, 2 0b00110101 53 13.25
2 (lsb is ±0.25) FLOAT.S f*n*, a*n*, 2 0b00110110 54 13.50
Fig 1. Example Values

The FLOOR.S, CIEL.S, and ROUND.S instructions convert float to fixed-point by the method in their name. Since fixed-point can’t try to store arbitrary real numbers you are telling the processor how you’d like the number to be butchered. The FLOAT.S and UFLOAT.S instructions just use the default rounding/inf/NaN settings to convert fixed-point to float.

Fig 2. Simple Implementation