This describes a clamped add operation on 15-bit RGB pixels of the form 0RRRRRGG GGGBBBBB. Below, X and Y refer to the two source pixels. The same algorithm can also be used on a pair of 15-bit RGB pixels packed into a 32-bit integer.

The add clamped operation adds the channels from both source pixels and limits the result to the maximum, instead of wrapping around:

result = min( X + Y, 31 );

The first step is to add the components together, which will require at least one zero bit between components to hold the carry. As with the RGB mixing operation, it is best if this can be done without having to unpack the components, and it turns out that similar logic can be applied here. If we XOR the components together and mask off the low bit, we end up with the sum of the low bits:

RRRRR GGGGG BBBBB 0 00001 00000 11111 X XOR 0 00001 11111 00010 Y ------------------- 0 00000 11111 11101 AND 0 00001 00001 00001 mask ------------------- 0 00000 00001 00001 low bits

If we subtract this from the sum of X and Y, we ensure that the sum would leave the low bits clear, turning them into carries from the component to the right:

0 00001 00000 11111 X + 0 00001 11111 00010 Y ------------------- 0 00011 00000 00001 X + Y - 0 00000 00001 00001 low bits ------------------- 0 00010 11111 00000 sum with low bits reflecting carries only AND 1 00001 00001 00000 ------------------- 0 00000 00001 00000 carries

If we subtract these carries from the sum of X and Y, we get a modulo add operation, avoiding interaction between components:

0 00011 00000 00001 X + Y - 0 00000 00001 00000 carries ------------------- 0 00010 11111 00001 modulo sum: result = (X + Y) % 32

The only step left is to clamp components that generated a carry and wrapped around. By subtracting a shifted version of the carry from itself, we generate a set of bits to OR with the modulo sum:

0 00000 00001 00000 carries - 0 00000 00000 00001 carries shifted right by 5 bits ------------------- 0 00000 00000 11111 clamp bits OR 0 00010 11111 00001 modulo sum ------------------- 0 00010 11111 11111 clamped result

Here is the final code:

sum = x + y; low_bits = (x ^ y) & 0x0421; carries = (sum - low_bits) & 0x8420; modulo = sum - carries; clamp = carries - (carries >> 5); result = modulo | clamp;

It uses only 9 operations without any branching, and can do two 15-bit pixels at once by multiplying the masks by 0x10001. The final installment will cover a subtract clamped operation, which for some reason is more difficult for me to grasp.

Mixing Packed RGB Pixels Efficiently

Subtracting RGB Pixels With Clamping