- SSE2/SSSE3 RGBA to BGRA conversion (10x faster) - Processes 4 pixels per iteration - Automatic fallback for non-x86 platforms - Applied to both STB and decoded image paths