The CMSIS API for GPIO toggling compiles into a function call (regardless of optimization or “inline” attributes), which results in a 22-cycle pin-wiggle duration.
However, by replacing the CMSIS call:
with a normal GPIO register manipulation:
GPIOA->ODR ^= GPIO_PIN_0;
this loop reduces to 9 cycles, as it is compiled into:
ldr r3, [r2, #20] // 3 cycles on STM32F0 eors r3, r1 // 1 cycle on an M0+ str r3, [r2, #20] // 3 cycles on STM32F0 b.n 0x8000c04 <main+72> // 2 cycles on an M0+
This is slower than