Forget the AI hype - FFT is the real unsung hero of computing...
The Fast Fourier Transform (FFT) is everywhere: multiplying large numbers, audio and video compression, high-frequency trading, weather prediction - you name it. It’s also the foundation of other key transforms: DCT for image compression, MDCT for audio compression, MFCC for machine learning, and more.
FFT is the most underrated algorithm of the 20th and 21st century — change my mind.
The first time I saw the Fourier Matrix and finally understood the Cooley-Tukey FFT, I was hooked. There’s something beautiful and elegant about its tree-like structure. Someday, I will probably write about what happens when you unravel FFT's recursion, and how it is related to the `rbit` instruction on ARM CPU. And sometimes, I just sit at my computer, and code away to make FFT run faster. It's relaxing...
Here’s one of my little achievement: A 4-point complex-to-complex FFT in just **11** AVX2 instructions. By itself, a 4-point FFT isn’t much, but as a kernel, it helps build higher-order FFTs with blazing efficiency.
Full demo implementation is on GitHub, which computes 256 point FFT under 1 micro-second on 12th gen Intel Processors.
https://gist.github.com/ashafq/eef8ef391fb58be85b325c259ce591e3