Tue, 06 Jan 2015 21:39:09 +0100
Conditionally force memory storage according to privacy.thirdparty.isolate;
This solves Tor bug #9701, complying with disk avoidance documented in
https://www.torproject.org/projects/torbrowser/design/#disk-avoidance.
michael@0 | 1 | KISS FFT - A mixed-radix Fast Fourier Transform based up on the principle, |
michael@0 | 2 | "Keep It Simple, Stupid." |
michael@0 | 3 | |
michael@0 | 4 | There are many great fft libraries already around. Kiss FFT is not trying |
michael@0 | 5 | to be better than any of them. It only attempts to be a reasonably efficient, |
michael@0 | 6 | moderately useful FFT that can use fixed or floating data types and can be |
michael@0 | 7 | incorporated into someone's C program in a few minutes with trivial licensing. |
michael@0 | 8 | |
michael@0 | 9 | USAGE: |
michael@0 | 10 | |
michael@0 | 11 | The basic usage for 1-d complex FFT is: |
michael@0 | 12 | |
michael@0 | 13 | #include "kiss_fft.h" |
michael@0 | 14 | |
michael@0 | 15 | kiss_fft_cfg cfg = kiss_fft_alloc( nfft ,is_inverse_fft ,0,0 ); |
michael@0 | 16 | |
michael@0 | 17 | while ... |
michael@0 | 18 | |
michael@0 | 19 | ... // put kth sample in cx_in[k].r and cx_in[k].i |
michael@0 | 20 | |
michael@0 | 21 | kiss_fft( cfg , cx_in , cx_out ); |
michael@0 | 22 | |
michael@0 | 23 | ... // transformed. DC is in cx_out[0].r and cx_out[0].i |
michael@0 | 24 | |
michael@0 | 25 | free(cfg); |
michael@0 | 26 | |
michael@0 | 27 | Note: frequency-domain data is stored from dc up to 2pi. |
michael@0 | 28 | so cx_out[0] is the dc bin of the FFT |
michael@0 | 29 | and cx_out[nfft/2] is the Nyquist bin (if exists) |
michael@0 | 30 | |
michael@0 | 31 | Declarations are in "kiss_fft.h", along with a brief description of the |
michael@0 | 32 | functions you'll need to use. |
michael@0 | 33 | |
michael@0 | 34 | Code definitions for 1d complex FFTs are in kiss_fft.c. |
michael@0 | 35 | |
michael@0 | 36 | You can do other cool stuff with the extras you'll find in tools/ |
michael@0 | 37 | |
michael@0 | 38 | * multi-dimensional FFTs |
michael@0 | 39 | * real-optimized FFTs (returns the positive half-spectrum: (nfft/2+1) complex frequency bins) |
michael@0 | 40 | * fast convolution FIR filtering (not available for fixed point) |
michael@0 | 41 | * spectrum image creation |
michael@0 | 42 | |
michael@0 | 43 | The core fft and most tools/ code can be compiled to use float, double, |
michael@0 | 44 | Q15 short or Q31 samples. The default is float. |
michael@0 | 45 | |
michael@0 | 46 | |
michael@0 | 47 | BACKGROUND: |
michael@0 | 48 | |
michael@0 | 49 | I started coding this because I couldn't find a fixed point FFT that didn't |
michael@0 | 50 | use assembly code. I started with floating point numbers so I could get the |
michael@0 | 51 | theory straight before working on fixed point issues. In the end, I had a |
michael@0 | 52 | little bit of code that could be recompiled easily to do ffts with short, float |
michael@0 | 53 | or double (other types should be easy too). |
michael@0 | 54 | |
michael@0 | 55 | Once I got my FFT working, I was curious about the speed compared to |
michael@0 | 56 | a well respected and highly optimized fft library. I don't want to criticize |
michael@0 | 57 | this great library, so let's call it FFT_BRANDX. |
michael@0 | 58 | During this process, I learned: |
michael@0 | 59 | |
michael@0 | 60 | 1. FFT_BRANDX has more than 100K lines of code. The core of kiss_fft is about 500 lines (cpx 1-d). |
michael@0 | 61 | 2. It took me an embarrassingly long time to get FFT_BRANDX working. |
michael@0 | 62 | 3. A simple program using FFT_BRANDX is 522KB. A similar program using kiss_fft is 18KB (without optimizing for size). |
michael@0 | 63 | 4. FFT_BRANDX is roughly twice as fast as KISS FFT in default mode. |
michael@0 | 64 | |
michael@0 | 65 | It is wonderful that free, highly optimized libraries like FFT_BRANDX exist. |
michael@0 | 66 | But such libraries carry a huge burden of complexity necessary to extract every |
michael@0 | 67 | last bit of performance. |
michael@0 | 68 | |
michael@0 | 69 | Sometimes simpler is better, even if it's not better. |
michael@0 | 70 | |
michael@0 | 71 | FREQUENTLY ASKED QUESTIONS: |
michael@0 | 72 | Q: Can I use kissfft in a project with a ___ license? |
michael@0 | 73 | A: Yes. See LICENSE below. |
michael@0 | 74 | |
michael@0 | 75 | Q: Why don't I get the output I expect? |
michael@0 | 76 | A: The two most common causes of this are |
michael@0 | 77 | 1) scaling : is there a constant multiplier between what you got and what you want? |
michael@0 | 78 | 2) mixed build environment -- all code must be compiled with same preprocessor |
michael@0 | 79 | definitions for FIXED_POINT and kiss_fft_scalar |
michael@0 | 80 | |
michael@0 | 81 | Q: Will you write/debug my code for me? |
michael@0 | 82 | A: Probably not unless you pay me. I am happy to answer pointed and topical questions, but |
michael@0 | 83 | I may refer you to a book, a forum, or some other resource. |
michael@0 | 84 | |
michael@0 | 85 | |
michael@0 | 86 | PERFORMANCE: |
michael@0 | 87 | (on Athlon XP 2100+, with gcc 2.96, float data type) |
michael@0 | 88 | |
michael@0 | 89 | Kiss performed 10000 1024-pt cpx ffts in .63 s of cpu time. |
michael@0 | 90 | For comparison, it took md5sum twice as long to process the same amount of data. |
michael@0 | 91 | |
michael@0 | 92 | Transforming 5 minutes of CD quality audio takes less than a second (nfft=1024). |
michael@0 | 93 | |
michael@0 | 94 | DO NOT: |
michael@0 | 95 | ... use Kiss if you need the Fastest Fourier Transform in the World |
michael@0 | 96 | ... ask me to add features that will bloat the code |
michael@0 | 97 | |
michael@0 | 98 | UNDER THE HOOD: |
michael@0 | 99 | |
michael@0 | 100 | Kiss FFT uses a time decimation, mixed-radix, out-of-place FFT. If you give it an input buffer |
michael@0 | 101 | and output buffer that are the same, a temporary buffer will be created to hold the data. |
michael@0 | 102 | |
michael@0 | 103 | No static data is used. The core routines of kiss_fft are thread-safe (but not all of the tools directory). |
michael@0 | 104 | |
michael@0 | 105 | No scaling is done for the floating point version (for speed). |
michael@0 | 106 | Scaling is done both ways for the fixed-point version (for overflow prevention). |
michael@0 | 107 | |
michael@0 | 108 | Optimized butterflies are used for factors 2,3,4, and 5. |
michael@0 | 109 | |
michael@0 | 110 | The real (i.e. not complex) optimization code only works for even length ffts. It does two half-length |
michael@0 | 111 | FFTs in parallel (packed into real&imag), and then combines them via twiddling. The result is |
michael@0 | 112 | nfft/2+1 complex frequency bins from DC to Nyquist. If you don't know what this means, search the web. |
michael@0 | 113 | |
michael@0 | 114 | The fast convolution filtering uses the overlap-scrap method, slightly |
michael@0 | 115 | modified to put the scrap at the tail. |
michael@0 | 116 | |
michael@0 | 117 | LICENSE: |
michael@0 | 118 | Revised BSD License, see COPYING for verbiage. |
michael@0 | 119 | Basically, "free to use&change, give credit where due, no guarantees" |
michael@0 | 120 | Note this license is compatible with GPL at one end of the spectrum and closed, commercial software at |
michael@0 | 121 | the other end. See http://www.fsf.org/licensing/licenses |
michael@0 | 122 | |
michael@0 | 123 | A commercial license is available which removes the requirement for attribution. Contact me for details. |
michael@0 | 124 | |
michael@0 | 125 | |
michael@0 | 126 | TODO: |
michael@0 | 127 | *) Add real optimization for odd length FFTs |
michael@0 | 128 | *) Document/revisit the input/output fft scaling |
michael@0 | 129 | *) Make doc describing the overlap (tail) scrap fast convolution filtering in kiss_fastfir.c |
michael@0 | 130 | *) Test all the ./tools/ code with fixed point (kiss_fastfir.c doesn't work, maybe others) |
michael@0 | 131 | |
michael@0 | 132 | AUTHOR: |
michael@0 | 133 | Mark Borgerding |
michael@0 | 134 | Mark@Borgerding.net |