|
| 1 | +# heatshrink |
| 2 | + |
| 3 | +A data compression/decompression library for embedded/real-time systems. |
| 4 | + |
| 5 | + |
| 6 | +## Key Features: |
| 7 | + |
| 8 | +- **Low memory usage (as low as 50 bytes)** |
| 9 | + It is useful for some cases with less than 50 bytes, and useful |
| 10 | + for many general cases with < 300 bytes. |
| 11 | +- **Incremental, bounded CPU use** |
| 12 | + You can chew on input data in arbitrarily tiny bites. |
| 13 | + This is a useful property in hard real-time environments. |
| 14 | +- **Can use either static or dynamic memory allocation** |
| 15 | + The library doesn't impose any constraints on memory management. |
| 16 | +- **ISC license** |
| 17 | + You can use it freely, even for commercial purposes. |
| 18 | + |
| 19 | + |
| 20 | +## Getting Started: |
| 21 | + |
| 22 | +There is a standalone command-line program, `heatshrink`, but the |
| 23 | +encoder and decoder can also be used as libraries, independent of each |
| 24 | +other. To do so, copy `heatshrink_common.h`, `heatshrink_config.h`, and |
| 25 | +either `heatshrink_encoder.c` or `heatshrink_decoder.c` (and their |
| 26 | +respective header) into your project. For projects that use both, |
| 27 | +static libraries are built that use static and dynamic allocation. |
| 28 | + |
| 29 | +Dynamic allocation is used by default, but in an embedded context, you |
| 30 | +probably want to statically allocate the encoder/decoder. Set |
| 31 | +`HEATSHRINK_DYNAMIC_ALLOC` to 0 in `heatshrink_config.h`. |
| 32 | + |
| 33 | + |
| 34 | +### Basic Usage |
| 35 | + |
| 36 | +1. Allocate a `heatshrink_encoder` or `heatshrink_decoder` state machine |
| 37 | +using their `alloc` function, or statically allocate one and call their |
| 38 | +`reset` function to initialize them. (See below for configuration |
| 39 | +options.) |
| 40 | + |
| 41 | +2. Use `sink` to sink an input buffer into the state machine. The |
| 42 | +`input_size` pointer argument will be set to indicate how many bytes of |
| 43 | +the input buffer were actually consumed. (If 0 bytes were conusmed, the |
| 44 | +buffer is full.) |
| 45 | + |
| 46 | +3. Use `poll` to move output from the state machine into an output |
| 47 | +buffer. The `output_size` pointer argument will be set to indicate how |
| 48 | +many bytes were output, and the function return value will indicate |
| 49 | +whether further output is available. (The state machine may not output |
| 50 | +any data until it has received enough input.) |
| 51 | + |
| 52 | +Repeat steps 2 and 3 to stream data through the state machine. Since |
| 53 | +it's doing data compression, the input and output sizes can vary |
| 54 | +significantly. Looping will be necessary to buffer the input and output |
| 55 | +as the data is processed. |
| 56 | + |
| 57 | +4. When the end of the input stream is reached, call `finish` to notify |
| 58 | +the state machine that no more input is available. The return value from |
| 59 | +`finish` will indicate whether any output remains. if so, call `poll` to |
| 60 | +get more. |
| 61 | + |
| 62 | +Continue calling `finish` and `poll`ing to flush remaining output until |
| 63 | +`finish` indicates that the output has been exhausted. |
| 64 | + |
| 65 | +Sinking more data after `finish` has been called will not work without |
| 66 | +calling `reset` on the state machine. |
| 67 | + |
| 68 | + |
| 69 | +## Configuration |
| 70 | + |
| 71 | +heatshrink has a couple configuration options, which impact its resource |
| 72 | +usage and how effectively it can compress data. These are set when |
| 73 | +dynamically allocating an encoder or decoder, or in `heatshrink_config.h` |
| 74 | +if they are statically allocated. |
| 75 | + |
| 76 | +- `window_sz2`, `-w` in the CLI: Set the window size to 2^W bytes. |
| 77 | + |
| 78 | +The window size determines how far back in the input can be searched for |
| 79 | +repeated patterns. A `window_sz2` of 8 will only use 256 bytes (2^8), |
| 80 | +while a `window_sz2` of 10 will use 1024 bytes (2^10). The latter uses |
| 81 | +more memory, but may also compress more effectively by detecting more |
| 82 | +repetition. |
| 83 | + |
| 84 | +The `window_sz2` setting currently must be between 4 and 15. |
| 85 | + |
| 86 | +- `lookahead_sz2`, `-l` in the CLI: Set the lookahead size to 2^L bytes. |
| 87 | + |
| 88 | +The lookahead size determines the max length for repeated patterns that |
| 89 | +are found. If the `lookahead_sz2` is 4, a 50-byte run of 'a' characters |
| 90 | +will be represented as several repeated 16-byte patterns (2^4 is 16), |
| 91 | +whereas a larger `lookahead_sz2` may be able to represent it all at |
| 92 | +once. The number of bits used for the lookahead size is fixed, so an |
| 93 | +overly large lookahead size can reduce compression by adding unused |
| 94 | +size bits to small patterns. |
| 95 | + |
| 96 | +The `lookahead_sz2` setting currently must be between 3 and the |
| 97 | +`window_sz2` - 1. |
| 98 | + |
| 99 | +- `input_buffer_size` - How large an input buffer to use for the |
| 100 | +decoder. This impacts how much work the decoder can do in a single |
| 101 | +step, and a larger buffer will use more memory. An extremely small |
| 102 | +buffer (say, 1 byte) will add overhead due to lots of suspend/resume |
| 103 | +function calls, but should not change how well data compresses. |
| 104 | + |
| 105 | + |
| 106 | +### Recommended Defaults |
| 107 | + |
| 108 | +For embedded/low memory contexts, a `window_sz2` in the 8 to 10 range is |
| 109 | +probably a good default, depending on how tight memory is. Smaller or |
| 110 | +larger window sizes may make better trade-offs in specific |
| 111 | +circumstances, but should be checked with representative data. |
| 112 | + |
| 113 | +The `lookahead_sz2` should probably start near the `window_sz2`/2, e.g. |
| 114 | +-w 8 -l 4 or -w 10 -l 5. The command-line program can be used to measure |
| 115 | +how well test data works with different settings. |
| 116 | + |
| 117 | + |
| 118 | +## More Information and Benchmarks: |
| 119 | + |
| 120 | +heatshrink is based on [LZSS], since it's particularly suitable for |
| 121 | +compression in small amounts of memory. It can use an optional, small |
| 122 | +[index] to make compression significantly faster, but otherwise can run |
| 123 | +in under 100 bytes of memory. The index currently adds 2^(window size+1) |
| 124 | +bytes to memory usage for compression, and temporarily allocates 512 |
| 125 | +bytes on the stack during index construction (if the index is enabled). |
| 126 | + |
| 127 | +For more information, see the [blog post] for an overview, and the |
| 128 | +`heatshrink_encoder.h` / `heatshrink_decoder.h` header files for API |
| 129 | +documentation. |
| 130 | + |
| 131 | +[blog post]: http://spin.atomicobject.com/2013/03/14/heatshrink-embedded-data-compression/ |
| 132 | +[index]: http://spin.atomicobject.com/2014/01/13/lightweight-indexing-for-embedded-systems/ |
| 133 | +[LZSS]: http://en.wikipedia.org/wiki/Lempel-Ziv-Storer-Szymanski |
| 134 | + |
| 135 | + |
| 136 | +## Build Status |
| 137 | + |
| 138 | + [](http://travis-ci.org/atomicobject/heatshrink) |
0 commit comments