Latest CODEC Source GPL v2+

The Latest compression CODEC source. Issued GPL v2 or greater. The context can be extended beyond 4 bits if you have enough memory and data to 8 bits easily, and a sub context can be made by nesting another BWT within each context block, for a massive 16 bit context, and a spectacular 28 bit dictionary of 268,435,456 entries. The skip code on the count table assists in data reduction, to make excellent use of such a large dictionary possible.

The minor 4 bits per symbol implicit context, has maximum utility on small dictionary entries, but the extra 16 times the number of entries allows larger entries in the same coding space. With a full 16 bit context enabled, the coding would allow over 50% dictionary symbol compression, and a much larger set of dictionary entries. The skip coding on large data sets is expected to be less than a 3% loss. With only a 4 bit context, a 25% symbol gain is expected.

On English text at about 2.1 bits per letter, almost 2 extra letters per symbol is an expected coding. So with a 12 bit index, a 25% gain is expected, plus a little for using BWT context, but a minor loss likely writes this off. The estimate then is close to optimal.

Further investigation into an auto built dictionary based on letter group statistics, and generation of entry to value mapping algorithmicaly may be an effective method of reducing the space requirements of the dictionary.