Saturday, May 16, 2009

Data compression in .NET

In .NET, developers often have to choose between 2 options for compression/decompression - GZipStream and DeflateStream. What's interesting to note is that GZip uses the same 'deflate' algorithm as DeflateStream; but in addition also supports CRC checks and has additional headers to store metadata such as version nos, original file name, timestamps, etc.

So GZip is actually a data format and multiple files can be compressed into a single archive. You can open a file written to by the GZipStream using a GZip decompression utility such as WinRAR on Windows, gunzip on Linux, etc.

Sample example code can be found at MSDN.  There is also a sample solution that allows working with multiple files using GZip compression. Both these classes only support max 4GB as the stream length.