Data Compression

Data Compression means the transformation of information that is performed to reduce its volume. It is used to ensure the rational use of hardware resources that store, process, transmit and perform any other operations with information.

Data Compression in NetApp storage

learn more

The Data Compression process is based on the elimination of redundancy, which is characteristic of intact (uncompressed) data. The simplest example of information redundancy is too many repetitions of the same word in the text.

To remove this kind of redundancy, you need to replace a frequently occurring word with a link to another piece of data that is encoded and has a strictly specified volume.

Reducing the “weight” of data can be achieved by replacing encoded words with too often repeated data types and long codes of too rare data (entropy coding). If the data does not have redundancy (encrypted information, “white noise”, short signal, etc.), then it will not be possible to compress them without losing information.

Lossless Data Compression is a process that allows, if necessary, to completely restore the original information, because the volume of stored information does not decrease, despite the decrease in the space it occupies.

The above possibility may appear if the probabilities are unevenly distributed on the messages. For example, when some of the messages that are possible in theory did not occur in the early encoding of these messages.