Last updated on September 18, 2020
When a file is compressed, its size is reduced. It’s based on a simple principle: if you have a file with the text “aaaaaabbb”, which is 9 characters long, you could reduce the size by saving it as “6a3b”. Using this rudimentary algorithm, you can decompress the compressed file by multiplying the character that follows it by the number of times the digit indicates.
That’s why two files that are the same size are compressed at a different compression rate. Following the previous example, the file “aaaaaabbb” would become “6a3b”, which means a compression ratio of 4:9, i.e. it is reduced by 66%. However, the file “aaabbaaab” would become “3a2b3a1b”, with a compression ratio of 8:9, it is only reduced by 12%.
The other method of compression involves a loss of data, since it is based on eliminating information that does not contribute much to the final result, but that requires many bytes to be stored. This can be done without major problems in images and music.
There are hundreds of compression algorithms. It is a very interesting branch of computing. If you are interested in delving deeper into it, I recommend this post on the history of compression algorithms.
Now imagine that you can get a very good compression rate. 1:1000. Or better, 1:1000000. This would mean that a several TBs file would be compressed into a few KBs file. If we send this compressed file to an unsuspecting user who decompresses it, his computer will try to decompress a quantity of data greater than its storage capacity, which will cause the system to get caught. This can be used to attack the availability of a system, or to disable your antivirus and try to get in with another type of malware.
Here are several ways to do this type of attack.
Nested zip bombs
The compression method for zip files is DEFLATE. The biggest drawback is that the maximum compression rate supported by this method is 1:1032. That’s why recursion is used. That way, with a pyramidal recursion, you can get a really high compression rate. Let’s look at an example. First, I have created with a loop in python a 17GB file containing only zeros, “file.txt”.
Then I compressed it, creating the file 1.zip, which only takes 17MB. I have made 9 copies of this compressed file. To compress it just use the command zip 1.zip file.txt
Now comes the only problem I’ve found doing this PoC. If we try to compress these ten zips into another zip, we would see that the resulting file has not compressed anything, and takes up the same as the ten zips put together:
This is because Zip’s “store” method does not perform any compression, and Zip chooses this method when it feels it is better to store rather than compress, as is the case here. It is also useless to indicate that you want to compress it with the option -Z deflate, if the program considers it better it will continue storing it with “store”.
However, by using the python zipfile library you can force it to compress. I have made this little script that compresses the ten zips in one, and I have put it in a loop to have 10 copies of this new zip.
These new copies take up only 263KB:
With a 4 level nesting, we came to have a single 29KB zip containing 10 zips of 26KB each containing 10 zips of 27KB each containing 10 zips of 263KB each containing 10 zips of 17MB each containing a 17GB file. This is a total of 17*10*10*10*10 = 170TB compressed in just 29KB. And you only need to do a couple more iterations to get several PBs with just a few KB.
You can visit here the next post of the zips bomb series, in which I talk about the “quine” zips bombs that replicate themselves.