I decided to finally learn QT and started to read the “C++ GUI Programming with Qt 4” (first edition) which is available online. The book comes in a zip file that unzips to a huge, 51MB, pdf file. Even when considering the book is quite long (556 pages), the file size is very large compared to what one is used to except. The huge file size made reading the PDF less convinient, as one notices a considerable delay when opening it (especially if the PDF resides on some portable storage), so I’ve decided to play a little and see what I can do about it.
I decided to try and re-distill the book’s PDF by printing it through the default PDF printer that comes with KDE, and voilà , the newly printed PDF weighted only 5.5MB, almost 10% of the original file size. I’m not sure what bloat is getting removed as I don’t see any loss of quality except that the bookmarks and links inside the PDF were lost. I’ve also tested successfully the technique with a 30MB and 26.2MB PDFs of other books I have and it resulted in 6.7MB and 6.8MB re-distilled PDFs (respectively).
So it seems one can greatly reduce big PDFs by re-distilling them using a simple PDF printer, at the cost of losing the bookmarks and links in the PDF (which I believe may be worth it depending on the original file size). It would be interesting to figure out a way to apply this technique but still maintain the bookmarks and links, if it’s possible at all.
Did you ever find out how to do this? as in distill but keep the links?
I did further testing after writing the post. If I remember correctly, the thing was that the original pdf wasn’t compressed. When I ran it through the distiller, it created a new, compressed pdf.
Compressing the pdf could also be done using pdftk, this should be a superior way as you will keep the links and bookmarks.