PDF Map Hack – Remove Embedded Image

pdforthomap

Need to remove an embedded image from a PDF file?  You can easily chop out parts of it as needed with the PDFtk command line tool and a little bit of text editing.  Here’s how…

When GIS applications started allowing us to save maps into PDF files, I was happy for about 2 minutes.  The first 48MB PDF sent to a client (by “sent” I mean burned to a CD and mailed) wasn’t even viewable on their low-powered computer.

Now, more than 12 years later I’m still dealing with a similar PDF challenge: how to remove unwanted elements from the document.  This time I had a PDF that included some vector graphics on top of an orthophoto image background.  As I was going to convert the PDF to a TIFF and georeference it in QGIS for someone else to use (see below), I wanted to drop the embedded image.

Remove Image from PDF Map

I’m a big fan of the PDFtk command line toolkit (aka PDFtk Server) as it is cross platform and has never let me down when splicing, chopping, concatenating and now modifying PDF files.  So I was thrilled to read this tip from quickpdf.org forums:

That is precisely what I did and it worked well.  The only gotcha was that I had never edited an uncompress PDF file and, obviously, my PDF wasn’t the same as the tip example.

I was able to find the section where the only RGB image started – it was “4 0 obj” and the “endobj” was about 400 lines further down.  I deleted those, ran the compress step and was off to the races.

PDF to TIFF Command Line Tool on OSX

As a bonus for reading this far, I then converted the PDF to TIFF so I could use it in QGIS.  I used the built-in OSX command line tool:

Incidentally, using QGIS Georeferencer plugin, I was only four clicks away from thin plate spline transforming the TIFF file into a georeferenced image with transparency.  Now I could put whatever background image I wanted behind it.


One thought on “PDF Map Hack – Remove Embedded Image”

Comments are closed.

© Tyler Mitchell / Spatialguru.com