PDF conversion for EPUB

Observant readers will have spotted that I have been obsessing about ereaders recently and have just bought a Kobo Touch (see here for the details). I have, therefore. started to work on converting a stack of PDF files into EPUB format for reading on said device (don’t bother with native PDF support on any of these devices, the nature of PDF files means they always render in a particular page format so can not be read without scrolling on any portable device smaller than the paper for which they are intended). I immediately found out why people find it so difficult! I don’t really have a good solution (so far) but I at least understand the problem.

The problem with PDF files

Adobe designed PDF files (or Acrobat files as we should call them) to provide an electronic image of a page as it would appear on a printed copy. They are widely used for sending the equivalent of an electronic fax, and also for sending material to people when you want to not give them the source files.

Pages can be complex with multiple columns, images, tables. side and foot notes and so forth so easily converting a page representation of the same in to blocks of HTML (which is roughly what EPUB looks like to me) is not simple.

I would have hoped that simple books which are one column with maybe page numbers, chapter breaks and headers would be achievable. How wrong I was!

Freeware options for conversion

As far as I can tell the best (freeware) package out there is Calibre which has a simple interface, a tool for getting book data from the web to populate metadata and the ability to convert format and load onto your device. I pointed some PDF files and they seemed to work, this was until I got them to my Kobo…

To be fair to Calibre the help file says ‘PDF is unrealiable’ and the software is upgraded regularly (in fact I don’t have the newest version I notice) but it does weird things on conversion – odd page breaks, weird splitting of the word Chapter and inconsistently losing ‘ll’. Painful in the extreme.

I have had to fall back on something called Mobi Pocket Creator to convert PDF to PRC format which seems to do a reasonable job of handling the pure text. I then use Calibre to convert to EPUB. This of course means that the actual EPUB code produced is a bit inconsistent / un-obvious and the Table of Contents doesn’t happen any more.

I then need to edit the EPUB; I could spend more time in Calibre and tweak the conversion but I prefer to get hands on so I used an editor called Sigil which lets me get right into the code but of course the previous steps have tagged things for style and structure!

The moral of this tale?

To be fair I want something for nothing – lots of books in PDF converted for free to EPUB to allow me to read in comfort. Maybe a bit of pain in conversion is good for the soul?

I will let you know how this progresses in coming months!

Mobipocket Creator, Calibre, Sigil


About Tony Jones
Big Finish writer, reviewer and blogger, I'm interested in science fiction and Doctor Who. I review for CultBox, The Doctor Who Companion and others. I am also Audio Drama editor for Starburst Magazine, and write the occasional piece for Vortex, the BSFA critical magazine.

One Response to PDF conversion for EPUB

  1. Pingback: Kobo Touch and the Kobo experience – week 1 « Reality Checkpoint

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: