I am currently starting with a Project, that deals with the ePub Format. I just thought, that it might be a good start to create a basic tool chain to create ePub Files from given Documents.
The de-facto Document Format is still PDF so i started with a Document in this Format. To make sure i deal with a CC Document i used a Document. created by my own: my ancient Diploma Thesis (there is only a german version :-/). The PDF is around 3.5 MBs in size.
If you work with different eBook Formats and different eBook Readers, you should definitely have a look at calibre it is a all-in-one eBook Solution. It handles PDF and ePub Files and – more important – can convert PDF into ePub Files. The Resulting ePub File of my Diploma Thesis is available for Download here – Hope it looks better than on my iPhone using Stanza ;-).
As you can see, there is still some need to enhance the result of the conversion, but the result produced by the calibre converter is quite good and readable. There is a big problem with the headers and footers (e.g. page numbers), but for a automatically, free process, i can deal with it.
There might be some PDFs that calibre cannot process. I could solve this problem by “reprinting” the PDF with the MacOs PDF export. The calibre could also convert these PDF files.
Another cool thing i found today, is the conversation of PDFs into image files using the ImageMagick convert tool. There is a good Blog post at medicalnerds.com.
You need just a few steps.
First make sure, that you have a version of ImageMagick installed on your system:
sudo port install ImageMagick(macports)
sudo fink port install ImageMagick(fink)
After that you can use the convert command to generate image files from every page, your PDF Document contains:
convert -density 300 file.pdf file.jpg
to get information about the progress, you can add the -monitor switch to the call.