...making Linux just a little more fun!
By Anonymous
This article is a follow-up to Maxin B. John's article, which introduced us to the Festival text-to-speech synthesizer and some possible applications. Here, we will push it a bit further and see how we can convert ebooks from the most common formats like HTML, CHM, PS and PDF into audiobooks ready to send to your portable player.
With the high availability of cheap and small portable MP3 players these days, it has become very convenient to listen to books and articles just anywhere when you would not necessarily have the time to read them. Audiobooks usually require very small bit-rates, and hence very small sizes - and as a consequence they are the most suitable content for the cheap/small capacity MP3 players (128 MB or less).
There are lots of websites out there catering for audiobooks needs with a wide range of choices. However, it might happen that you really want to read that article or book that you found on the web as a PDF or as HTML, and there is probably no audio version of it available (yet). I will provide you with some scripts that will enable you to convert all your favorite texts into compressed audio files ready to upload and enjoy on your portable player. Here we go!
Most of these tools are packaged in the main Linux distributions. Once you have all of the above installed, we can start the fun. We will begin with one of the most common format for ebooks: Adobe PDF.
#!/bin/sh - chunks=200 if [ "$#" == 0 ]; then echo "Usage: $0 [-a author] [-t title] [-l lines] <ps or pdf file>" exit 1 fi while getopts "a:t:l:" option do case "$option" in a)author="$OPTARG";; t)title="$OPTARG";; l)chunks="$OPTARG";; esac done shift $((OPTIND-1)) ps2ascii $@ | split -l $chunks - tmpsplit count=1 for i in `ls tmpsplit*` do text2wave $i | lame --ta "${author:-psmp3}" --tt "$count ${title:-psmp3}" \ --tl "${title:-psmp3}" --tn "$count" --tg Speech --preset mw-us \ - abook${count}.mp3 count=`expr $count + 1` done rm tmpsplit*
First 'ps2ascii' converts the PDF file or Postscript file to simple
text. That text is then split into chunks of $chunks lines; you
might have to tweak that value, since splitting the book into more than 255
files might cause troubles in some players (the id3v1 track number tag can
only go up to 255.) After that, each chunk is processed by text2wave and
the resulting audio stream is sent directly to 'lame' through a pipe. The
encoding is performed with the mw-us
preset, which is mono ABR
40 kbps average at 16 kHz. That should be enough, since Festival outputs a
voice sampled at 16 kHz by default. You can leave it as it is, unless you
are using a voice synthesizer with a different sampling rate. Refer to
lame --preset help for optimum settings for different sampling
rates.
When you input the artist or title, do not forget to quote the string if it includes spaces; for example:
ps2mp3 -a "This is the author" -t "This is the title" my.pdf
Next, we are going to see how to convert to an audio file from the most common format: HTML.
#!/bin/sh - #requires lynx, festival and lame if [ "$#" == 0 ]; then echo "Usage: echo $0 [-a author] [-t title] <html file1> <html file2> ..." exit 1 fi while getopts "a:t:" option do case "$option" in a)author="$OPTARG";; t)title="$OPTARG";; esac done shift $((OPTIND-1)) count=1 for htmlfile in $@ do section=`expr match "${htmlfile##*/}" '\(.*\)\.htm'` lynx -dump -nolist $htmlfile | text2wave - | lame --ta "${author:-html2mp3}" \ --tt "$count. ${section:-html2mp3}" --tl "${title:-html2mp3}" \ --tn "$count" --tg Speech --preset mw-us - ${section}.mp3 #rm /tmp/est_* count=`expr $count + 1` done
The first part of the script, up to line 16, is about extracting the optional parameters from the command line. From line 19 we are going to perform a loop on the list of all HTML files, the remaining arguments given at the command line. On line 21, "${htmlfile##*/}" strips out everything up to and including the last "/" character - useful if we are dealing with URLs or a directory path - so only the filename remains. Then the '\(.*\)\.htm'` regular expression takes care of the extension of the file so the variable section holds only the stem of the file. It will be used to tag and name the resulting MP3 files.
Line 22 is really the heart of the script: first, 'lynx' takes an HTML
file as input and dumps its text to stdout. That output is piped to
'text2wave' and converted into a WAV-encoded stream, which is then piped to
'lame' to be encoded with the mw-us
preset and id3-tagged with
the artist/title/speech genre.
Note that the script can also take URLs as arguments, since they are directly sent to lynx.
This html2mp3
script is going to be
very useful for our next step, which is converting from CHM to MP3.
CHM files are a proprietary format developed by Microsoft, but basically they are just compiled HTML files with an index and a table of contents in one file. Their use as an ebook format is certainly not as widespread as HTML or PDF, but as you will see, it is pretty straightforward to convert them to audio files once you have the right tools.
#!/bin/sh - #requires archmage and html2mp3 if [ "$#" == 0 ]; then echo "Usage:" echo " $0 <chm file> [-a author] [-t title] <html file1> <html file2> ..." exit 1 fi while getopts "a:t:" o do case "$o" in a)author="$OPTARG";; t)title="$OPTARG";; esac done shift $((OPTIND-1)) archmage $1 tmpchm find tmpchm -name "*.htm*" -exec html2mp3 -a "$author" -t "$title" {} \; rm -fr tmpchm
archmage
is a Python-based script that extracts HTML files from
CHM. You will need to have Python installed to get it to run.
Unlike 'ps2mp3', 'chm2mp3' does not require an arbitrary decision on where to split the book: every page compiled into the CHM file becomes its own audio file. All we need to do is extract these pages with 'archmage' and convert them with 'html2mp3'.
We are using the find command to recursively search for HTML files in the CHM book that we extracted, since sometimes the HTML files are stored in subdirectories inside the CHM. Then, for each HTML file found, we call 'html2mp3'.
Remember that it can take a while to encode several dozen pages of text to speech and then to MP3. But you do not need to encode a full book to start uploading and enjoying it on your portable player.
Another recent article on Festival and TTS synthesis software