Getting Started
This library is an end-to-end audio/text aligner. It is meant to be used together with the ReadAlong-Web-Component to interactively visualize the alignment.
Background
The concept is a web application with a series of stages of processing, which ultimately leads to a time-aligned audiobook, i.e., a package of:
- ReadAlong XML file describing text
- Audio file (WAV or MP3)
- HTML file describing the web component
Which can be loaded using the read-along web component.
A book is generated as a standalone HTML page by default, but can optionally be generated as an ePub file.
Required knowledge
- How to use a Command-line interface (CLI).
- How to edit and manipulate plain text, XML and SMIL files using a text editor or a code editor.
- How to edit and examine an audio file with Audacity or similar software.
- How to spin up a local web server (e.g., see How do you set up a local testing server?)
What you need to make a ReadAlong
In order to create a ReadAlong you will need two files:
- A text file, either in plain text (
.txt
) or in ReadAlong XML (.readalong
) - Clear audio in any format supported by ffmpeg
The content of the text file should be a transcription of the audio file. The audio can be spoken or sung, but if there is background music or noise of any kind, the aligner is likely to fail. Clearly enunciated audio is also likely to increase accuracy.