I would like to convert printed books I own into audio by scanning them with OCR and then running the text through a TTS engine. These titles are not available as ebooks.
Since OCR can make small errors especially when converting images containing old typefaces, I would like to find an OCR engine that can tag each region of text with metadata describing the engine's perceived likelihood of a correct match, or an array of other possibilities. For example, see Google Voice's voicemail transcription, which highlights each word in shades of gray indicating the speech-to-text engine's probability ranking.
Do you know of any packages that offer this?