The art of converting pixels of images into machine-editable text looks already back on a
meanwhile longstanding tradition of successful developments. Programs available nowadays
produce often quite respectable and sometimes even astonishing results. However, working
with historical sources is quite a different story. Trying to do the job in this field not
only requires adequate hardware but also a considerable amount of expertise and experience.
The historical period the AAC has been working on in recent years poses a number of quite
particular challenges such as poor paper quality, small letter sizes and black-letter
typefaces. The varying combinatorial possibilities of these features bring about a number
of situations each demanding particular procedures to tackle the task.
In doing this, we have been trying to tap the full potential of existing solutions. In many
cases training of the software has been helpful, very often more time consuming interactive
approaches had to be chosen. In a number of cases the application of OCR standard procedures
turned out to be unjustifiably inefficient or yielding little usable output. In such
circumstances OCR was replaced double-keying, a technique which has been applied in
particular to newspaper sources cooperating with a Beijing based company.






