background
text_tech
AAC - Basics
Infrastructure
Scanning
OCR
Image Processing
XML Markup
Text Retrieval
Databases
Corpus Tools
Web Design
AAC-Container
Applications
Lab
Institution
Text Retrieval Print
Quick and reliable access is a mandatory prerequisite for putting digital texts to work. However, adequate interfacing and the implementation of efficient search engines is not easily achievable by means of standard tools. Most solutions do not scale with constantly increasing sizes of corpora, all the more so in complexly structured and deeply annotated text collections.

During the initial years of the AAC’s corpus built-up, corpus data have been accessed using traditional full text search tools. Although XML data have already been utilised fruitfully, exploitation of the full potential of XML data incorporated in the corpus remains an issue on the agenda of the consolidation phase of the project. The tools applied during the first years were for the most part database applications with efficient full text search capabilities that were adapted to the particular needs of our researchers.

Having established a working infrastructure for the digital texts available at this point, the AAC is currently working on more sophisticated methods of utilising large scale corpora doing experiments with various database systems as well as XML-aware indexing tools to establish standard procedures of accessing large XML text repositories.