Quick and reliable access is a mandatory prerequisite for putting digital texts to work.
However, adequate interfacing and the implementation of efficient search engines is not
easily achievable by means of standard tools. Most solutions do not scale with constantly
increasing sizes of corpora, all the more so in complexly structured and deeply annotated
text collections.
During the initial years of the AAC’s corpus built-up, corpus data have been accessed using
traditional full text search tools. Although XML data have already been utilised fruitfully,
exploitation of the full potential of XML data incorporated in the corpus remains an issue
on the agenda of the consolidation phase of the project. The tools applied during the first
years were for the most part database applications with efficient full text search
capabilities that were adapted to the particular needs of our researchers.
Having established a working infrastructure for the digital texts available at this point,
the AAC is currently working on more sophisticated methods of utilising large scale corpora
doing experiments with various database systems as well as XML-aware indexing tools to
establish standard procedures of accessing large XML text repositories.







