Creating a searchable website for medieval Welsh prose was an interesting exercise given that I don’t speak any Welsh, never mind the medieval sort.
Rhydiaith Gymraeg (Welsh Prose) is a bi-lingual Cardiff University site containing transcriptions of medieval manuscripts holding some 1.8 million words.
The manuscripts are marked up as XML documents with markup for the structure of the original text, including various indications for damage, scribal additions, and so on. A searchable database is generated from the XML using XSL+PHP tools.
The site was build using a fairly conventional PHP/MySQL platform, but special problems were created by Welsh digraphs and collation rules. In Welsh (as in Spanish), ‘ch’ and ‘ll’, among others, are treated as single characters, following ‘c’ and ‘l’ respectively. This means that, for instance, achas comes after acris, not before. Also, ‘c’ and ‘k’ are considered equivalent, so datkenir comes between datcan and datcud. This created special problems for the Wordlist facility, which lists words beginning with certain prefixes.
For this project, I also proposed and implemented the visual design, and it can be seen at www.rhyddiaithganoloesol.caerdydd.ac.uk.