Lost in Translation

I may not officially work any more, but I do occassionly do some unpaid work. Currently this involves htmlifying Slovenian translation of a set of web pages. Except that they aren’t based on the web pages but the original, much edited and amended, documents.

This leads to all sorts of problems:

  • The English web page does not match the document
  • The Slovenian document does not therefore match the web page
  • The Slovenian document doesn’t always match the English document
  • Not all words have been translated

This ends up with having to have 3, and in some cases 4, files open plus the web browser to complete the task.

The task is made harder due to one, well one so far, it may end up with more, character. That of a c caron ( č ). The translators have used different methods in some of the Word .doc files, and my text editor doesn’t like that character and turns it into a c. So I replaced the character in the word .doc thinking it would be easier. It is to a certain extent until you realise that Word occassionly changes the character to cč but still treats it as a single character!

I have also discovered that text boxes are outside the normal flow of the document, and a search and replace doesn’t traverse these sections.

The long slog continues…

