Sunday, 21 November 2010

On Databases

I doubt if there are (m)any translators around who choose CafeTran as their first ever CAT tool. Most people who consider CT will be "switchers." I am. I have been using DéjàVu for 13 or 14 years now.

The main feature of CAT tools is databases, and some time ago, a DV user suggested to use only one set of databases for each language combination. That idea made sense. In DV, there are three kinds of databases:
MDB, the memory database also referred to as TM (Translation Memory), also called Big Mama if you adapted the idea mentioned above.
TDB, the terminology database, or Big Papa, used for terms, but also for segments that occur frequently, like "For more information, please go to [URL]"
And the project specific Lexicon.
Since I want to be able to use all three kinds of databases, I will discuss all of them, one by one.

Arguably the most useful database is the TDB, so I'll start with that one. My Big Papa is about 50 MB, built up in I don't know how many years. I want to use that database in CT, of course, so I will have to convert it. I have the option to convert to text, several versions of MS Excel, and Catalyst. I choose text, tab delimited, because I have no idea what Catalyst is, I can easily convert txt tab del into Excel, and because I have the feeling a txt file will be a lot smaller than any of the other possibilities. This can be crucial since CT uses RAM to access memories, and even though I have 4 GB of RAM available, a smaller memory file would be accessible faster (me thinks).

My Big Papa ended up as a 15 MB text file. So far, so good. The next step will be importing the TDB file into CT. See you later.