CafeTran4Mac: The Big Mama, an Approach

A Big Mama (BM) is a general memory for segments. It is the TM in which you keep all your translations for the language pair. It’s optional of course, but results from your Big Mama may surprise you.

Some people doubt the use of a Big Mama, but at the end of this entry, I will list its advantages.

As I said, the BM is where you save all your translations. You can of course add TMs from other sources, but you’ll have to make sure they are of at least the same quality (and unfortunately this isn’t always the case), and not excessively specific. If you never do chemical stuff, it’s not recommended to add a very specific chemical TM to your BM, since it will be used for Auto-Assemble (AA) and also for Auto-Complete (please see the relevant entries in the Wiki) and is therefore probably counter-productive.

When you start using a BM, I take it it’s rather small. This allows you to benefit most of the TM settings. The recommended settings:

Memory Type. It’s clear we’re dealing with a Translation Memory (a memory for segments). Leave all the other options in this section disabled (though some people think it’s a good idea to enable “Processing Tags”, I have my doubts).
Priority. Set it to “Low Priority”. As a rule, all other attached TM are more important.
Workflow Integration. With “Automatic” you will benefit optimally from Auto-Assemble. CT will search your BM real-time.
Matching Type. “Fuzzy and Hits” will offer you most results.

If you use your relatively little Big Mama at this stage, automatic concordance search and AA will show up instantly when you move to the next segment.

Unfortunately, as your Big Mama grows, the delay will increase.

When you notice that, the first thing you should do is increase the RAM assigned to CT in Menu, Edit | Options | Memory | Java Memory Size. Do not assign more to CT than the physical memory of your computer. Increasing the RAM improves the speed of concordance search and AA without affecting the results you’ll get.

If the delays increase again, the next step would be to set the Matching Type (3) from Fuzzy & Hits to Fuzzy. This will have some impact on your results, but I bet you can still live with it.

Switch to the "Pretranslation" mode from the default "Automatic" in Memory Setting in Project Info and check the Translation in Review box. When you start the translation, you will se a progress bar that indicates the background pretranslation. There's no need to wait till the pretranslation is ready, just start working. However, it’s recommended to use all attached TMs for the pretranslation (together with the Big Mama), since later changes will not be included in it. This also almost requires you to use a Project TM with the highest Priority, and set for use in the QA.

And when your Mama really gets out of hand, you should set the Workflow Integration to Manual: For use with the Search function only.

There are two more ways to still use an oversized BM, but they (seemingly) defy the Big Mama concept, saving all your translations in one file:

You can set your BM to Read-Only. It will use considerable less RAM, making the process faster. I don’t know if this has any consequences for the results, since in this setting, the segments are stripped from their internal XML codes. After finishing the translation, you should use your ProjectTM to update your BM. More on the ProjectTM later.

Import your BM into the external database: Menu, Edit | Options | Total Recall | CAT Tools Exchange | Load from TMX Memory… Since the table will be indexed, concordance searches are blistering fast, but you won’t benefit from AA. Unless you use the Recall functionality, Menu, Edit | Options | Total Recall | Recall Segments to Memory… This kind of reverses the process, CT will create a memory that consists of only the relevant segments. More on external databases later. After finishing the translation, you should use your ProjectTM to update your BM, and to update the table by simply loading it to it.

Both solutions may present problems for the elderly users who tend to forget things. I suggest running a script to remind you of it.

Maintaining a BM is a good idea for a.o. the following reasons:

You kind of automatically keep track of everything you ever translated.
The content is yours, so you can trust it (we hope).
You’ll be surprised how much “repetition” you will stumble upon, often from texts you did years ago.*
Auto-Complete will take suggestions from your BM.

* Earlier this year, I did a 20,000 words financial report. I’d set my BM to manual workflow, but I soon found out that a search in it resulted in a lot of hits. I tried “Insert All Exact Matches” from the Translation menu, and even though the workflow was manual, I got heaps of matches, some of them from 2006, possibly earlier. It reduced the job to some 2,000 words.

Related: TMX Files, an Approach

Saturday, 11 April 2015

The Big Mama, an Approach