Language mystery: Fight the machine? (2)

You want to travel from A to B. You get in your car, tell it where you want to go, relax - and the car does the rest. It recognises the route, keeps to the traffic regulations, detects hazards, avoids collisions with other vehicles, uses its database of road works and congestion to choose the best route and gets you to your destination quickly and safely.

Is this vision realistic? Could a car really be programmed to "see" traffic and detect all hazards? What road markings are necessary for the car to tell the difference between road and non-road? Can vehicle programming be sophisticated enough to anticipate all possible traffic situations? There are already automatic vehicle systems such as reversing cameras to help city parking, collision warning systems, speed control systems. Satellite navigation systems can help the driver to find his/her way. Can these systems be combined and refined to create all-round automatic travel?

The answer to this question is probably rather complex and full of "ifs" and "buts". Just like the answer to the question of whether computerised translation is suitable for professional use.

What for?

Machine translation can be useful for some purposes. If I come across an Internet text in a language that I do not understand at all, and if there is an automatic translation solution available for this language, a machine translation into a language that I understand may give me a general idea of the content. This is known as "gisting", i.e. my aim is to understand the "gist" of the text. Usually this process will give me a reasonable indication of the subject matter, but it is notoriously unreliable on the details, and in places I must expect the text to contain serious mistranslations. If I need a reliable translation, the only solution is to consult a competent human translator who understands the message of the text and can express it in the desired target language.

What subject matter?

It is generally agreed that MT is not suitable for literary texts. But there are many other domains and types of text that are completely unsuitable. I frequently translate contracts and other legal texts from German to English (and occasionally vice versa), and most of the sentences that I face are completely unsuited to automatic machine translation. This is partly because of the potential for terminology mismatch between the two legal systems, even where specialist dictionaries suggest equivalent terms or phrases. But it is also due to the sentence structure. Legal writing is often very complicated, with intricate clause structures and multiple layers of meaning within most sentences. The typical English word order "subject-verb-object" is sometimes reflected in German, but the alternative patterns "object-verb-subject" or "object-subject-verb" are also common. And complex adjectives (which frequently occur in legal language) are handled completely differently. A competent specialist translator must first take time to grasp the structure and interconnections of the elements in the German sentence, and then spend more time working out how these elements should be transferred into meaningful English, a process which often involves trial and error and, in an age of computers, shuffling of the elements by "drag and drop".

There are pitfalls in many other subject areas, too. Topics such as investment banking, business management, accounting and many others have their own conventions in each language. Even in technical disciplines there can be terminology and syntax mismatch which can lead to problems. I recently translated the technical specifications for the construction of a facade for a building. This text contained many terms which were not found in any specialist dictionaries and did not occur on a number of bilingual sites that I sometimes use for research on the Internet. In one or two cases, even the leading search engines had never heard of the concepts. I had to solve these problems by a multi-staged process which involved breaking these compound German terms down into their parts, investigating the meaning of the parts on their own, finding and checking other combinations of these parts, compiling a short list of possible English equivalents, then using search engines to check how plausible these equivalents were.

In cases like this, machine translation is out of its depth. Rules-based translation systems are liable to fail when the author of the source text bends or breaks the rules. Statistical machine translation systems, which depend on a corpus of previous material, are lost when there is no corpus.

How sophisticated?

The most sophisticated MT projects are projects with a restricted subject area and a well-defined procedural structure. They usually deal with mass-produced technical products, especially in areas where manufacturers produce a range of products which are similar in many respects and where the documentation has various recurring patterns. The procedural structure involves various stages. The first stage is editing the source text before it goes through the translation process to remove factual mistakes, language mistakes and non-standard wording. Then comes the machine translation itself, but it is then followed by post-editing by a competent editor. This editor usually needs to understand the source language (and the specific technological discipline) in order to spot and correct any mistranslations. And the editor also needs to give feedback to the system, thus enabling the MT system to expand or correct its data and "learn" from the work of the editor.

There is still controversy about the use of such systems. Proponents point to the savings in cost and the increase in efficiency. Others speak of the risk of liability if the quality control system is unable to eradicate the inherent errors and if the "translated" documentation therefore contains mistakes which lead to damage or injury. Another controversial issue is the role of the post-editor. What qualifications does the post-editor need, what are the potential earnings and how satisfying is the work likely to be?

Just another dictionary?

Professional translators have plenty of reference works. First of all, there are shelf-loads of dictionaries. I have about 70 dictionaries in various subject areas (some bilingual, some monolingual) and a good range of background reading. I also have several bilingual dictionaries in digital form. I have developed various strategies to extend my terminology searching on the Internet, and I also use a "translation memory" software program which gives me an easy way to look up all of the work I have done over the last 11 years.

This is not in any way special - the majority of the really experienced and competent professional translators probably have similar resources. It is therefore perfectly logical to add some form of access to an MT program. Some time ago I invested in such a program ("Personal Translator" from linguatec), and I occasionally use it for a "second opinion" on individual sentences. But I do not use it very often, because the results are simply not useful enough. From time to time it may provide a good suggestion which I can incorporate into my work, but in most cases the rendering is just not useful enough, so it is usually more effective to work without it. And when I do use MT, the guiding principle is the same as when I use paper-based or digital dictionaries or Internet resources: the help that I find is just a suggestion. I am the one who must judge whether it is really useful, and I am always free to adapt it to the requirements of the text that I am working on.

The title of this article is "Fight the machine?". The short answer is: No. I don't wish to fight against the machine, and I am open to use the resources provided by computer programs and the Internet. But these resources need to be used carefully and critically. They are a resource for our work, not a source of higher wisdom.

3 comments:

Victor Dewsbery8 October 2010 at 18:07
Just seen a news item relevant to my opening paragraph. The Technical University (TU) of Braunschweig, Germany, has a pilot study for a car that can drive automatically through the city. News at http://www.ndr.de/regional/niedersachsen/harz/leonie129.html, or http://www.tu-braunschweig.de/ and look for links to "Leonie".
Victor Dewsbery10 October 2010 at 09:12
Another automatic car pilot project, this time in California and with a report in English: http://www.bbc.co.uk/news/technology-11508351
So is the analogy between automatic car driving and automatic translation valid or helpful? What do you think?
Victor Dewsbery9 August 2011 at 17:20
Update August 2011: First automatic car crash, although Google claims that the human on the driving seat was actually driving it. In other words: don't blame our programming, it was the user's fault. See http://www.nbcbayarea.com/blogs/press-here/Google-Blames-Human-for-Robot-Car-Crash-126845683.html
So who is liable if I use Google Translate to help me with a translation, and the translation that I deliver is defective? Should I expect that the automatic control (sorry: machine translation) will make no mistakes?

Language mystery

Monday 27 September 2010

Fight the machine? (2)

3 comments:

Popular Posts

Blog Archive

About Me

My Blog List

Followers