| ||||||||||||
| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 20
20
3
The Technical Challenges:
Approaches to Research and Assessment
Dramatic improvements in computer hardware and software have contributed
to progress in machine translation. System performance, which can be measured
in `'raw LIPS" or logical inferences per second, is now doubling every one to two
years.20 Despite, or perhaps because of, these rapid advances in computer
technology the barriers in linguistic theory and other areas have become ever
more apparent. The result is a receding horizon as strides are made in R&D it
is clear that much more needs to be done.
Over more than 30 years of research and development, work on machine
translation has taken three approaches to the process of translation. But there are
a variety of machine translation systems in use today and new advances in
technology have ushered in systems using intermediate language representations
and artificial intelligence that enable the computer to "learn" a language. The
discussion that follows briefly reviews the current strategies for developing
machine translation systems, key problems for research and development, and
issues in the evaluation of machine translation systems.
20 Logical inferences per second is a measure of the speed at which the system: 1) recognizes a
pattern match between an element of the input text and a previously stored pattern, and 2) applies the
rule that goes with that pattern to translate the element.
OCR for page 21
21
DEVELOPMENTAL STRATEGIES AND PROBLEMS
Regardless of which languages are translated, Here are now three primary
translation strategies for machine translation. The direct translation method
deals only with single language pairs and translates words directly from one
language to another. Used for most of the earliest systems, this method involves
very little or no linguistic analysis and produces very rough translations.
Although this strategy is not designed to handle translations of complete
documents, it has been used for machine translation of large databases, tables of
contents, and titles of technical publications. The International Liaison Office of
MCC is using this strategy for Japanese to English translation in order to develop
databases on scientific developments in Japan for its customers who require an
overview of available materials In particular fields. It is expected that He users
will select documents for full translation by over means.
The transfer system, which operates in three stages, is the most widely used
strategy for machine translation. The source language is first analyzed and
converted into representations that can be transposed into sentence structures
through semantic analysis. In the second stage, the source language
representations are converted to the target language transfer representations. The
final process synthesizes the transfer language representation into the text of the
target language. This method is the one most widely used in Japanese to English
machine translation systems, although the transfer system works best when the
language pairs are closely related. In some mainframe systems such as
experimental systems ATHENE/N developed by Hitachi and FAI by Mitsubishi,
the transfer system is enhanced by the use of case analysis.22
The pivot method is the third translation s ;rategy. It is based on the ideas that
language is a universal human experience and that a universal interlingua can be
developed, which can be understood by a machine. This method is designed to
convert the source language into the interlingua which is Den converted into the
target language. The interlingua today is still largely a theoretical concept,
although the Logos system makes use of an interlingua in a hybrid
interlingual/transfer architecture. Researchers working on the interlingua expect
that the application of artificial intelligence will permit significant advances to be
made?3
21 This overview of machine translation strategies is drawn from a paper by Wayne Kiyosaki,
"Machine Translation: Time for a Reappraisal," forthcoming.
22 "Japanese Machine Translation Systems Described," Tokyo NIKKEI ELECTRONICS in Japanese,
February 1986, pp.137-168.
23 For a detailed analysis of the strengths and weaknesses of these three stratgeies, see W. John
Hutchins, "Recent Developments in Machine Translation," New Directions in Machine Translation,
Conference Proceedings, Budapest, August 1988.
OCR for page 22
22
The systems that have been developed use various approaches; they can be
compared to tools in a tool chest. No one tool is always best but in some cases
one tool may be better than another.24
Using primarily direct translation or translation by the transfer system, there
are three possible ways in which human intervention can occur. Pre-editing can
involve two kinds of operations. In one case, a text is revised to eliminate
structural or lexical ambiguities before being translated by a computer. In the
past this approach was not widely used, due to the difficulty in anticipating
structures or words that will be difficult for a computer to handle. More recently,
with the introduction of text-critiquing software, the potential ambiguities can be
brought to the attention of a human translator automatically.
In a second approach to pre-editing, the input text is produced especially for
the machine. In some cases it is a new version of an existing text, in others an
entirely new text. Multinational Customized English is an example of a restricted
English developed by Xerox for use on its Systran system. In some cases, pre-
editing is almost as difficult as traditional translation. The efficacy of pre-editing
depends to a great extent on the human editor knowing the limitations of the
machine translation system.
In the case of interactive editing, the computer calls on the editor to make
choices among various alternatives in order to resolve ambiguities. It is also
possible to combine post-editing with interactive editing. However, this can
make the process costly. The first interactive systems were introduced in the
early 1980s and the interactive editing method seems to have gained wider
acceptance in recent years.
Of the three editing options, post-editing is clearly the most widely used.
Usually a professional translator, the post-editor corrects the machine's output.
This is more efficient when done directly on the screen using appropriate word
processing software. If the post-editor writes the corrections on hard copy and
they are then entered into the computer, the process is much slower. Some
estimate that an experienced post-editor can produce 4,000 to 8,000 words a day
and in some cases as many as 10,000?5
Since human intervention is costly, the goal of some developmental efforts is
fully automatic operation of a machine translation system. When the
application involves merely gleaning the "gist" of the text, some of the large,
general-purpose systems are used on a fully automatic basis. If a more careful
translation is needed, output can be post-edited. Such systems include general
24 Observation by Jaime Carbonell, Carnegie Mellon University.
25 finis discussion of editing options is based on work by Muriel Vasconcellos.
26 The Center for Machine Translation at Carnegie Mellon University is working to improve the
quality of machine translation output through the incorporation of knowledge bases, especially for
applications in limited domains.
OCR for page 23
23
purpose systems that are able to handle a wide variety of source texts and special-
purpose systems designed to translate a special type of source text such as
weather reports or abstracts of technical articles in particular fields.
In addition to full machine translation systems, there are many related
technologies that are used as translation aids. These include on-line dictionaries,
grammar checkers, and libraries of phrases that are regularly used by human
translators. These systems are updated and developed as post-editors contribute to
the on-line dictionaries and users give input to improve the way that the machine
translation system does the actual translation.
Research and technical challenges particularly relevant to Japanese to English
machine translation include the problem of inputting the text, which includes
Chinese characters as well as two phonetic scripts. Optical character readers will
help to solve the problem of text input, but there are still many difficulties
associated with input of Japanese text because of different character fonts and the
placement of charts, graphs, and tables in the text. Optical character readers are
now being coupled with machine translation systems in Japan, but the extent to
which they increase savings over manual input is not clear.
A major research and development question relates to the problem of pre-
editing. The better the source text (the clearer the expression and the shorter the
sentences), the better the resulting machine translated text and the less post-editing
needed. But, as mentioned above, pre-editing is time consuming and tedious work
that requires special skills.
While significant advances have been made in computational linguistics, there
remain problems that must be overcome in order to build linguistic theory and
develop more sophisticated machine translation systems. This set of challenges
could be approached in a step-by-step fashion, as some Japanese experts suggest.
Research in the following areas is needed: the introduction of priority information
in order to disambiguate several possible sentence structures and words; the
development of learning mechanisms that produce preference values for the
disambiguations; the establishment of grammatical rules that consider many more
than two elements simultaneously; improved capabilities for dealing with such
problems as anaphora resolution, ellipsis, and the analysis of sentence fragments.
Although machine translation strategies and system types are more or less
universal, the ways in which the researchers in the United States and Japan
approach these subjects are quite different. As one observer put it: Americans
write papers; Japanese build machines.27 The Japanese approach has been more
27 This distinction should be considered carefully. Some question the notion that Japanese
researchers are not theory-oriented: one leading Japanese researcher believes that (instead) they focus
on second and third approximations required for machine processing of natural language less
beautiful and less academic but useful theory. On the other hand, one critic of Japan's machine
translation says that the machines that the Japanese build do not really do the job and (therefore) them
approach is not practical.
OCR for page 24
24
pragmatic and oriented toward experimenting with systems. This involves a
problem-solving approach to linguistic analysis. The parts of the language that do
not fit neatly into linguistic theory models are approached by combining different
theories or by accumulating individual facts to deal with specific problem areas.
In contrast, the U.S. research community has concentrated more on machine
translation theory than on applications. Much of the researcher's time is devoted
to writing papers and developing new models of natural language. As a result,
critics argue that U.S. researchers construct models that are elegant but not
amenable to practical use. At the same time, we should remember that the strides
that have been made in basic computational linguistics, a research approach
recommended by the ALPAC report, make today's machine translation systems
possible.
The theoretical work that has been done in the United States and other
countries, including Japan, has made machine translation developers and users
aware of the research challenges that are present. These include the need for a
bilingual text corpus and the development of automatic comparison algorithms
for this corpus. The automatic collection of special terminology words and
construction of a thesaurus of these terms would improve many machine
translation systems. Standardizing dictionary theory and practice, proper analysis
of broken utterances, improved grammar checking devices, and automated
approaches to the detection and resolution of ambiguities are other important
research themes. Even linguistic and cognitive studies of pre- and post-editors'
behavior have been suggested as avenues to improved machine translation.
All of this requires an increase in the number of researchers working on
machine translation as well as more basic research themes. Some practical steps
might be taken to make experimental tools for natural language processing and
machine translation easily available to researchers. These might include the
construction of a portable software package for natural language processing, and
its distribution to interested researchers; establishment of core grammars for
English and Japanese that are linguistically sound, and their distribution to
interested researchers; the construction of a text database that includes bilingual
text data for use in natural language processing.
In the United States, where the thrust of research has been in more theoretical
areas, there is a need to improve interactions with those who take an
"engineering" and applications-oriented approach if commercialization of useful
systems is the objective. As noted above, interaction with users is essential to
system development. These and other questions central to R&D policy in the
United States will be explored more fully in Section 4.
OCR for page 25
25
EVALUATING MACHINE TRANSLATION SYSTEMS
Corporations involved in development, researchers working on fundamental
technologies, potential users, and government policymakers all need to know
how good machine translation systems are in order to make choices.
Unfortunately, there is no generally accepted method for evaluating the quality
and accuracy of translations by people, or by machines.
Japanese developers of machine translation systems often say that the systems
are 80% acceptable. This general score is, however, more an intuitive judgment
than the result of systematic research. It was pointed out that if 20% of the
cookies in the cookie jar are poisoned, no one will want to eat any of them.
Overall assessments of machine translation are less useful than evaluations of
specific systems because the evaluation depends very much on the needs of a
specific user. Japanese developers note that in some cases a reasonably accurate
or even a rough translation may be appropriate, while in other cases where high
levels of accuracy are essential, machine translation is unacceptable.
researcher who needs to comb through a vast mountain of information may find
rough translations of abstracts very useful in tracking overall trends In research or
in selecting articles for full translation. Nothing less than absolute accuracy in
translation will satisfy a lawyer working on a legal brief or a politician whose
words are quoted by the media. The machine translation systems now in
operation, particularly the prototype Japanese to English systems, have been
developed to translate technical documents, manuals, and information in
restricted domains.
Participants in the Japanese machine translation project supported by the
Science and Technology Agency of Japan developed an approach to evaluation
using two independent indicators: intelligibility (the extent to which the
translation can be understood by a native speaker of the target language) and
accuracy (the degree to which the translated text conveys the meaning of the
original).28 Samples of machine translated sentences were evaluated by the
researchers as roughly 80% acceptable. This overall evaluation was based on the
result that 80% of the sentences were given a score of at least 3 in intelligibility
and accuracy.29 It is estimated that 20 to 30% of the output sentences in Japanese
to English machine translation systems are unacceptable, and in those cases post-
ediiing cannot be carried out effectively.
28 See Makoto Nagao, Junichi Tsuji, and Junichi Nakamura, "Machine Translation from Japanese into
English," Proceedings of the IEEE, vol. 74, July 1986.
29 A score of 3 in intelligibility was given to sentences whose meaning was clear, but where the
evaluator was not sure of some word and grammar usage. A score of 3 in accuracy was given to
sentences where the content of the input sentence was generally conveyed in the output sentence, but
where there were problems with tense, voice, etc. Ibid., p. 1006.
OCR for page 26
26
While no commercially available system can do it, some Japanese to English
systems now in use by researchers in Japan reportedly can identify inaccurate
text. Leaders in Japanese to English machine translation research, however, note
that no accurate data are available to judge particular systems and that the
assessments of accuracy and intelligibility are not based on rigorous testing.30
Nor are there unambiguous cost evaluations of machine translation systems,
although developers contend that the time taken and cost are generally less than
with pure human translation. Here, again, the conclusions drawn about the
relative cost of machine translation depend on the type of text and the purpose of
the user. According to Japanese expert reports, the best Japanese machine
translation systems are cost effective. In one example, a page of text can be
translated in 40 minutes when post-editing is done on hard copy, while human
translation requires about 43 minutes per page. The charge for machine
translation is about 75% the amount for human translation in this particular
instance.3~
The more a user uses a machine translation system, the more efficient the work.
It takes at least one year and usually two years for a user to become really familiar
with a system and for cost efficiencies to become apparent. (See Figures 4 and 5.)
It appears that a signficant volume of text must be translated in order to achieve
such "learning curve" benefits. The more carefully selected the text (with short
sentences and well tuned content consistent with the parameters of the system), the
more apparent the cost efficiencies over time. (See Figures 6 and 7.)
Unfortunately, evaluations of machine translation systems currently depend on
subjective judgments as to what constitutes acceptable levels of cost and accuracy.
In many respects, beauty is in the eye of the beholder. What may be unacceptable
text to one user may be usable to another. A major obstacle to the development of
machine translation systems is the reluctance of some involved in development to
provide detailed information about performance charactensucs and to exchange
information about their experiences. Developers anxious to convince potential
funders of research and users of the systems have oversold their systems, resulting
in frustration. Potential users are well advised to conduct systematic comparisons of
system performance on sample texts of their own selection that are typical of the
application envisaged. In order to facilitate research and development, it will be
30 It should be noted that the ratings are carried out by the developers and reflect evaluations of
carefully "tuned" texts appropriate to the system.
31 See Japan Electronic Industry Development Association, A Japanese View of Machine
Translation. . ., op. cit., p. 12. Ihis utilization example involves machine translation of a technical
text "tuned" to the system. See also Appendix 9 of the report, Examples of Machine Translation Use
in Japan. One participant in the symposium reports that better results for machine translation as
compared to human translation from Japanese to English were recently reported at a conference in
Munich.
OCR for page 27
27
1 00%
904/0 ~
80% ~
70% ~
60% ~
a'
co
~ soo/O _
-_
A
~ 40o/O _
cr
30% _
20% -
1 0%
0%
BE
start 0.5 year 1.5 year 2.5 year 3.5 year
FIGURE 4 Developer's effort to improve. SOURCE: Data collected by a major Japanese Am
involved in machine translation development.
1.5 -
O ~
-
-
-1 .5
O:
76% '85%
.~~ 80%
human ~
~ rob
translation /
/ ~ ~~
61%.
Daily operation
began here
- _
_
1
1986 1987 1988
~ translation rate
FIGURE 5 User's effort to improve. SOURCE: Data collected by a major Japanese firm involved
in machine translation development.
OCR for page 28
28
1 00%
90%
80%
70%
a)
-
a)
co
x
a)
~ 30% -
co
20%
10%
0% _
/
~G~
-
_
1-20 21-40
~ original text
41-60 61-80 81-100 >100
Number of characters in sentence
+ pre-edited text
FIGURE 6 Length of sentences in text. SOURCE: Data collected by a major Japanese firm
involved in machine translation development.
1 00%
70% -
60%
in
O 50%
-
-
cn
40%
30%
20%
10%
0% _
90% ~ ~
80%- \
\
l l l l 1
1-20 21-40 41-60 61-80 81-100 >100
\
\
~ .
Number of characters in sentence
O original text
+ pre-edited text
FIGURE 7 Accuracy of translation (by length of sentence). SOURCE: Data collected by a major
Japanese Olin involved in machine translation development.
OCR for page 29
29
necessary to improve techniques for evaluating system performance and timely
exchange of information about new developments.32
32 One participant in the symposium expressed doubt, based on experience of the past 20 years, thee
reliable methodologies for evaluating machine translation systems can be developed. A comparison
of parsers under controlled conditions was suggested as a possiblity.
Representative terms from entire chapter:
natural language