PROBLEMS AND ISSUES IN MACHINE TRANSLATION: THE CASE
Transcription
PROBLEMS AND ISSUES IN MACHINE TRANSLATION: THE CASE
ŠIAULIAI UNIVERSITY FACULTY OF HUMANITIES DEPARTMENT OF ENGLISH PHILOLOGY PROBLEMS AND ISSUES IN MACHINE TRANSLATION: THE CASE OF TRANSLATION FROM ENGLISH TO LITHUANIAN BACHELOR THESIS Research adviser: Assist. Lolita Petrulionė Student: Viktorija Stalmačenkaitė Šiauliai, 2013 CONTENTS INTRODUCTION .................................................................................................................... 3 1. AN OVERVIEW ON MACHINE TRANSLATION ........................................................ 5 1.1. General definitions .......................................................................................................... 5 1.2. Machine translation process ............................................................................................ 6 2. PROBLEMS OCCURRING IN MT .................................................................................. 8 2.1. Linguistic mistakes .......................................................................................................... 9 2.1.1.Grammatical mistakes .............................................................................................. 9 2.1.2.Lexical mistakes ..................................................................................................... 11 2.2. Systemic mistakes ......................................................................................................... 14 3. TRANSLATING TEXTS OF DIFFERENT REGISTERS ........................................... 15 4. THE ANALYSIS OF TEXTS OF DIFFERENT REGISTERS..................................... 17 4.1. The methodology of the research .................................................................................. 17 4.2. Analysis of technical set of instructions ........................................................................ 18 4.3. Analysis of popular non-fiction text .............................................................................. 21 4.4. Analysis of belles-lettres style ....................................................................................... 25 4.5. Analysis of newspaper article ........................................................................................ 29 4.6. Analysis of the official document .................................................................................. 34 4.7. Statistical analysis of data.............................................................................................. 38 CONCLUSIONS ..................................................................................................................... 42 REFERENCES ....................................................................................................................... 44 WEBSITES ............................................................................................................................. 45 DICTIONARIES .................................................................................................................... 46 SOURCES ............................................................................................................................... 46 ANNEX 1.................................................................................................................................47 2 INTRODUCTION It can be stated that a necessity to translate languages occurred together with first civilizations. What is more, if we look at ‘‘The Book of Genesis’’ we would find that ‘‘the whole earth had one language and the same words’’ (Genesis 11:1). Everything changed, when people decided to build an enormous tower, which would reach the heaven. Deffinbaugh (2004) asserts that God, then, realized that if humans succeed it would lead to ‘‘arrogant self-confidence and independence of God,’’ so he decided to destroy people’s plans: ‘‘Come, let us go down, and confuse their language there, so that they will not understand one another’s speech’’ (Genesis 11:7). From this point, people became bewildered by the variety of languages. In the course of time humans perceived that ability to understand and to use other languages is crucial in modern society what caused the emergence of translation. Shapa (2009) distinguishes four formal types of translations: 1. Oral translation (interpreting). 2. Written translation. 3. Computer-assisted translation. 4. Machine translation. Each translation requires a decent knowledge from a person who is rendering the text. To save time and energy of human translators the idea of machines, which could translate a large amount of texts, has been implemented. However, products of the human translator and the machine can differ greatly. Such being the case, machine translation has been investigated more and more thoroughly throughout translation studies. The aim of the current paper is to discuss problems and issues of machine translation in the texts of various genres. The following objectives have been set to reach the aim: 1. To analyze scholarly literature on the notion, process, problems and issues of machine translation. 2. Briefly present a notion of different text genres in the English language. 3. To perform machine translation and compare the output with the texts rendered by a human translator. 4. Highlight the most crucial mistakes in each text and evaluate the quality of the texts. Methods employed in this study are as follows: 3 1. The method of meta-analysis helped to review the conclusions made by other authors about the problems and issues of machine translation and theory related to different genres in the English language. 2. The sampling method was used to select and classify the examples of mistakes found in the output of machine translation. 3. The contrastive method enabled to investigate texts of different languages with a purpose of highlighting differences and issues of machine translation comparing to human translation. 4. The statistical method helped to systematize and generalize the empirical data of the present research. The scope of the paper is 49 examples which are selected from texts of different literature genres. Each example consists of 3 segments – the source text segment and two target segments. The total number of words under analysis reaches 1.577. Sources were mainly extracted from the Internet, whereas one work of fiction and instruction manual of an alarm clock were also used as the sources. More detailed description of all sources and the process of analysis will be discussed in chapter 4.1. As regards the practical value of the research paper, very little research had been done concerning the concept of machine translation from English to Lithuanian. Therefore, other students, carrying out research concerning this phenomenon, will be able to use data collected in this paper. The current bachelor thesis consists of the following parts: introduction, theoretical part, practical part, which includes the methodology of the research and statistical analysis, conclusions, and a list of references and sources. Introduction presents the aim of the study as well as the objectives, the scope, materials, methods used in this work, and the practical value of the present paper. The theoretical part covers the theory on machine translation, i.e. the general definitions, the process and findings on the topic of machine translation already discussed by other researchers. It also involves the theory related to different literature genres of the English language. The practical part consists of the collected examples and their explanations, as well as the methodology of the study and statistical analysis. 4 1. AN OVERVIEW ON MACHINE TRANSLATION This chapter briefly discusses the concept of machine translation, its process and approaches. This section also presents findings concerning problems and issues of machine translation made by other researchers. 1.1. General definitions Daudaravičius (2006:7) suggests, that terms automatic, computer and machine translation are often used without drawing any difference between them and that the use of those terms are usually determined by the context. However, Hutchins and Somers (1992:3) state that ‘‘Machine Translation is traditional and standard name for computerised systems responsible for the production of translations from one natural language into another, with or without human assistance.’’ According to the authors, such terms as automatic translation and mechanical translation are extremely rare in English. What is more, all authors agree that the definitions mentioned above do not include computer-based translation, i.e. translators are not supported by the access to various online dictionaries or terminology databases (Hutchins and Somers 1992:3, Daudaravičius 2006:7). Such being the case, the target text (TT) is rendered by employing pre-established algorithms and logical rules (Daudaravičius 2006:7). Thus we can conclude that it would be incorrect to state that machine translation is fully automatic process, because human interference is necessary to some extent, e.g. to install the rules followed by the machine. One more fact supporting this idea is that it is still practically impossible to get a high quality machine translation product. Therefore, outputs of almost every machine system are post-edited by humans (Hutchins and Somers 1992:3). However, the idea of machine translation is that of fully automatic translation process. Therefore, it should not be mixed with Machine-Aided Human Translation (MAHT) and Human-Aided Machine Translation (HAMT), boundaries between which are very often uncertain and the term Computer-Aided Translation (CAT) can cover both (Ibid.). Throughout the present paper the term machine translation (MT) will be used to present the material relevant to the process of applying machine system to render selected texts. As it was mentioned previously MT defines any computerized process of translating texts, but systems themselves can be of different kinds. Hutchins and Somers (1992:4) point out two types of MT systems: 1) bilingual – a system designed for only one particular pair of languages and 2) multilingual – a system designed for more than two languages. The latter may be either uni-directional or bi-directional (Robin 2009). However, in most cases they 5 tend to be bi-directional, which means they can translate texts from any provided language to any other given language and vice versa (Ibid.). The definitions given in this chapter define general systems of MT, but it is important to distinguish which forms are most often used by the society. This aspect had been discussed by Manion (2009:6) who distinguishes the following types: 1) MT for dissemination – this form is usually employed by corporations which publish many of their documentations and the TT must be of publishable quality. Texts are edited by the human translators afterwards what helps them to save time; 2) MT for assimilation – most often used on the Internet for it provides information in real time, but the TT will not always be intelligible. The aim of this form is to give a general meaning of the source text (ST); 3) MT for communication – this form is very similar to previously mentioned one, because it also has to perform in real time and also is found on the Internet. However, the difference is that MT for communication deals with the language utilized in conversations so it is more appropriate to be employed in emails and chat rooms (Ibid.). The system employed in this research is of second type, for it provides the information in real time, but the texts sometimes is not always comprehensible. Other types do not suit here, because the system is not able to cope with large amount of text, which is usually the case with the texts corporations translate, whereas the third type is not suitable, because chat rooms usually employ many colloquial words, abbreviations, etc. which also is a big obstacle for the current MT system. 1.2. Machine translation process Even though translation process consists of many tasks, two main actions can be pointed out, i.e. analysing the meaning of the ST and transferring encoded meaning into the TT. However, MT process is a bit more complex and mostly hinges on different approaches used in certain systems. Classical MT structure is presented by Jurafsky and Martin (2006:10). Authors present 3 approaches usually applied in MT and briefly describe them. Firstly they talk about direct translation and say that in this approach ‘‘we proceed word-by-word through the source language text, translating each words as we go’’ (Ibid.). Extensive bilingual dictionaries are employed in this process, where each dictionary is like a small programme and is responsible of translating one word (Ibid.). Further transfer approach is described. Jurafsky and Martin (2006:10) suggest that in this case, firstly we perform a grammatical analysis of the ST, and then reconstruct the input text into the target language (TL) parse structure by utilizing various rules. The next step, according to authors, is building the TL sentence from the 6 grammatical structure. Finally, Jurafsky and Martin (2006:10) point out, so called, interlingua approach. The essence of this approach is that we analyze the ST and put it in an abstract notion, entitled as an interlingua. The following stage is to create the TT ‘‘from this interlingual representation’’ (Ibid.). These approaches and processes are illustrated in well-known Vauquois triangle presented in Figure 1 below. Figure 1. The Vauquois triangle adopted from Jurafsky and Martin (2006:11) The triangle illustrates the knowledge required in the different types of analysis. It is evident that direct approach utilizes the lowest amount of translation knowledge, because words employed in this approach are usually rendered word-by-word as we go along. As we move up the process of translation gets more and more complex, because a deeper analysis of the ST and greater efforts to generate the meaning into the TT are required. The ideal of the rendering process is interlingua, i.e. ‘‘a scheme capable of representing all meaning expressible in any language in language-independent form’’ (Gerber 2009). A bit different and less complex scheme of MT process is presented by Robin (2010) (see Figure 2). Figure 2. A typical MT process adopted from Robin (2010) 7 Comparing with previously presented Figure 1 we can state that Figure 2 shows interlingua, for we can see that it involves various types of analyses which help to transfer the meaning of the ST into the TT as close as possible. To achieve a decent product deformatting and reformatting are utilized. Deformatting is used to identify portions of the ST which do not require translation while reformatting deals with putting those non-translated portions back into the TT (Robin 2010). Pre-editing means segmentation of long sentences into short ones or fixing up punctuation and separating portions which are untranslatable. Meanwhile, postediting fixes the TT that it would be up to the mark (Ibid.). Both figures constitute the idea that MT is an elaborate process if one seeks to produce a satisfactory translation product. The machine has to perform a number of tasks to analyse the ST properly and to generate the TT which would contain the meaning as close as possible to the translated text. 2. PROBLEMS OCCURRING IN MT Many authors have discussed and classified problems one faces while using MT systems. For instance, Hutchins and Somers (1992:81-96) distinguish 5 categories which are: 1) morphology problems; 2) lexical ambiguity; 3) structural ambiguity; 4) anaphora1 resolution and 5) quantifier scope ambiguity. However, only 3 reasons why MT is an elaborate process are discussed by Arnold et al. (1994:105). These problems are as follows: 1) ambiguity; 2) structural and lexical differences between languages, and 3) multiword units2. Riedel and Schwarze (2001) cited by Petkevičiūtė and Tamulynas (2011:38) provide one more distribution of translation issues. Authors divide rendering problems into 8 groups: 1) polysemy3; 2) homonymy4; 3) syntactic ambiguity; 4) referential ambiguity; 5) indefinite errors5; 6) synonyms; 7) metaphors and symbols, and 8) neologisms. It is evident that all authors highlight more or less the same problems, only the depth of the analysis of these problems differs. Petkevičiūtė and Tamulynas (2011:39) conclude that 1 Anaphora – use of a grammatical substitute (as a pronoun or a pro-verb) to refer to the denotation of a preceding word or group of words. Merriam-Webster Online. [Online] Available from: http://www.merriamwebster.com/dictionary/anaphora [Accessed on 10th October 2012]. 2 Authors mean idioms and collocations. 3 Polysemy – having multiple meanings. Merriam-Webster Online. [Online] Available from: http://www.merriam-webster.com/dictionary/polysemous [Accessed of 10th October 2012]. 4 Monosemy – the property of having only one meaning. Oxford Dictionaries [Online] Available from: http://oxforddictionaries.com/definition/english/monosemy [Accessed on 10th October 2012]. 5 By indefinite errors it is meant terms, sayings, and unclear words. 8 the most sufficient classification is given by Hutchins and Somers (1992:81-96), for it is thorough and involves all spheres of translation issues. Present research will be based on the classification proposed by Petkevičiūtė and Tamulynas (2011:39). They distinguish 2 types of possible MT mistakes: linguistic and systemic. Linguistic mistakes are further subdivided into morphological (grammatical) and lexical issues, whereas systemic problems do not have any subgroups (Ibid.). More detailed information about morphological, lexical, and systemic problems will be discussed in the following chapters. 2.1. Linguistic mistakes This chapter covers a presentation of the classification proposed by two Lithuanian authors Petkevičiūtė and Tamulynas (2011:39). This classification includes the most common mistakes in MT process when translating English texts into Lithuanian. In this section we will be discussing grammatical and lexical mistakes marked out by authors mentioned above. 2.1.1. Grammatical mistakes Petkevičiūtė and Tamulynas (2011:39-40) point out 7 main grammatical mistakes evident in the process of MT, which are as follows: Grammatical case; Verb forms; Number; Verb conjugation; Gender; Parts of speech; Negative verbs. According to the authors mistakes related to the grammatical case are one of the most common mistakes. MT system usually translates words in the nominative case. This issue is extremely distinct in the complex sentences where the system has to translate words linked together, but which are separated by extra words. Such being the case, MT system usually is able to identify the case of the first word correctly, but the second word is rendered in the nominative or other random case (Petkevičiūtė and Tamulynas 2011:39). This problem occurs 9 due to differences between cases of the Lithuanian and English languages. In the Lithuanian language cases are defined by adding different inflections to the word, whereas in the English language word order is the main tool to determine the grammatical case (Valeika and Buitkienė 2003:49). If MT system is not conversant with these peculiarities of both languages it will most likely cause problems in translating the ST. Another common mistake in MT is the issue dealing with verb forms. Usually MT system uses the infinitive or the third form of the present tense. These forms are selected because the system cannot use any other text comprehension knowledge, i.e. semantics and pragmatics and this makes it difficult to assess which form is more appropriate (Petkevičiūtė and Tamulynas 2011:39). What is more, in some cases preceding words and the grammatical structures do not indicate which forms should be employed (Ibid.). Authors state that this problem is quite often met when MT system has to translate derivative verb forms. Further Petkevičiūtė and Tamulynas (2011:39-40) discuss issues related to the grammatical number. Authors indicate that the incorrect usage of the number is usually determined by the preceding pronoun, e.g. the English pronoun you can mean either tu or jūs in Lithuanian. Considering the fact mentioned in the latter paragraph, that MT systems are unable to use text comprehension knowledge, it is hard to identify which translation is more appropriate. Consequently, if the pronoun is rendered incorrectly words following the pronoun will be translated incorrectly as well. Authors also note that in some cases the wrong grammatical number is used even though the pronoun is translated properly (Petkevičiūtė and Tamulynas 2011:40). However, this mistake could be attributed to the group of systemic mistakes, because it is hard to explain why this happens. The forth mistake presented by Petkevičiūtė and Tamulynas (2011:40) is the verb conjugation. This mistake occurs due to similar reasons mentioned in the preceding paragraph. Authors point out that conjugations also depend on the pronouns which precede them. Such being the case, the incorrect translation of the pronoun leads to the misalignment of the verb conjugation (Ibid.). Next important issue is the usage of the gender. Petkevičiūtė and Tamulynas (2011:40) mark out that ‘‘if there are no clear attributes of gender (like pronouns she, he) MT system renders word in masculine gender.’’ This is due to the masculine gender is unmarked member in the English language. The same can be said about the Lithuanian language. However, the difference is that in the English language only words denoting persons (and some animals) are marked as masculine or feminine and words denoting non-persons have neuter gender. The Lithuanian language, on the other hand, marks both groups of words, persons and objects, as masculine or feminine (Valeika and Buitkienė 2003:54). These differences can cause certain 10 misalignment in the TT if MT system is not programmed to identify those non-person words and their gender with the remaining text. One more common and grave mistake is the usage of the incorrect part of speech. It is a severe error, because the improper part of speech can completely change the meaning of the ST sentence. We encounter this problem, because some words in the English language can be a noun, an adjective, an adverb, or a verb, due to this semantic information of the sentence is fundamental here (Petkevičiūtė and Tamulynas 2011:40). Usually if the word is preceded by the particle to it is believed that this word is a verb, if the particle is absent we assume the word to be a noun or an adjective. However MT system not always follows these rules (Ibid.). The last grammatical mistake presented by Petkevičiūtė and Tamulynas (2011:40) is the translation of negative verbs, i.e. sentences where a verb is preceded by a negative adverb (mostly never) and both words must be translated in a negative form in the TT, e.g. Tom never asks Ann’s permission must be translated Tomas niekada neprašo Anos leidimo. Even though this mistake is quite rare in MT process, but authors highlight that it may occur when some extra word/s intervene/s in the negative adverb and the verb. Such being the case, these two words are not linked together and are rendered separately (Ibid.). Summing up the topic of the grammatical mistakes in MT process, it is evident that grammar is essential to MT. Grammatical rules embedded in MT system help to achieve a high quality TT. Systems should be programmed to deal with each grammatical aspect separately in order to render the text as precisely as possible. However, due to the certain shortages, like disability to use text comprehension knowledge, some rules are skipped, which eventually cause low quality of the TT. 2.1.2. Lexical mistakes Lexis in MT process is as important as grammar, which is why it is important to discuss lexical mistakes occurring in translation process. Petkevičiūtė and Tamulynas (2011:40) classify lexical mistakes in the following way: Sayings; Polysemy; Words which were not translated; Abbreviations; Pronouns; Phraseological units/collocations; 11 Hyphened words; Proper names. Firstly, one of the most severe and common mistakes is related to polysemantic words. The Lithuanian language, just like any other language in the world, has many words which are polysemantic. Polysemy is especially evident in the English language, because it has many polysemic words, which can only be deciphered from the co-text, e.g. fast may stand for an adjective, a verb, an adverb, and a noun. Such being the case, MT system does not always choose the right meaning considering the context, which later causes the change or a total loss of the meaning of the sentence. MT systems usually use a big corpora to look how often one or another word is used and in which context they usually appear (Petkevičiūtė and Tamulynas 2011:40). However we may assume that if a word will be placed in an unusual text, the quality of translation would decrease. Cvilikaitė (2008:30) also notes that this problem may also occur if the system was programmed to translate only general texts and specific corpora was not included in the system. The author also indicates that many mistakes occur while dealing with the polysemic words which end in –ed and –ing (Ibid.). Cvilikaitė (2008:31) marks out that usually human translator changes the part of speech to avoid wordby-word translation. However MT system is not able to perform such transformation and translates the word by its primary meaning (Cvilikaitė 2008:32). The issues which were discussed above confirm that it is important to install high quality dictionaries and corpora in MT systems, in order to achieve the best results in translation process. Further, we face the problem with the translation of pronouns from English into Lithuanian. As it was mentioned in the latter chapter, the biggest mistake occurs when the system has to translate the pronoun you, because it can be translated either as jūs or tu. Petkevičiūtė and Tamulynas (2011:40) notice that almost everytime the machine translates this pronoun as jūs, even though the second meaning tu is often preferred. This mistake appears due to the reason mentioned before, i.e. MT systems are not able to use text comprehension knowledge and choose the right form of the word. Authors also denote that in many cases incorrectly translated pronoun has a wrong grammatical case as well (Ibid.). One more big issue is translation of proper names/nouns. This problem is discussed by Cvilikaitė (2008:31-32) and by Petkevičiūtė and Tamulynas (2011:41). All authors agree upon the idea that systems usually translate proper names if they coincide with nouns included in systems’ dictionaries. We can state that it is important to create the rule which could recognize proper nouns and leave them in their original form afterwards. 12 Another grave and essential mistake concerning translation quality is rendering cultural realia6. Nida (1964:30) suggests that poor knowledge of source culture can cause more problems than differences within language structure. Petkevičiūtė and Tamulynas (2011) somehow missed this problem, but there are many authors who have discussed this issue thoroughly. Karamanian (2002) notes that ‘‘translators must be both bilingual and bicultural, if not indeed multicultural’’, therefore, the idea of Robinson (2003:186) that words so familiar and usual in one language can be completely untranslatable in another language sounds indisputable. The idea of Karamanian (2002) is supported by Petrulionė (2012:44) as she also states that cultural realias ‘‘require from translators both linguistic and cultural competence’’ to achieve that the TT would not lose its value. Here we face a major problem, because it would be very hard to install all the peculiarities of all cultures and languages into the machine so rendering realia is a huge challenge for MT systems. Cvilikaitė (2008:34) writes that systems translate realia only if the concept is included in its dictionary, if not MT systems can perform the following steps: 1) Leave word in its original form; 2) Omit the word; 3) Perform word-by-word translation; 4) Use the explanation of the concept given in the dictionary. In rare cases some of these options can meet the requirements of the TT, but often they cause problems. Usually if the human translator leaves word in its original form he includes footnotes to explain the concept, which MT system is unable to do (Cvilikaitė 2008:34). Omission and word-by-word translation can cause a gap of information or a great confusion preventing to perceive the idea of the sentence or the text. Finally, the wrong explanation can be included in the dictionary, usage of which also will make the TT confusing (Ibid.). Finally, it must be noted that Petkevičiūtė and Tamulynas (2011:40) point out that a number of lexical mistakes occur because of the same reason, i.e. certain words or collocations are absent in system’s dictionary. This problem is most evident when translating sayings, abbreviations, phraseological units/collocations, and hyphened nouns (Ibid.). Therefore, we could state that the improvement and constant update of dictionaries installed in MT systems would considerably improve the outputs of MT. Concluding this topic, we can note that it is fundamental to update dictionaries and corpora used in MT systems, for low quality databases reduce quality of the TT. 6 Cultural realia (sometimes referred as lacuna or non-equivalence) – indicates the absence of a word in one language from the point of view of another language, when in the reality, portrayed by those two languages, this particular token or phenomenon exists (Gudavičius 2007:93). 13 2.2. Systemic mistakes Systemic mistakes usually occur due to inaccuracies in the system, i.e. issues regarding algorithms of dictionaries or programme. Most often these mistakes do not have any logical explanation. Petkevičiūtė and Tamulynas (2011:41) provide the following classification of the systemic mistakes: Omission of verb; Spelling of capital and minuscule letters; Omission of word; Word translated in another language; Ignorance of diacritics; Word translated using the concept absent in the dictionary; Extra word. A further description of these mistakes is quite brief, because as authors state no other researchers have distinguished such type of MT problems (Ibid.). Firstly Petkevičiūtė and Tamulynas (2011:41) present the omission of verb. They state that usually the system omits the predicate, e.g. if the predicate is is (or its forms was, were) it may be omitted. However, this mistake appears not in all sentences: the predicate may be omitted in one sentence, but in the following one it may be already translated (Ibid.). Here we can add the issue concerning the omission of words, for these two problems are similar. Researchers point out that in most cases the system omits dependent parts of speech (conjunctions, prepositions, particles) (Ibid.). However, the consequences of these mistakes differ. The absence of the predicate can cause the misunderstanding in the whole meaning of the sentence, whereas the absence of dependent parts of speech do not bare such significance on catching the meaning of the sentence (Ibid.). Discussion of remaining problems is quite superficial, for authors do not explain why these mistakes occur. As it was mentioned above, it is hard to explain why the systemic problems occur. We may state, that the perception of appearance of one or another systemic mistake can only be achieved through the analysis of the translation system itself, i.e. the rules and algorithms which are installed in it. 14 3. TRANSLATING TEXTS OF DIFFERENT REGISTERS Authors DiMarco and Hirst (1990), Calude (2004), and Proshina (2008) highlight the importance of text genre in the process of translation. DiMarko and Hirst (1990: 65) state that ‘‘a significant part of the meaning of any text lies in the author’s style<...>which must be carried through in any translation if it is to be considered faithful.’’ The idea that the TT must satisfy, as closely as possible, the same writing style as the one of the ST is maintained by Proshina (2008: 196). She also adds that at the same time the translator must mind stylistic peculiarities of the SL (Ibid.) what is a hard task for the current MT systems, say DiMarco and Hirst (1990: 65). The most common classification of functional styles of the English language is that of Galperin (1981: 33). The author distinguishes five major genres, which are as follows: 1) The language of belles-lettres; 2) The language of publicistic literature; 3) The language of newspapers; 4) The language of scientific prose; 5) The language of official documents. Proshina (2008: 196) however, adds to this list such functional styles as colloquial style and advertising style, whereas Galperin (1981) does not list colloquial style as a separate functional style, but points out that this genre is used in belles-lettres style and newspaper style. The advertising style is listed as the subgroup of newspaper style (Ibid.). A quite different approach is proposed by DiMarco and Hirst (1990:66). Authors base their classification already from the MT point of view. They state that despite looking at separate styles, the machine is more concerned in the analysis of group style7. Further they divide this group into two big subgroups: literary and utilitarian styles (Ibid.). It is further explained that utilitarian style is a more general name for the texts ‘‘which have a particular purpose, such as medical text books or newspaper articles’’ (DiMarco and Hirst 1990:66). Each of the style has its own goals: to inform, to instruct, to suggest a possible way, to convince the reader, etc. and as it was mentioned previously the TT must preserve the same goal and functions of a particular functional style cannot be mixed (Galperin 1981, DiMarco and Hirst 1990, Proshina 2008). 7 Group style – authors refer ‘‘to a characteristic of text that, although possibly produced by one individual, shares the stylistic standards of a body of writers.’’ (Ch. DiMarco, G. Hirst 1990: 66). 15 It is also important to understand the properties of different functional styles. The properties of the text, in this case, mean pragmatics8 of texts of different genres. Calude (2004:8) presents a table with general attributes of four styles. Text genres Sentence types Pragmatic information Domain/scope Short Little Very limited Neutral Neutral Lots Very broad In abundance Very broad Technical set of instructions (scientific prose*) Popular magazine extract (publicistic style*) Newspaper article extract (language of newspapers*) Combination of long and short Many short and effective Short story extract (belles-lettres style*) Long, elaborate Table 1. Attributes of the different text genres adopted from Calude (2004: 8) Table 1 shows the properties of four functional styles. As it is evident each of this style bears a different amount of pragmatic information. Belles-lettres style has the abundance of contextual meaning, whereas scientific prose has very little of it. Magazine extract has neutral pragmatic information, which means that contextual meaning is not always found in these kinds of articles. Finally, we can see that newspaper articles have lots of contextual meaning. Calude (2004:8) explains that this genre usually fancies context-bound information, i.e. it is supposed that the reader is already aware of events that had already happened in the world. However, since Calude (2004) does not describe the style of official documents, theory collected by Galperin (1981) must be revised. He claims that ‘‘there is no room for contextual meaning or any kind of simultaneous realization of two meanings’’ (Galperin 1981:314). What is more, author states that each substyle of official documents has its own peculiar compositional patterns (Ibid.). In the practical part of this paper we will follow the classification suggested by Galperin (1981). 8 Pragmatics – the study of how words and phrases are used with special meanings in particular situations. Longman Dictionary of Contemporary English (2005). * Author’s remarks. 16 4. THE ANALYSIS OF TEXTS OF DIFFERENT REGISTERS 4.1. The methodology of the research Before proceeding to the empirical part of the study the methods and the process of the analysis employed in this paper will be briefly discussed. The main purpose of the present paper is to analyse MT mistakes in texts of various genres while translating them from English into Lithuanian. To fulfil this goal several methods were applied. Firstly the sampling method was used to collect a number of examples from various literature and Internet sources. In order to obtain better and clearer results on how the machine deals with the texts which have an abundance of contextual meaning and with those which have very little of it, our sources vary from the most pragmatic to the least context bound texts. Some examples for the present paper were taken from the Internet. These examples include: a popular magazine article found on website http://www.popsci.com, a newspaper article, extracted from the website http://www.guardian.co.uk/ (both texts were translated by the author of this research, because translations made by professional translators could not be found). A text of official documents was also taken from the website http://www.constitution.org/constit_.htm. Its Lithuanian translation was found in the library of Šiauliai University. Additionally, instruction manual, received together with the purchase of alarm clock radio and its Lithuanian version attached to the clock, were also taken as sources of the present research, whereas examples of belles-lettres style were picked up from Oscar Wilde’s novel ‘‘The Picture of Dorian Grey’’ (2003) and its Lithuanian version ‘‘Doriano Grėjaus Portretas’’ (2001) translated by Lilija Vanagienė. It is also important to note that all the examples were selected randomly instead of analysing the whole or a part of selected texts. After performing MT, which was done by employing an online machine translation programme created by Vytautas Magnus University and which could be found here http://vertimas.vdu.lt/twsas/, all examples were grouped into those consisting of grammatical, lexical and systemic misalignments. This was done by using contrastive method and comparing the ST with the TT and the MT output. The descriptive method enabled to describe our findings and provide the short analysis. Finally, by utilizing the statistical method, the results were graphically presented and the percentage of the previously mentioned mistake groups, i.e. grammatical, lexical and systemic, were calculated using MS Excel programme. The percentage was calculated according to mathematical formula X=N:Z*100%, where X – the percentage of number N; N 17 – the number of which the percentage needs to be found; Z – the number which denotes 100%. It also should be mentioned that this research was done on the grounds of the following approach: black box, i.e. only the SL texts and rendered texts were analyzed, but there was no analysis made on how the programme works itself. 4.2. Analysis of technical set of instructions Firstly the analysis of the instruction manual of an alarm clock radio (see Annex 1) produced by ‘‘SOUNDMAX’’ was performed. First 3 examples illustrate the grammatical mistakes detected in the selected text. (1) ST: ‘‘Connect a 9 volt battery (not included) to the terminals inside the battery compartment.’’ TT: ‘‘Prijunkite 9 voltų bateriją (nepridedama) prie baterijos dėkle esančių įvadų.’’ MT: ‘‘Prijunkite 9 voltų bateriją (neįtrauktas) į terminalus baterijų skyriuje.’’ In the example (1) we see the incorrect usage of the grammatical gender while translating words in brackets. This mistake occurs because the word battery in the English language has a neuter gender, but in Lithuanian it is of the feminine gender. Because there are no specific indications about this misalignment, the system follows the rule that the masculine gender is the unmarked member in the English language and translates words in the masculine gender. The second mistake in the example (1) is misalignment in the grammatical number. We see that the second word battery is in a singular form, but MT system translates it in a plural form. This mistake is quite surprising, because there are no pronouns which could mislead the system. The machine probably assumes that there are mutual syntactic relations between the words terminals and battery therefore these words must share the same grammatical number. (2) ST: ‘‘<...>but there is now the advantage that if there is a mains current failure your clock will continue to work.’’ TT: ‘‘<...>tačiau privalumas tas, kad nutrūkus elektros tiekimui, jūsų laikrodis ir toliau veiks.’’ MT: ‘‘<...>bet yra dabar pranašumo kad, jei yra maitinimo tinklo srovės nesėkmė, jūsų laikrodis tęs dirbti. 18 The example (2) shows two more grammatical mistakes discussed in the theoretical part of this paper, i.e. the wrong usage of the grammatical case and the incorrect translation of the verb. It is quite hard to explain why the machine uses the improper case while translating the word advantage, because no other word/s in this sentence require/s the genitive case. The mistake is probably caused by some systemic failure. The second mistake is made while translating the infinitive to work. As seen in the TT, the word is translated in the future simple, but the machine leaves it in the infinitive form. As it is evident from the MT output the infinitive to work in this case should be translated as a noun, because the preceding word tęs requires the following word to be in the accusative case. MT system is probably unable to identify this and translates the word in the infinitive form. (3) ST: ‘‘The clock display will not light up, as the clock time will be held in the clock memory by the battery back-up system.’’ TT: ‘‘Laikrodžio ekranėlis nenušvis, nes laikrodžio laiką baterijos rezervinė sistema laikys laikrodžio atmintyje.’’ MT: ‘‘Laikrodžio parodymas neapšvies, kadangi laikrodžio laikas bus laikytas laikrodžio atmintyje baterijos atsarginės sistemos.’’ The example (3) contains the grammatical mistake concerning the wrong usage of the part of speech. Instead of using a verb in the future simple tense, MT system uses the participle of the past tense. What is more, the word laikytas is preceded by the word bus indicating the future. The word combination like this indicates some future result, but the ST does not have this indication, it simply explains what happens at the moment. What is more, words bus and laikytas cannot be used together in Lithuanian, because they contradict each other. That is why the usage of the past participle is incorrect in this case. Further examples illustrate the lexical mistakes found in the analysis of the selected text. (4) ST: ‘‘The battery back-up system is only meant to be used from short temporary power failures.’’ TT: ‘‘Baterijos rezervinė sistema skirta naudoti tik trumpam nutrūkus elektros tiekimui.’’ MT: ‘‘Baterijos atsarginė sistema yra tiktai reikšta, kad būtų panaudota nuo trumpų laikinų valdžios nesėkmių.’’ 19 The example (4) contains the lexical mistake which is essential for the correct understanding of the sentence. It is quite confusing why the machine selects to translate the word power as valdžios, because as far as a number of dictionaries9 have been looked through, this is only a third definition provided and MT systems usually use the first given meaning. We could say that this mistake occurs because the machine does not recognize the context of the text and is unable to choose the right meaning of the word power. (5) ST: ‘‘The clock display will not light up, as the clock time will be held in the clock memory by the battery back-up system.’’ TT: ‘‘Laikrodžio ekranėlis nenušvis, nes laikrodžio laiką baterijos rezervinė sistema laikys laikrodžio atmintyje.’’ MT: ‘‘Laikrodžio parodymas neapšvies, kadangi laikrodžio laikas bus laikytas laikrodžio atmintyje baterijos atsarginės sistemos.’’ The example (5) also shows the incorrect translation of the word display, however this mistake is not as severe as the one discussed in the example (4). Misalignment in the example (5) has no impact on the meaning of the sentence. It can only confuse the reader, because one may think that the talk is about the numbers but not the whole clock screen. This mistake also occurs for the same reason as the error in the example (4), i.e. the system is not able to recognize the context and chooses the improper meaning of the specific word. As it is evident from the provided examples, the vast majority of mistakes occurring in the translation of technical set of instructions are the grammatical mistakes. However, they are not very serious and would not cause the misinterpretation of the text. The only crucial mistake in this text is the lexical one provided in the example (4). No systemic mistakes occur in the selected text. Few crucial mistakes can be found in the text under analysis due to such texts usually use clear definite statements and standardized words, which mean they have only one meaning. All in all, we can conclude that the technical text is quite sufficient. Despite two lexical mistakes, text is understandable and there are no other severe mistakes which could prevent the reader from misunderstanding the text. 9 (2007) Mokomasis Anglų Kalbos Žodynas. Vilnius: Alma Littera. Baravykas, V. (1961) Anglų-Lietuvių Kalbų Žodynas. 2nd edition. Vilnius: Valstybinė Politinės Ir Mokslinės Literatūros Leidykla. Piesarskas, B. (2004) Dvitomis Anglų-Lietuvių Kalbų Žodynas. Vilnius: Alma Littera. 20 4.3. Analysis of popular non-fiction text The second text was taken from an online popular non-fiction magazine Popular Science (see: www.popsci.com). The chosen article, entitled Wrap Factor, was written by Konstantin Kakaes who discusses the intents of NASA to create a spaceship which could travel faster than light and therefore make travels beyond our solar system possible. Firstly, we will present grammatical mistakes found in the process of translation. (6) ST: ‘‘<...>engineers and space enthusiasts gathered at the Hyatt Hotel in downtown Houston<...>’’ (K. Kakaes 2013) TT: ‘‘<...>inžinierių ir kosmoso entuziastų rinkosi Hyatt viešbutyje, Hiustono centre<...>’’ (Author) MT: ‘‘<...>inžinierių ir kosminių entuziastų rinkosi Hyatt Viešbutyje Hiustono miesto centre<...>’’ (7) ST: ‘‘The first mainstream use of the expression ‘‘wrap drive’’ dates to 1966...’’ (K. Kakaes 2013) TT: ‘‘Pirmą kartą sąvoka ‚‚deformacijos variklis‘‘ imta plačiai vartoti dar 1966...’’ (Author) MT: ‘‘Pirmas vyraujantis posakio naudojimas ‚‚deformuoja variklį‘‘ datos iki 1966...’’ The examples (6) and (7) illustrate the grammatical mistake of the incorrect part of speech translation. MT translation of space enthusiasts into kosminių entuziastų, in the example (6), can be treated as misleading, because it implies that enthusiast are from space, kosminių here stands as an adjective and describes the following word. However, the original text means a certain sphere which people are interested in, therefore, word space must be treated as a noun and rendered kosmoso. In the example (7) we observe that the machine is not able to indicate that the word dates is a verb and translates it as a noun in the plural form, which not even causes the misinterpretation of the text but also makes the sentence hardly comprehensible. (8) ST: ‘‘<...>and then takes me down the hall to Eagleworks.’’ (K. Kakaes 2013) TT: ‘‘<...>po to nuveda mane kolidoriumi žemyn į ‘‘Erelio dirbtuves’’ (Author) MT: ‘‘<...>ir paskui ima mane žemyn salė į Eagleworks.’’ 21 The example (8) presents another type of the grammatical mistake, which is the wrong usage of the grammatical case. This mistake occurs due to the reason that MT system is not able to link words take me down and the hall. As it is evident from the TT the word hall must be in the ablative case, but not in the nominative case, as is rendered by the machine. A great number of the lexical mistakes are found in the popular non-fiction text. These misalignments are presented below. (9) ST: ‘‘<...>something he calls a quantum vacuum plasma thrusters (QVPT).’’ (K. Kakaes 2013) TT: ‘‘<...>kažką ką jis vadina plazmine važiuokle (variklio tipas, kuris geba iš vakuumo išgauti energiją) (PV).’’ (Author) MT: ‘‘<...>kažkas, kurį jis kviečia, kvantas siurbia plazminį stūmiką (QVPT).’’ The example (9) contains one of the most severe mistakes within the text under analysis. We can see that the machine is unable to translate the specific term and performs word-by-word translation, which is utterly incomprehensible. This error appears because it is quite difficult to translate such terms, which are used in a particular sphere of study. Even the human translator finds it difficult to render such word clusters, so the one has to use some explanatory notes, what is done by the author while translating the example (9). The translator must look through various materials to find out what the phrase means and put it in the text in short terms, yet we cannot expect for the machine to perform such an operation. (10) ST: ‘‘White shows me into the facility<...>’’ (K. Kakaes 2013) TT: ‘‘Vaitas palydi mane į patalpą, kurioje stovi įrenginys<...>’’ (Author) MT: ‘‘Baltas rodo man į priemonę<...>’’ Another crucial mistake is evident in the example (10). This example illustrates the kind of mistake when the SL word requires a whole phrase in the TL, as opposed to a single word. The improper translation of such words usually severely affects the meaning of the whole sentence. MT system uses the meaning of the word facility which can be found in the dictionary, but from the ST we understand that by facility author means the room where this device is located and this must be highlighted in the TT. In addition, the machine does not understand that words show into must be translated together. It happens because the pronoun me stands between show and into, what becomes an obstacle for MT system. Finally, this 22 example contains the wrong translation of a proper noun, in this case a surname. This mistake occurs because there is no distinction made between common words and proper names when the latter have lexical meaning. The machine cannot distinguish between them if they are not marked in deformatting stage. However, it can be noted that when the name is in the possessive case, e.g. White’s device this error does not occur. (11) ST: ‘‘<...>and that he was commencing physical tests in his NASA lab, which he calls Eagleworks.’’ (K. Kakaes 2013) TT: ‘‘<...>ir kad jis jau pradėjo fizinius testus savo NASA laboratorijoje, kurią jis vadina ‚‚Erelio dirbtuvėmis‘‘.’’ (Author) MT: ‘‘<...>ir kad jis pradėjo fizinius testus savo NASA laboratorijoje, kurią jis kviečia Eagleworks.’’ The example (11) depicts other lexical mistake found in the selected text which is the wrong choice of a word meaning. While checking up a number of dictionaries (see footnote 9, p. 20), it is clear that the meaning kviesti of the word call is only the second one, and as it was mentioned before in this paper, MT systems usually employ first meaning of a word. That is why this mistake is quite unusual. However, this mistake is not crucial, for it does not have an impact on the meaning of the sentence. (12) ST: ‘‘Put plainly, warp drive would permit faster-than-light travel.’’ (K. Kakaes 2013) TT: ‘‘Aiškiai kalbant, deformacijos diskas leistų keliauti greičiau už šviesą.’’ (Author) MT: ‘‘Padėtas aiškiai, deformacijos variklis leistų greitesnę negu šviesa kelionę.’’ The example (12) deals with the translation of a collocation. As it is evident the collocation put plainly is absent in the system’s dictionary therefore, the machine performs word-by-word translation. The selected text also contains several systemic mistakes. They are presented in the examples below. (13) ST: ‘‘As we walk, he tells me about his quest to open the lab.’’ (K. Kakaes 2013) TT: ‘‘Mums beeinant jis man pasakoja apie savo siekius atidaryti laboratoriją.’’ (Author) MT: ‘‘Kadangi mes einame, jis sako, kad aš apie jo ieškojimą atidaryčiau laboratoriją.’’ 23 In the example (13) we notice that MT system inserts the conjunction kad in its output even though it is absent in the ST. As it was mentioned in the theoretical part of this paper, it is hard to explain why such systemic mistakes occur. To understand we must analyze the machine itself. (14) ST: ‘‘Johnson Space Center sprawls beside lagoons<...>’’ (K. Kakaes 2013) TT: ‘‘Džonsono kosminis centras driekiasi palei lagūnas<...>’’ (Author) MT: ‘‘Johnson Kosminis centras išsidriekia šalia lagūnų<…>’’ The example (14) presents misalignment in the spelling of capital and minuscule letters. It is known that in English full names of companies or institutions are considered to be a proper noun and the whole title is written in capital letters, e.g. Chelsea Hotel, Houston Space Center, The Homeless Center for Strafford County, etc. (Marshall 2012). However, this is not the case in Lithuanian. Usually such words as center, hotel, etc. are written in minuscule letters (Lingytė 2002). This rule probably is not installed in the MT system and the machine renders it improperly. (15) ST: ‘‘<...>wrap drive<...>’’ (K. Kakaes 2013) TT: ‘‘<...>deformacijos diskas<...>’’ (Author) MT: ‘‘<...>deformacijos variklis<...>’’ The example (15) shows the error when the machine translates the word using the concept absent in the dictionary. After checking up a number of dictionaries (see footnote 9, p. 20), the word drive does not have the meaning variklis. However, this mistake could be explained by saying, that the person/s who installed the dictionaries in MT system was/were aware of this phrase, but had a bit different understanding of it and did not check how it is rendered in other scientific sources. What is more, this mistake does not cause misinterpretation of the text, because the reader still can understand the meaning of the text. Summing up, it could be said that comparing to the previous chapter and the text of technical instructions this piece of text has more lexical mistakes than the grammatical ones. In addition, there are a few systemic mistakes in this text, which are absent in the text analyzed in chapter 4.2. What is more, translation of popular non-fiction text contains more severe errors. The examples (7) and (9) illustrate mistakes which make the sentences hard to comprehend and the example (10) is rendered completely incorrectly. Furthermore, MT system faces a difficulty to translate specific terms used in a certain field of interest as seen in the examples (9) and (15). To avoid these kinds of mistakes, dictionaries installed in the 24 system should be constantly updated and the person/s who is/are responsible for updating them should look closely for the right terms. 4.4. Analysis of belles-lettres style For the belles-lettres style the novel The Picture of Dorian Gray by Oscar Wilde was selected. The story is about a young man, Dorian Gray, and his mysterious portrait, which was getting older and uglier as his master lived wild and sinful life. The novel was translated into Lithuanian by Lilija Vanagienė. The examples (16) – (21) illustrate the grammatical mistakes found in the text under analysis. (16) ST: ‘‘<...>there came through the open door the heavy scent<...>’’ (Wilde 2003:4) TT: ‘‘<...>pro atviras duris svaigiai padvelkdavo<...>’’ (Wilde 2001:8) MT: ‘‘<...>ten prasiskverbė atviros duris sunkus aromatas<...>’’ The example (16) presents misalignment in the grammatical case. We see that regardless the indication that the accusative case should be used, which is expressed by the word through, the machine still renders the word open in the wrong case. (17) ST: ‘‘<...>a smile of pleasure passed across his face, and seemed about to linger there.’’ (Wilde 2003:5) TT: ‘‘<...>ir pasitenkinimo šypsena neblėso jam iš veido.’’ MT: ‘‘<...>malonumo šypsena praėjo per jo veidą, ir atrodo, ketino užtrukti ten.’’ (18) ST: ‘‘<...>a long thin dragon-fly floated past<...>’’ (Wilde 2003:8) TT: ‘‘<...>ir ilgas plonas laumžirgis<...>praplaukė pro šalį<...>’’ (Wilde 2001:12) MT: ‘‘<...>ilgas plonas laumžirgis paskleista praeitis<...>’’ Words seemed and floated in the examples (17) and (18) have definite indicators of the past tense, i.e. ending –ed, but still the word seemed is translated in the present tense and the word floated is rendered as a noun. (19) ST: ‘‘<...>my acquaintances for their good characters<...>’’ (Wilde 2003:10) TT: ‘‘<...>pažįstamus dėl gero būdo<...>’’ (Wilde 2001:14) MT: ‘‘<...>savo pažįstamą jų geriems charakteriams<...>’’ 25 In the example (19), where the noun has a clear sign of being in the plural form, ending –s, the word is still translated in the singular form. This mistake becomes more conspicuous, because we see that following words are rendered correctly, i.e. in the plural form. Regarding this, this misalignment could even be classified as the systemic mistake. (20) ST: ‘‘<...>talking [Dorian Gray] to the pretty Duchess of Monmouth<...>’’ (Wilde 2003:170) TT: ‘‘<...>šnekučiavosi [Dorianas Grėjus] su dailiąją Monmeto hercogiene<...>’’ (Wilde 2001:174) MT:‘‘<...>kalbėdama [Dorianas Grėjus] su gražia Monmouth Kunigaikštiene<...>’’ In the example (20) the word talking is translated in the feminine gender, even though it is mentioned that the person who is talking is male. This error could also be entitled as the systemic mistake, because it is hard to explain why the machine makes such mistakes despite clear indications. (21) ST: ‘‘The girl never really lived<...>’’ (Wilde 2003:93) TT: ‘‘Mergaitė niekada tikrai negyveno<...>’’ (Wilde 2001:97) MT: ‘‘Mergaitė niekada iš tikrųjų gyveno<...>’’ Finally, the example (21) is a typical example of the translation of negative forms. The machine renders the segment incorrectly, because two words denoting negativity are separated by another word. However, trying to translate such simple sentence as He never lived on his own [MT: jis niekada negyveno savarankiškai] we observe that the machine is capable of recognizing and translating negative forms correctly. Further examples are the illustrations of the lexical mistakes found in the text of belleslettres style. (22) ST: ‘‘There was something in his face that made one trust him at once.’’ (Wild 2003:17) TT: ‘‘Kažkodėl jo veidas skatino juo pasitikėti.’’ (Wilde 2001:21) MT: ‘‘Buvo kažkas jo veide, kuris privertė vieną patikėti juo tuojau pat.’’ (23) ST: ‘‘Was he always to be burdened by his past?’’ (Wilde 2003:196) TT: ‘‘Negi visada jį slėgs praeitis?’’ (Wilde 2001:200) MT: ‘‘Jis turėjo visada būti kraunamas jo praeities?’’ 26 The selected text contains a number of mistakes concerning the wrong translation of pronouns. Some of these mistakes are presented in the examples (22) and (23). In the example (22) a wrong indefinite pronoun is used. Instead of kuris the system has to use the pronoun kas, because word kažkas is of neuter gender and so must be the pronoun linked with that word. The example (23), on the contrary, deals with the personal pronouns. In this example the primary pronoun jo is used instead of the possessive one. This mistake is quite crucial, for it has an impact on the meaning of the sentence. When we use a primary pronoun we get the idea that the past of another person is burdened over someone, but when a possessive pronoun is used, the reader can clearly understand that the person is suffering because of his own deed he did in the past. (24) ST: ‘‘<...>they got on the roof<...>’’ (Wilde 2003:197) TT: ‘‘<...>jie užlipo ant stogo<...>’’ (Wilde 2001:200) MT: ‘‘<...>jie sėdo ant stogo<...>’’ (25) ST: ‘‘There was only one bit of evidence left against him.’’ (Wilde 2003:196) TT: ‘‘Prieš jį tėra tik vienas įrodymas.’’ (Wilde 2001:200) MT: ‘‘Buvo tik vienas bitas įrodymo, kurį paliekama prieš jį.’’ Errors concerning polysemy are present as well and are presented in the examples (24) and (25). The example (24) is incorrect, because when to get on is translated like sėsti it indicates that someone is moving inside. However, in this case people are moving outside. In the following example the word bit is misinterpreted and confused with the term used to define computer capacity. This mistake is also quite conspicuous, because there are no previous indications about computers or any electronic devices which could confuse the system. The machine is probably misled by the word one and assumes that those two words mean the capacity of some king of device. Systemic mistakes are also present in the belles-lettres text. They are represented in the examples below. (26) ST: ‘‘The studio was filled with the rich odour of roses<...>’’ (Wilde 2003:4) TT: ‘‘Dailininko dirbtuvėje tvyrojo saldus rožių aromatas<...>’’ (Wilde 2001:8) MT: ‘‘Studija buvo pripildyta turtingo roses aromato<...>’’ (27) ST: ‘‘Dorian Gray stepped up on the dais with the air of a young Greek martyr<...>’’ (Wilde 2003:18) 27 TT: ‘‘Dorijanas Grėjus užlipo ant pakylos jauno graikų kankinio veidu<...>’’ (Wilde 2001:22) MT: ‘‘Dorėnų Pilkuma žengė į priekį ant pakylos [su] išraiška jauno Graikijos kankinio<...>’’ (28) ST: ‘‘<...>the sharp snaps of the guns that followed<...>’’ (Wilde 2003:177) TT: ‘‘<...>ir įkandin jį sekąs šaižūs šautuvų pyškėjimai<...>’’ (Wilde 2001:181) MT: ‘‘<...>ir aštrus šnapso ginklų, kurie sekė<...> ’’ The examples (26) and (28) could be explained from the point of view that there is a major error in the system’s dictionary and the word rose is absent, whereas the word snaps is confused with the word schnapps. But we cannot explain why the machine misses the word su in the example (27) even though the ST contains the conjunction with. However, we see that in the TT this conjunction is absent as well, but the grammatical case is adjusted to maintain the meaning, what is absent in MT output. What is more, some portions of the selected text are of such low quality, that it is almost impossible to understand them. Such sentences could be called miscellaneous, for they bear a number of various mistakes, grammatical and lexical, which make the sentences to be of such unsatisfactory quality. These portions are presented below. (29) ST: ‘‘From the corner of the divan of Persian saddle-bag on which he was lying, smoking, as was his custom, innumerable cigarette, Lord Henry Wotton could just catch the gleam of the honey-sweet and honey-coloured blossoms of a laburnum, whose tremulous branches seemed hardly able to bear the burden of a beauty so flamelike as theirs<...>’’ (Wilde 2003:4) TT: ‘‘Gulėdamas ant persiškom gūnion apklotos kanapos ir kaip visada rūkydamas nežinia kelintą iš eilės cigaretę, lordas Henris Votonas iš savo kampo dar matė geltonus ir saldžius it medus akacijos žiedus ir virpančias šakas, lūžte lūžtančias nuo grožybių naštos, taip panašios į liepsną<...>’’ (Wilde 2001:8) MT: ‘‘Nuo kampo sofos persų persisveriamų krepšių, ant kurių jis gulėjo, rūkymas, kaip buvo jo padaryta pagal užsakymą, nesuskaičiuojamos cigaretės, Lord Henry Wotton galėjo tik sugauti saldaus medumi ir sužydėjimo laburnum spalvos medaus šviesaus, kurios drebančios šakos atrodė vos tik gabios turėti naštą grožio taip panašaus į liepsną kaip jų<...>’’ 28 (30) ST: ‘‘<...>but now and then a thrill of terror ran through him when he remembered that, pressed against the window of the conservatory, like a white handkerchief, he had seen the face of James Vane watching him.’’ (Wild 2003:175) TT: ‘‘<...>tačiau kartkartėmis jį persmelkdavo siaubas, kai tik prisimindavo Džeimso Veino veidą, kuris jį stebėjo, prisispaudęs prie oranžerijos stiklo lyg balta nosinė.’’ (Wilde 2001:179) MT: ‘‘<...>bet dabar ir paskui teroro jaudulys pakartojo jį, kai jis atsiminė, kad, spaustas prieš langą konservatorijoje, kaip balta nosinė, jis pamatė veidą James Vane, stebinčio jį.’’ The examples above contain almost every mistake possible in MT, i.e. the wrong part of speech usage, e.g. rūkymas instead of rūkyti (example (29)), spaustas instead of prisispaudęs (example (30)), misalignment in verb forms, e.g. pakartojo instead of persmelkdavo (example (30)), the improper grammatical case, e.g. medaus šviesaus instead of it medus (example (29)), polysemy, e.g. padarytas pagal užsakymą instead of kaip visada (example (29)), untranslated word, e.g. laburnum [akacija] (example (29)), and the wrong translation of cultural specific item, e.g. persų persisveriami krepšiai instead of persiškom gūniom apklota kanapa (example (29)). All these mistakes together contribute greatly to the low quality of the sentences, because one error inevitably leads to another. What makes these examples even harder to understand is word order, which is very primitive. To sum up, it is evident that translation of belles-lettres style is completely unsatisfactory. The output contains almost every possible mistake the machine can do while translating the text. What is more, even though some sentences are understandable, there are a number of portions which are hardy, if any, comprehensible. In addition, word order is also a common mistake which contributes greatly to the misinterpretation of the selected text. 4.5. Analysis of newspaper article The article was extracted from one of the biggest newspapers’ in Britain, The Guardian, website (see: http://www.guardian.co.uk). The article is called Kim Jong-un Has Made A Decent Fist of Rattling The US and discusses the motives of nuclear war threats USA got from North Korea lately. The author of the article is Justin McCurry. The grammatical mistakes are in abundance in the selected text and are shown in the examples below. 29 (31) ST: ‘‘<...>or an attack on islands near the disputed North-South maritime border.’’ (McCurry 2013) TT: ‘‘<...>arba salų esančių šalia ginčytinos Šiaurės-Pietų jūrų sienos puolimas.’’ (Author) MT: ‘‘<...>ar atakos ant salų šalia ginčytino Šiaurės-pietų jūrinė siena.’’ The example (31) illustrates the improper usage of the grammatical case. As it is the case in other texts, here the machine is unable to link certain words, such as ginčytina with jūrų sienos and this eventually cause the wrong choice in the grammatical case. This mistake could be explained through another mistake evident in this example, which is misalignment in gender. Apparently, the machine is not able to relate the word ginčytina to any other word in the sentence and renders it in the masculine gender, despite every other word is in the feminine gender. This misalignment contributes to the improper usage of the grammatical case, because the word wall in Lithuanian has only the feminine gender and it is impossible to use it with the word of the masculine gender. Because the machine can not relate those two words it translates the remaining sentence as the separate one and uses the nominative case. (32) ST: ‘‘Jang, who says he talks ‘two or three times a day’<...>’’ (McCurry 2013) TT: ‘‘Jang, kuris teigia, kad kalbasi ‘du ar tris kartus per dieną’<...>’’ (Author) MT: ‘‘Jang, kas sako, kad jis kalba ‘du ar trys kartai dieną’<...>’’ The example (32) also presents error concerning the grammatical case. In this instance the machine is unable to link words three times with a day. However the origin of this mistake is quite different from the one discussed in the example (31). This time the mistake occurs due to another error evident in this example: omission of the article a. When this article is omitted the word day loses its meaning of adverbial modifier of time and becomes a simple noun. Due to this the system is unable to link those words and renders them both in the nominative case. (33) ST: ‘‘The coming weeks could see more attempts to unsettle the region.’’ (McCurry 2013) TT: ‘‘Daugiau pasikėsinimų sutrikdyti sritį gali būti įvykdyta per ateinančias savaites.’’ (Author) MT: ‘‘Besiartinančios savaitės galėjo pamatyti daugiau pastangų sutrikdyti sritį.’’ The example (33) depicts the mistake concerning the wrong usage of verb form. Regardless the indication of the future tense, could see, the machine translates the verb in the past tense. These kinds of mistakes are hard to explain, because when there is a clear 30 indication of a certain tense, but the machine chooses completely different tense, we can assume there are some systemic mistakes with the system itself. (34) ST: ‘‘Among the options<...>’’ (McCurry 2013) TT: ‘‘Tarp galimų pasirinkimų<...>’’ (Author) MT: ‘‘Tarp pasirinkimo<...>’’ Misalignment in the grammatical number is also present in the selected text. As it is the case in other texts analyzed earlier, the example (34) depicts that despite the distinct indication of the plural form, ending –s, the system translates the word in the singular form. This also can be regarded as the systemic mistake, because there is no clear explanation why this happens. (35) ST: ‘‘<...>and repairing the damage UN sanctions have inflicted<...>’’ (McCurry 2013) TT: ‘‘<...>ir atitaisyti žalą sukeltą JT apropojimų<...>’’ (Author) MT: ‘‘<...>ir pakenkimo JT sankcijų remontas sukėlė<...>’’ The mistake in the example (35) deals with the part of speech misinterpretation, which is also evident in the text under analysis. This particular mistake occurs because, as it is evident from MT output, the word repairing is related to words sanctions have inflicted and the system assumes it has to be rendered as a noun, but not the verb. Further the lexical mistakes, found while analyzing the selected text, are presented and discussed. (36) ST: ‘‘<...>that his rule would mark<...>’’ (McCurry 2013) TT: ‘‘<...>kad jo valdymas žymės<...>’’ (Author) MT: ‘‘<...>kad jo tasyklė pažymės<...>’’ The machine faces problems translating polysemous words, as can be seen in the example (36). In this particular example the error appears because the machine is not able to use any other comprehension knowledge, consequently it is unable to detect that the word rule must be rendered by its other meaning, i.e. valdymas instead of taisyklė. (37) ST: ‘‘<...>now a senior fellow at the Institute for National Security Strategy<...>’’ (McCurry 2013) 31 TT: ‘‘<...>vyriausiasis Instituto, atsakingo už Nacionalinio saugumo strategiją, narys<...>’’ (Author) MT: ‘‘<...>dabar vyresnis bičiulis Institute Nacionalinio saugumo Strategijos<...>’’ One collocation which is misinterpreted is also present in the article under analysis and is presented in the example (37). Such phraseological unit as senior fellow is probably not very commonly used in everyday English and is not included in the dictionary the system uses. That is why this collocation is translated word-by-word and loses its meaning, which is ‘‘the most experienced, or most successful of an elite group of people who work together as peers in an academic setting or institution’’ (Jones 2013). (38) ST: ‘‘The defector, who arrived in South Korea with his wife and children<...>’’ (McCurry 2013) TT: ‘‘Pabėgėlis, kuris atvyko į Pietų Korėją su savo žmona ir vaikais<...>’’ (Author) MT: ‘‘Dezertyras, kuris atvyko į Pietų Korėją su jo žmona ir vaikais<...>’’ The example (38) shows the improper translation of pronouns. The machine once again uses a primary pronoun instead of a possessive. This is a crucial mistake, because it causes misinterpretation of the text. In this particular case the idea of the ST is that someone came to South Korea together with the family, whereas the output of MT conveys the idea that the person came to the country with someone else’s family. (39) ST: ‘‘<...>a departure from bellicosity<...>’’ (McCurry 2013) TT: ‘‘<...>karo veiksmų nutraukimas<...> (Author) MT: ‘‘<...>išvykimą nuo bellicosity<...>’’ The example (39) shows that the dictionary of the system is quite primitive, because it does not have noun bellicosity, but while checking if it has other forms of this word it turned out it does. MT system renders correctly words such as bellicose [karingas], belligerence [karingumas], belligerent [karingas]. Some systemic mistakes are also found in the text. They are illustrated in the examples below. (40) ST: ‘‘Kim Jong-un’s aim is to unite the North Korean military and people around his regime<...>’’ (McCurry 2013) 32 TT: ‘‘Kim Jong-uno tikslas yra, kad Šiaurės Korėjos karinės pajėgos ir liaudis prisijungtu prie jo režimo<...>’’ (Author) MT: ‘‘Kim Jong-un tikslas suvienija Šiaurės Korėjos kariuomenę ir žmones aplink jo režimą<...>’’ The example (40) shows the error of omitting the verb. Once again, the clear explanation cannot be provided why this mistake is present. However, it can be said, that it is quite crucial, for it slightly changes the meaning of the sentence. The ST means that people are against Kim Jong-un’s regime and he is taking some actions to change it. Yet, the output of MT conveys the idea that people are already accepting the leader’s regime. One instance is found in the selected text which is translated incomprehensibly. The portion is presented in the example (41). (41) ST: ‘‘<...>but is not willing to give anything up to get it.’’ (McCurry 2013) TT: ‘‘<...>tačiau nesiruošią ką nors aukoti, kad tai gautų.’’ (Author) MT: ‘‘<...>bet nenori duoti kažkam iki, gauna tai.’’ As it is evident the most crucial mistake here is the wrong translation of the particle to. In this case the particle is matched to the first part of the sentence, which is but is not willing to give anything up, but this particle has to be attributed to the remaining part of the sentence. This one mistake leads to other crucial mistake: the wrong usage of verb form (cf. kad gautų tai gauna tai). It is important to note that such portions are not so common in the newspaper article as it is in the text of beller-lettres syle, but they still are present. Consequently, the quality of the text under analysis decreased greatly. Finally, it can be concluded that the newspaper article has the best error score so far. 5 out of 7 grammatical mistakes are present in this article, which concerns misalignment in the grammatical case, number, gender, verb forms, and the part of speech. Such common linguistic mistakes as polysemy, untranslated words, the wrong translation of pronouns, and collocations are also detected in the selected text. What is more, the systemic mistake, dealing with the omission of the verb is found and some portions are completely misinterpreted. All in all, we can state that MT output is below average and a serious editing is required if the newspaper article is being rendered by the machine. 33 4.6. Analysis of the official document Constitution for the United States of America was chosen to see what mistakes occur while MT system translates the text of the official document. The text was extracted from the online version of the Constitution (see: http://constitution.org/c5/index.php). Findings of the analysis are presented and discussed below. Firstly we will discuss the grammatical mistakes. (42) ST: ‘‘<...>two Senators from each State, chosen by the Legislature thereof, for six Years<...>’’ (Constitution for the United States of America (henceforth U.S. Constitution), Article I, Section 3) TT: ‘‘<...>du senatorius, renkamus atitinkamų valstijų įstatymų leidžiamųjų susirinkimų šešeriems metams<...>’’ (Jungtinių Amerikos Valstijų Konstitucija (nuo čia JAV Konstitucija), I Straipsnis, 3 Skyrius) MT: ‘‘<...>dviejų Senatorių nuo kiekvienos valstybės, pasirinktos Įstatymų leidžiamojo organo jo šešerius Metus<...>’’ (43) ST: ‘‘<...>shall be vested in a Congress of the United States, which shall consist of a Senate and House of Representatives.’’ (U.S. Constitution, Article I, Section 1) TT: ‘‘<...>suteikiami Jungtinų Amerikos Valstijų Kongresui, susidedančiam iš Senato ir Atstovų rūmų.’’ (JAV Konstitucija I Straipsnis, 1 Skyrius) MT: ‘‘<...>turi būti suteiktos Kongrese Jungtinių Valstijų, kurios turi susidėti iš Senato it Atstovų Rūmų.’’ The most common grammatical mistakes present in the text of the official document are misalignment in the grammatical gender and number as can be seen in the examples (42) and (43). In both cases the machine relates wrong words. In the example (42) chosen is related with States and in the example (43) the word which is also related with the same word, what causes words chosen and which to be translated in the feminine gender, even though they have to be rendered in the masculine gender. Errors in connecting the right words, consequently, lead to the improper usage of the grammatical number. Due to the fact that the word States is in the plural form, words correlated to it are also translated in plural, even though they have to be in singular. Another common mistake in the text under analysis is misinterpretation of the grammatical case depicted in the example (42). This mistake is present because the machine misses the preposition for, which shows that the dative case must be used. 34 (44) ST: ‘‘When the president of the United States is tried<...>’’ (U.S. Constitution, Article I, Section 3) TT: ‘‘Kai teisiamas Jungtinių Valstijų prezidentas<...>’’ (JAV Konstitucija, I Straipsnis, 3 Skyrius) MT: ‘‘Kai Jungtinių Valstijų prezidentas bus teistas<...>’’ The example (44) presents misalignments in verb form and the part of speech. As it can be seen the machine renders the verb is in the future tense, even though it is used in the present tense in the ST. This mistake becomes more conspicuous due to the wrong usage of the part of speech. The machine translates the verb tried as the participle. What is more, the participle is used in the past tense, which is also a crucial mistake, for such word combinations are not employed in Lithuanian, because they contradict each other. (45) ST: ‘‘<...>and shall have the sole Power of Impeachment.’’ (U.S. Constitution, Article I, Section 2) TT: ‘‘<...>tiktai jie turi išimtinę teisę pradėti apkaltos procesą.’’ (JAV Konstitucija, I Straipsnis, 2 Skyrius) MT: ‘‘<...>ir turėsiu vienintelę Apkaltos Valdžią.’’ Misalignment in verb conjugation is illustrated in the example (45). Such mistake appears, due to there is no pronoun the machine could relate the word have with, plus the system is not able to link this sentence with the previous one, and determine that both sentences have the same subject, which is the Speaker and other Officers. Therefore, the machine implies someone is talking in first person and renders the word improperly. Examples below depict the lexical mistakes found in the text under analysis. (46) ST: ‘‘The House of Representatives shall be composed of Members chosen every second Year by the People of the several States<...>’’ (U.S. Constitution, Article I, Section 2) TT: ‘‘Atstovų rūmus sudaro nariai, renkami kas dveji metai valstijų gyventojų<...>’’ (JAV Konstitucija, I Straipsnis, 2 Skyrius) MT: ‘‘Atstovų Rūmai turi būti sudaryti iš Narių, pasirinktų kiekvieni antri Metai Žmonių kelių valstybių<...>’’ The most common mistake in the text is related with polysemy and is presented in the example (46). This mistake usually occurs while translating the word States, which is translated as valstybė. The machine is probably programmed to translate the word States as 35 Valstija only in such collocation as the United States of America and when the word occurs alone the first meaning is programmed to be used. This assumption is made because when we try to translate word cluster including this word, when it should be translated as Valstija, the same mistake is present (cf. The State of Alabama Alabamos valstybė). (47) ST: ‘‘The Times, Places and Manner of holding Elections for Senators and Representatives, shall be prescribed in each State by the Legislature thereof<...>’’ (U.S. Constitution, Article I, Section 4) TT: ‘‘Senatorių ir atstovų rinkimų laiką, vietą ir tvarką kiekvienoje valstijoje nustato jos įstatymų leidžiamasis susirinkimas<...>’’ (JAV Konstitucija, I Straipsnis, 4 Skyrius) MT: ‘‘The Times, Vietos ir Būdas surengti Rinkimus Senatoriams ir Atstovams, turi būti nurodytas kiekvienoje valstybėje Įstatymų leidžiamojo organo jo<...>’’ A quite conspicuous mistake concerning untranslated words can be seen in the example (47). This mistake is interesting, because we cannot explain why the machine leaves words The Times in English. The word times is included in the system’s dictionary, for if writing this word alone it is translated. What is more, if we write such word combination the Times it also is rendered correctly, but if we have such word combination as presented in the above example the machine does not translate it. Thus, we can assume there are some problems in the way the machine works and this mistake could be even categorized as the systemic mistake. (48) ST: ‘‘The House of Representatives shall chuse their Speaker and other Officers; and shall have the sole Power of Impeachment.’’ (U.S. Constitution, Article I, Section 2) TT: ‘‘Atstovų rūmai renka savo spikerį ir kitus pareigūnus; tiktai jie turi išimtinę teisę pradėti apkaltos procesą.’’ (JAV Konstitucija, I Straipsnis, 2 Skyrius) MT: ‘‘Atstovų Rūmai turi būti chuse jų Kalbėtoją ir kitus Pareigūnus; ir turėsiu vienintelę Apkaltos Valdžią.’’ The usage of archaic10 words, such as chuse seen in the example (48), also causes misalignment in the output of MT. These kinds of words are not incorporated in the presentday dictionaries, therefore, the machine does not recognize them and leaves them in original 10 Having the characteristics of the language of the past and surviving chiefly in specialized uses. MerriamWebster Online [Online] Available from: http://www.merriam-webster.com/dictionary/archaic [Accessed on 10th April 2013]. 36 form. Misinterpretation of collocations, also showed in the example above, occurs while translating words Speaker and sole Power. Each of this word is a specific term used in the certain context, which is why their meanings should be checked more closely. However, the machine is note able to perform such task so it renders the word Speaker by its first meaning kalbėtojas and performs word-by-word translation while rendering collocation sole Power, which is translated as vienintelė valdžia. A further example is the illustration of the part of the text which is translated incomprehensibly. (49) ST: ‘‘<...>but the Party convicted shall nevertheless be liable and subjected to Indictment, Trial, Judgement and Punishment, according to Law.’’ (U.S. Constitution, Article I, Section 3 TT: ‘‘Tačiau šitaip nuteistas asmuo taip pat gali būti pagal įstatymą traukiamas baudžiamojon atsakomybėn, teisiamas ir baudžiamas teismo nuosprendžiu.’’ (JAV Konstitucija, I Straipsnis, 3 Skyrius) MT: ‘‘<...>bet Partija kaltino, vis dėlto būsiu atsakingas ir paveiksiu prie Kaltinamojo akto, Teismo, Nuosprendžio ir Bausmės, pagal Įstatymą.’’ As it is the case in the belles-lettres and popular non-fiction texts such portion appears due to a number of the grammatical and lexical mistakes occurring at the same time. As can be seen in the provided example a short sentence contains polysemous word Party which is misinterpreted, what leads to the wrong use of the part of speech and furthermore, misalignment in verb form (cf. asmuo gali būti traukiamas partija kaltino). What is more, the same example contains the wrong verb conjugation, which makes the sentence completely unclear. What is more, one mistake present throughout all the examples picked out from the text of the official document is the spelling of capital letters. All words written with capital letters in English are transferred into Lithuanian, even though the human translation does not consist of such transference. The translator probably thought these words did not have the same significance in Lithuanian as they do in English, therefore he/she wrote them with minuscule letters. However, the machine is not able to do such assumptions, consequently, it transfers words as it is written in the ST. Nevertheless, we cannot state that it is a crucial mistake, because for some people these words can have a huge significance and to write them with capital letters seems completely natural. Summing up, we could state that the quality of the text of the official document is below average. Due to numerous specific terms and archaic words used in the text, the machine is 37 incapable to perform a satisfactory translation. In addition, the grammatical mistakes are also evident in the selected text. The most common ones are misalignment in the grammatical gender, number, case, verb forms, and verb conjugation. What is more, it should be noted that the text of the official document is the only one which contains errors concerning the improper translation of verb conjugation. It is also important to note, that other crucial mistakes can be detected in the examples provided in the chapters above. These errors are not discussed in detail due to limited scope of the present paper and because they are not mentioned in the theoretical review presented in current research. However, these mistakes are as important as the ones discussed here. These mistakes include: 1. translation of prepositions, which make a great percentage of all mistakes found in all styles and effect greatly the quality of MT output, and 2. word order in the translated text, which also contributes greatly to comprehensibility of the translation. Also the mistake found in the example (47), i.e. the case when the word cluster The Times is left untranslated, could be brought to more attention, because it is very interesting why capital letter has such big influence on the machine’s work. 4.7. Statistical analysis of data After translating 5 different texts from English into Lithuanian and analyzing them thoroughly, 6311 instances of various mistakes were found. Below we are presenting the statistical data of those mistakes. Firstly, the pie chart showing the percentage distribution (rounded to units) of total mistakes found in all texts are presented. Figure 3. The percentage distribution of all mistakes found in the selected texts. 11 The number of total mistakes differs from the scope of the paper, because some sentences contain more than one mistake. 38 Figure 3 shows that the most dominant mistakes throughout all the texts are the grammatical errors, which contribute to 43% of all mistakes. The second most common mistakes are the lexical ones, making up 40% of all misalignments. The systemic mistakes make up 7% of all mistakes found in the selected texts. The miscellaneous mistakes make up only 6%. From the data presented, we can assume that strong attention should be paid while installing algorithms dealing with Lithuanian grammar. Further pie chart illustrates the percentage distribution (rounded to units) of all grammatical mistakes found in 5 different texts. Figure 4. The percentage distribution of the grammatical mistakes found in the selected texts Figure 4 reveals that the most common grammatical mistake is that of the part of speech misalignment, which make up of 26% of all grammatical errors. Cases dealing with the grammatical case and verb forms are the second most common type of mistakes. These mistakes contribute to 19% of all errors. Misalignment in the grammatical gender and number distributes evenly – 15% each. The lowest percentage, making up only 4%, is contributed to the mistakes dealing with the verb conjugation and negative verbs. Statistical analysis of the collected data also presents the percentage distribution (rounded to units) of the lexical mistakes. This distribution is illustrated in the pie chart below. 39 Figure 5. The percentage distribution of the lexical mistakes found in the selected texts. From the Figure 5 it is evident that all texts have a number of polysemious words and they are a big obstacle for the machine. These kinds of mistakes contribute to 39% of all the lexical mistakes. Also texts are rich with collocations and the system is also unable to translate them. Translation of collocations makes up 17% of all mistakes. However, not so many proper names and cultural realia are found in the selected texts and these kinds of mistakes make up only 4%. No hyphened words and abbreviations are found in the selected texts. From the data presented we may assume, that the dictionaries must be updated constantly and new concepts should be included in order to reduce the percentage of lexical mistakes. Finally, a pie chart was organized to present the percentage distribution of systemic mistakes. Percentage is also rounded to units. Figure 6. The percentage distribution of the systemic mistakes found in the selected texts. 40 Figure 6 reveals that the systemic mistakes spread quite evenly. The biggest amount, which is 32%, is contributed to words translated with the concept absent in the dictionary. Other mistakes, such as the omission of verb, the usage of capital and miniscule letters, omission of word, and extra word make only 17% of all systemic mistakes. What is more, not all systemic mistakes mentioned in the theoretical part were found in the selected texts. Mistakes absent in the texts are words translated in another language and ignorance of diacritics. Therefore, we can conclude that the machine itself works quite well and makes a few systemic mistakes. However an improvement is needed to avoid rendering words with the concepts, which are not suitable for them. 41 CONCLUSIONS The aim of this paper was to discuss the problems and issues the machine encounters while translating texts of various genres from English into Lithuanian. After gathering a great amount of theoretical material and the analysis of 5 different texts the following conclusions have been drawn: 1. The most general definition used in Modern English to define the process of translation done by the machine is that of machine translation (MT). Nevertheless, despite the idea of fully automatic translation, almost in every case the output of MT is being edited by the human. What is more, there are different types of MT systems: those designed for only one particular pair of language, i.e. bilingual, and those designed for a variety of language pairs, i.e. multilingual. Moreover, the system can also differ in its approaches. It is usually distinguished between 3 main approaches: direct, transfer and interlingua. 2. After analyzing several classifications on problems and issues of MT it was found out that the most common mistakes are grammatical, lexical and systemic. Linguists point out that grammatical mistakes are the chief mistakes in the output of MT. 3. Several different approaches towards text genres in the English language had been looked through. Mostly 5 different genres are distinguished, which are: 1) the language of belles-lettres; 2) the language of publicistic style; 3) the language of newspapers; 4) the language of scientific prose and 5) the language of official documents. Each of this style bares a different amount of pragmatic information varying from abundance of pragmatics to the least pragmatic texts respectively. 4. After analysing 5 different English texts, 63 mistakes were found in 49 instances. When all errors were assembled into charts it turned out that the biggest group of mistakes was that of the grammatical misalignments, which made up 43% of all errors. The second most common type of mistakes was the lexical one. Such mistakes contributed to 40% of all misalignments. Systemic and miscellaneous errors made up 7% and 6% of all mistakes respectively. 5. Statistical analysis revealed that the wrong usage of the part of speech, grammatical case and verb forms were the most common mistakes among the category of grammatical errors. Such mistakes made up 26% and 19% of all misalignments respectively. The incorrect usage of grammatical number and gender were also common in the selected texts. Each of the error made up 15% of all mistakes, whereas such misalignments as verb conjugation and negative verbs contributed 42 only to 4% of all errors each. After analyzing the lexical mistakes, it was found out that polysemy, collocations and untranslated words were the most common errors in this group. These mistakes contributed to 39%, 17% and 16% of all misalignments respectively. Other lexical mistakes distributed in the following way: pronouns 12%, sayings 8%, proper names 4% and cultural realia 4%. Among the systemic mistakes most common errors were those dealing with the words which were translated using the concepts absent in the dictionary. Such misalignments made up 32% of all mistakes. Remaining errors, i.e. omission of verb, spelling of capital and miniscule letters, omission of word, and extra word contributed 17% each. 6. What is more, a great amount of other mistakes, such as translation of prepositions and word order were evident in the examples under analysis and contributed greatly to the quality of MT translation. These mistakes were not analyzed in detail in the current paper due to limited scope and because they were excluded from the classification of MT problems, which this research was based on. All in all, it was clear that the best quality text was that of technical instruction, whereas the worst was the belles-lettres text. Consequently it can be stated that a further improvement is necessary if we desire to have the machine capable of producing a high quality texts of all genres. This improvement should consist not only of updating the dictionaries installed in the system, but also a number of research should be conducted to obtain a better understanding of how the machine works itself. 43 REFERENCES 1. (1960) The Book of Genesis. New York: Paulist Press. 2. Arnold, D., Balkan, L., Meijer S., Humphreys, R. L., Sadler, L. (1994) Machine Translation: An Introductory Guide. London: Blackwells-NCC. 3. Calude, A. S. (2004) Machine Translation of Various Text Genres. [Online] Available from: http://www.calude.net/andreea/MT.pdf [Accessed on 11th December 2012]. 4. Cvilikaitė, J. (2008) Leksinės Mašininio Vertimo Klaidos: Beekvivalenčių Žodžių Vertimas. Filologija, (13), 27-38. 5. Daudaravičius, V. (2006) Pradžia į begalybę. Mašininis vertimas ir lietuvių kalba. Darbai ir dienos, (45), 9-18. 6. DiMarco, Ch., Hirst, G. (1990) Accounting for Style in Machine Translation. [Online] Available from: http://mt-archive.info/TMI-1990-DiMarco.pdf [Accessed on 11th December 2012]. 7. Galperin, I. R. (1981) Stylistics. 3rd edition. Moscow: Higher School. 8. Gudavičius, A. (2007) Gretinamoji Semantika. Šiauliai: Šiaulių Universiteto leidykla. 9. Hutchins, J. W., Somers, H. L. (1992) An Introduction to Machine Translation. [ebook] London: Academic Press. Available from: http://www.hutchinsweb.me.uk/ IntroMT-TOC.htm [Accessed on 5th October 2012]. 10. Jurafsky, D., Martin, J. H. (2006) Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Pearson Prentice Hall. 11. Manion, S. L. (2009) Fluency Enhancement. Applications to Machine Translation. MA thesis. Massey University. 12. Nida, E. A. (1964) Toward a Science of Translating: With Special Reference to Principles and Procedures Involved in Bible Translating. The Netherlands: Brill Archive. 13. Petkevičiūtė, I., Tamulynas, B. (2011) Computational Linguistics. Studies About Languages, (18), 38-45. 14. Petrulionė, L. (2012) Translation of Culture-Specific Items from English into Lithuanian: the Case of Joanne Harris’s Novels. Studies about Languages, (21), 43-49. 15. Proshina, Z. (2008) Theory of Translation (English and Russian). 3rd edition. Vladivostok: Far Eastern University Press. 44 16. Riedel, M., Schwarze, T. (2001) Machine Translation: History, Theory, Problems and Usage. In: Petkevičiūtė, I., Tamulynas, B. (2011) Computational Linguistics. Studies About Languages, (18), 38-45. 17. Robinson, D. (2003) Becoming a Translator: An Introduction to the Theory and Practise of Translation. London and New York: Routledge. 18. Valeika, L., Buitkienė, J. (2003) An Introductory Course in Theoretical English Grammar. Vilnius: Vilnius Pedagogical University. WEBSITES 1. Deffinbaugh, B. (2004) The Unity of Unbelief (Genesis 11: 1-9). [Online] Available from: http://bible.org/seriespage/unity-unbelief-genesis-111-9 [Accessed 20th October 2012]. 2. Gerber, L. (2009) Machine Translation: Ingredients for Productive and Stable MT Deployments – Part 2. [Online] Available from: http://www.translationdirectory.com/ articles/article1945.php [Accessed on 26th October 2012]. 3. Jones, P. S. (2013) What is a Senior Fellow? [Online] Available from: http://www.wisegeek.com/what-is-a-senior-fellow.htm [Accessed on 7th April 2013]. 4. Karamanian, A. P. (2002) Translation and Culture. Translation Journal, [Online] 6 (1), Available from: http://www.bokorlang.com/journal/19culture2.htm [Accessed on 1st December 2012]. 5. Lingytė, J (2002) Tikriniai Sudėtiniai Įstaigų, Įmonių ir Organizacijų Pavadinimai. Lietuvių Kalbos Taisyklių Sąvadas [Online] Available from: http://siauliai.mok.lt/ daukantas/darbai/Rasyba/Tikriniai_istaigu_pavadinimai.htm [Accessed on 25th April 2013]. 6. Marshall, P. (2012) Proper Nouns. K12 Reader. [Online] Available from: http://www.k12reader.com/proper-nouns/ [Accessed on 25th April 2013]. 7. Robin, A. (2009) Machine Translation – Overview. [Online] Available from: http://language.worldofcomputing.net/machine-translation/machine-translationoverview.html [Accessed on 3rd November 2012]. 8. Robin, A. (2010) Machine Translation Process. [Online] Available from: http://language.worldofcomputing.net/machine-translation/machine-translationprocess.html [Accessed on 2nd November 2012]. 45 9. Shapa, E. (2009) Translation Types. [Online] Available from: http://www.slideshare. net/elenashapa/translation-types [Accessed on 23rd October 2012]. DICTIONARIES 1. (2005) Longman Dictionary of Contemporary English. Harlow: Pearson Education. 2. (2007) Mokomasis Anglų Kalbos Žodynas. Vilnius: Alma Littera. 3. Baravykas, V. (1961) Anglų-Lietuvių Kalbų Žodynas. 2nd edition. Vilnius: Valstybinė Politinės ir Mokslinės Literatūros Leidykla. 4. Merriam-Webster Online [Online] Available from: http://www.merriam-webster.com/ [Accessed on 10th October 2012]. 5. Oxford Dictionaries [Online] Available from: http://oxforddictionaries.com/ [Accessed on 10th October 2012]. 6. Piesarskas, B. (2004) Dvitomis Anglų-Lietuvių Kalbų Žodynas. Vilnius: Alma Littera. SOURCES 1. A System of Machine Translation from English to Lithuanian. (2008) [Online] Available from: http://vertimas.vdu.lt/twsas/ 2. (2009) Jungtinių Amerikos Valstijų konstitucija. Viena: Usia Regional Program Office. 3. Constitution for the United States of America. [Online] Available from: http://www.constitution.org/constit_.htm [Accessed on 9th April 2013]. 4. Kakaes, K. (Monday 1st April 2013) Wrap Factor. Popular Science. [Online] Available from: http://www.popsci.com/technology/article/2013-03/warp-factor ?single-page-view=true [Accessed on 5th April 2013]. 5. McCurry, J. (Friday 5th April 2013) Kim Jong-un Has Made A Decent Fist Of Rattling The US. The Guardian. [Online] Available from: http://www.guardian.co.uk/world/2013/apr/05/kim-jong-un-rattles-us?INTCMP =SRCH [Accessed on 6th April 2013]. 6. Soundmax. Radijas Su Žadintuvu: Naudojimo instrukcija. 7. Soundmax. Alarm Clock Radio: Instruction Manual. 8. Wilde, O. (2001) Doriano Grėjaus Portretas. Vilnius: Alma Littera. 9. Wilde, O. (2003) The Picture Of Dorian Gray. London: Collectors Library. 46 47 ANNEX 1 48