TRICKLET Translation Research in Corpora, Keystroke Logging and
Transcription
TRICKLET Translation Research in Corpora, Keystroke Logging and
Changes of word class during the translation process Insights from a combined analysis of keystroke logging and eye-tracking data Tatiana Serbina, Sven Hintzen, Adjan Hansen-Ampah, Paula Niemietz, Stella Neumann Translation in Transition, Germersheim, 29.-30.01.2015 A HumTec Boost Fund Project funded by the Excellence Initiative of the German State and Federal Governments Overview 2 Translation shifts Grammatical complexity Aims of the study Methodology Product-based analysis: word class changes ST vs. TT Process-based analyses: eye-tracking analysis, word class changes in intermediate versions of translations Empirical translation studies Product-based studies Method: corpus analyses Typical research questions: translation shifts or translation properties Process-based studies Method: translation experiments (frequently keystroke logging and eye-tracking) Typical research questions: translators’ styles, levels of expertise and their effect on the translation process Our research Treating keystroke logs as a corpus (cf. e.g. Alves & Magalhães 2004, Alves & Vale 2009, 2011) A combination of product-based and process-based perspectives 3 Translation shifts Translation shifts: differences between source and target texts, e.g. part of speech change or change of semantic perspective (Čulo et al. 2008, Cyrus 2009, Halverson 2007) Changes of word class – transpositions (Vinay and Dalbernet 1958/1995, 36) EO: Crumpling a sheet of paper seems simple and doesn't require much effort, but explaining [why the crumpled ball [behaves]Verb the way it does]Clause is another matter entirely. GTrans: Ein Blatt Papier zusammen zu knüllen, erscheint einfach und erfordert wenig Anstrengung; [die [Verhaltensweise]Noun des Papierknäuels]NP zu erklären, ist dagegen eine völlig andere Sache. (KLTC PROBRAL GT7) 4 Word classes – contrastive difference German: nominal word classes - 40.21% verbal classes - 22.53% ratio of 1.784 English: nominal word classes - 41.39% verbal word classes - 25.47% ratio of 1.625 More pronouns in German (8.45%) than in English (5.46%) German appears to be more nominal than English (HansenSchirra, Neumann, and Steiner 2012, 77-78) 5 In the translations into German: more shifts from verbs to nouns and fewer shifts from nouns to verbs than in the opposite translation direction (Čulo et al. 2008, 50) Grammatical complexity Association with different levels of grammatical complexity Verbs – possible indicator that the process is realized canonically through a clause Nominalizations – may result in a more condensed and thus grammatically more complex version (Halliday and Matthiessen 2014, 715). EO: [After the [crumpling]Noun of a sheet of thin aluminized Mylar]PP, the researchers placed it inside a cylinder. GTrans: [Nachdem sie ein dünnes Blatt aluminiumbeschichtetes Mylar [verknittert]Verb hatten]Clause, gaben sie es in einen Zylinder. (KLTC PROBRAL GT3) Translation: understanding of the more complex units in the ST could involve their paraphrase with grammatically more simple structures in the TT (Steiner 2001, Hansen-Schirra, Neumann, and Steiner 2012, 257-261) 6 Aims of the study Analysis of POS distribution and shifts between main word classes This study concentrates on nouns & verbs and investigates the cognitive effort during the translation process depending on the word class in the original type of shift in the translation Analysis of intermediate versions in the keystroke logging data Assumptions: due to the contrastive difference, the translation direction EnglishGerman may be characterized through shifts from verbs to nouns due to the process of understanding related to the grammatically dense noun phrases in ST, translations into German may be characterized through shifts from nouns to verbs (in the intermediate or final versions) 7 Our translation process data Translation experiment (Neumann et al. 2010) Translation direction: English-German Subjects 8 professional translators 8 physicists Material Two versions of an authentic text with ten integrated stimuli Abridged version of a popular-scientific text published in the journal Scientific American Online Apparatus Tobii 2150 remote eyetracker, software Tobii Studio 1.5 Keystroke logging software Translog 8 Keystroke Logged Translation Corpus The corpus consists of: 2 versions of the original (source texts) 16 translations (target texts) 16 log files (process texts) Corpus size (comprising STs and final TTs): approx. 3,650 words Corpus register: Popular Scientific writings (Serbina, Niemietz, and Neumann, forthcoming) 9 Automatic POS annotation of ST and TT using TreeTagger (Schmid 1994) Manual alignment between ST and TT words using the alignment tool (Hansen-Ampah 2014) based on the alignment guidelines (Samuelsson et al. 2010) Methodology Manual extraction of ST words belonging to main word classes and the aligned TT words Translation pairs selected for further analysis: Shifts between nominal and verbal variants Random samples of verbs and nouns that do not contain a shift in the final translation 10 Keystroke data: identification of intermediate versions for the selected ST words Eye-tracking data: calculation of total fixation duration as a concrete indicator of cognitive effort for the selected ST words POS distribution in ST and TT English STs German TTs Nouns 32,75% (113/345) 27% (882/3267) Verbs 17,39% (60/345) 15,81% (511/3267) Adjectives 11,30% (39/345) 9,77% (319/3267) Adverbs 4,35% (15/345) 5,17% (169/3267) 11 More nouns and verbs in English originals than in German translations Technical problem: compound nouns counted as several nouns in English but as one in German (Čulo et al. 2008, 49) Types of word class shifts Absolute numbers 12 % of all shifts VERB → NOUN 37 49,95% ADJ → NOUN 23 17,04% NOUN → VERB 16 11,85% VERB → ADJ 14 10,37% ADV → PP 14 10,37% NOUN → ADJ 10 7,41% VERB → ADV 9 6,70% ADV → ADJ 6 4,44% NOUN → ADV 4 2,96% ADJ → ADV 2 1,48% Types of word class shifts II 13 English ST verb English ST noun No shift 350 776 Shift 60 30 The translation direction English-German is characterized through shifts from verbs to other word classes, in particular to nouns Cognitive effort I Does the translation of nouns require more cognitive effort than the translation of verbs? Cognitive effort is measured using log-transformed values for total fixation duration normalized per character Means: noun -1.9 verb -1.78 t = -0.7, df = 94.85, p-value = 0.48 14 Cognitive effort II Means: n-v -2.05 v-n -1.65 t = -1.57, df = 33.1, p-value = 0.13 Slightly lower mean for the total fixation duration associated with shifts from nouns to verbs could be potentially explained through reduction of grammatical complexity EO: Instead of collapsing to a final fixed size, the height of the crushed ball continued to decrease, even three weeks [after the [application]Noun of weight]NP. GTrans: Statt zu einer endgültigen festen Größe zusammenzufallen, nahm die Höhe des zusammengeknüllten Papierballs weiter ab, und zwar auch noch drei Wochen, [nachdem das Gewicht [angewendet]Verb wurde]Clause. (KLTC PROBRAL GT5) 15 Intermediate versions I Verb to Noun shifts: Verb Verb Noun (3x) Verb Noun Noun (1x) EO: Crumpling a sheet of paper seems simple and doesn't require much effort, but explaining [why the crumpled ball [behaves]Verb the way it does]Clause is another matter entirely. GTrans_i: Ein Blatt Papier zusammen zu knüllen, erscheint einfach und erfordert wenig Anstrengung, jedoch zu erklären, [warum der zeras Papierknäuel sich so [verhält]Verb, wie es das tut]Clause, ist eine völlig andere Sache. GTrans: Ein Blatt Papier zusammen zu knüllen, erscheint einfach und erfordert wenig Anstrengung; [die [Verhaltensweise]Noun des Papierknäuels]NP zu erklären, ist dagegen eine völlig andere Sache. (KLTC PROBRAL GT7 16 Intermediate versions II Noun to Verb shifts: Noun Noun Verb (1x) Noun Verb (Verb) Verb (2x) EO: [After the [crumpling]Noun of a sheet of thin aluminized Mylar]PP, the researchers placed it inside a cylinder. GTrans: [Nachdem sie ein dünnes Blatt aluminiumbeschichtetes Mylar [verkrumpelt]Verb hatten]Clause, gaben sie es in einen Zylinder. GTrans: [Nachdem sie ein dünnes Blatt aluminiumbeschichtetes Mylar [verknäuelt]Verb hatten]Clause, gaben sie es in einen Zylinder. GTrans: [Nachdem sie ein dünnes Blatt aluminiumbeschichtetes Mylar [verknittert]Verb hatten]Clause, gaben sie es in einen Zylinder. (KLTC PROBRAL GT3) 17 Conclusion & Outlook An application of a Keystroke Logged Translation Corpus to triangulate product and process data Shifts from verbs in the ST to nouns in the final TT the most pervasive type of shifts in the translation direction English-German Verbs shifted to nouns are fixated slightly longer than nouns shifted to verbs Both types of shifts can involve intermediate stages (either ST POS or TT POS, i.e. a synonym of the item in the final TT) Determining cognitive effort based not only on the shift in the final but also intermediate translation versions more data points required Taking into account further indicators of cognitive effort (further combining eye-tracking and keystroke logging data streams) 18 e-cosmos platform Creating a web-based platform for different scenarios of multimodal data integration and analysis Translation data Integration: text, keystroke logging and eye-tracking data Identification of word tokens in the intermediate translation versions and their linguistic annotation Query tool for quantitative analyses of product and process data (cf. Carl & Jakobsen 2006) e-cosmos platform Linguistics 19 Computer Science Information Management Thank you for your attention! Tatiana Serbina [email protected] RWTH Aachen University Templergraben 55 52056 Aachen www.rwth-aachen.de References Alves, Fabio, and Célia Magalhaes. 2004. “Using Small Corpora to Tap and Map the Process-product Interface in Translation.” TradTerm 10: 179–211. Alves, Fabio, and Daniel Couto Vale. 2009. “Probing the unit of translation in time: Aspects of the design and development of a web application for storing, annotating, and querying translation process data.” Across Languages and Cultures 10 (2): 251–73. Alves, Fabio, and Daniel Couto Vale. 2011. “On drafting and revision in translation: On drafting and revision in translation: A corpus linguistics oriented analysis of translation process data.” Translation: Computation, Corpora, Cognition 1: 105–22. Carl, Michael, and Arnt LykkeJakobsen. 2009. Objectives for a query language for user-activity data. In 6th International Natural Language Processing and Cognitive Science Workshop, Milano, Italy. Čulo, Oliver, Silvia Hansen-Schirra, Stella Neumann, and Mihaela Vela. 2008. “Empirical studies on language contrast using the EnglishGerman comparable and parallel CroCo corpus.” In Proceedings of the LREC 2008 Workshop “Building and Using Comparable Corpora”, 47–51. Marrakesh, Morrocco. http://www.dfki.de/lt/publication_show.php?id=3991. Cyrus, Lea. 2009. “Old concepts, new ideas: Approaches to translation shifts.” MonTI. Monografías de Traducción e Interpretación 1: 87– 106. Halliday, Michael A. K., and Christian M. I. M. Matthiessen. 2013. An introduction to functional grammar. London: Arnold. Halverson, Sandra L. 2007. “A cognitive linguistic approach to translation shifts.” In The study of language and translation, edited by Willy Vandeweghe, Sonia Vandepitte, and Marc van de Velde, 105–21. Amsterdam: Benjamins. Hansen-Schirra, Silvia, Stella Neumann, and Erich Steiner. 2012. Cross linguistic corpora for the study of translations: Insights from the language pair English-German. Berlin: de Gruyter. Hansen-Schirra, Silvia, Stella Neumann, and Mihaela Vela. 2006. Multi-dimensional annotation and alignment in an English-German translation corpus. In Proceedings of the 5th Workshop on NLP and XML (NLPXML-2006): Multi-Dimensional Markup in Natural Language Processing, EACL 2006, (pp. 35–42). Trento, Italy. Neumann, Stella, Adriana Pagano, Fabio Alves, Piritta Pyykkönen, Igor da Silva. 2010. Targeting (de-) metaphorization: Process-based insights. The 22nd European Systemic Functional Linguistics Conferece and Workshop. 9th–11th July 2010. Koper, Slovenija. Serbina, Tatiana, Paula Niemietz, and Stella Neumann. Forthcoming. Development of a keystroke logged translation corpus. In: Claudio Fantinuoli und Federico Zanettin (eds.): Parallel corpora for translation studies: Language Science Press. Steiner, Erich. 2001. Translations English-German. Investigating the relative importance of systemic contrasts and the text-type "translation". In: SPRIKreports Reports of the Project Languages in Contrast 7, S. 1–49. Vinay, Jean-Paul, and Jean Darbelnet. 1995 (1958). Comparative stylistics of French and English: A methodology for translation. Amsterdam: Benjamins. Translated and edited by Juan C. Sager and M.-J. Hamel. 21