Lecture 2nd May - AG Wissensbasierte Systeme
Transcription
Lecture 2nd May - AG Wissensbasierte Systeme
Collaborative Intelligence - Lecture SS 2016 - Prof. Dr. Andreas Dengel WM/04.02 S. 92 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"?#""1""77"#$%&" WM/04-05 S. 92 Collaborative Intelligence focuses on the support of knowledge workers within socio-technical networks Chapter 1: Search & Classification Chapter 2: Attention-based Collaborative Intelligence Chapter 3: Recommender Systems Chapter 4: Proactive Multi-Channel Information Extraction Chapter 5: Usability in Collaborative Systems Chapter 6: Social Media Monitoring, Discovery & Forecast WM/04.02 S. 93 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"?@""1""77"#$%&" WM/04-05 S. 93 This allows a general description of an index system An index system I maps a set of terms T on a set of documents D Assumption is a homogenous representation via an inverted index I: T ! D As a result of the mapping we expect a group of documents which contents wise correlate with the terms of the query For the evaluation we may use the measures Recall and Precision WM/04.02 S. 94 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"?A""1""77"#$%&" WM/04-05 S. 94 Basic anatomy of a search system* Query languages Query builders Metadata Controlled vocabulary ? User Query Search Interface Search Engine content Ranking and clustering algorithms Interface design Results Users do ask, browse, or search again Until the succeed or give up For evaluating of the relevance of documents there is room for improvement, WM/04.02 S. 95 i.e., !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"?B""1""77"#$%&" WM/04-05 S. 95 * Source: Rosenfeld & Morville, 2002 All methods aim at the modification of either index, term vector or query Approach Index term consolodation Stemming Term vector expressiveness Weighting the terms Query support Thesaurus Generalization Specialization x Removing stop words x x Distances x x x Grammar and Dictionary x Quorum-Level-Search x x Relevance Feedback x x Contextual Search x x Note that most of the are implicit WM/04.02 S. 96 methods !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"?&""1""77"#$%&" WM/04-05 S. 96 How can we weight terms for the indexing of large document collections? Remember: „ It is here proposed that the frequency of word occurrence in an article furnishes a useful measurement of word significance “ Luhn, 1958 WM/04.02 S. 97 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"?C""1""77"#$%&" WM/04-05 S. 97 Let's start with a first attempt to define the term weights of a document corpus FIRST APPROACH The weight wt,d of a term t for a document d is defined by the quotient of its frequency tft,d in d and the number of documents nD in the entire document collection D in which the term occurs Wt,d = tft,d nD The term weight wt,d is high if there are only few articles capturing a term but the term has a high frequency in a document d Although there is an inherent danger using such a weighting, we take it for a first experiment WM/04.02 S. 98 Note that as result, typical stop words have a low weight !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"?D""1""77"#$%&" WM/04-05 S. 98 Assuming we would have a large set of documents and have the goal to index them - how to proceed? d Example Tendenz zur Lästigkeit Die institutionelle Kompetenzschwäche Michael Naumanns und wie er sie nutzen kann. Was der Kulturbeauftragte darf und was nicht. Staatstragende Überlegungen von Elke Gurlit Niemand wird bestreiten, dass Gerhard Schröder mit der Etablierung des Bundeskulturbeauftragten ein Coup gelungen ist. Nicht nur die staatliche Kulturpolitik, sondern auch das Räsonieren über Kultur hat in den letzten Monaten einen enormen Bedeutungszuwachs erfahren. Die tägliche Naumann-Meldung gehört zum unverzichtbaren Repertoire des Feuilletons. Man gewinnt fast den Eindruck, Michael Naumann handele als Beauftragter unterbeschäftigter Kulturredaktionen. Zum besseren Verständnis der Stellung des Kulturbeauftragten lohnt ein Blick auf das Beauftragtenwesen, das sich in der Bundesrepublik flächendeckend ausgebreitet hat. Wir kennen. [...] Corpus: 200 German daily newspaper articles First of all, the index system has to count the occurrence of terms in documents WM/04.02 S. 99 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"??""1""77"#$%&" WM/04-05 S. 99 Ranking based on term weight Term tft,d nD wt,d Bundeskulturbeauftragter Kulturbeauftragte Kompetenzschwäche Naumann Kulturhoheit Bundesbeauftragte parlamentarisch Lästigkeit Staatssekretär Kulturpolitik Beauftragter [...] alle oder nach so wieWM/04.02 S. 100 dass 5 14 3 11 2 2 12 2 9 5 5 1 3 1 5 1 1 6 1 5 3 4 5.0000 4.6667 3.0000 2.2000 2.0000 2.0000 2.0000 2.0000 1.8000 1.6667 1.2500 191 191 197 195 199 200 0.0052 0.0052 0.0051 0.0051 0.0050 0.0050 1 1 1 1 1 1 Wt,d = tft,d nD Threshold !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%$$""1""77"#$%&" WM/04-05 S. 100 For all terms with wt,D > 1 we can index the text example d Example Tendenz zur Lästigkeit Die institutionelle Kompetenzschwäche Michael Naumanns und wie er sie nutzen kann. Was der Kulturbeauftragte darf und was nicht. Staatstragende Überlegungen von Elke Gurlit Niemand wird bestreiten, dass Gerhard Schröder mit der Etablierung des Bundeskulturbeauftragten ein Coup gelungen ist. Nicht nur die staatliche Kulturpolitik, sondern auch das Räsonieren über Kultur hat in den letzten Monaten einen enormen Bedeutungszuwachs erfahren. Die tägliche Naumann-Meldung gehört zum unverzichtbaren Repertoire des Feuilletons. Man gewinnt fast den Eindruck, Michael Naumann handele als Beauftragter unterbeschäftigter Kulturredaktionen. Zum besseren Verständnis der Stellung des Kulturbeauftragten lohnt ein Blick auf das Beauftragtenwesen, das sich in der Bundesrepublik flächendeckend ausgebreitet hat. Wir kennen [...] WM/04.02 S. 101 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%$%""1""77"#$%&" WM/04-05 S. 101 But when is a term an index term? As a matter of fact, the terms should be selected by their weights However, using a threshold is only one option we have The selection of index terms may be: on the basis of their rank (e.g. the first five) by a comparison, e.g. relative frequency e.g. terms such as “Bundeskulturbeauftragter”, “Kompetenzschwäche” or “Lästigkeit” have relative frequency of WM/04.02 S. 102 1/200 = 0,005 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%$#""1""77"#$%&" WM/04-05 S. 102 What do we have to consider when we deal with a corpus of documents? WM/04.02 S. 103 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%$@""1""77"#$%&" WM/04-05 S. 103 There are different ways of how to calculate weights, each of them causing different effects SECOND APPROACH The documents used to create an index system may vary in size If the weight is only based on the frequency of a term in the document the evaluation will be distorted Problem with our In large documents index terms appear more often, first approach of thus the documents are overrated weighting Hence, it is necessary to normalize the term frequency tfd,t of the term t with respect to the quantity nt,d of terms in the document d ntft,d = WM/04.02 S. 104 tft,d nt,d Note: The measure is now related to a single document !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%$A""1""77"#$%&" WM/04-05 S. 104 However, just frequency does not help to find the best matches in a document collection The normalized term frequencies ntft,d are comparable for the same terms in different documents However, it does not help in distinguishing between more relevant and less relevant documents Note: Terms do often correlate thematically with a document but are usually not specific enough for a precise differentiation of the document content The application of pure frequency measures on the basis of weights solely leads to a high recall and a low precision This is because the documents used to create an index system may be very specific (some terms are very frequent in a collection but concentrate on a few documents only) WM/04.02 S. 105 This may lead to an improvement of the precision !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%$B""1""77"#$%&" WM/04-05 S. 105 Low frequency terms define and differentiate the document content more significantly than high frequency terms With respect to the document content an index term is the more distinct the more it appears within the document and the less it appears in general (inverse document frequency) The inverse document frequency is expressed by the logarithm of the quotient of the total number of documents N and the document frequency dft,D of the term t (Number of the documents in the set of documents D in which t appears) idft,D = log * N dft,D The log is uses to restrict the space of relative frequency values The inverse document frequency allows to increase accuracy through reassessing the normalized term frequency of a term WM/04.02 S. 106 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%$&""1""77"#$%&" WM/04-05 S. 106 There are two assumption on which the determination of useful descriptors for text is based on Observation 1 The weight of an index term relates to a single document The very best descriptors are frequent with respect to the total length of the document 2 The weight of an index term relates to the document corpus Good descriptors can only rarely found in the document collection, which leads to a differentiation effect We may combine the measure of the relative document frequency and the inverse document frequency to a new weighting WM/04.02 S. 107 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%$C""1""77"#$%&" WM/04-05 S. 107 This results in a final measurement THIRD APPROACH Weight wt,d,D of a term t is defined by the product of its normalized frequency ntft,d in d and the inverse document frequency idft,D based on a document collection D wt,d,D = ntft,d * idft,D d Example Tendenz zur Lästigkeit Die institutionelle Kompetenzschwäche Michael Naumanns und wie er sie nutzen kann. Was der Kulturbeauftragte darf und was nicht. Staatstragende Überlegungen von Elke Gurlit Niemand wird bestreiten, dass Gerhard Schröder mit der Etablierung des Bundeskulturbeauftragten ein Coup gelungen ist. Nicht nur die staatliche Kulturpolitik, sondern auch das Räsonieren über Kultur hat in den letzten Monaten einen enormen Bedeutungszuwachs erfahren. Die tägliche Naumann-Meldung gehört zum unverzichtbaren Repertoire des Feuilletons. Man gewinnt fast den Eindruck, Michael Naumann handele als Beauftragter unterbeschäftigter Kulturredaktionen. Zum besseren Verständnis der Stellung des Kulturbeauftragten lohnt ein Blick auf das Beauftragtenwesen, das sich in der Bundesrepublik flächendeckend ausgebreitet hat. Wir kennen [...] WM/04.02 S. 108 Assumption: nt,d = 1427 "Staatssekretär" is 3 times in the document and in 5 different articles collection has 200 articles This results in a weight of w“staatssekretär“,d,D = 3/1427* log(200/5) = 0,0077 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%$D""1""77"#$%&" WM/04-05 S. 108 Term wt,d,D = ntft,d * idft,D Kulturbeauftragte parlamentarisch Naumann Bundeskulturbeauftragter Kulturpolitik staatlich Beauftragte Kompetenzschwäche kulturell institutionell : Staatssekretär WM/04.02 S. 109 0.0412 0.0295 0.0284 0.0186 0.0147 0.0144 0.0137 0.0111 0.0100 0.0098 d Tendenz zur Lästigkeit Die institutionelle Kompetenzschwäche Michael Naumanns und wie er sie nutzen kann. Was der Kulturbeauftragte darf und was nicht. Staatstragende Überlegungen von Elke Gurlit 0.0077 Niemand wird bestreiten, dass Gerhard Schröder mit der Etablierung des Bundeskulturbeauftragten ein Coup gelungen ist. Nicht nur die staatliche Kulturpolitik, sondern auch das Räsonieren über Kultur hat in den letzten Monaten einen enormen Bedeutungszuwachs erfahren. Die tägliche Naumann-Meldung gehört zum unverzichtbaren Repertoire des Feuilletons. Man gewinnt fast den Eindruck, Michael Naumann handele als Beauftragter unterbeschäftigter Kulturredaktionen. Zum besseren Verständnis der Stellung des Kulturbeauftragten lohnt ein Blick auf das Beauftragtenwesen, das sich in der Bundesrepublik flächendeckend ausgebreitet hat. Wir kennen [...] !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%$?""1""77"#$%&" WM/04-05 S. 109 But how may we express the relevance of a term for a document? WM/04.02 S. 110 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%%$""1""77"#$%&" WM/04-05 S. 110 The term relevance differs according to the term frequency The term relevance describes its capability to find appropriate documents within a collection and to disregard inappropriate documents The higher the IDF value the higher the relevance of a term for a document Based on this, the frequency of a term may have a distinct influence on the differentiation of documents Relevant terms are those in the medium frequency range WM/04.02 S. 111 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%%%""1""77"#$%&" WM/04-05 S. 111 How useful are terms to differentiate documents? WM/04.02 S. 112 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%%#""1""77"#$%&" WM/04-05 S. 112 It is possible to determine the usefulness of a term for differentiation Define a similarity coefficient SC for all document vectors of an index system 1 SC = N(N-1) N N "" t=1 k=1, k#t S(dt , dk ) Note: S refers to the similarity of two documents (later more) After excluding the term i from the document vectors another similarity coefficient SC/i may be calculated The differentiation coefficient DCi describes changes of two similarity coefficients that have been caused by the index term i WM/04.02 S. 113 DCt = SC – SC/i !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%%@""1""77"#$%&" WM/04-05 S. 113 The differentiation coefficient can be incorporated into the weighting of the terms Its particular impact on the index system can take effect through a novel weighting wt,d,D = ntft,d * DCt or wt,d,D = ntft,d * idft,D * DCt In this case term weights are evaluated on the basis of the differentiation potential of the vectors The novel weighting allows the optimization of both recall and precision in the results WM/04.02 S. 114 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%%A""1""77"#$%&" WM/04-05 S. 114 So how does an index system work? WM/04.02 S. 115 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%%B""1""77"#$%&" WM/04-05 S. 115 The work with an index system can be divided in two subtasks The first subtask deals with building the inverted index: 1 Identification of every word in a set of documents 2 Elimination of all stop-words which do not have a value regarding the differentiation of documents 3 Generating the root word for each index term (stemming) 4 Calculating the weight of each root word 5 Representation of each document through all root words and the associated weights WM/04.02 S. 116 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%%&""1""77"#$%&" WM/04-05 S. 116 The second subtask of an index system deals with query processing: Precondition is an existing inverted index which is employed by the individual processing steps 1 Manual or automated formulation of a query 2 Generating the root words of the query terms (stemming) 3 Detecting the set of appropriate documents 4 Output of the result according to relevance WM/04.02 S. 117 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%%C""1""77"#$%&" WM/04-05 S. 117 Basis for a reasonable term weighting is a homogenous domain d1 Example Computer werden im Information Retrieval eingesetzt. Es existieren Verfahren auf Computern für automatisches Retrieval. Moderne Computer ermöglichen ein effizientes Retrieval d2 Nutzer von Systemen zum Information Retrieval wurden befragt. Viele Nutzer waren mit der Funktionalität des Retrieval zufrieden. Die vorhandenen Systeme zum Information Retrieval genügen den Anforderungen der Nutzer. Es existieren eine Reihe von Systemen auf Computern d3 Die Entwicklung neuer Systeme für das Information Retrieval wird von vielen Nutzern begrüsst. Die Entwicklung zielt auf neue Methoden des Retrievals mit Computern ab. Systeme zum effizienten Retrieval nach Information befinden sich derzeit in der Entwicklung. d4 Das Information Retrieval wird in Datenbanken durchgeführt. Verschiedene Datenbanken haben eine Oberfläche für den Nutzer, die ein zielgerichtetes Retrieval in Informationsräumen ermöglicht. Verschiedene Systeme für ein Retrieval in Datenbanken stehen derzeit dem Nutzer zur Verfügung. d5 Task: Automated indexing of those five documents Die Entwicklung von Systemen zum Retrieval in Informationsräumen ist für viele Nutzer von Datenbanken interessant. In Informationsräumen kann man navigieren und somit das Information Retrieval unterstützen. Der Informationsraum wird WM/04.02 S. 118auf Computern visualisiert. dreidimensional !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%%D""1""77"#$%&" WM/04-05 S. 118 In the first step high frequency words are eliminated The stop-word list contains all kinds of words except for nouns (exceptions: procedure, requirement, row, method, availability, functionality, surface) Analyzed nouns are reduced to nominative singular In the second step the term weight is calculated on the basis of TF/IDF In the third step the quality of an index term is evaluated by defining threshold values In the given example we can do without an upper threshold value because of the volume of the stop-word list WM/04.02 S. 119 (... optional homework exercise) !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%%?""1""77"#$%&" WM/04-05 S. 119 Are there alternative approaches to deal with the length of a document when evaluating the captured text? WM/04.02 S. 120 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%#$""1""77"#$%&" WM/04-05 S. 120 Let us consider an example Document d1: {data, multimedia, computer, retrieval, retrieval} Document d2: {similarity, multimedia, retrieval} Document d3: {data, computer, data} data multimedia retrieval similarity computer d1 = { 1/5, 1/5, 2/5, 0, 1/5} d2 = { 0, 1/3, 1/3, 1/3, 0} d3 = { 2/3, 0, 0, 0, 1/3} WM/04.02 S. 121 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%#%""1""77"#$%&" WM/04-05 S. 121 Vectors describe multi-dimensional spaces for representing documents and queries Based on their term weights queries and documents are represented as vectors in the vector space e.g.: q = data, retrieval wdata,d q Interpretation: The smaller the angle between the query vector and the document vector the higher the relevance of the document d3 Note: d1 wretrieval,d d2 A cosine value of zero means that the query and document vector are orthogonal and have no match The so called vector space model is the most commonly used model WM/04.02 S. 122 today !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%##""1""77"#$%&" WM/04-05 S. 122 The angle between two vectors is determined by the so called cosine measure Instead of the angle, the cosine of the angle is easier to calculate: The similarity of a query and a document is expressed by the cosine measure, i.e. as a correlation of the vectors quantified by the cosine of the enclosed angle ! Employing a scalar product it is possible to calculate the length of a vector as well as the angle between two vectors While determining the angle, the length of a vector can be neglected S(di, q ) = cos ! = di x q di * q WM/04.02 S. 123 Note: n q = "q 2 i=1 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%#@""1""77"#$%&" WM/04-05 S. 123 The vector space model reveals some obvious strength and weaknesses positive Its calculation is straight-forward because of the simple model based on linear algebra It considers the term weights instead of Boolean values It allows computing a continuous degree of similarity between queries and documents It allows ranking documents according to their possible relevance negative Long documents are poorly represented because they have poor similarity values Search keywords must precisely match document term, i.e. problems with substrings (! false positives) or missing contextual relevant documents because of different terminology (! false negatives) WM/04.02 S. 124 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%#A""1""77"#$%&" WM/04-05 S. 124 Hence, not only a query but an entire document can function as a query The similarity S(d1, d2) of two documents d1 and d2 is calculated on the basis of the scalar product of the respective term weights wt,d1,D and wt,d2,D divided by the cosine between the two vectors n S(d1, d2 ) = "w i=1 ti,d1 n " i=1 * wti,d2 n wti,d12 * " wti,d22 i=1 Interpretation of the method: Low frequency terms reduce the similarity of documents since they appear rarely High frequency terms increase the similarity since they appear in many documents Terms with a medium range frequency tend to organize a collection into clusters differentiated by content WM/04.02 S. 125 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%#B""1""77"#$%&" WM/04-05 S. 125 How are these concepts used in web search? WM/04.02 S. 126 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%#&""1""77"#$%&" WM/04-05 S. 126 Web search poses unique challenges to information retrieval Methods that work well on well controlled document collections often do not produce good results on the web Vector space model returns documents that most closely approximate the query On the web, this strategy often returns very short documents that are the query plus a few words (see weaknesses of VSM). Major challenges for web search engines include Ultra large scale (60 Billion ++ web pages to index) High throughput (several hundred Million queries per day) Extreme variation in document contents (language, vocabulary, format, !) Companies deliberately manipulating search engines for profit Use of external meta-information WM/04.02 S. 127 The Original Google Paper !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9" 6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%#C""1""77"#$%&" WM/04-05 S. 127 PageRank: Bringing Order to the Web Makes use of the link structure of the Web to calculate a quality ranking for each page, called the PageRank. A probability distribution used to represent the likelihood that a person randomly clicking on links will arrive at any particular page. It considers the importance of each page that casts a vote, as votes from some pages are considered to have greater value, thus giving the linked page greater value. $ PR(Ti ) ' ) PR(A) = (1! d) + d & # &%Ti "L( A) C(Ti ) )( Note that PageRanks form a probability distribution of webpages, so the summation of all webpages will be 1. PR(A) " PageRank of a webpage A PR(Ti) " PageRank of a webpage Ti pointing to A C(Ti) " Number of outbound links for webpage Ti L(A) " Set of webpages linking to A WM/04.02 S. 128 d " Damping factor, a value between 0 and 1, is the probability that a random surfer will stop clicking !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%#D""1""77"#$%&" WM/04-05 S. 128 How to apply these techniques for document classification? WM/04.02 S. 129 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%#?""1""77"#$%&" WM/04-05 S. 129 Classification means the association of objects with known categories Classification in information retrieval addresses the association of documents to contents wise defined document classes The purpose of document classification is automated routing of documents and organized filing of documents Unknown documents Categorization system K1 How is the set of the classes K = [K1, ..., Kn] formed? K2 K3 WM/04.02 S. 130 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%@$""1""77"#$%&" WM/04-05 S. 130 The difference to the approach is given by an altered mapping function The domain is formed by a set T of terms; the range of values is now reduced to a defined number of classes K Now the terms refer to a set of classes instead of documents as before I‘: T ! K Mapping of an index system for classification The challenge is to extract all terms relevant for splitting the set of documents into classes WM/04.02 S. 131 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%@%""1""77"#$%&" WM/04-05 S. 131 If a document is to be assigned to a class its term vector is given to the system I‘‘ which identifies the respective class I‘‘: d ! K Mapping of an index system for the classification of documents The difference in function causes an altered design of the inverted matrix The basis is a set of documents that has already been classified The term vectors created on this basis are not stored separately for each document anymore but are combined to class vectors WM/04.02 S. 132 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%@#""1""77"#$%&" WM/04-05 S. 132 Index terms of document vectors belonging to a class are transferred into a term vector Example: the combination of document vectors and class vectors d1 d2 d3 Document vectors t1 of class 1 3 5 0 t1 0 0 2 t2 1 0 4 t2 7 2 4 t3 0 9 2 t3 0 3 4 Note: During a query the class of a document is defined by comparing the document vector and the class vector dM-2 dM-1 dM k1 kN t1 8 2 t2 5 13 t3 11 7 Document vectors of class N Inverted index of the class vectors In contrast to common index systems, which are only capable to evaluate already indexed documents, now the examination of documents unknown to the system WM/04.02 S. 133 is also possible !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%@@""1""77"#$%&" WM/04-05 S. 133 Here again, similarity measures determine the assignment of documents to classes The assignment of a document to a class is determined by the similarity of its document vector d and the respective class vector k The more similar those vectors are, the more likely it is that the document belongs to this class In order to determine the similarity of two vectors different evaluation functions S are available, measuring the distances between vectors WM/04.02 S. 134 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%@A""1""77"#$%&" WM/04-05 S. 134 Normalizing the evaluation with respect to the size of the vectors ensures the comparability of the results Normalization of the evaluation is done by determining the angle between two vectors in the vector space In order to do so, the vectors are considered parts of a highdimensional vector space which is given by the amount of all terms within the system The sinus of the angle between document vector and class vector in this space determines their similarity S(d, k) = 1- "" t$d s$k " t$d WM/04.02 S. 135 2 tft,d * wt,k tft,d 2 * " s$k * at,s wt,k 2 The disadvantage of this function is its complexity !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%@B""1""77"#$%&" WM/04-05 S. 135 The categorization results can be expressed in different ways Singular categorization („best match“) Assorted categorization („ranked match“) 1. 2. 3. 0,6 0,2 0,1 Validated categorization („measured match“) The different types to express the results complicates the comparability of WM/04.02 S. 136 the systems !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%@&""1""77"#$%&" WM/04-05 S. 136 Some more sophisticated methods also include the consideration of word context! WM/04.02 S. 137 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%@C""1""77"#$%&" WM/04-05 S. 137 Terminological Conceptualization of Information Objects based on term similarity (syntactic distance) REMEMBER Determine term similarity through a pre-computed statistical analysis of strings captured in a document Association matrices quantify term correlations based on how frequently they co-occur (Term Co-Occurrence Matrix) The correlation cij between terms ti and tj is expressed by their joint occurrence within a document WM/04.02 S. 138 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%@D""1""77"#$%&" WM/04-05 S. 138 However, how is the situation at a workspace of a knowledge worker In office environments people classify documents according to their preferences, i.e. they generate folders as categories and name them Resulting taxonomies correspond to subjective concepts of the world but ! ! have no unique meaning vacation WM/04.02 S. 139 vacation !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%@?""1""77"#$%&" WM/04-05 S. 139 The perception of documents is subjective In office environments people classify documents according to their preferences, i.e. they generate folders as categories and name them Resulting taxonomies correspond to subjective concepts of the world but ! ! have no unique meaning ! do not allow perspective considerations How? What? ? Where? ! Who? WM/04.02 S. 140 Lecture Knowledge Management TU KL Dengel !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%A$""1""77"#$%&" WM/04-05 S. 140 The perception of documents is subjective In office environments people classify documents according to their preferences, i.e. they generate folders as categories and name them Resulting taxonomies correspond to subjective concepts of the world but ! ! have no unique meaning ! do not allow perspective considerations ! are not integrative Files System Email-Folders Bookmarks File System Personal Favorites Local Files Keynotes Inbox Outbox Hybrid Classification Personal Memory Semantic Desktop WM/04.02 S. 141 Contacts Miles Zhang Novotel Melbourne Springer Homepage HCM 2006 KSEM07 EDOC 2006 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%A%""1""77"#$%&" WM/04-05 S. 141 Using term co-occurrence, the documents stored in folders allow to dynamically learn content profiles clouds water wind blue sky wave ocean sand snorkeling palm tree coral reef vacation shell Barbados Profiles represent subjective perceptions of content and do have a descriptive character for the content of a folder The profile of newly created or incoming documents are compared with the folder profiles and categorized accordingly Storing a document in a folder causes a dynamic adaptation of the profile during the course of time Profiles can be used for categorizing new documents and for query expansion WM/04.02 S. 142 while searching !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%A#""1""77"#$%&" WM/04-05 S. 142 Multi-dimensional management may generate “on-the-fly” metadata to describe context The “Email”-Folder is just part of the view addressing document types The application of several taxonomies enable the multi-dimensional management of content So, for example, content can be administered by virtual folders on an overlying level synchronously to Explorer, email-system and browser Content is simultaneously assigned to different folders (criteria or categories) WM/04.02 S. 143 Folder names (categories) and views (super categories) provide meta data for indexing content !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%A@""1""77"#$%&" WM/04-05 S. 143 Views may be individually centered or group centered Organizational view of the business structure Individual views (my personal views) View of running projects WM/04.02 S. 144 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%AA""1""77"#$%&" WM/04-05 S. 144 How to evaluate document classification systems? WM/04.02 S. 145 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%AB""1""77"#$%&" WM/04-05 S. 145 Confusion matrices or contingency charts enable to summarize correct and incorrect class assignments A confusion matrix can be used for comparing desired results and classification results Example: single binary classification Ground Truth Classifier K ¬K K a b ¬K c d The matrix represents the four possibilities of the binary classification WM/04.02 S. 146 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%A&""1""77"#$%&" WM/04-05 S. 146 Extension to n classes in order to evaluate classifier with more than two classes Since it may occur that a document cannot be assigned to any class, the new class R (reject) is introduced Ground Truth Ai,j indicates the number of documents assigned to class Ki which – according to the ground-truth information - belong to class Kj WM/04.02 S. 147 Classifier K1 K2 K3 ... Kn K1 A1,1 A1,2 A1,3 ... A1,n K2 A2,1 A2,2 A2,3 ... A2,n K3 A3,1 A3,2 A3,3 ... A3,n Kn An,1 An,2 An,3 ... An,n R R1 R2 R3 ... Rn !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%AC""1""77"#$%&" WM/04-05 S. 147 On the basis of clear yes/no decisions the results can be divided into four types 1 2 3 4 True Positives: The number Ai of documents that are correctly assigned to class Ki False Positives: The number Bi of documents that are falsely assigned to class Ki False Negatives: The number Ci of documents that do belong but have not been assigned to class Ki True Negatives: The number Di of documents that do not belong and have not been assigned to class Ki WM/04.02 S. 148 Ground Truth Classifier K1 K2 K3 ... Kn K1 A1,1 A1,2 A1,3 ... A1,n K2 A2,1 A2,2 A2,3 ... A2,n K3 A3,1 A3,2 A3,3 ... A3,n Kn An,1 An,2 An,3 ... An,n R R1 R2 R3 ... Rn The values Ai, Bi, Ci, and Di correlate with the values a, b, c, d for binary classification assuming that Ki is the class K and ¬K the unification of all classes Kj with i " j and the reject-class R !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%AD""1""77"#$%&" WM/04-05 S. 148 Contingency charts are perfect data structures to deal with classification results Simple calculation of recall and precision Recall ( Ki ) = Ai Ai + Ci Precision ( K i ) = Ai Ai + Bi Recall or precision values are often combined with an additional parameter % Calculation of the so called F-Measure F%: 2 (% + 1) * Preci si on(Ki ) * Recal l (Ki ) % 2Preci si on(Ki) + Recall (Ki ) F% = 2 % ( + 1) * A i = 2 2 WM/04.02 S. 149(% + 1) * Ai + Bi+ % * Ci % is between[0.."[ and indicates the effect of the respective measure on the evaluation !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%A?""1""77"#$%&" WM/04-05 S. 149 Example for low recall and high precision Klassen Klassifikatiosnaufgabe Klassifiziert WM/04.02 S. 150 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%B$""1""77"#$%&" WM/04-05 S. 150 Example for high recall and high precision WM/04.02 S. 151 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%B%""1""77"#$%&" WM/04-05 S. 151 In addition cost-benefit measures can be applied Enables to evaluate the benefit of each correct class assignment differently in order to give a higher priority to particular document types Represent user preferences (e.g. the costs of a false classification in the incoming mail scenario) In the most common setting the benefit ben is defined for a correct classification or the cost are defined for an incorrect classification n c/b-Measur e ben,cost () = i= 1 Ai * ben - Bi * cost WM/04.02 S. 152 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%B#""1""77"#$%&" WM/04-05 S. 152 Charts can be used to administer costs and benefits of many classes simultaneously Differentiated evaluation of individual cases: Ground Truth Classifier K1 K2 K3 ! Kn K1 Ben1 Cost1,2 Cost1,3 ! Cost1,n K2 Cost2,1 Ben2 Cost2,3 ! Cost2,n K3 Cost3,1 Cost3,2 Ben3 ! Cost3,n Kn Costn,1 Costn,2 Costn,3 ! Benn For every single class Ki the benefit Beni is defined for a correct classification For every possible mistake of the document of the ground-truth class Ki, in the class Kj i#j the costs Costi,j are defined WM/04.02 S. 153 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%B@""1""77"#$%&" WM/04-05 S. 153 ... some additional techniques? WM/04.02 S. 154 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%BA""1""77"#$%&" WM/04-05 S. 154 User profiles are used for information filtering User profile Unknown documents Relevant documents Information Retrieval System WM/04.02 S. 155 Not relevant documents !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%BB""1""77"#$%&" WM/04-05 S. 155 Passage retrieval solves a typical problem of information retrieval Search for a number, a name, a result, a place, etc! How many employees are working at Facebook? When is the next plenary meeting of Telekom AG? Who is the chairman of the board of the Deutschen Bahn? Where ... ? What is a ... ? While conventional document retrieval delivers a whole document as an answer to a query, passage retrieval identifies relevant sentences or passages in the document collection Realization for example with the aid of the vector space model WM/04.02 S. 156 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%B&""1""77"#$%&" WM/04-05 S. 156 Passage Retrieval aims at providing the most relevant passages from a document collection Weights of Terms in Document WindowFunction f(x) W: Size of the Window Density Distribution Maximal Value WM/04.02 S. 157 Answer Passage [ ] !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%BC""1""77"#$%&" WM/04-05 S. 157 Chapter 2 Attention-based Collaborative Intelligence WM/04.02 S. 158 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%BD""1""77"#$%&" WM/04-05 S. 158 'E/:+(9<(/"4E*":<()"-9*4(/08")+=+()-"4("<(9*<(-<;":49<F,9<4(" ,()"4(";4/(<9<F+",99+(9<4("94"E()+*-9,()",()"0+,*(" WM/04.02 S. 159 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+" !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9" !"##$%"&$'()*+(,'*##(-*,!*++1""=>"%B?""1""77"#$%&" WM/04-05 S. 159 Eye tracking is one option for getting better insights into contextual behavior In many cases, eye tracking experiments are still done by using a special head-installed device WM/04.02 S. 160 Sources: www.cure.at, www.egr.vcu.edu !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%&$""1""77"#$%&" WM/04-05 S. 160 GH+"(+5"/+(+*,9<4("4I"+8+"9*,;J+*-",*+"0</H9"5+</H9",()" E(,K9*E-<F+" WM/04.02 S. 161 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%&%""1""77"#$%&" WM/04-05 S. 161 Up to now, gaze data from eye trackers is widely used for usability applications or in “passive“ tool for behavior analysis WM/04.02 S. 162 Sources: www.andreas.com, www.agencytimes.net, www.stz-medienforschung.de, www.usibilty.at !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9" !"##$%"&$'()*+(,'*##(-*,!*++1""=>"%&#""1""77"#$%&" WM/04-05 S. 162 However, what is the difficulty in attention recognition? WM/04.02 S. 163 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%&@""1""77"#$%&" WM/04-05 S. 163 Your Daily Illusion ! WM/04.02 S. 164 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%&A""1""77"#$%&" WM/04-05 S. 164 Eye Movements saccadic suppression! WM/04.02 S. 165 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%&B""1""77"#$%&" WM/04-05 S. 165 1 6 So how understanding more about textual relevance in context WM/04.02 S. 166 Source: www.childrebookblock.com !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%&&""1""77"#$%&" WM/04-05 S. 166 Reading is based on ocular movements divided into fixations and saccades Some fundamental facts: !"#$%&%'#()*$%+)%(#&,+$-%&%)#$.#$/#%)+0#$.012% ."#%#1#%3*4#3#$.)%)"*5%."&.%$*.%#4#(1% 5*(,%+)%67&.#,8%94#(1%*$/#%+$%&%5"+0#2%&% (#-(#))+*$%:&$%#1#%3*4#3#$.%."&.%-*#)%;&/<% +$%."#%.#7.=%+)%3&,#%.*%(#>#7&3+$#%&%5*(,% ."&.%3&1%"&4#%$*.%;##$%?@001%@$,#().**,%."#% 6().%A3#8%B"+)%*$01%"&''#$)%5+."%&;*@.%CDE% *?%."#%67&A*$)%,#'#$,+$-%*$%"*5%,+F/@0.% ."#%.#7.%+)8%B"#%3*(#%,+F/@0.%."#%"+-"#(%."#% 0+<#0+"**,%."&.%(#-(#))+*$)%&(#%3&,#8% Fixations appear, when the eye gaze pauses in a certain position - normally lasting between 200 and 400 ms Saccades are the jumps of the gaze between fixations taking 10-20 ms Reading does not happen in exact linear saccadic movements, sometimes we need control fixations, that move towards the text directions Regression rate depends on the subjective difficulty of a text WM/04.02 S. 167 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%&C""1""77"#$%&" WM/04-05 S. 167 Some fundamental facts: The distance of two fixation points is about 8 characters The perceptual span, that is the size of the visual window, where the reading occurs, is asymmetric (Moving Window Technique) Humans also read some text left and right of a fixation point For reading a text: 3 to 4 letters to the left up to 15 letters to the right (alignment depends on the language) The asymmetry is caused by the fact that the information to be captured is within the WM/04.02 S. 168 region right of the fixation point (for readers in Hebrew it is vice versa) !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%&D""1""77"#$%&" WM/04-05 S. 168 While reading we follow the text in order to understand the captured message Some fundamental facts: Left Right Book 5 sec. Time WM/04.02 S. 169 Horizontal Eye Position !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%&?""1""77"#$%&" WM/04-05 S. 169 The perceptual sensibility during and after fixation is different Some fundamental facts: Perceptual Sensibility 100% There is a kind of blindness while moving the eye (which is called saccadic suppression) Almost all information from the eye is made available during a fixation 50% Rare and unexpected words have to be fixated longer than common and known terms (Spillover Effect) before saccade starts 200 100 after saccade starts 0 100 Time (ms) WM/04.02 S. 170 200 300 Humans read with expectations, i.e. expected words often are not fixated anymore Results from two studies*: * Source: Goldstein, 2002 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%C$""1""77"#$%&" WM/04-05 S. 170 However, reading behavior is different and may be categorized depending on purpose, form and reading process Reading categorization: Form Silent Reading Purpose Oral Recreatory Psychological Process Motivated Distinguishing all four modes via eye tracking is very hard since there are different reader types Observatory Assimilative Reflective Creative (Noting) (Understanding /Remembering) (Evaluating) (Employing) Skimming addresses a quick movement of the eyes across the page, picking up the occasional observation or idea ! when the assignment is not too important Not (that) relevant Reading has process every sentence, and then try make use of the salient arguments WM/04.02 S. 171 ! when we know we’ll later profit from the material Relevant !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%C%""1""77"#$%&" WM/04-05 S. 171 We use eye tracking for distinguishing reading and skimming behavior! WM/04.02 S. 172 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%C#""1""77"#$%&" WM/04-05 S. 172 For reading mode detection we build on an eye tracker from Tobii with an integrated sensing Setting: Eye tracker emits infrared light (invisible for humans) Eye ground reflects the light back to the eye tracker A camera sensitive to infrared light detects these reflections and computes the focus point of attention Unobtrusive, relatively precise (1° of visual angle) May be adjusted in short time and even works for users with glasses WM/04.02 S. 173 Sources: www.tobii.com, www.dfki.de !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%C@""1""77"#$%&" WM/04-05 S. 173 For reading mode detection we use various filters along a processing chain Eye Tracker 2")0-%3$)%0$45-66-%3$ 7","+.&%$8-9,"*$ ?&-%,$&@$2"3)*0$ '&6/A,).&%$ 8-9,"*$ Computation of Memory for saccade features point of regard %&$ '()*)+,"*-#.+$ #+)%/),($0","+,"01$ !"#$ Reading and skimming differentiation 8-C).&%$7","+.&%$ 8-9,"*$ Memory for points of regard %&$ 8-C).&%$"%0"01$ :-%";<="*)3-%3$8-9,"*$ !"#$ Horizontal averaging Fixation clustering :-%"$>),+(-%3$8-9,"*$ 4)++)0"$'9)##-B+).&%$ 8-9,"*$ OCR-based bounding box detection and line matching Classification of saccade based on last two fixations WM/04.02 S. 174 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%CA""1""77"#$%&" WM/04-05 S. 174 Noisy gaze data from the eye tracker Eye Tracker Point of Regard Computation Filter Fixation Detection Filter Saccade Classification Filter Reading and Skimming Detection Filter Line-Averaging Filter Line Matching Filter WM/04.02 S. 175 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%CB""1""77"#$%&" WM/04-05 S. 175 For recognizing fixation we apply a so-called dispersion approach Gaze locations produced by the eye (50 Hz, real time): - Slow Motion 9 30 pixel 2 3 1 8 5 6 4 7 Outliers 50 pixel 10 Fixation with Drift 11 14 12 13 New Fixation New fixation is detected if 4 successive nearby gaze locations are accumulated Gaze points are considered nearby when they fit together in a circel of 1° diameter (30 pixel) The circle is grown to be robust against drifting to 50 pixel correspond to a duration between 80 and 100 ms, which is the minimum fixation duration according to the literature Fixations are determined based on gaze location and gaze order WM/04.02 S. 176 G. Buscher, A. Dengel and L. van Elst, High Level Eye Movement Measures for Relevance Assessments of Information Items, Proceedings CHI 2008, Florence, Italy (Apr. 2008). !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%C&""1""77"#$%&" WM/04-05 S. 176 Noisy gaze data from the eye tracker Eye Tracker Point of Regard Computation Filter Fixation detection and saccade classification Fixation Detection Filter Saccade Classification Filter Reading and Skimming Detection Filter Line-Averaging Filter Line Matching Filter WM/04.02 S. 177 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%CC""1""77"#$%&" WM/04-05 S. 177 In use cases we were able to distinguish different saccade features that may be used to characterize the reading mode Move forward Read forward movements Skim forward movements Distinguished via distances between fixation points along the text orientation (number of characters) Long skim jumps movements Move backward Short regressions Long regressions Distinguished via distances between fixation points towards the text orientation (number of characters) Reset jump Go to new line Move elsewhere Unrelated move WM/04.02 S. 178 All other movements !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%CD""1""77"#$%&" WM/04-05 S. 178 Based on this observation saccade features may be classified and appropriate scores may be associated Note that these values may differ depending on the type of reader WM/04.02 S. 179 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%C?""1""77"#$%&" WM/04-05 S. 179 Reading Detection – Example Tobii 1750 eye tracker Reading Skimming Plausibility of reading sr = 62 Plausibility of skimming ss = 51 WM/04.02 S. 180 Reading behavior detected !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%D$""1""77"#$%&" WM/04-05 S. 180 Noisy gaze data from the eye tracker Eye Tracker Point of Regard Computation Filter Fixation detection and saccade classification Fixation Detection Filter Saccade Classification Filter Reading and Skimming Detection Filter Reading identification and saccade sequence alignment . Line-Averaging Filter Line Matching Filter WM/04.02 S. 181 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%D%""1""77"#$%&" WM/04-05 S. 181 Results can be represented by gaze-based document meta data* Line-matching by mapping with line segmentation results (plus OCR) Store reading information as document annotations in a semantic Wiki [Rayner 1998], the eye shows a very characteristic behavior composed of fixations and saccades. A fixation is a time of about 250ms on average when the eye is steadily gazing at one point. A saccade is a rapid, ballistic eye movement from one fixation to the next. The mean left-to-right saccade size is 7-9 letter spaces. It depends on the font size and is relatively invariant concerning the distance between the eyes and the text. Annotation (Read) Delete author: Georg start date: 07.12.2009 10:46:08 End date: 07.12.2009 10:46:12 length: 226 chars mean fixation duration: 217ms mean saccade length: 9.4 chars regression ratio: 13.9% task: write report An enormous amount of research has been done during last one hundred years concerning eye movements while reading. When reading silently, as summed up in [Rayner 1998], the eye shows a very characteristic behavior composed of fixations and saccades. A fixation is a time of about 250ms on average when the eye is steadily gazing at one point. A saccade is a rapid, ballistic eye movement from one fixation to the next. The mean left-to-right saccade size is 7-9 letter spaces. It depends on the font size and is relatively invariant concerning the distance between the eyes and the text. G. Buscher, A. Dengel, L. van Elst, F. Mittag, Generating and Using Gaze-Based Document Annotations, in Proceedings CHI 2008, WM/04.02 S. 182 Florence, Italy (Apr. 2008). G. Buscher, A. Dengel and L. van Elst, Query Expansion Using Gaze-Based Feedback on the Subdocument Level, Proceedings SIGIR ‘08, 31st Annual Int’l ACM SIGIR Conference, Singapore, (July 2008), accepted for publication !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%D#""1""77"#$%&" WM/04-05 S. 182 The various measures may be used to determine the relevance of read text! WM/04.02 S. 183 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%D@""1""77"#$%&" WM/04-05 S. 183 Not all of the features are valid for defining a measuring the text relevance + Fixation duration Fixation count vs Average saccade length vs + Regression rate + Viewing time - Reading vs. skimming behavior + Length of coherently read text WM/04.02 S. 184 vs + !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%DA""1""77"#$%&" WM/04-05 S. 184 There is high variability of most eye movement measures both within as well as between readers Since it is difficult to build methods estimating relevance of read text based on absolute values, gaze measures are individually personalized Procedure: ! Determine distribution of a measure for an individual user by analyzing all of her/his recorded eye movement data during reading (forward saccade lengths) Upper and lower whiskers define a user-specific interval where outliers are excluded ! Compute upper/lower whiskers (limits) concerning the measure's value distribution, e.g. lower whisker = max(min, lq - 1.5 * iqr) ! Normalize absolute values of the eye movement measures with respect to the individual whiskerWM/04.02 S. 185 intervals !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%DB""1""77"#$%&" WM/04-05 S. 185 In an experiment we could prove that the more intensive a given text is read, the more useful it is for the reader Percentage of read text for documents broken down by relevance judgments WM/04.02 S. 186 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%D&""1""77"#$%&" WM/04-05 S. 186 Based on the relevance measure we may attach so-called attention paths to best practices Assuming we would be in the task context of a knowledge worker: Document 4 Task X Document 3 Document 2 Document 1 next next prev next WM/04.02 S. 187 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%DC""1""77"#$%&" WM/04-05 S. 187 Using eye tracking for text-based information retrieval WM/04.02 S. 188 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%DD""1""77"#$%&" WM/04-05 S. 188 Query Expansion / Reformulation Engine Ranked Result List Re-ranking Mechanism “Island” Documents Personalized Documents Annotated Documents User Model Implicit Feedback ImplicitRelevance /Explicit Relevance Feedback User Context, etc. User Context, etc. Eye Tracker User Observation Non-personalized Data Retrieval Engine Objective Method Query Retrieval Knowledge Sim. Measure & Background Personalized Data Document Corpus User-Centered Method User a query andtohas to view and filter the Therecreates are various option allow for the consideration of result list in order to find retrieval the relevant documents user-centric (subjective) WM/04.02 S. 189 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%D?""1""77"#$%&" WM/04-05 S. 189 Reading and attention data allows the implementation of implicit relevance feedback Explicit relevance feedback based on Rocchio is effective but requires extra efforts Implicit feedback is an alternative for automatic recognition of relevance WM/04.02 S. 190 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%?$""1""77"#$%&" WM/04-05 S. 190 Gaze-based methods provide a remarkable improvement of 20% information gain compared to classical approaches !"#$%&'%(#)*% I(#+?4*-#0! Term Extraction J ".+/0121+34+ :3.96.9+ ".+/0121+34+ 56072.8+7090+ K Individualized Result List G#()*$&0+H&A*$% ;0&/<%;*7% Query Expansion )##$! )##$ and read "#$%&'(!)$*+,$,(!-.! "#/'(!0'12$(!-!.! WM/04.02 S. 191 G. Buscher, A. Dengel and L. van Elst, Query Expansion Using Gaze-Based Feedback on the Subdocument Level, Proceedings SIGIR ‘08, 31st Annual Int’l ACM SIGIR Conference, Singapore, (July 2008). !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%?%""1""77"#$%&" WM/04-05 S. 191 This approach may be also used for improving classifier learning WM/04.02 S. 192 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%?#""1""77"#$%&" WM/04-05 S. 192 '==0<;,9<4("LM,:=0+N".4;E:+(9"O0,--<P;,9<4(" !;0112<:0923.+21+73.6+/=+ >0.?0;;=+>[email protected]+73:?>6.91+ 2.93+0+43;765+ !;0112<:0923.+21+/0167+3.+ 1?/A6:92@6+B65:6B923.+34+:3.96.9+ !"#$%&'&()*+,-&./01%&"2&"/103,%& "$*+1&:;</,%"'%!#",/"-%=&36$% 6)/>)% ,<)%?@ABC& W35+>0N2.8+D65XD21+76:2123.+561B6:92.8+ :0968352Y0923.G++0+?165EEE+ EEE +56071+13>6+B0110861+ EEE +1N2>1+3@65+39D651+ EEE +1N2B1+B0591+9D09+056+.39+2.9656192.8+35+ 56;[email protected]+ 0::3572.8+93+9D6+D65XD21+40>2;20529=+C29D+9D6+ 13?5:61G+D65XD21+2.965619G+EEE++ $;;+73:?>6.91+C29D2.+3.6+43;765+:3.902.+ 965>2.3;38=+CD2:D+21+:D050:9652192:+435+0+:;011+ ".;=+:3.12765+9D316+B0591+34+9D6+73:?>6.9+ 435+:;0112<65+;605.2.8G+CD2:D+056+5607+/=+9D6+ WM/04.02 S. 193 ?165+ Europe UK IWF London Risk Johnson Brussels Cameron Brexit Euro Jobs Independence Brexit Referendum nt e m e v o r p m 45% I -E+%?1:D65+0.7+$E+F6.86;G++,,)-,"&-.!/0)$+1&234)-,%56/00"7)#%8)/#-"-9G+H53:6672.81+F$IJKG+(.9L;+($H&+M35N1D3B+3.+F3:?>6.9+$.0;=121+I=196>1G+,050G+ O0B0.G+(***+!3>BE+I3:269=+H5611+PI6BE+QJJKRG+BBE+KSTUV++ !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%?@""1""77"#$%&" WM/04-05 S. 193 ! but there are also new applications in infotainment WM/04.02 S. 194 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%?A""1""77"#$%&" WM/04-05 S. 194 Imagine there were input devices which could allow text to know if and how it is read Text 2.0 is an innovative interaction mode between humans and computer It is build on the idea that the computer knows on which text line, sentence, or word a person looks It supplements the text by hidden “attentive mark-ups” that are activated during reading, i.e. recognizing a specific reading mode WM/04.02 S. 195 Reveals new business options, .e.g. in online marketing and advertisement !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%?B""1""77"#$%&" !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+" WM/04-05 S. 195 Text 2.0 provides a simple-to-use framework for constructing gaze-attentive and -responsive applications Data Clustering for fixation recognition Effect generation Data filtering and normalization Reading the text via an eye tracker Determination of saccade lengths Matching with hidden mark-ups Real time WM/04.02 export S.of196 fixation recognition data and reading mode into HTML !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%?&""1""77"#$%&" WM/04-05 S. 196 Text 2.0 is one of 32 selected recent megatrends selected by von TrendOne WM/04.02 S. 197 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%?C""1""77"#$%&" WM/04-05 S. 197 Q+"+:=048+)"9H+"<)+,"I4*":4K<0+"+8+19*,;J+*-"5<9H" H+,)1:4E(9+)")<-=0,8" WM/04.02 S. 198 'E+'3=0>0G+FE+I3..908G+$E+F6.86;G+'E+Z091?70G+ZE+(C0>?50G+0.7+[E+[216G%+%D"E)$%F)/6",*%G)/$.D&3-,)$%H)E,%H#/-06/,"&-%I*0,)4%@0"-9%?*)% J/K)%L-M3,G+H53:6672.81+(\(+QJ]VG+]U9D+(.9L;+!3.4E+3.+(.96;;286.9+\165+(.96540:61G+^0240G+(1506;+PW6/E+QJ]VRG+BBE+_QUT__VE++ !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%?D""1""77"#$%&" WM/04-05 S. 198 Q+"+:=048+)"9H+"<)+,"I4*":4K<0+"+8+19*,;J+*-"5<9H" H+,)1:4E(9+)")<-=0,8" WM/04.02 S. 199 'E+'3=0>0G+FE+I3..908G+$E+F6.86;G+'E+Z091?70G+ZE+(C0>?50G+0.7+[E+[216G%+%D"E)$%F)/6",*%G)/$.D&3-,)$%H)E,%H#/-06/,"&-%I*0,)4%@0"-9%?*)% J/K)%L-M3,G+H53:6672.81+(\(+QJ]VG+]U9D+(.9L;+!3.4E+3.+(.96;;286.9+\165+(.96540:61G+^0240G+(1506;+PW6/E+QJ]VRG+BBE+_QUT__VE++ !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"%??""1""77"#$%&" WM/04-05 S. 199 Q+"+:=048+)"9H+"<)+,"I4*":4K<0+"+8+19*,;J+*-"5<9H" H+,)1:4E(9+)")<-=0,8" WM/04.02 S. 200 'E+'3=0>0G+$E+F6.86;G+ME+I?Y?N2G+0.7+[E+[216G%@0)#%+,,)-,"&-%N#")-,)$%+394)-,)$%F)/6",*%&-%1&234)-,0%@0"-9%/%I)).,<#&39<%GD1%/-$%/% ;)/#/O6)%?*)%H#/2P)#G+1?/>29967+93+(IZ$&+QJ]_G+I=>BE+3.+Z2`67+0.7+$?8>6.967+&60;29=G+$76;0276G+$?1950;20+P":9E+QJ]_RG+BBE+QUUT_JJE++ !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#$$""1""77"#$%&" WM/04-05 S. 200 M6+6>B;3=67+9D6+2760+435+>3/2;6+6=6T950:N651+C29D+ D607T>3?.967+721B;0=+ WM/04.02 S. 201 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#$%""1""77"#$%&" WM/04-05 S. 201 R+0,9<4(-H<="K+95++(",E9H4*-S*+,)+*T"9H+"*+,0"54*0)T",()"," )4;E:+(9";,("K+")+-;*<K+)"F<,"9H+"7+:<49<;"G*<,(/0+" (>082.0923.+ UE*"+(F<*4(:+(9";4(-<-9-"4I" <9+:-T"I,;9-",()"+F+(9-"9H,9",*+" V*+,0W",()")+9+*:<(+"4E*"0<F+-" XVCD09+21+832.8+3.WY" '3++M=*+--"9H4E/H9-T"5+"E-+" -8:K40-T"4*";H,*,;9+*-"9H,9":,8" K+"E()+*-944)"K8"49H+*-"" XVCD09+21+:3?:D67+35+ 6`B;2:096WT"+>/>"F<,",")4;E:+(9Y" F3:?>6.9+ WM/04.02 S. 202 &+,)<(/",")4;E:+(9"=E9" ;4(9+(9-"94/+9H+*",()";*+,9+" F+*8"<()<F<)E,0"<:,/<(,9<4(-" XVCD09+(+D0@6+2.+>2.7WY" &60;29=+ !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#$#""1""77"#$%&" !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+" WM/04-05 S. 202 QH8"-4:+")4;E:+(9-"54*J"K+99+*"9H,("49H+*-",9"H40)<(/"4E*" ,99+(9<4(",()";4(F+8<(/"<(I4*:,9<4(Z" G8=4/*,=H8"E-,/+",()")+-</("<-" +--+(9<,0"I4*"9+M9"E()+*-9,()<(/" XJ+*(<(/T"9*,;J<(/T",()"0+,)<(/Y"" GH+",))<9<4(,0"+:=048:+(9"4I" /*,=H<;-"-<:=0<I8"9H+"E()+*-9,()<(/" 4I","9+M9":+--,/+" [+/<K<0<98"4I"9+M9"<-"<(\E+(;<(/" *+0E;9,(;+",()"4I9+(")+;<)+-"5H+9H+*" 5+"=E*-E+"*+,)<(/"<9" WM/04.02 S. 203 74:+"*+-+,*;H"K,;JE=-"9H,9"5+",*+" (,9E*,008")*,5("94"<:,/+-"5<9H" =*4=4*9<4(-",==*4,;H<(/"/40)+("*,9<4" !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#$@""1""77"#$%&" WM/04-05 S. 203 68"E-<(/"9H+"240)+("R,9<4"<("9+M9"0<(+"-=,;<(/T"5+"<(F+-9</,9+)" 9H+"<(\E+(;+"4I"9+M9"0<(+"]E,0<98"4("9H+"\E+(;8"4I"*+,)<(/""" F3:?>6.9+'=B6+]+PF']R+ PC29D+-3;76.+&0923+;2.6+1B0:2.8 F3:?>6.9+'=B6+Q+PF'QR+ PC29D3?9+-3;76.+&0923+;2.6+1B0:2.8 WM/04.02 S. 204 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#$A""1""77"#$%&" WM/04-05 S. 204 ^4*"4E*"+M=+*<:+(9-"5+";4:K<(+)")4;E:+(9"*+9*<+F,0" 5<9H"/,_+"I+,9E*+"+M9*,;9<4("5H<0+"*+,)<(/" &6:357+*=6+'50:N2.8+'520;1+ F3:?>6.9+&69526@0;+ W2`0923.1+ I0::0761+ &68561123.1+ WM/04.02 S. 205 F090+$.0;=121 + -0Y6+W609?56+*`950:923.++ !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9" 6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#$B""1""77"#$%&" WM/04-05 S. 205 R+-E09-"=*4F+)"9H,9"]E,0<98"4I"9+M9"0,84E9"-9*4(/08"<(\E+(;+-"9H+" =+*;+=9E,0",K<0<9<+-"4I","*+,)+*" &68561123.+50923+ 'D6+0@65086+<`0923.1+34+9520;1+ 3000 0,3 2000 0,2 1000 F'] F'Q 0 0,1 F'] F'Q 0 'D6+0@65086+34+56072.8+92>6++ &696.923.+6@0;?0923.+ 100 300 200 50 100 F'] F'Q 0 F'] F'Q 0 WM/04.02 S. 206 S. S. Mozaffari, S. Bukhari, and A. Dengel, Using The Wearable Eye Trackers to measure of reading performance by applying the golden ratio parameter for line spacing, Proceedings WeSAX15,, IEEE International Conference on Multimedia and Expo Workshops, Torino, Italy (July 2015)+ !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#$&""1""77"#$%&" WM/04-05 S. 206 + %?9+CD09+0/3?9+96`9++ ;682/2;29=ab+ + WM/04.02 S. 207 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#$C""1""77"#$%&" WM/04-05 S. 207 `,J<(/"4E9"0+99+*-")4+-"(49":+,("9H,9"54*)-",*+"(+;+--,*<08" +,-8"94"*+,)"4*";4:=*+H+()" R+,),K<0<98"<-"9H+"+,-+"5<9H"5H<;H"9+M9";,("K+"*+,)" O4:=*+H+(-<4("<-","J+8"I,;94*"<("9+*:-"4I"*+,),K<0<98T",-"<-"K+<(/",K0+"94"]E<;J08" 044J"0+99+*<(/",9"'a."E()+*-9,()" %?9+D3C+:0.+C6+>601?56+9D6+;682/2;29=+34+96`9b+ ++8+"9*,;J<(/",;;E*,;8T")<b+*+(9"*+,)<(/"-980+-T",)+]E,9+"I+,9E*+-" ... ... ... WM/04.02 S. 208 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#$D""1""77"#$%&" WM/04-05 S. 208 c("4*)+*"94":,(,/+"9H+-+";H,00+(/+-"5+")<)"(49";4(-<)+*"-<(/0+" 54*)-"KE9"9H+"+(F<*4(:+(9"4I"54*)-" Q+"=,*9<9<4(+)"9H+"+(9<*+"=,/+"<(94","/*<)"-9*E;9E*+",()",==0<+)","5<()45"9+;H(<]E+" Actual Area Consideration Area G<0+"-<_+"X,;9E,0",*+,Y",==*4M<:,9+08"-<:<0,*"94"9H+"54*)"-<_+" WM/04.02 S. 209 O4(-<)+*,9<4(",*+,"+M9+()-"94"+,;H"-<)+"K8"9H+"+-9<:,9+)"9*,;J<(/"+**4*" !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#$?""1""77"#$%&" WM/04-05 S. 209 c("4E*"+M=+*<:+(9-"5+"<(F+-9</,9+)","KE(;H"4I"=49+(9<,008" *+0+F,(9"I+,9E*+-"" 'F+*,/+"^<M,9<4(".E*,9<4("X'^.Y" h d ... . WM/04.02 S. 210 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#%$""1""77"#$%&" WM/04-05 S. 210 c("4E*"+M=+*<:+(9-"5+"<(F+-9</,9+)","KE(;H"4I"=49+(9<,008" *+0+F,(9"I+,9E*+-"" 'F+*,/+"^<M,9<4(".E*,9<4("X'^.Y" .E*,9<4("4I"^<*-9"^<M,9<4("d4<(9"X^^.Y" h d ... . WM/04.02 S. 211 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#%%""1""77"#$%&" WM/04-05 S. 211 c("4E*"+M=+*<:+(9-"5+"<(F+-9</,9+)","KE(;H"4I"=49+(9<,008" *+0+F,(9"I+,9E*+-"" 'F+*,/+"^<M,9<4(".E*,9<4("X'^.Y" .E*,9<4("4I"^<*-9"^<M,9<4("d4<(9"X^^.Y" R+,)<(/"S"7J<::<(/"O0,--<P;,9<4("XR7Y" h d ... . WM/04.02 S. 212 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#%#""1""77"#$%&" WM/04-05 S. 212 c("4E*"+M=+*<:+(9-"5+"<(F+-9</,9+)","KE(;H"4I"=49+(9<,008" *+0+F,(9"I+,9E*+-"" 'F+*,/+"^<M,9<4(".E*,9<4("X'^.Y" .E*,9<4("4I"^<*-9"^<M,9<4("d4<(9"X^^.Y" R+,)<(/"S"7J<::<(/"O0,--<P;,9<4("XR7Y" h . .. R+,)<(/"O4E(9"XROYe" WM/04.02 S. 213 c+'D6+.?>/65+34+B01161+:;0112<67+01+56072.8+2.96516:92.8+9D6+:3.127650923.+0560+34+9D6+92;6E+ !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#%@""1""77"#$%&" WM/04-05 S. 213 c("4E*"+M=+*<:+(9-"5+"<(F+-9</,9+)","KE(;H"4I"=49+(9<,008" *+0+F,(9"I+,9E*+-"" 'F+*,/+"^<M,9<4(".E*,9<4("X'^.Y" .E*,9<4("4I"^<*-9"^<M,9<4("d4<(9"X^^.Y" R+,)<(/"S"7J<::<(/"O0,--<P;,9<4("XR7Y" h ... R+,)<(/"O4E(9"XROY" f"R+/*+--<4(-"-9,*9<(/"<("9H+"'*+,"XR7Y" WM/04.02 S. 214 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#%A""1""77"#$%&" WM/04-05 S. 214 c("4E*"+M=+*<:+(9-"5+"<(F+-9</,9+)","KE(;H"4I"=49+(9<,008" *+0+F,(9"I+,9E*+-"" 'F+*,/+"^<M,9<4(".E*,9<4("X'^.Y" .E*,9<4("4I"^<*-9"^<M,9<4("d4<(9"X^^.Y" R+,)<(/"S"7J<::<(/"O0,--<P;,9<4("XR7Y" h ... . R+,)<(/"O4E(9"XROY" f"R+/*+--<4(-"-9,*9<(/"<("9H+"'*+,"XR7Y" f"R+/*+--<4(-"+()<(/"<("9H+"'*+,"XRGY" WM/04.02 S. 215 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#%B""1""77"#$%&" WM/04-05 S. 215 c("4E*"+M=+*<:+(9-"5+"<(F+-9</,9+)","KE(;H"4I"=49+(9<,008" *+0+F,(9"I+,9E*+-"" T1 T2 T3 'F+*,/+"^<M,9<4(".E*,9<4("X'^.Y" 230 313 135 .E*,9<4("4I"^<*-9"^<M,9<4("d4<(9"X^^.Y" 170 70 120 R+,)<(/"S"7J<::<(/"O0,--<P;,9<4("XR7Y" 0.6 -4.0 2.1 R+,)<(/"O4E(9"XROY" - 1 2 f"R+/*+--<4(-"-9,*9<(/"<("9H+"'*+,"XR7Y" - 1 - f"R+/*+--<4(-"+()<(/"<("9H+"'*+,"XRGY" 2 - - … WM/04.02 S. 216 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#%&""1""77"#$%&" WM/04-05 S. 216 c("4E*"+M=+*<:+(9-"5+"<(F+-9</,9+)","KE(;H"4I"=49+(9<,008" *+0+F,(9"I+,9E*+-"" T1 'F+*,/+"^<M,9<4(".E*,9<4("X'^.Y" T2 T3 … 0.6 0.8 0. .E*,9<4("4I"^<*-9"^<M,9<4("d4<(9"X^^.Y" 0.9 R+,)<(/"S"7J<::<(/"O0,--<P;,9<4("XR7Y" 0.3 0…1 0.1 0.4 0.1 0.5 R+,)<(/"O4E(9"XROY" - 0.3 0.6 f"R+/*+--<4(-"-9,*9<(/"<("9H+"'*+,"XR7Y" - 0.5 - f"R+/*+--<4(-"+()<(/"<("9H+"'*+,"XRGY" 0.6 - - WM/04.02 S. 217 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#%C""1""77"#$%&" WM/04-05 S. 217 c("4E*"+M=+*<:+(9-"5+"<(F+-9</,9+)","KE(;H"4I"=49+(9<,008" *+0+F,(9"I+,9E*+-"" T1 T2 T3 'F+*,/+"^<M,9<4(".E*,9<4("X'^.Y" 0.2 0.6 0.2 0.9 0.8 1.0 0.2 0.2 0.1 .E*,9<4("4I"^<*-9"^<M,9<4("d4<(9"X^^.Y" 0.2 0.9 0.4 0.3 0.1 0.0 0.5 0.4 0.1 R+,)<(/"S"7J<::<(/"O0,--<P;,9<4("XR7Y" 0.5 0.3 1.0 0.2 0.1 0.4 0.7 0.5 0.7 R+,)<(/"O4E(9"XROY" 0.3 0.9 0.3 - 0.9 0.6 0.1 f"R+/*+--<4(-"-9,*9<(/"<("9H+"'*+,"XR7Y" - 0.2 0.5 0.2 0.9 0.2 f"R+/*+--<4(-"+()<(/"<("9H+"'*+,"XRGY" 0.2 0.6 0.8 1.0 0.2 0.6 } … F.a. users WM/04.02 S. 218 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#%D""1""77"#$%&" WM/04-05 S. 218 c("4E*"+M=+*<:+(9-"5+"<(F+-9</,9+)","KE(;H"4I"=49+(9<,008" *+0+F,(9"I+,9E*+-"" T1 T2 T3 'F+*,/+"^<M,9<4(".E*,9<4("X'^.Y" 0.2 0.6 0.2 0.9 0.8 1.0 0.2 0.2 0.1 .E*,9<4("4I"^<*-9"^<M,9<4("d4<(9"X^^.Y" 0.2 0.9 0.4 0.3 0.1 0.0 0.5 0.4 0.1 R+,)<(/"S"7J<::<(/"O0,--<P;,9<4("XR7Y" 0.5 0.3 1.0 0.2 0.1 0.4 0.7 0.5 0.7 R+,)<(/"O4E(9"XROY" 0.3 0.9 0.3 - 0.9 0.6 0.1 f"R+/*+--<4(-"-9,*9<(/"<("9H+"'*+,"XR7Y" - 0.2 0.5 0.2 0.9 0.2 f"R+/*+--<4(-"+()<(/"<("9H+"'*+,"XRGY" 0.2 0.6 0.8 1.0 0.2 0.6 … }ø WM/04.02 S. 219 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#%?""1""77"#$%&" WM/04-05 S. 219 '(",//*+/,9<4("4I"*+,)<(/"),9,"I*4:"F,*<4E-"*+,)+*-"/<F+-" <(-</H9-"<(94"9H+"*+,),K<0<98"4I","9+M9" $WF WWF &I &! &I &' d 'F+*,/+"^<M,9<4(".E*,9<4("X'^.Y" .E*,9<4("4I"^<*-9"^<M,9<4("d4<(9"X^^.Y" R+,)<(/"S"7J<::<(/"O0,--<P;,9<4("XR7Y" R+,)<(/"O4E(9"XROY" f"R+/*+--<4(-"-9,*9<(/"<("9H+"'*+,"XR7Y" f"R+/*+--<4(-"+()<(/"<("9H+"'*+,"XRGY" WM/04.02 S. 220 I/4M6)%&3,M3,%'&#%/% $&234)-,% #)/$%O*%0)>)-%30)#0%+ R. Biedert, M. El Hosseiny, A. Dengel, and G. Buscher, Towards Robust Gaze-Based Objective Quality Measures for Text, Proceedings 7th Biennial Symposium on Eye Tracking Research & Applications, Santa Barbara, CA, USA (March 2012), pp. 201-204 " !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"##$""1""77"#$%&" WM/04-05 S. 220 ^4*"9+-9<(/T"E-+*-"H,)"94"5*<9+"-H4*9"9+M9"+M=0,<(<(/"-4:+"I,;9-" KE9"5<9H4E9"=*44I*+,)<(/" 1 Italienische Schüler besuchen zunächst fünf Jahre lang die scuola elementare vergleichbar mit der deutschen Grundschule. Anschließen folgt die scuola media, die die Schüler drei Jahre lang besuchen. Bevor sie auf eine weiterführende Schule wechseln können (liceo). Diesen Schultyp gibt es in verschiedenen Ausprägungen (z.B. liceo classico, liceo linguistico), bei dem der Schwerpunkt jeweils auf einer anderen Fächergruppe liegt. Anstelle eines liceos kann die Schule auch mit dem Besuch eines Istituto tecnico (Fachschule), das auf die Ausübung kaufmännischer Berufe vorbereitet, oder eines Instituto professionale, einer Berufsschule mit den Zweigen Handel, Tourismus, Industrie und Landwirtschaft, Während meines Praktikums beim DFKI, dem Deutschen Forschungszentrum für Künstliche Intelligenz, in Kaiserslautern muss ich jeden morgen verschiedene Verkehrsmittel in Anspruch nehmen, um an meinen geliebten Arbeitsplatz zu kommen. Entweder Smart oder Fahrrad sowie eine Regionalbahn der Deutschen Bahn dienen als Transportmittel. Die Reise von Pirmasens nach Kaiserslautern beginnt zu Hause um viertel nach sieben. Nachdem Laptop und Rucksack mit Essen, Trinken, Papier und Stiften in den riesigen, überdimensionalen Kofferraum des Smarts eingeladen sind, geht die Fahrt los. Da man sich morgens durch den für Pirmasens relativ Bei der Musik gibt es verschiedene Genres z.B. Dubstep, Indie, Electro, House, Pop, Rock, Metal, Klassik, Core, Punk, Evil Disco, Blues, Alternative, Hip-Hop, Rap, Soul, Soundtrack, Jazz, Dance, Hardstyle, Shuffle, Jumpstyle, etc. Zwischen den Genres gibt es jeweils Unterscheidungen, manche fokussieren sich mehr auf den instrumentalen Part, andere beschäftigen sich primär mit dem Gesang und andere auf elektronisch erzeugte Rhythmen und Melodien. . Dubstep, Elektro, House, Hardstyle, Shuffle und Jumpstyle gehören zu den elektronischen Genren, meist fokussiert sich das elektronische Genre auf den Bass & schnelle Beats. Doch auch Core wird von elekronischen Melodien beeinflusst z.B. Breakcore. Zu Jumpstyle, Shuffle Ich spiele leidenschaftlich gerne Handball. Diesen Sport habe ich allerdings erst mit 15 Jahren angefangen. Die Trainingszeiten sind dienstags, von 20:00 bis 22:00 Uhr, und donnerstags, von 18:30 bis 20:00 Uhr. In der letzten Runde stand ich mit meinem Verein, dem TV Dahn (nur) auf dem 3. Tabellenplatz der Pfalzliga, weil die Vorrunde sehr schlecht ausgefallen ist, und wir fast jedes Spiel verloren hatten. In der Rückrunde konnten wir aber wieder punkten und gewannen jedes Spiel. Allerdings mussten wir in der Rückrunde wegen Spielermangel 2 Punkte hier in Kaiserslautern lassen, da wir lediglich mit 6 Personen ankamen und der Torwart ebenfalls fehlte. Bei meinem ersten Auto handelt es sich um einen VW Golf Caddy I 1.4d .Dieses Auto ist ein Pickup mit VW GolfI Basis und wird von einem 4-Gang Getriebe und einem 1.4 l Dieselmotor angetrieben. Die Front des Autos ist eins zu eins von einem 1er Golf 5-Türer übernommen worden . Es ist ein 2-Sitzer , dessen hinterer Teil aus einer etwa 1,80m langen Ladefläche besteht . Beginn der Restauration des Fahzeuges war zum Beginn der zweiten Hälfte der Osterferien 2011 und endeteam 7.6.2011 . In dieser Zeitspanne wurde einiges an dem Fahrzeug getan . Der Zustand indem das Auto vor den Reperaturen war , sah folgendermaßen aus . WM/04.02 S. 221 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"##%""1""77"#$%&" WM/04-05 S. 221 U9H+*"E-+*-"*+,)",()"*,9+)"=,--,/+"<()<F<)E,008"K8"/*,)+-"%"94"B" 2 Italienische Schüler besuchen zunächst fünf Jahre lang die scuola elementare vergleichbar mit der deutschen Grundschule. Anschließen folgt die scuola media, die die Schüler drei Jahre lang besuchen. Bevor sie auf eine weiterführende Schule wechseln können (liceo). Diesen Schultyp gibt es in verschiedenen Ausprägungen (z.B. liceo classico, liceo linguistico), bei dem der Schwerpunkt jeweils auf einer anderen Fächergruppe liegt. Anstelle eines liceos kann die Schule auch mit dem Besuch eines Istituto tecnico (Fachschule), das auf die Ausübung kaufmännischer Berufe vorbereitet, oder eines Instituto professionale, einer Berufsschule mit den Zweigen Handel, Tourismus, Industrie und Landwirtschaft, Während meines Praktikums beim DFKI, dem Deutschen Forschungszentrum für Künstliche Intelligenz, in Kaiserslautern muss ich jeden morgen verschiedene Verkehrsmittel in Anspruch nehmen, um an meinen geliebten Arbeitsplatz zu kommen. Entweder Smart oder Fahrrad sowie eine Regionalbahn der Deutschen Bahn dienen als Transportmittel. Die Reise von Pirmasens nach Kaiserslautern beginnt zu Hause um viertel nach sieben. Nachdem Laptop und Rucksack mit Essen, Trinken, Papier und Stiften in den riesigen, überdimensionalen Kofferraum des Smarts eingeladen sind, geht die Fahrt los. Da man sich morgens durch den für Pirmasens relativ Bei der Musik gibt es verschiedene Genres z.B. Dubstep, Indie, Electro, House, Pop, Rock, Metal, Klassik, Core, Punk, Evil Disco, Blues, Alternative, Hip-Hop, Rap, Soul, Soundtrack, Jazz, Dance, Hardstyle, Shuffle, Jumpstyle, etc. Zwischen den Genres gibt es jeweils Unterscheidungen, manche fokussieren sich mehr auf den instrumentalen Part, andere beschäftigen sich primär mit dem Gesang und andere auf elektronisch erzeugte Rhythmen und Melodien. . Dubstep, Elektro, House, Hardstyle, Shuffle und Jumpstyle gehören zu den elektronischen Genren, meist fokussiert sich das elektronische Genre auf den Bass & schnelle Beats. Doch auch Core wird von elekronischen Melodien beeinflusst z.B. Breakcore. Zu Jumpstyle, Shuffle Ich spiele leidenschaftlich gerne Handball. Diesen Sport habe ich allerdings erst mit 15 Jahren angefangen. Die Trainingszeiten sind dienstags, von 20:00 bis 22:00 Uhr, und donnerstags, von 18:30 bis 20:00 Uhr. In der letzten Runde stand ich mit meinem Verein, dem TV Dahn (nur) auf dem 3. Tabellenplatz der Pfalzliga, weil die Vorrunde sehr schlecht ausgefallen ist, und wir fast jedes Spiel verloren hatten. In der Rückrunde konnten wir aber wieder punkten und gewannen jedes Spiel. Allerdings mussten wir in der Rückrunde wegen Spielermangel 2 Punkte hier in Kaiserslautern lassen, da wir lediglich mit 6 Personen ankamen und der Torwart ebenfalls fehlte. Bei meinem ersten Auto handelt es sich um einen VW Golf Caddy I 1.4d .Dieses Auto ist ein Pickup mit VW GolfI Basis und wird von einem 4-Gang Getriebe und einem 1.4 l Dieselmotor angetrieben. Die Front des Autos ist eins zu eins von einem 1er Golf 5-Türer übernommen worden . Es ist ein 2-Sitzer , dessen hinterer Teil aus einer etwa 1,80m langen Ladefläche besteht . Beginn der Restauration des Fahzeuges war zum Beginn der zweiten Hälfte der Osterferien 2011 und endeteam 7.6.2011 . In dieser Zeitspanne wurde einiges an dem Fahrzeug getan . Der Zustand indem das Auto vor den Reperaturen war , sah folgendermaßen aus . !3;35+56e6:91+>05Nf+ -566.+d+]++ + +++++++++ P@65=+8337+;682/2;29=R+ &67+d+g+ + + ++++++++++ P@65=+/07+;682/2;29=R+ WM/04.02 S. 222 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"###""1""77"#$%&" WM/04-05 S. 222 GH+("5+"E-+)"9H+"9<0+-"94"/+(+*,9+"/*4E()"9*E9H" 3 Italienische Schüler besuchen zunächst fünf Jahre lang die scuola elementare vergleichbar mit der deutschen Grundschule. Anschließen folgt die scuola media, die die Schüler drei Jahre lang besuchen. Bevor sie auf eine weiterführende Schule wechseln können (liceo). Diesen Schultyp gibt es in verschiedenen Ausprägungen (z.B. liceo classico, liceo linguistico), bei dem der Schwerpunkt jeweils auf einer anderen Fächergruppe liegt. Anstelle eines liceos kann die Schule auch mit dem Besuch eines Istituto tecnico (Fachschule), das auf die Ausübung kaufmännischer Berufe vorbereitet, oder eines Instituto professionale, einer Berufsschule mit den Zweigen Handel, Tourismus, Industrie und Landwirtschaft, Während meines Praktikums beim DFKI, dem Deutschen Forschungszentrum für Künstliche Intelligenz, in Kaiserslautern muss ich jeden morgen verschiedene Verkehrsmittel in Anspruch nehmen, um an meinen geliebten Arbeitsplatz zu kommen. Entweder Smart oder Fahrrad sowie eine Regionalbahn der Deutschen Bahn dienen als Transportmittel. Die Reise von Pirmasens nach Kaiserslautern beginnt zu Hause um viertel nach sieben. Nachdem Laptop und Rucksack mit Essen, Trinken, Papier und Stiften in den riesigen, überdimensionalen Kofferraum des Smarts eingeladen sind, geht die Fahrt los. Da man sich morgens durch den für Pirmasens relativ Bei der Musik gibt es verschiedene Genres z.B. Dubstep, Indie, Electro, House, Pop, Rock, Metal, Klassik, Core, Punk, Evil Disco, Blues, Alternative, Hip-Hop, Rap, Soul, Soundtrack, Jazz, Dance, Hardstyle, Shuffle, Jumpstyle, etc. Zwischen den Genres gibt es jeweils Unterscheidungen, manche fokussieren sich mehr auf den instrumentalen Part, andere beschäftigen sich primär mit dem Gesang und andere auf elektronisch erzeugte Rhythmen und Melodien. . Dubstep, Elektro, House, Hardstyle, Shuffle und Jumpstyle gehören zu den elektronischen Genren, meist fokussiert sich das elektronische Genre auf den Bass & schnelle Beats. Doch auch Core wird von elekronischen Melodien beeinflusst z.B. Breakcore. Zu Jumpstyle, Shuffle Ich spiele leidenschaftlich gerne Handball. Diesen Sport habe ich allerdings erst mit 15 Jahren angefangen. Die Trainingszeiten sind dienstags, von 20:00 bis 22:00 Uhr, und donnerstags, von 18:30 bis 20:00 Uhr. In der letzten Runde stand ich mit meinem Verein, dem TV Dahn (nur) auf dem 3. Tabellenplatz der Pfalzliga, weil die Vorrunde sehr schlecht ausgefallen ist, und wir fast jedes Spiel verloren hatten. In der Rückrunde konnten wir aber wieder punkten und gewannen jedes Spiel. Allerdings mussten wir in der Rückrunde wegen Spielermangel 2 Punkte hier in Kaiserslautern lassen, da wir lediglich mit 6 Personen ankamen und der Torwart ebenfalls fehlte. Bei meinem ersten Auto handelt es sich um einen VW Golf Caddy I 1.4d .Dieses Auto ist ein Pickup mit VW GolfI Basis und wird von einem 4-Gang Getriebe und einem 1.4 l Dieselmotor angetrieben. Die Front des Autos ist eins zu eins von einem 1er Golf 5-Türer übernommen worden . Es ist ein 2-Sitzer , dessen hinterer Teil aus einer etwa 1,80m langen Ladefläche besteht . Beginn der Restauration des Fahzeuges war zum Beginn der zweiten Hälfte der Osterferien 2011 und endeteam 7.6.2011 . In dieser Zeitspanne wurde einiges an dem Fahrzeug getan . Der Zustand indem das Auto vor den Reperaturen war , sah folgendermaßen aus . ie d f i r e e v r ata a d s t l g n u ad i Res e r e io n t h a t c i h w it ssif a l c ! (62% ccuracy) a WM/04.02 S. 223 &E+%267659G+ZE+*;+^31162.=G+$E+F6.86;G+0.7+-E+%?1:D65G+H&=/#$0%F&O30,%J/K).!/0)$%NOQ)2,">)%R3/6",*%D)/03#)0%'&#%H)E,G+ H53:6672.81+S9D+%26..20;+I=>B312?>+3.+*=6+'50:N2.8+&61605:D+h+$BB;2:0923.1G+I0.90+%05/050G+!$G+\I$+PZ05:D+QJ]QRG+BBE+ QJ]TQJVE++ !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"##@""1""77"#$%&" WM/04-05 S. 223 + MD09+9=B61+34+721B;0=67+7090+21+ /69965+1?2967+93+13;@6+1B6:2<:+ 9=B61+34+B53/;6>1b+ + WM/04.02 S. 224 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"##A""1""77"#$%&" WM/04-05 S. 224 c("4*)+*"94"/+9","P*-9"<:=*+--<4(T"5+"*E(","=<049"-9E)8"5<9H" -9E)+(9-"I*4:"=H8-<;-" Xa49"4(08Y"<("=H8-<;-"+)E;,9<4(T"<9"<-"H</H08"<:=4*9,(9"I4*"+)E;,94*-",()"<(-9*E;94*-" 94"H,F+"<(-</H9",K4E9"9H+",==*4=*<,9+"=*+=,*,9<4("4I"*+=*+-+(9,9<4(-T"0<J+"F+;94*-T" 9,K0+-T",()")<,/*,:-"I4*"-40F<(/"-=+;<P;"98=+-"4I"=*4K0+:-" LM=+*<:+(9,0"7+9E=" Q+"H,F+";4()E;9+)",("+8+" 9*,;J<(/"+M=+*<:+(9"K8" +:=048<(/","0</H915+</H9T"045" =*<;+",()"=4*9,K0+"+8+"9*,;J+*" =,<*+)"5<9H","9,K0+9" WM/04.02 S. 225 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"##B""1""77"#$%&" WM/04-05 S. 225 79E)+(9-"5+*+"-H45("9H*++";4H+*+(9"*+=*+-+(9,9<4(-",K4E9"," =H+(4:+(4(",()"5+*+"<(-9*E;9+)"94"-40F+","=H8-<;-"=*4K0+:" )6:9351 '0/;6 F20850> WM/04.02 S. 226 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"##&""1""77"#$%&" WM/04-05 S. 226 LM,:=0+"%N"R,51/,_+"4I","-<(/0+"=,*9<;<=,(9"I4*","=,*9<;E0,*" ]E+-9<4(" An initially latent body is irregularly accelerated. While the experiment various data regarding the movement of the body is collected. Please verify whether the the following statement is correct: “The body reaches its maximum speed at time t1.” WM/04.02 S. 227 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"##C""1""77"#$%&" WM/04-05 S. 227 GH+",;;E:E0,9<4(",()"F<-E,0<_,9<4("4I"9H+"*+-E09-"I4*","-<(/0+" ]E+-9<4("*+F+,0"<(-</H9-"<(94"*+0+F,(;+",()"=*+I+*+(;+" An initially latent body is irregularly accelerated. While the experiment various data regarding the movement of the body is collected. Please verify whether the the following statement is correct: “The body reaches its maximum speed at time t1.” WM/04.02 S. 228 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"##D""1""77"#$%&" WM/04-05 S. 228 LM,:=0+"#N"R,51/,_+"4I","-<(/0+"=,*9<;<=,(9"I4*","=,*9<;E0,*" ]E+-9<4(" An objects falls from the roof a building at time t=0. After some time the air resistance leads to a constant fall velocity. z is the position, v the speed, and a the acceleration of the object. Please verify whether the the following statement is correct: “The acceleration at t= 0 is maximal.” WM/04.02 S. 229 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"##?""1""77"#$%&" WM/04-05 S. 229 ^4*"+,;H"=,*9<;<=,(9T"5+";,=9E*+)"*,51/,_+"4I","=,*9<;<=,(9" I4*","=,*9<;E0,*"]E+-9<4(" An objects falls from the roof a building at time t=0. After some time the air resistance leads to a constant fall velocity. z is the position, v the speed, and a the acceleration of the object. Please verify whether the the following statement is correct: “The acceleration at t= 0 is maximal.” WM/04.02 S. 230 !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#@$""1""77"#$%&" WM/04-05 S. 230 GH+"+b+;9<F+(+--"4I"+,;H"*+=*+-+(9,9<4("5,-",--+--+)"I4*"9H*++" 0+F+0-"4I"-9E)+(9"+M=+*9<-+N"+M=+*9-T"<(9+*:+)<,9+-",()"(4F<;+-" )6:935+ '0/;6+ F20850>+ LM=+*9" %C>Dg" A$TDg" A%TAg" `+)<4;*+" #@T$g" @CT?g" @?T$g" a4F<;+" ##TDg" gJGUi+ @&T@g" Expert gives least preference to vector, and high and equal preference to both table and diagram" Mediocre gives little higher preference to vector as compared to expert, and same, but little lower as compared to expert preference to both table and diagram" Novice, unlike expert and mediocre, gives the highest preference to table WM/04.02 S. 231 among all representations" !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#@%""1""77"#$%&" WM/04-05 S. 231 Q+"H,F+",0-4"*+;4*)+)"9H+"/,_+"K+H,F<4*")E*<(/",(-5+*<(/" d0+,-+",(-5+*"(45"9H+"]E+-9<4(N" c-"9H+"I40045<(/"-9,9+:+(9";4**+;9Z" """"WGH+"K4)8"<-"<(<9<,008",;;+0+*,9+)"+F+(08h>" " GH+",(-5+*"<-N" O4**+;9 "<(;4**+;9" " i45";+*9,<(",*+"84EZ" j" d0+,-+",(-5+*"(45"9H+"]E+-9<4(N" c-"9H+"I40045<(/"-9,9+:+(9";4**+;9Z" """"W^4*"-H4*9"=+*<4)-"1445&(&67189:1&5<9H"9H+" ""P(,0",;;+0+*,9<4(":h>" " GH+",(-5+*"<-N" O4**+;9 "<(;4**+;9" " i45";+*9,<(",*+"84EZ" j" We calculated a confidence score (CS) and found out that there is a WM/04.02 S. 232 difference: 95% (Expert CS), 88% (Intermediate), 84% (Novice)" !"#$%&"'()*+,-".+(/+0""1""'2"3(450+)/+"6,-+)"78-9+:-""1""7;*<=9"!"##$%"&$'()*+(,'*##(-*,!*++1""=>"#@#""1""77"#$%&" WM/04-05 S. 232