Techniques and Challenges for Temporal Event Tracking
Transcription
Techniques and Challenges for Temporal Event Tracking
Techniques and Challenges for Temporal Event Tracking Heng Ji Computer Science Department Queens College and the Graduate Center City University of New York [email protected] June 1, 2010 Outline A Cross-document IE Task Methods in Global Time Discovery Remaining Challenges Chinese-specific Challenges 2 Traditional Single-document IE VivendiUniversal UniversalEntertainment. Entertainment BarryDiller Diller on Wednesday quit as chief of Vivendi Barry Trigger Arguments Quit (a “Personnel/End-Position” event) Role = Person Barry Diller Role = Organization Vivendi Universal Entertainment Role = Position Chief Role = Time-within Wednesday (2003-03-04) 3 Limitations of Single-document IE Various events are evolving, updated, repeated and corrected in different documents Most current IE analyzes single documents in isolation Net result is a set of facts which are Unconnected: Related events (e.g.“Tony Blair’s foreign trips”) appear unconnected and unordered; 13480 event arguments for 337 articles per day in TDT5 corpus Unranked: all events are considered equally important Redundant: many events are frequently repeated in different documents Erroneous and Incomplete (‘performance ceiling’): ACE event extraction systems barely exceeded 50% F-score on argument labeling; more than 50% event instances don’t include explicit time arguments 4 A New Cross-document IE Task … Centroid=“Toefting” Rank=26 … Time 2002-01-01 Time 2003-03-15 Time 2003-03-31 Event Attack Event End-Position Event Sentence Person Toefting Person Toefting Defendant Toefting Place Copenhagen Entity Bolton Sentence Target workers four months in prison Crime assault Input: A test set of documents Output: Identify a set of centroid entities, and then for each centroid entity, link and order the events centered around it on a time line 5 Evaluation Metrics Browsing Cost: Incorporate Novelty/Diversity into F-Measure An argument is correctly extracted in an event chain if its event type, string and role match any of the reference argument mentions Two arguments in an event chain are redundant if their event types, event time, string (the full or partial name) and roles overlap Browsing Cost (i) = the number of incorrect or redundant event arguments that a user must examine before finding i correct event arguments Temporal Correlation: Measure Coherence Temporal Correlation = the correlation of the temporal order of argset in the system output and the answer key Argument recall = number of unique and correct arguments in response / number of unique arguments in key 6 A bit difference from what I learned from this workshop We also like event arguments We want to recover implicit event time arguments Ideally we hope to conduct cross-doc inference Event Chain centering around centroid entities, links on event mention level instead of sentence level (main verb) 7 A Cross-document IE System Test docs Single-doc IE Background Data Unconnected Events Wikipedia Cross-doc Argument Refinement Related docs Centroid Entity Detection Global Time Discovery Cross-doc Event Selection & Temporal Linking Cross-doc Event Coreference (Ji et al., RANLP 2009) Ranked Temporal Event Chains 8 What’s New: Research Challenges Overview More Salient: Detecting centroid entities using global confidence More Accurate and Complete: Correcting and enriching arguments from the background data More Concise: Conducting cross-document event coreference resolution to remove redundancy 9 Why Detecting Event Time? It’s important to many NLP applications Textual inference (Baral et al., 2005) Multi-document text summarization (e.g. Barzilay e al., 2002), Temporal event tracking (e.g. Bethard et al., 2007; 2008; Chambers et al., 2009; Ji and Chen, 2009) Template based question answering (Ahn et al., 2006) It’s challenging because about half of the event instances don’t include explicit time arguments Prior Our work of detecting implicit time arguments Filatova and Hovy, 2001; Mani et al., 2003; Lapata and Lascarides, 2006; Eidelman, 2008 Most work focused on sentence level Linguistic evidence such as verb tense was used for inference Focus More fine-grained events An event mention and all of its coreferential event mentions do not include any explicit or implicit time expressions 10 Observations about Events in News Based on series of events Various situations are evolving, updated, repeated and corrected in different event mentions Events occur as chains ConflictLife-Die/Life-Injure Justice-Convict Justice-Charge-Indict/Justice-TrialHearing Writer won’t mention time repeatedly To avoid redundancy, rarely provide time arguments for all of the related events Reader is expected to use inference On Aug 4 there is fantastic food in Suntec…Millions of people came to attend the IE session. the IE session is on Aug 4 11 Solution 1: Background Knowledge Reasoning Time Search from Related Documents [Test Sentence] <entity>Al-Douri</entity> said in the <entity>AP</entity> interview he would love to return to teaching but for now he plans to remain at the United Nations. [Sentences from Related Documents] In an interview with <entity>The Associated Press</entity> <time>Wednesday<time> night, <entity>Al-Douri</entity> said he will continue to work at the United Nations and had no intention of defecting. Time Search from Wikipedia [Test Sentence] <person>Diller</person> started his entertainment career at <entity>ABC</ entity>, where he is credited with creating the ``movie of the week'' concept. [Sentences from Wikipedia] <person>Diller</person> was hired by <entity>ABC</entity> in <time>1966</ time> and was soon placed in charge of negotiating broadcast rights to feature films. 12 Solution 2: Time Propagation between Events Event Mention with time Injured Russian diplomats and a convoy of America America's Kurdish Kurdish comrades in arms Sunday were among unintended victims caught in crossfire crossfire and friendly fire fire [Sunday] Sunday. Event Mention without time Kurds Kurds said 18 of their own died died in the mistaken U.S. U.S. air strike strike. Event Mention with time courtsuspended a newspaper A state state security security court newspaper critical of the government convictingit of publishing religiously inflammatory material. Saturday [Saturday] Saturday after convicting Event Mention without time Monitor The sentence sentence was the latest in a series of state state actions against the Monitor, the only English language daily in Sudan and a leading critic of conditions in the south of the country, where a civil war has been waged for 20 years. (Gupta and Ji, ACL 2009) 13 Rule based Prediction Same-Sentence Propagation Relevant-Type Propagation EMi and EMj are in the same sentence and only one time expression exists in the sentence typei= “Conflict”, typei= “Life-Die/Life-Injure” argi is coreferential with argj rolei=“Target” and rolej=“Victim”, or rolei=rolej= “Instrument” Same–Type Propagation argi is coreferential with argj, typei= typei, rolei= rolei, and match time-cue roles Typei Rolei Typei Rolei Conflict Target/Attacker/Crime MovementTransport Destination/Origin Justice Defendant/Crime/Plaintiff Transaction Buyer/Seller/Giver/Recipient Life-Die/Life-Injure Victim Contact Person/Entity Life-Be-Born/LifeMarry/Life-Divorce Person/Entity Personnel Person/Entity Business Organization/Entity 14 Statistical Learning based Prediction Maximum Entropy based model for propagate/non-propagate classification of any event mention pair <EMi, EMj> Features Same Sentence: whether EMi and EMj are located in the same sentence or not Number of Time Arguments: EMi and EMj are in the same sentence, then assign the number of time arguments in the sentence Time-Cue Argument Role Matching: whether the time-cue role types in EMi and EMj match or not 15 Cross-document Event Coreference Resolution 1. An explosion in a cafe at one of the capital's busiest intersections killed one woman and injured another Tuesday 4. Ankara police chief Ercument Yilmaz visited the site of the morning blast 2. Police were investigating the cause of the explosion in the restroom of the multistory Crocodile Cafe in the commercial district of Kizilay during the morning rush hour 5. The explosion comes a month after 3. The blast shattered walls and windows in the building 7. Radical leftist, Kurdish and Islamic groups are active in the country and have carried out the bombing in the past 6. a bomb exploded at a McDonald's restaurant in Istanbul, causing damage but no injuries (Chen and Ji, 2009) 16 Method 1: Spectral Graph Clustering Trigger Arguments Trigger Arguments Trigger Arguments explosion Role = Place a cafe Role = Time Tuesday explosion Trigger Arguments Trigger Role = Place restroom Arguments Role = Time morning rush hour Trigger explosion Role = Place building Arguments Trigger Arguments blast Role = Place site Role = Time morning explosion Role = Time a month after exploded Role = Place restaurant bombing Role = Attacker groups 17 Spectral Graph Clustering 0.8 0.7 A 0.9 0.9 0.8 0.6 0.3 0.8 0.2 0.7 0.2 0.1 0.3 B cut(A,B) = 0.1+0.2+0.2+0.3=0.8 18 Automatically Detect Event Attributes Modality Expressing degrees of possibility, belief, evidentiality, expectation, attempting, and command (Sauri et al., 2006); An Event is ASSERTED when the author or speaker makes reference to it as though it were a real occurrence; All other events are annotated as OTHER Polarity Polarity has a value of NEGATIVE if an event did not occur, otherwise, it has a value of POSITIVE Genericity Genericity has a value of SPECIFIC if an event is a singular occurrence at a particular place and time, otherwise, it has a value of GENERIC TENSE It is determined with respect to the speaker or author. Possible values: PAST, FUTURE, PRESENT, and UNSPECIFIED 6/11/10 eETTs 2009 19 Event Attribute Disagreement Examples Event Attributes Modality Event Mentions Toyota Motor Corp. said Tuesday it will promote Akio Toyoda, a grandson of the company's founder who is widely viewed as a candidate to some day head Japan's largest automaker. Managing director Toyoda, 46, grandson of Kiichiro Toyoda and the eldest son of Toyota honorary chairman Shoichiro Toyoda, became one of 14 senior managing directors under a streamlined management system set to be… Polarity Genericity Other Asserted At least 19 people were killed in the first blast Positive There were no reports of deaths in the blast Negative An explosion in a cafe at one of the capital's busiest intersections killed one woman and injured another Tuesday Specific Roh has said any pre-emptive strike against the North's nuclear facilities could prove disastrous Tense Attribute Value Israel holds the Palestinian leader responsible for the latest violence, even though the recent attacks were carried out by Islamic militants We are warning Israel not to exploit this war against Iraq to carry out more attacks against the Palestinian people in the Gaza Strip and destroy the Palestinian Authority and the peace process. Generic Past Future 20 Experiments: Data 106 newswire texts from ACE 2005 training corpora as test set extracted the top 40 ranked person names as centroid entities, and manually created temporal event chains by Aggregated reference event mentions (Inter-annotator agreement: ~90%) Filled in the implicit event time arguments from the background data (Inter-annotator agreement: ~82%) Annotated by two annotators independently and adjudicated 278,108 texts from English TDT5 corpus and 148 million sentences from Wikipedia as the source for background data 140 events with 368 arguments (257 are unique) The top ranked centroid entities are “Bush”, “Ibrahim”, “Putin”, “Al-douri”, “Blair”, etc. 21 Browsing Cost 22 Temporal Correlation Method Temporal Argument Correlation Recall Baseline: ordered by event reporting time 3.71% 27.63% Method1: Single-document IE 44.02% 27.63% Method2: 1+Cross-doc Event Coreference 46.15% 27.63% Method3: 2+ Cross-doc Argument Refinement 55.73% 30.74% Method4: 3 + Global Time Discovery 70.09% 33.07% 23 Time Propagation Experiments Data and Answer-Key Annotation Construct any pair of event mentions <EMi, EMj> as a candidate sample if EMi includes a time argument while EMj and its coreferential event mentions don’t include any time arguments; manually label “Propagate/ Not-Propagate” for <EMi, EMj> 47 ACE05 newswire texts for training (485 “Propagate” samples and 617 “Not-Propagate” samples) and blind test on 10 texts (212 samples) Results Method P (%) R (%) F(%) Rule-Based 70.40 74.06 72.18 Statistical Learning 72.48 50.94 59.83 The most common correctly propagated pairs are Conflict-Attack Life-Die/Life-Injure Justice Convict Justice-Sentence/Justice-Charge-Indict Movement-Transport Contact-Meet Justice-Charge-Indict Justice-Convict 24 Why Rule-based Prediction Performs Better Not enough training data to capture all the evidence from different time-cue roles Example: only one positive training sample matching “defendant” role (newspaper/Monitor): Event Mention with time courtsuspended a newspaper A state state security security court newspaper critical of the government convictingit of publishing religiously inflammatory material. Saturday [Saturday] Saturday after convicting Event Mention without time Monitor The sentence sentence was the latest in a series of state state actions against the Monitor, the only English language daily in Sudan and a leading critic of conditions in the south of the country, where a civil war has been waged for 20 years. Combining these two approaches in a self-training framework – adding the highconfidence results from rules as additional training data to re-train the MaxEnt classifier - did not provide further improvement 25 To Fix the Remaining Spurious Errors Incorporate distance, event reporting order, context event features and better entity coreference resolution Event Mention with time American troops stormed a presidential palace and other key buildings in U.S. tanks tanks rumbled into the heart of the battered Iraqi capital on Baghdad as U.S. Monday amid the thunder of gunfire [Monday] Monday gunfire and explosions explosions… Event Mention without time √ ? Iraqis shot At the palace compound, Iraqis shot small small arms arms fire from a clock tower, U.S. tanks which the U.S. tanks quickly destroyed. Event Mention with time [Saturday] gun battles Saturday The first one was on Saturday and triggered intense gun battles, which Iraqi according to some U.S. accounts, left at least 2,000 Iraqi fighters dead. 26 Remaining Challenges: Cross-document Discourse Reasoning Query: When was Carol Shepp McCain acting as the wife of John McCain? Answer: 1966-1980 DOCID: LTW_ENG_20081007.0068.LDC2009T13 Carol Shepp McCain, then 42, had endured much in more than 14 years of marriage to John. She had raised their three young children alone while her husband languished in a North Vietnamese prison camp for 5 1/2 years DOCID: LTW_ENG_20081007.0068.LDC2009T13 Nine months earlier, at a cocktail reception in Hawaii, he met a glamorous young heiress named Cindy Lou Hensley and, by all accounts, fell instantly in love. According to public records, he and Cindy received a marriage license in Maricopa County, Ariz., in early March 1980, four weeks before his divorce from Carol was final. 27 Remaining Challenges: Paraphrase Discovery Query: During when was R. Nicholas Burns a member of the U.S. State Department? Answer: 1995-2008 <DOCID> APW_ENG_19950112.0477.LDC2007T07 </DOCID> R. Nicholas Burns, a career foreign service officer in charge of Russian affairs at the National Security Council, is due to be named the new spokesman at the U.S. State Department, a senior U.S. official said Thursday. [APW_ENG_20070324.0924.LDC2009T13 and many other DOCS] The United States is "very pleased by the strength of this resolution" after two years of diplomacy, said R. Nicholas Burns, undersecretary for political affairs at the State Department. <DOCID> NYT_ENG_20080118.0161.LDC2009T13 </DOCID> R. Nicholas Burns, the country's third-ranking diplomat and Secretary of State Condoleezza Rice's right-hand man, is retiring for personal reasons, the State Department said Friday. <DOCID> NYT_ENG_20080302.0157.LDC2009T13 </DOCID> The chief U.S. negotiator, R. Nicholas Burns, who left his job on Friday, countered that the sanctions were all about Iran's refusal to stop enriching uranium, not about weapons. But that argument was a tough sell. 28 Remaining Challenges: Unclear Boundaries With ending-date only: With vague starting-date only: Sotheby's main shareholder and former chairman Vivendi Universal, the world's second-largest media group after AOL Time Warner of the United States, has been digging out from under a mountain of debt since the removal of expansionist boss Jean-Marie Messier last July, largely through asset sales Nathan divorced wallpaper salesman Bruce Nathan in 1992 the recently appointed Palestinian prime minister With time-duration only: Liana Owen drove 10 hours from Pennsylvania to attend the rally in Manhattan with her parents Hariri submitted his resignation during a 10-minute meeting with the head of state at the Baabda presidential palace, outside the capital 29 Chinese-specific Challenges Time Argument Associated with Un-defined Events 贝克和他的研究小组在1990 年代的初期到中期一共研究了103个人,这项最新 的研究结论是在追踪这103人病例以后所获得的 ,研究人发现在这103人当 中,婚姻不愉快的心脏的内壁都比较厚。 From the beginning to middle of 1990s Bake and his research team investigated 103 people ……, unpleasant marriage … 检察官说,他们希望正式起诉英国和欧洲官员,因为1989年,他们在国内禁 止使用被疑受污染的动物饲料,但却允许英国继续出口这种饲料 The prosecutor said, they hope to formally sue the British and European officials, because in 1989, they forbid using the animal feed… Time Argument Associated with Longer Distance Events 第55届联大20号晚在巴以冲突紧急特别会议上以压倒性多数的表决结果通过 决议 The 55th joint association 20th night at Palestine-Israel confliction emergency meeting with significantly major voting passed the decision…b 30 Chinese-specific Challenges (Cont’) Reasoning across Multiple Time Arguments 还有半个月是结婚一周年 It’s the marriage one year anniversary after half a month 昨天晚上香港中华总商会在香港会展中心举 办成立100周年庆祝酒会 Last night Hong Kong China Business Association held a 100 anniversary banquet in the Hong Kong Exhibition Center Reasoning across Multiple Events/Slots 此次释放全部被捕军警的行动是在哥政府与游击队代表在哈瓦那经过一周多 协商后由该游击队组织单方面决定的。 This release activity on all the arrested police was made one week after the negotiation between …. 摩托罗拉设计师梁丽娇(42岁)第一次出国公干,不料竟踏上不归路。 The Motorola designer Liang Lijiao (42) went abroad for business for the first time… 31 Chinese-specific Challenges (Cont’) “Hidden” Tense 新华社南京12月17日电(记者 赵明亮)我国家电行业大型企业 之一江苏苏宁电器集团昨天在南京宣布:未来三年内在全国建立1 500家综合电器连锁店,比现在的数量增加10倍,形成行业内 的“航空母舰”,以应对入世后跨国商业资本的<进入>。 …announced: in the next three years to build in the whole country… 32 Related Work Event Tracking Information Redundancy to Improve Extraction Accuracy Discovering temporal event chains: TempEval (Verhagen et al., 2007); e.g. Bethard and Martin (2008), Chambers and Jurafsky (2008) Topic detection and tracking (Allan, 2002) Downey et al. (2005), Yangarber (2006), Mann (2007), Patwardhan and Riloff (2007; 2009) What’s New Extend the representation of each node in the linked chains from an event trigger/sentence/sentence to a structured aggregated event including finegrained information such as event types, arguments and their roles Global argument correction and implicit time discovery Correct the original extracted facts and discover implicit time arguments using background knowledge 33 Conclusion Temporal Event Tracking is an important and challenging task Substantial improvement requires global reasoning and more finegrained temporal annotation Let’s keep working on it. An advertisement: NIST Knowledge Base Population: http://nlp.cs.qc.cuny.edu/kbp/2010/ New research focus in KBP 2011: Temporal KBP Cross-lingual (Chinese to English) KBP 34 Thank you 35 Evaluation Metrics Centroid Entity Detection Browsing Cost: Incorporate Novelty/Diversity into F-Measure F-Measure: A centroid entity is correctly detected if its name (and document id) matches the full or partial name of a reference centroid Normalized Kendall tau distance (Centroid entities) = the fraction of correct system centroid entity pairs out of salience order Centroid Entity Ranking Accuracy = 1- Normalized Kendall tau distance (Centroids) An argument is correctly extracted in an event chain if its event type, string and role match any of the reference argument mentions Two arguments in an event chain are redundant if their event types, event time, string (the full or partial name) and roles overlap Browsing Cost (i) = the number of incorrect or redundant event arguments that a user must examine before finding i correct event arguments Temporal Correlation: Measure Coherence Temporal Correlation = the correlation of the temporal order of argset in the system output and the answer key Argument recall = number of unique and correct arguments in response / number of unique arguments in key 36 Baseline Single-document IE System Includes entity extraction, time expression extraction and normalization, relation extraction and event extraction Event Extraction Pattern Matching British and US forces reported gains in the advance on Baghdad PER report gain in advance on LOC Maximum Entropy Models Trigger Labeling: to distinguish event instances from non-events, to classify event instances by type Argument Identification: to distinguish arguments from non-arguments Argument Classification: to classify arguments by argument role Reportable-Event Classifier: to determine whether there is a reportable event instance Each step produces local confidence (Grishman et al., 2005) 37 Conclusion and Future Work Used propagation between related events to predict unknown time arguments which were not possible using the traditional explicit time argument extraction techniques Compared two approaches and demonstrated that embarrassingly simple but smart knowledge engineering can perform better than supervised learning with small training corpora for some particular tasks Further work Applied for temporal event tracking (Ji et al., 2009) and improved the correlation score from 55.73% to 70.09%. Future work Incorporate dynamic context features into MaxEnt Extend to cross-document IE, predict event time from related documents 38