Comments to: Referential Structures and Links (S. Salmon-Alt and L. Romary) Massimo Poesio University of Essex, Language & Computation Group 1 Summary of the RAF proposal • A Reference Annotation Framework (RAF) instantiating Ide & Romary’s LAF • TWO main `data categories’ (= elements): – MARKABLES – LINKS • Recommendations about descriptors of markables and links ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 2 Overview of comments • I share much of the philosophy behind the proposal – Markables – Separate links – Useful attributes for reference • But (some of these issues already covered in presentation) – Some aspects may be more ready for standardization than others – Are all the proposed descriptors really part of ‘reference’ annotation? – Practical considerations to think about: • Use • Annotation technology (where does standoff fit in?) • Some difficult issues still to be addressed ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 3 Separate links • Case for links as independent elements: a form of normalization – (Ambiguity – see below) – Multiple relations (identity, bridging, more general ACE-style semantic relations) – Attributes of relations (type of relation, different discourse models) • BUT: MMAX-style set-based annotation more transparent representation of discourse models ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 4 Multiple relations in the GNOME corpus Cabinet on Stand The decoration on this monumental cabinet refers to the French king Louis XIV's military victories. A panel of marquetry showing the cockerel of France standing triumphant over both the eagle of the Holy Roman Empire and the lion of Spain and the Spanish Netherlands decorates the central door. On the drawer above the door, gilt-bronze military trophies flank a medallion portrait of Louis XIV. In the Dutch Wars of 1672 1678, France fought simultaneously against the Dutch, Spanish, and Imperial armies, defeating them all. This cabinet celebrates the Treaty of Nijmegen, which concluded the war. Two large figures from Greek mythology, Hercules and Hippolyta, Queen of the Amazons, representatives of strength and bravery in war, appear to support the cabinet. The fleurs-de-lis on the top two drawers indicate that the cabinet was made for Louis XIV. As it does not appear in inventories of his possessions, it may have served as a royal gift. The Sun King's portrait appears twice on this work. The bronze medallion above the central door was cast from a medal struck in 1661 which shows the king at the age of twenty-one. Another medallion inside shows him a few years later. ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 6 Multiple relations …. <ne ID=“ne127">a medallion portrait of Louis XIV</ne>. …… <ne ID=“ne164”>this work</ne>. <ne ID=“ne165”>The bronze medallion</ne> ….. <link href="coref.xml#id(ne165)" type="ident"> <anchor href="coref.xml#id(ne127)"/> </link> <link href="coref.xml#id(ne165)" type=“poss"> <anchor href="coref.xml#id(ne164)"/> </link> ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 7 Link descriptors: type of relation, who-believes GIVER: Do_you have <coref:de ID="de_20">diamond_mine.</coref:de> FOLLOWER: Yes I've got <coref:de ID="de_21">a gold_mine.</coref:de> GIVER: Ah. S--. FOLLOWER: .... GIVER: You don't have <coref:de ID="de_22">diamond_mine</coref:de> though. FOLLOWER: No. <coref:de ID="de_23"> It </coref:de> is a gold_mine according to this one. Presumably <coref:de ID="de_24">that</coref:de> ‘s the same. GIVER: Well I've got <coref:de ID="de_25">a gold_mine</coref:de> as well you see. <coref:link href="coref.xml#id(de_23)" who-believes="F" type="ident"> <coref:anchor href="coref.xml#id(de_22)"/> </coref:link> ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 8 Markable descriptors: GNOME <ne id="ne2" cat="poss-np" per="per3“ num="sing" gen="neut" gf="subj“ lftype="term“ onto="concrete" ani="inanimate" deix="deix-no" count="undersp-count" structure="undersp-structure" generic="generic-no" loeb="sem-function"> <ne id="ne3" cat="this-np" per="per3"num="sing“ gen="neut" gf="gen" lftype="term“ onto="concrete" ani="inanimate" deix="deix-yes" count="count-yes“ structure="atom" generic="generic-no" loeb=“disc-function"> This table's </ne> unusual materials and coloring …. </ne> ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 9 Descriptors, I: morpho-syntax • Morpho-syntactic descriptors – (Syntactic) agreement features, syntactic category, gramm function: • Indeed very useful both for interpretation and for generation • But: separate level? • In GNOME: with markables for technical reasons – POS: separate level? ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 10 Descriptors, II: Semantic • Semantic descriptors: – Wordsense: separate level? – Abstractness, animacy, named entity: Reference? – Countability: a difficult issue – Collectiveneness, Inalienability (!) ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 11 Descriptors, III: Referential • Referential descriptors: – – – – – Cardinality (!?) Natural Gender (!) Definiteness (?) Informational status (!?) Referential status (?) • Not included, but very useful in GNOME: – LFTYPE (term / quantifier / predicate) ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 12 Link descriptors • Referential target: “single target, use structural solutions for describing ambiguity” (see presentation) • Objectal relations vs. linguistic relations – OK, but linguistic relations very difficult (VENEX) ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 13 Ambiguity 15.12 M: we’re gonna take the engine E3 15.13 : and shove it over to Corning 15.14 : hook it up to the tanker car 15.15 : _and_ 15.16 : send it back to Elmira (from the University of Rochester TRAINS-91 dialogues) ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 14 Marking ambiguity <de ID="de_01">we</de>'re gonna take <de ID="de_07"> the engine E3 </de> and shove <de ID="de_08"> it </de> over to <de ID="de_02">Corning</de>, hook <de ID="de_09"> it </de> up to <de ID="de_03">the tanker car</de>... _and_ and send <de ID="de_19">it</de> back to Elmira <link href="coref.xml#id(de_08)" type="ident"> <anchor href="coref.xml#id(de_07)"/> </link> <link href="coref.xml#id(de_19)" type="ident"> <anchor href="coref.xml#id(de_09)"/> <anchor href="coref.xml#id(de_03)"/> </link> ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 15 General questions • What aspects of the proposal will become part of the standard? – What is the status of the recommendations re: attributes? – For my own research I need richly annotated corpora, but who else does? For which applications? • How can the standard incorporate future extensions? (See next slides) – What about simplifications? ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 16 The difficult issues • A `general purpose’ ?(coreference? anaphora?) annotation • Ambiguity (GNOME, ARRAU) • Underspecified relations (GNOME) • Reference to abstract objects (ARRAU) • Misunderstandings in multimodal reference (VENEX) ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 17 Object AND relation ambiguity In 1740, Caffieri’s wife bought a royal privilege - a form of permit from the king – which allowed them to gild bronze as well as cast it; within the same workshop these two processes would usually have been done by separate businesses. After his son Philippe Caffieri joined the workshop in 1747, they produced designs for chandeliers, ornaments for coaches, wall lights, and furniture mounts. ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 18 DE/REL AMBIGUITY which allowed <ne id=“ne1”>them</ne> …. After <ne id=“ne2”>his son Philippe Caffieri</ne> …. joined <ne id=“ne3”>the workshop</ne> … <ne id=“ne4”>they</ne> produced … <ante current=“ne4”> <anchor id=“ne1” rel=“ident”> <anchor id=“ne2” rel=“element”> <anchor id=“ne3” rel=“ident”> </ante> ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 19 Product / instance metonymy & anaphora XXX Cream What you should know about XXX Cream * What your cream does XXX Cream is a strong and rapidly effective treatment for inflamed skin conditions such as eczema, psoriasis and dermatitis. IT IS IMPORTANT TO READ THIS CAREFULLY BEFORE STARTING TREATMENT Following this advice gives the cream a chance to work safely and effectively. * What is in your cream XXX Cream is one of a group of medicines called topical steroids. "Topical" means they are put on the skin. ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 20 Misunderstandings in the MapTask (VENEX) ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 21 MISUNDERSTANDINGS IN DIALOGUE G037: Perfetto, allora questa... sei arrivata dove sta la punta delle barche? Perfect then, did you arrive at the top of the boats? F038: Sí Yes G039: Mh, ora fai un mezzo cerchio. Now do a half-circle F040: La prima però, la prima figura <La seconda,> barche. The first however, the first landmark <the second> boats G041: Quella più in alto. The one above F042: Ah, quella più <Eh> in alto? The one above? G043: Sí, ma non ci andare proprio sopra, cioé ci devi sta_, ci devi passà vicino non ci de_, non devi andá sulle figure. Fatto? Yes, but you should not go over it, I mean you must … you must get close but you cannot go over the landmark. Done? ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 22 MISUNDERSTANDINGS IN DIALOGUE, 2 F044: Sí Yes G045: Perfetto, ora fai un mezzo cerchio verso destra come se volessi, come se volessi <Scendere.> fare ... Perfect, now draw a half-circle to the right as if you wanted to <go down> F046: G047: <Perché> No. No. F048: La figura barche sta più sopra di valle limpida. The boat landmark is above valle limpida G049: Sta piú sopra? [It] is above? F050: Ce ne sono due, una piú sotto e una piú sopra. There is two, one below and one above ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 23 MISUNDERSTANDINGS IN DIALOGUE, CONT’D G051: Mh, molto bene. Allora, mh, non c'hai barche un poco piú in alto spostato verso, verso destra, cioè non c'hai scritto valle limpida? Very good. Then, mh, don’t you have boats a bit higher towards, to the right, that is, don’t you have valle limpida F052: Sí, c'è scritto, peró poi piú sopra c'è un'altra figura barche. Yes, it’s there, however above that there is a second boats landmark G053: Ci sono due figure barche? There are two boat landmarks? F054: Sí. Yes. ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 24 Conclusions • Core aspects of the proposal good • Need to clarify some of the more complex aspects and their role in the standard • Would also be useful to expand on – Use in specific annotations / tasks – Relation with standoff – How future extensions can be included ISO/SIGSEM Workshop on Multim. Sem. Repr - Tilburg, Jan 2005 25