Rules based interlingua translation in ATLAS Leonardo Lesmo & Alessandro Mazzei & Cristina Battaglino Dipartimento di Informatica Università degli Studi di Torino [email protected] Seminari CIRMA 2012 13-06-2012 1 Interaction Models Group The linguistics of LIS Several articolators: parallelism Spatial “organization” of the sentence Meteo19-3 Plural SOV Many local dialects A few linguistic studies No prepositions, genre, articles No natural written form (!!!): GLOSSE+feats 2 Interaction Models Group Rules-based Translation Architecture Italian Sentence Parser Dependency Grammar for Italian Semantic Interpreter AW-LIS Spatial Allocation Planner Spatial Allocation Strategy AEW-LIS Ontology Generator Animation Interpreter CCG for LIS Signary Interaction Models Group Parser Semantic Interpreter Generator Spatial Allocation Planner Interaction Models Group Parser Semantic Interpreter Generator Spatial Allocation Planner Interaction Models Group Turin University Parser A wide-coverage bottom-up rule-based dependency parser Rules for: Chunking, Coordination, Verb-SubCat Dependency relations Morpho-syntactic Syntactic-functional Semantic 6 Interaction Models Group Turin University Parser Paolo è davvero veloce rmod Chunking Paolo subj Verb-SubCat è davvero veloce predcompl rmod Paolo è davvero veloce Interaction Models Group Turin University Parser Oggi ultimo giorno del mese di Giugno, con valori di temperatura superiori alla media rmod separator rmod Oggi ultimo rmod rmod giorno del arg rmod rmod arg mese di giugno arg , con rmod valori di arg temperatura arg superiori arg alla media 8 Interaction Models Group Parser Semantic Interpreter Generator Spatial Allocation Planner Interaction Models Group Semantic Interpreter Frege compositional approach Ontology based syntax-semantics interface Sentence meaning is a complex semantic network Semantic objects (e.g. roles, time, space) are encoded in the network Interaction Models Group = Description + Weather + World ££entity ££weatherstatusdescription £Adriatic ££sea ££weatherevent ££it-geograrea ££daydescription ££evaluableentity ££meteostatus-situation ££day ££timeintervaldescription ££it-area-spec ££time-interval ££geographicarea ££description £ £positiveeval-entity ££evening ££weatherstatussituation ££precipitation ££positive-evening £perturbato ££deicticdaydescription ££itregion ££storm ££clouds ££positive-day ££it-adriatic-region £monday £today 11 Interaction Models Group Computing Meaning by recursion n d1 ... ... di ... ... dK ... 12 Interaction Models Group Ordinals … ultimo giorno del mese … adjc+ordin-rmod ultimo [£last] rmod arg giorno del [££day] mese [££month] 13 Interaction Models Group Ordinals in the ontology ££physical-entity &part-smaller &temporalpart-smaller ££time-interval &day-indaymonth ££day &part-bigger ££part-of &temporalpart-bigger ££temporal-part-of ££day-month-part-of &month-indaymonth ££month ££day-sequence ££sequenceableentity &ord-described-item ££ordinaldescription &reference-sequence &ordinaldesc-selector ££entity-sequence ££ordinal-selector £last 14 Interaction Models Group Ordinals in the ontology adjc+ordin-rmod ££physical-entity &part-smaller &temporalpart-smaller ££time-interval &day-indaymonth ££day rmod arg &part-bigger ££part-of &temporalpart-bigger ££temporal-part-of ££day-month-part-of &month-indaymonth ultimo [£last] giorno del [££day] mese [££month] ££month ££day-sequence ££sequenceableentity &ord-described-item ££ordinaldescription &reference-sequence &ordinaldesc-selector ££entity-sequence ££ordinal-selector £last 15 Interaction Models Group Ordinals in the ontology adjc+ordin-rmod ££physical-entity &part-smaller &temporalpart-smaller ££time-interval &day-indaymonth ££day rmod arg &part-bigger ££part-of &temporalpart-bigger ££temporal-part-of ultimo [£last] giorno del [££day] mese [££month] &month-indaymonth ££day-month-part-of ££month ££day-sequence ££sequenceableentity &ord-described-item ££ordinaldescription &reference-sequence &ordinaldesc-selector ££entity-sequence ££ordinal-selector £last 16 Interaction Models Group Ontological restriction adjc+ordin-rmod ultimo [£last] rmod arg giorno del [££day] ⇝ mese [££month] ££ordinaldescription &reference-sequence &ord-described-item &ordinaldesc-selector ££day ££month £last 17 Interaction Models Group Semantic Interpreter rmod separator rmod Oggi ultimo rmod rmod giorno del arg rmod rmod arg mese di giugno arg , con rmod valori di arg temperatura arg superiori arg alla media 18 Interaction Models Group Parser Semantic Interpreter Generator Spatial Allocation Planner Interaction Models Group Natural Language Generation NLG systems use knowledge about language and the application domain to automatically produce documents, reports, explanations, help messages, and other kinds of texts Interaction Models Group Natural Language Generation NLG systems use knowledge about language and the application domain to automatically produce documents, reports, explanations, help messages, and other kinds of texts Document Planning ● Content Selection Micro Planning Surface Realization ● Lexicalization ● Word Order ● Aggregation ● Functional Words ● Ref. Expression ● Inflections TEXT Interaction Models Group NLG in ATLAS Semantic Network Abstract Syntax Tree Micro Planning LIS Dependency Tree Surface Realization Interaction Models Group ATLAS Generator MicroPlanner Sentence Designer ● ● ● ● ● Homemade LISP Heuristics Expert System Taxonomy Realizer OpenCCG ● ● ● ● Take-away Java LIS-CCG (homemade) AMMA-DB Interaction Models Group Sentence Designer Algorithm: 1.Segmentation Split the semantic network into singular messages 2.Lexicalization Introduce quasi-lexical items Introduce syntactic relations between items 3.Simplification Remove non-necessary quasi-lexical items Remove repetitions Interaction Models Group Sentence Designer LISA Expert System: ~50 rules (defrule rule-TO-FORESEE-01 () (SEMANTIC-STATE (NAME ££TO-FORESEE-1) (ARG-1 ?X2)) (SEMANTIC-RELATION (NAME &FORESEEN) (ARG-1 ?X2) (ARG-2 ?X3)) => (assert (syntactic-relation (name SYN-OBJ) (arg-1 ?X2) (arg-2 ?X3)))) (defrule rule-COMPARISON-RELATION () (SEMANTIC-STATE (NAME ££COMPARISON-RELATION) (ARG-1 ?X12)) (SEMANTIC-RELATION (NAME &COMPAR-ARG1) (ARG-1 ?X12) (ARG-2 ?X11)) (SEMANTIC-RELATION (NAME &COMPAR-ARG2) (ARG-1 ?X12) (ARG-2 ?X14)) (SEMANTIC-RELATION (NAME &COMPAR-OP) (ARG-1 ?X12) (ARG-2 ?Y12)) => (assert (syntactic-relation (name SYN-SUBJ) (arg-1 ?y12) (arg-2 ?x11))) (assert (syntactic-relation (name SYN-OBJ) (arg-1 ?y12) (arg-2 ?x14)))) Interaction Models Group OpenCCG LIS-CCG : ~ 40 Syntactic Families, ~ 30 morphological Families noun-II(nord-1533-1,it-northern-region,it-region,noun) ... family TransV_III-A-s_nuvola-aumentare { entry: s<1> [E] \ np [X] : E:meteo-status-situation(* <SYN-SUBJ>X:clouds); }; family TransV_III-B { entry: s<1> [E PoS=verb] \ np [X] \ np [Y]: E:meteo-status-situation(* <SYN-SUBJ>X:evaluable-entity <SYN-OBJ>Y:evaluable-entity) ; }; family Adj-II { entry: n<~5> [X] \* n<5> [X] : X(<ATT>(P *)); entry: s <1>[X adj] \* np [Y] : X:weather-status-situation(*<SYN-SUBJ>Y:meteo-status-situation); entry: s <1>[X adj] \* np [Y] : X:weather-status-situation(*<SYN-SUBJ>Y:evaluable-entity); entry: s <1>[X adj] \* np <5> [X] : X(<SYN-RMOD-SEQPOS>(P *)); entry: s <~1> [Z adj] \* s <1>[Z] : Z:event( <COORD> (P *)); } Interaction Models Group Sentence Designer Interaction Models Group OpenCCG SUBJ oggi RMOD mese noun–2669-2 noun–1398-2 RMOD giugno noun–3056-1 RMOD giorno ultimo noun–1052-2 noun–2747-1 OBJ SUBJ RMOD temperatura valore_valere media noun–2998-2 noun–80020- 2 adj–3072-2 superiore verb–3168-2 Interaction Models Group Parser Semantic Interpreter Generator Spatial Allocation Planner Interaction Models Group Spatial Allocation Planner Rule-based Space Iconic Grammatical Syntax Phonology Gravitation Interaction Models Group Gestione spazio grammaticale FUNCTION Split-space (…) 1. Trova dipendenti head in esame a cui si deve assegnare un sottospazio per caratteristiche strutturali della frase 2. Divide sottospazio di giorno tra i dipendenti trovati 3. Richiama Split-space sui dipendenti trovati 4. Assegna posizione all’head in esame 5. Trova dipendenti che rilocano sulla head genitore per caratteristiche strutturali della frase Interaction Models Group Split-space (giorno) 1. Trova solo i dipendenti di “giorno” a cui si deve assegnare un sottospazio per caratteristiche strutturali della frase: {oggi, mese} Interaction Models Group Split-space (giorno) 1. Trova solo i dipendenti di “giorno” a cui si deve assegnare un sottospazio per caratteristiche strutturali della frase: {oggi, mese} 2. Divide sottospazio di “giorno” tra i dipendenti trovati Giorno Oggi [-1 0 0 0 0 0 1 0 0] Mese Ultimo [-1 0 0 -0.5 0 0 0 0 0] [0 0 0 +0.5 0 0 1 0 0] Giugno Interaction Models Group Split-space (giorno) 1. Trova solo i dipendenti di “giorno” a cui si deve assegnare un sottospazio per caratteristiche strutturali della frase: {oggi, mese} 2. Divide sottospazio di “giorno” tra i dipendenti trovati 3. Richiama Split-space sui dipendenti trovati Giorno Oggi [-1 0 0 0 0 0 1 0 0] Mese [-1 0 0 -0.5 0 0 0 0 0] [0 0 0 +0.5 0 0 1 0 0] Split-space (oggi) Split-space (mese) Ultimo Giugno Interaction Models Group Split-space (giorno) 1. Trova solo i dipendenti di “giorno” a cui si deve assegnare un sottospazio per caratteristiche strutturali della frase: {oggi, mese} 2. Divide sottospazio di “giorno” tra i dipendenti trovati 3. Richiama Split-space sui dipendenti trovati 4. Assegna posizione a ”giorno” Rilocabile, non direzionale: Giorno Oggi [-1 0 0 -0.5 0 0 0 0 0] Split-space (oggi) (0.0 0.0 0.0) [-1 0 0 0 0 0 1 0 0] Mese [0 0 0 +0.5 0 0 1 0 0] Split-space (mese) Ultimo Giugno Interaction Models Group Split-space (giorno) 1. Trova solo i dipendenti di “giorno” a cui si deve assegnare un sottospazio per caratteristiche strutturali della frase: {oggi, mese} 2. Divide sottospazio di “giorno” tra i dipendenti trovati 3. Richiama Split-space sui dipendenti trovati 4. Assegna posizione a “giorno” 5. Esamina se ci sono dipendenti di “giorno” non ancora esaminati 6. Li riloca sulla head “giorno”: {ultimo} Rilocabile, non direzionale: Giorno Oggi Mese [-1 0 0 -0.5 0 0 0 0 0] [0 0 0 +0.5 0 0 1 0 0] Split-space (oggi) (0.0 0.0 0.0) [-1 0 0 0 0 0 1 0 0] Split-space (mese) Ultimo Rilocabile, non direzionale: (0.0 0.0 0.0) [-1 0 0 0 0 0 1 0 0] Giugno Interaction Models Group Spatial Allocation Planner AEW-LIS Interaction Models Group Conclusioni Sistema interlingua “classico” di machine translation Analisi profonda dell'italiano Generazione LIS Gestione grammaticale dello spazio Ingegnerizzato Desktop demo WEB-service Coverage and Quality Potenziamento delle basi di conoscenza Sistema di recovery ... Interaction Models Group Coverage and Quality The Welsh text says: I am not in the office at the moment. Send any work to be translated. Interaction Models Group Grazie per l'attenzione. Interaction Models Group POST /TURBTWebService/RULEBTranslator/01234 HTTP/1.1 User-Agent: ATLAS-Orchestrator/1.0.0 (Linux) Host: localhost:8080 Content-Type: text/xml Content-length: ... <?xml version="1.0"?> <text id="01234"> <sentence id="5" start_time="8.45" duration_time="1.425">Temporali a centro nord</sentence> <sentence id="6" start_time="9.875" duration_time="14.359">Sole a sud e al centro.</sentence> </text> 1 Parser 4.1 2 3 WEB Server Servlet Server Apache Tomcat 7 4.2 TextID + XML RBT.java 6 5 8 HTTP/1.1 200 OK Content-Type: text/sal Content-Length: ... 5 8.45 1.425 ftp://.../01234/atlas_aewlis_manual_movie_meteo_080701_030000_f1.xml 6 9.875 14.359 ftp://.../01234/atlas_aewlis_manual_movie_meteo_080701_030000_f2.xml FTP server Semantic Interpreter 4.3 4.4 Generator Spatial Allocation Planner Interaction Models Group Turin University Parser Domani le nuvole sono in aumento al nord 42 Interaction Models Group Esempio “Le nuvole sono in aumento al nord” Sentence Designer Interaction Models Group OpenCCG RMOD SUBJ nord nuvola noun–1553-1 noun–2667-2 nuvola_aumentare verb–2888-2 Interaction Models Group Parametro input Descrizione parametro … GIORNO … Sign id Id nel segnario … . 1053 … Time Seq Posizione della frase … 4 … Lemma Nome del lemma … Giorno … Link Etichetta arco … nil Arg-ref Direzione rilocazione … {} … Dependents Lista dei dipendenti … {oggi, mese, giugno} … Reloc Rilocabilità … Yes … Absolute position Posizione del segno non rilocabile … Nil … Dir Direzionalità … No …Models Group Interaction NLG Document Planning ● Content Selection Micro Planning Surface Realization ● Lexicalization ● Word Order ● Aggregation ● Functional Words ● Ref. Expression ● Inflections like(Ale-4475,radio') dislike(Ale-4475,tv') sbj Alex sbj Alex amare conj odiare obj radio TEXT “Alex ama la radio e Alex odia la TV” obj TV Interaction Models Group