Dipartimento di Sistemi e Informatica Istituto di Teoria e Tecniche dell'Informazione Giuridica Università degli Studi di Firenze CNR Automatic Translation from Textual Representations of Laws to Formal Models through UML Pietro Mercatali Francesco Romano Luciano Boschi Emilio Spinicci ITTIG-CNR DSI { mercatali, romano }@ittig.cnr.it { boschi, spinicci }@dsi.unifi.it JURIX 2005 – 18th Annual Conference on Legal Knowledge and Information Systems Bruxelles, December 8, 2005 Outline Introduction Information Extraction&Modelling of Express Textual Amendments using UML LexLooter plug-in UML Modelling of the contents of a legislative text and of the applied amendments; Tool functionalities: UML instantiation, generation of coordinated text; Application example; Conclusions and future perspectives P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 2 Introduction Motivations: Legislative text contents liable to Express Textual Amendments (ETAs) may present intelligibility problems Support for recognition and validation can be achieved using ● ● ● NLP and IE tools&techniques (XML markup); Formal representation of legislative contents (UML); Automatic generation of the coordinated text; Goal: Implementing a suitable software tool for automatic processing a legislative text prone to several ETAs ● P. Mercatali, F. Romano L. Boschi, E. Spinicci LexLooter; Automatic Translation from Textual Representation of Laws to Formal Models through UML 3 Outline Introduction Information Extraction&Modelling of Express Textual Amendments using UML LexLooter plug-in Conclusions P. Mercatali, F. Romano L. Boschi, E. Spinicci and future perspectives Automatic Translation from Textual Representation of Laws to Formal Models through UML 4 Express Textual Amendment (Novella) Kind of normative nexus, where an active disposition affects the passive one, by modifying its text; Root Law repeal insertion substitution P. Mercatali, F. Romano L. Boschi, E. Spinicci Express Textual Amendment Coordinated (Enforced) Text Automatic Translation from Textual Representation of Laws to Formal Models through UML 5 ETA example Regional Statute 43/2003, article 1, comma 1 (ETA) 1. In article 1, comma 2, of Regional Statute October 14, 1999, n. 52 (Norme sulle concessioni, le autorizzazioni e le denunce d'inizio delle attivita' edilizie [...]), after the words "procedimenti amministrativi," are inserted the following: "al soddisfacimento dei bisogni sociali ed". Regional Statute 52/1999, article 1, comma 2 (Root Law) 2. La presente legge e' finalizzata all'applicazione dei principi di efficienza e di trasparenza nei procedimenti amministrativi, al perseguimento contestuale del servizio al singolo cittadino e della tutela degli interessi pubblici e collettivi. P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 6 ETA example Regional Statute 43/2003, article 1, comma 1 (ETA) 1. In article 1, comma 2, of Regional Statute October 14, 1999, n. 52 (Norme sulle concessioni, le autorizzazioni e le denunce d'inizio delle attivita' edilizie [...]), after the words "procedimenti amministrativi," are inserted the following: "al soddisfacimento dei bisogni sociali ed". Regional Statute 52/1999, article 1, comma 2 (Root Law, modified) 2. La presente legge e' finalizzata all'applicazione dei principi di efficienza e di trasparenza nei procedimenti amministrativi, al soddisfacimento dei bisogni sociali ed al perseguimento contestuale del servizio al singolo cittadino e della tutela degli interessi pubblici e collettivi. P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 7 UML Modelling Modelling Legislative Acts A UML Class Diagram including: ● ● Instantiating Class Diagram for each legislative act Structure of the Text; Amendment Dispositions; Generating one Object Diagram (UML 2.0) for each analysed act Processing algorithms for: Search of partitions and amendment dispositions; ETA application to Root Models; Verification of instatiated structures; P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 8 Structure Package NIR Law type, number date... Articles Partitions Titolo Articolo Comma... Text Elements Corpo Rubrica... P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 9 ETA Associated to the partitions of the Novella Repeals Insertions Substitutions Text strings or entire partitions References to Law partitions P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 10 Coordinated model Change Class Allow tracking of applied amendments Included in: Legge (Law) Class Text Elements Partitions P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 11 Adopted Information Extraction tools LexEdit A tool by ITTIG and CELI s.r.l. for XML structural marking and checking of a legislative text; Sophia XXI 2.1 A general-purpose FSA linguistic parser developed by CELI s.r.l., used for automatic recognition and XML marking of ETAs; P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 12 Document Processing Root Law XML LEXXI XML LEXXI Root Model Structure Model ETA Model ETA XML Sophia P. Mercatali, F. Romano L. Boschi, E. Spinicci Coordinated Model Automatic Translation from Textual Representation of Laws to Formal Models through UML 13 Model generation T0 Root model T1 ETA Model 1 T2 ETA Model 2 T3 ETA Model 3 ... P. Mercatali, F. Romano L. Boschi, E. Spinicci ... Coordinated Model 1 Coordinated Model 2 Coordinated Model 3 ... Automatic Translation from Textual Representation of Laws to Formal Models through UML 14 Outline Introduction Information Extraction&Modelling of Express Textual Amendments using UML LexLooter plug-in Conclusions P. Mercatali, F. Romano L. Boschi, E. Spinicci and future perspectives Automatic Translation from Textual Representation of Laws to Formal Models through UML 15 LexLooter Main Features Plug-in for Eclipse 3.0 framework; OO Design; Integrates UML2 Eclipse library; Functionalities Implementation of the proposed methodology ● ● Parsing of XML documents produced by LexEdit XXI and Sophia 2.1 Instatiating UML model of the analyzed act Verification of the applied amendments Generation of the Coordinated Text Analysis of the text currently in force P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 16 Example of Coordinated Model Root Law LEXXI XML LEXXI XML Root Model Structure Model ETA Model ETA Sophia XML P. Mercatali, F. Romano L. Boschi, E. Spinicci Coordinated Model Automatic Translation from Textual Representation of Laws to Formal Models through UML 17 Coordinated Text Generation Coordinated model is converted in HTML using XSL (eXtensible Stylesheet Language) Transformation; P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 18 Coordinated Text Generation P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 19 Text in force Analysis Root Model ETA Model 1 ETA Model 2 P. Mercatali, F. Romano L. Boschi, E. Spinicci Coordinated Model 1 Coordinated Model 2 Automatic Translation from Textual Representation of Laws to Formal Models through UML 20 Application Example: Tuscany Statute 52/1999 Express Textual Amendments Marked by Sophia Novella Applied by LexLooter Str. Repeal Str. Subst. Str. Ins. Str. Repeal Str. Subst. Str. Ins. Part.Repeal Part. Subst. Part.Ins. Part.Repeal Part. Subst. Part.Ins. 8 L.R. 13/2002 8 0 2 0 0 2 0 2 2 2 2 2 2 34 L.R. 43/2003 31 0 7 4 0 7 2 7 13 3 7 13 2 42 Overall 39 0 9 4 0 9 2 9 15 5 9 15 4 LexLooter successfully applied 39 of 42 ETAs marked by Sophia (93%) P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 21 Unmodelled amendments Presence of typing errors (2) prevent correct matching with text extraction regex; Missing structural XML markup of the text added by an amendment (1) partition (e.g. a comma) replaced by more partitions (more commas); text added by an amendment appears quoted in the novella and cannot be marked by LexEdit Text normalisation currently implemented (blank removing, quotes reduction, …); Regex refinement and better LexEdit integration will allow structural markup of added partitions; P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 22 Unmarked Amendments Unmarked amendments are also taken into account; Sophia only marks “well formed” amendments according to modern drafting rules LexLooter can estimate the number of unmarked amendments using suitable regex matching in the text outside Sophia markups Suitable warnings notify the number of estimated unmarked amendments or incorrect marking, as well as of unmodelled elements P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 23 Next Developments Extending Sophia marking rules in order “bad formed” amendments to appear in the UML diagram; Implementation of a verification algorithm to validate the modelled amendments Correctness attributes in Change Class to be instantiated in case of “well formed” or “bad formed” amendments (to be notified with suitable warning messages) P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 24 Outline Introduction Information Extraction&Modelling of Express Textual Amendments using UML LexLooter plug-in Conclusions P. Mercatali, F. Romano L. Boschi, E. Spinicci and future perspectives Automatic Translation from Textual Representation of Laws to Formal Models through UML 25 Conclusions The LexLooter prototype has shown its potential as a tool supporting the recognition of a legislative text liable to express textual amendments; as a tool supporting the validation and verification of a legislative text; Further improvements include: Definition of OCL structural constraints for the validation of the structure of model, according to legislative drafting; UML Interaction ● Omondo EclipseUML / GMF Experimentation on an extended corpus of legislative acts is also a priority. P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 26 Future Work Interfacing with NirEditor and adaptation of metadata (DTD NiR) Evolution of the adopted UML model Inclusion of further structural and textual information ● Non-standard structures ● Article differentiated being-in-force Conditioned amendments Extension of the methodology to other documentation Administrative acts P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 27 Open Issues (1) As in other projects, the main goal of this work is to intercept the semantics of a text by adopting textual linguistic analysis techniques. The textual linguistic postulates three main levels constituting the communicative act: textual inter-textual extra-textual P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 28 Open Issues (2) In this work the implemented model is built within textual and inter-textual levels without considering the extratextual one. We call it the “pre-situational and static model” The model represents a text which does not take into account factual situations however which interacts in the determining (in continuous evolution) of the meaning of the text itself. We link the “pre-situational model” to a “normativistic model” which represents the general and abstract legal rule. Instead the “situational model”, is the model obtained after the positioning of the utterance in the factual situation. We link the “situational model” to the “interpretative model”. This model represents the interpreted norm. P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 29 Open Issues (3) In the following, we exemplify this through the difference between enforcement and effectiveness We consider the first institute saturated within textual and inter-textual levels and the second one as a result of the interaction between textual, inter-textual and extra-textual levels. ● In Italian jurisprudence, the difference between enforcement and effectiveness is often exemplified by the war criminal code: in peace time the code is certainly enforced but not effective. The effectiveness will depend upon wartime declaration, but also from real situations not calculable by a representative model of the legislative system being extraneous to the system itself. The war criminal code considers, in fact, cases of “automatic” effectiveness (application) of war criminal law connected to some events, without the needs of a bill, order or other and, in particular, the war declaration which effects, for general rule, the application of the war criminal code (art. 3 c.p.m.g.) On one hand, the enforcement is implemented by a model built only in the textual and inter-textual level, on the other, the effectiveness to be intercepted demands the covering of the extra-textual level. P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 30 Open Issues (4) The building of the “situational/interpretative model” demands the representation of events and the dynamic interaction between the events and the text. We believe UML, a notation potentially able to represent also extratextual level for the implementation of more efficient “interpretative models” for the evaluation of the normative product which can be adopted not only by the legislator, but also by lawyers who apply and interpret the law. Modelling such events and situations needs different UML notations w.r.t. the static diagrams adopted to represent textual and inter-textual levels. UML notations for the description of dynamic aspects (i.e. sequence, activity diagrams) should be adopted to model extra-textual level and its interaction with the text. We hope that the proposed issue receives the critical attention from the computer/law community to be developed or demolished. P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 31 Thank you for your attention! P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 32 Dipartimento di Sistemi e Informatica Istituto di Teoria e Tecniche dell'Informazione Giuridica Università degli Studi di Firenze CNR Automatic Translation from Textual Representations of Laws to Formal Models through UML Pietro Mercatali Francesco Romano Luciano Boschi Emilio Spinicci ITTIG-CNR DSI { mercatali, romano }@ittig.cnr.it { boschi, spinicci }@dsi.unifi.it JURIX 2005 – 18th Annual Conference on Legal Knowledge and Information Systems Bruxelles, December 8, 2005 Additional Slides P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 34 Introduction Expertise Experience on information extraction&modelling of technical documentation ● of DSI and ITTIG Patents, Software Requirements Specifications; Experience in analysing legislative texts using structural and semantic information extraction techniques ● ● P. Mercatali, F. Romano L. Boschi, E. Spinicci Structural XML marking: NormeInRete (NIR) project, NIR DTDs; Semantic analysis of Express Textual Amendments; Automatic Translation from Textual Representation of Laws to Formal Models through UML 35 Warnings after Sophia XML Analysis Detection of Sophia unmarked or incorrectly marked amendments Possible wrong tags Assessment of unmarked amendments P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 36 Modelling notifications Further warnings may be produced at the end of the UML modelling phase typing errors missing structural markup of added text P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 37 Unmodelled amendments Text “Normalisation” currently implemented: emphasized characters (è, é e') quotes (e.g.: “ ” '', ` ´ ' ) removing unnecessary XML tags for the generation of the coordinated text (e.g.: references tag <rif…> […] </rif>) blank removing Next implementation: Refinement of regular expressions for text extraction Integration with LexEdit for the markup of added text P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 38 Next Developments Example of coordinated model with verified amendment correctness: P. Mercatali, F. Romano L. Boschi, E. Spinicci Automatic Translation from Textual Representation of Laws to Formal Models through UML 39