Legal Information Retrieval
on the Web
The Experience of the NiR Portal
Costantino Ciampi
1
Rome, 26 April 2004
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
http://www.ittig.cnr.it
Legal Information Retrieval on the Web
The Experience of the NiR Portal
(http://www.nir.it)
Costantino Ciampi
e-mail: [email protected]
Contents


Normeinrete (NIR) – “Access to Law on the Net”: an e-Government project
 Project description (goals, technology, results)
Standardization in the legal domain:
 XML representation of Italian norms
 URN adoption to automate hyperlinking among norms in a distributed
environment
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
NiR Project "Access to Law on the Net"
http://www.ittig.cnr.it
Project goals
3
• Improving accessibility to legislation by providing a unique
point of access to Italian and EU legal documents published on
different web sites
–
ICT to allow rights fulfillment
• Supporting PA in managing legislative documentation life cycle
and law consolidation by providing standardization, software
tools and methodologies
–
ICT to improve PA efficiency
A system prototype (third version) is available at the Url:
http://www.normeinrete.it
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
http://www.ittig.cnr.it
NiR Actors
4
•
Main Actors:
– Minister of Justice (beginner) (www.giustizia.it)
– AIPA -> CNIPA - (Authority ->) National Center for Information Technology
in the Public Administration (founder and technical coordinator) (now
CNIPA) (www.cnipa.it)
•
Scientific and Technical Partners:
– Institute of Legal Information
CNR, Florence (www.ittig.cnr.it)
Theory and Technologies of the
– CINECA Consortium, Bologna (www.cineca.it)
•
Public Administrations participating at the Project
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
http://www.ittig.cnr.it
Steps and Resources of the NiR Project
5
• Phase I (May 1999 - May 2000)
First Study of feasibility and realization of the Portal prototype
• Phase II (December 2000 - November 2001)
Second Study of feasibility, extension of the documentary base
and qualitative evolution of the Portal prototype
• Phase III (years 2002/2003)
Definition of standards (URN and XML) and preparation of the
software for the dissemination of the standards (parser of
references and parser of structures, NIREditor XML)
• Phase IV (years 2004/2005)
Commitment to external managers and full operation of the NIR
Portal (with economic resources from the e-Government
programme and Italian financial laws)
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
http://www.ittig.cnr.it
NiR Project Strategy
6
• Implementation of a specialized portal, delivering
search and retrieval functions of legislative documents
published on various Public Administration's web sites;
• Definition of standards, consistent with Internet
technologies, to represent data and metadata meaningful
in the legal domain;
• Development and distribution of open source
software to support legislative document management
and publishing;
• Training and knowledge sharing among Public
Administrations.
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
http://www.ittig.cnr.it
Present Results
• www.normeinrete.it: provides unified access to Italian and European
Union legislation published on different institutional web sites
So far
– more than 50 public institutions have taken part in the Project;
– more than 140,000 documents have been indexed;
– about 160,000 search sessions are held monthly on the site;
– creation and updating of the NiR Legal Database ("Norm Catalogue")
including metadata;
– definition of the NiR Standards.
• Two standards issued by AIPA/CNIPA as technical norms
– DTDs definition for Italian legislation;
– URN definition for any kind of legal document;
7
– Editors and other software tools developed and distributed to PA to support
standard implementation.
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
http://www.ittig.cnr.it
NiR Features
8
• The system is based on co-operative technological architecture,
resulting in a federation of legislative data bases developed on different
platforms.
• Co-operation is achieved by means of suitable application
gateways which provide "loose" integration by adopting two standards:
– one for identifying legal resources (URNs), and
– one for representing document structures and metadata by XML mark-up
language according to ad hoc DTDs.
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
http://www.ittig.cnr.it
Searching Tools and
Architecture of the NIR System
9
(1)
The NIR System consists of:
• NiR nodes: components belonging to administration domains
containing legal database systems and related application
gateways. Documents can be stored in the file system or within
database/full text management systems: they are all accessible
through the Internet
• Central registries: components in the co-operative layer
publishing information, needed to allow effective co-operation
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
http://www.ittig.cnr.it
Searching Tools and
Architecture of the NIR System
10
(2)
• Central registries include:
– Standards repository (XML DTD and URN grammar definitions
and tools);
– Registry of official Authority names, needed to standardise URN
adoption;
– Registry of NiR nodes, containing information needed to allow
interaction between NiR agents and domain application
gateways;
– Norm Catalogue, containing, for each norm: title, basic
classification, URN and the list of known physical addresses
(URL) where it is published.
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
The Norm Catalogue
http://www.ittig.cnr.it
(> 45.000 documents)
11
– The Norm Catalogue is a relational
database containing, for each norm:
title, basic classification, URN and the list
of known physical addresses (URL) where
it is published
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
http://www.ittig.cnr.it
NiR Standards
12
• Uniform Resource Name (URN) definition (based on
IETF) to:
– identify each document regardless of its physical address (URL)
– allow automatic hyperlink through a resolution system (as DNS)
• Document Type Definition (DTD) for Italian legislative
and regulatory acts (based on W3C XML Meta-language) to
represent documents structure, semantics and metadata
(*) The standards have been issued as AIPA/CNIPA technical standards and
published as regulations in the Italian Official Journal
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
URNs
(1/3)
http://www.ittig.cnr.it
•
13
•
Each law contains several references to other laws: the whole
legislative corpus can be seen as a net, laws being nodes connected
through references;
Manual activity is required to build laws hypertext through URLs;
•
The URN is a persistent, location-independent, resource
identification mechanism;
•
The URNs are defined as a combination of elements, according to a
specific grammar, that are basically: name of the enacting Authority,
type of norm, date, number and a some more detailed specifications
when needed;
•
URNs can be built regardless the availability of corresponding
documents on-line.
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
URNs
http://www.ittig.cnr.it
(2/3)
14
• The adoption of a URN-based scheme allows to build an automated
distributed hypertext, according to a model similar to the DNS (Domain
Name System) used to resolve the self-explaining web sites' names into
numerical HTTP addresses.
•
This opportunity relies on the following considerations:
– the natural language expressions used in law references usually contain
repetitive patterns, thus automatically detectable;
– the URN is built by combining data (almost) always included in the
reference;
– cross references between each URN and the list of corresponding URLs,
needed for the resolution service, can be built automatically.
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
URNs: tools and examples
http://www.ittig.cnr.it
(3/3)
15
• Parser
– Available on-line, automatically detects references within laws.
• Resolution service
– Resolves URNs into URLs (when known).
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
http://www.ittig.cnr.it
XML Representation
of Italian Legislative and Regulatory Acts
16
(1/5)
Three categories
 Documents with a well-defined structure
– laws, constitutional laws, regional laws
 Documents partially structured
– regulation acts, decrees
 Generic documents
– any kind of non-structured acts, enclosures,..
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
DTD definition approach
http://www.ittig.cnr.it
(2/5)
17
Three DTDs
• Basic DTD: well structured simple documents
• Strict DTD: well structured complex documents
• Loose DTD: documents with irregular structure,
exceptions (suitable for historical documents)
Each DTD can represent several document types
Mark-up must be carried out using only relevant elements
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
XML Elements
(categories)
http://www.ittig.cnr.it
(3/5)
18
• Structural elements
– heading, preamble, sections, articles, paragraphs...
• Special elements
– references to other laws, formatted representation of textembedded relevant entities (institution, dates, places)
• Elements containing Metadata
– subject-matter classification, publication data, preparatory
iter
• Semantic elements
– obligation, prohibition, penalties, exceptions, modifications,
abrogations,...
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
Examples of Legal Texts in XML
http://www.ittig.cnr.it
(4/5)
19
• Example of an Italian Act, tagged with DTD Basic
• Examples of fragments of legal texts in different formats
(XML vs Html)
• Navigating the document structure with a visual
XML editor
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
http://www.ittig.cnr.it
Training on XML and
Development of an XML NirEditor
20
(5/5)
Considering the relevance of XML to NIR:
• an intense training activity has been carried out, also with the
aid of multimedia e-learning product developed by
ITTIG-CNR;
• an XML Editor, that will be distributed as open source
software, has been developed and enriched of parsing
functions by ITTIG-CNR .
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
http://www.ittig.cnr.it
Opportunities deriving from NIR standards
21
• Advanced search functions
• Supporting legislative documents life-cycle (law
enacting workflow, "law in force" at any given date)
• Moving from a totally “free” approach to a more
formally-defined organizational model in order to
achieve completeness and to improve precision
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
Conclusive Remarks:
http://www.ittig.cnr.it
Current Developments and Future Initiatives
22
• Software tools to support Administrations in the
adoption of NiR standards
• XML Schema definition
• Parsing services
• New metadata
• Implementation of distributed URN resolution
• Certification of the authenticity of acts through digital
signature technology
http://www.ittig.cnr.it
CONSIGLIO NAZIONALE DELLE RICERCHE
Istituto di Teoria e Tecniche dell’Informazione Giuridica
23
... The End …
Scarica

The searching of juridical information in the net