INFORMATICA UMANISTICA D:
LESSICOGRAFIA E COMPUTER
Semantica lessicale
Tesauri
WordNet
SEMANTICA LESSICALE


Nella lezione 2 iniziammo a discutere la
caratterizzazione del significato delle parole
nei dizionari contemporanei
In questa lezione discuteremo piu’ in
dettaglio queste definizioni, e parleremo di
altri tipi di dizionari che cercano di
caratterizzare questi significati in modo piu’
preciso: tesauri e WordNet
TIPI DI DEFINIZIONI IN UN
DIZIONARIO

GENUS E DIFFERENTIA:




“stating the superordinate concept next to
the definiendum together with at least
one distinctive feature”
SINONIMIA
TIPICALITA’
USO
GENUS E DIFFERENTIA
GENUS
horse
noun
1 a solid-hoofed plant-eating domesticated mammal with a flowing mane
and tail, used for riding, racing, and to carry and pull loads
New Oxford Dictionary of English
DIFFERENTIAE
LIMITI DELLA DEFINIZIONE VIA
GENUS & DIFFERENTIA (lez.2)

Putnam:



`faggio’ / `olmo’
`diamante’ / `zircone’
Jackson: happen vs occur vs befall vs
transpire

Everything is illuminated: `harmonize’ vs
`agree’,
TIPI DI DEFINIZIONI IN UN
DIZIONARIO


GENUS E DIFFERENTIA
SINONIMIA




Molte parole, specialmente astratte,
difficili da definire in modo analitico
In questo caso si usano sinonimi
TIPICALITA’
USO
DEFINIZIONE PER
SINONIMIA
miserable 1 very unhappy, wretched 2 causing misery 3
squalid 4 mean
unhappy 1 sad or depressed 2 unfortunate or wretched
wretched 1 miserable or unhappy 2 worthless
Collins Pocket English Dictionary (2000)
CIRCOLARITA
TIPI DI DEFINIZIONI IN UN
DIZIONARIO



GENUS E DIFFERENTIA
SINONIMIA
TIPICALITA’


La definizione specifica cos’e’ “tipico” del
referente
USO
DEFINIZIONE PER
TIPICALITA’
day of rest a day set aside from normal activity, typically,
Sunday on religious grounds
measles an infectious viral disease causing fever and a red
rash, typically occurring in childhood
Concise Oxford Dictionary
TIPI DI DEFINIZIONI IN UN
DIZIONARIO




GENUS E DIFFERENTIA
SINONIMIA
TIPICALITA’
USO


La definizione spiega l’uso di una parola
Tipica specialmente per le parole
funzionali (articoli, preposizioni, etc)
RELAZIONI DI SIGNIFICATO

Molte di queste definizioni stabiliscono
il significato di una parola tramite
relazioni di significato con altre parole:




IPONIMIA: cane / animale
SINONIMIA: scemo / cretino
ANTONIMIA: giusto / sbagliato
MERONIMIA: cavallo / criniera
IPONIMIA

HYPONYMY is the relation between a subclass and
a superclass:




Generally speaking, a hyponymy relation holds
between X and Y whenever it is possible to
substitute Y for X:



CAR and VEHICLE
DOG and ANIMAL
BUNGALOW and HOUSE
That is a X -> That is a Y
E.g., That is a CAR -> That is a VEHICLE.
HYPERNYMY is the opposite relation
SINONIMIA


Two words are SYNONYMS if they have the same
meaning at least in some contexts
E.g., PRICE and FARE; CHEAP and INEXPENSIVE;
LAPTOP and NOTEBOOK; HOME and HOUSE


From Roget’s thesaurus:


I’m looking for a CHEAP FLIGHT / INEXPENSIVE FLIGHT
OBLITERATION, erasure, cancellation, deletion
But few words are truly synonymous in ALL contexts:


I wanna go HOME / ?? I wanna go HOUSE
The flight was CANCELLED / ?? OBLITERATED / ???
DELETED
ANTONIMIA

La relazione di antonimia lega lemmi
con significati opposti:


giusto / sbagliato; piccolo / grande
Alle volte anche antonimia ‘estesa’

destra / sinistra; cane / gatto
ANTONIMIA
artificial not real
conventional not spontaneous or sincere or original
vacant not occupied
Concise Oxford Dictionary 9
MERONIMIA

La relazione tra le parti ed il tutto:

Criniera / cavallo; ruota / auto
MERONIMIA NELLE
DEFINIZIONI
HYPERNYM
horse
noun
1 a solid-hoofed plant-eating domesticated mammal with a flowing mane
and tail, used for riding, racing, and to carry and pull loads
New Oxford Dictionary of English
PARTI
QUANTI SIGNIFICATI?
horse
noun
1 a solid-hoofed plant-eating domesticated mammal with a flowing mane
and tail, used for riding, racing, and to carry and pull loads
•Equus caballus, family Equidae (the horse family), descended from the wild
Przewalski’s horse. The horse family also includes the asses and zebras.
An adult male horse; a stallion or gelding. A wild mammal of the horse family
2 a frame or structure on which something is mounted or supported,
especially a sawhorse.
3 [mass noun] informal heroin
4 informal a unit of horsepower: the huge 63-horse 701-cc engine
5 Mining an obstruction in a vein
New Oxford Dictionary of English
QUANTI SIGNIFICATI?
horse
n 1 a domesticated perissodactyl mammal, Equus caballus, used for
draught work and riding: family Equidae 2 the adult male of this
species; stallion. 3 wild horse. 3a a horse (Equus caballus) that has
become feral. 3b another name for Przewalski’s horse. 4a any other
member of the family Equidae, such as the zebra or ass. 4b (as
modifier): the horse family 5 (functioning as pl) horsemen, especially
cavalry: a regiment of horse 6 Also called: buck Gymnastics: a padded
apparatus on legs, used for vaulting, etc 7 a narrow board supported by
a pair of legs at each end, used as a frame for sawing or as a trestle,
barrier, etc 8 a contrivance on which a person may ride and exercise 9
a slang word for heroin 10 Mining a mass of rock within a vein or ore.
11 Nautical. A rod, rope or cable, fixed at the ends, along which
something may slide by means of a thimble, shackle, or other fitting;
traveller. 12 Chess. An informal name for knight. 13 Informal. Short for
horsepower. 14 (modifier) drawn by horse or horses: a horse cart.
Collins English Dictionary 4
OMONIMIA E POLISEMIA

OMONIMIA: I significati sono ben distinti
(e.g., etimologie diverse)



BANK
‘SCANNARE’ come ‘fare a pezzi’ /
‘italianizzazione di TO SCAN’; GRU come
uccello / macchina per sollevare pesi
POLISEMIA: i significati sono collegati


MOUTH
VERDE’ come ‘avente un certo colore’ e come
‘ricco di vegetazione’
QUANTI SIGNIFICATI?
The `lumpers’ like to lump meanings together and leave the user
to extract the nuance of meaning that corresponds to a
particular context, whereas the `splitters’ prefer to enumerate
differences of meaning in more detail; the distinction
corresponds to that between summarizing and analysing.
Allen, R. Lumping and splitting, English today, 16(4), 61-3
CRITERI ?

GRAMMATICALI


Sensi nominali vs verbali
Usi transitivi & intransitivi (Hirst, 1987)




COLLOCAZIONI

isometric da CED4:



Ross KEPT staring at Nadia’s decolletage
Nadia KEPT calm and made a cutting remark
Ross wrote of his embarassment in the diary that he KEPT.
(of a crystal or system of crystallization) having three
mutually perpendicular equal axes
(of a method of projecting a drawing in three dimensions)
having the three axes equally inclined and all lines drawn to
scale
ETIMOLOGIA
PROBLEMI



Gia’ menzionato: distinzioni di senso
non sempre facili
Circolarita’
Relazioni non usate in modo coerente
SEMANTICA & LESSICO: UN
RIASSUNTO
“eat”
“eats”
EAT-LEX-1
eat0600
eat0700
“ate”
“eaten”
WORD-FORMS
LEXEMES
SENSES
L’ORGANIZZAZIONE DEL
LESSICO
stock0100
STOCK-LEX-1
“stock”
STOCK-LEX-2
stock0200
stock0600
stock0700
STOCK-LEX-3
stock0900
stock1000
WORD-FORMS
LEXEMES
SENSES
SINONIMIA
cheap0100
“cheap”
CHEAP-LEX-1
CHEAP-LEX-2
….
……
cheapXXXX
“inexpensive”
INEXP-LEX-3
inexp0900
inexpYYYY
WORD-FORMS
LEXEMES
SENSES
DIZIONARI ORGANIZZATI
SULLA BASE DEL SIGNIFICATO


Tesauri
WordNet
TESAURI


Dizionari organizzati per argomenti
sono apparsi simultaneamente a quelli
organizzati alfabeticamente (Ǽlfric:
Glossary, ~ 1000)
Piu’ famoso dizionario tematico: Peter
Mark Roget, Thesaurus of English
Words and Phrases, apparso per la
prima volta nel 1852
ROGET THESAURUS:
CLASSI
I.
ABSTRACT RELATIONS
Sezioni: Existence, relation, quantity, order,
number, time, change, causation
II.
III.
IV.
V.
VI.
SPACE
MATTER
INTELLECT
VOLITION
AFFECTIONS
ROGET’S THESAURUS:
SEZIONI & INSIEMI DI PAROLE
I.
ABSTRACT RELATIONS
….
IV. ORDER
1. GENERAL
58 Order
60 Arrangement
59 Disorder
61 Derangement
2. CONSECUTIVE 62 Precedence
63 Sequence
64 Precursor
65 Sequel
66 Beginning
67 End
68 Middle
ALTRI TESAURI



A THESAURUS OF OLD ENGLISH
(Roberts, 1995)
HISTORICAL THESAURUS OF
ENGLISH (Christian Kay)
LONGMAN DICTIONARY OF
SCIENTIFIC USAGE
WORDNET

A lexical database created at Princeton




Information about a variety of SEMANTICAL
RELATIONS
Three sub-databases (supported by psychological
research as early as (Fillenbaum and Jones, 1965))




Freely available for research from the Princeton site
http://www.cogsci.princeton.edu/~wn/
NOUNs
VERBS
ADJECTIVES and ADVERBS
Each database organized around SYNSETS
SYNSETS


Senses (or `lexicalized concepts’) are
represented in WordNet by the set of words
that can be used in AT LEAST ONE
CONTEXT to express that sense /
lexicalized concept: the SYNSET
E.g.,
{chump, fish, fool, gull, mark, patsy, fall guy,
sucker, shlemiel, soft touch, mug}
(gloss: person who is gullible and easy to
take advantage of)
IL DATABASE DEI NOMI


About 90,000 forms, 116,000 senses
Relations:
hypernym
breakfast -> meal
hyponym
meal -> lunch
has-member
faculty -> professor
member-of
copilot -> crew
has-Part
table -> leg
part-of
course -> meal
antonym
leader -> follower
IPERNIMIA
2 senses of robin
Sense 1
robin, redbreast, robin redbreast, Old World robin, Erithacus rubecola -(small Old World songbird with a reddish breast)
=> thrush -- (songbirds characteristically having brownish upper plumage with a spotted breast)
=> oscine, oscine bird -- (passerine bird having specialized vocal apparatus)
=> passerine, passeriform bird -(perching birds mostly small and living near the ground with feet having 4 toes arranged to allow for gri
pping the perch; most are songbirds; hatchlings are helpless)
=> bird -- (warm-blooded egglaying vertebrates characterized by feathers and forelimbs modified as wings)
=> vertebrate, craniate -(animals having a bony or cartilaginous skeleton with a segmented spinal column and a large brain en
closed in a skull or cranium)
=> chordate -(any animal of the phylum Chordata having a notochord or spinal column)
=> animal, animate being, beast, brute, creature, fauna -(a living organism characterized by voluntary movement)
=> organism, being -(a living thing that has (or can develop) the ability to act or function independently)
=> living thing, animate thing -- (a living (or once living) entity)
=> object, physical object -=> entity, physical thing --
MERONIMIA
wn beak –holon
Holonyms of noun beak
1 of 3 senses of beak
Sense 2
beak, bill, neb, nib
PART OF: bird
VERBI


About 10,000 forms, 20,000 senses
Relations between verb meanings:
Hypernym
Troponym
Entails
Antonym
fly-> travel
Walk -> stroll
Snore -> sleep
Increase ->
decrease
RELAZIONI TRA SIGNIFICATI
VERBALI
V1 ENTAILS V2
when Someone V1 (logically) entails Someone V2
- e.g., snore entails sleep
TROPONYMY
when To do V1 is To do V2 in some manner
- e.g., limp is a troponym of walk
AGGETTIVI & AVVERBI



About 20,000 adjective forms, 30,000
senses
4,000 adverbs, 5600 senses
Relations:
Antonym
Heavy <-> light
(adjective)
Antonym (adverb) Quickly <->
slowly
COME USARLO


Online:
http://cogsci.princeton.edu/cgi-bin/webwn
Scaricatevelo, poi da command line:

Get synonyms:


Get hypernyms:


wn –synsn bank
wn –hypen robin
(also for adjectives and verbs): get antonyms

wn –antsa right
I LIMITI DI WORDNET

Coverage

words not in WordNet



slump, crash, bust all synonyms in the WSJ
corpus
The structure of WordNet


Missing information: MERONYMY
Context-dependent senses:


Crocidolite, spinoff (spin-off)
Some information is encoded in complex ways
(room, wall, floor)
But: MOVING TARGET!!
MERONIMIA IN WORDNET:
UN ESPERIMENTO



100 bridging descriptions in a mereological
relation
Ran a script trying to find a direct link in
WordNet (1.7) between one of the senses of
the BD and one of the senses of any of the
previous NPs
Results: in only 6 cases there is in WordNet
a direct lexical relation between a BD and
one of the CFs
John looked at the HOUSE.
The WALL was crumbling.
ARTIFACT
IS-A
IS-A
HOUSING
IS-A
HOUSE
BUILDING
IS-A
PART-OF
HOME
ROOM
PART-OF
WALL
PART-OF
FLOOR
SOLUZIONE: ACQUISIZIONE
LESSICALE


Parziale (aggiungi informazioni a
WordNet, specialmente per domini
specialistici)
Totale (crei un nuovo lessico a partire
da zero)
LETTURE


Jackson, cap. 8
C. Fellbaum. WordNet: An electronic
lexical database. MIT Press, 1998

cap. 1
Scarica

Semantica lessicale Tesauri WordNet