|
Distribution data format
|
|
 |
MultiWordNet distribution is composed of 11 files, including
10 tables representing a dump of the mySQL database, and
1 text file:
common_relations.sql lists all the semantic relations
that are common to all languages
INSERT INTO common_relations
VALUES ('*','v#00001740','v#00003763',NULL);
english_relations.sql and italian_relations.sql
contain the relations that are language dependent.
These relations are instances of the standard lexical
relations used in Princeton WordNet (e.g. antonymy, pertains
to, etc.).
INSERT INTO english_relations
VALUES ('!','v#00009549','v#00009666','rest','be_active',NULL);
english_synsets.sql and italian_synsest.sql
contain the English and Italian synsets (most of them
are aligned with the Princeton WordNet but some are new
ones)
INSERT INTO english_synsets
VALUES ('n#00008864',' plant flora plant_life ','a living
organism lacking the power of locomotion');
english_index.sql and italian_index.sql
contain the lists of the English and Italian lemmas. The
purpose of these tables is to retrive very quickly the
synset ids and the possible searches starting from a lemma
in all its PoS.
INSERT INTO italian_index
VALUES ('parlamentare','#m @','@ ~',NULL,NULL, 'n#07457674','
v#00518082','a#02590962',NULL);
semfield.sql contains one or more domain labels
(version 1.1.1) for each synset in MultiWordNet. Domain
labels are language-independent and apply to both English
and Italian synsets
INSERT INTO semfield
VALUES ('n#00028871','Aeronautic Tourism','Aeronautica
Turismo');
semfield_hierarchy.sql contains the hierarchy
of the 164 domain labels used to tag MultiWordNet synsets
INSERT INTO semfield_hierarchy
VALUES (25,'Sport','Sport','Sport','Free_Time','Badminton
Baseball Basketball Cricket Football Golf Rugby Soccer
Table_Tennis Tennis Volleyball Cycling Skating Skiing
Hockey Mountaineering Rowing Swimming Sub Diving Racing
Athletics Wrestling Boxing Fencing Archery Fishing Hunting
Bowling');
english_frame.sql contains the subcategorization
frames of the English verbs. Up to now, Italian verbs
do not have this kind of information.
INSERT INTO english_frame
VALUES ('v#00182674',8,'get_over');
verb-frames.txt contains the list of the 35 different
kinds of subcategorization frames for English verbs, each
identified by a progressive number from 1 to 35.
1 Something ----s
35 Something ----s INFINITIVE
|
|
|