The Bilingual English-Russian Russian-English Dictionaries were produced by the SCIPER company through a funding from ELRA in the framework of the European Commission project LRsP&P (Language Resources Production & Packaging - LE4-8335).
In this work were used linguistic resources produced originally in Russia. It is well-known that during the Soviet period, a number of linguistic resources of very high quality have been developed in Russia. Among those are dictionaries and especially bilingual dictionaries which generally have many more entries than those found in Western countries.
Bilingual language resources produced within the above-mentioned LRsP&P project contain, in total, more than 350,000 pairs of words (in tabular form). In XML format which corresponds to the DTD, the dictionaries have the following volumes:
1) Russian-English dictionary - more than 130 000 entries
2) English-Russian dictionary - more than 95 000 entries
In this format, a dictionary entry corresponds to more than one pair of words because it may contain several semantically equal translations in target language.
Each dictionary entry contains the following information:
- source word (lemma);
- part of speech of source word;
- target word(s) (lemma(s)), grouped by same meaning;
- part of speech of target word(s);
Both dictionaries contain as 'source words' only significant parts of speech: nouns, adjectives, verbs, adverbs and nominals. Stop-words (prepositions, articles, pronouns, etc.) have not been included into the dictionaries because of the intended use in multilingual search and cross-lingual interrogation.
Both dictionaries are presented as XML files, with he same DTD. These files are coded in UNICODE - UTF-8.
The dictionaries are consistent, i.e. each of them presents the inverted version of the second one. This feature proves to be very useful for aligners, multilingual search engines, etc.
The list of domains may be found in DTD. It contains more than 100 domain names.