The Sanskrit Heritage Site
Version 3.63 [2024-12-11] (fr)
Welcome to the Sanskrit Heritage site.
It provides various services for the computational treatment of Sanskrit.
The first service is dictionary access. The dictionary is a hypertext structure
giving access to the Sanskrit lexicon, given with grammatical information.
There are currently two versions of the dictionary.
The first one is the original Heritage Sanskrit-French dictionary, that
serves as morphology generator, and is thus fully equipped with grammatical
tools. Furthermore it offers a rich encyclopedic contents about Indian culture.
You may also download a printable pdf version of this dictionary, as
explained below. A fully hypertext version in the
Goldendict format is also available.
The second lexicon is a digital version of the Monier-Williams Sanskrit-English
dictionary, a much more complete lexicon for the Sanskrit language.
It is issued from Thomas Malten's digitalization of the Monier-Williams
at Köln University, turned into an XML databank by Jim Funderburk,
and finally adapted to the HTML Heritage look and feel by Pawan Goyal.
The Sanskrit Heritage dictionary is thus mirrored in the Monier-Williams, which
allows compatibility of the grammatical tools.
The choice of the dictionary is set to Heritage by default by accessing
the standard entry page "sanskrit.inria.fr", but is set to Monier-Williams
if you rather invoke
"https://sanskrit.inria.fr/index.en.html"
Each dictionary is accessible separately by its search page,
respectively
Sanskrit Heritage and
Monier-Williams.
This site offers a number of linguistic services for the Sanskrit language, such
as a Sanskrit Reader that parses Sanskrit
text under various formats into Sanskrit banks of tagged hypertext.
Various phonological and morphological tools are also provided.
Please visit the Reference manual for learning how
to use the various facilities.
An elementary Sanskrit course for beginners, using the site resources,
may be found in its French version
ici and in its English version
here.
It allows you to read and understand a simple text, extracted from the
Vikramacarita story. An updated version of the English lesson is available
here.
!!! NEW !!!
The story of Nala, taken from the 3rd book of Mahābhārata, is now available
in the Corpus repository, section Nala. Its French translation is available
here as the second lesson on
using our tools.
A more extensive course on using Hypertext Sanskrit Tools, developed jointly
with the Sanskrit Studies Department of the University of Hyderabad,
was taught remotely in Spring 2024. Its video contents may be accessed
here.
Sanskrit Heritage dictionary in book form
You may download the Heritage dictionary as a pdf document from
PDF.
This document is readable through Acrobat Reader,
a well-known document management software from Adobe freely available on Internet.
Since the document is rather large, you have to account for some delay
in loading its 5 Mb. This is a still on-going effort, lexical acquisition
implies quick obsolescense of this document which grows along with versions.
The Sanskrit Heritage dictionary is also available in an older version,
in an ebook format,
usable with the Babyloo, Stardict or Goldendict software.
Please visit the Golden Sanskrit Heritage page.
Multilingual hyper-text dictionary
Interactive browsing
The dictionary may be accessed through an indexing engine:
Héritage du sanskrit for the
French dictionary, or
Monier-Williams for the
English one.
Your browser must be HTML5 compliant, and for proper viewing
of Sanskrit text you must have installed on your system open type fonts
for roman transliteration with diacritics, and for devanāgarī.
A Unicode-compliant font for devanāgarī with proper ligatures
is Apple's Devanagari MT for Macintosh OS X stations. For Windows users,
installation of font 'Arial MS Unicode' is advised for proper rendering.
Note that many words are given with their etymology as hypertext links. You
may thus navigate from a word to its morphological components, down to its roots.
Also, the gender declarations of
the main entries are mouse-sensitive, and give you direct access to the
relevant declension table. Similarly, the present class mark of the verbal roots
gives access to the conjugation schemes. Also for verb entries, preverbs
lead you to the correspondingly prefixed derived verbs.
All these grammatical tools, originally developed for the Heritage dictionary,
are being progressiveley extended to the Monier-Williams dictionary.
Thus our HTML Monier-Williams offers similar declension and conjugation
facilities.
Sanskrit made easy
If you want to search for a Sanskrit word
without knowing its exact transliteration, go to section "Sanskrit made easy"
of the index page, which allows you to search for words without knowing
precise diacritics usage.
For instance, search Vishnou, Siva, or the grammarian Panini. This
interface is limited for the moment to the Sanskrit Heritage dictionary.
Sanskrit Grammarian
This interface gives the declension tables for Sanskrit substantives.
Try out this
declension engine by submitting Sanskrit stems
with intended gender. The same transliteration conventions as for the
dictionary index apply. For instance, submit "deva" with gender Mas,
or (assuming Velthuis transliteration) "devii" with gender Fem,
or "brahman" with gender Neu. The fourth
button, labeled "Any", may be used for the words which take their
gender from the context, such as deictic personal pronouns ("aham", "tvad"),
or numeral words such as "dva", "tri", etc.
A conjugation engine for roots is also available. It handles
the full present system: present indicative, imperfect, imperative and
optative, as well as the passive present system, the perfect, the aorist
and the future.
Participial stems, absolutives and infinitives are listed as well.
Some secondary conjugations (causative, intensive,
desiderative) are also generated, for the full present and future systems.
Try out this conjugation engine
with data such as "bhuu" 1, "as" 2, "m.rj" 2, "han" 2, "haa" 3, "hu" 3,
"daa" 4, "su" 5, "p.r" 6, "yuj" 7, "k.r" 8, "j~naa" 9, "cur" 10, "namas" 11.
You may cascade by generating declensions of the generated participial stems.
A word of caution is called for here. The only safe way to get correct
inflected forms is to enter the stem and its morphological parameters
consistently with their specification in the Heritage dictionary. This is
specially true of roots, since they appear with various names according to
Sanskrit grammars. For instance, root hū is called hū,
hvā or hve according to various grammarians. Another problem
is homophony. When two items have the same phonetic realization, their
respective lexemes are disambiguated
by an integer index, which is specific to the lexicon. Thus there are
three roots named mā in the Sanskrit Heritage dictionary. They are
adressed respectively (in Velthuis transliteration) as maa#1, maa#3 and maa#4.
If you ask for the conjugated forms of maa in present classes 2 or 3, the
system will guess you mean maa#1 (to measure). But if you mean maa#3
(to mow) or maa#4 (to exchange) you have to enter explicitly their disambiguated
stems maa#3 or maa#4. Entering an arbitrary stem and arbitrary morphology
parameters may yield random results or error messages.
Lemmatizer
Conversely, a
lemmatiser
attempts to tag inflected words.
Try for instance (in Velthuis format)
"devaat", "jagmivaan", "a.s.tau" (selecting Noun)
or "apibat", "akaar.siit", "dudoha", "vaahyate" (selecting Verb).
This lemmatizer knows about inflected forms of derived stems in some
secondary derivations.
For instance, "darzayi.syati" is found as conjugated form:
{ ca. fut. ac. sg. 3 }[dṛś_1],
"dariid.rzyate" yields { int. pr. md. sg. 3 }[dṛś_1],
"did.rk.sate" yields { des. pr. md. sg. 3 }[dṛś_1]
and "bibhik.se" yields { des. pft. md. sg. 3 | des. pft. md. sg. 1 }[bhaj].
Please note the multitag notation of this ambiguous form.
Other lexical categories are available, such as Part for participles.
For instance, "bhikṣitavyānām" (selecting IAST transliteration and the Part
lexical category), yields { g. pl. f. | g. pl. n. | g. pl. m. }[bhikṣitavya { des. pfp. [3] }[bhaj]].
The various grammatical abbreviations used in these lemmas are available
here.
N.B. Do not attempt to lemmatize verbal forms with preverbs - this will
not work, it knows only how to invert root forms. Lemmatizing
more complex forms is possible through the Sanskrit Reader interface,
as explained in the manual.
Morphology
A dictionary of inflected forms of Sanskrit words is provided
in XML form under various transliteration schemes.
Please visit the Sanskrit linguistic resources page.
Sanskrit Reader
The main tool provided by this site is a
Sanskrit Reader that allows machine-assisted
analysis of Sanskrit sentences, that is segmentation
(including sandhi viccheda), morphological tagging, and several parsers.
Please consult the Reference manual for learning how
to use these tools.
The Zen Library
This site reflects an ongoing project of Sanskrit processing
on a comprehensive software platform.
The project is based on a structured lexicographic database, compiled from
the Sanskrit Heritage dictionary, and on
the Zen computational linguistics toolkit. This toolkit is a library
of programs implemented in the
Objective Caml
programming language. The Zen library and its documentation are available
as free software under the Gnu Lesser General Public License (LGPL) from the
Zen gitlab site.
|