As a part of the consortium project 'Development of Sanskrit computational tools and Sanskrit-Hindi Machine Translation system (2008-2012)', funded by DeiTy, Government of India, under the TDIL programme, manually tagged data was developed.
The data was tagged following these guidelines.
Following data is available for research.
- POS tagged Corpus
- Dependency Analysis of Corpus
- samAsa Annotated Data in WX notation
- Frequency of Compound components in WX notation
- Frequency of sandhi rules within compound components in WX notation
- Frequency of words in WX notation
- External sandhi rules with their frequencies in WX notation
- Sandhi split parallel data extracted from annotated texts