8000 GitHub - sanskritsahitya-com/data: Repo for storing all SanskritSahitya.com data.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

sanskritsahitya-com/data

Repository files navigation

This repository hosts all the data that powers SanskritSahitya.org.

The structure of the repo is straightforward:

  • code/ holds any scripts that might be useful for reading / writing this data
  • For a typical Kaavya named kaavya, there would be file kavya/kaavya.json that has the entire text of the Kaavya.

For large kaavyas such as the Mahabharatam, the data has been sharded, with the number appended to the filename. eg. mahabharatam/mahabharatam1.json etc.

The typical structure of the JSON is:

{
    "title": "Name of the Kaavya",
    "books": [{"number": "1", "name": "Name of Book 1"},
              {"number": "2", "name": "Name of Book 2"}, ...]
    "chapters": [{"number": "1", "name": "Name of Chapter 1"},
                 {"number": "2", "name": "Name of Chapter 2"}, ...]
    "data": [
        {"c": "1", "n": "1", "i": 0, "v": "Shlok 1.1"},
        {"c": "1", "n": "2", "i": 1, "v": "Shlok 1.2"},
        ...
    ]
}

The format currently supports kaavyas that are either:

  • Not sub-divided into chapters / sections. A shloka in such a kaavya is referenced with just the number of the shloka. Example /meghadutam/1.
  • Divided into sections / chapters that comprise of shlokas. A shloka in such a kaavya is referenced with the chapter number combined with the shloka number. Example /raghuvansham/1.1.
  • Divided into books which are sub-divided into sections, which then comprise of shlokas. A chapter in such a kaavya is referenced with the book number and chapter number combined (e.g. 1.1) and a shloka is represented with all three combined. Example /mahabharatam/1.1.1.

The data key has a list of objects (referred to as Shloka objects) representing the entire text.

The structure of a Shloka object is:

{
    // [String] Chapter number, matches the chapter numbers declared in the `chapters` key.
    // For Kaavyas with books, this would include the book number.
    "c": "1",

    // [String] Shloka number
    "n": "1",

    // [Integer] Index. This represents the overall index of this shloka / line
    // in the text, and runs monotonically from start to end.
    "i": 0,

    // [String] Verse [Shloka]. The actual text of the shloka.
    "v": "वागर्थाविव संपृक्तौ..."

    // [String] Text. Refers to non-shloka text such as 'धृतराष्ट्र उवाच ।' that is
    // meant to be interspersed with shlokas.
    // A Shloka object should have only one of |v| and |t| keys.
    // A text entry doesn't have the "n" or "i" keys present.
    "t": "धृतराष्ट्र उवाच ।"

    // [String] Speaker. This is used in plays to annotate the speaker of this particular text.
    // Using this makes it easier to show the speaker in the UI and allows searching for dialogues
    // based on speakers.
    "sp": "शकुन्तला"
}

The focus is to make this easy to access through computational means, but it's written out in a specifically pretty-printed JSON format to be relatively easily human readable and to lead to clean diffs when modifying any line. It is still a pure JSON object that can be read by any parser, but if you are planning to submit changes to the repository, use the smart_json_dump helper function in code/smart_json_dump.py to pretty-print your output to the repo style.

About

Repo for storing all SanskritSahitya.com data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages

0