Wikidata:WordGraph

From Wikidata
Jump to navigation Jump to search

On November 7, 2024, Google released the WordGraph dataset as a belated present for Wikidata’s 12th birthday. According to its self-description, “[t]he WordGraph dataset contains multilingual lexicon entries linked to wikipedia entities, focusing on human-denoting nouns and demonym adjectives. Each lexicon entries contain inflected word-form and morphological information [of] all locales.” WordGraph was made available on GitHub.

The dataset is published under Creative Commons CC0 License (Q6938433), which makes it compatible for integration with Wikidata. We want to thank Bruno Cartoni (Q111290954), Saran Lertpradit, Seungmin Back, Daniel Calvelo Aros (Q100784694), Kuang-Yu Samuel Chang (Q86087144) and Abdelrahman Nabil at Google for this beautiful gift.

Here we look at some statistics about WordGraph, particularly compared to Wikidata.

All together, the WordGraph release contains almost a million word forms (968,153) in 39 languages: #Arabic, #Bulgarian, #Bangla, #Catalan, #Czech, #Danish, #German, #Greek, #English, #Spanish, #Estonian, #Persian, #Finnish, #French, #Hebrew, #Hindi, #Croatian, #Hungarian, #Indonesian, #Italian, #Kannada, #Lithuanian, #Latvian, #Marathi, #Dutch, #Norwegian, #Polish, #Portuguese, #Romanian, #Russian, #Slovak, #Slovenian, #Serbian, #Swedish, #Tamil, #Telugu, #Turkish, #Ukrainian, and #Urdu.

Norwegian, Hindi, and Urdu are not properly mapped, due to their different representation between WordGraph and Wikidata. I am asking for help to fix this mapping.

Explanations of the following statistics: Forms are the different forms a lexeme can have, counted for each combination of grammatical features. Unique forms only count the forms that actually differ. Unique lemmas count the different headwords of lexemes. Unique topics are the different QIDs mentioned in the claims respectively the mappings of the entry. Senses are the mappings of lemmas to QIDs. In Wikidata, this might overcount, as it counts all QIDs on claims of the senses, not just the sense-related ones. Shared forms and lemmas is counting how many forms and lemmas are in both WordGraph and Wikidata. Shared senses count only matches of lemmas to QIDs for their meaning. Missing senses are QIDs given on WordGraph that are missing on a shared lemma in Wikidata.

Arabic

[edit]
  • 32526 forms in WordGraph
  • 12379 unique forms in WordGraph
  • 919 unique lemmas in WordGraph
  • 944 unique topics in WordGraph
  • 957 senses in WordGraph
  • 2231 lexemes in Wikidata
  • 2221 unique lemmas in Wikidata
  • 600 unique topics in Wikidata
  • 656 senses in Wikidata (possibe overcount)
  • 1842 forms in Wikidata
  • 1779 unique forms in Wikidata
  • 5 shared forms
  • 2 shared lemmas
  • 0 shared senses
  • 2 missing senses on existing Lexemes

Bulgarian

[edit]
  • 14232 forms in WordGraph
  • 10471 unique forms in WordGraph
  • 1291 unique lemmas in WordGraph
  • 1340 unique topics in WordGraph
  • 1345 senses in WordGraph
  • 173 lexemes in Wikidata
  • 173 unique lemmas in Wikidata
  • 67 unique topics in Wikidata
  • 67 senses in Wikidata (possibe overcount)
  • 245 forms in Wikidata
  • 233 unique forms in Wikidata
  • 7 shared forms
  • 2 shared lemmas
  • 2 shared senses
  • 1 missing senses on existing Lexemes

Bangla

[edit]
  • 27282 forms in WordGraph
  • 8058 unique forms in WordGraph
  • 1398 unique lemmas in WordGraph
  • 1347 unique topics in WordGraph
  • 1464 senses in WordGraph
  • 10780 lexemes in Wikidata
  • 10243 unique lemmas in Wikidata
  • 1823 unique topics in Wikidata
  • 3818 senses in Wikidata (possibe overcount)
  • 69902 forms in Wikidata
  • 57227 unique forms in Wikidata
  • 136 shared forms
  • 34 shared lemmas
  • 22 shared senses
  • 19 missing senses on existing Lexemes

Catalan

[edit]
  • 3797 forms in WordGraph
  • 2935 unique forms in WordGraph
  • 924 unique lemmas in WordGraph
  • 957 unique topics in WordGraph
  • 957 senses in WordGraph
  • 175 lexemes in Wikidata
  • 173 unique lemmas in Wikidata
  • 104 unique topics in Wikidata
  • 107 senses in Wikidata (possibe overcount)
  • 208 forms in Wikidata
  • 200 unique forms in Wikidata
  • 4 shared forms
  • 2 shared lemmas
  • 1 shared senses
  • 1 missing senses on existing Lexemes

Czech

[edit]
  • 51247 forms in WordGraph
  • 22823 unique forms in WordGraph
  • 1430 unique lemmas in WordGraph
  • 1466 unique topics in WordGraph
  • 1479 senses in WordGraph
  • 29644 lexemes in Wikidata
  • 29260 unique lemmas in Wikidata
  • 20902 unique topics in Wikidata
  • 22451 senses in Wikidata (possibe overcount)
  • 861902 forms in Wikidata
  • 211267 unique forms in Wikidata
  • 2333 shared forms
  • 83 shared lemmas
  • 67 shared senses
  • 21 missing senses on existing Lexemes

Danish

[edit]
  • 13736 forms in WordGraph
  • 12194 unique forms in WordGraph
  • 1584 unique lemmas in WordGraph
  • 1633 unique topics in WordGraph
  • 1665 senses in WordGraph
  • 91505 lexemes in Wikidata
  • 90417 unique lemmas in Wikidata
  • 7820 unique topics in Wikidata
  • 9037 senses in Wikidata (possibe overcount)
  • 611986 forms in Wikidata
  • 564466 unique forms in Wikidata
  • 6646 shared forms
  • 292 shared lemmas
  • 261 shared senses
  • 67 missing senses on existing Lexemes

German

[edit]
  • 79418 forms in WordGraph
  • 13787 unique forms in WordGraph
  • 2969 unique lemmas in WordGraph
  • 2838 unique topics in WordGraph
  • 3148 senses in WordGraph
  • 214090 lexemes in Wikidata
  • 213399 unique lemmas in Wikidata
  • 12739 unique topics in Wikidata
  • 16586 senses in Wikidata (possibe overcount)
  • 567158 forms in Wikidata
  • 206858 unique forms in Wikidata
  • 3747 shared forms
  • 556 shared lemmas
  • 474 shared senses
  • 145 missing senses on existing Lexemes

Greek

[edit]
  • 10736 forms in WordGraph
  • 4369 unique forms in WordGraph
  • 635 unique lemmas in WordGraph
  • 681 unique topics in WordGraph
  • 681 senses in WordGraph
  • 43326 lexemes in Wikidata
  • 43263 unique lemmas in Wikidata
  • 133 unique topics in Wikidata
  • 144 senses in Wikidata (possibe overcount)
  • 70987 forms in Wikidata
  • 38905 unique forms in Wikidata
  • 701 shared forms
  • 4 shared lemmas
  • 4 shared senses
  • 0 missing senses on existing Lexemes

English

[edit]
  • 6904 forms in WordGraph
  • 5923 unique forms in WordGraph
  • 3567 unique lemmas in WordGraph
  • 3355 unique topics in WordGraph
  • 3784 senses in WordGraph
  • 75194 lexemes in Wikidata
  • 67495 unique lemmas in Wikidata
  • 25335 unique topics in Wikidata
  • 28160 senses in Wikidata (possibe overcount)
  • 137911 forms in Wikidata
  • 115078 unique forms in Wikidata
  • 1885 shared forms
  • 960 shared lemmas
  • 811 shared senses
  • 297 missing senses on existing Lexemes

Spanish

[edit]
  • 14620 forms in WordGraph
  • 8882 unique forms in WordGraph
  • 2832 unique lemmas in WordGraph
  • 3058 unique topics in WordGraph
  • 3270 senses in WordGraph
  • 60088 lexemes in Wikidata
  • 56089 unique lemmas in Wikidata
  • 5179 unique topics in Wikidata
  • 7617 senses in Wikidata (possibe overcount)
  • 679598 forms in Wikidata
  • 555915 unique forms in Wikidata
  • 3491 shared forms
  • 710 shared lemmas
  • 897 shared senses
  • 127 missing senses on existing Lexemes

Estonian

[edit]
  • 2744 forms in WordGraph
  • 2351 unique forms in WordGraph
  • 87 unique lemmas in WordGraph
  • 97 unique topics in WordGraph
  • 97 senses in WordGraph
  • 83199 lexemes in Wikidata
  • 80642 unique lemmas in Wikidata
  • 1078 unique topics in Wikidata
  • 1136 senses in Wikidata (possibe overcount)
  • 2801896 forms in Wikidata
  • 2637583 unique forms in Wikidata
  • 512 shared forms
  • 0 shared lemmas
  • 0 shared senses
  • 0 missing senses on existing Lexemes

Persian

[edit]
  • 2142 forms in WordGraph
  • 1677 unique forms in WordGraph
  • 81 unique lemmas in WordGraph
  • 80 unique topics in WordGraph
  • 81 senses in WordGraph
  • 18293 lexemes in Wikidata
  • 17594 unique lemmas in Wikidata
  • 1715 unique topics in Wikidata
  • 34642 senses in Wikidata (possibe overcount)
  • 40389 forms in Wikidata
  • 37092 unique forms in Wikidata
  • 3 shared forms
  • 1 shared lemmas
  • 0 shared senses
  • 1 missing senses on existing Lexemes

Finnish

[edit]
  • 7020 forms in WordGraph
  • 6073 unique forms in WordGraph
  • 225 unique lemmas in WordGraph
  • 232 unique topics in WordGraph
  • 232 senses in WordGraph
  • 1054 lexemes in Wikidata
  • 1053 unique lemmas in Wikidata
  • 459 unique topics in Wikidata
  • 558 senses in Wikidata (possibe overcount)
  • 9337 forms in Wikidata
  • 9205 unique forms in Wikidata
  • 22 shared forms
  • 2 shared lemmas
  • 1 shared senses
  • 1 missing senses on existing Lexemes

French

[edit]
  • 14372 forms in WordGraph
  • 8905 unique forms in WordGraph
  • 2906 unique lemmas in WordGraph
  • 2853 unique topics in WordGraph
  • 3119 senses in WordGraph
  • 19835 lexemes in Wikidata
  • 19505 unique lemmas in Wikidata
  • 8414 unique topics in Wikidata
  • 10401 senses in Wikidata (possibe overcount)
  • 331819 forms in Wikidata
  • 253717 unique forms in Wikidata
  • 1627 shared forms
  • 569 shared lemmas
  • 419 shared senses
  • 252 missing senses on existing Lexemes

Hebrew

[edit]
  • 14353 forms in WordGraph
  • 9107 unique forms in WordGraph
  • 1245 unique lemmas in WordGraph
  • 1257 unique topics in WordGraph
  • 1301 senses in WordGraph
  • 29640 lexemes in Wikidata
  • 27893 unique lemmas in Wikidata
  • 5838 unique topics in Wikidata
  • 6024 senses in Wikidata (possibe overcount)
  • 441485 forms in Wikidata
  • 322656 unique forms in Wikidata
  • 2056 shared forms
  • 216 shared lemmas
  • 182 shared senses
  • 59 missing senses on existing Lexemes

Hindi

[edit]

Note: Hindi was not properly mapped. Besides the first five stats, all others are not trustworthy.

  • 15526 forms in WordGraph
  • 3199 unique forms in WordGraph
  • 1868 unique lemmas in WordGraph
  • 1892 unique topics in WordGraph
  • 1974 senses in WordGraph
  • 1 lexemes in Wikidata
  • 1 unique lemmas in Wikidata
  • 0 unique topics in Wikidata
  • 0 senses in Wikidata (possibe overcount)
  • 0 forms in Wikidata
  • 0 unique forms in Wikidata
  • 0 shared forms
  • 0 shared lemmas
  • 0 shared senses
  • 0 missing senses on existing Lexemes

Croatian

[edit]
  • 52171 forms in WordGraph
  • 17990 unique forms in WordGraph
  • 1413 unique lemmas in WordGraph
  • 1451 unique topics in WordGraph
  • 1482 senses in WordGraph
  • 773 lexemes in Wikidata
  • 747 unique lemmas in Wikidata
  • 828 unique topics in Wikidata
  • 1159 senses in Wikidata (possibe overcount)
  • 10180 forms in Wikidata
  • 5025 unique forms in Wikidata
  • 345 shared forms
  • 28 shared lemmas
  • 28 shared senses
  • 3 missing senses on existing Lexemes

Hungarian

[edit]
  • 10574 forms in WordGraph
  • 10160 unique forms in WordGraph
  • 299 unique lemmas in WordGraph
  • 310 unique topics in WordGraph
  • 310 senses in WordGraph
  • 121 lexemes in Wikidata
  • 118 unique lemmas in Wikidata
  • 62 unique topics in Wikidata
  • 64 senses in Wikidata (possibe overcount)
  • 159 forms in Wikidata
  • 154 unique forms in Wikidata
  • 0 shared forms
  • 0 shared lemmas
  • 0 shared senses
  • 0 missing senses on existing Lexemes

Indonesian

[edit]
  • 13451 forms in WordGraph
  • 5132 unique forms in WordGraph
  • 1647 unique lemmas in WordGraph
  • 1725 unique topics in WordGraph
  • 1751 senses in WordGraph
  • 20212 lexemes in Wikidata
  • 20191 unique lemmas in Wikidata
  • 385 unique topics in Wikidata
  • 402 senses in Wikidata (possibe overcount)
  • 412640 forms in Wikidata
  • 393154 unique forms in Wikidata
  • 365 shared forms
  • 15 shared lemmas
  • 13 shared senses
  • 5 missing senses on existing Lexemes

Italian

[edit]
  • 9932 forms in WordGraph
  • 6786 unique forms in WordGraph
  • 2287 unique lemmas in WordGraph
  • 2301 unique topics in WordGraph
  • 2453 senses in WordGraph
  • 64254 lexemes in Wikidata
  • 62040 unique lemmas in Wikidata
  • 18765 unique topics in Wikidata
  • 20732 senses in Wikidata (possibe overcount)
  • 521626 forms in Wikidata
  • 412361 unique forms in Wikidata
  • 2129 shared forms
  • 546 shared lemmas
  • 453 shared senses
  • 177 missing senses on existing Lexemes

Kannada

[edit]
  • 299 forms in WordGraph
  • 275 unique forms in WordGraph
  • 63 unique lemmas in WordGraph
  • 63 unique topics in WordGraph
  • 63 senses in WordGraph
  • 14 lexemes in Wikidata
  • 14 unique lemmas in Wikidata
  • 14 unique topics in Wikidata
  • 14 senses in Wikidata (possibe overcount)
  • 9 forms in Wikidata
  • 9 unique forms in Wikidata
  • 0 shared forms
  • 0 shared lemmas
  • 0 shared senses
  • 0 missing senses on existing Lexemes

Lithuanian

[edit]
  • 17424 forms in WordGraph
  • 11798 unique forms in WordGraph
  • 584 unique lemmas in WordGraph
  • 617 unique topics in WordGraph
  • 618 senses in WordGraph
  • 22 lexemes in Wikidata
  • 22 unique lemmas in Wikidata
  • 16 unique topics in Wikidata
  • 16 senses in Wikidata (possibe overcount)
  • 95 forms in Wikidata
  • 85 unique forms in Wikidata
  • 0 shared forms
  • 0 shared lemmas
  • 0 shared senses
  • 0 missing senses on existing Lexemes

Latvian

[edit]
  • 15348 forms in WordGraph
  • 9296 unique forms in WordGraph
  • 611 unique lemmas in WordGraph
  • 638 unique topics in WordGraph
  • 641 senses in WordGraph
  • 273 lexemes in Wikidata
  • 272 unique lemmas in Wikidata
  • 122 unique topics in Wikidata
  • 128 senses in Wikidata (possibe overcount)
  • 2637 forms in Wikidata
  • 1950 unique forms in Wikidata
  • 54 shared forms
  • 3 shared lemmas
  • 3 shared senses
  • 0 missing senses on existing Lexemes

Marathi

[edit]
  • 28748 forms in WordGraph
  • 10371 unique forms in WordGraph
  • 1340 unique lemmas in WordGraph
  • 1280 unique topics in WordGraph
  • 1397 senses in WordGraph
  • 54 lexemes in Wikidata
  • 50 unique lemmas in Wikidata
  • 59 unique topics in Wikidata
  • 116 senses in Wikidata (possibe overcount)
  • 111 forms in Wikidata
  • 94 unique forms in Wikidata
  • 2 shared forms
  • 2 shared lemmas
  • 0 shared senses
  • 2 missing senses on existing Lexemes

Dutch

[edit]
  • 7912 forms in WordGraph
  • 5024 unique forms in WordGraph
  • 1891 unique lemmas in WordGraph
  • 1641 unique topics in WordGraph
  • 1941 senses in WordGraph
  • 560 lexemes in Wikidata
  • 534 unique lemmas in Wikidata
  • 396 unique topics in Wikidata
  • 415 senses in Wikidata (possibe overcount)
  • 1172 forms in Wikidata
  • 1104 unique forms in Wikidata
  • 13 shared forms
  • 5 shared lemmas
  • 4 shared senses
  • 1 missing senses on existing Lexemes

Norwegian

[edit]

Note: Norwegian was not properly mapped. Besides the first four stats, all others are not trustworthy.

  • 18044 forms in WordGraph
  • 8118 unique forms in WordGraph
  • 1083 unique lemmas in WordGraph
  • 1138 unique topics in WordGraph
  • 1139 senses in WordGraph
  • 0 lexemes in Wikidata
  • 0 unique lemmas in Wikidata
  • 0 unique topics in Wikidata
  • 0 senses in Wikidata (possibe overcount)
  • 0 forms in Wikidata
  • 0 unique forms in Wikidata
  • 0 shared forms
  • 0 shared lemmas
  • 0 shared senses
  • 0 missing senses on existing Lexemes

Polish

[edit]
  • 82047 forms in WordGraph
  • 28445 unique forms in WordGraph
  • 1912 unique lemmas in WordGraph
  • 1915 unique topics in WordGraph
  • 2034 senses in WordGraph
  • 3533 lexemes in Wikidata
  • 3376 unique lemmas in Wikidata
  • 2167 unique topics in Wikidata
  • 3193 senses in Wikidata (possibe overcount)
  • 31744 forms in Wikidata
  • 19313 unique forms in Wikidata
  • 204 shared forms
  • 21 shared lemmas
  • 20 shared senses
  • 6 missing senses on existing Lexemes

Portuguese

[edit]
  • 15735 forms in WordGraph
  • 9403 unique forms in WordGraph
  • 3056 unique lemmas in WordGraph
  • 2879 unique topics in WordGraph
  • 3271 senses in WordGraph
  • 4544 lexemes in Wikidata
  • 4464 unique lemmas in Wikidata
  • 2694 unique topics in Wikidata
  • 3115 senses in Wikidata (possibe overcount)
  • 46865 forms in Wikidata
  • 34634 unique forms in Wikidata
  • 561 shared forms
  • 151 shared lemmas
  • 116 shared senses
  • 68 missing senses on existing Lexemes

Romanian

[edit]
  • 22990 forms in WordGraph
  • 13208 unique forms in WordGraph
  • 1245 unique lemmas in WordGraph
  • 1303 unique topics in WordGraph
  • 1310 senses in WordGraph
  • 99 lexemes in Wikidata
  • 98 unique lemmas in Wikidata
  • 96 unique topics in Wikidata
  • 100 senses in Wikidata (possibe overcount)
  • 119 forms in Wikidata
  • 101 unique forms in Wikidata
  • 6 shared forms
  • 2 shared lemmas
  • 1 shared senses
  • 1 missing senses on existing Lexemes

Russian

[edit]
  • 106335 forms in WordGraph
  • 26400 unique forms in WordGraph
  • 2182 unique lemmas in WordGraph
  • 2174 unique topics in WordGraph
  • 2312 senses in WordGraph
  • 101971 lexemes in Wikidata
  • 99742 unique lemmas in Wikidata
  • 12077 unique topics in Wikidata
  • 16131 senses in Wikidata (possibe overcount)
  • 1243365 forms in Wikidata
  • 913065 unique forms in Wikidata
  • 10412 shared forms
  • 380 shared lemmas
  • 351 shared senses
  • 69 missing senses on existing Lexemes

Slovak

[edit]
  • 45758 forms in WordGraph
  • 20794 unique forms in WordGraph
  • 1354 unique lemmas in WordGraph
  • 1390 unique topics in WordGraph
  • 1401 senses in WordGraph
  • 16477 lexemes in Wikidata
  • 16216 unique lemmas in Wikidata
  • 574 unique topics in Wikidata
  • 643 senses in Wikidata (possibe overcount)
  • 235262 forms in Wikidata
  • 128234 unique forms in Wikidata
  • 3141 shared forms
  • 43 shared lemmas
  • 36 shared senses
  • 9 missing senses on existing Lexemes

Slovenian

[edit]
  • 53478 forms in WordGraph
  • 21836 unique forms in WordGraph
  • 1545 unique lemmas in WordGraph
  • 1609 unique topics in WordGraph
  • 1616 senses in WordGraph
  • 32 lexemes in Wikidata
  • 32 unique lemmas in Wikidata
  • 30 unique topics in Wikidata
  • 32 senses in Wikidata (possibe overcount)
  • 170 forms in Wikidata
  • 103 unique forms in Wikidata
  • 11 shared forms
  • 2 shared lemmas
  • 1 shared senses
  • 1 missing senses on existing Lexemes

Serbian

[edit]
  • 45843 forms in WordGraph
  • 15997 unique forms in WordGraph
  • 1305 unique lemmas in WordGraph
  • 1338 unique topics in WordGraph
  • 1350 senses in WordGraph
  • 25 lexemes in Wikidata
  • 25 unique lemmas in Wikidata
  • 18 unique topics in Wikidata
  • 18 senses in Wikidata (possibe overcount)
  • 39 forms in Wikidata
  • 32 unique forms in Wikidata
  • 0 shared forms
  • 0 shared lemmas
  • 0 shared senses
  • 0 missing senses on existing Lexemes

Swedish

[edit]
  • 13664 forms in WordGraph
  • 11132 unique forms in WordGraph
  • 1589 unique lemmas in WordGraph
  • 1621 unique topics in WordGraph
  • 1660 senses in WordGraph
  • 42779 lexemes in Wikidata
  • 42385 unique lemmas in Wikidata
  • 8256 unique topics in Wikidata
  • 9251 senses in Wikidata (possibe overcount)
  • 294289 forms in Wikidata
  • 262349 unique forms in Wikidata
  • 4351 shared forms
  • 405 shared lemmas
  • 361 shared senses
  • 83 missing senses on existing Lexemes

Tamil

[edit]
  • 23618 forms in WordGraph
  • 11544 unique forms in WordGraph
  • 1249 unique lemmas in WordGraph
  • 1129 unique topics in WordGraph
  • 1296 senses in WordGraph
  • 883 lexemes in Wikidata
  • 871 unique lemmas in Wikidata
  • 706 unique topics in Wikidata
  • 1057 senses in Wikidata (possibe overcount)
  • 6670 forms in Wikidata
  • 6572 unique forms in Wikidata
  • 140 shared forms
  • 22 shared lemmas
  • 19 shared senses
  • 8 missing senses on existing Lexemes

Telugu

[edit]
  • 14751 forms in WordGraph
  • 9672 unique forms in WordGraph
  • 1129 unique lemmas in WordGraph
  • 1133 unique topics in WordGraph
  • 1173 senses in WordGraph
  • 75 lexemes in Wikidata
  • 72 unique lemmas in Wikidata
  • 38 unique topics in Wikidata
  • 50 senses in Wikidata (possibe overcount)
  • 52 forms in Wikidata
  • 52 unique forms in Wikidata
  • 3 shared forms
  • 2 shared lemmas
  • 2 shared senses
  • 0 missing senses on existing Lexemes

Turkish

[edit]
  • 12078 forms in WordGraph
  • 10058 unique forms in WordGraph
  • 851 unique lemmas in WordGraph
  • 905 unique topics in WordGraph
  • 906 senses in WordGraph
  • 1815 lexemes in Wikidata
  • 1766 unique lemmas in Wikidata
  • 1081 unique topics in Wikidata
  • 1425 senses in Wikidata (possibe overcount)
  • 2517 forms in Wikidata
  • 2298 unique forms in Wikidata
  • 24 shared forms
  • 15 shared lemmas
  • 12 shared senses
  • 10 missing senses on existing Lexemes

Ukrainian

[edit]
  • 47172 forms in WordGraph
  • 19775 unique forms in WordGraph
  • 1175 unique lemmas in WordGraph
  • 1219 unique topics in WordGraph
  • 1221 senses in WordGraph
  • 16295 lexemes in Wikidata
  • 16293 unique lemmas in Wikidata
  • 310 unique topics in Wikidata
  • 319 senses in Wikidata (possibe overcount)
  • 508091 forms in Wikidata
  • 238647 unique forms in Wikidata
  • 73 shared forms
  • 4 shared lemmas
  • 2 shared senses
  • 2 missing senses on existing Lexemes

Urdu

[edit]

Note: Urdu was not properly mapped. Besides the first five stats, all others are not trustworthy.

  • 126 forms in WordGraph
  • 77 unique forms in WordGraph
  • 69 unique lemmas in WordGraph
  • 68 unique topics in WordGraph
  • 70 senses in WordGraph
  • 0 lexemes in Wikidata
  • 0 unique lemmas in Wikidata
  • 0 unique topics in Wikidata
  • 0 senses in Wikidata (possibe overcount)
  • 0 forms in Wikidata
  • 0 unique forms in Wikidata
  • 0 shared forms
  • 0 shared lemmas
  • 0 shared senses
  • 0 missing senses on existing Lexemes