Jacy is being developed in cooperation with the Hinoki Treebank.
Corpora
Name | ID | Full Name | # Sentences | # Words | Comments |
mrs | 0 | MRS Test Suite | 136 | ??? | |
tc | 100,000 | Tanaka Corpus | 150,341 | 1,756,825 | Includes English Translations, 10 profiles (6-15) treebanked |
These treebanks are in the jacy/tsdb/gold directory. They may lag behind the most recent version of the grammar.
If you want silver data, parsing the rest of the Tanaka Corpus is a good place to start.
Last update: 2020-07-14 by FrancisBond [edit]