Tanaka Corpus Development Data (rtc000, rtc001, rtc002)
Changes | Jaen Rev. | Date | Parse Coverage | Transfer Coverage | Generation Coverage | End-to-End Coverage | NEVA | Oracle | F1 |
settings as below, minus bad surface rules (aggressive version) | 9388 | 2011/08/11 | 3726/4500 (82.8%) | 2267/3726 (60.8%) | 1247/2267 (55.0%) | 1247/4500 (27.7%) | 19.0 | 25.0 | 22.5 |
sg_relratio = 0.2, mwe_relratio = 1, sg_thresh = 0.15, mwe_thresh = 0.25 | 9388 | 2011/08/11 | 3726/4500 (82.8%) | 2281/3726 (61.2%) | 1252/2281 (54.9%) | 1252/4500 (27.8%) | 19.0 | 25.0 | 22.6 |
Passive | 9388 | 2011/06/01 | 3726/4500 (82.8%) | 2259/3726 (60.6%) | 1240/2259 (54.9%) | 1240/4500 (27.6%) | 19.0 | 25.0 | 22.5 |
Relation prob > 0.1 | 9336 | 2011/05/31 | 3725/4499 (82.8%) | 2295/3725 (61.6%) | 1276/2295 (55.6%) | 1276/4499 (28.4%) | 18.7 | 24.3 | 22.5 |
MT version (more debugging + zero pronouns) | 9343 | 2011/05/24 | 3726/4500 (82.8%) | 2262/3726 (60.7%) | 1246/2262 (55.1%) | 1246/4500 (27.7%) | 18.3 | 24.0 | 22.1 |
debugged alternation bug | 9336 | 2011/05/20 | 3726/4500 (82.8%) | 2079/3726 (55.8%) | 1121/2079 (53.9%) | 1121/4500 (24.9%) | 18.0 | 23.7 | 20.9 |
+ unknown words / lower threshold on omtr 0.4>0.2 | 9336 | 2011/05/18 | 3726/4500 (82.8%) | 1984/3726 (53.2%) | 1086/1984 (54.7%) | 1086/4500 (24.1%) | 18.3 | 24.0 | 20.8 |
+ phrase table single rules (all MWEs) | 9336 | 2011/05/16 | 3726/4500 (82.8%) | 1933/3726 (51.9%) | 1055/1933 (54.6%) | 1055/4500 (23.4%) | 18.3 | 23.7 | 20.6 |
+ phrase table single rules (some missing MWEs) | 9319 | 2011/05/12 | 3726/4500 (82.8%) | 1894/3726 (50.8%) | 1037/1894 (54.8%) | 1037/4500 (23.0%) | 18.0 | 24.0 | 20.2 |
- phrase table single rules (some missing MWEs) | 9319 | 2011/05/12 | 3725/4499 (82.8%) | 1735/3725 (46.6%) | 960/1735 (55.3%) | 960/4499 (21.3%) | 18.7 | 24.0 | 19.9 |
Selected rules for each profile | 9319 | 2011/04/29 | 3726/4500 (82.8%) | 1772/3726 (47.6%) | 976/1772 (55.1%) | 976/4500 (21.7%) | 18.3 | 24.0 | 19.9 |
Selected transfer rules | 9308 | 2011/04/28 | 3726/4500 (82.8%) | 1772/3726 (47.6%) | 973/1772 (54.9%) | 973/4500 (21.6%) | 18.3 | 24.0 | 19.8 |
All anymalign, small transfer model | 9296 | 2011/04/21 | 3726/4500 (82.8%) | 1788/3726 (48.0%) | 983/1788 (55.0%) | 983/4500 (21.8%) | 18.3 | 23.3 | 19.9 |
New prepositions, - pcorp omtrs, + half anymalign | 2011/04/20 | 3726/4500 (82.8%) | 1779/3726 (47.7%) | 955/1779 (53.7%) | 955/4500 (21.2%) | 18.3 | 23.7 | 19.7 | |
Updated Edict, + MWEs | 9140 | 2011/04/06 | 3725/4500 (82.8%) | 1644/3725 (44.1%) | 906/1644 (55.1%) | 906/4500 (20.1%) | 18.3 | 24.0 | 19.2 |
Updated Edict, - MWEs | 9140 | 2011/04/04 | 3726/4500 (82.8%) | 1600/3726 (42.9%) | 874/1600 (54.6%) | 874/4500 (19.4%) | 18.0 | 23.7 | 18.7 |
Tanaka Corpus Test Data (rtc003, rtc004, rtc005)
Changes | Date | Parse Coverage | Transfer Coverage | Generation Coverage | End-to-End Coverage | NEVA | Oracle | F1 |
ACL v. baseline | 2011/06/09 | 3614/4500 (80.3%) | 2117/3614 (58.6%) | 1163/2117 (54.9%) | 1163/4500 (25.8%) | 17.7 | 24.0 | 21.0 |
MT v. + complex MWE rules (relratio = 1, threshold = 0.35) | 2011/06/07 | 3613/4499 (80.3%) | 2151/3613 (59.5%) | 1172/2151 (54.5%) | 1172/4499 (26.1%) | 17.7 | 23.7 | 21.1 |
MT v. + complex MWE rules (relratio = 0.8, threshold = 0.25) | 2011/06/07 | 3614/4500 (80.3%) | 2154/3614 (59.6%) | 1169/2154 (54.3%) | 1169/4500 (26.0%) | 18.0 | 24.0 | 21.3 |
MT v. + complex MWE rules (relratio = 0.2, threshold = 0.15) | 2011/06/04 | 3614/4500 (80.3%) | 2157/3614 (59.7%) | 1158/2157 (53.7%) | 1158/4500 (25.7%) | 17.7 | 23.7 | 21.0 |
MT version (more debugging + zero pronouns) | 2011/05/24 | 3612/4498 (80.3%) | 2172/3612 (60.1%) | 1175/2172 (54.1%) | 1175/4498 (26.1%) | 17.7 | 24.0 | 21.1 |
debugged alternation bug | 2011/05/20 | 3613/4499 (80.3%) | 1983/3613 (54.9%) | 1053/1983 (53.1%) | 1053/4499 (23.4%) | 17.7 | 24.0 | 20.1 |
+ unknown words / lower threshold on omtr 0.4>0.2 | 2011/05/18 | 3614/4500 (80.3%) | 1861/3614 (51.5%) | 1001/1861 (53.8%) | 1001/4500 (22.2%) | 18.0 | 24.3 | 19.9 |
+ phrase table single rules (all MWEs) | 2011/05/16 | 3614/4500 (80.3%) | 1854/3614 (51.3%) | 986/1854 (53.2%) | 986/4500 (21.9%) | 17.7 | 23.7 | 19.6 |
+ n and adj MWEs from Moses and Anymalign | 2011/05/08 | 3614/4500 (80.3%) | 1704/3614 (47.1%) | 900/1704 (52.8%) | 900/4500 (20.0%) | 18.0 | 24.0 | 18.9 |
+ PP MWEs from Moses and Anymalign | 2011/05/08 | 3614/4500 (80.3%) | 1659/3614 (45.9%) | 877/1659 (52.9%) | 877/4500 (19.5%) | 18.0 | 24.0 | 18.7 |
+ MWEs (All MWEs from Moses and Anymalign) | 2011/05/06 | 3614/4500 (80.3%) | 1729/3614 (47.8%) | 906/1729 (52.4%) | 906/4500 (20.1%) | 18.0 | 24.3 | 19.0 |
+ Verb MWEs from Moses and Anymalign | 2011/05/06 | 3612/4498 (80.3%) | 1688/3612 (46.7%) | 885/1688 (52.4%) | 885/4498 (19.7%) | 17.7 | 23.3 | 18.6 |
- MWEs (Baseline for MWE paper) | 2011/05/04 | 3613/4499 (80.3%) | 1647/3613 (45.6%) | 870/1647 (52.8%) | 870/4499 (19.3%) | 17.7 | 23.7 | 18.5 |
Corrected lexicon, + MWEs | 2011/04/06 | 3614/4500 (80.3%) | 1576/3614 (43.6%) | 852/1576 (54.1%) | 852/4500 (18.9%) | 18.3 | 23.7 | 18.6 |
Corrected lexicon, - MWEs | 2011/03/31 | 3614/4500 (80.3%) | 1505/3614 (41.6%) | 824/1505 (54.8%) | 824/4500 (18.3%) | 17.7 | 23.3 | 18.0 |
New Edict (with particle bug) + Auto MWEs | 2011/03/29 | 3614/4500 (80.3%) | 1526/3614 (42.2%) | 830/1526 (54.4%) | 830/4500 (18.4%) | 18.0 | 23.7 | 18.2 |
Auto MWEs | 2011/03/28 | 3612/4498 (80.3%) | 1507/3612 (41.7%) | 816/1507 (54.1%) | 816/4498 (18.1%) | 18.3 | 23.7 | 18.2 |
New proper and relative nouns | 2011/03/20 | 3614/4500 (80.3%) | 1503/3614 (41.6%) | 815/1503 (54.2%) | 815/4500 (18.1%) | 17.7 | 24.0 | 17.9 |
Old cheap + tc + end2end | 2011/03/14 | 3614/4500 (80.3%) | 1487/3614 (41.1%) | 785/1487 (52.8%) | 785/4500 (17.4%) | 17.3 | 23.3 | 17.4 |
Old cheap + rtc000 + end2end | 2011/03/14 | 3614/4500 (80.3%) | 1487/3614 (41.1%) | 785/1487 (52.8%) | 785/4500 (17.4%) | 16.0 | 22.3 | 16.7 |
New cheap + end to end model | 2011/03/10 | 3612/4498 (80.3%) | 1252/3612 (34.7%) | 664/1252 (53.0%) | 664/4498 (14.8%) | 18.0 | 24.3 | 16.2 |
New auto rules with parallel corpus rules | 2011/03/08 | 3614/4500 (80.3%) | 1487/3614 (41.1%) | 785/1487 (52.8%) | 785/4500 (17.4%) | 15.7 | 22.0 | 16.5 |
New auto rules (no parallel corpus rules) | 2011/03/07 | 3614/4500 (80.3%) | 1374/3614 (38.0%) | 739/1374 (53.8%) | 739/4500 (16.4%) | 15.3 | 21.7 | 15.9 |
New ERG | 2011/03/07 | 3613/4499 (80.3%) | 1374/3613 (38.0%) | 730/1374 (53.1%) | 730/4499 (16.2%) | 15.7 | 22.0 | 15.9 |
Wikipedia rules (first batch) | 2011/02/17 | 3614/4500 (80.3%) | 1465/3614 (40.5%) | 775/1465 (52.9%) | 775/4500 (17.2%) | 16.3 | 23.0 | 16.8 |
Debugging | 2011/02/17 | 3613/4499 (80.3%) | 1463/3613 (40.5%) | 776/1463 (53.0%) | 776/4499 (17.2%) | 16.3 | 23.0 | 16.8 |
Modifications | 2011/01/27 | 3613/4499 (80.3%) | 1422/3613 (39.4%) | 754/1422 (53.0%) | 754/4499 (16.8%) | 16.3 | 22.7 | 16.5 |
Multi-word expressions | 2011/01/06 | 3614/4500 (80.3%) | 1506/3614 (41.7%) | 787/1506 (52.3%) | 787/4500 (17.5%) | 16.0 | 22.3 | 16.7 |
Parallel corpus rules (new batch) | 2010/12/28 | 3613/4499 (80.3%) | 1517/3613 (42.0%) | 793/1517 (52.3%) | 793/4499 (17.6%) | 16.3 | 22.7 | 17.0 |
Parallel corpus rules (fixed possessives) | 2010/12/15 | 3614/4500 (80.3%) | 1499/3614 (41.5%) | 789/1499 (52.6%) | 789/4500 (17.5%) | 16.3 | 22.3 | 16.9 |
New Auto Transfer (fixed possessives) | 2010/12/14 | 3614/4500 (80.3%) | 1375/3614 (38.0%) | 739/1375 (53.7%) | 739/4500 (16.4%) | 15.3 | 21.7 | 15.9 |
Parallel corpus rules (wnjpn) | 2010/12/06 | 3614/4500 (80.3%) | 1518/3614 (42.0%) | 719/1518 (47.4%) | 719/4500 (16.0%) | 0.0 | 0.0 | 0.0 |
New Auto Transfer | 2010/12/02 | 3614/4500 (80.3%) | 1370/3614 (37.9%) | 670/1370 (48.9%) | 670/4500 (14.9%) | 0.0 | 0.0 | 0.0 |
TC Gen Model | 2010/10/26 | 3613/4499 (80.3%) | 1383/3613 (38.3%) | 732/1383 (52.9%) | 732/4499 (16.3%) | 15.7 | 22.0 | 16.0 |
2010/09/23 | 3614/4500 (80.3%) | 1384/3614 (38.3%) | 731/1384 (52.8%) | 731/4500 (16.2%) | 14.3 | 20.7 | 15.2 |
Tanaka Corpus Development Data
Changes | Date | HG Rev | PPID | Parse Coverage | Transfer Coverage | Generation Coverage | End-to-End Coverage | NEVA | Oracle | F1 |
:wait=7200, :quantum=1200 | 2009/09/22 | 604@27 | 21502 | 3722 / 4499 (82.73%) | 1496 / 3722 (40.19%) | 809 / 1496 (54.08%) | 809 / 4499 (17.98%) | 14.90 | 20.70 | 16.30 |
:wait=3600, :quantum=600 | 2009/09/22 | 604@26 | 29895 | 3722 / 4499 (82.73%) | 1496 / 3722 (40.19%) | 810 / 1496 (54.14%) | 810 / 4499 (18.00%) | 14.94 | 20.74 | 16.33 |
:wait=3600, :quantum=600 | 2009/09/22 | 604@26 | 15939 | 3723 / 4500 (82.73%) | 1497 / 3723 (40.21%) | 811 / 1497 (54.18%) | 811 / 4500 (18.02%) | 14.94 | 20.74 | 16.34 |
T20000 | 2009/09/21 | 604@24 | 2265 | 3722 / 4499 (82.73%) | 1496 / 3722 (40.19%) | 811 / 1496 (54.21%) | 811 / 4499 (18.03%) | 14.94 | 20.74 | 16.34 |
T15000 | 2009/09/21 | 604@23 | 20118 | 3721 / 4499 (82.71%) | 1495 / 3721 (40.18%) | 811 / 1495 (54.25%) | 811 / 4499 (18.03%) | 14.94 | 20.74 | 16.34 |
T10000 | 2009/09/21 | 604@22 | 29341 | 3723 / 4500 (82.73%) | 1497 / 3723 (40.21%) | 810 / 1497 (54.11%) | 810 / 4500 (18.00%) | 14.94 | 20.74 | 16.33 |
+MRS_MODEL | 2009/09/21 | 604@21 | 30836 | 3477 / 4200 (82.79%) | 1407 / 3477 (40.47%) | 765 / 1407 (54.37%) | 765 / 4200 (18.21%) | 14.94 | 20.72 | 16.41 |
-MRS_MODEL | 2009/09/20 | 604@20 | 21485 | 3722 / 4499 (82.73%) | 1496 / 3722 (40.19%) | 787 / 1496 (52.61%) | 787 / 4499 (17.49%) | 14.92 | 20.41 | 16.11 |
MAX3, FB_CLEAN | 2009/09/20 | 604@19 | 10091 | 3723 / 4500 (82.73%) | 1497 / 3723 (40.21%) | 810 / 1497 (54.11%) | 810 / 4500 (18.00%) | 14.94 | 20.74 | 16.33 |
+CHANGE | 2009/09/18 | 585@16 | 19596 | 3722 / 4499 (82.73%) | 1488 / 3722 (39.98%) | 795 / 1488 (53.43%) | 795 / 4499 (17.67%) | 14.78 | 20.65 | 16.10 |
+CHANGE | 2009/09/18 | 585@14 | 29791 | 3720 / 4500 (82.67%) | 1310 / 3720 (35.22%) | 705 / 1310 (53.82%) | 705 / 4500 (15.67%) | 15.50 | 21.36 | 15.58 |
+CHANGE | 2009/09/18 | 565@17 | 22899 | 3723 / 4500 (82.73%) | 1324 / 3723 (35.56%) | 709 / 1324 (53.55%) | 709 / 4500 (15.76%) | 15.38 | 21.11 | 15.56 |
+CHANGE | 2009/09/17 | 585@9 | 10748 | 3722 / 4499 (82.73%) | 1491 / 3722 (40.06%) | 797 / 1491 (53.45%) | 797 / 4499 (17.72%) | 15.05 | 20.75 | 16.28 |
+CHANGE | 2009/09/17 | 585@8 | 19904 | 3722 / 4499 (82.73%) | 1490 / 3722 (40.03%) | 795 / 1490 (53.36%) | 795 / 4499 (17.67%) | 14.74 | 20.62 | 16.07 |
+CHANGE | 2009/09/17 | 585@7 | 5196 | 3722 / 4499 (82.73%) | 1496 / 3722 (40.19%) | 811 / 1496 (54.21%) | 811 / 4499 (18.03%) | 14.94 | 20.74 | 16.34 |
+CHANGE | 2009/09/17 | 585@13 | 22095 | 3721 / 4498 (82.73%) | 1477 / 3721 (39.69%) | 798 / 1477 (54.03%) | 798 / 4498 (17.74%) | 14.99 | 20.60 | 16.25 |
+CHANGE | 2009/09/17 | 585@12 | 11178 | 3719 / 4498 (82.68%) | 1471 / 3719 (39.55%) | 797 / 1471 (54.18%) | 797 / 4498 (17.72%) | 14.68 | 20.40 | 16.06 |
+CHANGE | 2009/09/16 | 585@6 | 16496 | 3723 / 4500 (82.73%) | 1514 / 3723 (40.67%) | 817 / 1514 (53.96%) | 817 / 4500 (18.16%) | 14.68 | 20.59 | 16.23 |
HG585 | 2009/09/15 | 585 | 8772 | 3715 / 4492 (82.70%) | 1450 / 3715 (39.03%) | 781 / 1450 (53.86%) | 781 / 4492 (17.39%) | 14.76 | 20.60 | 15.96 |
+MAX5, +T10000, -LONG_WAIT, +MOSES_2.05, -FEEDBACK | 2009/09/15 | 3687 / 4457 (82.72%) | 1436 / 3687 (38.95%) | 776 / 1436 (54.04%) | 776 / 4457 (17.41%) | 14.76 | 20.66 | 15.98 | ||
+MAX3, +T10000, -LONG_WAIT, +MOSES_2.05, +FEEDBACK | 2009/09/14 | 3722 / 4499 (82.73%) | 1496 / 3722 (40.19%) | 809 / 1496 (54.08%) | 809 / 4499 (17.98%) | 14.94 | 20.73 | 16.32 | ||
+MAX3, +T5000, -LONG_WAIT, +MOSES_2.05, +FEEDBACK | 2009/09/14 | 3723 / 4500 (82.73%) | 1497 / 3723 (40.21%) | 811 / 1497 (54.18%) | 811 / 4500 (18.02%) | 14.94 | 20.74 | 16.34 | ||
+FEEDBACK, +MAX10, +T10000, -LONG_WAIT, +MOSES_2.05 | 2009/09/14 | 3721 / 4498 (82.73%) | 1470 / 3721 (39.51%) | 795 / 1470 (54.08%) | 795 / 4498 (17.67%) | 15.02 | 20.64 | 16.24 | ||
+MAX10, +T10000, -LONG_WAIT, +MOSES_2.05 | 2009/09/11 | 3721 / 4498 (82.73%) | 1457 / 3721 (39.16%) | 792 / 1457 (54.36%) | 792 / 4498 (17.61%) | 14.69 | 20.52 | 16.01 | ||
+FEEDBACK_SORTING | 2009/09/03 | 3721 / 4498 (82.73%) | 1371 / 3721 (36.84%) | 741 / 1371 (54.05%) | 741 / 4498 (16.47%) | 14.60 | 20.39 | 15.48 | ||
+BURN_IN | 2009/08/31 | 3719 / 4495 (82.74%) | 1508 / 3719 (40.55%) | 814 / 1508 (53.98%) | 814 / 4495 (18.11%) | 14.74 | 20.61 | 16.25 | ||
+FEEDBACK, +MAX3, +T5000, -LONG_WAIT, +MOSES_2.05 | 2009/08/29 | 3721 / 4497 (82.74%) | 1518 / 3721 (40.80%) | 826 / 1518 (54.41%) | 826 / 4497 (18.37%) | 14.75 | 20.32 | 16.36 | ||
+FEEDBACK, +MAX10, +T10000, +LONG_WAIT, +MOSES_2.05 | 2009/08/27 | 3654 / 4416 (82.74%) | 1461 / 3654 (39.98%) | 793 / 1461 (54.28%) | 793 / 4416 (17.96%) | 14.72 | 20.44 | 16.18 | ||
+MAX10, +T10000, +LONG_WAIT, +MOSES_2.05 | 2009/08/26-2 | 3722 / 4499 (82.73%) | 1445 / 3722 (38.82%) | 788 / 1445 (54.53%) | 788 / 4499 (17.52%) | 14.71 | 20.55 | 15.99 | ||
+FEEDBACK, +MAX10, +T5000, +LONG_WAIT, +MOSES_2.05 | 2009/08/26 | 3527 / 4256 (82.87%) | 1414 / 3527 (40.09%) | 769 / 1414 (54.38%) | 769 / 4256 (18.07%) | 14.64 | 20.34 | 16.17 | ||
+JACY_BARCELONA | 2009/08/24 | 3594 / 4349 (82.64%) | 1460 / 3594 (40.62%) | 790 / 1460 (54.11%) | 790 / 4349 (18.17%) | 14.55 | 20.35 | 16.16 | ||
-BLACKLIST | 2009/08/23-2 | 3722 / 4499 (82.73%) | 1477 / 3722 (39.68%) | 794 / 1477 (53.76%) | 794 / 4499 (17.65%) | 14.86 | 20.74 | 16.13 | ||
+T5000, -LONG_WAIT | 2009/08/23 | 3716 / 4492 (82.72%) | 1348 / 3716 (36.28%) | 744 / 1348 (55.19%) | 744 / 4492 (16.56%) | 14.83 | 21.07 | 15.65 | ||
+MAX3 | 2009/08/19 | 3722 / 4499 (82.73%) | 1364 / 3722 (36.65%) | 765 / 1364 (56.09%) | 765 / 4499 (17.00%) | 15.11 | 21.21 | 16.00 | ||
+LONG_WAIT | 2009/08/18-3 | 3722 / 4499 (82.73%) | 1345 / 3722 (36.14%) | 748 / 1345 (55.61%) | 748 / 4499 (16.63%) | 15.11 | 21.06 | 15.83 | ||
+MOSES_2.5, +MAX5, -LONG_WAIT | 2009/08/18-2 | 3723 / 4500 (82.73%) | 1356 / 3723 (36.42%) | 752 / 1356 (55.46%) | 752 / 4500 (16.71%) | 15.05 | 21.19 | 15.84 | ||
-MOSES, +BLACKLIST | 2009/08/18 | 3721 / 4498 (82.73%) | 1354 / 3721 (36.39%) | 756 / 1354 (55.83%) | 756 / 4498 (16.81%) | 14.99 | 21.35 | 15.85 | ||
+LEMMAS | 2009/08/17 | 3720 / 4497 (82.72%) | 1441 / 3720 (38.74%) | 808 / 1441 (56.07%) | 808 / 4497 (17.97%) | 14.92 | 21.22 | 16.30 | ||
+T5000, +MOSES_2.25 | 2009/08/16 | 3721 / 4498 (82.73%) | 1438 / 3721 (38.65%) | 806 / 1438 (56.05%) | 806 / 4498 (17.92%) | 15.03 | 21.16 | 16.35 | ||
+MAX3 | 2009/08/12 | 3721 / 4498 (82.73%) | 1543 / 3721 (41.47%) | 816 / 1543 (52.88%) | 816 / 4498 (18.14%) | 13.88 | 19.33 | 15.72 | ||
-MRS_MODEL | 2009/08/09 | 3493 / 4270 (81.80%) | 1536 / 3493 (43.97%) | 826 / 1536 (53.78%) | 826 / 4270 (19.34%) | 15.10 | 20.77 | 16.96 | ||
+T10000, +COMBINED_MTRS, +LONG_WAIT | 2009/08/03 | 3721 / 4497 (82.74%) | 1172 / 3721 (31.50%) | 601 / 1172 (51.28%) | 601 / 4497 (13.36%) | 14.55 | 20.97 | 13.93 | ||
+BARCELONA_TEST3 | 2009/07/13 | 3723 / 4500 (82.73%) | 1735 / 3723 (46.60%) | 943 / 1735 (54.35%) | 943 / 4500 (20.96%) | 14.81 | 20.53 | 17.35 | ||
+BARCELONA_TEST2 | 2009/07/07 | 3722 / 4499 (82.73%) | 1734 / 3722 (46.59%) | 942 / 1734 (54.33%) | 942 / 4499 (20.94%) | 14.85 | 20.63 | 17.38 | ||
+BARCELONA_TEST | 2009/07/03 | 3723 / 4500 (82.73%) | 1733 / 3723 (46.55%) | 939 / 1733 (54.18%) | 939 / 4500 (20.87%) | 14.81 | 20.71 | 17.32 | ||
+TC0906, -FEEDBACK2 | 2009/06/06 | 3721 / 4500 (82.69%) | 1705 / 3721 (45.82%) | 939 / 1705 (55.07%) | 939 / 4500 (20.87%) | 14.90 | 20.61 | 17.38 | ||
+FEEDBACK2 | 2009/06/04 | 3730 / 4500 (82.89%) | 1698 / 3730 (45.52%) | 927 / 1698 (54.59%) | 927 / 4500 (20.60%) | 14.77 | 20.53 | 17.21 | ||
+JACY_EXP, -FEEDBACK | 2009/06/03 | 3730 / 4500 (82.89%) | 1693 / 3730 (45.39%) | 926 / 1693 (54.70%) | 926 / 4500 (20.58%) | 14.86 | 20.63 | 17.26 | ||
+FEEDBACK | 2009/06/02 | 3717 / 4500 (82.60%) | 1696 / 3717 (45.63%) | 918 / 1696 (54.13%) | 918 / 4500 (20.40%) | 14.79 | 20.08 | 17.15 | ||
+PN_FIX, +JACY_SVN | 2009/05/28 | 3717 / 4500 (82.60%) | 1716 / 3717 (46.17%) | 941 / 1716 (54.84%) | 941 / 4500 (20.91%) | 14.97 | 20.64 | 17.45 | ||
+LIKE | 2009/05/21 | 3721 / 4500 (82.69%) | 1719 / 3721 (46.20%) | 944 / 1719 (54.92%) | 944 / 4500 (20.98%) | 14.97 | 20.64 | 17.47 | ||
+ZERO_FIX | 2009/05/19 | 3721 / 4500 (82.69%) | 1638 / 3721 (44.02%) | 936 / 1638 (57.14%) | 936 / 4500 (20.80%) | 14.97 | 19.97 | 17.41 | ||
-TERG, -TERGDICT, +0902, +0902DICT | 2009/05/18 | 3721 / 4500 (82.69%) | 1631 / 3721 (43.83%) | 721 / 1631 (44.21%) | 721 / 4500 (16.02%) | 16.56 | 21.56 | 16.29 | ||
+TERG, +TERGDICT | 2009/05/16 | 3721 / 4500 (82.69%) | 1565 / 3721 (42.06%) | 624 / 1565 (39.87%) | 624 / 4500 (13.87%) | 15.62 | 21.62 | 14.69 | ||
+SVN | 2009/05/14 | 3721 / 4500 (82.69%) | 1600 / 3721 (43.00%) | 723 / 1600 (45.19%) | 723 / 4500 (16.07%) | 16.26 | 22.63 | 16.16 | ||
+RELATIONAL_N2, +TAME, +NAKEREBA, +MOSES, +CVS_HEAD | 2008/11/11 | 3599 / 4500 (79.98%) | 1711 / 3599 (47.54%) | 937 / 1711 (54.76%) | 937 / 4500 (20.82%) | 14.67 | 20.67 | 17.21 | ||
+T5000, +BOOT | 2008/11/08 | 3541 / 4409 (80.31%) | 1667 / 3541 (47.08%) | 903 / 1667 (54.17%) | 903 / 4409 (20.48%) | 14.32 | 23.00 | 16.86 | ||
+UNKNOWN | 2008/11/04 | 3606 / 4499 (80.15%) | 1492 / 3606 (41.38%) | 852 / 1492 (57.10%) | 852 / 4499 (18.94%) | 14.33 | 23.64 | 16.31 | ||
+LMC, +SEMI, -GIZA | 2008/11/02 | 3607 / 4500 (80.16%) | 1351 / 3607 (37.45%) | 801 / 1351 (59.29%) | 801 / 4500 (17.80%) | 14.33 | 23.65 | 15.88 | ||
-LMD | 2008/11/01 | 3606 / 4499 (80.15%) | 988 / 3606 (27.40%) | 681 / 988 (68.93%) | 681 / 4499 (15.14%) | 13.02 | 24.64 | 14.00 | ||
-LMC, +GIZA, +LMD | 2008/10/31 | 3607 / 4500 (80.16%) | 989 / 3607 (27.42%) | 682 / 989 (68.96%) | 682 / 4500 (15.16%) | 14.37 | 24.64 | 14.75 | ||
-T10000, +CVS, +LMC | 2008/10/30 | 3606 / 4499 (80.15%) | 988 / 3606 (27.40%) | 682 / 988 (69.03%) | 682 / 4499 (15.16%) | 14.69 | 24.64 | 14.92 | ||
+T10000 | 2008/10/26 | 3605 / 4499 (80.13%) | 1095 / 3605 (30.37%) | 724 / 1095 (66.12%) | 724 / 4499 (16.09%) | 13.67 | 24.29 | 14.78 | ||
+NO_AMBIGUOUS_V3, +RELATIONAL_N | 2008/10/24 | 3607 / 4500 (80.16%) | 1010 / 3607 (28.00%) | 692 / 1010 (68.51%) | 692 / 4500 (15.38%) | 13.66 | 24.64 | 14.47 | ||
+GEN2, +GEDICT2, + NO_AMBIGUOUS_V2 | 2008/10/21 | 3625 / 4500 (80.56%) | 872 / 3625 (24.06%) | 605 / 872 (69.38%) | 605 / 4500 (13.44%) | 13.67 | 23.63 | 13.56 | ||
+NEW_TANAKA | 2008/10/20 | 3484 / 4500 (77.42%) | 882 / 3484 (25.32%) | 618 / 882 (70.07%) | 618 / 4500 (13.73%) | 13.66 | 23.64 | 13.70 | ||
-UNKNOWN, +PN, +NO_AMBIGUOUS_V | 2008/10/13 | 3550 / 4500 (78.89%) | 911 / 3550 (25.66%) | 636 / 911 (69.81%) | 636 / 4500 (14.13%) | 13.67 | 23.63 | 13.90 | ||
+UNKNOWN, -IN_DOMAIN | 2008/10/11 | 3564 / 4499 (79.22%) | 1288 / 3564 (36.14%) | 822 / 1288 (63.82%) | 822 / 4499 (18.27%) | 12.36 | 21.66 | 14.75 | ||
+DISCOURSE, CORRECT_GRAMMAR | 2008/09/28 | 3566 / 4500 (79.24%) | 972 / 3566 (27.26%) | 651 / 972 (66.98%) | 651 / 4500 (14.47%) | 12.67 | 21.97 | 13.51 | ||
+DISCOURSE | 2008/09/25 | 3567 / 4500 (79.27%) | 967 / 3567 (27.11%) | 647 / 967 (66.91%) | 647 / 4500 (14.38%) | 12.67 | 22.32 | 13.47 | ||
+IN_DOMAIN | 2008/07/22 | 3551 / 4500 (78.91%) | 861 / 3551 (24.25%) | 617 / 861 (71.66%) | 617 / 4500 (13.71%) | 13.65 | 23.28 | 13.68 | ||
+WA | 2008/06/26 | 3551 / 4500 (78.91%) | 866 / 3551 (24.39%) | 625 / 866 (72.17%) | 625 / 4500 (13.89%) | 13.65 | 22.94 | 13.77 | ||
+VN3 | 2008/06/19 | 3543 / 4500 (78.73%) | 861 / 3543 (24.30%) | 610 / 861 (70.85%) | 610 / 4500 (13.56%) | 13.00 | 22.97 | 13.27 | ||
-STRICT_N, -STRICT_V, -VN, +VN2 | 2008/06/18 | 3543 / 4500 (78.73%) | 857 / 3543 (24.19%) | 598 / 857 (69.78%) | 598 / 4500 (13.29%) | 12.69 | 22.66 | 12.98 | ||
+PET, +PMODEL | 2008/06/17 | 3543 / 4500 (78.73%) | 882 / 3543 (24.89%) | 612 / 882 (69.39%) | 612 / 4500 (13.60%) | 12.69 | 22.66 | 13.13 | ||
-NO_SPURIOUS | 2008/06/16 | 3013 / 4499 (66.97%) | 730 / 3013 (24.23%) | 528 / 730 (72.33%) | 528 / 4499 (11.74%) | 12.68 | 22.32 | 12.19 | ||
-CONJ | 2008/06/16 | 3068 / 4500 (68.18%) | 742 / 3068 (24.19%) | 510 / 742 (68.73%) | 510 / 4500 (11.33%) | 14.04 | 22.64 | 12.54 | ||
+NO_SPURIOUS, +CONJ | 2008/06/15 | 3068 / 4500 (68.18%) | 777 / 3068 (25.33%) | 528 / 777 (67.95%) | 528 / 4500 (11.73%) | 13.36 | 22.32 | 12.49 | ||
+GEDICT | 2008/06/15 | 3014 / 4500 (66.98%) | 722 / 3014 (23.95%) | 520 / 722 (72.02%) | 520 / 4500 (11.56%) | 13.37 | 22.31 | 12.40 | ||
+VN, +STRICT_N, +STRICT_V | 2008/06/14 | 3014 / 4500 (66.98%) | 703 / 3014 (23.32%) | 500 / 703 (71.12%) | 500 / 4500 (11.11%) | 13.69 | 22.64 | 12.27 | ||
+GEN | 2008/06/14 | 3014 / 4500 (66.98%) | 691 / 3014 (22.93%) | 491 / 691 (71.06%) | 491 / 4500 (10.91%) | 14.04 | 23.00 | 12.28 | ||
+IF/THEN | 2008/06/09 | 2779 / 4500 (61.76%) | 679 / 2779 (24.43%) | 487 / 679 (71.72%) | 487 / 4500 (10.82%) | 14.70 | 23.35 | 12.47 | ||
+HAND, +SYNC | 2008/06/07 | 2779 / 4500 (61.76%) | 662 / 2779 (23.82%) | 478 / 662 (72.21%) | 478 / 4500 (10.62%) | 14.36 | 23.02 | 12.21 | ||
2008/06/05 | 2779 / 4500 (61.76%) | 549 / 2779 (19.76%) | 383 / 549 (69.76%) | 383 / 4500 (8.51%) | 14.33 | 22.34 | 10.68 |
Tanaka Corpus Test Data
Changes | Date | Parse Coverage | Transfer Coverage | Generation Coverage | End-to-End Coverage | NEVA | Oracle | F1 | ||
+CHANGE | 2009/09/22 | 604@27 | 21502 | 3195 / 3994 (79.99%) | 1286 / 3195 (40.25%) | 667 / 1286 (51.87%) | 667 / 3994 (16.70%) | 13.40 | 19.03 | 14.87 |
+CHANGE | 2009/09/22 | 604@26 | 29895 | 3573 / 4450 (80.29%) | 1441 / 3573 (40.33%) | 742 / 1441 (51.49%) | 742 / 4450 (16.67%) | 13.23 | 18.83 | 14.75 |
+CHANGE | 2009/09/22 | 604@26 | 15939 | 3609 / 4499 (80.22%) | 1460 / 3609 (40.45%) | 755 / 1460 (51.71%) | 755 / 4499 (16.78%) | 13.30 | 19.00 | 14.84 |
+CHANGE | 2009/09/21 | 604@24 | 2265 | 3612 / 4499 (80.28%) | 1462 / 3612 (40.48%) | 756 / 1462 (51.71%) | 756 / 4499 (16.80%) | 13.30 | 19.01 | 14.85 |
+CHANGE | 2009/09/21 | 604@23 | 20118 | 3614 / 4500 (80.31%) | 1463 / 3614 (40.48%) | 756 / 1463 (51.67%) | 756 / 4500 (16.80%) | 13.30 | 19.01 | 14.85 |
+CHANGE | 2009/09/21 | 604@22 | 29341 | 3613 / 4499 (80.31%) | 1461 / 3613 (40.44%) | 755 / 1461 (51.68%) | 755 / 4499 (16.78%) | 13.30 | 19.00 | 14.84 |
+CHANGE | 2009/09/20 | 604@20 | 21485 | 3612 / 4498 (80.30%) | 1458 / 3612 (40.37%) | 751 / 1458 (51.51%) | 751 / 4498 (16.70%) | 13.29 | 18.84 | 14.80 |
+CHANGE | 2009/09/20 | 604@19 | 10091 | 3613 / 4500 (80.29%) | 1459 / 3613 (40.38%) | 755 / 1459 (51.75%) | 755 / 4500 (16.78%) | 13.30 | 19.00 | 14.84 |
+CHANGE | 2009/09/18 | 585@16 | 19596 | 3613 / 4499 (80.31%) | 1436 / 3613 (39.75%) | 726 / 1436 (50.56%) | 726 / 4499 (16.14%) | 12.96 | 18.85 | 14.38 |
+CHANGE | 2009/09/18 | 585@14 | 29791 | 3611 / 4498 (80.28%) | 1249 / 3611 (34.59%) | 650 / 1249 (52.04%) | 650 / 4498 (14.45%) | 14.00 | 19.71 | 14.22 |
+CHANGE | 2009/09/18 | 565@17 | 22899 | 3613 / 4499 (80.31%) | 1256 / 3613 (34.76%) | 647 / 1256 (51.51%) | 647 / 4499 (14.38%) | 14.02 | 19.84 | 14.20 |
+CHANGE | 2009/09/17 | 585@9 | 10748 | 3611 / 4499 (80.26%) | 1448 / 3611 (40.10%) | 750 / 1448 (51.80%) | 750 / 4499 (16.67%) | 13.24 | 19.08 | 14.76 |
+CHANGE | 2009/09/17 | 585@8 | 19904 | 3614 / 4500 (80.31%) | 1439 / 3614 (39.82%) | 728 / 1439 (50.59%) | 728 / 4500 (16.18%) | 12.93 | 18.85 | 14.37 |
+CHANGE | 2009/09/17 | 585@7 | 5196 | 3611 / 4498 (80.28%) | 1461 / 3611 (40.46%) | 755 / 1461 (51.68%) | 755 / 4498 (16.79%) | 13.30 | 19.00 | 14.84 |
+CHANGE | 2009/09/17 | 585@13 | 22095 | 3612 / 4498 (80.30%) | 1439 / 3612 (39.84%) | 745 / 1439 (51.77%) | 745 / 4498 (16.56%) | 13.17 | 19.13 | 14.67 |
+CHANGE | 2009/09/17 | 585@12 | 11178 | 3612 / 4498 (80.30%) | 1425 / 3612 (39.45%) | 725 / 1425 (50.88%) | 725 / 4498 (16.12%) | 12.89 | 18.69 | 14.33 |
+CHANGE | 2009/09/16 | 585@6 | 16496 | 3611 / 4497 (80.30%) | 1470 / 3611 (40.71%) | 748 / 1470 (50.88%) | 748 / 4497 (16.63%) | 13.16 | 18.88 | 14.69 |
HG585 | 2009/09/15 | 3435 / 4287 (80.13%) | 1328 / 3435 (38.66%) | 683 / 1328 (51.43%) | 683 / 4287 (15.93%) | 12.88 | 19.00 | 14.24 | ||
HG584 | 2009/09/15 | 3613 / 4499 (80.31%) | 1462 / 3613 (40.46%) | 755 / 1462 (51.64%) | 755 / 4499 (16.78%) | 13.30 | 19.00 | 14.84 | ||
+MAX3, +T10000, -LONG_WAIT, +MOSES_2.05, +FEEDBACK | 2009/09/14 | 3613 / 4499 (80.31%) | 1462 / 3613 (40.46%) | 755 / 1462 (51.64%) | 755 / 4499 (16.78%) | 13.30 | 19.00 | 14.84 | ||
+MAX3, +T5000, -LONG_WAIT, +MOSES_2.05, +FEEDBACK | 2009/09/14 | 3612 / 4498 (80.30%) | 1457 / 3612 (40.34%) | 754 / 1457 (51.75%) | 754 / 4498 (16.76%) | 13.31 | 19.02 | 14.84 | ||
+MAX10, +T10000, -LONG_WAIT, +MOSES_2.05, +FEEDBACK | 2009/09/14 | 3608 / 4495 (80.27%) | 1411 / 3608 (39.11%) | 736 / 1411 (52.16%) | 736 / 4495 (16.37%) | 13.19 | 19.07 | 14.61 | ||
+MAX10, +T10000, -LONG_WAIT, +MOSES_2.05 | 2009/09/11 | 3612 / 4498 (80.30%) | 1415 / 3612 (39.17%) | 721 / 1415 (50.95%) | 721 / 4498 (16.03%) | 12.88 | 18.75 | 14.28 | ||
+FEEDBACK_SORTING | 2009/09/04 | 3613 / 4500 (80.29%) | 1336 / 3613 (36.98%) | 672 / 1336 (50.30%) | 672 / 4500 (14.93%) | 13.53 | 19.40 | 14.19 | ||
+BURN_IN | 2009/08/31 | 3613 / 4499 (80.31%) | 1467 / 3613 (40.60%) | 748 / 1467 (50.99%) | 748 / 4499 (16.63%) | 13.31 | 18.94 | 14.78 | ||
+FEEDBACK, +MAX3, +T5000, -LONG_WAIT, +MOSES_2.05 | 2009/08/29 | 3609 / 4495 (80.29%) | 1476 / 3609 (40.90%) | 750 / 1476 (50.81%) | 750 / 4495 (16.69%) | 13.57 | 18.98 | 14.96 | ||
+FEEDBACK, +MAX10, +T10000, +LONG_WAIT, +MOSES_2.05 | 2009/08/27 | 3612 / 4497 (80.32%) | 1453 / 3612 (40.23%) | 735 / 1453 (50.58%) | 735 / 4497 (16.34%) | 13.24 | 19.06 | 14.63 | ||
+MAX10, +T10000, +LONG_WAIT, +MOSES_2.05 | 2009/08/26-2 | 3607 / 4493 (80.28%) | 1383 / 3607 (38.34%) | 712 / 1383 (51.48%) | 712 / 4493 (15.85%) | 12.89 | 18.79 | 14.22 | ||
+FEEDBACK, +MAX10, +T5000, +LONG_WAIT, +MOSES_2.05 | 2009/08/26 | 3290 / 4088 (80.48%) | 1303 / 3290 (39.60%) | 665 / 1303 (51.04%) | 665 / 4088 (16.27%) | 13.25 | 19.00 | 14.60 | ||
+JACY_BARCELONA | 2009/08/24 | 3613 / 4499 (80.31%) | 1458 / 3613 (40.35%) | 746 / 1458 (51.17%) | 746 / 4499 (16.58%) | 13.39 | 18.94 | 14.81 | ||
-BLACKLIST | 2009/08/23-2 | 3613 / 4499 (80.31%) | 1426 / 3613 (39.47%) | 724 / 1426 (50.77%) | 724 / 4499 (16.09%) | 13.27 | 19.15 | 14.54 | ||
+T5000, -LONG_WAIT | 2009/08/23 | 3614 / 4500 (80.31%) | 1317 / 3614 (36.44%) | 667 / 1317 (50.65%) | 667 / 4500 (14.82%) | 14.00 | 19.52 | 14.40 | ||
+MAX3 | 2009/08/19 | 3373 / 4199 (80.33%) | 1246 / 3373 (36.94%) | 648 / 1246 (52.01%) | 648 / 4199 (15.43%) | 13.81 | 19.68 | 14.57 | ||
+LONG_WAIT | 2009-0818-3 | 3613 / 4499 (80.31%) | 1311 / 3613 (36.29%) | 672 / 1311 (51.26%) | 672 / 4499 (14.94%) | 13.45 | 19.42 | 14.15 | ||
+MOSES_2.5, +MAX5, -LONG_WAIT | 2009/08/18-2 | 3614 / 4500 (80.31%) | 1320 / 3614 (36.52%) | 674 / 1320 (51.06%) | 674 / 4500 (14.98%) | 13.43 | 19.47 | 14.16 | ||
-MOSES, +BLACKLIST | 2009/08/18 | 3613 / 4499 (80.31%) | 1322 / 3613 (36.59%) | 682 / 1322 (51.59%) | 682 / 4499 (15.16%) | 13.65 | 19.54 | 14.37 | ||
+LEMMAS | 2009/08/17 | 3564 / 4440 (80.27%) | 1364 / 3564 (38.27%) | 710 / 1364 (52.05%) | 710 / 4440 (15.99%) | 13.31 | 19.48 | 14.53 | ||
+T5000, +MAX3, +COMBINED_MTRS, +MOSES_2.25 | 2009/08/16 | 3613 / 4499 (80.31%) | 1389 / 3613 (38.44%) | 724 / 1389 (52.12%) | 724 / 4499 (16.09%) | 13.66 | 19.81 | 14.78 | ||
+T10000, +MAX3, +COMBINED_MTRS | 2009/08/12 | 3611 / 4497 (80.30%) | 1455 / 3611 (40.29%) | 746 / 1455 (51.27%) | 746 / 4497 (16.59%) | 11.99 | 18.21 | 13.92 | ||
+T10000, +COMBINED_MTRS, +LONG_WAIT | 2009/08/03 | 3607 / 4493 (80.28%) | 1066 / 3607 (29.55%) | 548 / 1066 (51.41%) | 548 / 4493 (12.20%) | 13.97 | 20.18 | 13.02 | ||
+BARCELONA_TEST3 | 2009/07/13 | 3614 / 4500 (80.31%) | 1679 / 3614 (46.46%) | 863 / 1679 (51.40%) | 863 / 4500 (19.18%) | 14.12 | 20.11 | 16.27 | ||
+BARCELONA_TEST2 | 2009/07/07 | 3614 / 4500 (80.31%) | 1680 / 3614 (46.49%) | 863 / 1680 (51.37%) | 863 / 4500 (19.18%) | 14.25 | 20.04 | 16.35 | ||
+BARCELONA_TEST | 2009/07/03 | 3613 / 4499 (80.31%) | 1677 / 3613 (46.42%) | 849 / 1677 (50.63%) | 849 / 4499 (18.87%) | 14.09 | 20.18 | 16.13 | ||
+TC0906, -FEEDBACK2 | 2009/06/06 | 3611 / 4499 (80.26%) | 1654 / 3611 (45.80%) | 832 / 1654 (50.30%) | 832 / 4499 (18.49%) | 14.50 | 20.64 | 16.25 | ||
+FEEDBACK2 | 2009/06/04 | 3626 / 4500 (80.58%) | 1649 / 3626 (45.48%) | 834 / 1649 (50.58%) | 834 / 4500 (18.53%) | 14.68 | 20.68 | 16.38 | ||
+JACY_EXP, -FEEDBACK | 2009/06/03 | 3628 / 4500 (80.62%) | 1642 / 3628 (45.26%) | 827 / 1642 (50.37%) | 827 / 4500 (18.38%) | 14.51 | 20.75 | 16.21 | ||
+FEEDBACK | 2009/06/02 | 3613 / 4500 (80.29%) | 1638 / 3613 (45.34%) | 838 / 1638 (51.16%) | 838 / 4500 (18.62%) | 14.97 | 20.87 | 16.60 | ||
+ PN_FIX, +JACY_SVN | 2009/05/28 | 3611 / 4498 (80.28%) | 1653 / 3611 (45.78%) | 841 / 1653 (50.88%) | 841 / 4498 (18.70%) | 14.69 | 20.37 | 16.46 | ||
+LIKE | 2009/05/23 | 3616 / 4500 (80.36%) | 1653 / 3616 (45.71%) | 841 / 1653 (50.88%) | 841 / 4500 (18.69%) | 14.69 | 20.37 | 16.45 | ||
+ZERO_FIX | 2009/05/19 | 3616 / 4500 (80.36%) | 1602 / 3616 (44.30%) | 837 / 1602 (52.25%) | 837 / 4500 (18.60%) | 14.70 | 20.37 | 16.42 | ||
-TERG, -TERGDICT, +0902, + 0902DICT | 2009/05/18 | 3616 / 4500 (80.36%) | 1602 / 3616 (44.30%) | 634 / 1602 (39.58%) | 634 / 4500 (14.09%) | 15.07 | 21.77 | 14.56 | ||
+TERG, +TERGDICT | 2009/05/16 | 3616 / 4500 (80.36%) | 1549 / 3616 (42.84%) | 559 / 1549 (36.09%) | 559 / 4500 (12.42%) | 14.04 | 19.76 | 13.18 | ||
+SVN | 2009/05/14 | 3616 / 4500 (80.36%) | 1555 / 3616 (43.00%) | 655 / 1555 (42.12%) | 655 / 4500 (14.56%) | 16.08 | 22.44 | 15.28 | ||
+SEMI, +T5000, +BOOT, +RELATIONAL_N2, +TAME, +NAKEREBA, +MOSES, +CVS_HEAD | 2011/11 | 3500 / 4499 (77.80%) | 1658 / 3500 (47.37%) | 871 / 1658 (52.53%) | 871 / 4499 (19.36%) | 15.04 | 22.07 | 16.93 | ||
+CVS, +LMC | 2008/10/30 | 3506 / 4500 (77.91%) | 983 / 3506 (28.04%) | 662 / 983 (67.34%) | 662 / 4500 (14.71%) | 13.66 | 24.01 | 14.17 | ||
+NO_AMBIGUOUS_V3, +GEN2, +GEDICT2, +RELATIONAL_N | 2008/10/25 | 3505 / 4499 (77.91%) | 1011 / 3505 (28.84%) | 677 / 1011 (66.96%) | 677 / 4499 (15.05%) | 12.97 | 23.99 | 13.93 | ||
-UNKNOWN, +PN, +NO_AMBIGUOUS_V | 2008/10/19 | 3491 / 4500 (77.58%) | 921 / 3491 (26.38%) | 623 / 921 (67.64%) | 623 / 4500 (13.84%) | 13.32 | 23.66 | 13.58 | ||
+UNKNOWN | 2008/10/13 | 3509 / 4499 (78.00%) | 1255 / 3509 (35.77%) | 805 / 1255 (64.14%) | 805 / 4499 (17.89%) | 11.66 | 20.68 | 14.12 | ||
+WA | 2008/06/27 | 3490 / 4500 (77.56%) | 865 / 3490 (24.79%) | 595 / 865 (68.79%) | 595 / 4500 (13.22%) | 13.03 | 22.70 | 13.12 | ||
+VN3 | 2008/06/21 | 3487 / 4500 (77.49%) | 859 / 3487 (24.63%) | 578 / 859 (67.29%) | 578 / 4500 (12.84%) | 12.01 | 22.67 | 12.41 | ||
+PET, +PMODEL | 2008/06/18 | 3486 / 4499 (77.48%) | 885 / 3486 (25.39%) | 584 / 885 (65.99%) | 584 / 4499 (12.98%) | 12.01 | 22.01 | 12.48 | ||
-NO_SPURIOUS | 2008/06/17 | 2939 / 4500 (65.31%) | 757 / 2939 (25.76%) | 507 / 757 (66.97%) | 507 / 4500 (11.27%) | 12.04 | 21.36 | 11.64 | ||
+GEN, +GEDICT, +VN, +STRICT_N, +STRICT_V, +NO_SPURIOUS | 2008/06/16 | 3005 / 4500 (66.78%) | 804 / 3005 (26.76%) | 514 / 804 (63.93%) | 514 / 4500 (11.42%) | 12.03 | 21.35 | 11.72 | ||
+IF/THEN | 2008/06/09 | 2764 / 4500 (61.42%) | 720 / 2764 (26.05%) | 494 / 720 (68.61%) | 494 / 4500 (10.98%) | 12.00 | 21.34 | 11.47 | ||
+HAND, +SYNC | 2008/06/07 | 2764 / 4500 (61.42%) | 698 / 2764 (25.25%) | 488 / 698 (69.91%) | 488 / 4500 (10.84%) | 12.00 | 21.00 | 11.39 | ||
+PRO | 2008/06/05 | 2764 / 4500 (61.42%) | 572 / 2764 (20.69%) | 398 / 572 (69.58%) | 398 / 4500 (8.84%) | 12.02 | 21.03 | 10.19 |
System Changes Legend
BARCELONA_TEST | Test of jaen for Barcelona LOGON release |
JACY_EXP | Francis’ experimental uncommitted Jacy fixes |
FEEDBACK | feedback cleaning round #1 (feedback clean won ;_;) |
JACY_SVN | re-checked out Jacy SVN |
PN_FIX | make pn-omtr inherit from pn-mtr instead of proper_noun-mtr |
LIKE | fixes for すること/のが好き/嫌い, some modification to idioms (“thank you”, “ok”) |
ZERO_FIX | FCB’s fix to zero pronoun translation |
0TGT | allow rules where the target word doesn’t appear in tc |
0902DICT | rebuilt EDICT rules with 0902 TERG mrs rels |
0902 | reverted to 0902 tip TERG |
TERGDICT | rebuilt EDICT rules with TERG mrs rels |
TERG | switched to trunk ERG in ja2en.lisp |
SVN | updated to the logon svn branch |
CVS_HEAD | updated the logon branch with cvs update -r HEAD |
MOSES | added rules acquired from Moses’ phrase table |
NAKEREBA | added rules for nakerea/nai+to naranai/ikenai |
TAME | added rules for ため and its many variations |
RELATIONAL_N2 | fixed relational noun rules and added rules for embedding relational noun args |
BOOT | updated bootstrapped rules from Tanaka Corpus and SLT06 data |
T5000 | set transfer edges to 5,000 |
SEMI | relaxed semi-test to (setf *semi-test* ‘(:predicates :properties)) |
LMD | set language weights to 0.2/0.2/0.1/0.3/0.0/0.2 |
GIZA | added giza++ alignment models for jaen |
LMC | set language model weights in .tsdbrc to 0.2/0.2/0.1/0.5 |
CVS | updated LOGON CVS on 2008/10/28 |
T10000 | increased transfer edges to 10,000 |
RELATIONAL_N | added a clean-up rule to insert ARG1s into relational nouns (_n_of,_n_for,_n_to,_n_about) |
NO_AMBIGUOUS_V3 | added なう and にる to ambiguous verb blacklist |
GEDICT2 | updated mtrs for Tanaka corpus generic entries |
NO_AMBIGUOUS_V2 | updated ambiguous verb form blacklist and added to Jacy SVN |
GEN2 | generic entries updated for new Tanaka corpus |
NEW_TANAKA | cleaned up version of Tanaka corpus |
NO_AMBIGUOUS_V | removed ambiguous verb entries from tanaka corpus unknown lexical entries. this includes potential forms of verbs like 買える for 買う and kana verb entries that cause particle ambiguity like でる, にる, はる, etc. |
PN | Proper noun rules like シェクスピア→Shakespeare |
UNKNOWN | fixes to unknown word handling: reinstating common noun -> proper noun coersion, stripping off _rel, etc. |
DISCOURSE | changes to the grammar adding _d_ discourse rels for wa, mo, etc. |
IN_DOMAIN | include up to 3 translations where src and tgt are both in the training data |
WA | fixes to wa and topicalization in grammar |
VN3 | apply VN handling rules after dictionary rules |
VN2 | added a FLAG.SUBSUMES check for args to VN handling |
PMODEL | parsing model trained on Tanaka corpus |
PET | switched to PET for parsing Japanese |
CONJ | fixed conjunction_mtr definition |
NO_SPURIOUS | reduced spurious ambiguity by removing _ga_5_rel,_iru_6_rel,_iru_7_rel from Japanese grammar |
STRICT_V | added checks to make sure ARG0 is of type e for verb rules |
STRICT_N | added checks to make sure ARG0 is of type x for noun rules |
VN | convert verbal nouns to nouns by stripping nominalization_rel and converting ARG0 to x in preprocessing |
GEDICT | added translation rules from Edict for generic entries |
GEN | added generic entries to Japanese grammar for unknown words in Tanaka corpus |
IF/THEN | fixed handling of ~eba/~tara/~nara -> if/then |
SYNC | synchronized rel names in grammar and handcrafted rules |
HAND | added handcrafted lexical items |
PRO | fixed pronoun handling |
Subgoal (2008-10)
-
900 sentences (20%): 253 to get from
- New lexicon (Edict+EDR, tanaka corpus P(E|J))
- Empathy verbs
- Time expressions
- analyse by length
- reorder training set
Last update: 2011-08-16 by PetterHaugereid [edit]