MRS in Tatoeba
This is a small experiment to put MRS test suite sentences into Tatoeba corpus. The primary purposes of this experiment are to (a) verify the naturalness of the sentences and (b) bootstrap a crosslingual MRS corpus through crowdsourcing.
Issues
Issues that came up when putting MRS test suite sentences into Tatoeba (the bolded sentences/phrases are comments from Tatoeba’s user on some of the MRS sentences)
- @possible copyright violation: no issue, the test suite is MIT. In tatoeba, we can create a list can add the license on its title. See https://tatoeba.org/eng/sentences_lists/show/166576. The same as the grammar they were part of, which is almost always MIT (at least ERG and Jacy are)
- This is not a good translation: Some supposedly grammatical sentences were deemed as unnatural by Tatoeba’s user, should these sentences be corrected to ensure that our grammars are parsing the right stuff?
- Sentences are machine-translated.: Are sentences from the MRS test suite humanly translated? Or are they generated using LOGON?
- I cannot think of any situation where this would be used.: Sentences from the test suite should validate the grammars’ correctness, so it’s understandable that some sentences are not very useful to Tatoeba’s user on the surface. Is there a page where it says which phenomenon/phenomena each sentence captures? How language dependent are the phenomena captures in the sentences?
- users can edit a sentence. See
https://tatoeba.org/eng/sentences/show/2638181 that was edited to ‘Abrams bet Browne a cigarette that it had rained.’
Last update: 2020-06-10 by AlexandreRademaker [edit]