Preprocessing and deep grammars

Lead: Kiril

Participants: Petya, Mike, Angelina, Montse, Yi

Main discussants: Kiril and Yi

0. The focus of the discussion is on how to incorporate shallow parses into deep parsing architectures.

1. Problem:

- MaltParser - tried on Bulgarian with partial grammars as input, BUT - not always complete trees can be derived.

- only one format CoNLL

2. What to be done? - input to be chosen (POS tagged, lemmatized, chunked or DP parsed…)

- degrees of reliability of the various preprocessing steps

- degrees of control over the various preprocessing steps

- towards Statistical parsong where confidence measures can be used

3. Connection to partial FS (hence - to simplified FS)

- language grammar specification

- mapping between the grammar and the format, i.e. agreement info for chunks

- impact of the new words

- impact of sure vs. unsure decisions, made by the various preprocessing steps

4. MaltParser again:

- no underspecification is possible now

- discrepancies between locality in algorithm and linguistic locality

Example: for German first the topological fields are identified, and then – HPSG analyses are performed.

5. Should morphology be done outside the grammar?

6. Probabilistic model over an PCFG Grammar. However: hard with a single probabilistic model. Thus: parallel model to HPSG grammar?

7. Towards Robust Unification. This implies degrees of constraints. For example, the INFLECTED+/- can be unified, but not singular and plural.

CONCLUSION: The idea of a hybrid approach; information fusion. Everything in a complex system.

Last update: 2012-07-10 by EmilyBender [edit]