Designing and creating a DELPH-IN repository (systems, tools, data, lingware)

LisbonRepositoryDiscussion is the transcript of Discussion I on Thu 2005-08-18 (11:00-11:45) in the LisbonTop meeting.

Moderation: StephanOepen

Scribe: UlrichSchaefer

Stephan introduces motivation for the discussion.

  • DELPH-IN is a community effort
  • central is the common repository
  • internal goal: exchange not only ideas, but also tools, resources, experiences
  • external goal: increase visibility, marketing; this is closely related to Discussion IV: Visibility in the afternoon

    • attract new contributors and funding
    • give the consortium a visible platform with a name

Start with what’s on the current webpage on Systems & Grammars

  • Core components
    • LKB: development environment
    • PET: fast parser
    • [incr tsdb()]: evaluation and benchmarking tool for both
  • Lingware
    • Matrix : starter kit for new grammars / bottom-up crosslingual
    • ERG : English Resource Grammar
    • Jacy : Japanese Grammar
    • Modern Greek
    • NorSource : Norwegian Grammar
    • German grammar to be available as open source in 2006 (already available for DELPH-IN members now)
    • plus more candidates: Korean, French, Spanish, Catalan, … ?
  • Architecture, language technology
  • Treebanks: Redwoods, Eiche, Tiger 700, …

Questions

  • how to organize exchange and delivery
    • communication
      • wiki
      • mailing list(s)
      • external vs. internal communication
      • up-to-date info on ongoing developments (visibility; avoid double work)
      • bugtracking system (scapegoat)
      • second generation developers experiences
      • keep in mind ‘consumer-only’ users like students, people not immediately working in CL/LT research
    • repository
      • what is missing?
        • common morphology component (finite state) for integration with LKB and PET, e.g. Spanish, Portuguese, German, Japanese: availability vs. usage of internal morphology, LKB optimization/lex. rules problem: more readings than necessary
        • better interface to e.g. ambigous external tokenization/preprocessing
        • machine translation (transfer) support
        • documentation is distributed, partly hard to find, e.g. DeepThought deliverables

        • quality assurance; see also Discussion V: Developing a distributed means of testing LKB updates on Friday
        • ways of obtaining DELPH-IN components
          • release vs. full CVS access (development tracking)
        • domain adaptation support
      • which platforms (operating systems, development environments) to support
        • what OSes are currently used?
        • which systems available via DELPH-IN are currently used?
  • marketing
    • is it a good idea to separate into wiki and static ‘official’ web page? (yes)
    • color and design of the current webpage
    • logo
    • online demos (holistic demonstrator / showcase app / killer app)
    • educational material
    • presentations (e.g. DELPH-IN corporate overview, tools, available resources)
    • target audience:
      • attract new contributors (members)
      • funding agencies/companies (e.g. visibility of already funded projects to potential funders)
    • licensing: see Discussion VI: Developing a standard DELPH-IN open source license on Friday

Last update: 2011-10-09 by anonymous [edit]