Datalinks Wiki





"Apertium is a toolbox to build open-source shallow-transfer machine translation systems, especially suitable for related language pairs: it includes the engine, maintenance tools, and open linguistic data for several language pairs."

Language-pair data includes:

    Spanish ⇆ Catalan (apertium-es-ca)

    Spanish ← Romanian (apertium-es-ro)

    French ⇆ Catalan (apertium-fr-ca)

    Occitan ⇆ Catalan (apertium-oc-ca)

    English ⇆ Galician (apertium-en-gl)

    Swedish → Danish (apertium-sv-da)

    Occitan ⇆ Spanish (apertium-oc-es)

    Spanish ⇆ Portuguese (apertium-es-pt)

    English ⇆ Catalan (apertium-en-ca)

    English ⇆ Spanish (apertium-en-es)

    English ⇆ Esperanto (apertium-en-eo)

    Spanish ⇆ Galician (apertium-es-gl)

    French ⇆ Spanish (apertium-fr-es)

    Esperanto ← Spanish (apertium-eo-es)

    Welsh → English (apertium-cy-en)

    Breton → French (apertium-br-fr)

    Esperanto ← Catalan (apertium-eo-ca)

    Portuguese ⇆ Catalan (apertium-pt-ca)

    Portuguese ⇆ Galician (apertium-pt-gl)

    Basque → Spanish (apertium-eu-es)

    Norwegian Nynorsk ⇆ Norwegian Bokmål (apertium-nn-nb)

The above are the "released" language pairs, data includes:

    dictionaries for morphological analysis and generation

    disambiguation (statistical models, rules, in some cases Constraint Grammars)

    bilingual (transfer) dictionaries

    structural transfer rules

There is also a lot of data of the above kinds for unreleased language pairs, eg. Icelandic → English, North Sámi → Lule Sámi; and tools to maintain such data.

License COPYING file in language pair data archive contains a copy of the GPL.