Datalinks Wiki
Advertisement
Hungarian Language Corpora and Analyzers

Type

Dataset

Link

http://mokk.bme.hu/resources/

Source

Ckan.net

Resources, including corpora and software, for processing Hungarian language.

Language resources


    The Hunglish Corpus is a sentence-aligned Hungarian-English parallel corpus published under the Creative Commons Attribution license.


    The Hungarian Webcorpus is a gigaword corpus of Hungarian gathered from the web.


    The Hunglish dictionary is a machine readable English-Hungarian bilingual lexicon.


    morphdb.hu is a Hungarian morphological database for use with Hunmorph morphological analyzer.

Software


    hunpos is a HMM based open source part-of-speech tagger.


    hunmorph is an open source tool and programming library for spell-checking, stemming and morphological analysing of agglutinative, german and other languages.


    hunalign is a language independent sentence level aligner to build parallel corpora.
Advertisement