RO  EN
IMI/Publicaţii/CSJM/Ediţii/CSJM v.19, n.2 (56), 2011/

Romanian Linguistic Resources On Very Large Scale

Authors: Dan Cristea

Abstract

This paper suggests a methodology for building a technological environment for linguistic processing, intended to conserve, update and exploit, for research, for public and for commercial purposes, strategic linguistic resources of the Romanian language, rooted in textual data contributed daily and in the long run by important editorial houses and mass-media institutions. In essence, it describes a technology able to receive, store and continuously process large amounts of textual data, received from voluntary contributors, on a daily basis. Apart from storing linguistic data à la longue for the benefit of preserving the language, the results of the processing will be returned to three categories of users: the researchers working on Romanian language and computational linguistics, the contributors of the resources, and the public at large.

Faculty of Computer Science, "Alexandru Ioan Cuza" University of Iaşi
Institute of Computer Science, Romanian Academy, the Iaşi branch
E-mail:



Fulltext

Adobe PDF document0.13 Mb