RO  EN
IMCS/Publications/CSJM/Issues/CSJM v.29, n.3 (87), 2021/

Analyzing Complex Words in Hindi using Parameters of Classical Readability Formulae (Part 1)

Authors: G. Venugopal, D. Pramod, J. R. Saini

Abstract

Readability of a passage indicates the extent to which the meaning of the text can be understood; this could be represented in terms of the age that person should be of, or the grade that a person should be in, to understand the text. Numerous word lists and readability formulae have been devised by researchers who tested the readability of texts by involving children and adults. Most of these resources have been built for the English language. This study aims to analyse the complex words in Hindi sentences that were derived from a Human Intelligence Task (HIT), using variables considered in the widely adopted readability measures that focus on the lexical aspects of a sentence. Although there have been studies that analyse the readability of texts, this study claims to be the first of its kind, that aims to determine whether the parameters of traditional readability measures contribute significantly to context-agnostic models that classify a Hindi word as complex or simple. We report the results of two approaches used to deem a word as complex and determine the best approach out of the two. The model built using this approach was used to identify the most significant features.

Gayatri Venugopal
Symbiosis Institute of Computer Studies and Research (SICSR)
Symbiosis International (Deemed University) (SIU),
Model Colony, Pune, Maharashtra, India
Phone: +91-9665856569
E-mail:

Dhanya Pramod
Symbiosis Centre for Information Technology (SCIT)
Symbiosis International (Deemed University) (SIU),
Hinjewadi, Pune, Maharashtra, India
E-mail:

Jatinderkumar R. Saini
Symbiosis Institute of Computer Studies and Research (SICSR)
Symbiosis International (Deemed University) (SIU),
Model Colony, Pune, Maharashtra, India
E-mail:



Fulltext

Adobe PDF document0.19 Mb