RO  EN
IMI/Publicaţii/CSJM/Ediţii/CSJM v.30, n.1 (88), 2022/

Revisiting the Role of Classical Readability Formulae Parameters in Complex Word Identification (Part 2)

Authors: Gayatri Venugopal, Dhanya Pramod, Jatinderkumar R. Saini
Keywords: complex word identification, readability, hindi, binary classification, natural language processing.

Abstract

Accessibility of text is an attribute that deserves the attention of researchers and content creators. This study is an attempt to determine the lexical features that play a key role in identifying complex words in Hindi text. As the first step, we studied the parameters used in readability metrics in different languages and tested their importance on classifiers built on datasets created with the help of a user study. In part of the study, we reported the results of two different approaches used to label a word as complex. In this part, we compare the previous results with the results obtained from a third labeling approach. We found satisfactory evidence for certain parameters and also observed a new parameter that could be used while devising readability metrics for Hindi.

Gayatri Venugopal
Symbiosis Institute of Computer Studies and Research (SICSR)
Symbiosis International (Deemed University) (SIU),
Model Colony, Pune, Maharashtra, India
Phone: +91-9665856569
E-mail:

Dhanya Pramod
Symbiosis Centre for Information Technology (SCIT)
Symbiosis International (Deemed University) (SIU),
Hinjewadi, Pune, Maharashtra, India
E-mail:

Jatinderkumar R. Saini
Symbiosis Institute of Computer Studies and Research (SICSR)
Symbiosis International (Deemed University) (SIU),
Model Colony, Pune, Maharashtra, India
E-mail:

DOI

https://doi.org/10.56415/csjm.v30.03

Fulltext

Adobe PDF document0.52 Mb