Ifeanyi-Reuben Nkechi J1 and Benson-Emenike Mercy E2, 1Rhema University, Nigeria and 2Abia State Polytechnic, Nigeria
ABSTRACT
The development in Information Technology (IT) has encouraged the use of Igbo Language in text creation, online news reporting, online searching and articles publications. As the information stored in text format of this language is increasing, there is need for an intelligent text-based system for proper management of the data. The selection of optimal set of features for processing plays vital roles in text-based system. This paper analyzed the structure of Igbo text and designed an efficient feature selection model for an intelligent Igbo text-based system. It adopted Mean TF-IDF measure to select most relevant features on Igbo text documents represented with two word-based n-gram text representation (unigram and bigram) models. The model is designed with Object-Oriented Methodology and implemented with Python programming language with tools from Natural Language Toolkits (NLTK). The result shows that bigram represented text gives more relevant features based on the language semantics.
KEYWORDS
Feature Selection, Igbo Language, Igbo Text Pre-Processing, Text Representation
Original Source URL: https://aircconline.com/ijdkp/V8N6/8618ijdkp02.pdf
https://airccse.org/journal/ijdkp/vol8.html

No comments:
Post a Comment