Classification of Text Using Fuzzy Based Incremental Feature Clustering Algorithm

Abstract

The dimensionality of feature vector plays a major in text classification. We can reduce the dimensionality of feature vector by using feature clustering based on fuzzy logic. We propose a fuzzy based incremental feature clustering algorithm. Based on the similarity test we can classify the feature vector of a document set are grouped into clusters following clustering properties and each cluster is characterized by a membership function with statistical mean and deviation .Then a desired number of clusters are formed automatically. We then take one extracted feature from each cluster which is a weighted combination of words contained in a cluster. By using our algorithm the derived membership function match closely with real distribution of training data. By our work we reduce the burden on user in specifying the number of features in advance.

Topics

0 Figures and Tables

    Download Full PDF Version (Non-Commercial Use)