Download PDFOpen PDF in browserArabic Text Classification Using Linear Discriminant AnalysisEasyChair Preprint 756 pages•Date: April 18, 2018AbstractLinear Discriminant Analysis (LDA) is a dimensionality reduction technique that is widely used in patter recognition applications. LDA aims at generating effective feature vectors by reducing the dimensions of the original data (e.g. bag-of-words representation) into a low dimensional space. Hence, LDA is a convenient method for text classification that generally characterized by high dimensional feature vectors. In this paper, we empirically investigated two LDA based methods for Arabic text classification. The first method based on computing the generalized eigenvectors of the ratio (inverse within-class and between-class) scatters, the second method include linear classification functions that assume equal population covariance matrices (i.e. pooled sample covariance matrix). We used a textual data collection that contains 1,750 documents belong to five categories. The testing set contains 250 documents belong to five categories (50 documents for each category). The experimental results show that the linear classification functions method outperforms the eigenvalue decomposition method. Keyphrases: Arabic, Classification, Fisher, Linear Discriminant Analysis, text
|