Class Specific TF-IDF Boosting for Short-text Classification

Ghosh, Samujjwal and Desarkar, Maunendra Sankar (2018) Class Specific TF-IDF Boosting for Short-text Classification. In: Companion of the The Web Conference 2018 on The Web Conference, April 23 - 27 2018, Lyon, France.

Full text not available from this repository. (Request a copy)

Abstract

Proper formulation of features plays an important role in short-text classification tasks as the amount of text available is very little. In literature, Term Frequency - Inverse Document Frequency (TF-IDF) is commonly used to create feature vectors for such tasks. However, TF-IDF formulation does not utilize the class information available in supervised learning. For classification problems, if it is possible to identify terms that can strongly distinguish among classes, then more weight can be given to those terms during feature construction phase. This may result in improved classifier performance with the incorporation of extra class label related information. We propose a supervised feature construction method to classify tweets, based on the actionable information that might be present, posted during different disaster scenarios. Improved classifier performance for such classification tasks can be helpful in the rescue and relief operations. We used three benchmark datasets containing tweets posted during Nepal and Italy earthquakes in 2015 and 2016 respectively. Experimental results show that the proposed method obtains better classification performance on these benchmark datasets

[error in script]
IITH Creators:
IITH CreatorsORCiD
Desarkar, Maunendra SankarUNSPECIFIED
Item Type: Conference or Workshop Item (Paper)
Subjects: Computer science
Divisions: Department of Computer Science & Engineering
Depositing User: Team Library
Date Deposited: 25 Apr 2018 06:07
Last Modified: 25 Apr 2018 06:07
URI: http://raiithold.iith.ac.in/id/eprint/3885
Publisher URL: http://doi.org/10.1145/3184558.3191621
Related URLs:

Actions (login required)

View Item View Item
Statistics for RAIITH ePrint 3885 Statistics for this ePrint Item