Singh, Dinesh and C, Krishna Mohan
(2018)
Projection-SVM: Distributed Kernel Support Vector Machine for Big Data using Subspace Partitioning.
In: IEEE International Conference on Big Data (Big Data), 10-13 December 2018, USA.
Full text not available from this repository.
(
Request a copy)
Abstract
The training of kernel support vector machine (SVM) is a computationally complex task for large datasets where the number of samples ranges in millions. This is because kernel matrix (in general not sparse) is both computation expensive and memory intensive. Existing methods hardly achieve a linear scale and suffer from high approximation loss. We propose Projection-SVM, a distributed implementation of kernel support vector machine for large datasets using subspace partitioning. In subspace partitioning, a decision tree is constructed on projection of data along the direction of maximum variance (i.e., dominant eigenvector) to obtain smaller partitions (i.e., subspaces) of the dataset. On each of these partitions, a kernel SVM is trained independently over a cluster thereby reducing the overall training time. Also, it results in reducing the prediction time significantly. We demonstrate the efficacy of the proposed approach on eight standard large datasets from various application domains, namely, mnist8m, kddcup99, webspam, etc. where Projection-SVM is on an average 150 times faster than sequential SVM while maintaining the classification accuracy. The experimental results also show the superiority of the Projection-SVM over the state-of-the-art approaches for distributed kernel SVMs, such as DCSVM, CASVM, and DTSVM.
Actions (login required)
|
View Item |