Boolean and Fp-Matrix Factorization: From Theory to Practice
Fomin, Fedor and Panolan, Fahad and Patil, Anurag and et al, . (2022) Boolean and Fp-Matrix Factorization: From Theory to Practice. In: 2022 International Joint Conference on Neural Networks, IJCNN 2022, 18 July 2022through 23 July 2022, Padua.
Text
Neural_Networks.pdf - Published Version Available under License Creative Commons Attribution. Download (1MB) |
Abstract
Boolean Matrix Factorization (BMF) aims to find an approximation of a given binary matrix as the Boolean product of two low-rank binary matrices. Binary data is ubiquitous in many fields, and representing data by binary matrices is common in medicine, natural language processing, bioinformatics, computer graphics, among many others. Factorizing a matrix into low-rank matrices is used to gain more information about the data, like discovering relationships between the features and samples, roles and users, topics and articles, etc. In many applications, the binary nature of the factor matrices could enormously increase the interpretability of the data. Unfortunately, BMF is computationally hard and heuristic algorithms are used to compute Boolean factorizations. Very re-cently, the theoretical breakthrough was obtained independently by two research groups. Ban et al. (SODA 2019) and Fomin et al. (Trans. Algorithms 2020) show that BMF admits an effi-cient polynomial-time approximation scheme (EPTAS). However, despite the theoretical importance, the high double-exponential dependence of the running times from the rank makes these algorithms unimplementable in practice. The primary research question motivating our work is whether the theoretical advances on BMF could lead to practical algorithms. The main conceptional contribution of our work is the fol-lowing. While EPTAS for BMF is a purely theoretical advance, the general approach behind these algorithms could serve as the basis in designing better heuristics. We also use this strategy to develop new algorithms for related Fp -Matrix Factorization. Here, given a matrix A over a finite field GF (p) where p is a prime, and an integer r. our objective is to find a matrix B over the same field with GF (p) -rank at most r minimizing some norm of A-B. Our empirical research on synthetic and real-world data demonstrates the advantage of the new algorithms over previous works on BMF and Fp-Matrix Factorization. © 2022 IEEE.
IITH Creators: |
|
||||
---|---|---|---|---|---|
Item Type: | Conference or Workshop Item (Paper) | ||||
Additional Information: | The research received funding from the Research Council of Norway via the project BWCA (grant no. 314528) and IIT Hyderabad via Seed grant (SG/IITH/F224/2020-21/SG-79). The work is conducted while Anurag Patil and Adil Tanveer were students at IIT Hyderabad. | ||||
Uncontrolled Keywords: | Binary matrix factorization; Categorical data; Data mining | ||||
Subjects: | Computer science | ||||
Divisions: | Department of Computer Science & Engineering | ||||
Depositing User: | . LibTrainee 2021 | ||||
Date Deposited: | 18 Nov 2022 14:59 | ||||
Last Modified: | 18 Nov 2022 14:59 | ||||
URI: | http://raiithold.iith.ac.in/id/eprint/11180 | ||||
Publisher URL: | http://doi.org/10.1109/IJCNN55064.2022.9892947 | ||||
Related URLs: |
Actions (login required)
View Item |
Statistics for this ePrint Item |