Title: Bayesian consensus clustering with LIME for security in big data

Authors: S. Balamurugan; M. Thangaraj

Addresses: Department of Computer Science, Bharathiar University, Coimbatore, India ' Department of Computer Science, Bharathiar University, Coimbatore, India

Abstract: Malware creates huge noises in the current data era. The security query arises everyday with new malwares created by the intruders. Malware protection remains one of the trending areas of research in the Android platform. Malwares are routed through the SMS/MMS in the subscriber's network. The SMS once read is forwarded to other users. This will impact the device once the intruders access the device data. Device data theft and the user data theft also includes credit card credentials, login credentials and card information based on the users' data stored in the Android device. This paper works towards how the various malwares in the SMS can be detected to protect mobile users from potential risks from multiple data sources. Using a single data source will not be very effective with the spam detection, as the single data source will not contain all the updated malwares and spams. This work uses two methods namely, BCC for spam clustering and LIME for classification of malwares. The significance of these methods is their ability to work with unstructured data from different sources. After the two-step classification, a set of unique malwares is identified, and all further malwares are grouped according to their category.

Keywords: Bayesian consensus clustering; BCC; large iterative multi-tier ensemble; LIME; ensemble; classification; clustering; data security.

DOI: 10.1504/IJDATS.2021.114665

International Journal of Data Analysis Techniques and Strategies, 2021 Vol.13 No.1/2, pp.15 - 35

Received: 14 Aug 2018
Accepted: 05 Mar 2019

Published online: 30 Apr 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article