Article: Automated identification of callbacks in Android framework using machine learning techniques Journal: International Journal of Embedded Systems (IJES) 2018 Vol.10 No.4 pp.301 - 312 Abstract: The number of malicious Android applications has grown explosively, leaking massive privacy sensitive information. Nevertheless, the existing static code analysis tools relying on imprecise callbacks list will miss high numbers of leaks, which is demonstrated in the paper. This paper presents a machine learning approach to identifying callbacks automatically in Android framework. As long as it is given a training set of hand-annotated callbacks, the proposed approach can detect all of them in the entire framework. A series of experiments are conducted to identify 20,391 callbacks on Android 4.2. This proposed approach, verified by a ten-fold cross-validation, is effective and efficient in terms of precision and recall, with an average of more than 91%. The evaluation results shows that many of newly discovered callbacks are indeed used, which furthermore confirms that the approach is suitable for all Android framework versions. Inderscience Publishers - linking academia, business and industry through research

Title: Automated identification of callbacks in Android framework using machine learning techniques

Authors: Xiupeng Chen; Rongzeng Mu; Yuepeng Yan

Addresses: University of Chinese Academy of Sciences, 19A Yuquan Rd, Shijingshan District, Beijing, China; Institute of Microelectronics of Chinese Academy of Sciences Kunshan Branch, 1699 Zuchongzhi, Kunshan, China ' Institute of Microelectronics of Chinese Academy of Sciences, 3 Beitucheng West Road, Chaoyang District, Beijing, China ' Institute of Microelectronics of Chinese Academy of Sciences, 3 Beitucheng West Road, Chaoyang District, Beijing, China

Abstract: The number of malicious Android applications has grown explosively, leaking massive privacy sensitive information. Nevertheless, the existing static code analysis tools relying on imprecise callbacks list will miss high numbers of leaks, which is demonstrated in the paper. This paper presents a machine learning approach to identifying callbacks automatically in Android framework. As long as it is given a training set of hand-annotated callbacks, the proposed approach can detect all of them in the entire framework. A series of experiments are conducted to identify 20,391 callbacks on Android 4.2. This proposed approach, verified by a ten-fold cross-validation, is effective and efficient in terms of precision and recall, with an average of more than 91%. The evaluation results shows that many of newly discovered callbacks are indeed used, which furthermore confirms that the approach is suitable for all Android framework versions.

Keywords: callbacks identification; machine learning; support vector machine; SVM; cross-validation; static analysis; malware; privacy; android framework; Android; mobile application security.

DOI: 10.1504/IJES.2018.093688

International Journal of Embedded Systems, 2018 Vol.10 No.4, pp.301 - 312

Received: 24 May 2016
Accepted: 17 Oct 2016
Published online: 01 Aug 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: Automated identification of callbacks in Android framework using machine learning techniques

Keep up-to-date