Authors: Xiupeng Chen; Rongzeng Mu; Yuepeng Yan
Addresses: University of Chinese Academy of Sciences, 19A Yuquan Rd, Shijingshan District, Beijing, China; Institute of Microelectronics of Chinese Academy of Sciences Kunshan Branch, 1699 Zuchongzhi, Kunshan, China ' Institute of Microelectronics of Chinese Academy of Sciences, 3 Beitucheng West Road, Chaoyang District, Beijing, China ' Institute of Microelectronics of Chinese Academy of Sciences, 3 Beitucheng West Road, Chaoyang District, Beijing, China
Abstract: The number of malicious Android applications has grown explosively, leaking massive privacy sensitive information. Nevertheless, the existing static code analysis tools relying on imprecise callbacks list will miss high numbers of leaks, which is demonstrated in the paper. This paper presents a machine learning approach to identifying callbacks automatically in Android framework. As long as it is given a training set of hand-annotated callbacks, the proposed approach can detect all of them in the entire framework. A series of experiments are conducted to identify 20,391 callbacks on Android 4.2. This proposed approach, verified by a ten-fold cross-validation, is effective and efficient in terms of precision and recall, with an average of more than 91%. The evaluation results shows that many of newly discovered callbacks are indeed used, which furthermore confirms that the approach is suitable for all Android framework versions.
Keywords: callbacks identification; machine learning; support vector machine; SVM; cross-validation; static analysis; malware; privacy; android framework; Android; mobile application security.
International Journal of Embedded Systems, 2018 Vol.10 No.4, pp.301 - 312
Received: 24 May 2016
Accepted: 17 Oct 2016
Published online: 01 Aug 2018 *