Authors: Nasser Zalmout; Moustafa M. Ghanem
Addresses: Department of Computing, Imperial College London, UK; New York University Abu Dhabi, United Arab Emirates ' Deceased; formerly of: School of Science and Technology, Middlesex University, London, UK
Abstract: In this paper, we present and apply a generic approach for multivariate community detection from Twitter data. The multivariate nature of social media communication in general provides for multidimensional interaction patterns, from which we were able to analyse different similarity and interaction patterns between the users, and construct multiple distance matrices based on them. The developed distance matrices facilitate the application of traditional network-centric community detection techniques, to identify users clusters. The paper also incorporates an adaptive technique for classifying new content to the already detected communities, based on Bayesian classification, approaching the issues of dynamicity and evolution in social media. Using a dataset of UK political tweets, we evaluate the factors affecting the quality of the detected communities. We also investigate how the accuracy of the classifier is affected by the dynamicity of the network evolution and the time elapsed between community detection and classifier application.
Keywords: community detection; twitter; homophily; social network analysis; SNA; social media; dynamicity; multidimensional communities; user clusters; Bayesian classification; UK; United Kingdom; political tweets; similarity patterns; interaction patterns.
International Journal of Big Data Intelligence, 2016 Vol.3 No.4, pp.239 - 249
Received: 20 Apr 2014
Accepted: 13 Nov 2014
Published online: 21 Oct 2016 *