Title: Analysing research collaboration through co-authorship networks in a big data environment: an efficient parallel approach

Authors: Carlos Roberto Valêncio; José Carlos De Freitas; Rogéria Cristiane Gratão De Souza; Leandro Alves Neves; Geraldo Francisco Donegá Zafalon; Angelo Cesar Colombini; William Tenório

Addresses: Institute of Biosciences, Humanities and Exact Sciences (IBILCE), São Paulo State University (UNESP), Campus São José do Rio Preto, São Paulo, Brazil ' Institute of Biosciences, Humanities and Exact Sciences (IBILCE), São Paulo State University (UNESP), Campus São José do Rio Preto, São Paulo, Brazil ' Institute of Biosciences, Humanities and Exact Sciences (IBILCE), São Paulo State University (UNESP), Campus São José do Rio Preto, São Paulo, Brazil ' Institute of Biosciences, Humanities and Exact Sciences (IBILCE), São Paulo State University (UNESP), Campus São José do Rio Preto, São Paulo, Brazil ' Institute of Biosciences, Humanities and Exact Sciences (IBILCE), São Paulo State University (UNESP), Campus São José do Rio Preto, São Paulo, Brazil ' Fluminense Federal University (UFF), Niterói, Rio de Janeiro, Brazil ' Institute of Biosciences, Humanities and Exact Sciences (IBILCE), São Paulo State University (UNESP), Campus São José do Rio Preto, São Paulo, Brazil

Abstract: Bibliometry is the quantitative study of scientific productions and enables the characterisation of scientific collaboration networks. However, with the development of science and the increase of scientific production, large collaborative networks are formed, which makes it difficult to extract bibliometrics. In this context, this work presents an efficient parallel optimisation of three bibliometrics for co-authorship network analysis using multithread programming: transitivity, average distance, and diameter. Our experiments found that the time taken to calculate the transitivity value using the sequential approach grows 4.08 times faster than the parallel proposed approach when the size of co-authorship network grows. Similarly, the time taken to calculate the average distance and diameter values using the sequential approach grows 5.27 times faster than the parallel proposed approach when the size of co-authorship network grows. In addition, we report relevant values of speed up and efficiency for the developed algorithms.

Keywords: bibliometrics; graphs; knowledge extraction; co-authorship network; NoSQL; parallel computing.

DOI: 10.1504/IJCSE.2020.106061

International Journal of Computational Science and Engineering, 2020 Vol.21 No.3, pp.364 - 374

Received: 10 May 2018
Accepted: 07 Nov 2018

Published online: 26 Mar 2020 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article