Forthcoming articles

International Journal of Big Data Intelligence

International Journal of Big Data Intelligence (IJBDI)

These articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Register for our alerting service, which notifies you by email when new issues are published online.

Open AccessArticles marked with this Open Access icon are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.
We also offer which provide timely updates of tables of contents, newly published articles and calls for papers.

International Journal of Big Data Intelligence (6 papers in press)

Regular Issues

  • A Survey of Computation Techniques on Time Evolving Graphs   Order a copy of this article
    by Shalini Sharma, Jerry Chou 
    Abstract: Time Evolving Graph (TEG) refers to graphs whose topology or attribute values change over time due to update events, including edge addition/deletion, vertex addition/deletion and attributes changes on vertex or edge. Driven by the Big Data paradigm, the ability to process and analyze TEG in a timely fashion is critical in many application domains, such as social network, web graph, road network trac, etc. Recently, many research e orts have been made with the aim to address the challenges of volume and velocity from dealing with such datasets. However it remains to be an active and challenged research topic. Therefore, in this survey, we summarize the state- of-art computation techniques for TEG. We collect these techniques from three di erent research communities: i)The data mining community for graph analysis; ii)The theory community for graph algorithm; iii)The computation community for graph computing framework. Based on our study, we also propose our own computing framework DASH for TEG. We have even performed some experiments by comparing DASH and Graph Processing System (GPS).We are optimistic that this paper will help many researchers to understand various dimensions of problems in TEG and continue developing the necessary techniques to resolve these problems more eciently.
    Keywords: Big Data; Time evolving graphs; Computing framework; Algorithm; Data Mining.

  • Uncovering data stream behavior of automated analytical tasks in edge computing   Order a copy of this article
    by Lilian Hernandez, Monica Wachowicz, Robert Barton, Marc Breissinger 
    Abstract: Massive volumes of data streams are expected to be generated by the Internet of Things (IoT). Due to their dispersed and mobile nature, they need to be processed using automated analytical tasks. The research challenge is to uncover whether the data streams, which are being generated by billions of IoT devices, actually conform to a data flow that is required to perform streaming analytics. In this paper, we propose process discovery and conformance checking techniques of Process Mining in order to expose the flow dependency of IoT data streams between automated analytical tasks running at the edge of a network. Towards this end, we have developed a Petri Net model to ensure the optimal execution of analytical tasks by finding path deviations, bottlenecks, and parallelism. A real-world scenario in smart transit is used to evaluate the full advantage of our proposed model. Uncovering the actual behavior of data flows from IoT devices to edge nodes has allowed us to detect discrepancies that have a negative impact on the performance of automated analytical tasks.
    Keywords: streaming analytics; process mining; Petri Net; smart transit; Internet of Things; edge computing.

  • Combining the Richness of GIS Techniques with Visualization Tools to Better Understand the Spatial Distribution of Data- A Case Study of Chicago City Crime Analysis   Order a copy of this article
    by Omar Bani Taha, M. Omair Shafiq 
    Abstract: This study aims to achieve the following objective: (1) To explore the benefits of adding a Spatial GIS layer of analysis to other existing visualization techniques. (2) To identify and evaluate the patterns in selected crime data by analysing Chicagos open dataset and examine related existing literature on crime trends in this city. Some of the motivations for this study include the magnitude and scale of crime incidents across the world as well as the need for a better understanding of patterns and prediction of crime trends within the selected geographical location. We conclude that Chicago seems to be on course to have both the lowest violent crime rate since 1972, and the lowest murder frequency since 1967. Chicago has witnessed a vigorous drop in most crimes types over the last few years in compares to the previous crime index data. Also, Chicago crime naturally upsurges during summer months and declines during winter months. Our study results align with previous several decades of studies and analysis of Chicago crimes, in which the same communities of highest crime rates still experience the mainstream of crime. One may go back and compare the crime pattern of those 1930s study and will find it very typical. The present study confirmed the efficiency of the Geographic Information System and other visualization techniques as a tool in scrutinizing crimes in Chicago city.
    Keywords: spatial analysis; geographic information system (GIS); human-centred data science; visualization tools; traditional qualitative techniques; data visualization; spatial and crime mapping.

  • Improving collaborative filterings rating prediction coverage in sparse datasets by exploiting the friend of a friend concept   Order a copy of this article
    by Dionisis Margaris, Costas Vassilakis 
    Abstract: Collaborative filtering computes personalized recommendations by taking into account ratings expressed by users. Collaborative filtering algorithms firstly identify people having similar tastes, by examining the likeness of already entered ratings. Users with highly similar tastes are termed near neighbours and recommendations for a user are based on her near neighbours ratings. However, for a number of users no near neighbours can be found, a problem termed as the gray sheep problem. This problem is more intense in sparse datasets, i.e. datasets with relatively small number of ratings, compared to the number of users and items. In this work, we propose an algorithm for alleviating this problem by exploiting the friend of a friend (FOAF) concept. The proposed algorithm, CFfoaf, has been evaluated against eight widely used sparse datasets and under two widely used collaborative filtering correlation metrics, namely the Pearson Correlation Coefficient and the Cosine Similarity and has been proven to be particularly effective in increasing the percentage of users for which personalized recommendations can be formulated in the context of sparse datasets, while at the same time improving rating prediction quality.
    Keywords: collaborative filtering; recommender systems; sparse datasets; friend-of-a-friend; Pearson correlation coefficient; cosine similarity; evaluation.

  • Improving collaborative filterings rating prediction accuracy by considering users dynamic rating variability   Order a copy of this article
    by Dionisis Margaris, Costas Vassilakis 
    Abstract: Collaborative filtering computes personalized recommendations by taking into account ratings expressed by users. Collaborative filtering algorithms firstly identify people having similar tastes, by examining the likeness of already entered ratings. Users with highly similar tastes are termed near neighbours and recommendations for a user are based on her near neighbours ratings. However, for a number of users no near neighbours can be found, a problem termed as the gray sheep problem. This problem is more intense in sparse datasets, i.e. datasets with relatively small number of ratings, compared to the number of users and items. In this work, we propose an algorithm for alleviating this problem by exploiting the friend of a friend (FOAF) concept. The proposed algorithm, CFfoaf, has been evaluated against eight widely used sparse datasets and under two widely used collaborative filtering correlation metrics, namely the Pearson Correlation Coefficient and the Cosine Similarity and has been proven to be particularly effective in increasing the percentage of users for which personalized recommendations can be formulated in the context of sparse datasets, while at the same time improving rating prediction quality.
    Keywords: collaborative filtering; users’ ratings dynamic variability; Pearson correlation coefficient; cosine similarity; evaluation; prediction accuracy.

Special Issue on: BDCA'18 Big Data Management and Applications

  • Towards a systematic collect data process
    by Iman Tikito, Nissrine Souissi 
    Abstract: Big data has become a known topic by a large number of researchers in different areas. Actions to improve data lifecycle in Big Data context was conduct in different phases and focused mainly on problems such as storage, security, analysis and visualization. In this paper, we focus basically on improvement of collect phase, which make the other phases more efficient and effective. We propose in this paper a process to follow to resolve the problematic of collecting a huge amount of data and as a result, optimize data lifecycle. To do this, we analyze different data collect processes present in literature and identify the similitude with the process of Systematic Literature Review. We apply our process by mapping the seven characteristics of Big Data with the sub-processes of proposed collect data process. This mapping provides a guide for the customer to have a clear decision of the need to use the proposed process by answering a set of questions.
    Keywords: Big Data; Data Collect; Data Lifecycle; Systematic Literature Review; Process; SLR