Forthcoming and Online First Articles

International Journal of Data Science

International Journal of Data Science (IJDS)

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are published online here, before they appear in a journal issue. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Open AccessArticles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

Register for our alerting service, which notifies you by email when new issues are published online.

We also offer which provide timely updates of tables of contents, newly published articles and calls for papers.

International Journal of Data Science (12 papers in press)

Regular Issues

  • Research on precise employment data analysis and practice of university graduates based on web system implementation   Order a copy of this article
    by Jie Liu, Ning Wang 
    Abstract: Based on big data technology and its application, this paper outlines the current problems of college employment, gives specific strategies for college employment services, and designs the main framework of the college student employment information service platform in the new era. To this end, a web-based employment management system for college graduates is designed. The information sharing platform is used for information collection and classification, data induction, mining and analysis, intelligent classification guidance for graduate resumes, and vocational assessment and occupational assessment. Skill assessment, accurately matching the job information of graduates, helping graduates to achieve precise employment. At the same time, we use big data technology to analyse the employment process of students, and further guide them to find employment according to the data analysis results.
    Keywords: web; management system; big data; employment service management platform; graduates.
    DOI: 10.1504/IJICT.2022.10051509
  • Feature Analysis applying Clustering and Optimization Methods to Mahalanobis-Taguchi Method   Order a copy of this article
    by Shinichi Murata, Hiroshi Morita 
    Abstract: While data analysis is important in various corporate activities, it is often the case that a company's data analysis is not well-conducted. There are two main reasons for this: the lack of teacher data and the increasingly complicated nature of the data to be analyzed, which makes it difficult to judge the appropriate analysis unit/group and to select the appropriate items to be used for the analysis. In response, we propose a data analysis approach that combines a clustering and a stochastic optimization model with the Mahalanobis-Taguchi method, making it possible to automatically determine the group of data to be analyzed and the items of data to be used, and to extract features from the data. The proposed approach enables data analysis with a single correct label and eliminates tasks that require higher-level skills (such as feature selection). The effectiveness of the proposed method is verified using recorded TV data.
    Keywords: Mahalanobis-Taguchi Method; Clustering; x-means; k-means; Optimization Method; Operations Research; Genetic Algorithm; Feature selection; Data Analysis; Recorded TV data.

  • Cloud Service Quality: A Research Roadmap   Order a copy of this article
    by Xianrong Zheng 
    Abstract: As cloud services become increasingly popular, cloud providers compete to offer the same or similar services over the Internet. Quality of Service (QoS), which describes how well a cloud service is performed, will be more important. QoS refers to nonfunctional properties of cloud services, and is an important differentiator among functionally equivalent cloud services. As a result, how to evaluate and assure QoS becomes important in both IT and business disciplines. This paper argues for QoS evaluation and assurance in cloud services. It reviews the state of the art, reports our latest work, and discusses future research directions and challenges on the two topics. The paper proposes a new benchmark suite for measuring cloud services and a new economic method for allocating cloud resources. The benchmark suite can provide comparable QoS data for cloud services. The economic method can meet users QoS needs while minimizing resources consumed for cloud services.
    Keywords: Cloud Computing; Quality of Service; Cloud Benchmark; Resource Allocation.

  • Slope One Collaborative Recommendations: A Survey   Order a copy of this article
    by Neeraj Kumar Bharti, Vijay Verma 
    Abstract: Collaborative filtering (CF) is a traditional and popular technique in the recommendation system (RS) paradigm. Still, it suffers from the problem of data sparsity and cold start. There exist numerous ways to reduce the problem of data sparseness in the RS scenario. Notably, one specific form of item-based collaborative filtering (IBCF), known as the slope one algorithm, deals with data sparsity in many different ways. Slope one algorithms are simple; therefore, their implementations are more straightforward than other complicated IBCF approaches. Further, the family of slope one algorithms provides comparable efficiency with respect to other CF techniques. This work summarizes the state-of-the art techniques for slope one recommendations. Various slope one predictors are analyzed and compared with each other along with their pros and cons. In order to validate the effectiveness of slope one predictors, several experiments have been performed using different datasets such as MovieLens-100k, MovieLens-1M, and Filmtrust. Finally, the weighted slope one predictor is compared with the basic IBCF using Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) metrics. Empirical values of MAE and RMSE demonstrate that the slope one predictors provide almost same accuracy as obtained from other complex and computationally expensive methods.
    Keywords: Recommender system; Slope one; collaborative filtering; item-based CF; data sparsity.

  • A DEA-WEI method for ranking universities in the presence of imprecise data   Order a copy of this article
    by Bibi Faheema Luckhoo, Arshad Ahmud Iqbal Peer 
    Abstract: Ranking universities has become increasingly common in recent years as it is considered a significant source of comparative information for various stakeholders. The three main university rankings differ by methodology and results since different parameters are considered. In this article, data envelopment analysis (DEA) is used to obtain a unified ranking of universities based on the data of these ranking systems. Due to the absence of input measures in the data set, DEA-WEI (Without Explicit Input) models are studied. In order to consolidate the classification, the established rankings of the three ranking systems, which are ordinal data, are considered. As such, we suggest a new approach to rank the universities in situations where imprecise data and only output measures are present.
    Keywords: Data Envelopment Analysis; DEA; University rankings; Imprecise Data; Ordinal Data; Without Explicit Inputs; WEI.

  • The construction of smart city information service system in the era of big data   Order a copy of this article
    by Cong Li 
    Abstract: To promote more scientific and standardized urban and rural construction and effectively enhance the overall quality and efficiency of urban and rural construction, a smart city information service system is constructed under the background of big data. This paper studies the problems existing in the construction of smart city information service system in cities across the country, including unreasonable construction planning, obstacles to the integration of information resources, insufficient awareness of innovation and reform, low social participation and difficult funds to meet the construction needs. According to the construction objectives, construction principles and construction modes, build a smart city information service system in the context of big data, and integrate the smart city information under big data by using data clustering algorithm. The quality of smart city information service constructed is high and the public recognition is high.
    Keywords: Big data era; Smart city; Information service; PEST analysis; Text analysis.
    DOI: 10.1504/IJDS.2023.10053182
  • Application of Adaptive Back Propagation Neural Network Algorithm in Vehicle Scheduling of Logistics Enterprises   Order a copy of this article
    by Tianming Zu 
    Abstract: With the rapid development of modern logistics, customers have higher and higher requirements for order delivery. With the increasing logistics pressure, the logistics vehicle scheduling problem has become the focus of the industry to ensure the timeliness and smoothness of logistics. Based on this, a vehicle scheduling model based on Self-Adaptation Back Propagation (SABP) is constructed. The results show that the prediction accuracy rate of the model established in the research is 96.5%, which is much higher than the prediction accuracy rate of the traditional Support Vector Machine (SVM) model and the traditional BP neural network model. The SABP model can reach the expected accuracy after 208 iterations, and the number of iterations is much lower than the other two models. The experiment shows that the model can accurately predict the shortest path and complete the distribution with the lowest cost.
    Keywords: BP neural algorithm; Adaptive; Logistics enterprise; Vehicle scheduling.
    DOI: 10.1504/IJDS.2023.10053344
  • The importance of trade openness and logistics performance in economic growth: A Lasso-based Approach   Order a copy of this article
    by Youqin Pan, Jain Gu, David Goodof 
    Abstract: This study explores the impacts of logistics performance and trade openness on economic growth. The Least Absolute Shrinkage and Selection Operator (Lasso) regression models were applied to perform variable selection for feature variables using logistics performance index (LPI), GDP, and trade openness (OP) during the periods when LPI data are available from 2010-2018. The results reveal that infrastructure, timeliness, and trade openness are major factors that affect a countrys economic growth. Moreover, interactive terms between trade openness and specific logistics indicators such as customs, infrastructure, and timeliness are statistically significant. The main policy implication is that continuous improvement in logistics infrastructure and timeliness leads to positive economic growth. Additionally, trade openness needs to be factored into logistics decision making to better boost economic development because greater trade openness may not lead to economic growth due to negative interactive effects.
    Keywords: LPI index; LASSO; Trade Openness; Infrastructure; Logistics Performance;Timeliness.

  • Performance Assessment of Mumbai Indians & Royal Challengers Bangalore in Indian Premier League by Computational Data Analysis   Order a copy of this article
    by Vikas Khare 
    Abstract: Abstract: The Indian Premier League (IPL) is a professional Twenty20 cricket league in India that features eight teams from eight different cities. The Mumbai Indians (MI) and Royal Challengers Bangalore is an Indian Premier League (IPL) franchise cricket team based on Mumbai and Bangalore respectively. This paper shows performance assessment of Mumbai Indians and royal challengers by NCSS tool based process of data analysis. The main objective of comparison of performance assessment of Mumbai Indians and Royal Challengers Bangalore are, both teams spent almost the same amount of money on their players and also have the same level of players, but there are many differences between performances of both the teams. Mumbai Indians won the IPL five times and the best performance of Royal Challengers Bangalore was runner up in 2009 and 2016. In this paper data collection of in the tenure of 2008-2020 and data assessment of year- wise salary of players, win% and batting & bowling performance of both the teams. All the statistical and descriptive analysis shows that MI is a much better team compared to RCB. Results show that in the future the win% of MI is approximately 62% and win% of RCB will be only 46%.
    Keywords: NCSS Tool; Regression Analysis; Michaelis–Menten concept; Non-linear Regression.

  • On representation of preferences   Order a copy of this article
    by Erio Castagnoli, Marzia De Donno, Gino Favero, Paola Modesti 
    Abstract: A representation theorem proven by G. Debreu in 1960, although somehow neglected by the literature, implies several deep and unexplored consequences both for Economics and for Decision Theory. This paper focuses on some of them. In particular, possible decompositions of state-dependent utilities
    Keywords: Debreu's Theorem; Representation of preferences; Sure Thing principle; State-dependent utility; Benchmarking; Certainty equivalents; Incomplete preferences.
    DOI: 10.1504/IJDS.2023.10054815
  • Research on the driving mechanism of business model innovation of startups based on big data analysis in the context of digital economy   Order a copy of this article
    by Hua Tian  
    Abstract: The era of big data is a new operating environment for enterprises, and business model innovation has been a hot topic in both academic and practical circles, but the mechanism of business model innovation in big data-driven scenarios is still a "black box". We use discourse analysis to refine the mechanism of business model innovation based on 100 big data cases. A model of 16 discourse elements in four categories of data, behavior, attributes and scenarios is obtained, and the relationship paths between the above-mentioned discourse elements and categories are mapped. This paper evaluates the change of business model innovation of the company by using the entropy - mutation level method by selecting the financial data from 2007 to 2020. It is found that the evaluation results are consistent with the development history of the company, and the development of the company's business model is characterized by "overall stability and phase fluctuation".
    Keywords: big data; enterprises; financial data; business model.
    DOI: 10.1504/IJDS.2023.10054897
  • Evaluation of cigarette market state based on multi-source data modeling   Order a copy of this article
    by Taicheng Wei, Hao Chen, Yuting Ou, Chen Zhang, Haiying Li, Yue Huang, Yanbing Liu 
    Abstract: Traditional cigarette market forecasting model usually has a low accuracy since it did not take the external data into account. Thus, a random forest was firstly used to extract features of data and rank the importance of influencing factors. Then, different external factors were eliminated, the percentage of reduced model interpretation was demonstrated, and expert feedback was introduced to input evaluation values. After optimizing the training RF-LSTM model, the prediction of the whole market sales status were finally constructed, and the historical week cigarette market status evaluation model was also established. The proposed machine learning model had a high prediction accuracy and generalization based on the local market data in province Guangxi of China. Overall results demonstrated that it can accurately and conveniently evaluate the market status of cigarettes.
    Keywords: Multi-source data; Cigarette market; Evaluation; Deep learning; Machine learning.
    DOI: 10.1504/IJDS.2023.10054898