Forthcoming and Online First Articles

International Journal of Data Science

International Journal of Data Science (IJDS)

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are published online here, before they appear in a journal issue. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Open AccessArticles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

Register for our alerting service, which notifies you by email when new issues are published online.

International Journal of Data Science (3 papers in press)

Regular Issues

  • Prediction of Customer Churn Risk with Advanced Machine Learning Methods   Order a copy of this article
    by Oguzhan Akan, Abhishek Verma, Sonika Sharma 
    Abstract: Customer churn risk prediction is an important area of research as it directly impacts the revenue stream of businesses. An ability to predict customer churn allows businesses to come up with better strategies to retain existing customers. In this research we perform a comprehensive comparison of feature selection methods, upsampling methods, and machine learning methods on the customer churn risk dataset. (i) Our research compares likelihood-based, tree-based, and layer-based machine learning methods on the churn dataset. (ii) Models built on the churn dataset without upsampling performed better than oversampling methods. However, SMOTE and ADASYN helped stabilize model performance. (iii) The models built on ADASYN dataset were slightly better than the SMOTE counterparts. (iv) It was observed that XGBoost and Deep Cascading Forest combined with XGBoost were consistently better across all metrics compared to other methods. (v) Information Value analysis performed better than PCA. In particular, IVR DCFX model has the best AUROC score with 89.1%.
    Keywords: Customer Churn; Deep Neural Networks; Deep Cascading Forest; Smote; Adasyn.
    DOI: 10.1504/IJDS.2024.10064744
     
  • Self-Evolving Data Collection Through Analytics and Business Intelligence to Predict the Price of Cryptocurrency   Order a copy of this article
    by Adam Moyer, William A. Young II, Timothy J. Haase 
    Abstract: This article presents the Self-Evolving Data Collection Engine through Analytics and Business Intelligence (SEDCABI) for predicting Bitcoin prices. Traditionally models use either structured or unstructured data alone, limiting effectiveness. This research pioneers using both data types. SEDCABI harnesses analytics and BI to extract insights from structured historical price and market data. It also incorporates unstructured social media sentiment and news to capture Bitcoin perceptions. Experiments show integrating both data types significantly improves prediction accuracy. SEDCABI continuously adapts to the dynamic crypto market. The plug-in prediction module enables customization. Overall, SEDCABI offers robust Bitcoin price predictions by combining structured and unstructured data. This contributes to cryptocurrency prediction research with an innovative approach to informed decision-making.
    Keywords: SEDCABI; Prediction; Bitcoin; Cryptocurrency; Text Mining; Analytics; Business Intelligence; Unstructured Data; Sentiment; Price.
    DOI: 10.1504/IJDS.2024.10064877
     
  • Comparison and Database Performance Optimisation Strategies Based on NSGA-II Genetic Algorithm: MySQL and OpenGauss   Order a copy of this article
    by Ming Tang, Lincheng Qi, Sibo Bi, Xinyun Cheng, Shijie Zhang 
    Abstract: In response to the lack of dynamic adjustment and optimization capabilities for real-time environmental changes in database performance optimization strategies, as well as poor query throughput and response time performance, this paper adopted NSGA-II (Non-dominated Sorting Genetic Algorithm II) to study performance optimization of MySQL (My Structured Query Language) and OpenGauss databases Firstly, it defined three objective functions and corresponding constraints for database query response time, query throughput, and query resource utilization, and calculated the fitness of each individual and the crowding distance of each layer Then, the tournament rotation method can be used to output parents with high fitness, and the crossover and mutation probabilities can be set Finally, the optimal parameter configuration of the database can be output The experiment was based on the TPC-DS (Transaction Processing Performance Council Decision Support Benchmark) dataset and compared the performance of MySQL and OpenGauss databases under different parameter configurations The experimental results show that after optimisation by the NSGA-II genetic algorithm, MySQL and OpenGauss databases have certain improvements in query throughput, query response time, and query resource utilisation. Moreover, the optimisation effect on the MySQL database was as high as 90.30%, which is more significant than that on the OpenGauss database.
    Keywords: Database Performance Optimization; MySQL and OpenGauss; Non-dominated Sorting Genetic Algorithm II; Query Response Time; Dynamic Adjustment Capability.
    DOI: 10.1504/IJDS.2024.10065423