Forthcoming and Online First Articles

International Journal of Data Science

International Journal of Data Science (IJDS)

Forthcoming articles have been peer-reviewed and accepted for publication but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

Online First articles are published online here, before they appear in a journal issue. Online First articles are fully citeable, complete with a DOI. They can be cited, read, and downloaded. Online First articles are published as Open Access (OA) articles to make the latest research available as early as possible.

Open AccessArticles marked with this Open Access icon are Online First articles. They are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

Register for our alerting service, which notifies you by email when new issues are published online.

We also offer which provide timely updates of tables of contents, newly published articles and calls for papers.

International Journal of Data Science (8 papers in press)

Regular Issues

  • Bayesian Survival Analysis of Under-Five Pneumonia Patients in Tercha General Hospital, Dawro Zone, South West Ethiopia   Order a copy of this article
    by Lema Abate, Megersa Tadesse 
    Abstract: Pneumonia is among the major killer diseases in under-five children in the world. In developing countries, 3 million children die each year due to pneumonia. Ethiopia is one of the 15 pneumonia high burden countries. The aim of this study was to examine the risk factors of the survival time of under-five pneumonia patients using Bayesian approach analysis. A total of 281 under-five pneumonia patients were included in this study. The parametric survival models such as Weibull, Lognormal, and Log-logistic baseline distributions were used to fit the datasets by introducing prior distributions. The DIC value was used to compare the baseline distributions and based on the DIC value the Weibull baseline distribution was selected as a good model to fit the under-five pneumonia dataset. The results obtained from the Weibull survival model showed that patients from urban residence and patients who admitted during a minimum number of patient nurse ratio (PNR) prolong timing death of under-five pneumonia patients, while patients admitted during spring and summer season, patients suffered co-morbidity and severe acute malnutrition (SAM) were shorten the timing of the death of patients. Factors such as sex, residence, Season of Diagnosis, Comorbidity, Severe Acute Malnutrition (SAM), Patient refer status, and Patient to Nurse Ratio (PNR) associated with the survival time of under-five pneumonia in this study. The concerned body should give attention to the factors identified in this study to prevent the mortality of under-five children due to pneumonia.
    Keywords: Pneumonia; Under-Five; Parametric Models; Risk Factors; Bayesian approach; WinBUGs.

  • Telecom Fraud Detection with Big Data Analytics   Order a copy of this article
    by Duygu Sinanc Terzi, Seref Sagiroglu, Hakan Kilinc 
    Abstract: The rapid development in telecom has also led to an increase in fraud activities, which causes both revenue and reputation losses. For this reason, this paper proposes a new telecom fraud detection model based on behaviour deviations of users expressed through time-varying signatures. In line with the similarity of these deviations to known frauds, a suspect list has been created and reported to fraud experts for the final decision. The proposed model was developed with the MapReduce parallel programming paradigm, which provides simplicity and flexibility for large-scale applications. Finally, the model was applied on call detail records of a telecom company. The obtained results have shown that the proposed approach detects the telecom frauds with 86% success and is suitable for application into a fraud management system for real-world implementation.
    Keywords: telecom fraud detection; big data analytics; signature based user profiling; behaviour analysis.

  • Tourist volume prediction using Data Mining techniques and Change Point detection for Sri Lanka   Order a copy of this article
    by Pavithra Basnayake, Nishani Chandrasekara 
    Abstract: Sri Lanka is the heart of the Indian Ocean which attracts tourists around the world. This study investigates the behavior of tourist arrivals in Sri Lanka using data mining techniques and change point analysis accompanied by the main objective of forecasting the tourist volume. Time Delay Neural Network (TDNN), Feedforward Neural Network (FFNN) with Levenberg-Marquardt (LM) and Scaled Conjugate Gradient (SCG) algorithms were applied in forecasting whereas two Windows (WA and WB) were identified with the change point detection. For the entire study period, FFNN with LM algorithm illustrates better performance. A change point was detected in October 2011 in the data. For WA, there was no better-performed model due to the fluctuations in tourist arrivals because of terrorist activities. In WB, the outperformed model was obtained from the FFNN with LM algorithm. This study will assist the tourism related industries in their future plans and support in developing infrastructure and economy.
    Keywords: Change point analysis; Data mining; Forecasting; Sri Lanka; Tourism.

Special Issue on: ETMS2018 and ETMS2019 Data Analytics in Engineering and Management

    by Selim COREKCIOGLU, Bekir POLAT 
    Abstract: Small and medium-sized enterprises (SMEs) have an important place in the economy due to the fact that 99.8% of businesses in Turkey are SMEs. It is important to survive for SMEs, especially newly founded enterprises. In order to help SMEs survive, KOSGEB which is SME development organization in Turkey provides the entrepreneurs with 3 year-support. However, the supported entrepreneurship projects still fail and cause to the waste of allocated resources for these projects. This study aimed to prevent waste of resource and to estimate the success and failure of proposed entrepreneurship projects with data mining algorithms. Thereby, the accuracy of the estimates increased and decisions about the projects were based on a scientific approach. As data of the study, the projects evaluated by KOSGEB Gaziantep Directorate between 2012-2014 were analyzed by taking some features such as age, gender, experience, education, partnership structure, market, location, sector, personnel, and capital into consideration. As a result of the analysis of the data, it has been examined whether entrepreneurial projects were successful or not. The data obtained from the entrepreneurship projects were pre-processed and adapted to WEKA 3.9.2 software. The dataset was classified using 10-fold cross-validation with C4.5, Naive Bayes, Logistic Regression, Random Forest and Support Vector algorithms. The results of the classification were compared and the C4.5 algorithm was found as the most successful algorithm with 70.75% prediction accuracy. In consequence of the C4.5 algorithm, the features affecting the tree were found as capital, partner, location, and age, respectively. The features that did not affect the tree were gender, education, market, sector, and personnel.
    Keywords: Entrepreneurship; SME; Data Mining; Classification.

    by Karina Pagan, Natália Pagan, Janaina Giraldi, Jorge Oliveira 
    Abstract: Neuromarketing is a recent area of research that is being increasingly addressed by academicsrnand business professionals. There are several techniques that can be employed for specificrnpurposes. This research focused on the neurofeedback technique known asrnelectroencephalography (EEG). This technique has several advantages when compared to otherrnneurofeedback techniques, such as greater mobility and freedom of the participants, greaterrnpossibility of creating experimental situations, high recording speed and lower costs. However,rndata analysis using this technique is considered complex. Therefore, this research had thernobjective of identifying which techniques of electroencephalography analysis were applied inrnresearches in the marketing area, verifying that there are several techniques, such as ERP, timefrequency analysis and frontal asymmetry. Subject that has not yet been explored in anyrnresearch. This study can help future academic and market neuromarketing research to identifyrnthe appropriate technique to analyze the data and to fulfill the research objectives
    Keywords: Marketing. Neuromarketing. Electroencephalography. Methods of analysis.rnNeurofeedback. ERP. Time-frequency analysis. Frontal asymmetry. ICA. FourrierrnTransformation.

    Abstract: Supply chain optimization has been the subject of many scientific studies due to its mathematical structure. In this study, some optimization studies that consider the components of the supply chain for modelling and to help to make decisions for supply chain management problems are examined. These studies in the literature have been classified in terms of mathematical model structures. Optimization of supply chain problems and complexity of problems are mentioned. For the optimization of the supply chain, a new innovative approach has been proposed by using the succession relationship between the components of the supply chain. In accordance with the proposed approach, a solution method has been described and its results are shown on a sample problem. It is aimed that the proposed method will bring a new approach suitable for solving complex problems, especially supply chain optimization problems, and contribute to finding better solutions. Development opportunities are evaluated by examining the results.
    Keywords: Supply Chain Management; Optimization; Butterfly Effect Algorithm.

  • Effects of Discount Policies on Economic Order Quantity and Total Cost for Perishables: A Case Study   Order a copy of this article
    by Didem Guleryuz, Sakir Esnaf 
    Abstract: Classical Economic Order Quantity model assumes that demand stays constant over time. However, the demand for perishable goods in real life may change depending on product freshness since the customer will always want to purchase fresh goods for the same price. Hence, carrying cost is important for the calculation of the economic order quantities of such goods. In this study, the effect of changes in the economic order quantities resulting from discount policies in perishable goods on total cost was examined, making a discount suggestion. Data for play dough sold at a retail store in Turkey were used, including the discounts due to shelf life and additional costs due to the perishing of products added to the Economic Order Quantity formula via Weiss Model for calculating the order quantity. Then total costs are determined, and both costs were compared by trying different discount rates. As a result, total costs for classical order quantity and order quantities, including discount policies, are calculated as 634.43 TRY, 641.32 TRY, and 1672.695 TRY, 1732.830 TRY for decorative and standard play doughs, respectively. Although the classic model has a lower cost as it does not consider the discount policy or the perishing rates, this is not suitable for perishable products in real life.
    Keywords: Economic order quantity model; perishable items; Weiss model.

  • A Study on Severity of Traffic Accidents using Road, Weather and Time Characteristics   Order a copy of this article
    by Zeynep Burcu Kizilkan, Ahmet Erdogan Asliyuce, Tugay Cengiz, Ugur Can Ersen 
    Abstract: Mortality and severe injuries caused by traffic accidents are some of the vital threats to society, therefore contributing factors to accidents are a major concern. Traffic accidents severity can be understood by many attributes like human factors, the impact of road characteristics, weather, and accident time. Artificial neural networks (ANNs) are more practical to implement compared to other algorithms while using categorical data. Accordingly, ANNs are one of the well-researched and applied techniques in traffic accident prediction models and determining contributing factors of traffic accidents. The utilization of ANNs for the determination of risk levels is an efficient way to find accurate results. While past research done in similar topics includes predominantly human impact, this paper aims to build a model to observe the impact of road, weather, and time characteristics rather than human factors on traffic accident risk levels. Two models are constructed using ANNs then their performances are compared. The results indicate that the ANNs model reached a satisfactory certainty level. For further development, this model can be developed as a prevention system to enable the use of governmental institutions.
    Keywords: Accident Severity; Artificial Neural Networks; Traffic Accident; Machine Learning; Supervised Learning; Prevention System.