International Journal of Knowledge Engineering and Data Mining (5 papers in press)
Application and Comparison of Neural Network, C5.0, and Classification and Regression Trees (CART) algorithms in the Credit Risk Evaluation Problem (Case Study: A Standard German Credit Dataset)
by Mahdi Massahi Khoraskani,, Fahimeh Kheradmand, Alireza Arshadi Khamseh
Abstract: Due to the reducing global economic stability, the demand of banks for predicting their customer's credit risk has significantly increased and has become more critical, still challenging than ever. This paper addresses the problem of credit risk evaluation of banks customers utilizing data mining tools. Three classification techniques include: Neural Network, C5.0, and Classification and Regression Trees (CART) algorithms. In order to evaluate the performance of the classification techniques, an innovative two-stage evaluation process is proposed. firstly, the optimal status of algorithms is found by tuning its parameters. secondly, these tuned algorithms are ranked by the Analytical Hierarchy Process (AHP) method while four criteria of overall accuracy, precision, sensitivity, and specificity are considered. As a case study, a standard German credit dataset are used to validate the performance of the proposed algorithms. It is illustrated that the Neural Network algorithm is the superior algorithm to evaluate bank customers' credit risk.
Keywords: credit risk evaluation; data mining; classification; neural networks; C5.0; classification and regression trees; CARTs; analytical hierarchy process; AHP.
Designing sales budget forecasting and revision system by using optimisation methods
by K. Koochakpour, M.J. Tarokh
Abstract: The sales procedures are the most important factors for keeping companies alive and profitable. So sales and budget sales are considered as important parameters influencing all other decision variables in an organisation. Therefore, poor sales forecasting can lead to great loses in an organisation caused by inaccurate and non-comprehensive production and human resource planning. Hence, in this research, a coherent solution has been proposed for forecasting sales besides refining and revising it continuously by adaptive neuro fuzzy inference system (ANFIS) model with consideration of time series relations. Data has been collected from the public and accessible annual financial reports related to a famous Iranian company. Moreover, for more accuracy in forecasting, the solution has been examined by back propagation neural network (BPN) and particle swarm optimisation (PSO) as optimisation methods. The comparison between taken prediction and the real data shows that PSO method can optimise some parts of prediction in contrast to the rest which is more coincident to the output of BPN analysis. As a consequence, a hybrid integrated system including them both, has been designed. This system uses them depending on their abilities to optimise each part, so it can produce more precise results relatively.
Keywords: sales forecast; adaptive neuro fuzzy inference system; ANFIS; time series analysis; PSO and BPN methods; hybrid method.
Pattern mining and process modelling of collaborative interaction data in an online multi-tabletop learning environment
by Parham Porouhan, Wichian Premchaiswadi
Abstract: This research builds on the intersection of a web-based (online) multi-interactive multi-tabletop collaborative environment (so-called M-ITCL) and process mining process discovery algorithms applied on the collaborative interaction data (event logs) previously collected from an authentic learning classroom. The main focus of the study was to investigate which process mining algorithm could lead to generation of process models that differentiate (replay) the events correctly with 100% level of fitness, precision, generalisation and simplicity. The results showed that alpha algorithm resulted in the generation of process models with good simplicity but with poor precision and generalisation. Heuristic algorithm resulted in the generation of process models with good precision but with poor generalisation and simplicity. Fuzzy algorithm resulted in generation of rather simple process models with good precision and generalisation. Moreover, the models/graphs generated through fuzzy algorithm could differentiate all of the cases correctly with 100% level of fitness as a validation measure.
Keywords: human-computer interaction; process mining; computer-supported collaborative learning; educational data mining; alpha algorithm; heuristic miner algorithm; fuzzy miner; analysis of collaborative interactions; interactive table computers; tabletops; concept mapping.
Detecting crime patterns from Swahili newspapers using text mining
by George Matto, Joseph Mwangoka
Abstract: The Tanzania Police Force, as many other law enforcement agencies in developing countries, relies mostly on manual, personal judgments, and other inadequate tools for analysis of data in its crime databases. This approach is inadequate and prone to errors. Moreover, research shows that more than half of all crimes committed in Tanzania are not reported to police and thus it is likely that they are not analysed by the police. In this study, we use text mining to extract crime patterns from sources of crime data outside police databases. In fact, we use four daily published Swahili newspapers. With the help of our developed patterns mining model we extracted several crimes reported in the newspapers, we mapped the distribution of the mined crimes country-wide, and with the use of FP-growth, we generated association rules between the mined crimes. Results from this study will contribute to crime detection and prevention strategies.
Keywords: crime; crime patterns; text mining; association rules; FP-growth.
An experimental design for optimising the degree of shared leadership in senior engineering design teams
by Brian J. Galli, Francisco J. Santos-Arteaga, Mohamad Amin Kaviani, Cyrus Mohebbi
Abstract: We present an experimental design approach for identifying the optimal levels of the key internal and external environmental conditions of shared leadership, which, in turn, would enable an engineering team to develop and maximise the degree of shared leadership. A full factorial design is applied to evaluate four control factors (shared purpose, social support, voice and external coaching) at two levels and analyse one output metric, defined as 'team centralisation'. Our analysis shows that the four factors are the main contributors to the output metric. However, interactions between the factors are either weak contributors or do not contribute at all to the performance of the output metric. We conclude that having high levels of the three internal team environmental factors and the external coaching environmental condition results in the greatest degree of shared leadership in the senior design environment.
Keywords: shared leadership; experimental design; organisational design; team-based structure; design team.