International Journal of Knowledge Engineering and Data Mining (4 papers in press)
Application and Comparison of Neural Network, C5.0, and Classification and Regression Trees (CART) algorithms in the Credit Risk Evaluation Problem (Case Study: A Standard German Credit Dataset)
by Mahdi Massahi Khoraskani,, Fahimeh Kheradmand, Alireza Arshadi Khamseh
Abstract: Due to the reducing global economic stability, the demand of banks for predicting their customer's credit risk has significantly increased and has become more critical, still challenging than ever. This paper addresses the problem of credit risk evaluation of banks customers utilizing data mining tools. Three classification techniques include: Neural Network, C5.0, and Classification and Regression Trees (CART) algorithms. In order to evaluate the performance of the classification techniques, an innovative two-stage evaluation process is proposed. firstly, the optimal status of algorithms is found by tuning its parameters. secondly, these tuned algorithms are ranked by the Analytical Hierarchy Process (AHP) method while four criteria of overall accuracy, precision, sensitivity, and specificity are considered. As a case study, a standard German credit dataset are used to validate the performance of the proposed algorithms. It is illustrated that the Neural Network algorithm is the superior algorithm to evaluate bank customers' credit risk.
Keywords: credit risk evaluation; data mining; classification; neural networks; C5.0; classification and regression trees; CARTs; analytical hierarchy process; AHP.
Rare Association Rule Mining: A Systematic Review
by Anindita Borah, Bhabesh Nath
Abstract: One of the indispensable tasks of data mining is the extraction of significant and meaningful association rules. Whereas the extraction of frequent patterns using association rule mining is an imperative field of research, the idea of generating patterns that do not appear frequently in a database has grabbed the attention of researchers in recent years. The infrequent items or more commonly known as the rare items represent unknown or unpredictable associations and are therefore more interesting than the frequent ones. This study aims to provide a broad systematic review of the area of rare association rule mining. In this paper, a methodical analysis of the rare itemset and rare rule generation techniques in static and dynamic environment is presented. This paper also attempts to feature the current status and future perspectives of rare association rule mining along with some major research challenges.
Keywords: Association rule; Rare itemset; Rare association rule; Rare pattern; Systematic review.
Evaluating performance with implementation of virtualized data in the cloud using metaheuristic approach
by Jyoti Prakash Mishra, Sambit Kumar Mishra
Abstract: Cloud computing provides access to the cloud resources such as storage and business applications from anywhere only by connecting to the internet. However, there are several issues that come along with the benefits of cloud computing. Cloud computing is not only been reshaped the field of distributed systems but also extend businesses potential. Virtual machine enables the abstraction of an operating system and application running on it from the hardware. Virtualisation provides the basic features such as flexibility, cost effectiveness, scalability. The main goal of virtualisation is managing the workloads by transforming the traditional computing units to make it more scalable. It can be applied to a wide range of system layers, including operating system level and hardware level virtualisation. In this article, it is aimed to evaluate performance of virtualised data and to identify the security issues of virtualisation in the cloud computing environments.
Keywords: Virtualization; datacenter; throughput; continuous query; metaheuristic; distributed query; response time.
Fuzzy logic based framework for measuring strength of sentiments in web data
by Anil Kumar
Abstract: The users of internet are growing exponentially providing a platform to the users where they can share ideas, their experiences or feedback regarding any product, services or any event. Nowadays, social media like Facebook, Twitter, blogs, micro-blog and others are very popular medium among the users for informal discussion or feedback. Some studies show that for effective decision making informal reviews, feedback or discussion should be considered. Sentiment may be positive or negative. As previous researches shows that there are two types of emotions positive or negative which can be measured by noting or counting of some words like not, no, bad, good, etc., in their conversation text. But the set of these words having crisp set characteristics or mere occurrences of these words do not explain the degree of sentiments. Therefore, this study develops an approach to measure the degree of sentiment of web data using fuzzy logic on the basis of membership value of linguistic hedges (e.i. very, mostly, small, highly etc.) in fuzzy set. This approach provides granularity of sentiments which is very useful for effective and reliable decision making process.
Keywords: sentiment analysis; opinion mining; linguistic hedges; fuzzy logic; natural language processing; NLP.