International Journal of Intelligent Information and Database Systems (20 papers in press)
Impacts of Feature Selection on Classification of Individual Activity Recognitions for Prediction of Crowd Disasters
by Ali Selamat, Fatai Sadiq, Ondrej Krejcar, Roliana Ibrahim
Abstract: We examined possibility of feature selection using Statistical Based Time Frequency Domain (SBTFD) extracted features for human activity recognitions. This is to reduce the dimension of features space, remove redundant features to improve accuracy and minimize false negative alarm for crowd disasters. For this, we analyzed and classified 54 SBTFD features obtained from 22,350 instances comprising of climb down, climb up, peak shake while standing, standing, still, and walking; as classes V1, V2, to V8, respectively. The individual activity recognition dataset (D1) were collected from 20 students in a well-known institution in Malaysia. In addition, a similar dataset (D2) from repository was used. The dataset contains 250,936 instances from 9 users for smartphone accelerometer signals. Both datasets were subjected to Minimum Redundancy Maximum Relevance (MRMR), correlation and chi-square techniques to filter the relevant SBTFD features to select effective features to reduce the dimension. Based on the selected features, we applied 10-fold cross validation using WEKA with Random Forest (RF), J48, Sequential Minimal Optimization (SMO) and Naive Bayes (NB) classifiers to classify and predict abnormality behaviour classes V1 to V8. We achieved an excellent accuracy and reduce false negative rate to safe human lives from crowd disasters with 7 features of MRMR using RF.
Keywords: Statistical Based Time Frequency Domain (SBTFD); human activity recognitions; Minimum Redundancy Maximum Relevance (MRMR); chi-square; dimensional reductions.
An approach oriented viewpoints for cooperative information system eliciting requirements
by Kahina Kessi, Zaia Alimazighi, Mourad Chabane Oussalah
Abstract: Cooperative information system (CIS) is a complex system that involves the cooperation of several stakeholders sharing a common purpose, with different viewpoint on the system. This makes its development more difficult. The successful design of a CIS is then mainly based on the definition of the requirements engineering phase. In software engineering domain, especially in requirements engineering domain, Viewpoint and abstraction level are two important concepts introduced to reduce systems complexity. In this paper, we present an approach oriented viewpoints/abstraction levels which defines the necessary concepts to elicit the requirements of a CIS. In doing so, a model driven method is proposed to develop a CIS modelling tool. In this method, we proposed first a meta-model oriented viewpoints/abstraction levels which decomposes a CIS according to its different viewpoints. We then proposed a modelling tool, viewpoints for cooperative information system (VpCIS) generated from the meta-model.
Keywords: viewpoints; abstraction level; needs analysis; requirements engineering; cooperative information system; CIS; model driven method; meta-model; modelling tool.
Application of multi-objective firefly algorithm based on archive learning in robot path planning
by Fan Tanghuai, Wang Jiayuan, Feng Mirong, Zhang Xi, Wang Jiajia, Wu Ruixiu
Abstract: Addressing the defects of slow convergence and low solution precision with multi-objective firefly algorithm, we propose a multi-objective firefly algorithm based on archive learning. The algorithm saves the elite particles obtained from each generation in an external archive, and then randomly selects a particle from the external file as the learning object of the firefly to participate in the population evolution. The algorithm was verified by four test functions ZDT1, ZDT2, ZDT3 and ZDT6 and evaluated by IGD comprehensive evaluation index. Experiments have shown that the modified firefly algorithm does not only have a higher ability to escape from local optima, but also displays a significant improvement in convergence speed and solution precision. Our algorithm is more suitable for multi-objective optimization problems that have a higher complexity. When applied to robot path planning, our modified algorithm can yield shorter length and higher smoothness of the path.
Keywords: Multi-objective; firefly algorithm; external archive; path planning.
Chinese Text Classification Based on Character-level CNN and SVM
by Huaiguang Wu, Daiyi Li, Ming Cheng
Abstract: With the rapid development of the Internet, the high dimensional text data has increased rapidly. How to build an efficient and extensible text classification algorithm has become a hot topic in the field of data mining. Aiming at the problems of high feature dimension, sparse data and long computation time in traditional SVM classification algorithm based on TF-IDF (Term Frequency-Inverse Document Frequency), we propose a novel hybrid system for Chinese text classification: CSVM, which is independent of the hand-designed features and domain knowledge. Firstly, the encoding words are done by constructing a text vocabulary of size m for the input language, and then quantize each word using 1-of-m encoding. Secondly, we exploit the CNN (Convolutional Neural Network) to extract the morphological features of character vectors from each word, and then through large scale text material training the semantic feature of each word vectors are be obtained the semantic feature of each word vectors. Finally, the text classification is carried out with the SVM multiple classifier. Testing on a text dataset with 10 categories, the average recognition rate of CSVM is up to 99%. The experimental results show that the CSVM algorithm is more effective than other traditional Chinese text classification algorithm.
Keywords: TF-IDF; SVM; Character-level CNN; Text vectorization; Text classification.
Improving Named Entity Recognition and Disambiguation in News Headlines
by Jayendra Barua, Rajdeep Niyogi
Abstract: In this paper, we present a framework for extraction and disambiguation of Hyphenated and Partially named entities in news headlines. The direct application of state-of-the-art named entity detection and disambiguation approaches on news headlines results in significantly degraded performance due to different headline formatting in comparison with regular text; hyphenated mentions; and partial entity mentions. In this paper, we introduce a novel framework that assists existing named entity recognition and disambiguation systems to deal with introduced challenges. In particular, we deal with hyphenated entity mentions and partial entity mentions present in news headlines. We modify the hyphenated and partial entity in a way that increases the probability of disambiguation to correct entity in Knowledge base. Our technique leverages headlines of recent past to improve the entity mentions in headlines. The experimental results show that our technique improves the F1-score of mention detection by 12% and 9% in state-of-the-art Stanford and Illinois NER systems, whereas F1-score of disambiguation is improved by 9%, 12%, 7% and 5% in AIDA, Wikifier, TagMe, and YODIE state-of-the-art NED systems respectively.
Keywords: Information Retrieval; Named Entity Disambiguation; Mention Detection; Mention Modification; News headlines; Natural Language Processing.
A New Method of Event Relation Identification
by Yang Junhui, L.I.U. Zongtian, L.I.U. Wei
Abstract: Aiming at the problem that the traditional event relation identification cannot be considered semantic relation of event structural characteristics, this paper proposes a method of semantic relation based on dependency and co-occurrences. Dividing the text into event representation, using the distribution characteristics of the event elements, the phenomenon of the co-occurrences elements and the dependence relation between the text events, excavate clues of semantic relevant events. Then cluster the event set with the related thread by the improved AP algorithm. Experiments show that the semantic role of the event (six elements) can more accurately for calculate the degree of dependence and the co-occurrences overlap ratio of event elements between the candidate relation events, helpful to the more abundant candidate related event set, so as to improve the recognition ability of the matter.
Keywords: event; event relation; event element; dependency; co-occurrence;AP algorithm.
Semantic image retrieval using random forest based AdaBoost learning
by Vijay Patil, Pramod Deore
Abstract: In this paper, reducing the semantic gap in CBIR, using relevance feedback and random forest based AdaBoost learning is proposed. Initially, user feedback is used to move the query point more towards the relevant images and train the random forest classifier, after that, the AdaBoost learning is used to identify the weak classifiers and to assign more weights to weak classifiers in the weighted majority voting scheme. The AdaBoost learning is adopted to overcome the prediction variance of the random forest classifier. Experiments performed on broadly used Corel and Caltech database show that the proposed approach is more efficient, by achieving an average precision of 95% in 6 iterations of relevance feedback.rnrn
Keywords: content based image retrieval; semantic gap; relevance feedback; random forest learning; adaBoost learning; information retrieval.
Special Issue on: Big Data and Decision Sciences in Management and Engineering
Query optimization in real-time data warehouses
by Issam Hamdi
Abstract: Nowadays the update frequency for traditional data warehouses cannot meet the objectives of real-time data analysis relying on data freshness. To alleviate this problem, the Real-Time Data Warehouse (RTDW) technology has emerged. A RTDW allows decision makers to access and analyze fresh data as fast as possible in order to support real-time decision processes. In this paper, we focus on optimization techniques to speed up query processing; in particular, a query response time optimization and storage space optimization. Then, we propose an architecture called DETL-(m, k)-firm-RTDW architecture (Decentralized Extract-Transform-Load approach based on (m, k)-Firm constraint for Real-Time Data Warehouse). This architecture deals with diversity and disparities in data source systems to reduce the time for ETL and it has threefold objectives: i) guarantee the data freshness, and ii) enhance the deadline miss ratio even in the presence of conflicts and unpredictable workloads. Finally, we evaluate our feedback control scheduling architecture which considers both materialized views and data fragmentation using the TPC-DS (TPC, 2014) benchmark; the preliminary results are quite promising.
Keywords: Real-Time Data Warehouse;\r\n Real-Time Transactions;\r\n Materialized views; \r\nData partitioning; \r\nETL.
Special Issue on: Evolutionary Algorithms in Intelligent Systems
Object tracking using the particle filter optimized by the improved artificial fish swarm algorithm
by Zhi-Gao Zeng, Haixing Bao, Zhiqiang Wen, Wenqiu Zhu
Abstract: In particle filter algorithm, the weight values of particles will gradually decrease as the increase of iteration times and the variance of the weight values of the particles will increase. This will lead to an increase in the deviation between the estimated state and the true state. In order to deal with this problem, an improved particle filter algorithm is proposed in this paper. That is, an improved artificial fish swarm optimization algorithm is used to optimize the traditional particle filter. In the improved particle filter algorithm, the resampled particles will be driven to the region with high likelihood function to increase the weight values of the particles. Thus, the estimated state is closer to the real state. Experiment results show the advantage of our new algorithm over a range of existing algorithms.
Keywords: object tracking; particle filter; artificial fish swarm algorithm.
Solve the IRP Problem with an Enhanced Discrete Differential Evolution Algorithm
by Shi Cheng, Zelin Wang
Abstract: The inventory -routing problem is a NP hard problem. It is difficult to find the optimal solution in polynomial time. Many scholars have studied it in many years. This paper analyzes the inventory-routing optimization problem, and comprehensive differential evolution algorithm is good performance in solving combinatorial optimization problems. The differential evolution algorithm was improved to make it be suitable for solving discrete combination optimization problems. In order to improve the performance of the differential evolution algorithm to solve the inventory routing problem, this paper puts forward dynamic adjustment of mutation factor and crossover factor of the differential evolution. It is proved by numerical experiments that the proposed algorithm has certain performance advantages, and it also proves that the improved algorithm can improve the performance of the algorithm by dynamic adjustment of the mutation factor and crossover factor.
Keywords: differential evolution algorithm; inventory routing problem; mutation factor; crossover factor.
Inventory routing optimization using differential evolution with feasibility checking and local search
by Hu Peng, Changshou Deng
Abstract: The inventory routing problem (IRP) is to minimize inventory and transportation costs simultaneously for increasing profitability of the system. However, the two costs are conflicting in most case and hard to solve. As a promising evolutionary algorithm, differential evolution (DE) has been successfully applied to solve many real-world optimization problems, but we found that it is not used to optimize the IRP. In this paper, for the first time, we utilize the DE algorithm to optimize the one-to-many IRP where a product is shipped from supplier to a set of retailers over a planning period. In the proposed DEIR algorithm, the solution feasible checking method, the local search method and the optimal routing method based on DE are designed to suit the IRP solving. The computational tests have been conducted on 50 benchmark instances. Experimental results and comparison with different parameter settings have proved that the proposed algorithm is competitive.
Keywords: Differential evolution; Inventory routing problem; Feasible checking; Local search.
Hybrid Fireworks Algorithm with Differential Evolution Operator
by JINGLEI GUO, Wei Liu, Ming Liu, Shijue Zheng
Abstract: As a population-based intelligence algorithm, fireworks algorithm simulates the fireworks explosion process to solve optimization problem. A comprehensive study on enhanced fireworks algorithm (EFWA) reveals that the explosion operator generates too much sparks for the best firework limits the exploration ability. A hybrid version of EFWA (HFWA_DE) is proposed by adding the differential evolution (DE) operator. In HFWA_DE, the population is divided into two subpopulations, then each subpopulation evolves with FWA operator and DE operator separately and exchanges the elitist individual. Experiments on 20 well-known benchmark functions are conducted to illustrate the performance of HFWA_DE. The results turn out HFWA_DE outperforms some state-of-the-art FWAs on most testing functions.
Keywords: Fireworks Algorithm; DE operator; explosion; exploitation; exploration.
Ant colony optimization with local search for the bandwidth minimization problem on graphs
by Jian Guan, Geng Lin, Huibin Feng
Abstract: The bandwidth minimization problem on graphs (BMPG) is an NP-complete problem, which consists of labeling the vertices of a graph with the integers from 1 to n (n is the number of vertices) such that the maximum absolute difference between labels of adjacent vertices is as small as possible. In this paper, anrnapplication of the ant colony optimization with local search is presented to solve the bandwidth minimization problem. The main novelty of the proposed approach is an efficient local search combined with first improvement and best improvement strategies. A fast incremental evaluation technique is employed to avoid excessive fitness evaluations of moves in local search. Computational experiments on 56 benchmark instances show that the proposed algorithm is able to achieve competitive results.
Keywords: metaheuristics; ant colony optimization; bandwidth minimization problem; local search.
Applying Distance Sorting Selection in Differential Evolution
by Yuxiang Shao, Zhe Chen, Yuan Liu
Abstract: Differential Evolution is eligible for solving continuous optimization problems. So far, the imbalance between exploration and exploitation in DE runs often leads to the failure to obtain good solutions. In this paper, we propose distance sorting selection. According to the individual has the best fitness among parents and offspring is selected firstly. Then, the genotype distance from another individual to it, the distance in their chromosome structure, decides whether the former individual is selected. Under the control of a adaptive scheme proposed by us, we use it replace the original selection of the CoBiDE in runs from time to time. Experimental results show that, for many among the twenty-five CEC 2005 benchmark functions, which have the similar changing trend of diversity and fitness in runs, our adaptive scheme for calling selection based on distance sorting brings improvement on solutions.
Keywords: Exploration and exploitation balance;Distance;Stagnation;Premature convergence;CoBiDE.
outlier detection based on cluster outlier factor and mutual density
by Zhang Zhongping, Zhu Mengfan, Qiu Jingyang, Liu Cong, Zhang Debin, Qi Jie
Abstract: Outlier detection is an important task in data mining
with numerous applications.Recent years,the study on outlier
detection is very active,many algorithms were proposed
including that based on clustering.However, most outlier
detection algorithms based on clustering often need
parameters,and it is very difficult to select a suitable parameter
for different data set. In order to solve this problem, an outlier
detection algorithm called outlier detection based on cluster
outlier factor and mutual density is proposed in this paper which
combining the natural neighbor search algorithm of the Natural
Outlier Factor (NOF)algorithm and based on the Density and
Distance Cluster (DDC) algorithm.The mutual density and γ
density is used to construct decision graph.The data points with γ
density anomalously large in decision graph are treated as cluster
centers.This algorithm detect the boundary of outlier cluster
using cluster outlier factor called Cluster Outlier Factor(COF),it
can automatic find the parameter.This method can achieve good
performance in clustering and outlier detection which be shown
in the experiments.
Keywords: data mining;outlier;mutual density; gamma-density; cluster outlier factor.
A new quantum evolutionary algorithm using dynamic rotation angle catastrophe for knapsack problem
by Jialin Li, Wei Li
Abstract: In this paper, a quantum evolution algorithm (IQEA) based on dynamic rotation angle catastrophe technology is proposed to solve the knapsack problem. A quantum revolving gate operator with adaptive dynamic adjustment of the rotation angle is designed according to the evolution generations and fitness values. The population is divided into three parts equally, while preserving the optimal solution for each generation. Using the quantum rotation angles of different periods in the evolution process, the catastrophe operations of these three parts are carried out, and the parallel evolution of four types of individuals is realized. With the guidance of better individuals, multi-path optimization is performed to improve the parallelism of the algorithm. Effectively increase the diversity of the population, carry out multi-directional search, and also retain the excellent information in the offspring population, ensuring the stability of the population. Experimental results show that the proposed algorithm is superior to traditional evolutionary algorithms and traditional quantum evolution algorithms.
Keywords: knapsack problem; quantum evolutionary algorithm; adaptive revolving gate operator; dynamic catastrophic technology.
A Novel Particle Swarms With Mixed Cooperative Co-evolution for Large Scale Global Optimization
by Yufeng Wang, Wenyong Dong, Chunyu Xu
Abstract: Identification of variable interaction and grouping of variables plays an important role in the divide-and-conquer algorithm. In this paper, a novel particle swarms optimization with mixed cooperative co-evolution (MCCPSO) is proposed. It has two strategies and one mechanism: mixed grouping of variables (MGV) strategy, reallocate computational resources (RCR) strategy and a competitive leadership with a lifecycle mechanism. MGV can effectively identify the direct and indirect interactive variables and form a spare sub-group pool. RCR can give more computational resources to the more important subcomponents. The leader mechanism can prevent the PSO algorithm from falling into a local optimum. In order to understand the characteristics of MCCPSO, we have carried out extensive computational studies on the CEC'2010 benchmark function. The experimental results show that the performance of MCCPSO is better than the other four state-of-the-art algorithms.
Keywords: Large Scale Global Optimization; Cooperative Co-evolution; Mixed Grouping of Variables; Reallocate Computational Resources.
Study on the Optimization Ability of Natural Selection Mechanism
by Huichao Liu, Fengying Yang
Abstract: In recent years, evolutionary algorithms have developed rapidly and become an important method for solving complex and nonlinear optimization problems. Many evolutionary algorithms, such as differential evolution algorithm (DE), artificial bee colony algorithm(ABC) and brainstorming algorithm (BSO), adopt the natural selection principle of "survival of the fittest" to determine the individuals of new populations. For a long time, researchers regard the selection operator as an important part of maintaining the evolution of the algorithm, and seldom distinguish the optimization ability of the selection operator. In fact, the natural selection operator also has some capability to optimize. For this reason, this paper takes DE algorithm as an example to construct different DE variants, and compares the optimization results of them with the standard DE algorithm. Simulation results show that the new algorithm which only using natural selection can achieve certain optimization results, meanwhile, DE algorithm which removing its greedy selection operator only has poor performance. This proves that natural selection operator has certain optimization capability.Theoretical analysis shows that natural selection mechanism can determine a searching baseline during evolution and make exploration and exploitation fuse with each other.
Keywords: Algorithm Analysis; Evolutionary Algorithm; Differential Evolution Algorithm; Optimization Ability; Selection Operator.
Special Issue on: SUSCOM-2019 Advancements in Computational Intelligence and Intelligent Database Design
Optimal Bag-of-Features using Random Salp Swarm Algorithm for Histopathological Image Analysis
by VENUBABU RACHAPUDI, Golagani Lavanya Devi
Abstract: Histopathological image classification is a prominent part of medical image classification. However, the classification of such images is a challenging task due to the presence of several morphological structures in the tissue images. Tissues images are categorized into four classes namely epithelium, connective, muscular, and nervous. These images have divergent variety of morphological structures. Therefore, the classification of these images is a grueling task and has been an active research area. Recently, bag-of-features method has been used for image classification tasks. However, bag-of-features method uses K-means algorithm to cluster the features, which is a sensitive algorithm towards the initial cluster centers and often traps into the local optima. Therefore, in this work, an efficient bag-of-features histopathological image classification method is presented using a novel variant of the salp swarm algorithm termed as random salp swarm algorithm. The efficiency of the proposed variant has been validated against the 20 benchmark functions. Further, the performance of the proposed method has been studied on blue histology image dataset and the results are compared with 5 other state-of-the-art meta-heuristic based bag-of-features methods in terms of four parameters, namely precision, recall, accuracy and F-measure. The experimental results demonstrates that the proposed method surpassed the other considered methods with an increase of 11% accuracy.
Keywords: Histopathological image classification; Salp Swarm Algorithm; Bag-of-features.
Security, Privacy and Trust (SPT): Privacy Preserving Model for Internet of Things
by Shelendra Kumar Jain, Nishtha Kesswani, Basant Agarwal
Abstract: With the advancements in the Information Technology, Internet of Things (IoT) has emerged as one of the dominant technologies. The IoT systems are capable of connecting everyone, everything and any service, and the analysis on the information gathered from such IoT devices provides signicant number of opportunities to solve many real-time problems such as in healthcare, agriculture, transport, smart-cities etc. However, the privacy protection is very important and challenging issue in the information sharing environment due to sensitive and personal information communicated through the IoT devices. Effective dealing with the privacy breaches in the IoT ecosystem is on the higher priority for the user satisfaction and success of the IoT market. In this paper, we present an overview of the issues and challenges being faced to deal with the privacy protection methods in the Internet of Things. We have proposed a privacy preserving model that ensures data privacy in IoT devices through a lightweight data collection and datarnaccess protocol in resource constrained IoT ecosystem. The experimental results and analysis show that the proposed model is effective, and provides relatively less time for data collection and data access as comparedrnthe existing models. We also provides a case study of the proposed approach on the healthcare based IoT system.
Keywords: Internet of Things; Data Privacy Protection; Obfuscation; Information Privacy; User Privacy; Healthcare; Agriculture.