International Journal of Artificial Intelligence and Soft Computing (5 papers in press)
A Novel Method for Network Intrusion Detection based on Nonlinear SNE and SVM
by Yasir Hamid, Ludovic Journaux, John Aldo Lee, M. Sugumaran
Abstract: In the case of Network Intrusion Detection data, pre-processing techniques
have been extensively used to enhance the accuracy of the model. An ideal Intrusion Detection System (IDS) is one that has appreciable detection capability overall the group of attacks. An open research problem in this area is the lower detection rate for less frequent attacks, which result from the curse of dimensionality and imbalanced class distribution of the benchmark datasets. This work attempts to minimize the effects of imbalanced class distribution by applying random under-sampling of the majority classes and SMOTE based oversampling of minority classes. In order to alleviate the issue arising from the curse of dimensionality, this model makes use of Stochastic Neighbor Embedding a Non-Linear Dimension Reduction technique to embed the higher dimensional feature vectors in low-dimensional embedding spaces. A nonlinear Support Vector Machine with a radial basis function on a series of Gamma values was used to build the model. The results demonstrate that the proposed model with the dimension reduction has higher detection coverage for all the attack groups in the data set as well as the normal data. Results are evaluated on two benchmark datasets KDD99 and UNSW-NB15.
Keywords: Accuracy; Classification; Dimension Reduction; Intrusion Detection;
KDD99; NLDR; SNE; SVM; UNSW-N15.
Hierarchical Classification of Web Search Results to detect user's
by Salma Gaou, Pedro A. Castillo-Valdivieso
Abstract: User understanding in a web browsing session is a difficult subject, which attracts the attention of many researchers in this field. This article will facilitate a great impact on many Internet-based applications. For example, if a search engine has the ability to detect users intent; it could be better following the order of the search results to user needs. In this context, research has focused on the analysis for users with Result Pages Search Engine resulting from a Web application, but most methods ignore the intent of the user during the exploration of Web pages associated with the links of the resulting page that it decides to visit.
In this article, we focus on the detection and understanding of the user's intent that motivate a user to search on the web. The calculation of the similarity is preceded by the formation of the vicinity of the target item, the first method used is that of the single k-means clustering approach for items in different groups. This method had limitations because of sparsity problem. To overcome this limitation and increase the accuracy of our model, we opted for a systemic approach outcome of Group Technology. This approach provides the BEA algorithm to improve communities search. It's a new way to identify the neighbourhood and solve the problem of scalability.
Keywords: Intent user; Information search; ranking of search results; search retrieval; Group Technology; Co-Clustering; BEA algorithm.
A New Kernel based Possibilistic Intuitionistic Fuzzy c-means Clustering
by Jyoti Arora, Meena Tushir
Abstract: Fuzzy c-means and its derivatives such as possibilistic c-means, possiblistic fuzzy c-means are the most widely used clustering algorithms in the literature. Though efficient, these clustering algorithms do not achieve high cluster quality on real-world data sets, which are not linearly separable. Kernel-based clustering algorithms employ non-linear similarity measures to define the inter-point similarities. As a result, they are able to identify clusters of arbitrary shapes and densities. Comparative analysis over standard data sets has established the superiority of kernel methods over its corresponding standard algorithms. In this paper, we propose a kernel based Atanassovs possibilistic intuitionistic fuzzy clustering for data clustering and image segmentation. The paper explores the performance of the proposed methodology with respect to various internal and external indices for various real data sets and it is found to perform better than other clustering techniques in the sequel i.e. normal as well as kernel based algorithms. Experimental results on noisy image data sets also show the competence of the proposed approach
Keywords: Fuzzy c-means; Possibilistic c-means; Kernel method; Intuitionistic Fuzzy C Means.
Remote Sensing Image Classification for Jabalpur Region using Swarm Classifiers
by Shruti Goel, Gourav Khurana, Vinod Kumar Panchal
Abstract: Swarm Intelligence algorithms have been widely applied in solving many complex problems in different domains. In recent years, researchers have started incorporating Swarm Intelligence techniques into different land cover classification problems for remote sensing perspective. Here, Swarm Intelligence algorithms are exploited for satellite image classification. The primary objective of the image classification is to recognize and classify the different land cover features available in a satellite image of any particular region. In the paper, Swarm Intelligence based Cuckoo Search (CS) and Artificial Bee Colony Optimization (ABC) are used for classifying Jabalpur region of Madhya Pradesh, India. The reason for the selection of CS and ABC algorithm over other swarm intelligence concepts is the wide & efficient applicability of these concepts in different domains. Also, these concepts are efficient to maintain balance between the exploration & exploitation which is necessary to obtain global optimization. The swarm classifiers based obtained results are compared with other than swarm intelligence techniques mainly Maximum Likelihood Classifier (MLC), Minimum Distance Classifier (MDC) and Fuzzy Logic. The accuracy assessment of each classified image is done separately and the observations are tabulated. Results are evaluated in terms of kappa coefficient, users accuracy, producers accuracy and overall accuracy. Later, the assessment values for each classifier are also obtained which serve as a basis for comparing the efficiency of swarm classifiers and other classifiers. Results in terms of individual feature accuracy (users accuracy & producers accuracy) and consolidated accuracy (kappa coefficient & overall accuracy) show the dominance of swarm classifiers in comparison with other considered classifiers for the classification of land cover features.
Keywords: Remote Sensing; Image Classification; Traditional Classifiers; Swarm Intelligence Classifiers; Image Classification Process; Accuracy Assessment.
Synthetic Sampling Approach Based on Model Based Clustering for Imbalanced data
by Shaukat Ali Shahee, Usha Ananthakumar
Abstract: A data set exhibits class imbalance problem when one class has very few examples compared to the other class also referred to as between class imbalance. Apart from between-class imbalance, imbalance within classes where classes are composed of different number of sub-clusters with these sub-clusters containing different number of examples may also affect the performance of the classifier. In this paper, we propose a method that can handle both between-class and within-class imbalance simultaneously that also takes into consideration various data intrinsic characteristics. The proposed method uses model based clustering with respect to classes to identify the sub-clusters present in the dataset and oversamples examples in each sub-cluster in such a manner that it eliminates between class and within class imbalance simultaneously. We validate our approach using neural network on 10 publicly available data sets. The experimental results show the proposed method to be statistically significantly superior to other methods.
Keywords: Classification; Imbalanced Dataset; Oversampling; Model based clustering.