International Journal of Computational Intelligence Studies (28 papers in press)
Air pollution prediction through Internet of Things technology and Big Data Analytics
by Safae Sossi Alaoui, Brahim Aksasse, Yousef Farhaoui
Abstract: Air pollution is one of the biggest and serious challenges facing our planet nowadays. In fact, the need to develop models to predict this issue is considered so crucial. Indeed, our work aimed at building an accurate model to predict air quality of US country by using a dataset collected from connected devices of Internet of Things (IoT), namely from wireless sensor networks (WSN). Therefore, the huge amount of data captured by these sensors (approximately 1.4 million observations) brings about a highly complex data that necessitates new form of advanced analytic; its about Big Data Analytics. In this paper, we examine the possibility to make a fusion between the two new concepts Big Data and Internet of Things; in the context of predicting Air pollution that occurs when harmful substances; like NO2, SO2, CO and O3, are introduced into Earth's atmosphere.
Keywords: Internet of Things (IoT); Wireless sensor networks (WSN); Air pollution; Air Quality Index (AQI); Big Data Analytics; Apache Spark.
Handling the Crowd Avoidance Problem in Job Recommendation Systems Integrating FoDRA
by Nikolaos Almalis
Abstract: In this article, we present the basic principles and approaches of the Job Recommender Systems (JRSs). Furthermore, we describe the four different relation types of the job seeking and recruiting problem, derived directly from the formal definition of the JRSs. We use our already published Four Dimensions Recommendation Algorithm (FoDRA) to calculate the suitability of person for a job and then we model a job seeking and recruiting problem with many candidates and many jobs (N-N case). Finally, we execute the algorithm and present the results proposing a solution -the minimum acceptable suitability level-for the crowd avoidance problem that occurred. Our study produces satisfying results and shows that this approach can be considered as an important asset in the domain of Job Seeking and Recruiting.
Keywords: Recommendation system; Job seeking and recruiting; Job recommender; Matching people and jobs; Constraint-based; Information filtering.
Creating classification rules using Grammatical Evolution
by Ioannis Tsoulos
Abstract: A genetic programming based method is introduced for data classification. The fundamental element of the method is the well - known technique of Grammatical Evolution. The method constructs classification programs in a C like programming language in order to classify the input data, producing simple if else rules. The paper introduces the method as well as the conducted experiments on a series of datasets against other well known classification methods.
Keywords: Genetic algorithm; Data classification; Grammatical evolution; Stochastic methods.
Big data: A distributed storage and processing for online learning systems
by Karim DAHDOUH, Ahmed DAKKAK, Lahcen OUGHDIR
Abstract: The new information and communication technologies have changed the way of teaching and learning. In particular, the big data technology that has recently been developed to overcome the limitations of traditional systems of storage, processing, and analysis. In fact, big data has been used in several fields including health care, public services, and online services such as social media and online learning. It offers a rich set of new technologies in terms of data integration, distributed storage, parallel processing, and data visualization. Furthermore, big data provides many techniques to bring solutions to various educational problems such as the courses recommendation engine, the prediction of learner behaviour, the exponential growth of the learners and pedagogical resources, etc. Today, thanks to the big data ecosystem, it is possible to greatly improve the effectiveness and performance of the online learning services. This article presents the big data paradigm, its components, technologies, and characteristics. It proposes an approach for incorporating big data, online learning systems, and cloud computing in order to enhance the efficiency of the distance learning environment. Also, it provides a methodology to store and process the data produced by online learning platforms using advanced big data technologies and tools. Moreover, It explores the advantages and benefits that big data offer to students, teachers and online learning professionals.
Keywords: computing environments for human learning; big data; cloud computing; e-learning; Online learning; Learner; Learning Management Systems (LMS); NoSQL database; Hadoop; MapReduce; Spark; Cassandra; Hive; Apache Flume; Apache Sqoop.
An Independent-domain Natural Language Interface for Multimodel Databases
by Bais Hanane
Abstract: Databases are gaining prime importance in the world of modern computing. Retrieving information stored in databases required the knowledge of the database Query languages such as Structured Query Language (SQL). However, learning this language can be difficult for non-expert users.Hence, the using of natural language is a very easy and convenient method that can provide powerful improvements to the use of data stored in databases. In this paper, we present the architecture of an intelligent natural language interface for a multimodel database. This interface functions independently of database domain, language and model. The using of machine learning approach helps our system to improve automatically its knowledge base through experience.
Keywords: Databases; Natural Language Processing (NLP); Intermediate XML Logical Query (IXLQ); Extended Context Free Grammar (ECFG); intelligent interface.
Diabetes Risk Stratification Method Based On Fuzzy Logic And Bio-Inspired Meta-Heuristics
by Deme Andreea, Chifu Viorica Rozina, Pop Cristina Bianca, Chifu Emil Stefan, Salomie Ioan
Abstract: This paper presents a system for diabetes risk stratification that combines fuzzy logic with two bio-inspired algorithms. The developed system takes as input a set of patients described by numerical and categorical features and generates fuzzy rules to classify them into groups according to their risk of having diabetes. To take into consideration the uncertainty from the input data set, our system combines fuzzy logic techniques with bio inspired algorithms and hierarchical classification. The system has been evaluated on Pima Indians data from UCI machine learning repository.
Keywords: CLONALG algorithm; fuzzy logic; ant clustering; patient risk stratification.
Stopping rules for a parallel genetic algorithm
by Ioannis Tsoulos, Alexandros Tzallas, Markos Tsipouras, Vasileios Christou, Dimitrios Tsalikakis
Abstract: A novel method for the implementation of parallel genetic algorithms is introduced to locate the global minimum of a multidimensional function inside a rectangular hyperbox. The algorithm relies on a client - server model and incorporates an enhanced stopping rule. A number of experiments were conducted in order to measure the effects in termination by using the termination rule either on server machine or on clients. The method is tested on a series of well - known test functions as well as neural network training and the results was compared against another parallel genetic algorithm method. The results from the experiments are reported in terms of test error and amount of generations.
Keywords: Genetic algorithm; parallel algorithms; stopping rules; optimization.
Special Issue on: Applications of Hybrid Bio Inspired Algorithms
Stock Price Trend Prediction with Long Short Term Memory Neural Networks
by Varun Gupta, Mujahid Ahmad
Abstract: Stock market is an immensely complex, chaotic and dynamic environment. Thus, the task of predicting changes in such an environment becomes challenging with regards to its accuracy. A number of approaches have been adopted to take on that challenge and machine learning has been as the crux in many of them. There are plenty of examples of algorithms based on machine learning yielding satisfactory results for such type of prediction. This paper presents the usage of Long Short Term Memory (LSTM) networks in this scenario, to predict future trends of stock market prices based on the patterns from price history, paired with technical analysis indicators. To achieve this, a model has been built, and a series of experiments have been conducted through a number of parameters and the results were analyzed against predefined metrics to assess if this algorithm presents any improvements in front of other machine learning methods and strategies. Also, a comparative study is presented which analyzes popularly used optimizers and error schemes to check which given optimizer yields the best results. The results obtained are promising and presented a reasonably accurate prediction for the rise or fall of a particular stock in the near future.
Keywords: Stock market prediction; LSTM; Recurrent neural networks; artificial neural networks; machine learning; deep learning; artificial intelligence; soft computing.
PREDICTION OF AIR POLLUTION USING LSTM BASED RECURRENT NEURAL NETWORKS
by Varun Gupta, Akshat Jain, Ashim Bhasin
Abstract: This paper proposes a system that predicts the pollution level at some hour at a place. It also infers about the various parameters associated with the increasing pollution across the globe, its ill effects and the future scenario of the same. An air quality dataset reporting level of pollution and weather every hour for five years is taken and Long Short Term Memory network (LSTM) based Recurrent Neural Networks using keras library with Tensorflow as back-end were applied in a python environment. The paper studies all 13 parameters affecting the weather and air pollution conditions and forecasts the pollution for any hour given the weather conditions and pollution value for the previous hour.
Keywords: Air Pollution Prediction; LSTM; Recurrent Neural Networks; Artificial Neural Networks; deep learning; machine learning; soft computing; artificial intelligence.
Fuzzy Knowledge Based Fractional Order PID Control Implementation with Nature Inspired Algorithms
by Ambreesh Kumar, Rajneesh Sharma
Abstract: In this paper, we attempt to hybridize nature inspired optimization techniques with fuzzy knowledge based proportional integral derivative (PID) control for applications on fractional order systems. Two nature inspired approaches, namely, Genetic algorithm and Ant Colony algorithms have been employed for tuning the parameters of the fuzzy knowledge based fract-order PID controller offline. In the next stage, we fine tune the PID controller parameters using a fuzzy knowledge based formulation. In our proposed nature inspired fractional fuzzy PID (NIFFPID) framework, GA has been used for optimizing the inputs to the ANT controller. We illustrate effectiveness of our methodology by simulation results on four plants: one integer order and three fractional order ones having different orders. Simulation results and comparison of our approach against other approaches, viz., fractional order PID-ANT, fractional order PID-GA, fuzzy fractional PID-ANT and fuzzy fractional PID-GA, shows feasibility and effectiveness of our approach for fract order systems.
Keywords: Integer order plant; fract order plants; fuzzy knowledge based control; NIFFPID approach.
Human Activity Recognition from Histogram of Spatiotemporal Depth Features
by Naresh Kumar
Abstract: The recent evolution in sensor based depth information has been developed a sounding scope to work for human activity recognition using depth image sequences. The activities due to human being can have great interest in every domain of real life where human is always a major actor. Activity recognition is having a key importance due to its advantages in several domain like surveillance systems at airport, patient monitoring system, care of elderly people etc. The variation in spatial and temporal parameters can represent any activity efficiently. In natural color vision, it is not efficient to attain the complete information because it represents flatness and occluded points for every portion of the images. This work is proposed the objective and evaluations to recognize daily life human activities by spatiotemporal depth information. Several varieties of actions may be performed by a single person or more than one person at a time. For this purpose, Kinect sensor is used to collect the data pertaining single activity performed by multiple person at a time. The spatiotemporal depth features are computed for activity recognition and support vector machine is used in classification phase. We have nine class of human actions in the database for RGB-D human activity recognition. This dataset is reconfigured from Cornell human activity and Berkeley multimodal human action databases. For multiple human action recognition, 91.38% accuracy is achieved on the synthetic dataset. This work can get better performance that are tough to achieve through the normal video frames of human activities.
Keywords: Human Action Recognition(HAR); Principal Component Analysis (PCA); spatiotemporal descriptors; Histogram of Gradient(HOG); Support Vector Machine(SVM); Histogram of oriented feature(HOOF).
Special Issue on: Intelligent Systems for Cyber Security Current Trends, Applications and New Challenges
Intrusion Detection using Data Mining
by Shubha Puthran, Ketan Shah
Abstract: Intrusion Detection plays very important role in securing Information Servers. Classification and Clustering Data Mining algorithms are very effective to deal with Intrusion Detection. However, classification (supervised) results with false negative detection and Clustering (unsupervised) results with false positive detection. This paper introduces a unique framework consisting of Pre-processing unit, Intrusion detection using quad split(IDTQS), Intrusion Detection using Correlation based quad split (IDTCA) and Intrusion Detection using Clustering (IDTC). In this proposed framework, IDTQS and IDTCA shows accuracy improvement for University of New South Wales (UNSW) dataset is in the range 4%-34% for DoS, Probe, R2L, U2R and Normal classes. IDTC Clustering algorithm performs with 97% accuracy. Training and testing time is improved by 14% for IDTCA in comparison with IDTQS.
Keywords: Quad split; Decision Tree; Correlated Attributes; UNSW dataset
Special Issue on: CMDM 2017 Computational Intelligence and Data Mining
A Co-evolutionary Decomposition-based Algorithm for the Bi-level knapsack optimization problem
by Abir Chaabani, Lamjed Ben Said
Abstract: Bi-level optimization problems (BOPs) are a class of challenging problems with two levels of optimization tasks. These problems allow to model a large number of real-life situations in which a first decision maker, hereafter the leader, optimizes his objective by taking the follower\'s response to his decisions explicitly into account. In this way, evaluating a solution in the upper level requires finding an optimal solution to the lower level problem. This fact makes BOPs difficult to handle and have kept researchers and practitioners busy alike. Recently, a new research field, called EBO (Evolutionary Bi-Level Optimization) has appeared thanks to the promising results obtained by the use of EAs (Evolutionary Algorithms) to solve such kind of problems. In this context,two recently proposed EBO called CODBA and CODBA-II were proposed to solve combinatorial BOPs. The proposed approaches were able to improve the quality of generated bi-level solutions regarding to the recently proposed methods within this research area. In fact, a wide range of applications fit the bi-level programming framework and real-life implementations still scarce. For this reason, we propose in this paper a Co-evolutionary Decomposition-based Bi-level Algorithm for the bi-level knapsack optimization problem. The computational performance of the proposed algorithm turned out to be quite efficient on both computation time and solution quality regarding to other competitive EAs.
Keywords: Bi-level combinatorial optimization; evolutionary methods; bi-level\r\nknapsack problem.
Web service selection based on QoS and user profile
by Ilhem Feddaoui, Faîçal Felhi, Jalel Akaichi
Abstract: The Web Services are from different sources, heterogeneous, and of large volume. The user is in a crucial situation to select the best Web services. The Web service selection process aims to discovery the desired Web services; as it allows to select the best Web services to users' query. In particular, various Web services have the same functionalities, so we need another factor to select the desired Web services, which is the Quality of Service (QoS). The QoS has an important role in the Web service selection process, it aims to classify the Web service that have same functionality. This paper focuses on different concepts of the QoS. We present a new approach that is composed by two services; its role is primarily the best Web service selection in relation with users' query and profile. In our approach, a better knowledge of user behavior is important because users can participate in research design and construction. The experiment shows that our method can accurately recommend the needed Web services in a faster time.
Keywords: Web service; Query; User profile; QoS.
MC4.5 decision tree algorithm: An improved use of continuous attributes
by Anis Cherfi, Kaouther Nouira, Ahmed Ferchichi
Abstract: C4.5 is one of the top ten data mining algorithms, it is the most widely used decision trees construction techniques. Although effective, it suffer from the problem of complexity when it deals with continuous attributes. It also leads to a certain level of information loss. Therefore, minimizing such loss, and reducing the time complexity is one of the main goals in this paper. With the intention of alleviating these problems, this paper presents a novel algorithm namely MC4.5, which proposes the statistical mean as an alternative to the C4.5 threshold selection process. To demonstrate the effectiveness of the new algorithm, a complete evaluation was launched to prove that MC4.5 complies with the objectives previously mentioned. From the theoretical perspective, we develop an analysis of the complexity to compare algorithms. Empirically, we conduct an experimental study using 30 data sets to prove that, in most cases, the proposed algorithm leads to smaller decision trees with better accuracy comparing to the C4.5 algorithm.
Keywords: Decision tree; MC4.5; C4.5; Statistical mean; Continuous attributes;rnClassification; Information gain.
Solving flexible job-shop problem with sequence dependent setup time and learning effects using an adaptive genetic algorithm
by Ameni Azzouz, Meriem Ennigrou, Lamjed Ben SAID
Abstract: For the most scheduling problems studied in literature, job processing times are assumed to be known and constant over time. However, this assumption is not appropriate for many realistic situations where the employees and the machines execute the same task in a repetitive manner. They learn how to perform more efficiently. As a result, the processing time of a given job is shorter if it is scheduled later, rather than earlier in the sequence. In this paper, we consider the Flexible Job Shop Problem (FJSP) with two kinds of constraint, namely, the sequence-dependent setup times (SDST) and the learning effects. Makespan is specified as the objective function to be minimized. To solve this problem, an Adaptive Genetic Algorithm (AGA) is proposed. Our algorithm uses an adaptive strategy based on : (1) the current specificity of the search space, (2) the preceding results of already used operators and (3) their associated parameter settings. We adopt this strategy in order to maintain the balance between exploration and exploitation. Experimental studies are presented to assess and validate the benefit of the incorporation of the learning process to the SDST-FJSP over the original problem.
Keywords: scheduling problem; Genetic algorithm; Adaptive strategy; Learning effects.
Contributions to the Automatic Processing of the User-Generated Tunisian Dialect on the Social Web
by Jihene Younes, Hadhemi Achour, Emna Souissi, Ahmed Ferchichi
Abstract: With the growing use of social media in the Arab world, Arabic dialects are rapidly spreading on the web, leading to a growing interest from NLP researchers. These dialects are however, still under-resourced languages and the lack of available dialectal resources is a major obstacle to their study and processing. In this paper, we focus on the automatic processing of the user-generated Tunisian dialect (TD) on the social web and propose an approach that aids to automatically generate TD language resources (LRs), useful for any NLP research work dealing with this dialect. This approach exploits the large amounts of textual productions on the social web in order to extract and generate dialectal content. It is based on two main NLP components, namely the TD Identification and the TD transliteration. A machine learning approach using Conditional Random Fields (CRF), is proposed for implementing these two components and reached an accuracy of 87.45 for the TD identification and 90.49 for the automatic generation of dialectal contents by transliteration.
Keywords: Tunisian Dialect; language resources; corpora; lexica; identification; transliteration; natural language processing; machine learning.
An effective Genetic Algorithm for solving the Capacitated Vehicle Routing Problem with Two-dimensional Loading Constraint
by Ines Sbai, Olfa Limam, Saoussen Krichen
Abstract: In this article, we focus on the symmetric capacitated vehicle routing problem where customer demand is composed of two-dimensional weighted items. The objective consists in designing a set of trips, starting and terminating at a central depot, that minimise the total transportation cost with a homogenous fleet of vehicles based on a depot node. Items in each vehicle trip must satisfy the two-dimensional orthogonal packing constraint. The capacitated vehicle routing problem with two-dimensional loading constraint is an NP-hard problem of high complexity. Given the importance of this problem, many solution approaches have been developed. However, it still a challenging problem. Then, we propose to use a new heuristic based on an adaptive genetic algorithm in order to find better solution. Our algorithm is tested with 150 benchmark instances and compared with state-of-the-art approaches. Results shown that our proposed approach is competitive in terms of the quality of the solutions found.
Keywords: Capacitated Vehicle Routing Problem; Loading; Genetic Algorithm;rn2L-CVRP.
A Multi-Level Study for Trust Management Models Assessment in VANETs
by Ilhem Souissi, Nadia Ben Azzouna, Lamjed Ben Said
Abstract: Nowadays, trust management is one of the key elements to ensure a high security level in ad hoc networks. Trust assessment can be perceived at three levels. First, the data perception trust need to be assessed in order to ensure a high quality of raw sensed data. Second, the trust relationship assessment is required to detect the selfish and malicious entities and to maintain the data integrity. Finally, the data fusion trust is essential to preserve the performance of the fusion process. In this paper, we intend to point out the need to integrate the data perception trust, the communication trust and the data fusion trust in order to preserve the information trustworthiness in VANETs. We further browse the literature to identify recent advancements with regard to each type of trust.
Keywords: Data Perception Trust; Communication Trust; Data Fusion Trust; VANETs.
Special Issue on: BDCA'18 Data Science and Applications
Information Technology performance management by Artificial Intelligence in Microfinance Institutions: An overview
by Kaicer Mohammed
Abstract: This paper presents an overview of the use of new information technology to improve the management of microfinance institutions, experiencing a gap due to the growth of the microfinance sector and the diversity of products and services they offer to the target populations. We will show that artificial intelligence could play a role to ensure reliable management information systems in MFIs.
Keywords: Management of Informatics Technology; Artificial intelligence; Microfinance Institution; Central risk.
Special Issue on: Advances and Challenges in Nature Inspired Optimisations
Nonlinear time series forecasting using a novel self-adaptive TLBO-MFLANN model
by Sibarama Panigrahi, H.S. Behera
Abstract: Time series forecasting (TSF) is a key field of research in several areas of study including engineering, finance, economics and management science. Conventionally, TSF has been predominantly performed using various linear statistical models. However, to cope up with nonlinear patterns exhibiting in most of the time series produced from real world phenomenon, recently, various artificial neural network (ANN) models have been used. Contrasting to traditional neural networks, higher order neural network (HONN) especially the functional link ANN (FLANN) has the capability to expand the input space with fewer trainable weights which makes it efficient to solve various complex problems. Motivated by this, in this paper, we have proposed a multiplicative FLANN (MFLANN) model for time series forecasting. The multiplicative unit in MFLANN model assists to capture the nonlinear patterns well. In addition, an improved version of teaching learning based optimization (TLBO), called self-adaptive TLBO (SATLBO) has been proposed to train the MFLANN model. The proposed SATLBO uses the gradient descent learning algorithm in the teacher phase while uses the past experience to adapt the learners parameters. This unique integration of SATLBO with gradient descent learning algorithm is used to determine the near optimal weight set of MFLANN. The proposed method is implemented in MATLAB environment and the obtained results are compared with other methods (DE based MFLANN, TLBO based MFLANN, CRO based MFLANN, Jaya based MFLANN and ETLBO-JPSNN) considering 11 benchmark univariate time series datasets. Extensive statistical analysis on obtained results indicated that the proposed SATLBO-MFLANN method is better and statistically significant in comparison to its counterparts.
Keywords: FLANN; Multiplicative FLANN; TLBO; Self-adaptive TLBO; Hybrid model; Time series forecasting.
Impact of C-Factor on PSO for Solar PV based BLDC Motor Drive Control
by Manoj Kumar Merugumalla, Prema Kumar Navuri
Abstract: The constriction factor (C-factor) based particle swarm optimization algorithm is proposed for the solar photovoltaic array powered brushless direct current (BLDC) motor drive control. The rotor position sensors are completely eliminated. Instead, it is determined by measuring the changes in back-emf. The CPSO algorithm is used for the tuning of proportional-integral-derivative (PID) controller parameters for the speed control of the drive. The BLDC motor drive is modelled in MATLAB/SIMULINK and trapezoidal back emf waveforms are modelled as a function of rotor position using matlab code. This paper deals with rise time, settling time, peak overshoot and steady-state error under varying conditions and examines the effectiveness of the BLDC motor drive with the proposed algorithm. The comparison of simulation results of proposed algorithm with other PSO algorithms demonstrates the prominence of the constriction factor for the drive controller.
Keywords: Brushless direct current motor ;Solar photo voltaic array;
Particle swarm optimization ; PID controller ; Constriction factor ; Inertia weight.
Robust Estimation of IIR System's Parameter using Modified Particle Swarm Optimization Algorithm
by Meera Dash, Trilochan Panigrahi, Renu Sharma
Abstract: This paper introduces a novel method of robust parameter estimation of infinite impulse response (IIR) system. When training signal contains strong outliers, the conventional squared error based cost function fails to provide desired performance. Thus a computationally efficient robust cost functions are used here. It is a fact that the IIR system falls in local minima. Thus the gradient based algorithm which is good for finite impulse response (FIR) system, can not be used for IIR. Therefore the parameters of the IIR system is estimated using modified particle swarm optimization algorithm. The most used and analyzed robust cost functions such as Hubers and saturation nonlinearity function are used in the optimization algorithm. The simulation results show that the proposed robust algorithms are providing better performance than the Wilcoxon norm based robust algorithm and conventional error squared based PSO algorithms.
Keywords: IIR system; impulsive noise; robust estimation; Wilcoxon norm; Hubers cost function; adaptive particle swarm optimization;saturation nonlinearity.
Design of Fractional Order PID Controller for Heat Flow System using Hybrid Particle Swarm Optimization and Gravitational Search Algorithm
by Rosy Pradhan, Santosh Kumar Majhi, Bibhuti Bhusan Pati
Abstract: This paper uses the hybrid Particle Swarm Optimization (PSO) and Gravitational Search Algorithm (GSA) for the design of fractional order proportional-integrator-derivative (FOPID) controller for a heat flow system. The social behavior of PSO is combined with the motion technique of GSA. The objective of the algorithm is to obtain the optimal controller parameters for the heat flow system. To obtain the optimal computation, different performance indices such as IAE (Integral Absolute Error), ISE (Integral Squared Error), ITAE (Integral Time Absolute Error), ITSE (Integral Time Squared Error) are considered for the optimization. The performance of the hybrid PSO-GSA is compared with the IMC-PID, Fractional Oder Filter-PID, PSO-FOPID. The proposed method performs comparatively better than the other published methods. Simulink/Matlab environment is used for simulation purpose.
Keywords: FOPID; performance indices; PSO; PSO-GSA.
Suitability and Importance of Deep Learning Feature Space in the Domain of Text Categorization
by Rajendra Kumar Roul
Abstract: One of the important features of Multilayer ELM (ML-ELM) is its capability of non-linearly mapping the features to an extended dimensional space and thereby builds the input features linearly separable. This paper studies the significance of deep learning feature space using ML-ELM for classification of text data which are the follow-up of my earlier work. The previous approach discusses a new feature selection technique named Combined Cohesion Separation and Silhouette Coefficient (CCSS) to generate a good feature vector and then used it for classification of text data using ML-ELM, which is a deep learning classifier. This approach has been extended here that has two main aspects. The first aspect is to compare the performance of CCSS approach with the traditional feature selection techniques and the second aspect is to test the performance of different conventional classification techniques on the higher dimensional feature space of ML-ELM. Results of the experiment on different benchmark datasets justify that the proposed CCSS technique is comparable with the existing feature selection techniques and the ML-ELM feature space is more promising compared to the traditional TF-IDF vector space for classification of text data.
Keywords: Classification; Cohesion; Deep learning; Extreme learning machine; Multilayer ELM; Separation; Silhouette coefficient.
Scalable Keyword-Based Search and Data Manipulation on Encrypted Data
by Prabhat Keshari Samantaray, Navjeet Kaur Randhawa, Swarna Lata Pati
Abstract: Cloud based services increases productivity and reduces infrastructure cost industries which attract researchers and individuals outsource data to the remote cloud servers. However, cloud servers pose several security issues and the most prominent issue is privacy. Although encrypting the data before outsourcing preserves the data privacy, it does not support data usability such as searching with keywords on encrypted data. Searchable Symmetric Encryption (SSE) is found to be an efficient solution and in this paper, we construct an SSE method which enables search operation on encrypted data and allows various data operations on encrypted data. We perform document clustering and construct index tree based secure search scheme. The index and queries are constructed with vector space model and are encrypted using secure kNN computation method. We defined two encrypted searchable schemes based on two security models. We perform thorough experiments to justify the efficiency of the proposed model.
Keywords: encrypted index construction;document clusters;encrypted search;encrypted data operations;searchable encryption;vector encryption;.
Improving Bug Report Quality by Predicting Correct Component in Bug Reports
by Indu Chawla, Sandeep Kumar Singh
Abstract: Bugs reported in bug tracking systems contain important information in the form of standard and mandatory fields like version, operating system, product, component and type of the bug etc. Unfortunately, the information provided in these fields is sometimes missing and inaccurate. Inaccurate information makes the program understanding difficult and inflicts delay in the process of bug fixing. Many times, these fields are reassigned during the bug fixing time even more than ones. This study explores the automatic identification of correct component field in a bug report. This study proposes using fuzzy similarity based approach for identifying the correct component as well as predicting the possibility of component reassignment. Experimentation is done using bug reports of three open source projects from Eclipse. The experimental results show that fuzzy similarity approach performs better as compared to state of art approaches on two out of three datasets in terms of precision score.
Keywords: Bug tracking system; Bug report; component prediction; component reassignment.
The Connectivity and the Static-Cost-Effective Analysis of a Shifted Completely Connected Network
by MOHAMMED N. M. ALI, M. M. Hafizur Rahman, Adamu Abubakar Ibrahim, Dhiren K. Behera, Yasushi Inoguchi
Abstract: At the current time, finding an alternative computing device with extreme computation power became the main concern of the research community. Therefore, building a computer device able to execute extremely difficult calculations in a short period of time is required. Presently, the massively parallel computer (MPC) systems considered the highest computing devices, and the existence of these systems is important to execute many operations in many sectors such as engineering and science. These devices built based on an internal network called interconnection network which has a particular design represented by the network topology. The cost of these networks influenced highly by the price of the processing elements (PEs) and the communication links. Thus, the design of these topologies has a crucial impact on the network cost and performance. In this paper, we have proposed a new design of topology for the MPC systems; this topology has been evaluated statically in previous work, and it showed good results. Therefore, in this paper, we will focus on analysing the cost effectivity of this network to examine if it is a cost-beneficial network before going to the implementation step to assure that the profit of this network is deserving the cost.
Keywords: Network-on-Chip (NoC); Interconnection Networks; Hierarchical Interconnection Networks (HINs); Static Network Performance; Shifted Completely Connected Network (SCCN); Massively Parallel Computer (MPC) Systems; Conventional Interconnection Networks.