International Journal of Intelligent Engineering Informatics (12 papers in press)
Mamdani fuzzy based vehicular grouping at the intersection of roads for smart transportation system
by Harsha Vardan Maddiboyina, V. A. Sankar Ponnapalli
Abstract: Smart transportation system is an innovative solution to the traffic congestion at the real time traffic environment. A smart transportation system based on vehicular grouping technique with length as a parameter has proposed in this research contribution to avoid traffic congestion at the intersection of roads. In this method, vehicles which are waiting at the intersection of roads are divided into small groups with highest priority given to the emergency vehicles. Owing to the proposed work, the average number of vehicles which can cross the junction is increased to 2.4 times compare with existing traffic statistics of Hyderabad City, India. Therefore the average waiting time of an emergency vehicle is diminished up to 38%. Mamdani fuzzy inference system has used to frame the fuzzy IF-THEN rules and MATLAB 2014a has used to simulate the proposed work.
Keywords: Mamdani fuzzy inference system; vehicular grouping; smart transportation system; VANETs; fuzzification; defuzzification.
The Kushner-Stratonovich Stochastic Method for a Master Robot of a Tele-robotic System: Beyond Extended Kalman Filtering
by Ravish Hirpara, Shambhunath Sharma
Abstract: The surgical robotic system, a tele-robotic system, comprises the operator, master-slave configuration in combination with the object. For the estimation and control of the surgical robotic system, the master-slave set up is the subject of greater robotic research activities as well as has proven promising. The noise equations as well as their extended Kalman filtering are available in literature in a greater detail. In this paper, the multi-dimensional stochastic differential equation as well as the higher-order nonlinear stochastic filter beyond extended Kalman filtering for a master robot of the surgical robotic system are developed. The higher-order non-linear filtering preserves the greater qualitative characteristics of non-linear stochastic systems by accounting observation correction terms in stochastic evolution equations. First, the Lagrangian mechanics is exploited to develop the vector stochastic model of the master robot. Secondly, this paper utilizes the Kushner-Stratonovich equation, a celebrated systems and control method in stochastic systems literature, for filtering of the master robot noise equations. This paper demonstrates comparison between the higher-order Kushner-Stratonovich filter, extended Kalman filter and true state trajectories as well. Importantly, this paper will be of interest to robotic dynamists and control theorists aspiring for formal, systematic non-linear stochastic methods with greater conceptual depth in robotic dynamics, which are not researched yet in literature.
Keywords: robotic surgical systems; master-slave set up; Lagrangian mechanics; the Kushner-Stratonovich equation; extended Kalman filtering; stochastic differential equations.
An Iterative Solution Approach for Steady State Analysis of Self-Excited Induction Generator
by Hayri Arabaci
Abstract: The steady state analysis of self-excited induction generator is widely used for stand-alone renewable energy system. The impedance values of equivalent circuit vary depending on frequency. The frequency changes according to load, excitation capacitance and rotor speed. If the values of load and capacitance are known, only the frequency and slip are the two main unknowns left for analysis of the single-phase equivalent circuit. The simulation is generally performed for a desired slip value to obtain the frequency. However, with the proposed approach in this study, the values of rotor slip, and frequency of terminal voltage are both directly determined. The proposed algorithm is verified by the experiments of three phase 1.1 kW squirrel cage induction machine.
Keywords: iterative solution approach; renewable energy; self-excited induction generator; SCIG; stand-alone systems; steady state analysis.
New weighted clustering ensemble based on external index and subspace attributes partitions for large features datasets.
by Nadjia Khatir, Nait-Bahloul Safia
Abstract: Real world datasets are commonly large and involve a lot of features.
This is due because of the variety of domains where are obtained from or for
the impact of diverse features extractors techniques. Relatively few works on
selecting and weighting relevant features for the propose of clustering data are
involved in the literature. To cope with this issue, in this paper a new weighting
partitions based features selection framework is proposed in conjunction with
clustering ensemble for large features datasets.
The clustering framework proposed is designed at two different levels: at the
first level, from the original dataset, partitions of features are generated based
on graph partitioning methods, where the nodes represent the features vectors
and the links (edges) are the correlation coefficients between these features,
then, as each subset of features gives different representation of the same data,
a baseline clustering algorithms on each subset are applied. At the second level,
a new clustering ensemble scheme is proposed to combine the whole partitions
by proposing a new weight based on external index to ass the quality weight of
each clustering partition before incorporating them into the final partition of the
Six real world datasets from both images and biological domains are chosen to
be evaluated and an average accuracy between 75:18% and 98:04% is achieved.
Results show that the new proposed technique have been successfully outclassed
state-of-the-art methods in term of both effectiveness and efficiency.
Keywords: Clustering; Consensus; weighted partitions; multi-features data; Data fusion.
Heat Transfer Dynamics Modeling by Means of Clustering and Swarm Methods
by Oualid LAMRAOUI, Yassine BOUDOUAOUI, Hacene HABBI
Abstract: This paper deals with the modeling problem of heat transfer dynamics in thermal exchanger process by using fuzzy prediction approaches. Clustering and swarm based optimization methods are used to derive heat transfer dynamical models to predict temperature variations of hot and cold fluids in the exchanger. The clustering method relies on a one-shot potential calculating strategy to extract the fuzzy sets distribution from the data space. However, the swarm optimization method employs a subject function to optimize the premise and conclusion parameters of the fuzzy structure. Experimental data extracted from a pilot exchanger process is used to learn the fuzzy models, and their performances are compared on both training and testing measurement data.
Keywords: fuzzy clustering; fuzzy modeling; artificial bee colony optimization; metaheuristic optimization; heat exchanger.
A Comparison of Health Informatics Education in the United States
by D. Cenk Erdil
Abstract: The need for health informatics professionals has been increasing recently. One common task in health informatics, is to collect data. There is also a significant need to orchestrate collection of data through informatics infrastructure, manage computing resources, store data, and operate network-enabled medical devices. In addition, in many medical fields, the overall need for supporting many complex medical devices is also increasing. Existing health informatics undergraduate programs in the United States do not adequately equip students with skills to address these challenges, mostly due to limited STEM-focused courses. Thus, a skills gap arises between graduating health informatics professionals, and typical job requirements in many health informatics fields, which has traditionally been addressed by employing graduates with computer science and engineering degrees. Moreover, graduate programs in many health informatics fields consistently offer less credits in computer science and information technology, when compared to other informatics fields. For example, a basic analysis of public health graduate programs shows that the ratio of STEM credits in public health informatics to those in other health informatics fields is 1 to 2. This article provides a basic analysis of both under- graduate and graduate programs in health informatics in the United States. At the graduate level, it highlights the differences across common informatics programs in medical sciences. At the undergraduate level, it proposes more STEM-focused undergraduate degree options to complement existing undergraduate programs to have an immediate effect, which could provide students proper training to help narrow this technology-specific skills gap.
Keywords: Health informatics; information technology; computer science and engineering; big data; engineering education; STEM education.
Special Issue on: Advances in Intelligent Big Data Analytics
An Ensemble Clustering Method for Intrusion Detection
by Kapil K. Wankhade, Kalpana C. Jondhale
Abstract: The amount of data in the field of computer networking growing rapidly and this urge new challenges in the field of an Intrusion Detection System (IDS). To handle such increasing volume of data, new hybrid approach has to be developed to overcome the problems such as high detection rate and low false alarm rate. An Intrusion Detection System plays a vital role against detection of malicious attacks. Data mining and machine learning techniques are important and plays vital role in detection of attacks. This paper mainly focuses on detection rate and false alarm rate so to resolves these problems a hybrid method, ensemble clustering has been proposed. This method tries to increase detection rate with lowering false alarm rate. The method has been tested on KDDCup99 network intrusion dataset and performs well as compared with other algorithms in terms of detection rate false alarm rate.
Keywords: boosting; classification; clustering; data mining; divide and merge; detection rate; false alarm rate; intrusion detection system; ensemble method; k-means.
Empirical Investigation of Dimension Hierarchy Sharing Based Metrics for Multidimensional Schema Understandability
by Anjana Gosain, Jaspreeti Singh
Abstract: Over the last years quality has gained lot of importance in the development of data warehouse systems. Predicting understandability of multidimensional schemas could play a key role in controlling data warehouse quality at early stages of development. In this area, some effort has been spent to define structural metrics and identity models for assessing quality of these systems. Of the structural properties used to define metrics, aspects of dimension hierarchies and its sharing plays primary role to enhance analytical capabilities of multidimensional schemas, thereby affecting their quality. The authors have previously proposed structural metrics based on aforementioned aspects. The objective of this study is to apply Principal Component Analysis (PCA) to find whether our metrics are improvements over the other existing metrics; and to apply Logistic Regression to study whether the metrics (selected as relevant in the extracted principal components) combined together are indicators of multidimensional schema understandability. The results of PCA confirm that our structural metrics based on the concept of sharing are different from other such metrics existing in the literature. Further, the metrics selected as principal components can be used in combination to predict understandability of data warehouse multidimensional schemas.
Keywords: Data Warehouse; Quality Metrics; Principal Component Analysis; Logistic Regression; Understandability; Multidimensional Schemas.
Detecting Concept Drift using HEDDM in Data Stream
by Snehlata S. Dongre, Latesh G. Malik, Achamma Thomas
Abstract: In evolving Data Stream, when its concept undergoes a change it is known as concept drift. Detecting Concept Drift and handling it is a challenging task in Data Stream Mining. If an algorithm is not adapted to Concept Drift, then it directly affects its performance. A number of algorithms have been developed to handle concept drift, but they are not suited for both - Sudden Concept Drift and Gradual Concept Drift. Thus, there is a demand for an algorithm that can react to both the types of concept drift as well as incur less computational cost. A new approach - Hybrid Early drift Detection Method (HEDDM) - has been proposed for drift detection, which works with an ensemble method to improve the performance.
Keywords: Concept drift; data stream; classification; ensemble classifier; concept drift detection; DDM; EDDM; HEDDM; data stream mining; evolving data stream.
Dynamic Social Network Analysis and Performance Evaluation
by Sanur Sharma, Anurag Jain
Abstract: Social media in todays age is on a tremendous increase in terms of its usage and the enormous amount of data it generates which includes personal details of users, their images and the content that is being shared on such open source platforms. This has led to a lot of research and analysis of such networks and data that exists in social media. This paper is focused on dynamic analysis of social networks, where snapshots of network are taken at regular intervals and are analysed on various performance measures. The real time email dataset of a company (ENRON) has been evaluated and visualized dynamically. The network measures are evaluated at each timestamp and clustering is performed on that data and its performance is calculated on various measures. Tabu search optimization algorithm has been used for clustering the timestamped data and a comparison is done between the fixed size cluster and variable size clusters. The results suggests that for certain time stamps the value of precision, recall and f measure for fixed size clusters are better than the variable size clusters. These measures can further be used for the selection of the dynamic clustering techniques for social network analysis.
Keywords: Social Network; Dynamic Social Network; Clustering; Dynamic Network Analysis; Data Mining.
Measuring harmfulness of class imbalance by data complexity measures in oversampling methods
by Deepika Singh, Anjana Gosain, Anju Saha
Abstract: Many real world applications consist of skewed datasets which result in class imbalance problem. During classification, class imbalance cause underestimation of minority classes. Researchers have proposed a number of algorithms to deal with this problem. But recent research studies have shown that some skewed datasets are unharmful and applying class imbalance algorithms on these datasets lead to degenerated performance and increased execution time. In this research paper, we have pre-estimated the degree of harmfulness of class imbalance for skewed classification problems, using two of the data complexity measures: scatter matrix based class separability measure and ratio of intra-class versus inter-class nearest neighbors. Also the performance of oversampling based class imbalance classification algorithms have been analyzed with respect to these data complexity measures. The experiments are conducted using k-nearest neighbor (k-nn) and naivebayes as the base classifiers for this study. The obtained results illustrate the usefulness of these measures by providing the prior information about the nature of the imbalance datasets that help us to select the more efficient classification algorithm.
Keywords: class imbalance; data complexity measure; class separability measure; class overlapping; inter-class nearest neighbor; intra-class nearest neighbor; imbalance ratio; oversampling method.
Threshold based Empirical Validation of Object-Oriented Metrics on Different Severity Levels
by Aarti Aarti, Geeta Sikka, Renu Dhir
Abstract: Software metrics has become desideratum for the fault-proneness, reusability and effort prediction. To enhance and intensify the sufficiency of object-oriented (OO) metrics, it is crucial to perceive the relationship between OO metrics and fault-proneness at distinct severity levels. This paper characterize on the investigation of the software parts with higher probability of occurrence of faults. We examined the effect of thresholds on the OO metrics and build the predictive model based on those threshold values. This paper also instanced on the empirical validation of threshold values calculated for the OO metrics for predicting faults at different severity levels and builds the statistical model using logistic regression. This paper depicts the detection of fault-proneness by extracting the relevant OO metrics and focus on those projects that falls outside the specified risk level for allocating the more resources to them. We presented the effects of threshold values at different risk levels and also validated results on the KC1 dataset using machine learning and different classifiers. The results evaluated on the Receiver and operator (ROC) parameters concluded that threshold methodology has great potential for conducting prediction of faults and shows that analysis of result using machine learning techniques outperforms as compared to logistic regression.
Keywords: Fault; Object-oriented (OO) metrics; Classification; ROC; Level of severity; Empirical Validation.