International Journal of Business Intelligence and Data Mining (69 papers in press)
OLAP technology and Machine learning as the tools for validation of the Numerical Models of Convective Clouds
by Elena N. Stankova, Andrey V. Balakshiy, Dmitry A. Petrov, Vladimir V. Korkhov
Abstract: In the present work we use the technologies of machine learning and OLAP for more accurate forecasting of such phenomena as a thunderstorm, hail, heavy rain, using the numerical model of convective cloud. Three methods of machine learning: Support Vector Machine, Logistic Regression and Ridge Regression are used for making the decision on whether or not a dangerous convective phenomenon occurs at present atmospheric conditions. The OLAP technology is used for development of the concept of multidimensional data base intended for distinguishing the types of the phenomena (thunderstorm, heavy rainfall and light rain). Previously developed complex information system is used for collecting the data about the state of the atmosphere and about the place and at the time when dangerous convective phenomena are recorded.
Keywords: OLAP; online analytical processing; machine learning; validation of numerical models; numerical model of convective cloud; weather forecasting; thunderstorm; multidimensional data base; data mining.
Modelling Economic Choice under Radical Uncertainty: Machine Learning Approaches
by Antov Gerunov
Abstract: This paper utilises a novel experimental dataset on consumer choice
to investigate and benchmark the performance of alternative statistical models
under conditions of extreme uncertainty. We compare the results of logistic
regression, discriminant analysis, na
Keywords: choice; decision-making; social network; machine learning; uncertainty; social network; logistic regression; neural network; random forest; consumer choice; modeling.
Rough Set Theory-Based Feature Selection and FGA-NN Classifier for Medical Data Classification
by B. Vijayalakshmi, Sugumar Rajendran
Abstract: The prediction of heart disease is difficult task, which needs much
experience and knowledge. In order to reduce the risk of heart disease
prediction, in this paper we proposed a rough set theory-based feature selection
and FGA-NN classifier. The overall process of the proposed system consists of
two main steps, such as: 1) feature reduction; 2) heart disease prediction. At
first, the kernel fuzzy c-means clustering with roughest theory (KFCMRS)
algorithm is applied to the high dimensional data to reduce the dimension of the
attribute. After that, the medical data classification is done through FGA-NN
classifier. To improve the prediction performance, hybridisation of firefly and
genetic algorithm (FGA) is utilised with NN for weight optimisation. At last,
the experimentation is performed by means of Cleveland, Hungarian, and
Switzerland datasets. The experimentation result proves that the FGA-NN
classifier outperformed the existing approach by attaining the accuracy of 83%.
Keywords: Heart disease; FGA-NN; KFCMRS; scaled conjugate gradient; prediction; feature reduction; optimisation.
Students Performance Prediction using Hybrid Classifier Technique in Incremental Learning
by Roshani Ade
Abstract: The performance in higher education is a turning point in the academics for all students. This academic performance is influenced by many factors, therefore it is essential to develop predictive data mining model for student's performance so as to identify the difference between high learners and slow learners student. The knowledge is hidden among the educational data set and it is extractable through data mining techniques. In our paper we used the hybrid classifier approach for the prediction of students performance using Fuzzy ARTMAP and Bayesian ARTMAP classifier. Sensitivity analysis was performed and irrelevant inputs were eliminated. The performance measures used to determine the performance of the techniques include Matthews Correlation Co-efficient (MCC), Accuracy Rate, True Positive, False Positive and Percentage correctly classified instances. The combined result gives the good accuracy for predicting students
Keywords: Hybrid Classifier; Incremental Learning; Fuzzy ARTMAP; MCC.
Privacy Preserving Data Mining using Hiding Maximum Utility Item First Algorithm By means of Grey wolf optimisation Algorithm
by M.T. Ketthari, Rajendran Sugumar
Abstract: In the privacy preserving data mining, the utility mining casts a very
vital part. The objective of the suggested technique is performed by concealing
the high sensitive item sets with the help of the hiding maximum utility item
first (HMUIF) algorithm, which effectively evaluates the sensitive item sets by
effectively exploiting the user defined utility threshold value. It successfully
attempts to estimate the sensitive item sets by utilising optimal threshold value,
by means of the grey wolf optimisation (GWO) algorithm. The optimised
threshold value is then checked for its performance analysis by employing
several constraints like the HF, MC and DIS. The novel technique is performed
and the optimal threshold resultant item sets are assessed and contrasted with
those of diverse optimisation approaches. The novel HMUIF considerably cuts
down the calculation complication, thereby paving the way for the
enhancement in hiding performance of the item sets.
Keywords: Data Mining; Privacy Preserving Utility Mining; Sensitive Item sets; optimal threshold; Grey wolf optimisation.
Fuzzy- MCS algorithm based Ontology generation for E Assessment
by A. Santhanavijayan, S.R. Balasundaram
Abstract: Ontologies can lead to important improvements in the definition of a
courses knowledge domain, in the generation of an adapted learning path, and
in the assessment phase. This paper provides an initial discussion of the role of
ontologies in the context of e-learning. Generally, automatic assessment is
preferred over manual assessment to avoid bias errors, human errors and also
conserves teachers time. Evaluation through objective tests like multiple
choice questions has gained a lot of importance in the e-assessment system.
Here we have proposed an efficient ontology generation based on soft
computing techniques in e-assessment for multiple choice questions. We have
employed fuzzy logic incorporated with optimisation algorithm like modified
cuckoo search algorithm. Here a set of rules are first designed for creating the
ontology. The rules are generated using fuzzy logic and these rules are
optimised in order to generate a better ontology structure.
Keywords: Ontologies; MCS algorithm; Fuzzy; e-learning.
Minimal constraint based cuckoo search algorithm for Removing Transmission Congestion and Rescheduling the Generator units
by N. Chidambararaj, K. Chitra
Abstract: In the paper, a minimal constraint based cuckoo search (CS)
algorithm is proposed for solving transmission congestion problem by
considering both increase and decrease in generation power. Thus, the proposed
algorithm is used to optimise the real power changes of generator while
transmission congestion occurred. Then, the power loss, generator sensitivity
factor and congestion management cost of the system is evaluated by the
proposed algorithm according to the transmission congestion. The proposed
method is implemented in MATLAB working platform and their congestion
management performance is analysed. The performance of the proposed
method is compared with the other existing methods such as fuzzy adaptive
bacterial foraging (FABF), simple bacterial foraging (SBF), particle swarm
optimisation (PSO), and artificial neural network (ANN)-CS respectively. The
congestion management cost is reduced up to 26.169%. Through the analysis of
comparison, it is shown that the proposed technique is better and outperforms
other existing techniques in terms of congestion management measures.
Keywords: minimal constraint based CS algorithm; PSO; ANN; real power; congestion management; power loss and congestion management cost.
Effective Discovery of Missing Links in Citation Networks Using Citation Relevancy Check Process
by Nivash J P, L.D. Dhinesh Babu
Abstract: Effective dissemination of knowledge published by eminent authors
in reputed journals and ensuring that the referred work is cited properly is the
need of the hour. Citation analysis is about the similarity measures of articles or
journals which are put forward to scaling as well as clustering procedures. A
proper citation relevancy check (CRC) is required to avoid the missing links in
the citation networks. Both similar and dissimilar references in the articles have
important article citations. The purpose of this work is devise a method to find
the most significant articles which can provide useful information to the journal
editors and writers. The strategy presented in this paper can assist an author to
incorporate most important articles and can help the editor in evaluating the
quality of the references. The main benefit in detecting the missing articles is
improvement in quality of research along with increased citation count.
Keywords: Citation network analysis; Missing citations; Citation relevancy check; Increasing citation count.
A Distributed Cross-layer Recommender System Incorporating Product Diffusion
by Ephina Thendral, C. Valliyammai
Abstract: In this era of online retailing, personalisation of web content has
become very essential. Recommender system is a tool for extraction of relevant
information to render personalisation in web information retrieval systems.
With an inclination towards customer oriented service, there is a need to
understand the adaptability of customers, to provide products/services of
interest at the right time. In this paper, a model for distributed context aware
cross layer recommender system incorporating the principle of product
diffusion is proposed. The offline-online modelled recommender system learns
offline about the adaptation time of users using the principle of product
diffusion and then, uses online explore-then-exploit strategy to make effective
recommendations to the user at the most probable time of consumption. Also,
an algorithm based on product adaptability is proposed for recommending new
items to the most probable users. The extensive experiments and results
demonstrate the efficiency, scalability, reliability and enhanced retrieval
effectiveness of the proposed recommender system model.
Keywords: Recommender Systems; Personalization; Product Diffusion; Distributed Graph Model; Hadoop; Hbase; Titan graph database; Spark; Cross layer; Distributed processing.
A Critique of Imbalanced Data Learning Approaches for Big Data Analytics
by Amril Nazir
Abstract: Biomedical research becomes reliant on multi-disciplinary,
multi-institutional collaboration, and data sharing is becoming increasingly
important for researchers to reuse experiments, pool expertise and validate
approaches. However, there are many hurdles for data sharing, including the
unwillingness to share, lack of flexible data model for providing context
information for shared data, difficulty to share syntactically and semantically
consistent data across distributed institutions, and expensive cost to provide
tools to share the data. In our work, we develop a web-based collaborative
biomedical data sharing platform SciPort to support biomedical data sharing
across distributed organisations. SciPort provides a generic metadata model for
researchers to flflexibly customise and organise the data. To enable convenient
data sharing, SciPort provides a central server-based data sharing architecture,
where data can be shared by one click through publishing metadata to the
central server. To enable consistent data sharing, SciPort provides collaborative
distributed schema management across distributed sites. To enable semantic
consistency for data sharing, SciPort provides semantic tagging through
controlled vocabularies. SciPort is lightweight and can be easily deployed for
building data sharing communities for biomedical research.
Keywords: imbalanced big data learning; large-scale imbalanced data analysis; high-dimensional imbalanced data learning.
A Novel Multi-class Ensemble model based on feature selection using Hadoop framework for classifying imbalanced Biomedical Data
by THULASI BIKKU, N. Sambasiva Rao, Ananda Rao Akepogu
Abstract: Due to the exponential growth of biomedical repositories such as
PubMed and Medline, an accurate predictive model is essential for knowledge
discovery in Hadoop environment. Traditional decision tree models such as
multi-variate Bernoulli model, random forest and multinominal na
Keywords: Ensemble model; Hadoop; Imbalanced data; Medical databases; Textual Decision Patterns.
An optimised approach to detect the identity of
hidden information in gray scale and colour images
by Murugeswari Ganesan, Deisy Chelliah, Ganesan Govindan
Abstract: Feature-based steganalysis is an emerging trend in the domain of
Information Forensics, aims to discover the identity of secret information
present in the covert communication by analysing the statistical features of
cover/stego image. Due to massive volumes of auditing data as well as complex
and dynamic behaviours of steganogram features, optimising those features is
an important open problem. This paper focused on optimising the number of
features using the proposed quick artificial bee colony (qABC) algorithm. Here
we tested for three steganalysers, namely subtractive pixel adjacency matrix
(SPAM), phase aware projection model (PHARM) and colour filter array
(CFA) for the break our steganographic system (BOSS) 1.01 datasets. The
significant improvement in the convergence nature of qABC quickly improves
the solution and fine tune the search than their real counterparts. The results
reveal that qABC method with support vector machine (SVM) classifier
outperforms the non-optimised version concerning classification accuracy and
reduced number of feature sets.
Keywords: Steganalysis; Feature Selection; Optimisation; Classification.
An Effective Preprocessing Algorithm for Model Building in Collaborative Filtering based Recommender System
by Srikanth T, M. Shashi
Abstract: Recommender systems suggest interesting items for online users based on the ratings expressed by them for the other items maintained globally as the rating matrix. The rating matrix is often sparse and very huge due to large number of users expressing their ratings only for a few items among the large number of alternatives. Sparsity and scalability are the challenging issues to achieve accurate predictions in recommender systems. This paper focuses on model building approach to collaborative filtering-based recommender systems using low rank matrix approximation algorithms for achieving scalability and accuracy while dealing with sparse rating matrices. A novel preprocessing methodology is proposed to counter data sparsity problem by transforming the sparse rating matrix denser before extracting latent factors to appropriately characterise the users and items in low dimensional space. The quality of predictions made either directly or indirectly through user clustering were investigated and found to be competitive with the existing collaborative filtering methods in terms of reduced MAE and increased NDCG values on bench mark datasets.
Keywords: Recommender System; Collaborative Filtering; Dimensionality Reduction; Pre- Processing,Sparsity,Scalability,Matrix Factorization.
Error Tolerant Global Search Incorporated With Deep Learning Algorithm to Automatic Hindi Text Summarization
by J. Anitha, P.V.G.D. Prasad Reddy, M.S. Prasad Babu
Abstract: There is an exponential growth in the available electronic
information in the last two decades. It causes a huge necessity to quickly
understand high volume text data. This paper describes an efficient algorithm
and it works by assigning scores to sentences in the document which is to be
summarised. It also focuses on document extracts; a particular kind of
computed document summary. The proposed approach uses fuzzy classifier and
deep learning algorithm. Fuzzy classifier produces score for each sentence and
the deep learning (DL) also produces score for each sentence. The combination
of score from both fuzzy classifier and DL produces the hybrid score. Finally,
the summarised text can be generated based on this hybrid score. In our
proposed approach, we have achieved an average precision rate of 0.92 and
average recall rate of 0.88 and the compression rate is 10% according to the
Keywords: GSA; Fuzzy; summarisation; hybrid; deep learning.
Network Affinity Aware Energy Efficient Virtual Machine Placement Algorithm
by Ranjana Ramamurthy, S. Radha, J. Raja
Abstract: Efficient mapping of virtual machine request to the available
physical machine is an optimisation problem in data centres. It is solved by
aiming to minimise the number of physical machines and utilising them to their
maximum capacity. Another avenue of optimisation in data centre is the energy
consumption. Energy consumption can be reduced by using fewer physical
machines for a given set of VM requests. An attempt is made in this work to
propose an energy efficient VM placement algorithm that is also network
affinity aware. Considering the network affinity between VMs during the
placement will reduce the communication cost and the network overhead. The
proposed algorithm is evaluated using the Cloudsim toolkit and the
performance in terms of energy consumed, communication cost and number of
active PMs, is compared with the standard first fit greedy algorithm.
Keywords: Virtualisation; affinity aware; cloud computing; virtual machine placement; network affinity.
A Secured Best Data Center Selection in Cloud Computing Using Encryption Technique
by Prabhu A., M. Usha
Abstract: In this work, we have proposed an approach for providing very high security to the cloud system. Our proposed method comprises of three phases namely authentication phase, cloud data centre selection phase and user related service agreement phase. For the purpose of accessing data from the cloud server, we will need a secure authentication key. In the authentication phase, the user authentication is verified and gets the key then encrypts the file using blowfish algorithm. Before encryption the input data is divided into column-wisely with the help of pattern matching approach. In the approach, the encryption and decryption processes are carried out by employing the blowfish algorithm. We can optimally select the cloud data centre to store the data; here the position is optimally selected with the help of bat algorithm. In the final phase, the user service agreement is verified. The implementation will be done by cloud sim simulator.
Keywords: Authentication key; blowfish; Bat algorithm; pattern match; Cloud Data Center Selection.
Combined Local color curvelet and mesh pattern for image retrieval system
by YESUBAI RUBAVATHI
Abstract: This manuscript presents the content based image retrieval
system using new textural features such as colour local curvelet (CLC) based
textural descriptor and colour local mesh pattern (CLMP), for the intention
of increasing the performance of the image retrieval system. The proposed
methods can be able to utilise the distinctive details obtained from spatial
coloured textural patterns of various spectral components within the particular
local image region. Furthermore, to acquire the benefit of harmonising
effect through joint colour texture information, the oppugant colour textural
features that obtain the texture patterns of spatial interactions among spectral
planes are also integrated in to the creation of CLC and CLMP. Extensive and
comparative experiments have been conducted on two benchmark databases,
i.e., Corel-1k, MIT VisTex. Retrieval results show that image retrieval using
colour local texture features yields better precision and recall than retrieval
approaches using either by colour or texture features.
Keywords: Content based image retrieval system; Curvelet transform; Local mesh pattern; Color local curvelets; Color local mesh pattern.
FUZZY BASED AUTOMATED INTERRUPTION TESTING MODEL FOR MOBILE APPLICATIONS
by Malini A, K. . Sundarakantham, C. Mano Prathibhan, A. Bhavithrachelvi
Abstract: Testing of mobile applications during the occurrence of interrupts is
termed as interrupt testing. Interrupts can occur either internally within the
mobile or from other external factors or systems. Interruption in any smart
phones may decrease the performance of mobile applications. In this paper, an
automated interruption testing model is proposed to analyse the responsiveness
of mobile applications during interrupts. This model monitors the applications
installed in the mobile devices and evaluates the overall performance of mobile
applications during interrupt using fuzzy logic. An enhanced MobiFuzzy
evaluation system (MFES) is proposed that is used to dynamically analyse the
test results and identify necessary information required for tuning the
application. Fuzzy logic will help the developers or testers in tuning the
application performance; by automatically categorising the impact
Keywords: Mobile application testing; Interrupt testing; Application tracker; Performance testing.
Evolution of Singular Value Decomposition in Recommendation Systems : A Review
by Rachana Mehta, Keyur Rana
Abstract: Proliferation of internet and web applications has led to
exponential growth of users and information over web. In such information
overload scenarios, recommender systems have shown their prominence by
providing user with information of their interest. Recommender systems
provide item recommendation or generate predictions. Amongst the various
recommendation approaches, collaborative filtering techniques have emerged
well because of its wide item applicability. Model-based collaborative
filtering techniques which use parameterised model for prediction are more
preferred as compared to their memory-based counterparts. However, the
existing techniques deals with static data and are less accurate over sparse,
high dimensional data. In order to alleviate such issues, matrix factorisation
techniques like singular value decomposition are preferred. These techniques
have evolved from using simple user-item rating information to auxiliary
social and temporal information. In this paper, we provide a comprehensive
review of such matrix factorisation techniques and their applicability to
different input data.
Keywords: Recommendation System; Collaborative filtering; Matrix factorization;Singular Value Decomposition; Information retrieval;Data mining;Auxiliary information; Latent features;Model learning;Data sparsity.
Investigating Different Fitness Criteria for Swarm-based Clustering
by Maria P.S. Souza, Telmo M. Silva Filho, Getulio J.A. Amaral, Renata M.C.R. Souza
Abstract: Swarm-based optimisation methods have been previously used for
tackling clustering tasks, with good results. However, the results obtained by
this kind of algorithm are highly dependent on the chosen fitness criterion.
In this work, we investigate the influence of four different fitness criteria
on swarm-based clustering performance. The first function is the typical
sum of distances between instances and their cluster centroids, which is the
most used clustering criterion. The remaining functions are based on three
different types of data dispersion: total dispersion, within-group dispersion
and between-groups dispersion. We use a swarm-based algorithm to optimise these criteria and perform clustering tasks with nine real and artificial
datasets. For each dataset, we select the best criterion in terms of adjusted
Rand index and compare it with three state-of-the-art swarm-based clustering
algorithms, trained with their proposed criteria. Numerical results confirm the
importance of selecting an appropriate fitness criterion for each clustering
Keywords: Swarm Optimisation; Fitness criterion; Clustering; Artificial Bee Colony; Particle Swarm Optimisation.
A Combined PFCM and Recurrent Neural Network based Intrusion Detection System for Cloud Environment
by Manickam M., N. Ramaraj, C. Chellappan
Abstract: The main objective of this paper is intrusion detection system for a
cloud environment using combined PFCM-RNN. Traditional IDSs are not
suitable for cloud environment as network-based IDSs (NIDS) cannot detect
encrypted node communication, also host-based IDSs (HIDS) are not able to
find the hidden attack trail. The traditional intrusion detection is largely
inefficient to be deployed in cloud computing environments due to their
openness and specific essence. Accordingly, this proposed work consists of two
modules namely clustering module and classification module. In clustering
module, the input dataset is grouped into clusters with the use of possibilistic
fuzzy C-means clustering (PFCM). In classification module, the centroid from
the clusters is given to the recurrent neural network which is used to classify
whether the data is intruded or not. For experimental evaluation, we use the
benchmark database and the results clearly demonstrate the proposed technique
outperformed conventional methods.
Keywords: cloud computing; intrusion detection system; Possibilistic Fuzzy C-means clustering; recurrent neural network.
Master node fault tolerance in distributed big data processing clusters
by Ivan Gankevich, Yuri Tipikin, Vladimir Korkhov, Vladimir Gaiduchok, Alexander Degtyarev, A. Bogdanov
Abstract: Distributed computing clusters are often built with commodity
hardware which leads to periodic failures of processing nodes due to
relatively low reliability of such hardware. While worker node fault-tolerance
is straightforward, fault tolerance of master node poses a bigger challenge.
In this paper master node failure handling is based on the concept of master
and worker roles that can be dynamically re-assigned to cluster nodes along
with maintaining a backup of the master node state on one of worker
nodes. In such case no special component is needed to monitor the health
of the cluster while master node failures can be resolved except for the
cases of simultaneous failure of master and backup. We present experimental
evaluation of the technique implementation, show benchmarks demonstrating
that a failure of a master does not affect running job, and a failure of backup
results in re-computation of only the last job step.
Keywords: parallel computing; Big Data processing; distributed computing; backup node; state transfer; delegation; cluster computing; fault-tolerance.
The integration of a newly defined N-gram concept and vector space model for documents ranking
by Mostafa Salama, Wafaa Salah
Abstract: Vector space model (VSM) is used in measuring the similarity
between documents according to the frequency of common words among
them. Furthermore, the N-gram concept is integrated in VSM to put
into consideration the relation between common consecutive words in the
documents. This approach does not consider the context and semantic
dependency between nonconsecutive words existing in the same sentence.
Accordingly, the approach proposed here presents a new definition of the
N-gram concept as N non-consecutive words located in the same sentence,
and utilises this definition in the VSM to enhance the measurement of
the semantic similarity between documents. This approach measures and
visualises the correlation between the words that are commonly existing
together within the same sentence to enrich the analysis of domain experts.
The results of the experimental work show the robustness of the proposed
approach against the current ranking models.
Keywords: N-gram; vector space model; Text Mining.
MODELLING AND SIMULATION OF ANFISBASED MPPT FOR PV SYSTEM WITH MODIFIED SEPIC CONVERTER
by M. Senthil Kumar, P.S. Manoharan, R. Ramachandran
Abstract: This paper presents modelling and simulation of artificial
neuro-fuzzy inference system (ANFIS) based maximum power point tracking
(MPPT) algorithm for PV system with modified SEPIC converter. The
conventional existing MPPT methods are having major drawbacks of high
oscillations at maximum power point and low efficiency due to uncertain
nature of solar radiation and temperature. These mentioned problems can be
solved by the proposed adaptive (ANFIS) based MPPT. The proposed work
involves ANFIS and modified single ended primary inductor converter
(SEPIC) to extract maximum power from PV panel. The results obtained from
proposed methodology are compared with other MPPT algorithms such as
perturb and observe (P&O), incremental conductance (INC) and radial basis
function network (RBFN). The improvement in voltage rating of modified
SEPIC is compared with conventional SEPIC converter. The result confirms
the superiority of the proposed system.
Keywords: ANFIS;INC;Modified SEPIC;P&O;RBFN.
Distributed Algorithms for Improved Associative Multilabel Document Classification considering Reoccurrence of Features and handling Minority Classes
by Preeti Bailke, S.T. Patil
Abstract: Existing work in the domain of distributed data mining mainly
focuses on achieving the speedup and scaleup properties rather than improving
performance measures of the classifier. Improvement in speedup and scaleup is
obvious when distributed computing platform is used. But its computing power
should also be used for improving performance measures of the classifier. This
paper focuses on the same by considering reoccurrence of features and
handling minority classes. Since it is very time consuming to run such complex
algorithms on large datasets sequentially, distributed versions of the algorithms
are designed and tested on the Hadoop cluster. Base associative classifier is
designed based on multi-class, multi-label associative classification (MMAC)
algorithm. Since no similar distributed algorithms exist, proposed algorithms
are compared with the base classifier and have shown improvement in classifier
Keywords: Multilabel associative classifier; Hadoop; Pig; Feature reoccurrence; Minority Class; Distributed Algorithm.
A survey on time series motif discovery
by Cao Duy Truong, Duong Tuan Anh
Abstract: Time series motifs are repeated subsequences in a long time series.
Discovering time series motifs is an important task in time series data mining
and this problem has received significant attention from researchers in data
mining communities. In this paper, we intend to provide a comprehensive
survey of the techniques applied for time series motif discovery. The survey
also briefly describes a set of applications of time series motif in various
domains as well as in high-level time series data mining tasks. We hope that
this article can provide a broad and deep understanding of the time series motif
Keywords: time series; motif discovery; window-based; segmentation-based; motif applications.
Trust Management Scheme for Authentication in Secure Cloud Computing Using Double Encryption Method
by P. Sathishkumar, V. Venkatachalam
Abstract: In cloud computing and banking, the consumer as well as supplier
required for their service as protection and confidence. In this document
suggest the belief value oriented verification procedure by the aid of encryption
procedure, this verification segment bank marketing database are measured to
the kernel fuzzy c-means clustering (KFCM) method. Clustered datas are
accumulated in the cloud to the confidence data verification procedure. In the
verification segment, the consumer verification is confirmed and acquires the
verification key then encrypts the file by the double encryption algorithm.
Primarily the confidence finest data implemented homomorphic encryption to
encrypt the data by blowfish algorithm and then encrypted data are
accumulated in cloud data core. This procedure oriented the banking data will
be steadily legalised in cloud computing procedure. The outcomes are
exemplify the improved encryption time and extremely legitimate the data in
Keywords: Authentication; Cloud Security; Cloud Services; Trust Management; clustering; cloud computing; encryption and decryption.
Trajectory tracking of the robot end-effector for the minimally invasive surgeries
by Jose De Jesus Rubio, Panuncio Cruz, Enrique Garcia, Cesar Felipe Juarez, David Ricardo Cruz, Jesus Lopez
Abstract: The surgery technology has been highly investigated, with the
purpose to reach an efficient way of working in medicine. Consequently,
robots with small tools have been incorporated in many kind of surgeries
to reach the following improvements: the patient gets a faster recovery, the
surgery is not invasive, and the robot can access to the body occult parts. In
this article, an adaptive strategy for the trajectory tracking of the robot end
effector is addressed; it consists of a proportional derivative technique plus
an adaptive compensation. The proportional derivative technique is employed
to reach the trajectory tracking. The adaptive compensation is employed to
reach approximation of some unknown dynamics. The robot described in this
study is employed in minimally invasive surgeries.
Keywords: Trajectory tracking; robot; minimal invasive surgery.
Multi Label Learning Approaches for Multi Species Avifaunal Occurrence Modelling: A Case Study of South Eastern Tamil Nadu
by Appavu Alias Balamurugan, P.K.A. Chitra, S. Geetha
Abstract: Many multi label problem transformation (PT) and algorithm
adaptation (AA) methods need to be explored to get good candidate for
avifaunal occupancy modelling. This research contrasted eight commonly used
state-of-the-art PT and AA multi label methods. The data was created by
collecting January 2014December 2014 records from e-bird repository for the
study area Madurai district, south eastern Tamil Nadu. The analysis shows that
classifier chain (CC) and multi label naive Bayes (MLNB) are the good
aspirants for avifauna data. The MLNB did best with 0.019 hamming loss and
90% average precision. To the best of our knowledge this is the first time to use
MLNB for avifaunal data and the results of multi label naive Bayes concludes
that out of 143 species observed, six species had high occurrence rate and 68
species had low occurrence rate.
Keywords: Species distribution models; multi species; multi label Learning; Multi Label Naive Bayes; Central part of southern Tamil Nadu.
Analytics on Talent Search Examination Data
by Anagha Vaidya, Vyankat Munde, Shailaja Shirwaikar
Abstract: Learning analytics and educational data mining has greatly
supported the process of assessing and improving the quality of education.
While learning analytics has a longer development cycle, educational data
mining suffers from the inadequacy of data captured through learning
processes. The data captured from examination process can be suitably
extended to perform some descriptive and predictive analytics. This paper
demonstrates the possibility of actionable analytics on the data collected from
talent search examination process by adding to it some data pre-processing
steps. The analytics provides some insight into the learners characteristics
and demonstrates how analytics on examination data can be a major support
for bringing the quality in education field.
Keywords: Learning Analytics; Educational Data Mining; clustering; linear modelling.
A fast clustering approach for large multidimensional data
by Hajar Rehioui, Abdellah Idrissi
Abstract: Density-based clustering is a strong family of clustering methods. The
strength of this family is its ability to classify data of arbitrary shapes and to
omit the noise. Among them density-based clustering (DENCLUE), which is
one of the well-known powerful density-based clustering methods. DENCLUE is
based on the concept of the hill climbing algorithm. In order to find the clusters,
DENCLUE has to reach a set of points called density attractors. Despite the
advantages of DENCLUE, it remains sensitive to the growth of the size of data
and of the dimensionality, in the fact that the density attractors are calculated of
each point in the input data. In this paper, in the aim to overcome the DENCLUE
shortcoming, we propose an efficient approach. This approach replaces the
concept of the density attractor by a new concept which is the hyper-cube
representative. The experimental results, provided from several datasets, prove
that our approach finds a trade-off between the performance of clustering and the
fast response time. In this way, the proposed clustering methods work efficiently
for large of multidimensional data.
Keywords: Large Data; Dimensional Data; Clustering; Density based clustering; DENCLUE.
CBRec: a book recommendation system for
children using the matrix factorisation and
content-based filtering approaches
by Yiu-Kai Ng
Abstract: Promoting good reading habits among children is essential, given
the enormous influence of reading on students development as learners and
members of the society. Unfortunately, very few (children) websites or online
applications recommend books to children, even though they can play a
significant role in encouraging children to read. Given that a few popular
book websites suggest books to children based on the popularity of books or
rankings on books, they are not customised/personalised for each individual
user and likely recommend books that users do not want or like. We have
integrated the matrix factorisation approach and the content-based approach,
in addition to predicting the grade levels of books, to recommend books for
children. Recent research works have demonstrated that a hybrid approach,
which combines different filtering approaches, is more effective in making
recommendations. Conducted empirical study has verified the effectiveness of
our proposed children book recommendation system.
Keywords: Book recommendation; matrix factorisation; content analysis; children.
Enhancing Purchase Decision using Multi-word Target Bootstrapping with Part-of-Speech Pattern Recognition Algorithm
by M. Pradeepa Sivaramakrishnan, C. Deisy
Abstract: In this research work, multi-word target related terms are extracted
automatically from the customer reviews for sentiment analysis. We used LIDF
measure and have proposed a novel measure called, TCumass in iterative
multi-word target (IMWT) bootstrapping algorithm. In addition, part-of-speech
pattern recognition (PPR) algorithm has been proposed to identify the
appropriate target and emotional words from multi-word target related terms.
This article aims to bring out both implicit and explicit targets with their
corresponding polarities in an unsupervised manner. We proposed two models
namely, MWTB without PPR and MWTB with PPR. Thus, the present research
illustrates the comparison between the proposed works and the existing
multi-aspect bootstrapping (MAB) algorithm. The experiment has been done
based on different data sets and thereafter the performance evaluated using
different measures. From this study, the result expounds that MWTB with PPR
model performs well, having achieved the precise targets and emotional words.
Keywords: Bootstrapping; emotional polarity; multi-word target; Part-of-Speech (POS); sentiment analysis.
Probabilistic Variable Precision Fuzzy Rough Set Technique for Discovering Optimal Learning Patterns in E-learning
by Bhuvaneshwari K.S, D. Bhanu, S. Sophia, S. Kannimuthu
Abstract: In e-learning environment, optimal learning patterns are discovered
for realising and understanding the effective learning styles. The value of
uncertain and imprecise knowledge collected has to be categorised into classes
known as membership grades. Rough set theory is potential in categorising data
into equivalent classes and fuzzy logic may be applied through soft thresholds
for refining equivalence relation that quantifies correlation between each class
of elucidated data. In this paper, probabilistic variable precision fuzzy rough set
technique (PVPFRST) is proposed for deriving robust approximations and
generalisations that handles the types of uncertainty namely stochastic,
imprecision and noise in membership functions. The result infers that the
degree of accuracy of PVPFRST is 21% superior to benchmark techniques.
Result proves that PVPFRST improves effectiveness and efficiency in
identifying e-learners styles and increases the performance by 27%, 22% and
25% in terms of discrimination rate, precision and recall value than the
Keywords: Inclusion degree; Probabilistic fuzzy information system; fuzzy membership grade; Crispness coefficient; Probabilistic variable precision fuzzy rough set; Inclusion function.
Inferring the Level of Visibility from Hazy Images
by Alexander A. S. Gunawan, Heri Prasetyo, Indah Werdiningsih, Janson Hendryli
Abstract: In our research, we would like to exploit crowdsourced photos from
social media to create low-cost fire disaster sensors. The main problem is to
analyse how hazy the environment looks like. Therefore, we provide a brief
survey of methods dealing with visibility level of hazy images. The methods
are divided into two categories: single-image approach and learning-based
approach. The survey begins with discussing single image approach. This
approach is represented by visibility metric based on contrast-to-noise ratio
(CNR) and similarity index between hazy image and its dehazing image. This
is followed by a survey of learning-based approach using two contrast
approaches that is: 1) based on theoretical foundation of transmission light,
combining with the depth image using new deep learning method; 2) based on
black-box method by employing convolutional neural networks (CNN) on hazy
Keywords: Hazy image; visibility level; single image approach; learning based approach; social media.
The Complexity of Cluster-Connectivity of Wireless Sensor Networks
by H.K. Dai, H.C. Su
Abstract: Wireless sensor networks consist of sensor devices with limited
computational capabilities and memory operating in bounded energy
resources; hence, network optimisation and algorithmic development in
minimising the total energy or power while maintaining the connectivity of
the underlying network are crucial for their design and maintenance. We
consider a generalised system model of wireless sensor networks whose node
set is decomposed into multiple clusters, and show that the decision and the
associated minimisation problems of the connectivity of clustered wireless
sensor networks appear to be computationally intractable completeness and
hardness, respectively, for the non-deterministic polynomial-time complexity
class. An approximation algorithm is devised to minimise the number of end
nodes of inter-cluster edges within a factor of 2 of the optimum for the
Keywords: wireless sensor network; connectivity; spanning tree; nondeterministic polynomial-time complexity class; approximation algorithm.
Efficient Moving Vehicle Detection for Intelligent Traffic Surveillance System Using Optimal Probabilistic Neural Network
by Smitha J.A, N. Rajkumar
Abstract: The vehicle detection system plays an essential role in the traffic
video surveillance system. Video communication of these traffic cameras over
real-world limited bandwidth networks can frequently suffer network
congestion. The objective of this paper is to develop an effective method for
moving vehicle detection problems that can find high quality solutions (with
respect to detection accuracy) at a high convergence speed. To achieve this
objective, we propose a method that hybridises the cuckoo search (CS) with
Opposition-based learning (OBL), where OBL is improve the performance of
the CS algorithm while optimising the weights of the standard PNN model. The
proposed system mainly consists of two modules such as: 1) design novel
OCS-PNN model; 2) moving vehicle detection using OCS-PNN model. The
algorithm is tested on three standard video dataset. For instance, the proposed
method achieved the maximum precision of 94%, F-measure of 94% and
similarity of 94%.
Keywords: Moving vehicle detection; Probabilistic neural network; oppositional; Cuckoo search; Traffic video surveillance system; OCS-PNN.
The Mediation Roles of Purchase intention and brand trust in Relationship between social marketing activities and brand loyalty
by Nasrin Yazdanian, Saman Ronagh, Parya Laghaei, Fatemeh Mostafshar
Abstract: The rise of social media significantly challenges the way of firms
managing about introducing their brands. The literature on social media
marketing activities (SMMA) has promoted specially in the field of luxury
marketing. Building on the basic of web 2.0 social media applications have
simplified and facilitated extraordinary growth in customer interaction in
modern times. The objective of this study is to examine the role of affecting
factors which influence Iranian luxury brands customers attitude toward
purchase intention and brand loyalty. A questionnaire was used for collecting
data from a sample of 114 luxury brand customers in social media in Tehran,
capital and metropolitan city of Iran. Structural equation modelling was applied
to examine the impact of social media marketing activities on brand loyalty.
The mediating role of purchase intention and brand trust is considered too. The
results indicated that entertainment does not have positive impact on purchase
intention, brand trust and brand loyalty. The results of this research enable
luxury brands managers to forecast the future purchasing behaviour of their
customers and provide a guide to managing their strategies and marketing
activities in competitive environment.
Keywords: Luxury brands; Social Media Marketing Activities; brand trust ; loyalty; purchase intention.
Application of a hybrid data mining model to identify the main predictive factors influencing hospital length of stay
by Ahmed Belderrar, Abdeldjebar Hazzab
Abstract: Length of hospital stay is one of the most appropriate measures that can be used for management of hospital resources and assistant of hospital admissions. The main predictive factors associated with the length of stay are critical requirements and should be identified to build a reliable prediction model for hospital stays. A hybrid integration approach consisting of fuzzy radial basis function neural network and hierarchical genetic algorithms was proposed. The proposed approach was applied on a data set collected from a variety of intensive care units. We achieved an acceptable forecast accuracy level with more than 80.50%. We found 14 common predictive factors. Most notably, we consistently found that the demographic characteristics, hospital features, medical events and comorbidities strongly correlates to the length of stay. The proposed approach can be used as an effective tool for healthcare providers and can be extended to other hospital predictions.
Keywords: data mining; hospital management; length of hospital stay; hybrid prediction model; predictive factors.
Genetic Algorithm based Intelligent Multiagent Architecture for Extracting Information from Hidden Web Databases
by Weslin D, T. Joshva Devadas
Abstract: Though there are enormous amount of information available in the
web, only very small portion of the available information is visible to the users.
Due to the non-visibility of huge information, the traditional search engines
cannot index or access all information present in the web. The main challenge
in the mining of the relevant information from a huge hidden web database is to
identify the entry points to access the hidden web databases. The existing web
crawlers cannot retrieve all information from the hidden web databases. To
retrieve all the relevant information from the hidden web, this paper proposes
an architecture that uses genetic algorithm and intelligent agents for accessing
hidden web databases. The proposed architecture is termed as genetic algorithm
based intelligent multi-agent system (GABIAS). The experimental results show
that the proposed architecture provides better precision and recall than the
existing web crawlers.
Keywords: Genetic Algorithm (GA); Hidden Web; Intelligent Agent; Web Crawler.
Efficient Clustering Technique for K-Anonymization with Aid of Optimal KFCM
by Chitra Ganabathi G., P. Uma Maheswari
Abstract: The k-anonymity model is a simple and practical approach for data
privacy preservation. To minimise the information loss due to anonymisation, it
is crucial to group similar data together and then anonymises each group
individually. So that in this paper proposes a novel clustering method for
conducting the k-anonymity model effectively. The clustering will be done by
an optimal kernel based fuzzy c-means clustering algorithm (KFCM). In
KFCM, the original Euclidean distance in the FCM is replaced by a
kernel-induced distance. Here the objective function of the kernel fuzzy
c-means clustering algorithm is optimised with the help of modified grey wolf
optimisation algorithm (MGWO). Based on that, the collected data is grouped
in an effective manner. The performance of the proposed technique is evaluated
by means of information loss, time taken to group the available data. The
proposed technique will be implemented in the working platform of MATLAB.
Keywords: Privacy preservation; k-anonymity; Kernel Fuzzy C-Means; Grey wolf optimization; information loss.
Optimal Decision Tree Fuzzy Rule Based Classifier (ODT-FRC) For Heart Disease Prediction Using Improved Cuckoo Search Algorithm
by Subhashini Narayan, Jagadeesh Gobal
Abstract: Heart disease is a major cause for anomaly in developed countries
and one of the basic diseases in developing countries. Then there is a necessary
to insert an alternative expressively caring network for predicting heart disease
of a patient. The clinical alternative expressively caring networks contain three
method of preprocessing such as preprocessing, generate decision rule and rule
weighting, classification. Initially, the Cleveland data, Hungarian data and
Switzerland data are loud in the reliable information from the database in
preprocessing. On this process, underline quantity reduction method will be
associated to reduce the components space exploiting orthogonal
neighbourhood safeguarding projection (OLPP) computation. While, the
combinations of cuckoo search algorithm, fuzzy and decision tree classifier can
create a hybrid classifier. Here, fuzzy and decision tree algorithm will be
sufficiently combined with cuckoo search (CS) algorithm and which will guide
for accurate grouping.
Keywords: preprocessing; cuckoo search; fuzzy; decision tree; classification.
A Novel Attribute Based Dynamic Clustering with Schedule Based Rotation Method (ADC-SBR) for Outlier Detection
by Karthikeyan .G, P. Balasubramanie
Abstract: Detection of outliers in bank transactions has gained popularity in
the recent years. The existing outlier detection techniques are unable to process
the high volume of data. Hence, to address this issue, an efficient attribute
based dynamic clustering-schedule based rotation (ADC-SBR) method is
proposed. The similarity between transactions within a cluster is estimated
using Jaccard coefficient based labelling approach and the optimal cluster head
is chosen by the similarity-based cluster head selection (SbCHS) method.
The outlier detection is performed in two levels. The node level outlier
detection is performed using linear regression model and the cluster level
outlier detection is performed by deviation based ranking. An own dataset with
bank transactions is used for the experimental analysis. The suggested method
is implemented in Apache Spark and is compared with existing algorithms for
the metrics. The comparison results prove that the proposed method is optimal
for all metrics than existing algorithms.
Keywords: Attribute based Dynamic Clustering (ADC) - Schedule based Rotation (SBR); Jaccard coefficient; Linear Regression method; Deviation based ranking; Similarity based Cluster Head Selection (SbCHS).
Mining Multilingual and Multiscript Twitter Data: Unleashing the Language and Script Barrier
by Bidhan Sarkar, Nilanjan Sinhababu, Manob Roy, Pijush Kanti Dutta, Prasenjit Choudhury
Abstract: Micro-blogging sites like Twitter have become an opinion hub
where views on diverse topics are expressed. Interpreting, comprehending and
analysing this emotion-rich information can unearth many valuable insights.
The job is trivial if the tweets are in English. But lately, increase in native
languages for communication has imposed a great challenge in social media
mining. Things become more complicated when people use Roman scripts to
write non-English languages. India, being a country with a diverse collection of
scripts and languages, encounters the problem severely. We have developed a
system that automatically identifies and classifies native tweets, irrespective of
the script used. Converting all tweets to English, we get rid of the script vs
language problem. The new approach we formulated consists of Script
Identification, Language analysis, and Clustered mining. Considering English
and the top two Indian languages, we found that the proposed framework gives
better precision than the prevailing approaches.
Keywords: Twitter Mining; Language Classification; Script Identification; Indic language; Preprocessing; Naive Bayes; Support Vector Machine; LDA.
KNOWLEDGE TRANSFER FOR EFFICIENT CROSS DOMAIN RANKING USING ADARANK ALGORITHM
by Geetha Narayanan, P.T. Vanathi
Abstract: Learning-to-rank has been an exciting topic of research exclusively
in hypothetical and the productions in the information retrieval practices.
Usually, in the learning-based ranking procedures, it is expected the training
and testing data are recovered from the identical data delivery. However those
existing research methods do not work well in case of multiple documents
retrieved from the cross domains (different domains). In this case ranking of
documents would be more difficult where the contents are described in multiple
documents from different cross domains. The main goal of this research
method is to rank the documents gathered from the multiple domains with
improved learning rate by learning features from different domains. The feature
level information allocation and instance level information relocation are
achieved with four learners namely RankNet, ranking support vector machine
(SVM), RankBoost and AdaRank. The estimation results presented that the
AdaRank algorithm achieves good performance.
Keywords: Learning-to-rank; knowledge transfer; RankNet; Ranking SVM; RankBoost;AdaRank.
An Automated Ontology Learning for benchmarking classifier models through Gain-Based Relative-Non-Redundant (GBRNR) Feature Selection : A case-study with Erythemato
by S. Sivasankari, Shomona Gracia Jacob
Abstract: Erythemato-squamous disease (ESD) is one of the complex diseases
in the dermatology field, the diagnosis of which is challenging, due to common
morphological features and often leads to inconsistent results. Besides,
diagnosis has been done on the basis of inculcated visible symptoms pertinent
with the expertise of the physician. Hence, ontology construction for prediction
of Erythemato-squamous disease through data mining techniques was believed
to yield a clear representation of the relationships between the disease,
symptoms and course of treatment. However, the classification accuracy
required to be high in order to obtain a precise ontology. This required
identifying the correct set of optimal features required to predict ESD. This
paper proposes the Gain based Relative-Non-Redundant Attribute selection
approach for diagnosis of ESD. This methodology yielded 98.1% classification
accuracy with Adaboost algorithm that executed J48 as the base classifier. The
feature selection approach revealed an optimal feature set comprising of 19
Keywords: Ontology; Feature Selection; Classifier; Web Ontology Language; Gain Base;Erythemato-Squamous.
Fuzzy C Means clustering and Elliptic Curve Cryptography Using Privacy Preserving in Cloud
by Sasidevi Jayaraman, Sugumar Rajendran, Shanmuga Priya P
Abstract: Cloud computing is the distribution of computing devices which
reduce the cost for IT infrastructure. In this projected approach, the databases
are measured to collecting method generate the transitional datasets. These
datasets acquire the facts increase to pick the responsive data to the encryption
and decryption procedure, the responsive data preferred procedure depend upon
the entry value. The facts increase is integrated to get the superior bound
limitation for the combined maintaining outflow. Responsive data to the elliptic
curve cryptography (ECC) system to encrypt the data to isolation procedure.
Encrypted data storage system is utilised to protected cloud data standards.
Encrypting every transitional data sets are neither competent nor rate effectual
one. From the trial outcome, the isolation defending charge of transitional
datasets can be appreciably condensed by our method above obtainable ones
where the entire datasets are encrypted.
Keywords: cloud computing; intermediate datasets; privacy preserving; Encryption and Decryption; cryptography and clustering.
Optimal Page Ranking System For Web Page Personalization Using MKFCM And GSA
by Pranitha P., M.A.H. Farquad, G. Narshimha
Abstract: In this personalised web search (PWS), we utilise a kernel-based
FCM for clustering a web pages. For effective personalised web search, queries
are optimised using GSA with respect to clustered query sessions. In offline
processing, initially preprocess the input information taken from consumer
visited web pages and are transformed in to numerical matrix. These matrices
are gathered with the help of kernel-based FCM method after produce a vector
for consumer query and detect a minimum distance as centroid values these
values are input to the GSA algorithm. It will engender these links given top N
web pages from cluster. In online processing, the user query is engaged as input
then extract some web pages from Google, Bing, Yahoo also extract content
and snippet from web pages. Finally, detect a sum of contents and snippets and
web pages would be considered in descending order.
Keywords: Kernelbased Fuzzy c-means; Clustering; offline; online; preprocessing; Google; Bing; Yahoo.
DYNAMIC RUNTIME PROTECTION TECHNIQUE AGAINST THE FILE SYSTEM INJECTING MALWARES (DRPT -FSIM)
by Arul Easwaramoorthy, Venugopal Manikandan
Abstract: Malwares enters into the victim system by injecting the code into
victim system executable files or well-known files or folders. In this paper, the
proposed dynamic runtime protection technique (DRPT) will ensure for
protection of all the modes of the malware entering into the system. In the
affected system, the behaviours of the injected file are monitored and controlled
and the malware spreads either through online or offline modes via files. The
DRPT unpack the malware, continuously monitors and analyses the windows
application programming interface (API) calls in the imported and exported
dynamic link library (DLLs) of the malwares to find the injection code. DRPT
also protects against the malware spread into the other files and the stealing of
information from the victim machine. The DRPT tested with 1,517 executable
files, among which 811 malicious files have been taken with different malware
families. The result of DRPT shows true positive of 94.20% and false positive
Keywords: Malware; DRPT; DLL; API.
Privacy Preserving-Aware Over Big Data in Clouds Using GSA and Map Reduce Framework
by Sekar K., Mokkala Padmavathamma
Abstract: This paper proposes a privacy preserving-aware-based approach
over Big data in clouds using GSA and MapReduce framework. It consists of
two modules such as; MapReduce module and evaluation module. In MR
module, convolution process is applied to the dataset and creates a new kernel
matrix. The convolution process is correctly done; the utility and privacy
information of the data is well secured. Once the convolution process is over,
the privacy-persevering framework over big data in cloud systems is performed
based on the evaluation module. In Evaluation module, the neural-network is
trained based on the Gravitational Search Algorithm with Scaled conjugate
gradient (GSA-SCG) algorithm which is improving the utility of the privacy
data. Finally, the reduced privacy datas are stored in the service provider
(CSP). The MapReduce framework is to ensure the private data, which is in
charge for anonymising original data sets as per privacy requirements.
Keywords: Map reduce; privacy preserving; big data; Cloud service provider; cloud system; GSA; convolution; entropy.
Secure Hash Algorithm based Multiple Tenant User Security over Green Cloud Environment
by Ram Mohan, S. Padmalal, B. Chitra
Abstract: This paper proposes a green cloud multi-tenant trust authentication
with secure hash algorithm-3 (GreenCloud-MTASHA3) scheme to eliminate
the unauthorised tenant access. GreenCloud-MTASHA3 scheme provide
security over the multiple tenant requests by referring the confidentiality,
integrity and availability rate. Confidentiality refers to limiting the unauthorised
tenants green cloud data access using the additive homomorphic privacy
property in proposed scheme. Additive homomorphic privacy property-based
encryption function is developed to improve the privacy preserving level.
To attain the integrity level between the tenant requests and green cloud
server machine in GreenCloud-MTASHA3 scheme an encrypted trust data
management process is carried out. Trustworthiness of tenant request is
measured to maintain the consistency level on security with minimal
computational time. The proposed scheme attains the confidentiality, integrity
and availability rate on communicating task. Experiment is conducted on
factors such as secure computation confidence, authorised tenant computational
time and space taken on storing encrypted data.
Keywords: Green Cloud; Security; Confidentiality; Secure Hash Algorithm; Computational Time; Multi-Tenant; Integrity; Privacy Level; Cryptographic System.
Frequent Pattern Mining for Parameterised Automatic Variable Key based cryptosystems
by Shaligram Prajapat
Abstract: Huge amount of information is exchanged electronically in most
enterprises and organisations. In particular, in all financial and e-business set
ups the amount of data stored or exchanged is growing enormously over public
network among variety of computing devices. Securing this gargantuan sized
input is challenging. This paper provides a framework for securing information
exchange using parametric approaches with AVK approach and investigating
strength of this cryptosystem using mining algorithms on symmetric key-based
cryptosystem. This work demonstrates association rule application as one of the
component of cryptic mining system used to process the encrypted data for
extracting use full patterns and association. The degree of identified patterns
may be use full to rank the degree of safety and class of cryptic algorithm,
during auditing of security algorithms.
Keywords: Mining algorithms; symmetric key cryptography; AVK.
A hybrid framework for Job Scheduling on Cloud using Firefly and BAT algorithm
by Hariharan B., Dassan Paul Raj
Abstract: Nowadays cloud computing is an emerging field, requires more
algorithm and techniques for the various process of cloud computing. Here, we
have considered the job scheduling process in cloud computing platform that
needs a good algorithm to schedule the jobs requested from various users of
cloud computing environment. Here, the request can be from any platform so
scheduling is indispensable one when a number of users need the particular
jobs. In this research, we have intended to develop a hybrid algorithm for job
scheduling in cloud computing environment. Accordingly, multiple criteria will
be taken for scheduling various jobs located in various servers. Then, the job
scheduling will be done based on a hybrid optimisation algorithm.
Additionally, different jobs with different constraints will be considered and the
cloud computing environment is simulated with the help of cloudsim tool.
Keywords: Cloud Computing; Firefly Algorithm; BAT algorithm; Job Scheduling; FF-BAT Algorithm.
An effective Feature Selection for Heart Disease Prediction with Aid of Hybrid Kernel SVM
by Keerthika T., K. Premalatha
Abstract: In todays modern world cardiovascular disease is the most lethal
one. This disease attacks a person so instantly that it hardly gets any time to get
treated with. So, diagnosing patients correctly on timely basis is the most
challenging task for the medical fraternity. In order to reduce the risk of heart
disease, effective feature selection and classification based prediction system is
proposed. An efficient feature selection is applied on the high dimensional
medical data, for selecting the features fish swarm optimisation algorithm is
used. After that, selected features from medical dataset are fed to the HKSVM
for classification. The performance of the proposed technique is evaluated by
accuracy, sensitivity, specificity, precision, recall and f-measure. Experimental
results indicate that the proposed classification framework have outperformed
by having better accuracy of 96.03% for Cleveland dataset when compared
existing SVM method only achieved 91.41% and optimal rough fuzzy classifier
Keywords: Hybrid Kernel Support Vector Machine; feature selection; Fish swarm Optimization; SVM; optimal rough fuzzy; Cleveland; Hungarian and Switzerland.
Brain Tumour Detection using Self-Adaptive Learning PSO-Based Feature Selection Algorithm in MRI images
by A.R. Kavitha, C. Chellamuthu
Abstract: In this paper, we propose a brain tumour classification scheme to classify the breast tissues as normal or abnormal. At first, we segment the region of interest (ROI) from the medical image using modified region growing algorithm (MRGA). Feature matrix is generated using gray-level co-occurrence matrix (GLCM) to the entire detailed coefficient from 2D-DWT of the region of interest (ROI). To derive the relevant features from the feature matrix, we take the self-learning particle swarm optimisation (SLPSO) algorithm. In SLPSO, four upgrading strategies are utilised to adaptively redesign the velocity of every particle to guarantee its differences and robustness. The relevant features are used in a feed forward neural network (FFNN) classifier for classification. The method yield very encouraging result in terms of classification accuracy using a neural network. In experimental result most cases, the classification accuracy improved on previously reported results.
Keywords: Region of interest; modified region growing; co-occurrence matrix; GLCM; 2D-DWT; SLPSO; features; feed forward neural network; classification.
ACCURATE RECOGNITION OF ANCIENT HANDWRITTEN TAMIL CHARACTERS FROM PALM PRINTS FOR THE SIDDHA MEDICINE SYSTEMS
by Vellingiriraj EK, P. Balasubrmanie
Abstract: The ancient Tamil characters recognition is the complex task
because there is no sufficient training information is available. Various
researchers attempted to perform accurate recognition of ancient Tamil
characters. In our preceding work, hybrid multi-neural learning based
prediction and recognition system (HMNL-PRS) is introduced for the
prediction process which lacks from inaccurate recognition. In this proposed
research work, this is overcome by proposing the Brahmi character prediction
and conversion system (BC-PCS) methodology. Here, the modified graph
based segmentation algorithm (MGSA) is used to segment the characters. And
then the statistical and structural features are extracted based on which
classification is done using hybridised support vector machine based fuzzy
neural network. In the MATLAB simulation environment, the proposed
research work is implemented and it is confirmed that the proposed research
work direct to give the excellent result compared to the preceding research
methodology in terms of recognition rate.
Keywords: Brahmi characters; accurate recognition; segmentation; graph based approach; Classification.
Benchmarking Tree based Least Squares Twin Support Vector Machine Classifiers
by Mayank C, S.S. Bedi
Abstract: Least square twin support vector machine is an emerging learning method applied in classification problem. This paper present a tree-based least square twin support vector machine (T-LSTWSVM) for classification. Classification procedure depends on the correlation of input feature as well as output feature. UCI benchmark data sets are used to evaluate the test set performance of tree-based least square twin support vector machine (T-LSTWSVM) classifiers with multiple kernel functions such as linear, polynomial and radial basis function (RBF) kernels. This method applies on two main types of classification problems such as binary class problem as well as multi-class problem. The evaluation and accuracy is calculated in terms of distance metric. It was observed that multi-class classification problem performed excellently by tree-based method.
Keywords: Binary Tree; Classification; Hyper plane; Kernel Function; Machine Learning; Support Vector Machine (SVM); Least Square Twin SVM.
An Utility Based Approach for Business Intelligence to Discover Beneficial Itemsets With or Without Negative Profit in Retail Business Industry
by C. SIVAMATHI, S. Vijayarani
Abstract: Utility mining is defined as discovery of high utility itemsets from the large databases. It can be applied in business Intelligence for business decision-making such as arranging products in shelf, catalogue design, customer segmentation, cross-selling etc. In this work a novel algorithm MAHUIM (matrix approach for high utility itemset mining) is proposed to reveal high utility itemsets from a transaction database. The proposed algorithm uses dynamic matrix structure. The algorithm scans the database only once and does not generate candidate itemsets. The algorithm calculates minimum threshold value automatically, without seeking from the user. The proposed algorithm is compared with the existing algorithms like HUI-Miner, D2HUP and EFIM. For handling negative utility values, MANHUIM algorithm is proposed and this is compared with HUINIV. For performance analysis, four benchmark datasets like Connect, Foodmart, Chess and Mushroom are used. The result shows that the proposed algorithms are efficient than the existing ones.
Keywords: Utility mining; High utility itemset mining; individual item utility; transaction utility; Minimum utility threshold; Negative utility; Pruning strategy; Profitable transactions.
Automated Optimal Test Data Generation for OCL Specification Using Harmony Search Algorithm
by A. Jali
Abstract: Exploring software testing possibilities at an early software life cycle is increasingly necessary to avoid the propagation of defects to the subsequent phases. This requirement demands technique that can generate automated test cases at the initial phases of software development. Thus, we propose a novel framework for automated test data generation using formal specifications written in object constraint language (OCL). We also defined a novel fitness function named exit-predicate-wise branch coverage (EPWBC) to evaluate the generated test data. Another focus of the proposed approach is to optimise the test case generation process by applying, harmony search (HS) algorithm. The experimental results indicate that the proposed framework outperforms the other OCL-based test case generation techniques. Furthermore, it has been inferred that OCL based testing adopting HS algorithm forms an excellent combination to produce more test coverage and an optimal test suite thereby improving the quality of a system.
Keywords: specification-based testing; OCL;object constraint language; HS; harmony search; EPWBC; exit-predicate-wise branch coverage;Optimal Test Case Generation.
Enhancing the JPEG Image Steganography Security by RSA and attaining High Payload using Advanced DCT Replacement and Modified Quantization Table
by Hemalatha J, M.K. Kavitha Devi
Abstract: Steganography deal with hiding information science, which offers an ultimate security in defence, profitable usages, thus sending the imperceptible information, will not be bare or distinguished by others. The aim of this paper is to propose a novel steganographic method in JPEG images to highly enrich a data security by RSA algorithm and attains higher payload by modified quantisation table. The goals of this paper are to be recognised through: 1) modify the quantisation table of the JPEG-JSTEG tool, hiding secret message with its middle frequency to offer great embedding capacity; 2) for challenge, secure RSA algorithm is used to prevent data from extraction. A broad experimental evaluation compares the performance of our proposed work with existing JSTEG was conducted. This algorithm resulted in greater PSNR values and steganogram histogram is more similar. Experimental results reported that the proposed system is a state-of-the model, contributing abundant payload and beating the statistical revealing. Besides, our method has better in all the parameters than JPEG-JSTEG method.
Keywords: RSA; Information Forensics; Robustness; DCT; JPEG; Quantization Table.
Characteristic of Enterprise Collaboration System and Its Implementation Issues in Business Management
by Tanvi Bhatia, Sudhanshu Joshi, Tanvi Bhatia, Sadhna Sharma, Durgesh Samadhiya, Rajiv Ratn Shah
Abstract: Collaboration is an extremely useful area for the most of the enterprise systems particularly within Web 2.0 and Enterprise 2.0. The collaboration provides help in enterprise collaboration system (ECS) to achieve the desired goal by unifying completed tasks of employees or people working on a similar or the same task. Thus, the collaboration systems have witnessed significant attention. The ECS provides consistent and off-the-shelf support to processes and managements within organisations. Management techniques of the ECS may be useful to a community which manages ECS systems for collaboration. In this context, this paper focuses on enterprise collaboration system and answers critical questions related to ECS including: 1) what does collaboration really means for an enterprise system; 2) how can the collaboration help to improve internal processes and management of the system; 3) how it is helpful to improve interactions with customers and partners?
Keywords: Enterprise Collaboration System; Web 2.0; Enterprise 2.0; Management Techniques; Enterprise System.
Unsupervised Key Frame Selection using Information Theory and Color Histogram Difference
by Janya Sainui, Masashi Sugiyama
Abstract: Key frame selection is one of the important research issues in video content analysis, as it helps effective video browsing and retrieval as well as efficient storage. Key frames would typically be as different from each other as possible but, at the same time, cover the entire content of the video. However, the existing methods still lose some meaningful frames due to an inaccurate evaluation of the differences between frames. To address this issue, in this paper, we propose a novel method of key frame selection which incorporates an information theoretic measure, called quadratic mutual information (QMI), with the colour histogram difference. Here, these two criteria are used to produce an appropriate frame difference measure. Through the experiments, we demonstrate that the proposed key frame selection method generates a more coverage of the entire video content with minimum redundancy of key frames compared with the competing approaches.
Keywords: Key frame selection; Similarity measure; Information theory ; Quadratic mutual information ; Color histogram di?erence.
Estimation of coffee rust infection and growth through two-level classifier ensembles based on expert knowledge
by David Camilo Corrales, Emmanuel Lasso, Apolinar Figueroa Casas, Agapito Ledezma, Juan Carlos Corrales
Abstract: Rust is a disease that leads to considerable losses in the worldwide coffee industry. There are many contributing factors to the onset of coffee rust, e.g., crop management decisions and the prevailing weather. In Colombia the coffee production has been considerably reduced by 31% on average during the epidemic years compared with 2007. Recent research efforts focus on detection of disease incidence using computer science techniques such as supervised learning algorithms. However, a number of different authors demonstrate that results are not sufficiently accurate using a single classifier. Authors in the computer field propose alternatives for this problem, making use of techniques that combine classifier results. Nevertheless, the traditional approaches have a limited performance due to dataset absence. Therefore, we proposed two-level classifier ensembles for coffee rust infection and growth estimation in Colombian crops, based on expert knowledge.
Keywords: coffee; rust; classifier; ensemble; dataset; expert; knowledge.
Prediction parameters in nano fibre composite membrane for effective air filtration using optimal neural network
by Veeracholapuram Subburathinam Kandavel, Gabriel Mohan Kumar
Abstract: The capacity to build up steady and extensive trench structures by means of headed for great degree thin fibres would have wide innovative ramifications. Here, we report a procedure to plan and make sandwich organised polyamide-6/polyacrylonitrile/polyamide-6 (PA-6/PAN/PA-6) composite membrane is considered. This is sensible for powerful air filtration via consecutive electro spinning by coordinating the elements of parts to foresee the distinctive mechanical properties with help of optimal weight of ANN structure. Distinctive inspired optimisation strategies are used to touch base at the optimal weight of the ANN procedure. All the ideal results exhibit the way that the accomplished error values between the yield of the exploratory qualities and the anticipated qualities are firmly equivalent to zero in the outlined network. In addition, the most intense filtration accuracy and lower pressure drops furthermore the result demonstrates the base error of 96.72% dictated by the ANN. This is accomplished by the artificial fish swam optimisation (AFSO) strategies.
Keywords: nanonets; composite membrane; high efficiency; neural network and optimisation techniques.
Multi performance parameters analysis in a manufacturing system using fuzzy logic and optimal neural network model
by R. Prasanna Lakshmi, P. Nelson Raja
Abstract: Support operations enhance machine conditions; additionally involve potential creation time, conceivably postponing the client orders. The target of this paper is to decide execution parameters in every work stations with foresee the cost, dependability and accessibility of the business. This estimate examination considers two sorts of various methodologies, for example, FLP ideal neural system model. At first utilising FLP to foresee the exhibitions parameters and expanding the exactness of examination by means of ANN with motivated enhancement procedure to upgrade the weights in structure. All the ideal results exhibit the way that the accomplished mistake values between the yield of the trial values and the anticipated qualities are firmly equivalent to zero in the planned system. From the outcomes the proposed KHO-based ideal neural system demonstrates the exactness is 98.23% it is contrasted with the Pareto improvement model.
Keywords: preventive maintenance; optimisation; neural network; fuzzy logic and manufacturing industry.
Behaviour-based analysis of tourism demand in Egypt
by Taheya H. Ahmed, Mervat Abu-Elkheir, Ahmed Abou Elfetouh Saleh
Abstract: Tourism demand is the total number of persons who travel, or wish to travel, to use tourists' facilities and services at places away from their places of work or residence. Analysis of tourism demand helps companies understand tourists' needs and improves their marketing strategies. Current research for predicting tourism demand is targeted at foreign countries, and the little research targeted at predicting tourism demand in Egypt is based on macro forecasting and not on understanding the collective behaviour of tourists. In this paper, we devise different granularities from tourist data that we collect and use these different granularities to provide different levels of demand prediction. We develop a hybrid prediction framework to analyse tourists' behaviour and infer behaviour rules. These rules will act as recommendations that help to understand tourists' behaviour and their needs, and define future policies regarding tourism in Egypt.
Keywords: tourist demand; clustering; data mining; cobweb; classification; Egyptian tourism.
A multi-objective analysis model in mass real estate appraisal
by Benedetto Manganelli, Pierfrancesco De Paola, Vincenzo Del Giudice
Abstract: The purpose of this research is to analyse the performance of a real estate valuation model based on the multi objective decision making methods. The optimal price function is achieved with the goal programming model. The price function which is described as the sum of the individual objectives (criteria), and the goals are the prices of comparable properties. The model integrates with the inductive and deductive approach overcomes many of the assumptions of the best known statistical approaches. The evaluation of the proposed model is performed by comparing the results obtained by the application, to the same case study, of a multiple linear regression model and a nonlinear regression method based on penalised spline smoothing model. The comparison shows, first of all, the best interpretation capabilities of the proposed model.
Keywords: goal programming; multi-criteria; real estate market; multi objective decision making; MODM.
Haphazard, enhanced haphazard and personalised anonymisation for privacy preserving data mining on sensitive data sources
by M. Prakash, G. Singaravel
Abstract: Privacy preserving data mining is a fast growing new era of research due to recent advancements in information, data mining, communications and security technologies. Government agencies and many other non-governmental organisations often need to publish sensitive data that contain information about individuals. The important problem is publishing data about individuals without revealing sensitive information about them. A breach in the security of a sensitive data may expose the private information of an individual, or the interception of a private communication may compromise the security of a sensitive data. The objective of the research is to publish data without revealing the sensitive information of individuals, at the same time the miner need to discover non-sensitive knowledge. To achieve the above objective, haphazard anonymisation, enhanced haphazard anonymisation and personalised anonymisation are proposed for privacy and utility preservation. The performances are evaluated based on vulnerability to attacks, efficiency and data utility.
Keywords: analytics; anonymisation; big data; data mining; data publishing; microdata; privacy preserving; privacy; sensitive data.
Information graph-based creation of parallel queries for databases
by Yulia Shichkina, Dmitry Gushchanskiy, Alexander Degtyarev
Abstract: The article describes the query parallelisation method that takes into account the dependencies between operations in the data query. The method is based on the representation of the query as a directed graph with vertices as operations and edges as data connections. The graph is processed as an adjacency list, saving more memory than during processing a sparse adjacency matrix. The graph is modified only by operations, which do not change the elements of the adjacency list. Therefore it is possible to achieve intra-query parallelism by consideration of a request structure and implementation of mathematical methods of parallel calculations for its equivalent transformation. This article also presents an example of complex query parallelisation and describes applicability of the graph theory and methods of parallel computing both for query parallelisation and optimisation.
Keywords: parallel computing; optimisation methods; relational database; query; information graph; query parallelisation.