Forthcoming articles


International Journal of Computational Science and Engineering


These articles have been peer-reviewed and accepted for publication in IJCSE, but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.


Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.


Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.


Articles marked with this Open Access icon are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.


Register for our alerting service, which notifies you by email when new issues of IJCSE are published online.


We also offer RSS feeds which provide timely updates of tables of contents, newly published articles and calls for papers.


International Journal of Computational Science and Engineering (202 papers in press)


Regular Issues


  • CUDA-based PSO-trained neural network for computation of resonant frequency of circular microstrip antenna   Order a copy of this article
    by Feng Chen, Yu-bo Tian 
    Abstract: Resonant frequency is an important parameter in the design process of microstrip antennae (MSA). Artificial neural network (ANN) trained by particle swarm optimisation (PSO) algorithm (PSO-ANN) has been used to model the resonant frequency of circular MSA. In order to deal with the problem of long calculation time when training PSO-ANN, its parallel scheme in the graphic processing unit (GPU) environment is presented in this paper. The designed parallel PSO-ANN algorithm uses the particle behaviour parallelisation of PSO, corresponds one particle to one thread, and deals with a large number of GPU threads in parallel to reduce training time. This scheme is applied to model the resonant frequency of circular MSA under compute unified device architecture (CUDA). Experimental results show that compared with CPU-based sequential PSO-ANN, GPU-based parallel PSO-ANN has obtained more than 340 times faster speedup ratio with the same optimisation stability. Moreover, the modelling error can be remarkablely reduced with very limited runtime increment when substantially enlarging the number of particles on the GPU side.
    Keywords: artificial neural network; ANN; compute unified device architecture; CUDA; microstrip antennae; MSA, particle swarm optimizsation; PSO; resonant frequency.

  • Heterogeneous data fusion for three-dimensional gait analysis using wearable MARG sensors   Order a copy of this article
    by Sen Qiu, Zhelong Wang 
    Abstract: Gait analysis has become a research highlight. In this paper, we propose a computing method using wearable MARG (magnetic angular rate and gravity sensor arrays) with wireless network, which can calculate absolute and relative orientation and position information of human foot motion during level walking and stair climbing. Three-dimensional foot orientation and position were estimated by a Kalman-based sensor fusion algorithm and validated by ground truth provided by Vicon system. The repeatability of the alignment procedure and the measurement errors were evaluated on healthy subjects. Experimental results demonstrate that the proposed method has a good performance at both motion patterns. No significant drifts exist in the overall results presented in the paper. The measured and estimated information can be transmitted to remote server through internet. Moreover, this method could be applied to other cyclic activity monitoring.
    Keywords: gait analysis, attitude computation, wireless sensor network, wearable sensors, MARG.

  • Relay selection in the multiple-mobile-relay-based D2D system over N-Nakagami fading channels   Order a copy of this article
    by Lingwei Xu, Hao Zhang, Tingting Lu, Xing Liu 
    Abstract: Based on variable-gain amplify-and-forward relaying scheme, the lower bound on the outage probability of the multiple-mobile-relay-based device-to-device system with relay selection over N-Nakagami fading channels is investigated in this paper. By the method of harmonic mean of positive random variables, the closed-form expressions for the lower bound on outage probability are derived. Then the outage probability performance under different conditions is evaluated through numerical simulations to verify the analysis. The simulation results showed that the number of mobile relays, the fading coefficient, the number of cascaded components, the relative geometrical gain, and the power-allocation parameter all have an important influence on the outage probability performance.
    Keywords: D2D communication; N-Nakagami fading channels; amplify-and-forward relaying; outage probability;

  • Evolving recommendations from past travel sequences using soft computing techniques   Order a copy of this article
    by Sunita Tiwari, Saroj Kaushik 
    Abstract: The World Wide Web and mobile devices have become an indispensable part of life. The pervasiveness of location acquisition technologies such as Global Positioning System (GPS) has enabled the convenient logging movement sequences of users using mobile devices. This work proposes a personalised tourist spot recommender system for mobile users using a genetic algorithm (GA) for a situation when explicit user ratings for tourist spots are not available. Implicit ratings of users for tourist spots are mined using GPS trajectory logs. GA is used to evolve ratings of unvisited spots using implicit ratings. A GPS trajectory dataset of 178 users collected by Microsoft Research Asias GeoLife project is used for the purpose of evaluation and experiments. We emphasise that proposed approach is comparable with existing related approaches when compared in terms of average root squared mean error (RSME) and provides focused, personalised and relevant recommendations.
    Keywords: personalised recommender systems; GPS log mining; genetic algorithms; user preference discovery; soft computing; location recommender

  • Fusion of statistical and machine learning approaches for time series prediction using Earth observation data   Order a copy of this article
    by K.P. Agrawal, Sanjay Garg, Shashikant Sharma, Pinkal Patel, Ayush Bhatnagar 
    Abstract: This paper focuses on fusion of statistical and machine learning models for improving the accuracy of prediction. Statistical models such as Integration of Auto Regressive (AR) and Moving Average (MA) are capable to handle non-stationary time series but can deal with only a single time series. A machine learning approach (i.e. Support Vector Regression (SVR)) can handle dependency among different time series along with non-linear separable domains, but it cannot incorporate the past behaviour of time series. This led us to combine these two approaches for improving the accuracy of time series prediction, where the focus has been on minimisation of forecast error using residuals, which helps to take appropriate action for the near future. Keeping in view our objective, hybridisation of Auto Regressive Integrated Moving Average (ARIMA) with SVR models has been done. In order to reduce the number of area-wise models and reduce time complexity for tuning different parameters, emphasis has been laid on handling issues related to scalability by taking suitable representative samples from each sub-area. Results obtained show that the performance of the proposed hybrid model is better than that of individual models.
    Keywords: prediction; time series; ARIMA; SVR; scalability.

  • Energy analysis of code regions of HPC applications using EnergyAnalyzer tool.   Order a copy of this article
    by Shajulin Benedict, Rejitha R.S., Preethi B.C, Bency Bright, Judyfer W.S 
    Abstract: Energy consumption analysis is emerging as a crucial step for analysing scientific applications. It is essential for application developers to design energy-conscious parallel algorithms. Even as there exist some power-measuring tools for parallel machines, code region specific energy consumption analysis tools for scientific applications, especially when the future exa-scale or large scale computing machines were targeted, are very rare and are challenging to implement. This paper focuses on revealing the design methodology of the EnergyAnalyzer tool - a code region based energy consumption analysis tool for scientific applications. The tool was tested with several HPC applications, such as Multiple EM for Motif Elicitation (MEME), Gapped Local Alignment of Motifs (GLAM2), High Performance Computing Challenge (HPCC) benchmarks, NAS parallel benchmarks (BT, CG, EP, FT, LU, MG, SP, and so forth), and a few other HPC benchmarks, at the HPCCLoud Research Laboratory. In addition, we investigated the energy consumption of code regions of MEME/GLAM2 applications when the application-specific parameters were modified.
    Keywords: energy analysis; HPC; performance analysis; scientific applications; tools.

  • Geometric optimisation of thermoelectric coolers using simulated annealing   Order a copy of this article
    by Doan Vi Kim Khanh, Pandian M. Vasant, Irraivan Elamvazuthi, Vo N. Dieu 
    Abstract: The field of thermoelectric coolers (TEC) has grown substantially in recent years. In an extreme environment, such as thermal energy and gas drilling operations, TEC is an effective cooling mechanism for instruments. Nevertheless, limitations, such as the relatively low energy conversion efficiency and ability to dissipate only a limited amount of heat flux, may seriously damage the lifetime and performance of the instruments. Until now, many researches were conducted to expand the efficiency of TEC. The material parameters are the most significant, but they are restricted by currently available materials and module fabricating technologies. Therefore, the main objective of finding the optimal TEC design is to define a set of design parameters. In this paper, technical issues of TEC are discussed. After that, a new method of optimising the dimension of TEC using simulated annealing (SA), to maximise the cooling rate was proposed. Equality constraint and inequality constraint were taken into consideration. This work reveals that SA performs better than genetic algorithm in terms of stability and reliability and establishes a better geometric design of single-stage TEC that maximises cooling rate when compared with Chengs work (2005).
    Keywords: single-stage thermoelectric cooler; simulated annealing; optimisation; geometric design; coefficient of performance.

  • Trajectory anonymisation based on graph split using earth mover's distance   Order a copy of this article
    by Priti Jagwani, Saroj Kaushik 
    Abstract: Analysing and mining trajectories poses new challenges for trajectory privacy. We are addressing the privacy issue for offline (historical) trajectories, which are generally published for public research. A fundamental research question in the trajectory privacy domain is one of trajectory anonymisation. K-anonymity is used as a standard for privacy, which ensures that every entity in the dataset is indistinguishable from (k−1) other entities. The proposed work aims at anonymising trajectories based on the graph split method for trajectory privacy. We have used a technique of constructing a trajectory graph to simulate spatial and temporal relations of trajectories, based on which trajectory k-anonymity sets are found through graph split. For the purpose, we have proposed a novel method that uses earth movers distance as a metric to find trajectory k-anonymity sets, in contrast to Euclidean distance. The two methods have been compared through a series of experiments. It is discovered that the proposed method performs better in terms of low information loss and computation time.
    Keywords: information security; trajectory anonymisation; EMD; earth mover’s distance; trajectory privacy

  • Computational investigation into the influence of yaw on the aerodynamics of a rotating wheel in free air   Order a copy of this article
    by Tharaka Kothalawala, Alvin Gatto 
    Abstract: This paper details a computational investigation into the influence of applied yaw angle on the aerodynamics of a rotating wheel in free air. The main analysis tool employed was Unsteady Reynolds-Averaged Navier-Stokes simulations with the primary aim to investigate and characterise the complex surface and near wake flow field physics of the configuration. Overall, results showed that the flow-field surrounding the wheel was principally vortical in nature, with the number and strength of developed vortical structures heavily dependent on the level of applied yaw angle. Lift, drag, and side force coefficients, as well as on-surface pressures, were also found to be interdependent to the level of yaw angle applied.
    Keywords: CFD; wheel; landing gear; rotating wheel; yaw; rotation; bluff body; aerodynamics; free air;

  • Temporal dynamic recommendation based on data imputation through association analysis   Order a copy of this article
    by Zhang Yuxiang, Wang Xiayang, Xiao Chunjing, Sun Yu 
    Abstract: As a novel method for modelling user interest drift over time, we explore the session-based temporal dynamic recommendation, in which we impute missing rating in terms of users association. Firstly, we mine user association groups through association analysis according to users common preferences. Secondly, the user's consumption history is divided into sessions, and we impute vacant values based on the correlation and occupation of user association groups in each session. Thirdly, we model the user interest drift over time by Latent Dirichlet Allocation (LDA) in each session and predict the users current interest by an exponential decay function. Finally, we predict ratings on items for active user using neighbour-based collaborative filtering. Experiments on a real dataset show that the proposed framework is more effective than previous methods on several tasks.
    Keywords: recommender system; interest drift; data sparsity; data imputation; association analysis

  • Including category information as supplements in latent semantic analysis of Hindi documents   Order a copy of this article
    by Karthik Krishnamurthi, Vijayapal Reddy Panuganti, Vishnu Vardhan Bulusu 
    Abstract: Latent Semantic Analysis (LSA) is a mathematical model that is used to capture the semantic structure of documents by using the correlations between the textual elements in them. LSA captures the semantic structure very well, as it is independent ofexternal sources of semantics. However, the models performance improves when it is supplemented with extra information. The work presented in this paper is to modify the model to analyse word correlations in documents by considering the document category information as supplements in the process. This enhancement is referred to as Supplemented Latent Semantic Analysis (SLSA). SLSAs performance is empirically evaluated in a document classification application by comparing the accuracies of classification against plain LSA for various term weighting schemes. Increments of 1.14%, 1.30% and 1.63% are observed in the classification accuracies when SLSA is compared with plain LSA for tf, idf and tfidf, respectively, in the initial term-by-document matrix.
    Keywords: dimensionality reduction; document classification; latent semantic analysis; semantic structure; singular value decomposition.

  • Fuzzy time series forecasting based on information granule and neural network   Order a copy of this article
    by Lanlan Gu, Hongyue Guo, Xiaodong Liu 
    Abstract: Time series forecasting is critical for the research of a fuzzy time series. In this paper, a novel model that combines an information granule partitioning method with a back propagation neural network (BPNN) is proposed to forecast the time series. First, the unequal-dividing method based on information granule is applied to divide the universe of discourse to form the fuzzy sets. Then, we use the fuzzy sets to fuzzify the historical data into labels. Next, a second-order fuzzy logical relationship of the labeled dataset is constructed to train a BPNN to forecast the labels. Finally, the forecasting labels are defuzzified to obtain predictions. The Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX) is used to verify the effectiveness of the proposed model. The results show that the proposed model performs better than existing models according to root mean-square error (RMSE).
    Keywords: fuzzy time series; forecasting; information granule; back propagation neural network

  • The analysis and recognition of Chinese temporal expressions based on a mixtured model using statistics and rules   Order a copy of this article
    by Dandan Zhao, Degen Huang, Yuzhe Wang, Qiong Wu 
    Abstract: As the first step of temporal information understanding, the results of temporal expressions recognition will directly affect further usage of temporal information, such as temporal relationship extraction. For Chinese language, there are many distinct characters both in word morphology and syntax in temporal expressions compared with the Western languages where much research has been done in the last decade. Classifications and constructions of Chinese temporal expressions were analysed, and an approach for extracting temporal expressions from Chinese texts is presented in this paper. The model comprises a cascade of rule-based and machine-learning pattern recognition procedures. Conditional Random Fields (CRFs) was applied to recognise time units rather than time expressions, to avoid the boundary localisation problems in Chinese temporal expressions. Rules for the temporal expressions boundary localisation were formulated based on time triggers thesaurus and time affix words thesaurus. The F-measure of temporal expressions identification was 95.93% on the temporal 2010 Chinese corpus. The experimental results showed the validity of the proposed approach.
    Keywords: temporal expressions; TEs; conditional random fields; CRFs; time unit; rules; time trigger; time affix word; Chinese information processing ; thesaurus

  • High quality multi-core multi-level algorithm for community detection   Order a copy of this article
    by Suely Oliveira, Rahil Sharma 
    Abstract: One of the most relevant and widely studied structural properties of networks is their community structure or clustering. Detecting communities is of great importance in various disciplines where systems are often represented as graphs. Different community detection algorithms have been introduced in the past few years, which look at the problem from different perspectives. Most of these algorithms, however, have expensive computational time that makes them impractical to use for large graphs found in the real world. Maintaining a good balance between the computational time and the quality of the communities discovered is a well known open problem in this area. In this paper, we propose a multi-core multi-level (MCML) community detection algorithm based on the topology of the graph, which contributes towards solving the above problem. MCML algorithm on two benchmark datasets results in detection of accurate communities. We detect high modularity communities by applying MCML on Facebook Forum dataset to find users with similar interests and Amazon product dataset. We also show the scalability of MCML on these large datasets with 16 Xeon Phi cores.
    Keywords: parallel algorithm; multi-level; multi-core; community detection; Facebook user interaction; big data.

  • An improved indoor localisation algorithm based on wireless sensor network   Order a copy of this article
    by Min-Yi Guo, Chen Li, Jianzhong Wu, Jianping Cai, Zengwei Zheng, Jin Lv 
    Abstract: Many sensor network applications require location awareness. In this paper, an improved positioning algorithm based on fingerprinting is presented for indoor environments. The improved algorithm compared with the traditional fingerprint recognition algorithm does not require offline fingerprint collection. The improved algorithm is robust in complex indoor environments and it can effectively deal with the situation of the failure of the beacon node. When there are new nodes added to the wireless sensor network, the algorithm will make use of the new nodes by generating new fingerprints to ensure the positioning performance of the algorithm.
    Keywords: wireless sensor network; indoor localisation; distributed database.

  • Improved artificial bee colony algorithm with differential evolution for numerical optimisation problems   Order a copy of this article
    by Jiongming Jiang, Yu Xue, Tinghuai Ma, Zhongyang Chen 
    Abstract: Evolutionary algorithms have been widely used in recent years. The Artificial Bee Colony (ABC) algorithm is an evolutionary algorithm for numerical optimisation problems. Recently, more and more researchers have shown interest in the ABC algorithm. Previous studies have shown that it is an efficient, effective and robust evolutionary optimisation method. However, the convergence rate of the ABC algorithm still does not meet our requirements and it is necessary to optimise it. In this paper, several local search operations are embedded into the ABC algorithm. This modification enables the algorithm to get a better balance between the convergence rate and the robustness. Thus it can be possible to increase the convergence speed of the algorithm and thereby obtain an acceptable solution. Such an improvement can be advantageous in many real-world problems. This paper focuses on the performance of improving the ABC algorithm with a differential strategy on the numerical optimisation problems. The proposed algorithm has been tested on 18 benchmark functions from relevant literature. The experiment results indicated that the performance of the improved ABC algorithm is better than that of the original ABC algorithm and some other classical algorithms.
    Keywords: artificial bee colony, numerical optimisation, differential algorithm.

  • A study of cooperative advertising in a one-manufacturer two-retailers supply chain based on the multi-stage dynamic game theory   Order a copy of this article
    by Hong Zhang, Quanju Zhang 
    Abstract: In this paper, the coordination of cooperative advertising decisions is analysed in a supply chain with one manufacturer and two retailers. Suppose the manufacturer invests in national advertising and one retailer invests in local advertising, the manufacturer agrees to share part of the local advertising cost with the retailer. Meanwhile, the other retailer refuses to take part in cooperative advertising. The manufacturer and retailer who put investment in cooperative advertising could choose a cooperative or a non-cooperative attitude, but the other retailer always chooses a non-cooperative attitude. We select four decision variables, including local advertising effort, two retailers' marginal profits, price of product to discuss seven three-stage dynamic game models according to the parties' attitudes being cooperative or not. Seven game models, including one non-cooperative model, five partial cooperative models and one cooperative model, are investigated in detail based on a whole mathematical analysis. By comparing the proposed seven models, several interesting propositions are obtained and the corresponding interesting results being acquired either via these propositions.
    Keywords: cooperative advertising; multi-stage dynamic model; supply chain; game theory

  • Cricket chirping algorithm: an efficient meta-heuristic for numerical function optimisation   Order a copy of this article
    by Jonti Deuri, Siva Sathya Sundaram 
    Abstract: Nature-inspired meta-heuristic algorithms have proved to be very powerful in solving complex optimisation problems in recent times. The literature reports several inspirations from nature, exploited to solve computational problems. This paper is yet another step in the journey towards the use of natural phenomena for seeking solutions to complex optimisation problems. In this paper, a new meta-heuristic algorithm based on the chirping behaviour of crickets is formulated to solve optimisation problems. It is validated against various benchmark test functions and then compared with popular state-of-the-art optimisation algorithms, such as genetic algorithm, particle swarm optimisation, bat algorithm, artificial bee colony algorithm and cuckoo search algorithm for performance efficiency. Simulation results show that the proposed algorithm outperforms its counterparts in terms of speed and accuracy. The implication of the results and suggestions for further research are also discussed.
    Keywords: optimisation; meta-heuristic algorithm; numerical function, cuckoo search; artificial bee colony; particle swarm optimisation; genetic algorithm; cricket chirping algorithm; calling chirp; aggressive chirp

  • Optimising the stiffness matrix integration of n-noded 3D finite elements   Order a copy of this article
    by J.C. Osorio, M. Cerrolaza, M. Perez 
    Abstract: The integration of the stiffness and mass matrices in finite element analysis is a time-consuming task. When dealing with large problems having very fine discretisations, the finite element mesh becomes very large and several thousands of elements are usually needed. Moreover, when dealing with nonlinear dynamic problems, the CPU time required to obtain the solution increases dramatically because of the large number of times the global matrix should be computed and assembled. This is the reason why any reduction in computer time (even being small) when evaluating the problem matrices is of concern for engineers and analysts. The integration of the stiffness matrix of n-noded high-order hexahedral finite elements is carried out by taking advantage of some mathematical relations found among the nine terms of the nodal stiffness matrix, previously found for the more simple brick element. Significant time savings were obtained in the 20-noded finite element example case.
    Keywords: stiffness matrix, finite elements, n-noded hexahedral elements, saving CPU time

  • A cost-effective graph-based partitioning algorithm for a system of linear equations   Order a copy of this article
    by Hiroaki Yui, Satoshi Nishimura 
    Abstract: There are many techniques for reducing the number of operations in directly solving a system of sparse linear equations. One such method is nested dissection (ND). In numerical analysis, the ND algorithm heuristically divides and conquers a system of linear equations, based on graph partitioning. In this article, we present a new algorithm for the first level of such graph partitioning, which splits a graph into two roughly equal-sized subgraphs. The algorithm runs in almost linear time. We evaluate and discuss the solving costs by applying the proposed algorithm to various matrices.
    Keywords: sparse matrix; nested dissection; graph partitioning; graph algorithm; Kruskal’s algorithm; Gaussian elimination; bit vector; adjacent list; refinement; system of equations.

  • A based-on-set-partitioning exact approach to multi-trip of picking up and delivering customers to airports   Order a copy of this article
    by Wei Sun, Yang Yu, Jia Li 
    Abstract: Picking up and delivering customers to airports (PDCA) is a new service provided in China. The multi-trip mode of PDCA (MTM-PDCA) service is a promising measure to reduce operation costs. To obtain the exact solution, we propose a novel modelling approach including two stages. In the first stage, all feasible trips of each subset of the customer point set are produced, and then the two local optimum trips of each subset can be obtained easily. Subsequently, using the local optimum trips obtained in the first stage, we establish the novel trip-oriented set-partitioning (TO-SP) model to formulate MTM-PDCA. The MTM-PDCA based on the TO-SP model can be solved exactly by CPLEX. By testing extensive instances, we summarise several managerial insights that can be used to successfully reduce the costs of PDCA by using multi-trip mode.
    Keywords: multi-trip; single-trip; set-partitioning; exact approach.

  • Reliability prediction and QoS selection for web service composition   Order a copy of this article
    by Liping Chen, Weitao Ha 
    Abstract: Web service composition is a distributed model to construct new web services on top of existing primitive or other composite web services. The key issues in the development of web service composition are the dynamic and efficient reliability prediction and the appropriate selection of component services. However, the reliability of the service-oriented systems heavily depends on the remote web services as well as the unpredictable internet. Thus, it is hard to predict the system reliability. In addition, there are many reliable functionally equivalent partner services for the same composite service which have different Quality of Service (QoS). It is important to identify the best QoS candidate web services from a set of functionally equivalent services. But efficient selection from the large numbers of candidate web services brings challenges to the existing methods. In this paper, we discuss web service composition in two ways: reliability prediction and QoS optimal selection. First, we propose a reliability prediction model based on Petri net. For atomic services, a staged reliability model is provided which predicts reliability from network environment availability, hermit equipment availability, discovery reliability and binding reliability. To address the complex connecting relationship among subservices, places of basic Petri net for input and output are extended to some subtypes for multi-source input place and multiuse output place. Secondly, we use a new skyline algorithm based on an R-tree index. The index tree is traversed to judge whether it is dominated by the candidate skyline sets. The leaf points store optimal component services. Experimental evaluation of real and synthetic data shows the effectiveness and efficiency of the proposed approach. The approach has been implemented and has been used in the context of travel process mining. Although the results are presented in the context of Petri nets, the approach can be applied to any process modelling language with executable semantics.
    Keywords: web service composition, atomic services, reliability prediction, QoS, skyline, optimisation

  • Cost-sensitive ensemble classification algorithm for medical image   Order a copy of this article
    by Minghui Zhang, Haiwei Pan, Niu Zhang, Xiaoqin Xie, Zhiqiang Zhang, Xiaoning Feng 
    Abstract: Medical image classification is an important part of domain-specific application image mining. In this paper, we quantify the domain knowledge about medical images for feature extraction. We propose a cost-sensitive ensemble classification algorithm(CEC), which uses a new training method and adopts a new method to acquire parameters. In the weak classifier training process, we mark the samples that are wrongly classified in the former iteration, use the method of re-sampling in the samples that are correctly classified, and put all the wrongly classified in the next training. The classification can pay more attention to those samples that are hardly classified. The weight parameters of weak classifiers are determined not only by the error rates, but also by their abilities to recognise the positive samples. Experimental results show that our algorithm is more efficient for medical image classification.
    Keywords: medical image, domain knowledge, cost-sensitive learning, ensemble classification

  • Mining balanced API protocols   Order a copy of this article
    by Deng Chen, Yanduo Zhang, Wei Wei, Rongcun Wang, Huabing Zhou, Xun Li, Binbin Qu 
    Abstract: API protocols can be used in many aspects of software engineering, such as software testing, program validation, and software documentation. Mining API protocols based on probabilistic models is proved to be an effective approach to achieve protocols automatically. However, it always achieves unbalanced protocols, that is, protocols described using probabilistic models have unexpected extremely high and low probabilities. In this paper, we discuss the unbalanced probability problem and propose to address it by preprocessing method call sequences used for training. Our method first finds tandem arrays in method call sequences based on the suffix tree. Then, it substitutes each tandem array with a tandem repeat. Since repeated sub method call sequences are eliminated, balanced API protocols may be achieved. In order to investigate the feasibility and effectiveness of our approach, we implemented it in our previous prototype tool ISpecMiner, and used the tool to perform a comparison test based on several real-world applications. Experimental results show that our approach can achieve more balanced API protocols than existing approaches, which is essential for mining valid and precise API protocols.
    Keywords: mining API protocol; suffix tree; probability balance; method call sequence; Markov model; tandem array

  • Advanced DDOS detection and mitigation technique for securing cloud   Order a copy of this article
    by Masoumeh Zareapoor, Pourya Shamsolmoali, M.Afshar Alam 
    Abstract: Distributed Denial of Service (DDoS) attacks have become a serious problem for internet security and cloud computing. This kind of attack is the most complex form of DoS (Denial of Service) attacks. This type of attack can simply duplicate its source address, such as spoofing attack, which disguises the real location of the attack. Therefore, DDoS attack is the most significant challenge for network security. In this paper, we present a model to detect and mitigate DDOS attacks in cloud computing. The proposed model requires very small storage and it has the ability of fast detection. The experiment results show that the system is able to mitigate most of the attacks. Detection accuracy and processing time were the metrics used to evaluate the performance of the proposed model. From the results, it is evident the system achieves high detection accuracy (97%) with some minor false alarms.
    Keywords: distributed denial-of-service; DDOS; information divergence; cloud security; filtering

  • Global and local optimisation based hybrid approach for cloud service composition   Order a copy of this article
    by Jyothi Shetty, Demian Antony D'Mello 
    Abstract: The goal of service composition is to find the best set of services to meet the user's requirements. The efficient local optimisation methods may fail to satisfy the users end-to-end requirements. Global optimisation methods are popular when the users end-to-end requirements are to be satisfied. Optimal composition to end-to-end requirements consumes exponential time, in the case of a large search space. Metaheuristic methods are being used to solve this problem, which give near-optimal solutions. This paper proposes an approach in which both local and global optimisations are used. In order to avoid local optimums during local optimisation, the proposed work selects a set of best services from each task and then uses a global optimisation method on the smaller search space to select the best composition. In order to reduce the communication costs, the optimal solution identifies the minimum number of clouds for composition.
    Keywords: cloud service; service composition; task level selection; global optimisation; local optimisation; exact algorithm

  • Homomorphisms between the covering information systems   Order a copy of this article
    by Zengtai Gong, Runli Chai, Yongping Guo 
    Abstract: The system of information is an important mathematical model in many fields, such as data mining, artificial intelligence, and machine learning. The relation or mapping is a popular method for exploring the communication between two systems of information. In this paper, we first introduce the concepts of the covering relation or mapping and the inverse covering relation or mapping between two covering systems of information and investigate their properties. Then, we propose the view of homomorphism of covering systems of information that are based on covering relation. Moreover, we prove that attribute reductions in the image system and original system are equivalent to each other under the conditions of homomorphism given in this paper.
    Keywords: covering-based rough sets, homomorphism of information, attribute reductions

  • The remote farmland environment monitoring system based on ZigBee sensor network   Order a copy of this article
    by Yongfei Ye, Xinghua Sun, Minghe Liu, Zhisheng Zhao, Xiao Zhang, Hongxi Wu 
    Abstract: In order to change the traditional management of agricultural production, ZigBee technology is used in short distance wireless transmission to design an intelligent farmland environment remote monitoring system, which integrates communication, computing and all aspects of network technology. The real-time accurate data collection of farmland soil pH value, the temperature and humidity surrounding the plants, illumination intensity and crop chlorophyll content, all provide reliable data for the intelligent agricultural production, thereby the level of intelligence of agricultural management is increased. Based on precision guidance, irrigation will become intelligent, which can avoid the waste of water resources and low use rate caused by free operation. At the same time, it will promote modernisation of agricultural production processes.
    Keywords: farmland environment; remote monitoring; ZigBee technology; sensor network; intelligent farmland environment; precise agriculture; agricultural information; data collection; data transmission; real time; agricultural knowledge; computational science; computational engineering.

  • Optimising order selection algorithm based on online taxi-hailing applications   Order a copy of this article
    by Tian Wang, Wenhua Wang, Yongxuan Lai, Diwen Xu, Haixing Miao, Qun Wu 
    Abstract: Nowadays, with the widespread use of smart devices and networking technologies, the application of taxi-hailing servers is becoming more and more popular in our daily life. However, the drivers' behaviour of robbing orders while driving brings great potential traffic security problems. Considering the characteristics and deficiencies of the mainstream taxi-hailing apps in smart devices, this paper studies the order selection problem from the drivers' end. According to different customers' requirements, an order auto-selection algorithm is proposed. Moreover, it adopts a time buffer mechanism to avoid time conflicts among the orders, and a new concept of 'efficiency value of orders' is proposed to evaluate the profits of orders. This algorithm can auto-select orders for the driver according to their qualities, which can not only improve the safety, but also maximise the drivers' revenue. Extensive simulations validate the performance of the proposed method.
    Keywords: taxi-hailing application; order selection algorithm; biggest profit; greedy algorithm; safety; efficiency value of orders.

  • Towards UNL based machine translation for Moroccan Amazigh language   Order a copy of this article
    by Imane Taghbalout, Fadoua Ataa Allah, Mohamed El Marraki 
    Abstract: Amazigh languages, also called Berber, belong to the Afro-Asiatic language (Hamito-Semitic) family. They are a family of similar and closely related languages and dialects indigenous to North Africa. They are spoken in Morocco, Algeria, and some populations in Libya, Tunisia, northern Mali, western and northern Niger, northern Burkina Faso, Mauritania, and in the Siwa Oasis of Egypt. Large Berber-speaking migrant communities have been living in Western Europe since the 1950s. In this paper, we study the Standard Moroccan Amazigh. It became a constitutionally official language of Morocco in 2011. However, it is still considered as a less resourced language. So, it is time to develop linguistic resources and applications for processing automatically this language, in order to ensure its survival and promotion by integrating it into the new information and communication technologies (NICT). In this context and in the perspective to produce a Universal Networking Language (UNL) based machine translation system for this language, we have undertaken the creation of the Amazigh-UNL dictionary, as a first step of linguistic resources development required by the UNL system to achieve translation. Thus, this paper is focused on presenting linguistic features implementation, such as morphological, syntactical and semantic information of the Amazigh languages.
    Keywords: Amazigh language; machine translation; Universal Networking Language; Amazigh-UNL dictionary; inflectional paradigm; subcategorisation frame; Universal Word

  • Population diversity of particle swarm optimisation algorithms for solving multimodal optimisation problems   Order a copy of this article
    by Shi Cheng, Junfeng Chen, Quande Qin, Yuhui Shi 
    Abstract: The aim of multimodal optimisation is to locate multiple peaks/optima in a single run and to maintain these found optima until the end of a run. In this paper, seven variants of particle swarm optimisation (PSO) algorithms, which includes PSO with star structure, PSO with ring structure, PSO with four clusters structure, PSO with Von Neumann structure, social-only PSO with star structure, social-only PSO with ring structure, and cognition-only PSO, are used to solve multimodal optimisation problems. The population diversity, or more specifically, the position diversity, is used to measure the candidate solutions during the search process. Our goal is to measure the performance and effectiveness of variants of PSO algorithms and investigate why an algorithm performs effectively from the perspective of population diversity. The experimental tests are conducted on eight benchmark functions. Based on the experimental results, the conclusions could be made that the PSO with ring structure and social-only PSO with ring structure perform better than the other PSO variants on multimodal optimisation. From the population diversity measurement, it is shown that to obtain good performances on multimodal optimisation problems, an algorithm needs to balance its global search ability and solutions maintenance ability, which means that the population diversity should be converged to a certain level quickly and be kept during the whole search process.
    Keywords: swarm intelligence algorithm; multimodal optimisation; particle swarm optimisation; population diversity; nonlinear equation systems

  • A pseudo nearest centroid neighbour classifier   Order a copy of this article
    by Hongxing Ma, Jianping Gou, Xili Wang 
    Abstract: In this paper, we propose a new reliable classification approach, called the pseudo nearest centroid neighbour rule, which is based on the pseudo nearest neighbour rule (PNN) and nearest centroid neighbourhood (NCN). In the proposed PNCN, the nearest centroid neighbours rather than nearest neighbours per class are first searched by means of NCN. Then, we calculate k categorical local mean vectors corresponding to k nearest centroid neighbours, and assign the weight to each local mean vector. Using the weighted k local mean vectors for each class, PNCN designs the corresponding pseudo nearest centroid neighbour and decides the class label of the query pattern according to the closest pseudo nearest centroid neighbour among all classes. The classification performance of the proposed PNCN is evaluated on real and artificial datasets in terms of the classification accuracy. The experimental results demonstrate the effectiveness and robustness of PNCN over the competing methods in many practical classification problems.
    Keywords: K-nearest neighbour rule; nearest centroid neighborhood; pseudo nearest centroid neighbour rule; local mean vector; pattern classification

  • A comparative study of mixed least-squares FEMs for the incompressible Navier-Stokes equations   Order a copy of this article
    by Alexander Schwarz, Masoud Nickaeen, Serdar Serdas, Abderrahim Ouazzi, Jörg Schröder, Stefan Turek 
    Abstract: In the present contribution we compare (quantitatively) different mixed least-squares finite element methods (LSFEMs) with respect to computational costs and accuracy. In detail, we consider an approach for Newtonian fluid flows, which are described by the incompressible Navier-Stokes equations. Various first-order systems are derived based on the residual forms of the equilibrium equation and the continuity condition. From these systems L^2-norm least-squares functionals are constructed, which are the basis for the associated minimisation problems. The first formulation under consideration is a div-grad first-order system resulting in a three-field formulation with total stresses, velocities, and pressure (S-V-P) as unknowns. Here, the variables are approximated in H(div) x H^1 x L^2 on triangles and in H^1 x H^1 x L^2 on quadrilaterals. In addition to that a reduced stress-velocity (S-V) formulation is derived and investigated. An advantage of this formulation is a smaller system matrix due to the absence of the pressure degree of freedom, which is eliminated in this approach. S-V-P and S-V formulations are promising approaches when the stresses are of special interest, e.g. for non-Newtonian, multiphase or turbulent flows. Furthermore, since in the total stress approach the pressure is approximated instead of its gradient, the proposed S-V-P formulation could be used in formulations with discontinuous pressure interpolation. For comparison the well-known first-order vorticity-velocity-pressure (V-V-P) formulation is investigated. In here, all unknowns are approximated in H^1 on quadrilaterals. Besides some numerical advantages, as e.g. an inherent symmetric structure of the system of equations and a directly available error estimator, it is known that least-squares methods have a drawback concerning mass conservation, especially when lower-order elements are used. Therefore, the main focus of the work is drawn to performance and accuracy aspects on the one side for finite elements with different interpolation orders and on the other side on the usage of efficient solvers, for instance of Krylov-space or multigrid type. Finally, two well-known benchmark problems are presented and the results are compared for different first-order formulations.
    Keywords: least-squares FEM; V-V-P formulation; S-V-P formulation; S-V formulation; Navier-Stokes; multigrid

  • Enhanced differential evolution with modified parent selection technique for numerical optimisation   Order a copy of this article
    by Xiang Li 
    Abstract: Differential evolution (DE) is considered to be one of the most prominent evolutionary algorithms for numerical optimisation. However, it may suffer from a slow convergence rate, especially in the late stage of the evolution progress. The reason might be that the parents in the mutation operator are randomly selected from the parent population. To remedy this limitation and to enhance the performance of DE, in this paper, a modified parent selection technique is proposed, where the parents in the mutation operator are chosen based on their previous successful experiences. The major advantages of the proposed parent selection technique are its simplicity and generality. It does not destroy the simple structure of DE, and it can be used in most DE variants. To verify the performance of the proposed technique, it is integrated into the classical DE algorithm and three advanced DE variants. Thirteen widely used benchmark functions are used as the test suite. Experimental results indicate the the proposed technique is able to enhance the performance of the classical DE and advanced DE algorithms in terms of both the quality of final solutions and the convergence rate.
    Keywords: differential evolution; parent selection; mutation operator; numerical optimisation

  • Intelligent selection of parents for mutation in differential evolution   Order a copy of this article
    by Meng Zhao, Yiqiao Cai 
    Abstract: In most DE algorithms, the parents for mutation are randomly selected from the current population. As a result, all vectors involved in mutation are equally selected as parents without any selective pressure. Although such a mutation strategy is easy to use, it is inefficient for solving complex problems. To address this issue, we present an intelligent parents selection strategy (IPS) for DE. The new algorithmic framework is named as DE with IPS-based mutation (IPSDE). In IPSDE, the neighbourhood of each individual is firstly constructed with a population topology. Then, all the neighbours of each individual are partitioned into two groups based on their fitness values, and a probability value for each neighbour being selected as the parents in the respective groups is calculated based on its distance from the current individual. With the probability values, IPS selects the parents from the neighbourhood of the current individual to guide the mutation process of DE. To evaluate the effectiveness of the proposed approach, IPSDE is applied to several original DE algorithms and advanced DE variants. Experimental results have shown that IPSDE is an effective framework to enhance the performance of most DE algorithms studied.
    Keywords: differential evolution; mutation operator; neighbourhood information; intelligent parents selection.
    DOI: 10.1504/IJCSE.2016.10002299
  • Modelling method of dynamic business process based on pi-calculus   Order a copy of this article
    by Yaya Liu, Jiulei Jiang, Weimin Li 
    Abstract: The formal modelling of a dynamic business process is to make the collaborative relationship between organisations more detailed and explicit. It is convenient for people to analyse the structure and interaction of cross-organisational business processes, especially dynamic business processes, and assure the optimisation of the system architecture. Based on the channel mobility of pi-calculus, a new modelling method of the dynamic business process is proposed by combining with the extended directed acyclic graph. It is mainly discussed from three aspects: the selection of the interactive paths, the transition of business objects and the validation of accuracy. Meanwhile, a concrete example with multiple roles is presented to assist in the implementation of the method. It concludes that the method can effectively distinguish the collaborative relationship between organisations, and also be used to build formal models of complicated and dynamic business processes with the mature technology.
    Keywords: dynamic business process; cross-organisational business process; channel mobility; pi-calculus; extended directed acyclic graph.

  • Unsupervised metric learning for person re-identification by image re-ranking   Order a copy of this article
    by Dengyi Zhang, Qian Wang, Xiaoping Wu, Yu Cao 
    Abstract: In a multi-camera video surveillance system with non-overlapping areas, the same person may appear different according to different cameras; also, different people may look the same. This makes person re-identification an important and challenging problem. Most of the current person re-identification methods are based on the supervised distance metrics learning method, which is labels the same person from many cameras as positive samples for distance metric learning, while it is hardly done manually in large numbers of cameras. Thus, this paper describes an unsupervised distance metric learning method based on image re-ranking, calculating the original distance matrix for person samples from two cameras using the original distance metric function, and re-ranking the distance matrix by the image re-ranking method to acquire a better distance function, then using it to calculate the new distance rank matrix. This matrix is used to label positive and negative samples automatically, using unsupervised distanced distance metric learning, and thus to acquire a better Mahalanobis distance metric function, without the need to manually label person samples according to different cameras. Experiments were performed on public datasets VIPeR, i-LIDS, GRID and CAVIAR4REID, and the results compared with current distance learning methods. The results are evaluated by CMC, which indicates this algorithm could overcome the difficulties for labelling large numbers of person samples from cameras in distance metric learning, with a better re-identification rate.
    Keywords: video surveillance; non-overlapping area; person re-identification; unsupervised metric learning; image re-ranking

  • Discovery of continuous coherent evolution biclusters in time series data   Order a copy of this article
    by Meihang Li, Yun Xue, Haolan Zhang, Bo Ma, Jie Luo, WenSheng Chen, Zhengling Liao 
    Abstract: Most traditional biclustering algorithms focus on the biclustering model of non-continuous columns, which is unsuitable for analysis of time series gene expression data. We propose an effective and exact algorithm that can be used to mine biclusters with coherent evolution on contiguous columns, as well as complementary and time-lagged biclusters in time series gene expression matrices. Experimental results show that the algorithm can detect biclusters with statistical significance and strong biological relevance. The algorithm is also applied to currency data analysis, in which meaningful results are obtained.
    Keywords: time series data; bicluster; coherent evolution; complementary; time-lagged

  • Empirical rules based views abstraction for distributed model-driven development   Order a copy of this article
    by Yucong Duan, Jiaxuan Li, Qiang Duan, Lixin Luo, Liang Huang 
    Abstract: UML view integration has been extensively studied in the area of model transformation in model-driven engineering. Empirical processing rules are among the most widely employed approaches for processing view abstraction, which can support model simplification, consistency checking, and management complexity reduction. However, empirical rules face some challenges, such as completeness validation, consistency among rules, and composition priority arrangement. The challenge of rule composition is enlarged in the environment of distributed model-driven development for web service-based systems, where redundant information/data is emphasised. The same redundant information can be expressed in different forms that comprise different topological structures for entity relationship network representing the same part of the system. Such variation will result in choosing different compositions of the rules executed in different orders, which will increase the severity of the current non-determinism from the empirical probability of some rules. In this paper, we investigate the effect of redundancy on rule application through designing a simulated distributed storage for an example diagram model. We propose a formal solution for addressing this challenge through constructing a finite-state automaton to unify empirical abstraction rules while relieving the side effects caused by redundancy. We also show the results obtained from a prototype implementation.
    Keywords: UML, model transformation, view abstraction, finite-state automaton

  • Populating parameters of web services by automatic composition using search precision and WSDL weight matrix   Order a copy of this article
    by Sumathi Pawar, Niranjan Chiplunkar 
    Abstract: Web service composition is meant for connecting different web services according to the requirement. The absence of public Universal Description, Discovery, and Integration (UDDI) made it difficult to get QoS information of the web services unless checked by execution. This research implements a system for invoking and composing web services according to the user requirements by searching required web services dynamically using the Bingo search engine. The user may not know the value of input parameters of the required web services, and these unknown parameters are populated by composing available web services automatically and dynamically. The methodology used here is searching the requested web services according to the functional word, finding the search precision with support and confidence values of search results, computation of Web Service Description Language(WSDL) weight matrix to select suitable web services for user satisfaction, and populating unknown input parameters values by composing the web services. Composable web services are found by intra-cluster search and inter-cluster search among different operation elements of community web services. A composition rule is framed for composable web services according to the order of composition. Pre-condition and effect elements are checked before execution of composition plan. Finally, web services are invoked according to the composition rule.
    Keywords: service composition; WSDL; match-making algorithm; service discovery; WSDL-S.

  • Fast elliptic curve scalar multiplication for resisting against SPA   Order a copy of this article
    by Shuanggen Liu 
    Abstract: This paper analyses the computation of the Symbolic Ternary Form (STF) elliptic curve scalar multiplication algorithm and the binary scalar multiplication algorithm. Compared with the binary scalar multiplication algorithm, the efficiency of the STF scalar multiplication algorithm is increased by 5.9% on average and has a corresponding advantage. For this reason, we improve the structure of the STF scalar multiplication algorithm and make the performance more "smooth" by constructing an indistinguishable operation between points addition (A) and triple point (T) and thus resist against the simple power analysis (SPA) attacks. At the same time, we propose the Highest-weight Symbolic Ternary Form (HSTF), which makes a scalar k transform into the highest-weight form. Thus, every cycle has a fixed pattern to resist SPA attack. With respect to binary scalar multiplication algorithm with anti-SPA, the average efficiency is enhanced by 17.7%.
    Keywords: elliptic curve scalar multiplication; simple power analysis; highest-weight symbolic ternary form

  • Predicting rainfall using neural nets   Order a copy of this article
    by Kyaw Kyaw Htike 
    Abstract: One of the most crucial factors that can help in making strategic decisions and planning in countries that rely on agriculture in some ways is successfully predicting rainfall. Despite its clear importance, forecasting rainfall up until now remains a big challenge owing to the highly dynamic nature of the climate process and its associated seemingly random fluctuations. A wide variety of models have been proposed to predict rainfall, among which statistical models have been one of the most relatively successful. In this paper, we propose a novel rainfall forecasting model using Focused Time-Delay Neural Networks (FTDNNs). In addition, we also contribute in comparing rainfall forecasting performances, using FTDNNs, for different prediction time scales, namely: monthly, quarterly, bi-annually and yearly. We present the optimal neural network architecture parameters automatically found for each of the aforementioned time scales. Our models are trained to perform one-step-ahead predictions and we demonstrate and evaluate our results, measured by mean absolute percentage error, on the rainfall dataset obtained from Malaysian Meteorological Department (MMD) for close to a thirty year period. For test data, we found that the most accurate result was obtained by our method on the yearly rainfall dataset (94.25%). For future work, dynamic meteorological parameters such as sunshine data, air pressure, cloudiness, relative humidity and wet bulb temperature can be integrated as additional features into the model for even higher prediction performance.
    Keywords: rainfall prediction; forecasting; statistical prediction models; artificial neural networks, focused time-delay networks.

  • Overview of information visualisation in science education   Order a copy of this article
    by Chun Hua Wang, Dong Han, Wen-Kuang Chou 
    Abstract: Developed as computer-assisted instruction, visual education is a new teaching method, which is a computer techniques-based visual design aimed to education. Based on an overview of previous studies, this paper expounds the main features of education visualisation, outlines the theoretical basis of education visualisation, summarises the empirical study of science education visualisation, and refines the application scenarios and attention matters in science education visualisation by using static and dynamic visualisation as the clues for classification. The paper concludes that whether the effect of education visualisation is a success depends on the students' knowledge background, visual perception and comprehension ability. Therefore, the design of education visualisation must ensure that the objects and contents of visualisation can adapt to the specific conditions and instructional objectives.
    Keywords: science education visualisation, static visualisation, dynamic visualisation

  • An automation approach for architecture discovery in software design using genetic algorithm   Order a copy of this article
    by Sushama C, A Rama Mohan Reddy 
    Abstract: Software architectures are treated as valuable artifacts in software engineering. The functionality of the software is dependent on the software architectures. The software architectures provide high-level analysis whenever the architects need to analyse the dynamic structure of the design. The modifications to the designs are made manually; it is a very complicated process and sometimes it will not solve the problem completely. This paper presents a genetic algorithm for discovery of underlying architectures of software design. The genetic algorithm is carried out with different modules like encoding, fitness function, and mutation. The algorithm was tested with real time projects and the complete experimental study is mentioned.
    Keywords: genetic algorithm, components, interactions, relations, search-based software engineering.

  • A modified electromagnetism-like mechanism algorithm with pattern search for global optimization   Order a copy of this article
    by Qing Wu, Chunjiang Zhang, Liang Gao 
    Abstract: The solution space of most global optimisation problems is very complex, which results in a high requirement for the search performance of algorithms. Electromagnetism-like mechanism (EM) algorithm is a rising global optimisation method. However, the intensification and the diversification of the original EM are not very efficient. This paper proposes a modified EM algorithm. To improve the intensification ability, a more effective variable step-size pattern search has been applied to replace the original random line search at the local search stage. Meanwhile, a perturbing point is used to increase the diversity. In addition, the formula of calculating the total force is simplified to accelerate the algorithms searching process. Numerical experiments are conducted to compare the proposed algorithm with other variants of EM algorithms and different variants of particle swarm optimisation algorithms. The results show that the approach is competitive.
    Keywords: electromagnetism-like mechanism algorithm; pattern search; global optimisation; meta-heuristic algorithm; local search

  • Various GPU memory utilisation exploration for large RDF search   Order a copy of this article
    by Chantana Chantrapornchai 
    Abstract: Graphic Processing Units (GPUs) are the important accelerators in our desktop com- puter nowadays. There are thousands of processing units that can simultaneously run the program and there are various memory types, with different sizes and access times, which are connected in a hierarchy. However, the GPUs have a much smaller internal memory size than a typical computer, which can be an obstacle to performing big data processing. In this paper, we study the use of various memory types: global, texture, constant, and shared memories, in simultaneously searching large Resource Description Framework (RDF) data, which are commonly used on the internet to link to the WWW data based on the GPUs. Using suitable memory types and properly managing the data transfer can lead to a better performance when processing such data. The results show that the parallel search in 45-Gigabyte RDF data on multiple GPUs that uses the global memory for storing large texts and uses the shared memory storing multiple keywords can run about 14 times faster than the sequential search on a low-cost desktop.
    Keywords: graphic processing units; large RDF; parallel string search

  • Hough transform-based cubic spline recognition for natural shapes   Order a copy of this article
    by Cheng-Huang Tung, Wei-Jyun Syu, Wei-Cheng Huang 
    Abstract: A two-stage GHT-based cubic spline recognition method is proposed for recognising flexible natural shapes. First, the proposed method uses cubic splines to interpolate a flexible natural shape, and a sequence of connected boundary points is generated from the cubic splines. Each such point has accurate tangent and curvature features. At the first recognition stage, the proposed method uses the modified GHT to adjust the scale and orientation factors of the input shape with respect to each reference model. At the second recognition stage, the proposed point-based matching technique calculates the difference between each specific reference model and its corresponding adjusted input shape at the point level. Experiments for recognising 15 categories of natural shapes, including fruits and vegetables, the recognition rate of the proposed two-stage method is 97.3%, much higher than 79.3% measured by the standard GHT.
    Keywords: Hough transform, GHT, cubic spline, natural shape, curvature, tangent, point-based matching, recognition method, template database, boundary point.

  • Personalised service recommendation process based on service clustering   Order a copy of this article
    by Xiaona Xia 
    Abstract: Personalised service recommendation is the key technology for service platforms, and the demand preferences of users are the important factors for personalised recommendation. First, in order to improve the accuracy and adaptability of service recommendation, services are needed to be initialised before being recommended and selected, then they are classified and clustered according to demand preferences, and service clusters are defined and demonstrated. For sparse problems of the service function matrix, historical and potential preferences are expressed as double matrices. Second, a service cluster is viewed as the basic business unit, and we optimise the graph summarisation algorithm and construct service recommendation algorithm SCRP. Helped by the experiments about variety parameters, SCRP has more advantages than other algorithms. Third, we select fuzzy degree and difference to be the two key indicators, and use some service clusters to complete the simulation and analyse the algorithm performance. The results show that our service selection and recommendation method is better than others, which might effectively improve the quality of service recommendation.
    Keywords: service clustering; service recommendation; graph summarisation algorithm; personalisation; preference matrix

  • Power-aware high level evaluation model of interconnect length of on-chip memory network topology   Order a copy of this article
    by XiaoJun Wang, Feng Shi, Yi-Zhuo Wang, Hong Zhang, Xu Chen, Wen-Fei Fu 
    Abstract: Interconnect power is the factor that dominates the power consumption on the on-chip memory architecture. Almost all dedicated wires and buses are replaced with packet switching interconnection networks which have become the standard approach to on-chip interconnection. Unfortunately, rapid advances in technology are making it more difficult to assess the interconnect power consumption of NoC. To resolve this problem, a new evaluating methodology Interconnect Power Evaluation based on Topology of On-chip Memory (IP-ETOM) is proposed in this paper. To validate this method, two multicore architectures 2D-Mesh and Triplet based Architecture (TriBA) are evaluated in this research work. The on-chip memory network model is evaluated based on characteristics of on-chip architecture interconnection. Matlab is used for conducting the experiment that evaluates the interconnection power of TriBA and 2D-Mesh.
    Keywords: power evaluation; on-chip memory network topology; NoC interconnects; IPETOM

  • Optimising data access latencies of virtual machine placement based on greedy algorithm in datacentre   Order a copy of this article
    by Xinyan Zhang, Keqiu Li, Yong Zhang 
    Abstract: The total completion time of a task is also the major bottleneck in the big data processing applications based on parallel computation, since the computation and data are distributed on more and more nodes. Therefore, the total completion time of a task is an important index to evaluate the cloud performance. The access latency between the nodes is one of the key factors affecting task completion time for cloud datacentre applications. Additionally, minimising total access time can reduce the overall bandwidth cost of running the job. This paper proposes an optimisation model focused on optimising the placement of virtual machines (VM) so as to minimise the total data access latency where the datasets have been located. According to the proposed model, our optimising VMs problem is linear programming. Therefore, we obtain the optimum solution of our model by the branch-and-bound algorithm that its time complexity is O(2^{NM}). Simultaneously, we also present a greedy algorithm, which has O(NM) of time complexity, to solve our model. Finally, the simulation results show that all of the solutions of our model are superior to existing models and close to the optimal value.
    Keywords: datacentre; cloud environment; access latency; virtual machine placement; greedy algorithm

  • An empirical study of disclosure effects in listed biotechnology and medicine industry using MLR model   Order a copy of this article
    by Chiung-Lin Chiu, You-Shyang Chen 
    Abstract: This research employs the multiple linear regression model to investigate the relationship between voluntary disclosure and firm performance in biotechnology and medicine industry in Taiwan. Using 44 firm-year observations collected from Information Transparency and Disclosure Ranking System and Taiwan Economic Journal financial database for companies listed in the Taiwan Stock Exchange and Taipei Exchange Market, the regression results reveal that there is a positive and significant relationship between voluntary disclosure and firm performance. Firms with better voluntary disclosure have better performance than do firms without voluntary disclosure. The results suggest that companies should pay more attention to voluntary disclosure as additional information. It is also considered by investors as valuable information when making their investment decision.
    Keywords: voluntary disclosure; firm performance; investment decision; MLR; multiple linear regression model, biotechnology and medicine industry; TSE; Taiwan Stock Exchange; ITDRS; information transparency and disclosure ranking system

  • A static analytical performance model for GPU kernel   Order a copy of this article
    by Jinjing Li 
    Abstract: Graphics processing units (GPUs) have shown increased popularity and play an important role as a kind of coprocessor in heterogeneous co-processing environments. Heavily data parallel problems can be solved efficiently by tens of thousands of threads collaboratively working in parallel in GPU architecture. The achieved performance, therefore,depends on the capability of multiple threads in parallel collaboration, the effectiveness of latency hiding, and the use of multiprocessors. In this paper, a static analytical kernel performance model (SAKP) is proposed, based on this performance principle, to estimate the execution time of the GPU kernel. Specifically, a set of kernel and device features for the target GPU is generated in the proposed model. We determine the performance-limiting factors and generate an estimation of the kernel execution time with this model. Matrix Multiplication (MM) and Histogram Generation (HG) in NVIDIA GTX680 GPU card were performed to verify our proposed model, and showed an absolute error in prediction of less than 6.8%.
    Keywords: GPU; co-processing; static analytical kernel performance model; kernel and device features; absolute error.

  • Syntactic parsing of clause constituents for statistical machine translation   Order a copy of this article
    by Jianjun Ma, Jiahuan Pei, Degen Huang, Dingxin Song 
    Abstract: The clause is considered to be the basic unit of grammar in linguistics, which is a structure between a chunk and a sentence. Clause constituents, therefore, are an important kind of linguistically valid syntactic phrase. This paper adopts the CRFs model to recognise English clause constituents with their syntactic functions, and testifies their effect on machine translation by applying this syntactic information to an English-Chinese PBSMT system, evaluated on a corpus of business domain. Clause constituents are mainly classified into six kinds: subject, predicator, complement, adjunct, residues of predicator, and residues of complement. Results show that our rich-feature CRFs model achieves an F-measure of 93.31%, a precision of 93.26%, and a recall of 93.04%. This syntactic knowledge in the source language is further combined with the NiuTrans phrasal SMT system, which slightly improves the English-Chinese translation accuracy.
    Keywords: syntactic parsing; clause constituents; PBSMT.
    DOI: 10.1504/IJCSE.2016.10004598
  • A universal compression strategy using sorting transformation   Order a copy of this article
    by Bo Liu, Xi Huang, Xiaoguang Liu, Gang Wang, Ming Xu 
    Abstract: Although traditional universal compression algorithms can effectively use repetition located in a slide window, they cannot take advantage of some message source in which similar messages are distributed uniformly. In this paper, we come up with a universal segmenting-sorting compression algorithm to solve this problem. The key idea is to reorder the message source before compressing it with the Lz77 algorithm. We design transformation methods for two common data types, corpus of webpages and access log. The experimental results show that segmenting-sorting transformation is truly beneficial to the compression ratio. Our new algorithm is able to make the compression ratio 20% to 50% lower than the naive Lz77 algorithm does and takes almost the same decompression time. For some read-heavy sources, segmenting-sorting compression can reduce space cost while guaranteeing throughput.
    Keywords: segmenting; sorting; Lz77; compression; universal compression method.

  • Executing time and cost-aware task scheduling in hybrid cloud using a modified DE algorithm   Order a copy of this article
    by Yuanyuan Fan, Qingzhong Liang, Yunsong Chen 
    Abstract: Task scheduling is one of the basic problems in cloud computing. In a hybrid cloud, task scheduling faces new challenges. In this paper, we propose a GaDE algorithm, based on a differential evolution algorithm, to improve the single objective scheduling performance of a hybrid cloud. In order to better deal with the multi-objective task scheduling optimisation in hybrid clouds, on the basis of the GaDE and Pareto optimum of the quick sorting method, we present a multi-objective algorithm, named NSjDE. This algorithm also reduces the frequency of evaluation. Compared with experiments using the Min-Min algorithm, GaDE algorithm and NSjDE algorithm, results show that for the single object task scheduling, GaDE and NsjDE algorithms perform better in getting the approximate optimal solution. The optimisation speed of the multi-objective NSjDE algorithm is faster than the single-objective jDE algorithm, and NSjDE can produce more than one non-dominated solution meeting the requirements, in order to provide more options to the user.
    Keywords: hybrid cloud; task scheduling; executing time-aware; cost-aware

  • A dynamic cold-start recommendation method based on incremental graph pattern matching   Order a copy of this article
    by Yanan Zhang, Guisheng Yin, Deyun Chen 
    Abstract: In order to give accurate recommendations for a cold-start user who has few records, researchers find similar users for a cold-start user according to social networks. However, these efforts assume that the cold-start users social relationships are static and ignore the fact that updating social relationships in large scale social networks is time consuming. In social networks, cold-start users and other users may change their social relationships as time goes by. In order to give accurate and timely recommendations for cold-start users, it is necessary to continuously update users similar to the cold-start user according to his latest social relationships. In this paper, an incremental graph pattern matching based dynamic cold-start recommendation method (IGPMDCR) is proposed, which updates similar users for a cold-start user based on the topology of social networks, and gives recommendations according to latest similar users. The experimental results show that IGPMDCR could give accurate and timely recommendations for cold-start users.
    Keywords: dynamic cold-start recommendation; social network; incremental graph pattern matching; topology of social network.

  • Modelling and simulation research of vehicle engines based on computational intelligence methods   Order a copy of this article
    by Ling-ge Sui, Lan Huang 
    Abstract: We assess the feasibility of two kinds of widely used artificial neural network (ANN) technologies applied in the field of transient emission simulation. In this work, the back-propagation feedforward neural network (BPNN) is shown to be more suitable than the radial basis function neural network (RBFNN). Considering the transient change rule of a transient operation, the composite transient rate is innovatively adopted as an input variable to the BPNN transient emission model, which is composited by the torque transient rate and air-fuel ratio (AFR) transient rate. Thus, a whole process transient simulation platform based on the multi-soft coupling technology of a test diesel engine is established. Through a transient emission simulation, the veracity and generalisation ability of the simulation platform is confirmed. The simulation platform can correctly predict the change trends and establish a peak value difference within 8%. Our findings suggest that the simulation platform can be applied to a control strategies study of typical transient operations.
    Keywords: transient emission; simulation; back-propagation feedforward neural network; radial basis function neural network; diesel engine.

  • Institution-based UML activity diagram transformation with semantic preservation   Order a copy of this article
    by Amine Achouri, Yousra Bendaly Hlaoui, Leila Jemni Ben Ayed 
    Abstract: This paper presents a specific tool, called MAV-UML-AD, allowing the specification and the verification of workflow models using UML Activity Diagrams (UML AD) and Event-B and Based on Institutions. The developed tool translates an activity diagram model into an equivalent Event-B specification according to a mathematical semantics. The transformation approach of UML AD models is based on the theory of institutions. In fact, each of UML AD and Event-B specification is defined by an instance of its corresponding institution. The transformation approach is represented by an institution co-morphism, which is defined between the two institutions. Institution theory is adopted as the theoretical framework of the tool essentially for two reasons. First, it gives a locally mathematical semantics for UML AD and Event-B. Second, to define a semantic preserving mapping between UML AD specification and Event-B machine. Thanks to the B theorem prover, functional proprieties such as liveness and fairness can be formally checked. The core of the model transformation approach will be highlighted in this paper and how institution concepts such category, co-morphism and signature are presented in the two used formalisms. This paper will also illustrate the use of the developed tool MAV-UML-AD through an example of specification and verification.
    Keywords: formal semantics; model-driven engineering; institution theory; Event-B; UML activity diagram; formal verification

  • The analysis of evolutionary optimization on the TSP(1,2) problem   Order a copy of this article
    by Xiaoyun Xia, Xinsheng Lai, Chenfu Yi 
    Abstract: The TSP(1,2) problem is a special case of the travelling salesperson problem, which is NP-hard. Many heuristics including evolutionary algorithms (EAs) are proposed to solve the TSP(1,2) problem. However, we know little about the performance of the EAs for the TSP(1,2) problem. This paper presents an approximation analysis of the (1+1) EA on this problem. It is shown that both the (1+1) EA and $(mu+lambda)$ EA can obtain $3/2$ approximation ratio for this problem in expected polynomial runtime $O(n^3)$ and $Oleft((frac{mu}{lambda})n^3+nright)$, respectively. Furthermore, we prove that the (1+1) EA can provide a much tighter upper bound than a simple ACO on the TSP(1,2) problem.
    Keywords: evolutionary algorithms; TSP(1,2); approximation performance; analysis of algorithm; computational complexity.

  • A novel rural microcredit decision model and solving via binary differential evolution algorithm   Order a copy of this article
    by Dazhi Jiang, Jiali Lin, Kangshun Li 
    Abstract: Generally, as an economic means of lifting people out of poverty, microcredit has been accepted as an effective method for empowering both individuals and communities. However, risk control is still a core part of the implementation of agriculture-related loans business for microcredit companies. In this paper, a rural microcredit decision model is presented based on maximising the profit while minimising the risk. Then, a binary differential evolution algorithm is applied to solve the decision model. The result shows that the proposed method and model are scientific and easy to operate, which can also provide a referential solution for the decision management in microcredit companies.
    Keywords: risk control; microcredit; decision model; binary differential evolution

  • Q-grams-imp: an improved q-grams algorithm aimed at edit similarity join   Order a copy of this article
    by Zhaobin Liu, Yunxia Liu 
    Abstract: Similarity join is more and more important in many applications and has attracted widespread attention from scholars and communities. Similarity join has been used in many applications, such as spell checking, copy detection, entity linking, pattern recognition and so on. Actually, in many web and enterprise scenarios, where typos and misspellings often occur, we need to find an efficient algorithm to handle these situations. In this paper, we propose an improved algorithm on q-grams called q-grams-imp that is aimed at solving edit similarity join. We use this algorithm in order to reduce the number of tokens and thus reduce space costs; it is best fitted for same size strings. But for different sizes of strings, we need to handle these strings in order to fit the algorithm. Finally, we conclude and get the results that our proposed algorithm is better than the traditional method.
    Keywords: similarity join; q-grams algorithm; edit distance.

  • An algorithm based on differential evolution for satellite data Transmission Scheduling   Order a copy of this article
    by Qingzhong Liang, Yuanyuan Fan, Xuesong Yan, Ye yan 
    Abstract: Data transmission task scheduling is one of the important problems in satellite communication. It can be considered as a combinatorial optimisation problem among satellite data transmission demand, visible time window and ground station resource, which is an NP-complete problem. In this paper, we propose a satellite data transmission task scheduling algorithm that searches for an optimised solution based on a differential evolution algorithm framework. In its progress of evolution, the individuals evaluating procedure is improved by a modified 0/1 knapsack based method. Extensive experiments are conducted to examine the effectiveness and performance of the proposed scheduling algorithm. Experimental results show that the scheduling results generated from the algorithm satisfy scheduling constraints and are consistent with the expectation.
    Keywords: data transmission; task scheduling; differential evolution; knapsack problem

  • Dynamic load balance strategy for parallel rendering based on deferred shading   Order a copy of this article
    by Mingqiang Yin, Dan Sun, Hui Sun 
    Abstract: To solve the problem of low efficiency in rendering of large scenes with a complex illumination model, a new deferred shading method is proposed, which is applied to the parallel rendering system. In order to make the rendering times of slave nodes in the parallel rendering system equal to each other, the algorithm for rendering task assignment is designed. For the deferred shading method, the process of rendering every frame is divided into two phases. The first one called geometrical process is responsible for the visibility detection. In this phase, the primitives are distributed to each rendering node evenly and are rendered without illumination. The pixels which should be shaded and their corresponding primitives are found. The second one called pixel shading is responsible for colouring the pixels which have been found in the first phase. The pixels are assigned to the rendering node evenly according the image of the last frame. As the rendering tasks in the two phases are assigned evenly, the rendering times of node in the cluster system are roughly equal to each other. Experiments show that this method can improve the rendering efficiency of the parallel rendering system.
    Keywords: parallel rendering; deferred shading; load balance.

  • Big data automatic analysis system and its applications in rockburst experiment   Order a copy of this article
    by Yu Zhang 
    Abstract: In 2006, State Key Laboratory for GeoMechanics and Deep Underground Engineering, GDLab for short, has successfully reconstructed the rockburst procedure indoors. Since then, a series of valuable research results has been gained in the area of rockburst mechanism. At the same time, there are some dilemmas, such as data storage dilemma, data analysis dilemma and prediction accuracy dilemma. GDLab has accumulated more than 500 TB data of rockburst experiments. But so far, the amount of analysed data is less than 5%. The primary cause of these dilemmas is that a large amount of experimental data in the procedure of the study of rockburst are produced. In this paper, a novel big data automatic analysis system for rockburst experiment is proposed. Various modules and algorithms were designed and realised. Theoretical analysis and experimental research show that the system can improve the existing research mechanism of rockburst. It also can make many impossible things become possible. The work of this paper has laid a theoretical foundation for rockburst mechanism research.
    Keywords: rock burst; experiment data; big data; automatic analysis

  • Training auto-encoders effectively via eliminating task-irrelevant input variables   Order a copy of this article
    by Hui Shen, Dehua Li, Zhaoxiang Zang, Hong Wu 
    Abstract: Auto-encoders are often used as building blocks of deep network classifiers to learn feature extractors, but task-irrelevant information in the input data may lead to bad extractors and result in poor generalisation performance of the network. In this paper, via dropping the task-irrelevant input variables the performance of auto-encoders can be obviously improved. Specifically, an importance-based variable selection method is proposed to aim at finding the task-irrelevant input variables and dropping them. The paper first estimates the importance of each variable, and then drops the variables with importance value lower than a threshold. In order to obtain better performance, the method can be employed for each layer of stacked auto-encoders. Experimental results show that when combined with our method the stacked denoising auto-encoders achieve significantly improved performance on three challenging datasets.
    Keywords: feature learning; deep learning; neural network; auto-encoder; stacked auto-encoders; variable selection; feature selection; unsupervised training

  • Model-checking software product lines based on feature slicing   Order a copy of this article
    by Mingyu Huang, Yumei Liu 
    Abstract: Feature model is a popular formalism for describing the commonality and variability of software product line in terms of features. Feature models symbolise a presentation of the possible application configuration space, and can be customised based on specific domain requirements and stakeholder goals. As feature models are becoming increasingly complex, it is desired to provide automatic support for customised analysis and verification based on the specific goals and requirements of stakeholders. This paper first presents feature model slicing based on the requirements of the users. We then introduce three-valued abstraction of behaviour models based on the slicing unit. Finally, based on a multi-valued model checker, a case study was conducted to illustrate the effectiveness of our approach.
    Keywords: feature model; slicing; three-valued model; model checking

  • Decomposition-based multi-objective comprehensive learning particle swarm optimisation   Order a copy of this article
    by Xiang Yu, Hui Wang, Hui Sun 
    Abstract: This paper proposes decomposition-based comprehensive learning particle swarm optimisation (DCLPSO) for multi-objective optimisation. DCLPSO uses multiple swarms, with each swarm optimising a separate objective. Two sequential phases are conducted: independent search and then cooperative search. Important information related to extreme points of the Pareto front often can be found in the independent search phase. In the cooperative search phase, a particle randomly learns from its personal best position or an elitist on each dimension. Elitists are non-dominated solutions and are stored in an external repository shared by all the swarms. Mutation is applied to each elitist in this phase to help escaping from local Pareto fronts. Experiments conducted on various benchmark problems demonstrate that DCLPSO is competitive in terms of convergence and diversity of the resulting non-dominated solutions.
    Keywords: particle swarm optimisation; comprehensive learning; decomposition; multi-objective optimisation.

  • Applicability evaluation of different algorithms for daily reference evapotranspiration model in KBE system   Order a copy of this article
    by Yubin Zhang, Zhengying Wei, Lei Zhang, Jun Du 
    Abstract: An irrigation decision-making system based on Knowledge-based Engineering (KBE) is reported in this paper. It can accurately predict water and fertiliser requirements and achieve intelligent irrigation diagnosis and decision support. However, the basis of the KBE was knowledge of reference crop evapotranspiration (ET0). Therefore, the research examined the accuracy of the support vector machines (SVMs) in the model of ET0. The main obstacles of computing ET0 by the PenmanMonteith model were the complicated nonlinear process and the many climate variables required; furthermore, these were calculated based on the original meteorological data, and the calculation standard was not the only one. Thus, the SVM models are applied with the original or limited data, especially in developing countries. The flexibility of the SVMs in ET0 modelling was assessed using the original meteorological data (Tmax, Tm, Tmin, n, Uh, RHm, φ, Z ) of the years 1990-2014 in five stations of Shaanxi, China. Those eight parameters were used as the input, while the reference evapotranspiration values were the output. In the first part of the study, the SVMs were compared with FAO-24, Hargreaves, McCloud, Priestley-Taylor and Makkink models. The comparison results indicated that the SVMs performed better than other models. In the second part, the total ET0 estimation of the SVMs was compared with the other models in the validation. It was found that the SVM models were superior to the others in terms of relative error. The further assessment of SVMs was conducted, and confirmed that the models could provide a powerful tool in KBE irrigation with a lack of meteorological data. This research could provide a reference for accurate ET0 estimation for decision-making in KBE irrigation systems based on collecting data from humidity sensors and weather stations in the field.
    Keywords: reference evapotranspiration; support vector machines; knowledge-based engineering; original meteorological data.

  • Multi hidden layer extreme learning machine optimised with batch intrinsic plasticity   Order a copy of this article
    by Shan Pang, Xinyi Yang 
    Abstract: Extreme learning machine (ELM) is a novel learning algorithm where the training is restricted to the output weights to achieve a fast learning speed. However, ELM tends to require more neurons in the hidden layer and sometimes leads to ill-condition problem owing to random selection of input weights and hidden biases. To address these problems, we propose a multi hidden layer ELM optimised with batch intrinsic plasticity (BIP) scheme. The proposed algorithm has a deep structure and thus learns features more efficiently. The combination with the BIP scheme helps to achieve better generalisation ability. Comparisons with some state-of-the-art ELM algorithms on both regression and classification problems have verified the performance and effectiveness of our proposed algorithm.
    Keywords: neural network; extreme learning machine; batch intrinsic plasticity; multi hidden layers.

  • Chaotic artificial bee colony with elite opposition-based learning strategy   Order a copy of this article
    by Zhaolu Guo, Jinxiao Shi, Xiaofeng Xiong, Xiaoyun Xia, Xiaosheng Liu 
    Abstract: Artificial bee colony (ABC) algorithm is a promising evolutionary algorithm inspired by the foraging behaviour of honey bee swarms, which has obtained satisfactory solutions in diverse applications. However, the basic ABC demonstrates insufficient exploitation capability in some cases. To address this issue, a chaotic artificial bee colony with elite opposition-based learning strategy (CEOABC) is proposed in this paper. During the search process, CEOABC employs the chaotic local search to promote the exploitation ability. Moreover, the elite opposition-based learning strategy is used to exploit the potential information of the exhausted solution. Experimental results compared with several ABC variants show that CEOABC is a competitive approach for global optimisation.
    Keywords: artificial bee colony; chaotic local search; opposition-based learning; elite strategy.

  • Numerical simulations of electromagnetic wave logging instrument response based on self-adaptive hp finite element method   Order a copy of this article
    by L.I. Hui, Zhu Xifang, Liu Changbo 
    Abstract: Numerical simulation of instrument response is an important method to calibrate instrument parameters, evaluate detection performance, and verify complex system theory. Measurement results of electrical well logging are important for the interpretation of measurement data and characterisation of oil reservoirs, especially in horizontal directional drilling and shale gas and oil development. In this paper, a self-adaptive hp finite element method has been used to investigate the electrical well logging instrument responses, such as the electromagnetic wave resistivity logging- while-drilling (LWD) tool and the through-casing resistivity logging (TCRL) tool. Measurement results illustrate the efficiency of the methods, and provide physical interpretation of resistivity measurements obtained with the LWD and TCRL tools. Numerical simulation examples are provided to show the validity, accuracy, and efficiency of the self-adaptive hp finite element method. The high accuracy simulation results have great importance for electrical well logging tools calibration and logging data interpretation.
    Keywords: numerical simulation; parameters calibration; electromagnetic wave resistivity logging-while-drilling; through-casing resistivity logging; self-adaptive hp finite element method.

  • Upgrading event and pattern detection to big data   Order a copy of this article
    by Soumaya Cherichi, Rim Faiz 
    Abstract: One of the marvels of our time is the unprecedented development and use of technologies that support social interaction. Social mediating technologies have engendered radically new ways of information and communication, particularly during events; in cases of natural disaster, such as earthquakes and tsunami, and the American presidential election. This paper is based on data obtained from Twitter because of its popularity and sheer data volume. This content can be combined and processed to detect events, entities and popular moods to feed various new large-scale data-analysis applications. On the downside, these content items are very noisy and highly informal, making it difficult to extract sense out of the stream. Taking into account all the difficulties, we propose a new event detection approach combining linguistic features and Twitter features. Finally, we present our event detection system from microblogs that aims (1) to detect new events, (2) to recognise temporal markers pattern of an event, and (3) to classify important events according to thematic pertinence, author pertinence and tweet volume.
    Keywords: microblogs; event detection; temporal markers; patterns; social network analysis.

  • A security ensemble framework for securing a file in cloud computing environments   Order a copy of this article
    by Sharon Moses J, Nirmala M 
    Abstract: Scalability and on-demand features of cloud computing have revolutionised the IT industry. Cloud computing provides flexibility to the user in several aspects, including pay as you use. The entire burdens of computing, managing resources and file storage are moved to the cloud service provider end. File storage in clouds is an important issue for both service providers and the end users. Securing the file stored in cloud storage from internal and external attacks has become a primary concern for cloud storage providers. Owing to the accumulation of enormous amounts of personal and confidential information in cloud storage, it draws hackers and data-pirates to steal the information at any cost. Once a file gets stored in cloud storage, the user has no authority over the file as well as any knowledge of its physical location. In this paper, the threats involved in file storage and a secure way of protecting the stored files using a novel ensemble of security strategies is presented. An encryption ensemble module is incorporated over an OpenStack cloud infrastructure for protecting the file. Five symmetric block ciphers are used in the encryption module to encrypt and decrypt the file without disturbing existing security measures provided to a file. This proposed strategy helps service providers as well as users to secure the file in cloud storage more efficiently.
    Keywords: Cloud Storage; File Privacy; File Security; Swift storage; OpenStack security; Security ensemble.

  • Virtual guitar: using real-time finger tracking for musical instruments   Order a copy of this article
    by Noorkholis Luthfil Hakim, Shih-Wei Sun, Mu-Hsen Hsu, Timothy K. Shih, Shih-Jung Wu 
    Abstract: Kinect, a 3D sensing device from Microsoft, invokes the Human Computer Interaction (HCI) research evolution. Kinect has been implemented in many areas, including music. One implementation was in a Virtual Musical Instrument (VMI) system, which uses natural gestures to produce synthetic sounds similar to a real musical instrument. From related work, we found that the use of a large joint, such as hand, arm or leg, is inconvenient and limits the way of playing VMI. Thus this study proposed a fast and reliable finger tracking algorithm suitable for VMI playing. In addition, a virtual guitar system application was developed as an implementation of the proposed algorithm. Experimental results show that the proposed method can be used to play a variety of tunes with an acceptable quality. Furthermore, the proposed application could be used by a beginner who does not have any experience in music or playing a real musical instrument.
    Keywords: virtual guitar; finger tracking; musical instrument; human computerrninteraction; HCI; hand detection; hand tracking; hand gesture recognition; virtual musical instrument; VMI; depth camera.

  • A cloud computing price model based on virtual machine performance degradation   Order a copy of this article
    by Dionisio Machado Leite, Maycon Peixoto, Carlos Ferreira, Bruno Batista, Danilo Costa, Marcos Santana, Regina Santana 
    Abstract: This paper reports the interference effects in virtual machines performance running higher workloads to improve the resources payment in cloud computing. The objective is to produce an acceptable pay-as-you-go model to be used by cloud computing providers. Presently, a price of pay-as-you-go model is based on the virtual machine used per time. However, this scheme does not consider the interference caused by virtual machines running concurrently, which may cause performance degradation. In order to obtain a fair charging model, this paper proposes an approach considering a recovery over the initial price considering the virtual machine performance interference. Results showed benefits of a fair pay-as-you-go model, ensuring the effective user requirement. This novel model contributes to cloud computing in a fair and transparent price composition.
    Keywords: cloud computing; pay-as-you-go; virtualisation; quality of service.

  • Designing scrubbing strategy for memories suffering MCUs through the selection of optimal interleaving distance   Order a copy of this article
    by Wei Zhou, Hong Zhang, Hui Wang, Yun Wang 
    Abstract: As technology scales, multiple cell upsets (MCUs) have shown prominent effect, thus affecting the reliability of memory to a great extent. Ideally, the interleaving distance (ID) should be chosen as the maximum expected MCU size. In order to mitigate MCUs errors, interleaving schemes together with single error correction (SEC) codes can be used to provide the greatest protection. In this paper, we propose the use of scrubbing sequences to improve memory reliability. The key idea is to exploit the locality of the errors caused by a MCU to make scrubbing more efficient. The single error correction, double error detection, and double adjacent error correction (SEC-DEDDAEC) codes have also been used. A procedure is presented to determine a scrubbing sequence that maximizes reliability. An algorithm of scrubbing strategy, which keeps the area overhead and complexity as low as possible without compromising memory reliability, is proposed for the optimal interleaving distance, which should be maximized under some conditions. The approach is further applied to a case study and results show a significant increase in the Mean Time To Failure (MTTF) compared with traditional scrubbing.
    Keywords: interleaving distance; memory; multiple cell upsets (MCUs); soft error; reliability; scrubbing; radiation.
    DOI: 10.1504/IJCSE.2016.10004753
  • A model of mining approximate frequent itemsets using rough set theory   Order a copy of this article
    by Yu Xiaomei, Wang Hong, Zheng Xiangwei 
    Abstract: Datasets can be described by decision tables. In real-life applications, data are usually incomplete and uncertain, which poses big challenges for mining frequent itemsets in imprecise databases. This paper presents a novel model of mining approximate frequent itemsets using the theory of rough sets. With a transactional information system constructed on the dataset under consideration, a transactional decision table is put forward, then lower and upper approximations of support are available that can be easily computed from the indiscernibility relations. Finally, by a divide-and-conquer way, the approximate frequent itemsets are discovered taking consideration of support-based accuracy and coverage defined. The evaluation of the novel model is conducted on both synthetic datasets and real-life applications. The experimental results demonstrate its usability and validity.
    Keywords: rough set theory; data mining; decision table; approximate frequent itemsets; indiscernibility relation.

  • Improved predicting algorithm of RNA pseudoknotted structure   Order a copy of this article
    by Zhendong Liu, Daming Zhu, Qionghai Dai 
    Abstract: The prediction of RNA structure with pseudoknots is an NP-hard problem. According to minimum free energy models and computational methods, we investigate the RNA pseudoknotted structure. The paper presents an efficient algorithm for predicting RNA structure with pseudoknots, and the algorithm takes O(n3) time and O(n2) space. The experimental tests in Rfam10.1 and PseudoBase indicate that the algorithm is more effective and precise, and the algorithm can predict arbitrary pseudoknots. And there exists an 1+e (e>0) polynomial time approximation scheme in searching the maximum number of stackings, and we give the proof of the approximation scheme in RNA pseudoknotted structure.
    Keywords: RNA pseudoknotted structure; predicting algorithm; PTAS; pseudoknots; minimum free energy.

  • An efficient algorithm for modelling and dynamic prediction of network traffic   Order a copy of this article
    by Wenjie Fan 
    Abstract: Network node degradation is an important problem in the internet of things, given the ubiquitous high number of personal computers, tablets, phones and other equipment present nowadays. In order to verify the network traffic degradation as one or multiple nodes in a network fail, this paper proposes an algorithm based on Product Form Results (PRF) for the Fractionally Auto Regressive Integrated Moving Average (FARIMA) model, namely PFRF. In this algorithm, the prediction method is established by the FARIMA model, through equations for queuing situation and average queue length in steady state derived from queuing theory. Experimental simulations were conducted to investigate the relationships between average queue length and service rate. Results demonstrated that it not only has good adaptability, but has also achieved promising magnitude of 9.87 as standard deviation, which shows its high prediction accuracy, given the low-magnitude difference between original value and the algorithm.
    Keywords: prediction; product form results; FARIMA model; average length of queue.

  • Reversible image watermarking based on texture analysis of grey level co-occurrence matrix   Order a copy of this article
    by Shu-zhi Li, Qin Hu, Xiao-hong Deng, Zhaoquan Cai 
    Abstract: Embedding the watermark in the complex area of the image can effectively improve concealment. However, most methods simply use the mean squared error (MSE) and some simple methods to judge the texture complexity. In this paper, we propose a new texture analysis method based on grey level co-occurrence matrix (GLCM) and provide an in-depth discussion on how to accurately choose a complex region. This new method is applied to the reversible image watermarking. Firstly, the original host image is divided into 128 * 128 sub-blocks. Then, the mean square error is used to assign the weight of the four texture feature parameters to establish the relationship between the characteristic parameters and the complexity of image sub-block. Applying this formulaic series, we can calculate the complexity of each sub-block, along with the selection of the maximum sub-blocks of the texture complexity. If the embedding position is insufficient, then we select the second sub-block to be embedded in the watermark, until a satisfactory embedding capacity is reached. Pairwise prediction error extend (PPEE) is used to hide the data.
    Keywords: grey level co-occurrence matrix; image sub block; texture complexity; reversible image watermarking.

  • A semantic recommender algorithm for 3D model retrieval based on deep belief networks   Order a copy of this article
    by Li Chen, Hong Liu, Philip Moore 
    Abstract: Interest in 3D modelling is growing; however, the retrieval results achieved for semantic-based 3D model retrieval systems have been disappointing. In this paper we propose a novel semantic recommendation algorithm based on a Deep Belief Network (DBN-SRA) to implement semantic retrieval with potential semantic correlations [between models] being achieved using deep learning form known model samples. The algorithm uses the feature correlation [between the models] as the conditions to enable semantic matching of 3D models to obtain the final recommended retrieval result. Our proposed approach has been shown to improve the effectiveness of 3D model retrieval, in terms of both retrieval time and, importantly, accuracy. Additionally, our study and our reported results suggest that our posited approach will generalise to recommender systems in other domains that are characterised by multiple feature relationships.
    Keywords: deep belief network; 3D model retrieval; recommender algorithm; cluster analysis.

  • Differential evolution with spatially neighbourhood best search in dynamic environment   Order a copy of this article
    by Dingcai Shen, Longyin Zhu 
    Abstract: In recent years, there has been a growing interest in applying differential evolution (DE) to optimisation problems in a dynamic environment. The ability to track a changing optimum over time is concerned in dynamic optimisation problems (DOPs). In this study, an improved niching-based scheme, named spatially neighbourhood best search DE (SnDE), for DOPs is proposed. The SnDE adopts DE with DE/best/1/bin scheme. The best individual in the selected scheme is searched around the considered individual in a predefined neighbourhood size, thus keeping a balance between exploitation ability and exploration ability. A comparative study with several algorithms with different characteristics on a common platform by using the moving peaks benchmark (MPB) and various problem settings is presented in this paper. The results indicate that the proposed algorithm can track the changing optimum in each circumstance effectively on the selected benchmark function.
    Keywords: differential evolution; dynamic optimisation problem; neighbourhood search; niching.

  • Optimal anti-interception orbit design based on genetic algorithm   Order a copy of this article
    by Yifang Liu 
    Abstract: The space defence three-player problem with impulsive thrust is studied in this work. Interceptor spacecraft and anti-interceptor spacecraft have only one chance to manoeuvre, while target spacecraft just keeps running in the target orbit without the ability to manoeuvre. Based on the Lambert theorem, the space defence three-player problem is modelled and divided into two layers. The internal layer is an interception problem in which the interceptor spacecraft tries to intercept the target spacecraft. The external layer is an anti-interception problem in which the anti-interceptor spacecraft tries to defend against the interceptor spacecraft. Because it can get the global solution and does not need the gradient information that is required in traditional optimisation methods, the genetic algorithm is employed to solve the resulting parameter optimisation problem in the interception/anti-interception problem. A numerical simulation is provided to verify the availability of the obtained solution, and the results show that this work is useful for some practical applications.
    Keywords: space three-player problem; anti-interception orbit design; impulsive thrust; parameter optimisation problem; genetic algorithm.

  • Detecting sparse rating spammer for accurate ranking of online recommendation   Order a copy of this article
    by Hong Wang, Xiaomei Yu, Yuanjie Zheng 
    Abstract: Ranking method for online recommendation system is challenging owing to the rating sparsity and the spam rating attacks. The former can cause the well-known cold start problem while the latter complicates the recommendation task by detecting these unreasonable or biased ratings. In this paper, we treat the spam ratings as 'corruptions', which spatially distribute in a sparse pattern, and model them with a L1 norm and a L2,1 norm. We show that these models can characterise the property of the original ratings by removing spam ratings and help to resolve the cold start problem. Furthermore, we propose a group reputation-based method to re-weight the rating matrix and an iteratively programming-based technique for optimising the ranking for online recommendation. We show that our optimisation methods outperform other recommendation approaches. Experimental results on four famous datasets show the superior performances of our methods.
    Keywords: ranking; group-based reputation; sparsity; spam rating; collaborative recommendation.

  • Differential evolution with dynamic neighborhood learning strategy based mutation operators   Order a copy of this article
    by Guo Sun, Yiqiao Cai 
    Abstract: As the core operator of differential evolution (DE), mutation is crucial for guiding the search. However, in most DE algorithms, the parents in the mutation operator are randomly selected from the current population, which may lead to DE being slow to exploit solutions when faced with complex problems. In this study, a dynamic neighborhood learning (DNL) strategy is proposed for DE to alleviate this drawback. The new proposed DE framework is named DE with DNL-based mutation operators (DNL-DE). Unlike the original DE algorithms, DNL-DE uses DNL to dynamically construct neighborhood for each individual during the evolutionary process and intelligently select parents for mutation from the defined neighborhood. In this way, the neighborhood information can be effectively used to improve the performance of DE. Furthermore, two instantiations of DNL-DE with different parent selection methods are presented. To evaluate the effectiveness of the proposed algorithm, DNL-DE is applied to the original DE algorithms, as well as several advanced DE variants. The experimental results demonstrate the high performance of DNL-DE when compared with other DE algorithms.
    Keywords: differential evolution; dynamic neighborhood; learning strategy; mutation operator; numerical optimisation.

  • A word-frequency-preserving steganographic method based on synonym substitution   Order a copy of this article
    by Lingyun Xiang, Xiao Yang, Jiahe Zhang, Weizheng Wang 
    Abstract: Text steganography is a widely used technique to protect communication privacy but it still suffers a variety of challenges. One of these challenge is that a synonym substitution based method may change the statistical characteristics of the content, which may be easily detected by steganalysis. In order to overcome this disadvantage, this paper proposes a synonym substitution based steganographic method taking the word frequency into account. This method dynamically divides the synonyms appearing in the text into groups, and substitutes some synonyms to alter the positions of the relatively low frequency synonyms in each group to encode the secret information. By maintaining the number of relatively low frequency synonyms across the substitutions, it preserves some characteristics of the synonyms with various frequencies in the stego and the original cover texts. The experimental results illustrate that the proposed method can effectively resist attack from the detection using relative frequency analysis of synonyms.
    Keywords: synonym substitution; steganography; word-frequency-preserving; multiple-base coding; steganalysis.

  • A personalised ontology ranking model based on analytic hierarchy process   Order a copy of this article
    by Jianghua Li, Chen Qiu 
    Abstract: Ontology ranking is one of the important functions of ontology search engines, which ranks searched ontologies based on the ranking model applied. A good ranking method can help users to acquire the exactly required ontology from a considerable amount of search results, efficiently. Existing approaches that rank ontologies take only a single aspect into consideration, and ignore users personalised demands, hence produce unsatisfactory results. It is believed that the factors that influence ontology importance and the users demands both need to be considered comprehensively in ontology ranking. A personalised ontology ranking model based on the hierarchical analysis approach is proposed in this paper. We build a hierarchically analytical model and apply an analytic hierarchy process to quantify ranking indexes and assign weights to them. The experimental results show that the proposed method can rank ontologies effectively and meet users personalised demands.
    Keywords: hierarchical analysis approach; ontology ranking; personalised demands; weights assignment.

  • Deploying parallelised ciphertext-policy attributed-based encryption in clouds   Order a copy of this article
    by Hai Jiang 
    Abstract: In recent years, cloud storage has become an attractive solution owing to its elasticity, availability and scalability. However, the security issue has started to prevent public clouds to move forward being more popular. Traditional encryption algorithms (both symmetric and asymmetric ones) fail to support achieving effective secure cloud storage owing to severe issues such as complex key management and heavy redundancy. Ciphertext-Policy Attribute Based Encryption (CP-ABE) scheme overcomes the aforementioned issues and provides fine-grained access control as well as deduplication features. CP-ABE has become a possible solution to cloud storage. However, its high complexity has prevented it from being widely adopted. This paper parallelises CP-ABE where issues to ensure secured cloud storage are considered and deployed in cloud storage environments. Major performance bottlenecks, such as key management and encryption/decryption process, are identified and accelerated, and a new AES encryption operation mode is adopted for further performance gains. Experimental results have demonstrated the effectiveness and promise of such a design.
    Keywords: CP-ABE; cloud storage; parallelisation; authentication.

  • Collective intelligence value discovery based on citation of science article   Order a copy of this article
    by Yi Zhao, Zhao Li, Bitao Li, Keqing He, Junfei Guo 
    Abstract: One of the tasks of scientific paper writing is to recommend. When the number of references is increased, there is no clear classification and the similarity measure of the recommendation system will show poor performance. In this work, we propose a novel recommendation research approach using classification, clustering and recommendation models integrated into the system. In an evaluation of the ACL Anthology papers network data, we effectively use a complex network of knowledge tree node degrees (refer to the number of papers) to enhance the accuracy of recommendation. The experimental results show that our model generates better recommended citation, achieving 10% higher accuracy and 8% higher F-score than the keyword march method when the data is big enough. We make full use of the collective intelligence to serve the public.
    Keywords: citation recommendation; classification; clustering; similarity; citation network.

  • Differential evolution with k-nearest-neighbour-based mutation operator   Order a copy of this article
    by Gang Liu, Cong Wu 
    Abstract: Differential evolution (DE) is one of the most powerful global numerical optimisation algorithms in the evolutionary algorithm family, and it is popular for its simplicity and effectiveness in solving numerous real-world optimisation problems in real-valued spaces. The performance of DE depends on its mutation strategy. However, the traditional mutation operators have difficulty in balancing the exploration and exploitation. To address these issues, in this paper, a k-nearest-neighbour-based mutation operator is proposed for improving the search ability of DE. This operator is used to search in the areas in which the vector density distribution is sparse. This method enhances the exploitation of DE and accelerates the convergence of the algorithm. In order to evaluate the effectiveness of our proposed mutation operator on DE, this paper compares other state-of-the-art evolutionary algorithms with the proposed algorithm. Experimental verifications are conducted on the CEC05 competition and two real-world problems. Experimental results indicate that our proposed mutation operator is able to enhance the performance of DE and can perform significantly better than, or at least comparably with, several state-of-the-art DE variants.
    Keywords: differential evolution; unilateral sort; k-nearest-neighbour-based mutation; global optimisation.

  • Topic-specific image indexing and presentation for MEDLINE abstract   Order a copy of this article
    by Lan Huang, Ye Wang, Leiguang Gong, Tian Bai 
    Abstract: MEDLINE is one of the largest databases of biomedical literature. The search results from MEDLINE for medical terms are in the form of lists of articles with PubMed IDs. To further explore and select articles that may help to identify potentially interesting interactions between terms, users need to navigate through the lists of URLs to retrieve and read actual articles to find relevancies among these terms. Such work becomes extremely time consuming and unbearably tedious when each query returns tens of thousands of results with an uncertain recall rate. To overcome this problem, we develop a topic-specific image indexing and presentation method for discovering interactions or relatedness of medical terms from MEDLINE, based on which a prototype tool is implemented to help discover interactions between terms of types of disease. The merits of the method are illustrated by search examples using the tool and MEDLINE abstract dataset.
    Keywords: MEDLINE; data visualisation; customised retrieval.

  • Simultaneous multiple low-dimensional subspace dimensionality reduction and classification   Order a copy of this article
    by Lijun Dou, Rui Yan, Qiaolin Ye 
    Abstract: Fisher linear discriminant (FLD) for supervised learning has recently emerged as a computationally powerful tool for extracting features for a variety of pattern classification problems. However, it works poorly with multimodal data. Local Fisher linear discriminant (LFLD) is proposed to reduce the dimensionality of multimodal data. Through experiments tried out on the multimodal but binary data sets created from several multi-class datasets, it has been shown to be better than FLD in terms of performance. However, LFLD has a serious limitation, which is that it is limited to use on small-scale datasets. In order to address the above disadvantages, in this paper we develop a Multiple low-dimensionality Dimensionality Reduction Technique (MSDR) of performing the dimensionality reduction (DR) of input data. In contrast to FLD and LFLD finding an optimal low-dimensional subspace, the new algorithm attempts to seek multiple optimal low-dimensional subspaces that best make the data sharing the same labels more compact. Inheriting the advantages of NC, MSDR reduces the dimensionality of data and directly performs classification tasks without the need to train the model. Experiments of comparing MSDR with the existing traditional approaches tried out on UCI, show the effectiveness and efficiency of MSDR.
    Keywords: Fisher linear discriminant; local FLD; dimensionality reduction; multiple low-dimensional subspaces.

  • Using Gaussian mixture model to fix errors in SFS approach based on propagation   Order a copy of this article
    by Huang WenMin 
    Abstract: A new Gaussian mixture model is used to improve the quality of the propagation method for SFS in this paper. The improved algorithm can overcome most difficulties of the method, including slow convergence, interdependence of propagation nodes and error accumulation. To slow convergence and interdependence of propagation nodes, a stable propagation source and integration path are used to make sure that the reconstruction work of each pixel in the image is independent. A Gaussian mixture model based on prior conditions is proposed to fix the error of integration. Good results have been achieved in the experiment for the Lambert composite image of front illumination.
    Keywords: shape from shading; propagation method; silhouette; Gaussian mixture model; surface reconstruction.

  • Sign fusion of multiple QPNs based on qualitative mutual information   Order a copy of this article
    by Yali Lv, Jiye Liang, Yuhua Qian 
    Abstract: In the era of big data, the fusion of uncertain information from different data sources is a crucial issue in various applications. In this paper, a sign fusion method of multiple Qualitative Probabilistic Networks (QPNs) with the same structure from different data sources is proposed. Specifically, firstly, the definition of parallel path in multiple QPNs is given and the problem of fusion ambiguity is described. Secondly, the fusion operator theorem has been introduced in detail, including its proof and algebraic properties. Further, an efficient sign fusion algorithm is proposed. Finally, experimental results demonstrate that our fusion algorithm is feasible and efficient.
    Keywords: qualitative probabilistic reasoning; QPNs; Bayesian networks; sign fusion; qualitative mutual information.

  • Estimation of distribution algorithms based on increment clustering for multiple optima in dynamic environments   Order a copy of this article
    by Bolin Yu 
    Abstract: Aiming to locate and track multiple optima in dynamic multimodal environments, an estimation of distribution algorithms based on increment clustering is proposed. The main idea of the proposed algorithm is to construct several probability models based on an increment clustering which improved performance for locating multiple local optima and contributed to find the global optimal solution quickly for dynamic multimodal problems. Meanwhile, a policy of diffusion search is introduced to enhance the diversity of the population in a guided fashion when the environment is changed. The policy uses both the current population information and the part history information of the optimal solutions available. Experimental studies on the moving peaks benchmark are carried out to evaluate the performance of the proposed algorithm in comparison with several state-of-the-art algorithms from the literature. The results show that the proposed algorithm is effective for the function with moving optimum and can adapt to the dynamic environments rapidly.
    Keywords: EDAs; dynamic multimodal problems; diffusion policy; incremental clustering.

  • A blind image watermarking algorithm based on amalgamation domain method   Order a copy of this article
    by Qingtang Su 
    Abstract: Combining with the spatial domain and the frequency domain, a novel blind digital image watermarking algorithm is proposed in this paper to resolve the protecting copyright problem. For embedding a watermark, the generation principle and distribution features of direct current (DC) coefficient are used to directly modify the pixel values in the spatial domain, then four different sub-watermarks are embedded into different areas of the host image for four times, respectively. When extracting the watermark, the sub-watermarks are extracted in a blind manner according to the DC coefficients of the watermarked image and the key-based quantisation step, and then the statistical rule and first to select, second to combine are proposed to form the final watermark. Hence, the proposed algorithm not only has the simple and quick performance of the spatial domain but also has the high robustness feature of DCT domain. Many experimental results have proved that the proposed watermarking algorithm has good invisibility of watermark and strong robustness for many added attacks, e.g., JPEG compression, cropping, adding noise, etc. Comparison results also have shown the preponderance of the proposed algorithm.
    Keywords: information security; digital watermarking; combine domain; direct current.

  • A data cleaning method for heterogeneous attribute fusion and record linkage   Order a copy of this article
    by Huijuan Zhu, Tonghai Jiang, Yi Wang, Li Cheng, Bo Ma, Fan Zhao 
    Abstract: In big data era, when massive heterogeneous data are generated from various data sources, the cleaning of dirty data is critical for reliable data analysis. Existing rule-based methods are generally developed in a single data source environment, so issues such as data standardisation and duplication detection for different data-type attributes are not fully studied. In order to address these challenges, we introduce a method based on dynamic configurable rules which can integrate data detection, modification and transformation together. Secondly, we propose a type-based blocking and a varying window size selection mechanism based on a classic sorted-neighborhood algorithm. We present a reference implementation of our method in a real-life data fusion system and validate its effectiveness and efficiency using recall and precision metrics. Experimental results indicate that our method is suitable in the scenario of multiple data sources with heterogeneous attribute properties.
    Keywords: big data; varying window; data cleaning; record linkage; record similarity; SNM; type-based blocking.

  • Chinese question speech recognition integrated with domain characteristics   Order a copy of this article
    by Shengxiang Gao, Dewei Kong, Zhengtao Yu, Jianyi Guo, Yantuan Xian 
    Abstract: Aiming at domain adaptation in speech recognition, we propose a speech recognition method for Chinese question sentence based on domain characteristics. Firstly, by virtue of syllable association characteristics implied in domain term, syllable feature sequences of domain terms are used to construct the domain acoustic model. Secondly, in decoding process of domain-specific Chinese question speech recognition, we use a domain knowledge relationship to optimise and prune the speech decoding network generated by the language model, to improve continuous speech recognition. The experiments on the tourist domain corpus show that the proposed method has an accuracy of 80.50% on Chinese question speech recognition and of 91.50% on domain term recognition, respectively.
    Keywords: Chinese question speech recognition; speech recognition; domain characteristic; acoustic model library; domain terms; language model; domain knowledge library.

  • Original image tracing with image relational graph for near-duplicate image elimination   Order a copy of this article
    by Fang Huang, Zhili Zhou, Ching-Nung Yang, Xiya Liu 
    Abstract: This paper proposes a novel method for near-duplicate image elimination, by tracing the original image of each near-duplicate image cluster. For this purpose, image clustering based on the combination of global feature and local feature is firstly achieved in a coarse-to-fine way. To accurately eliminate redundant images of each cluster, an image relational graph is constructed to reflect the contextual relationship between images, and the PageRank algorithm is adopted to analyse this contextual relationship. Then the original image will be correctly traced with the highest rank, while other redundant near-duplicate images in the cluster will be eliminated. Experiments show that our method achieves better performance both in image clustering and redundancy elimination, compared with the state-of-the-art methods.
    Keywords: near-duplicate image clustering; near-duplicate image elimination; image retrieval; image search; near-duplicate image retrieval; partial-duplicate image retrieval; image copy detection; local feature; contextual relationship.

  • IFOA: an improved forest algorithm for continuous nonlinear optimisation   Order a copy of this article
    by Borong Ma, Zhixin Ma, Dagan Nie, Xianbo Li 
    Abstract: The Forest Optimisation Algorithm (FOA) is a new evolutionary optimisation algorithm which is inspired by seed dispersal procedure in forests, and is suitable for continuous nonlinear optimisation problems. In this paper, an Improved Forest Optimisation Algorithm (IFOA) is introduced to improve convergence speed and the accuracy of the FOA, and four improvement strategies, including the greedy strategy, waveform step, preferential treatment of best tree and new-type global seeding, are proposed to solve continuous nonlinear optimisation problems better. The capability of IFOA has been investigated through the performance of several experiments on well-known test problems, and the results prove that IFOA is able to perform global optimisation effectively with high accuracy and convergence speed.
    Keywords: forest optimisation algorithm; evolutionary algorithm; continuous nonlinear optimisation; scientific decision-making.

  • A location-aware matrix factorisation approach for collaborative web service QoS prediction   Order a copy of this article
    by Zhen Chen, Limin Shen, Dianlong You, Chuan Ma, Feng Li 
    Abstract: Predicting the unknown QoS is often required because most users would have invoked only a small fraction of web services. Previous prediction methods benefit from mining neighborhood interest from explicit user QoS ratings. However, the implicitly existing but significant location information that would potentially tackle the data sparsity problem is overlooked. In this paper, we propose a unified matrix factorisation model that fully capitalises on the advantages of both location-aware neighborhood and latent factor approach. We first develop a multiview-based neighborhood selection method that clusters neighbours from the views of both geographical distance and rating similarity relationships. Then a personalised prediction model is built up by transforming the wisdom of neighborhoods. Experimental results have demonstrated that our method can achieve higher prediction accuracy than other competitive approaches and also better alleviate the concerned data sparsity issue.
    Keywords: service computing; web service; QoS prediction; matrix factorisation; location awareness.

  • Pairing-free certificateless signature with revocation   Order a copy of this article
    by Sun Yinxia, Shen Limin 
    Abstract: How to revoke a user is an important problem in public key cryptosystems. Free of costly certificate management and key escrow, the certificateless public key cryptography (CLPKC) are advantageous over the traditional public key system and the identity-based public key system. However, there are few solutions to the revocation problem in CLPKC. In this paper, we present an efficient revocable certificateless signature scheme. This new scheme can revoke a user with high efficiency. We also give a method to improve the scheme to be signing-key-exposure-resilient. Based on the discrete logarithm problem, our scheme is provably secure.
    Keywords: revocation; certificateless signature; without pairing; discrete logarithm problem.

  • Large universe multi-authority attribute-based PHR sharing with user revocation   Order a copy of this article
    by Enting Dong, Jianfeng Wang, Zhenhua Liu, Hua Ma 
    Abstract: In the patient-centric model of health information exchange, personal health records (PHRs) are often outsourced to third parties, such as cloud service providers (CSPs). Attribute-based encryption (ABE) can be used to realise flexible access control on PHRs in the cloud environment. Nevertheless, the issues of scalability in key management, user revocation and flexible attributes remain to be addressed. In this paper, we propose a large-universe multi-authority ciphertext-policy ABE system with user revocation. The proposed scheme achieves scalable and fine-grained access control on PHRs. In our scheme, there are a central authority (CA) and multiple attribute authorities (AAs). When a user is revoked, the system public key and the other users' secret keys need not be updated. Furthermore, because our scheme supports a large attribute universe, the number of attributes is not polynomially bounded and the public parameter size does not linearly grow with the number of attributes. Our system is constructed on prime order groups and proven selectively secure in the standard model.
    Keywords: attribute-based encryption; large universe; multi-authority; personal health record; user revocation.

  • A multi-objective optimisation multicast routing algorithm with diversity rate in cognitive wireless mesh networks   Order a copy of this article
    by Zhufang Kuang 
    Abstract: Cognitive Wireless Mesh Networks (CWMNs) were developed to improve the usage ratio of the licensed spectrum. Since the spectrum opportunities for users vary over time and location, enhancing the spectrum effectiveness is a goal and also a challenge for CWMNs. Multimedia applications have recently generated much interest in CWMNs supporting Quality-Of-Service (QoS) communications. Multicast routing and spectrum allocation is an important challenge in CWMNs. In this paper, we design an effective multicast routing algorithm based on diversity rate with respect to load balancing and the number of transmissions for CWMNs. A Load Balancing wireless links weight computing function and computing algorithm based on Diversity Rate (LBDR) are proposed, and a load balancing Channel and Rate Allocating algorithm based on Diversity Rate (CRADR) is proposed. On this basis, a Load balancing joint Multicast Routing, channel and Rate allocation algorithm based on Diversity rate with QoS constraints for CWMNs (LMR2D) is proposed. Balancing the load of node and channel, and minimising the number of transmissions of multicast tree are the objectives of LMR2D. Firstly, LMR2D computes the weight of wireless links using LBDR and the Dijkstra algorithm for constructing the load balancing multicast tree step by step. Secondly, LMR2D uses CRADR to allocate channel and rate of its to links, which is based on the Wireless Broadcast Advantage (WBA). Simulation results show that LMR2D can achieve the expected goal. Not only can it balance the load of node and channel, but also it needs fewer transmissions for multicast tree.
    Keywords: cognitive wireless mesh networks; multicast routing; spectrum allocation; load balanced; diversity rate.

  • Online multi-label learning with cost-sensitive budgeted SVM   Order a copy of this article
    by Jing Liu, Zhongwen Guo, Ling Jian, Like Qiu, Xupeng Wang 
    Abstract: Multi-label learning deals with data associated with multiple labels simultaneously. It has been extensively studied in diverse areas such as information retrieval, bioinformatics, image annotation, etc. Explosive growth of multi-label related data has brought challenges of how to efficiently learn these labelled data and automatically label the unlabelled data. In this paper, we propose an online learning algorithm which processes the data arriving in streaming fashion. It is space-saving and scalable to large-scale problems. Specifically, to tackle the class imbalance problem, we exploit label prior to construct cost-sensitive function for sub-classification problem. Experimental studies corroborate the performance of our approaches on datasets drawn from diverse domains, and demonstrate that our proposed algorithm is an ideal candidate to process streaming data and deal with online multi-label learning tasks.
    Keywords: online learning; budgeted SVM; multi-label learning; cost-sensitive; stochastic gradient descent.

  • Nominal data similarity: a hierarchical measure   Order a copy of this article
    by Hao Yu, Gen Zhang 
    Abstract: Similarity of nominal data plays fundamental roles in numerous fields of machine learning and data mining. Unlike numerical data, the similarity of nominal data is much more difficult to describe, and few efforts have been done for it. Although existing nominal similarity measures can reveal a part of data properties, they suffer low accuracy owing to ignoring value relationships or integrating multi-view relationships inappropriately. In this paper, we propose a novel hierarchical measure for nominal data similarity (HNS). The HNS leverages the intrinsic data characteristics by considering low-level information both within and between attributes, and hierarchically seizes the value distributions, attribute interactions and attribute to object contributions. Meanwhile, it aggregates multi-view relationships through a bottom-up framework, retaining consistency as well as complementary details. We theoretically analysed this measure, and experiments on six UCI datasets demonstrate that the HNS outperforms the state-of-the-art nominal similarity measures in term of target alignment and clustering accuracy.
    Keywords: similarity measure; nominal data; metric learning.

  • Context discriminative dictionary construction for topic representation   Order a copy of this article
    by Shufang Wu 
    Abstract: The construction of a discriminative topic dictionary is important for describing the topic and increasing the accuracy of topic detection and tracking. In this method, we rank the mutual information of words, and the top few words with the maximum mutual information are selected to construct the discriminative topic dictionaries. Considering context words can provide a more accurate expression of the topic, during word selection, we both consider the differences between different topics and the context words that appear in the stories. Since the news topic is dynamic over time,it is not reasonable to keep the topic dictionary unchanged, so a dictionary updating method is also proposed. Experiments were carried out on TDT4 corpus, and we adopt miss probability and false alarm probability as evaluation criteria to compare the performance of incremental TF-IDF and the proposed method. Extensive experiments are conducted to show that our method can provide better results.
    Keywords: discriminative dictionary; context word; topic representation; word selection.

  • Demystifying echo state network with deterministic simple topologies   Order a copy of this article
    by Duaa Elsarraj, Maha Al Qisi, Ali Rodan, Nadim Obeid, Ahmad Sharieh, Hossam Faris 
    Abstract: Echo State Networks (ESN) are a special type of Recurrent Neural Networks (RNN) with distinct performance in the field of Reservoir computing. The state space of the ESN is initially randomised and the reservoir weights are fixed with training done only on the state readout. Beside the advantages of ESN, there remains some opacity in the dynamic properties of the reservoir owing to the presence of randomisation. Our aims in this paper are to demystify the model of ESN in a complete deterministic structure with the use of different proposed reservoir structures (topologies) and to compare their performance with the random ESN on different benchmark datasets. All applied topologies maintain the simplicity of random ESN computation complexity. Most of the topologies showed comparable or even better performance.
    Keywords: echo state network; reservoir computing; reservoir structure topology; memory capacity; echo state network algorithm; complexity.

  • A state space distribution approach based on system behaviour   Order a copy of this article
    by Imene Bensetira, Djamel Eddine Saidouni, Mahfud Al-la Alamin 
    Abstract: In this paper, we propose a novel approach to deal with the state space explosion problem occurring in model checking. We propose an off-line algorithm for distributed state space construction. That is carried out by reviewing the behaviour of the constructed system and redistributing the state space according to the accumulated information about the optimal considered behaviour. Therefore, the distribution will be guided by the systems behaviour. The proposed policy maintains the spatial-time balance. The simulation and implementation of our system are based on a multi-agent technique which fits very well the development of distributed systems. The experimental measures performed on a cluster of machines have shown very promising results for both workload balance and communication overhead.
    Keywords: model checking; combinatorial state space explosion; distributed state space construction; graph distribution; system behaviour; distributed algorithms; reachability analysis.

  • Consensus RNA secondary structure prediction using information of neighbouring columns and principal component analysis   Order a copy of this article
    by Tianhang Liu, Jianping Yin, Long Gao, Wei Chen, Minghui Qiu 
    Abstract: RNA is a family of biological macromolecules. It is important to all kinds of biological processes. RNA structures are closely related to their functions. Hence, determining the structure is invaluable in understanding genetic diseases and creating drugs. Nowadays, RNA secondary structure prediction is a field yet to be researched. In this paper, we present a novel method using an RNA sequence alignment to predict a consensus RNA secondary structure. In essence, the goal of the method is to give a prediction about whether any two columns of an alignment correspond to a base pair or not, using the information provided by the alignment. The information includes the covariation score, the fraction of complementary nucleotides and the consensus probability matrix of the column pair and those of its neighbours. Then principal component analysis is applied to overcome the problem of over-fitting. A comparison of our method and other consensus RNA secondary structure prediction methods, including NeCFold, ELMFold, KnetFold, PFold and RNAalifold, in 47 families from Rfam (version 11.0), is performed. Results show that our method surpasses the other methods in terms of Matthews correlation coefficient, sensitivity and selectivity.
    Keywords: RNA secondary structure prediction; comparative sequence analysis; principal component analysis; information of neighbouring columns.

  • Research on RSA and Hill hybrid encryption algorithm   Order a copy of this article
    by Hongyu Yang, Yuguang Ning, Yue Wang 
    Abstract: An RSA-Hill hybrid encryption algorithm model based on random division of plaintext is proposed. First, the key of the Hill cipher is replaced by a Pascal matrix. Secondly, the session key of the model is replaced by random numbers of plaintext division, and encrypted by the RSA cipher. Finally, the dummy problem in the Hill cipher can be solved, and the model can achieve the one-time pad. Security analysis and experimental results show that our method has better encryption efficiency and stronger anti-attack capacity.
    Keywords: hybrid encryption; plaintext division; Pascal matrix; RSA cipher; Hill cipher.

  • An auction mechanism for cloud resource allocation with time-discounting values   Order a copy of this article
    by Yonglong Zhang 
    Abstract: Group-buying has emerged as a new trading paradigm and has become more attractive. Both sides of the transaction benefit from group-buying: buyers enjoy a lower price and sellers receive more demanding orders. In this paper, we investigate an auction mechanism for cloud resource allocation with time discounting values via group-buying, called TDVG. TDVG consists of two steps: winning seller and buyer selection, and pricing. In the first step, we choose winning seller and buyer in a greedy manner according to some criterion, and calculate the payment for each winning seller and buyer in the second step. Rigorous proof demonstrates that TDVG satisfies the properties of truthfulness, budget balance and individual rationality. Our experiment results show that TDVG achieves better total utility, matching rate and commodities use than the existing works.
    Keywords: cloud resource allocation; auction; time discounting values; group-buying.

  • Study on data sparsity in social network-based recommender system   Order a copy of this article
    by Ru Jia, Ru Li, Meng Gao 
    Abstract: With the development of information technology and the expanding of information resources, it is more difficult for people to get the information that they are really interested in, which is so-called information overload. Recommender systems are regarded as an important approach to deal with information overload, because it can predict users preferences according to users records. Matrix factorisation is very successful in recommender systems, but it faces the problem of data sparsity. This paper deals with the sparsity problem from the perspective of adding more kinds of information from social networks, such as friendships and tags, into the recommending model in order to alleviate the sparsity problem. The paper also validates the impacts of users friendships, tags and neighbours of items on reducing the sparseness of the data and improving the accuracy of recommending by the experiments using the dataset from real life.
    Keywords: social network-based recommender systems; matrix factorisation; data sparsity.

  • A novel virtual disk bandwidth allocation framework for data-intensive applications in cloud environments
    by Peng Xiao, Changsong Liu 
    Abstract: Recently, cloud computing has become a promising distributed processing paradigm to deploy various kinds of non-trivial applications. In those applications, most of them are considered data-intensive and therefore require the cloud system providing massive storage space as well as desirable I/O performance. As a result, virtual disk technique has been widely applied in many real-world platforms to meet the requirements of these applications. Therefore, how to efficiently allocate the virtual disk bandwidth become an important issue that need to be addressed. In this paper, we present a novel virtual disk bandwidth allocation framework, in which a set of virtual bandwidth brokers are introduced to make allocation decisions by playing two game models. Theoretical analysis and solution are presented to prove the effectiveness of the proposed game models. Extensive experiments are conducted on a real-world cloud platform, and the results indicate that the proposed framework can significantly improve the utilization of virtual disk bandwidth comparing with other existing approaches.
    Keywords: cloud computing; bandwidth reservation; quality of service; queue model; gaming theory

Special Issue on: ICACCI-2013 and ISI-2013 Swarm and Artificial Intelligence

  • Particle swarm optimisation with time-varying cognitive avoidance component   Order a copy of this article
    by Anupam Biswas, Bhaskar Biswas, Anoj Kumar, Krishn Mishra 
    Abstract: Interactive cooperation of local best or global best solution encourages particles to move towards them, hoping that a better solution may present in the neighbouring positions around the local best or global best. This encouragement does not guarantee that movements taken by particles will always be suitable. Sometimes particles may be misled in a wrong direction towards the worst solution. Prior knowledge of worst solutions may predict such misguidance and avoid such moves. Worst solutions can not be known in advance and can be known only through experience. This paper introduces a cognitive avoidance scheme to the particle swarm optimisation method. A very similar kind of mechanism is used to incorporate worst solutions into the strategic movement of particles as used during incorporation of best solutions. A time-varying approach is also extrapolated to the cognitive avoidance scheme to deal with negative effects. The proposed approach is tested with 25 benchmark functions of CEC 2005 special session on real parameter optimisation, as well as with four other very popular benchmark functions.
    Keywords: optimisation; particle swarm optimisation; differntial evolution; heuristics

Special Issue on: Cloud Computing Services Brokering, SLA and Security

  • Coordinated scan detection algorithm based on the global characteristics of time sequence   Order a copy of this article
    by Yanli Lv, Yuanlong Li, Shuang Xiang, Chunhe Xia 
    Abstract: Scanning is a kind of activity or action for the purpose of acquiring the target host status information. In order to obtain the information more efficiently and more secretly, the attackers in the network often use coordinated scans to scan the target host or network. At present, there are no effective methods to detect the coordinated scan. We take scan sequences as a time series and combine the general characteristics of time series. Then based on the features of time series clustering approach, we are going to find the coordinated scans governed by the same controller. Simulation and experiment results show that the methods we propose are better than the existing methods in accuracy and efficiency.
    Keywords: scan; scan detection; coordinated scan; general feature; clustering analysis

  • The intensional semantic conceptual graph matching algorithm based on conceptual sub-graph weight self-adjustment   Order a copy of this article
    by Zeng Hui, Xiong Liyan, Chen Jianjun 
    Abstract: Semantic computing is an important task in the research on natural language processing. On solving the problem of the inaccurate conceptual graph matching, this paper proposes an algorithm to compute the similarity of conceptual graphs, based on conceptual sub-graph weight self-adjustment. The algorithm works by basing itself on the intensional logic model of Chinese concept connotation, using intensional semantic conceptual graph as a knowledge representation method and combining itself with the computation method of E-A-V structures. When computing the similarity of conceptual graphs, the algorithm can give the homologous weight to the sub-graph according to the proportion of how much information the sub-graph contains in the whole conceptual graph. Therefore, it can achieve better similarity results, which has also been proved in the experiments described in this paper.
    Keywords: Chinese semantic analysis; intensional semantic conceptual graph; E-A-V conceptual structures similarity; conceptual sub-graph weight self-adjustment

Special Issue on: High-Performance Information Technologies for Engineering Applications

  • Development and evaluation of the cloudlet technology within the Raspberry Pi   Order a copy of this article
    by Nawel Kortas, Anis Ben Arbia 
    Abstract: Nowadays, communication devices, such as laptops, computers, smartphones and personal media players, have extensively increased in popularity thanks to the rich set of cloud services that they allow users to access. This paper focuses on setting solutions of network latency for communication devices by the use of cloudlets. This work also proposes a conception of a local datacentre that allows users to connect to their data from any point and through any device by the use of the Raspberry. We also display the performance demonstration results of the resource utilisation rate, the average execution time, the latency, the throughput and the lost packets that provide the big advantage of cloudless application from local and distant connections. Furthermore, we display an evaluation of cloudless by comparing it with similar services and by setting simulation results through the CloudSim simulator.
    Keywords: cloudlets; cloud computing; cloudless; Raspberry Pi; datacentre; device communication, file-sharing services.

  • Parallel data processing approaches for effective intensive care units with the internet of things   Order a copy of this article
    by N. Manikandan, S Subha 
    Abstract: Computerisation in health care is more general and monitoring Intensive Care Units(ICU) is more significant and life-critical. Accurate monitoring in an ICU is essential. Failing to take right decisions at the right time may prove fatal. Similarly, a timely decision can save people's lives in various critical situations. In order to increase the accuracy and timeliness in ICU monitoring, two major technologies can be used, namely parallel processing through vectorisation of ICU data and data communication through the Internet of Things (IoT). With our approach, we can improve efficiency and accuracy in data processing. This paper proposes a parallel decision tree algorithm in ICU data to take faster and accurate decisions on data selection. Uses of parallelised algorithms optimise the process of collecting large sets of patient information. A decision tree algorithm is used for examining and extracting knowledge-based data from large databases. Finalised information will be transferred to concerned medical experts in cases of medical emergency using the IOT. Parallel implementation of the decision tree algorithm is implemented with threads, and output data is stored in local IOT tables for further processing.
    Keywords: medical data processing; internet of things; ICU data; vectorisation; multicore architecture; parallel data processing.

  • Study of runtime performance for Java-multithread PSO on multiCore machines   Order a copy of this article
    by Imed Bennour, Monia Ettouil, Rim Zarrouk, Abderrazak Jemai 
    Abstract: Optimisation meta-heuristics, such as Particle Swarm Optimization (PSO), require high-performance computing (HPC). The use of software parallelism and hardware parallelism is mandatory to achieve HPC. Thread-level parallelism is a common software solution for programming on multicore systems. The Java language, which includes important aspects such as its portability and architecture neutrality, its multithreading facilities and its distributed nature, makes it an interesting language for parallel PSO. However, many factors may impact the runtime performance: the coding styles, the threads-synchronisation levels, the harmony between the software parallelism injected into the code and the available hardware parallelism, the Java networking APIs, etc. This paper analyses the Java runtime performance on handling multithread PSO over general purpose multicore machines and networked machines. Synchronous, asynchronous, single-swarm and multi-swarm PSO variants are considered.
    Keywords: high-performance computing , particle swarm optimisation,multicore, multithread, performance, simulation.

  • Execution of scientific workflows on IaaS cloud by PBRR algorithm   Order a copy of this article
    by S.A. Sundararaman 
    Abstract: Job scheduling of scientific workflow applications in IaaS cloud is a challenging task. Optimal resource mapping of jobs to virtual machines is calculated considering schedule constraints such as timeline and cost. Determining the required number of virtual machines to execute the jobs is key in finding the optimal schedule makespan with minimal cost. In this paper, VMPROV algorithm has been proposed to find the required virtual machines. Priority-based round robin (PBRR) algorithm is proposed for finding the job to resource mapping with minimal makespan and cost. Execution of four real-world scientific application jobs by PBRR algorithm are compared with MINMIN, MAXMIN, MCT, and round robin algorithms execution times. The results show that the proposed algorithm PBRR can predict the mapping of tasks to virtual machines in better way compared to the other classic algorithms.
    Keywords: cloud job scheduling; virtual machine provisioning; IaaS

Special Issue on: Technologies and Applications in the Big Data Era

  • Research on implementation of digital forensics in cloud computing environment   Order a copy of this article
    by Hai-Yan Chen 
    Abstract: Cloud computing is a promising next-generation computing paradigm which integrates multiple existing and new technologies. With the maturing and wide application of cloud computing technology, there are more and more crimes occuring in the environment of cloud computing, so the effective investigations of evidence against these crimes are extremely important and of urgent need. Because of the characteristics of the virtual computing environment (mass storage and distribution of data, and multi-tenant), cloud computing sets an extremely hard condition for the investigation of evidence. For this purpose, in this paper, we propose a digital forensics reference model in the cloud environment. First, we divide cloud forensics into four steps and the implementation scheme is given respectively. Secondly, a cloud platform trusted evidence collection mechanism based on trusted evidence collection agent is put forward. Finally, methods of using various data mining algorithms in the evidences analysed are introduced. The experiment and simulation on real data show the accuracy and effectiveness of the proposed method.
    Keywords: cloud computing; digital forensics; cloud environment; digital evidence

  • Building a large-scale testing dataset for conceptual semantic annotation of text   Order a copy of this article
    by Xiao Wei, Daniel Dajun Zeng, Xiangfeng Luo, Wei Wu 
    Abstract: One major obstacle facing the research on semantic annotation is lack of large-scale testing datasets. In this paper, we develop a systematic approach to constructing such datasets. This approach is based on guided ontology auto-construction and annotation methods that use little priori domain knowledge and little user knowledge in documents. We demonstrate the efficacy of the proposed approach by developing a large-scale testing dataset using information available from MeSH and PubMed. The developed testing dataset consists of a large-scale ontology, a large-scale set of annotated documents, and the baselines to evaluate the target algorithm, which can be employed to evaluate both the ontology construction algorithms and semantic annotation algorithms.
    Keywords: semantic annotation; ontology concept learning; testing dataset; evaluation baseline; ontology auto-construction; priori knowledge; MeSH; PubMed

Special Issue on: Advanced Information Processing in Communication

  • Hybrid genetic, variable neighbourhood search and particle swarm optimisation based job scheduling for cloud computing   Order a copy of this article
    by Rachhpal Singh 
    Abstract: In a Cloud Computing Environment (CCE), many scheduling mechanisms have been proposed to balance the load between the given set of distributed servers. Genetic Algorithm (GA) has been verified to be the best technique to reduce the energy consumed by distributed servers, but it becomes unsuccessful to strengthen the exploration in the rising areas. The performance of Particle Swarm Optimisation (PSO) depends on initially selected random particles, i.e. wrongly selected particles may produce poor results. The Variable Neighbourhood Search (VNS) can be used to set the stability of non-local searching and local utilisation for an evolutionary processing period. Therefore, this paper proposes a hybrid VNS, GA and PSO, called HGVP, in order to overcome the constraint of a poorly selected initial amount of particles in the case of PSO-based scheduling for CCE. The simulation results of the proposed technique have shown effective results over the available techniques, especially in terms of energy consumption
    Keywords: cloud computing environment; job scheduling; particle swarm optimisation; genetic algorithm; variable neighbourhood search.

  • Secured image compression using AES in bandelet domain   Order a copy of this article
    by S.P. Raja, A. Suruliandi 
    Abstract: Compression and encryption are jointly used in network systems to improve efficiency and security. A secure and reliable means for communicating images and video is, consequently, indispensable for networks. In this paper, a new methodology is proposed for secure image compression. Initially, a bandelet transform is applied to the input image to obtain coefficients and kernel matching pursuits (KMP) used to choose key bandelet coefficients. The coefficients obtained from the KMP are encrypted using the advanced encryption standard (AES) and encoded using the listless set partitioning embedded block (listless SPECK) image compression encoding technique. For performance evaluation, the peak signal to noise ratio (PSNR), mean square error (MSE), structural similarity index (SSIM) and image quality index (IQI) are taken. From the experimental results and performance evaluation, it is shown that the proposed approach produces high PSNR values and compresses images securely.
    Keywords: bandelet transform; KMP; AES; listless SPECK.

  • A semantic layer to improve collaborative filtering systems   Order a copy of this article
    by Sahraoui Kharroubi, Youcef Dahmani, Omar Nouali 
    Abstract: According to IBM statistics, the internet generates 2.5 trillion items of heterogeneous data on a daily basis. Known as big data, this degrades the performance of search engines and reduces their ability to satisfy requests. Filtering systems such as Netflix, eBay, iTunes and others are widely used on the web to select and distribute interesting resources to users. Most of these systems recommend only one kind of resource, which limits the ambitions of their users. In this paper, we propose a hybrid recommendation system that includes a variety of resources (books, films, music, etc.). A similarity process was applied to group users and resources on the basis of appropriate metadata. We have also used a graph data model known as a Resource Description Framework (RDF) to represent the different modules of the system. RDF syntax allows for perfect integration and data exchange via the SPARQL query language. Real data sets are used to perform the experiments, showing promising results in terms of performance and accuracy.
    Keywords: big data, namespace, rating, relevant item, RDF vocabulary, sparsity, user’s relationship

  • A fuzzy trust-based routing model for mitigating the misbehaving nodes in MANETs   Order a copy of this article
    by Abdesselem Beghriche, Azeddine Bilami 
    Abstract: Although security issues in Mobile Ad-hoc NETworks have been a major focus in recent years, the development of most secure schemes for these networks has not been entirely achieved till now. This paper proposes a novel trusted routing model for mitigating attacks in MANETs. The proposed model incorporates the concept of trust into the MANETs and applies Gray Relational Analysis (GRA) theory combined with fuzzy sets to calculate a nodes trust level based on observations from neighbour nodes trust level; these trust levels are then used in the routing decision-making process. In order to prove the applicability of the proposed solution, extensive experiments were conducted to evaluate the efficiency of our model, aiming at improving the network interaction quality, malicious node mitigation and enhancements of the systems security.
    Keywords: mobile ad-hoc networks; security; routing protocol; misbehaviour; trust model; trust management; fuzzy set; gray relational analysis; gray clustering method.

  • QoS-aware web service selection based on self-organising migrating algorithm and fuzzy dominance.   Order a copy of this article
    by Amal Halfaoui, Fethallah Hadjila, Fedoua Didi 
    Abstract: Web service composition consists of creating a new complex web service by combining existing ones. The selection of composite services is a very complex and challenging task, especially with the increasing number of services offering the same functionality. The web service selection can be considered as a combinatorial problem that focuses on delivering the optimal composition that satisfies the user's requirements (functional and non functional needs). Several optimisation algorithms have been proposed in the literature to tackle the web service selection. In this work, we propose an approach that adapts a recent stochastic optimisation algorithm called Self Organising Migrating Algorithm (SOMA) for QoS web service selection. Furthermore, we propose a fuzzification of the Pareto dominance and use it to improve SOMA by comparing the services within the local search. The proposed approach is applicable to any combinatorial workflow with parallel, choice and loop pattern. We test our algorithm with a set of synthetic datasets and compare it with the most recently used algorithm (PSO). The comparative study shows that SOMA produces promising results and therefore it is able to select the user's composition in an efficient manner.
    Keywords: web service selection; SOMA; fuzzy dominance; swarm-based optimisation algorithms.

  • Fault detection and behavioural prediction of a constrained complex system using cellular automata   Order a copy of this article
    by Priya Radha, Elizabeth Sherly 
    Abstract: Functionality-based failure analysis and validation during the design process in a constrained complex system is challenging. In this paper, we advocate a model to validate the functionality of a constrained complex control system with its structural behaviour. An object-constrained model is proposed for validation of any component of a complex system with constraints, and its state of safeness is predicted using cellular automata. The model consists of two sub-systems: an inference engine that functions based on a rule-based expert system, and a failure forecast engine based on cellular automata. The system is tested against a thermal power plant for early detection of failure in the system, which enhances the process efficiency of power generation.
    Keywords: complex system, constrained objects, cellular automata, control system, prediction engine, failure forecast engine.

  • Distributed diagnosis based on distributed probability propagation nets   Order a copy of this article
    by Yasser Moussa Berghout, Hammadi Bennoui 
    Abstract: This paper addresses the problem of modelling uncertainty in the distributed context. It is situated in the field of diagnosis; more precisely, model-based diagnosis of distributed systems. A special focus is given to modelling uncertainty and probabilistic reasoning. Thus, this work is based on a probabilistic modelling formalism called: "probability propagation nets" (PPNs), which are designed for centralised systems. Hence, an extension of this model is proposed to suit the distributed context. Distributed probability propagation nets (DPPNs), the proposed extension, were conceived to consider the distributed systems' particularities. So, the set we consider is a set of interacting subsystems, each of which is modelled by a DPPN. The interaction among the subsystems is modelled through the firing of common transitions belonging to more than one subsystem. All of that is logically supported by means of probabilistic Horn abductions (PHAs). Furthermore, the diagnostic process is done by exploiting transition-invariants; a diagnostic technique developed for Petri nets. The proposed extension is illustrated through a real life example.
    Keywords: model-based diagnosis; distributed systems; probabilistic reasoning; probability propagation nets; probabilistic Horn abduction; Petri nets.

  • Novel automatic seed selection approach for mass detection in mammograms   Order a copy of this article
    by Ahlem Melouah, Soumai Layachi 
    Abstract: The success of mass detection using seeded region growing segmentation depends on seed point selection operation. The seed point is the first point from which the process of aggregation starts. This point must be inside the mass otherwise the segmentation fails. There are two principal ways to perform the seed point selection. The first one is manual, performed by a medical expert who manually outlines the point of interest using a pointer device. The second one is automatic; in this case the whole process is performed without any user interaction. This paper proposes a novel approach to select automatically the seed point for further region growing expansion. Firstly, suspicious regions are extracted by a thresholding technique. Secondly, the suspicious region whose features match with the predefined masses features is identified as the region of interest. Finally, the seed point is placed inside the region of interest. The proposed method is tested using the IRMA database and the MIAS database. The experimental results show the performance and robustness of the proposed method.
    Keywords: breast cancer; masses detection; mammograms; segmentation; seeded region growing; automatic seed selection; region of interest; features; thresholding.

  • Combining topic-based model and text categorisation approach for utterance understanding in human-machine dialogue   Order a copy of this article
    by Mohamed Lichouri, Rachida Djeradi, Amar Djeradi 
    Abstract: In the present paper, we suggest an implementation of an automatic understanding system of the statement in human-machine communication. The architecture we adopt was based on a stochastic approach that assumes that the understanding of a statement is nothing but a simple theme identification process. Therefore, we present a new theme identification method based on a documentary retrieval technique which is text (document) classification [1]. The method we suggest was validated on a basic platform that gives information related to university schooling management (querying a student database), taking into consideration a textual input in French. This method has achieved a theme identification rate of 95% and a correct utterance understanding rate of about 91.66%.
    Keywords: communication; human-machine dialogue; understanding; utterance; thematic; text classification; topic model.

  • A Manhattan distance based binary bat algorithm vs integer ant colony optimisation for intrusion detection in audit trails.   Order a copy of this article
    by Wassila Guendouzi, Abdelmadjid Boukra 
    Abstract: An intrusion detection system (IDS) is the process of monitoring and analysing security activities occurring in a computer or network systems. The detection method is the brain of IDS and it can perform either anomaly-based or misuse-based detection. The misuse mechanism aims to detect predefined attack scenarios in the audit trails, whereas the anomaly detection mechanism aims to detect deviations from normal user behaviour. In this paper, we deal with misuse detection. We propose two approaches to solve the NP-hard security audit trail analysis problem. Both rely on the Manhattan distance measure to improve the intrusion detection quality. The first proposed method, named Enhanced Binary Bat Algorithm (EBBA), is an improvement of Bat Algorithm (BA) that uses a binary coding and the fitness function defined as the global attacks risks. This fitness function is used in conjunction with the Manhattan distance. In this approach, new operators are adapted to the problem of our interest which are solution transformation, vertical permutation and horizontal permutation operators. The second proposed approach, named Enhanced Integer Ant Colony Optimisation (EIACS), is a combination of two metaheuristics: Ant Colony System (ACS), which uses a new pheromone update method, and Simulated Annealing (SA), which uses a new neighborhood generation mechanism. This approach uses an integer coding and a new fitness function based on the Manhattan distance measure. Experiments on different problem sizes (small, medium and large) are carried out to evaluate the effectiveness of the two approaches. The results indicate that for small and medium sizes the two algorithms have similar performance in term of detection quality. For large problem size the performance of EIACS is more significant than EBBA.
    Keywords: intrusion detection; security audit trail analysis; combinatorial optimisation problem; NP-hard; Manhattan distance; bat algorithm; ant colony system; simulated annealing.

  • An approach for managing the dynamic reconfiguration of software architectures   Order a copy of this article
    by Abdelfetah Saadi, Mourad Chabane Oussalah, Abderrazak Henni 
    Abstract: Currently, most software systems have a dynamic nature and need to evolve at runtime. The dynamic reconfiguration of software systems is a mechanism that must be dealt with to enable the creation and destruction of component instances and their links. To reconfigure a software system, it must be stopped, patched and restarted; this causes unavailability periods which are always a problem for highly available systems. In order to address these problems, this paper presents an approach called software architecture reconfiguration approach (SAREA). We define for this approach a set of intelligent agents, each of them has a precise role in the functioning and the control of software. Our approach implements a restoring mechanism of software architecture to a fully functional state after the failure of one or more reconfiguration operations; it also proposes a reconfiguration mechanism which describes the execution process of reconfigurations.
    Keywords: software architecture; dynamic reconfiguration; evolution; intelligent agents; component model; model driven architecture; MDA; meta-model.

Special Issue on: ICA3PP 2015 and PRDC 2015 Dependable Computing and Parallel Computing

  • A locality constrained self-representation approach for unsupervised feature selection   Order a copy of this article
    by Cuihua Wang, Shuyi Ma, Chao Bi, Hui Sun, Yugen Yi 
    Abstract: Recently, Regularized Self-Representation (RSR) has been proposed as an efficient unsupervised feature selection algorithm. However, RSR only takes the self-representation ability of features into account, and neglects the locality structure preserving ability of features, which may adversely affect its performance. To overcome this limitation, a novel algorithm termed Locality Constrained Regularized Self-Representation (LCRSR) is proposed in this paper. In our algorithm, a local scatter matrix is introduced to encode the locality geometric structure of high-dimensional data. Therefore, the locality information of the input database can be well preserved. Moreover, a simple yet efficient iterative update algorithm is developed to solve the proposed LCRSR. Extensive experiments are conducted on five publicly available databases (JAFFE, ORL, AR, COIL20 and SRBCT) to demonstrate the efficiency of the proposed algorithm. Experimental results show that LCRSR obtains better clustering performance than some other state-of-the-art approaches.
    Keywords: unsupervised feature selection; self-representation;local structure;clustering

  • Fault masking issue on a dependable processor using BIST under highly electromagnetic environment   Order a copy of this article
    by Aromhack Saysanasongkham, Satoshi Fukumoto, Masayuki Arai 
    Abstract: Although power converter and inverter circuits have been enhanced for higher switching speed, higher voltage and higher power density, problems arise relating to the effect of near-field noise due to high-current pulse, which can severely affect the operations of the controllers for power converters and the surrounding logic circuits as multi-bit transient faults. A scheme to construct highly reliable processors that measures the noise duration by BIST (Built-in Self Test) and avoids its effect by clock mitigation was proposed in our previous work. However, in some cases the upper bounds of the noise duration distribution are underestimated owing to insufficient testing and fault masking, resulting in faulty operations. In this paper, we further investigate these underestimating situations and then propose a test method to overcome the problem. We use SPICE simulation to evaluate the effectiveness of the proposed scheme.
    Keywords: fault masking; online BIST; built-in self test; periodic multi-bit transient fault; noise avoidance; dependable processor; electromagnetic radiation

  • A collaborative filtering recommendation method based on TagIEA expert degree model   Order a copy of this article
    by Weimin Li, Bin Wang, Jianbo Zou, Jinfang Sheng 
    Abstract: In recent years, social networking services and e-commerce have been developing rapidly. The research of recommending in e-commerce service mainly focused on using the collaborative filtering algorithm. But the algorithm had the limitations of data sparsity and cold start. This paper presents a model using TagIEA expert degree metrics in the context of social e-commerce services, where Tag and expert degree information are integrated into the collaborative filtering algorithm. The comprehensive recommendation based on the TagIEA expert degree can effectively mitigate the problem of cold start and data sparsity. Finally, this paper verifies the effectiveness of the improved collaborative filtering algorithm by experiments.
    Keywords: collaborative filtering; tag; expert degree; social networking services.

  • Data or index: a trade-off in mobile delay-tolerant networks   Order a copy of this article
    by Hong Yao, Hang Zhang, Changkai Zhang, Deze Zeng, Jie Wu, Huanyang Zheng 
    Abstract: Acquiring content through mobile networks is a basic and general topic. Mobile nodes have two different ways of obtaining data. The first method is to download data quickly through 3G/4G networks, which are expensive. The second way is to get data from other nodes by means of delay-tolerant networks (DTN), which are much cheaper, but are time-consuming. Throwboxes deployed in DTN act as fixed ferry nodes. The index records the historical encounter information, in order to give the mobile nodes predictive abilities regarding future encounter events. We try to compare the effectiveness when we replace some space for the data to index. We bring forward an index-based buffer space management mechanism for throwboxes, by which mobile nodes can have the chance to fetch data at a lower total cost. Preliminary simulations demonstrate that the buffer space allocation strategy is affected by some system parameters, and that replacing some space for data with an index can lower the system total cost significantly in most cases. Simulation results also show that the index-based buffer space management mechanism outperforms other mechanisms, which only store data items or hold an index of static size.
    Keywords: mobile networks, delay-tolerant networks, throwbox, index

  • A differential game-theoretic model of auditing for data storage in cloud computing   Order a copy of this article
    by Zhi Li, Yanzhu Liu 
    Abstract: Cloud computing is a novel computing model that enables convenient and on-demand access to a shared pool of configurable computing resources. Owing to data outsourcing, this new paradigm of data hosting services is also facing many new security challenges. Auditing is essential to make sure that the user's data is correctly stored in the cloud. In this article, the interaction between cloud user and cloud service providers (CSPs) in cloud as a non-cooperative differential game has been studied. In this game formulation, a feedback Nash equilibrium of the game is reviewed, and complex decision making processes and interactions between cloud users and CSPs are analysed. The simulation results provide a reference for users and CSPs in setting appropriate energy consumption and optimal strategies.
    Keywords: differential game; cloud auditing; data integrity checking; feedback Nash equilibrium; cloud security

  • Package balancing K-means algorithm for physical distribution   Order a copy of this article
    by Yinglong Dai, Wang Yang, Guojun Wang 
    Abstract: In the flourishing express delivery area, it becomes a challenge to assign a set of tasks to a set of carriers, as it has to aggregate the neighbour tasks to a carrier and keep the loads of carriers balanced simultaneously. K-means clustering algorithm is an effective way to split the delivery tasks. However, it cannot solve the load balancing problem. This paper proposes an extended clustering algorithm - package balancing k-means. It can handle delivery task data that contain not only features for similarity measure but also additional weight information for load balancing after adding the weight metric on standard k-means algorithm. Analyses and experiments show that package balancing k-means can solve the load balancing clustering problem efficiently. To our surprise, the time cost experiments show that the extended algorithm runs even faster than k-means by averaging. We infer that it can accelerate the convergence rate.
    Keywords: clustering; K-means; logistics; physical distribution; load balancing

  • Top-k keyword search with recursive semantics in relational databases   Order a copy of this article
    by Dingjia Liu, Guohua Liu, Wei Zhao, Yu Hou 
    Abstract: Existing solutions for keyword search over relational databases focus on finding joined tuple structures from a data graph. We observe that such a graph using tuples as nodes and foreign-key references as edges cannot describe the joining connections between tuples within a single relation, and thus cannot support recursive query semantics over a relational database. To solve this problem, in our approach, we firstly model a weighted data graph considering both foreign key references and tuple joining connections within a single relation. Secondly, we discuss the ranking strategy for both nodes and edges supporting the recursive semantics by incorporating PageRank methods. Finally, an approximation algorithm as well as a top-k enumeration algorithm is presented by running the Dijkstra algorithm based on dynamic programming strategy to enumerate result tuple trees. At the end of this paper, we conduct an experimental study and report the findings.
    Keywords: relational database; keyword search; recursive semantics; graph; top-k; enumeration; shortest path; steiner tree problem; group steiner tree problem; pagerank; datalog

Special Issue on: New Techniques for Secure Internet and Cloud Computation

  • Self and social network behaviours of users in cultural spaces   Order a copy of this article
    by Angelo Chianese, Salvatore Cuomo, Pasquale De Michele, Francesco Piccialli 
    Abstract: Many cultural spaces offer their visitors the use of ICT tools to enhance their visit experience. Data collected within such spaces can be analysed in order to discover hidden information related to visitors behaviours and needs. In this paper, a computational model inspired by neuroscience simulating the personalised interactions of users with cultural heritage objects is presented. We compare a strengthened validation approach for neural networks based on classification techniques with a novel proposal one, based on clustering strategies. Such approaches allow us to identify natural users groups in data and to verify the model responses in terms of user interests. Finally, the presented model has been extended to simulate social behaviours in a community, through the sharing of interests and opinions related to cultural heritage assets. This data propagation has been further analysed in order to reproduce applicative scenarios on social networks.
    Keywords: social network; clustering techniques; cultural heritage, internet of things, user behaviours

  • A perspective on applications of in-memory and associative approaches supporting cultural big data analytics   Order a copy of this article
    by Francesco Piccialli, Angelo Chianese 
    Abstract: Business intelligence, advanced analytics, big data, in-memory database and associative technologies are actually the key enablers for enhanced business decision-making. In this paper, we provide a perspective on applications of in-memory approaches supporting analytics in the field of Cultural Heritage (CH), applied to information resources including structured and unstructured contents, geo-spatial and social network data, multimedia, multiple domain vocabularies, classifiers and ontologies. The proposed approach is implemented in an information system exploiting associative in-memory technologies in a cloud context, as well as integrating semantic technologies for merging and analysing information coming from heterogeneous sources. We analyse and describe the application of this system to trace a behavioral and interest profile of users and visitors for cultural events (exhibitions, museums, etc.) and territorial (touristic areas and routes including cultural resources, historical downtown, archaeological sites). The results of ongoing experimentation encourage a business intelligence approach that is suitable for supporting CH asset crowdsourcing, promotion, publication, management and usage.
    Keywords: in-memory database systems, big data , social analytics , business intelligence , cultural heritage , internet of things.

  • Data security and privacy information challenges in cloud computing   Order a copy of this article
    by Weiwei Kong, Yang Lei, Jing Ma 
    Abstract: Cloud computing has become a hotspot in the area of information technology. However, when indulging into its convenience and strong ability of the data processing, we also find that the great challenges also appear in terms of data security and privacy information protection. In this paper, summary of the current security and privacy information challenges is presented. The current security measures are summarized as well.
    Keywords: cloud computing; data security; privacy information; cloud computing provider

  • Load balancing algorithm based on multiple linear regression analysis in multi-agent systems   Order a copy of this article
    by Xiao-hui Zeng 
    Abstract: With the increase of agents involved in applications of multi-agent systems (MAS), the problem of load balancing is more and more prominent. This paper proposes a novel load balancing algorithm based on multiple linear regression analysis (LBAMLR). By using parallel computing on all servers and using partial information about agents communication, our algorithm can effectively choose the optimal agents' set and the suitable destination servers. The simulation results show our proposed algorithm can shorten the computing time and increase the total performance in MAS.
    Keywords: distributed computing; multi-agent systems; load balancing; multiple linear regression analysis

  • TERS: a traffic-efficient repair scheme for repairing multiple losses in erasure-coded distributed storage systems   Order a copy of this article
    by Zheng Liming 
    Abstract: Erasure coding has received considerable attention owing to the better tradeoff between the space efficiency and reliability. However, the high repair traffic and the long repair time of erasure coding have posed a new challenge: how to minimise the amount of data transferred among nodes and reduce the repair time when repairing the lost data. Existing schemes are mostly designed for single node failures, which incur high network traffic and result in low efficiency. In this paper, we propose a traffic-efficient repair scheme (TERS) suitable for repairing data losses when multiple nodes fail. TERS reduces the repair traffic by using the overlap of data accessing and computation between node repairs. To reduce the repair time, TERS uses multiple threads during the computation, and pipelines the data transmission during the repair. To evaluate the repair cost and the repair time, we provide an implementation of integrating TERS into HDFS-RAID. The numerical results confirm that TERS reduces the repair traffic by 44% on average compared with the traditional erasure codes and regenerating codes. Theoretical analysis shows that TERS effectively reduces the repair time. Moreover, the experimental results show that compared with current typical repair methods, such as TEC, MSR and TSR, the repair time of TERS is reduced by 25%, 20% and 16%, respectively.
    Keywords: distributed storage; erasure coding; repair traffic; repair time; multiple losses.

  • A sound abstract memory model for static analysis of C programs   Order a copy of this article
    by Yukun Dong 
    Abstract: Abstract memory model plays an important role in static analysis of program. This paper proposes a region-based symbolic three-valued logic (RSTVL) to guarantee the soundness of static analysis, which uses abstract regions to simulate blocks of the concrete memory. RSTVL applies symbolic expressions to express the value of memory objects, and the interval domain to describe the value of each symbol of symbolic expressions. Various operations for memory objects can be mapped to operations about regions. RSTVL can describe the shape information of data structure in memory and storage state of memory object, and a variety of associative addressable expressions, including the point-to relations, hierarchical and valued logic relations. We have built a prototype tool DTSC_RSTVL that detects code level defects in C programs. Five popular C programs are analysed, the results indicate that the analysis is sufficiently sound to detect code level defects with zero false negative rate.
    Keywords: software quality; static analysis; abstract memory model; memory object; defect detection.

Special Issue on: Advances in Evolutionary Computation and Its Applications

  • Variable penalty factors: a new GEP automatic clustering algorithm   Order a copy of this article
    by Yan Chen, Kangshun Li, Haohua Huang 
    Abstract: The clustering algorithm is considered as an important and basic method in the field of data mining on interdisciplinary researches. Various problems, such as sensitive selection of the initial clustering centre, easily fall into local optima, and poor universal search capacity and requiring prior knowledge for determining the numbers of clusters still exist in the traditional clustering algorithms. A GEP (Gene Expression Programming) automatic clustering algorithm with variable penalty factors is adopted in this paper, featuring combination of penalty factors and GEP clustering algorithm, no requirements for any prior knowledge of the dataset, automatic division of clusters and a better solution for the impact of isolated points and noise points. The simulation experiment makes further proof of the effectiveness of the algorithm.
    Keywords: variable penalty factor; gene expression programming; clustering algorithm

  • A dynamic search space strategy for swarm intelligence   Order a copy of this article
    by Zhang Shui-ping, Wang Bi, Wang Xue-jiao 
    Abstract: As an appendix which is designed to embed in one of complete swarm intelligence algorithms, a novel strategy named dynamic-search-spaces (DS) is proposed to deal with the premature convergence of those algorithms. For realising the decrements of search space, the differences or the distances between individuals and the global performance are to form a threshold for building a self-adaption system. When the value, calculating the rate of those individuals sitting near the global performance, reaches a stated percentage, the system is working to readjust the borders of the search space by the site of the global performance. The search space will be compressed to close the global performance as the centre. After each readjustment, the re-initialisation to distribute individuals in the whole search space should be achieved to enhance individuals vitality, which moves away from premature convergence and improves the performance of each individual. Meanwhile, the simpler verification is provided. The improvements of results are exhibited embedding in the genetic algorithm, the particle swarm optimisation and the differential evolution. Thus, this dynamic search space scheme can be easily embedded in most swarm intelligence algorithms.
    Keywords: swarm intelligence, self-adaption, search space, particle swarm optimisation

  • Complex splitting of context-aware recommendations   Order a copy of this article
    by Shuxin Yang, Qiuying Peng, Le Chen 
    Abstract: Item splitting is an effective approach to improve the prediction accuracy of contextual recommendations. In this approach, an item is split into two items under two alternative contextual conditions, respectively. In this work, complex splitting is proposed to get more specialised rating data and further improve the accuracy of the recommendations. The key to the approach is to select multiple contextual conditions for splitting the user or item. We translate this into a contextual conditions combinatorial optimisation problem based on a discrete binary particle swarm optimisation algorithm. The item or user is split into two different items or users according to those contextual conditions in optimal combination one is rated in a context that meets all the contextual conditions of the best combination, and the other one is rated in a context that does not. In this way, more specialised rating data can be obtained, which results in a more accurate recommendation when the data is input into the recommendation algorithm. We evaluate our algorithm using a real-world dataset, and the resulting experimental results demonstrate its validity and reliability.
    Keywords: context-aware recommendation; complex splitting; particle swarm optimisation; collaborative filtering

  • MEMS-based seismic intensity instrument for earthquake early warning   Order a copy of this article
    by Wei Ding, Hao Wang 
    Abstract: Micro-electro-mechanical systems (MEMS)-based tri-axial seismometry can be employed to design intensity instruments and construct an early warning network. Using average period τc and peak amplitude of displacement Pd methods, the estimation relationships between seismometer and magnitude can be calculated. This work outlines the algorithm and first-hand experience using a low-cost MEMS-based seismometer in a seismic intensity instrument. τc and Pd methods are employed to calculate magnitude estimation and an STA/LTA trigger algorithm is used to identify a seismic event. A set of simulated earthquake wave experiments on a horizontal vibration table are used to assess the ability of P-wave identification. The results show that the accuracy of P-wave identification and seismic intensity judgement is over 95%, which proves the usability of improving implementations for the MEMS-based seismometer in earthquake early warning applications.
    Keywords: seismic intensity instrument; earthquake early warning; micro-electro-mechanical systems; P-wave identification.

  • An improved floc image segmentation algorithm based on Otsu and particle swarm optimisation   Order a copy of this article
    by Xin Xie, Huandong Xiong, Jianbin Wang, Nan Jiang, Fengping Hu 
    Abstract: Image segmentation algorithm research is of great significance in the process of flocs detection. This paper proposes an improved floc image segmentation algorithm based on particle swarm optimisation (PSO) and Otsu, which takes into account both the motion characteristics of flocs and the real-time requirements of water treatment. Our research process goes as follows. Firstly, a gray stretch technique is used to enhance the contrast between the flocs and the background. Then, the segmentation threshold is obtained by using the adaption characteristics of PSO. Finally, our algorithm uses morphological filtering, including the opening and closing operation, to handle the segmented flocs image. The purpose is to remove the edge fuzzy zone. Experiments show that the algorithm realises flocs image segmentation accurately and rapidly, which greatly simplifies the calculation of equivalent size and quantity of flocs.
    Keywords: Otsu; particle swarm optimisation; floc; image segmentation

  • Application of interactive evolutionary strategy in fault-tolerant system capable of online self-repairing   Order a copy of this article
    by Xiaoyan Yang, Yuanxiang Li, Yala Tong 
    Abstract: An evolution mechanism is widely used in triple different-structure modular redundant(D-TMR) fault-tolerant systems to protect different modules of the system from single event upsets (SEU) implemented on SRAM-based FPGAs and to realise online self-repairing. How to enhance the efficiency and diversity of the generated module by an evolution strategy is among the most commonly discussed topics. This paper puts forward a two-stage mutation evolution strategy (TMES) based on an improved virtual reconfigurable architecture platform to evolve a combination logical circuit. In order to increase the scalability and improve the diversity of the generated circuit, an interactive evolution strategy and heterogeneous degree evaluation are leaded into TMES, and then interactive TMES (ITMES) and improved ITMES (IITMES) are presented. The proposed schemes are tested with the evolution of a two-bit multiplier, a three-bit multiplier, and a three-bit full adder. The obtained results demonstrate that the IITMES scheme has better average generation and more circuit type of generated successful individuals than the other existing evolutionary strategies. This approach has the potential to strengthen the capability of online self-repair and reduce the possibility of error occurring at the same time in the D-TMR system.
    Keywords: evolvable hardware, triple different-structure modular redundancy, VRA, interactive evolution strategies.

  • Building business process ontology based on concept hierarchy model   Order a copy of this article
    by Ying Huang, Xianwen He 
    Abstract: The agility and efficiency of business processes have great influence on the company's competitiveness. However, current detection of business process management system reveals the problem that it does not satisfy the customer requirements, because of a lack of sufficient semantic information between business processes. This paper proposed an approach for ontology extraction on business process by incorporating concept hierarchy as background knowledge. Incorporating the background knowledge in the procedure of the process ontology has two main advantages: (1) background knowledge accelerates the building process, thereby minimising the conversion cost; (2) background knowledge guides the extraction of knowledge, which hides in databases. To validate the method given in this paper,we use part of the sales order processes from SAP reference process models to construct the business process ontology. The gold standard experiments show that using this method can correct and effectively construct process ontology.
    Keywords: statistical modelling ontology; business process; concept hierarchy model; process ontology

  • Optimised placement of wireless sensor networks by evolutionary algorithm   Order a copy of this article
    by Kangshun Li, Zhichao Wen, Zhaopeng Wang, Shen Li 
    Abstract: In wireless sensor networks, the sensor nodes are small, the function structure is relatively simple, with intensive distribution and limited energy. In many applications, such as forest fire monitoring, owing to the huge monitoring area, it isnt convenient to use manual deployment, and the sensor nodes are random deployed from the air. In this situation, some sensor nodes are redundant, namely the required monitoring area just opens one part of the node monitoring coverage and which is enough for the entire region. From the perspective of resource-saving, under the condition of the networks connectivity and specific coverage, the number of sensor nodes is assumed to be as few as possible. So, computing the sensor nodes collection when meeting the requirements is called the problem of network coverage optimisation for wireless sensor networks; it is also called the problem of minimum connected covering node set. The innovation points of this article are: (1)it analyses the deficiencies of traditional evolution algorithm fitness functions, put forward an improved fitness function design scheme, and shows that it has the advantage of solving the problem of wireless sensor network coverage optimisation; (2) it applies the method of control variables, comparison and analysis of the influence on the various operations and parameter selection in evolution algorithm on the optimisation results and performance, and then points out how to design algorithms to manage to optimise effect and performance.
    Keywords: wireless sensor network; evolutionary algorithm; coverage optimization

  • Multi-objective cluster head election in cluster-based wireless sensor networks   Order a copy of this article
    by Xiaoyu Hong, Ming Tao 
    Abstract: Owing to good coverage preservation, cluster-based wireless sensor networks (WSN) have been widely explored in the context of various potential applications. Yet, selecting the optimal candidates as the cluster heads in clusters that require complete coverage of the monitored area over long periods of time still remains a significant challenge, and needs to be reasonably solved to operate the large-scale WSN systems in an optimal fashion. To address this issue, a multi-objective cluster head election scheme is proposed in this paper, in which, by taking the network coverage and load balance as the optimisation objectives and deducing the objective ranges, cluster head election is conducted as a problem of multi-objectives combination optimisation. And then, by introducing the Metropolis rule of the simulated annealing algorithm, an improved particle swarm algorithm is developed to solve this problem, which could restrict the position change of original and new particles in the iteration process, and accelerate the convergence speed of the algorithm. The simulation and analytical results are shown to demonstrate the performance on coverage optimisation, and load balance.
    Keywords: multi-objective; cluster head election; cluster; WSN

  • A novel chaotic key-based algorithm for still images   Order a copy of this article
    by Jinglong Zuo 
    Abstract: The data transmission of still images has to satisfy two objectives, namely, the limited volume reduction of the data for charging the public communication networks and the maximum safety of transmitted information. In order to satisfy the two objectives, this paper proposes a novel chaotic key-based algorithm (NCKBA). Different to recently proposed chaotic-key based algorithm (CKBA), in this algorithm, several steps were applied for enhancement: the key size was increased to 128-bits; the 1-D Logistic map in original CKBA was substituted by a piecewise linear chaotic map (PWLCM) to improve the balance property of the chaotic map. Experiments are done under C++, and the results show our algorithm is efficient, secure and greatly improve the compression ratio of the image while maintaining the performance of the encryption algorithm.
    Keywords: JPEG2000; NCKBA; Scalability; PWLCM; CBC; Sp-network

  • An improved adaptive selection search for block motion estimation   Order a copy of this article
    by Fu Mo, E-Liang Chen 
    Abstract: Motion estimation has an important effect in H.265 video coding systems because it occupies a large amount of time in the encoding system. So the quality of motion search selection affects the entire encoding efficiency directly. In this paper, a novel search algorithm that uses an adaptive hexagon and small diamond search is proposed to overcome the drawbacks of the traditional block matching selection implemented in most current video coding standards. The adaptive search algorithm is chosen according to the motion strength of the current block. When the block is in active motion, the hexagon search provides an efficient search method; when the block is inactive, the small diamond search is adopted. Experimental simulation results showed that our adaptive search algorithm can speed up the search process with little effect on distortion performance compared with other traditional approaches.
    Keywords: adaptive hexagon search; diamond search; motion estimation; motion strength.

  • The development and application of the ontology for tractor fault diagnosis   Order a copy of this article
    by Chunyin Wu, Qing Ouyang, Shouhua Yu, Chengjian Deng, Xiaojuan Mao, Tiansheng Hong 
    Abstract: This paper develops an ontology for tractor fault diagnosis, and designs the algorithms for reasoning and explanation. Furthermore, it applies the ontology to develop an expert system for tractor fault diagnosis. First of all, the ontology is developed with the method of fault tree analysis, which supports bidirectional reasoning. Then, the algorithm of reasoning is designed based on connected tractor fault trees, which is actually a directed acyclic graph structure in this case. Meanwhile, the algorithm of explanation is designed based on backward reasoning, which could provide explanations for the deduced results. Finally, the ontology-based expert system is designed and implemented with web technologies. The expert system could provide service for tractor fault diagnosis via internet to tractor maintenance personnel and tractor drivers located in the wide rural areas in China.
    Keywords: ontology, expert system, tractor, fault diagnosis, fault tree analysis, web, agricultural machinery, China

  • Research on modified teaching-learning-based optimisation algorithm for estimating parameters of Van Genuchten equation   Order a copy of this article
    by Fahui Gu 
    Abstract: In this paper, a modified teaching-learning-based optimisation (MTLBO) algorithm is proposed for estimating parameters of the Van Genuchten equation. A self-adaptive mutation strategy is used to overcome local convergence of the teaching-learning-based optimisation (TLBO). The proposed method is applied to a Matlab test system and the simulation results are compared with those of other algorithms, which verifies the superiority of the proposed method. It shows that the MTLBO algorithm is an effective method for estimating parameters of the Van Genuchten equation.
    Keywords: Van Genuchten equation; modified teaching-learning-based optimisation algorithm; estimating parameters; self-adaptive mutation strategy

Special Issue on: Computational Imaging and Multimedia Processing

  • Underwater image segmentation based on fast level set method   Order a copy of this article
    by Yujie Li, Huiliang Xu, Yun Li, Huimin Lu, Seiichi Serikawa 
    Abstract: Image segmentation is a fundamental process in image processing that has found application in many fields, such as neural image analysis, underwater image analysis. In this paper, we propose a novel fast level set method (FLSM)-based underwater image segmentation method to improve the traditional level set methods by avoiding the calculation of signed distance function (SDF). The proposed method can speed up the computational complexity without re-initialisation. We also provide a fast semi-implicit additive operator splitting (AOS) algorithm to improve the computational complex. The experiments show that the proposed FLSM performs well in selecting local or global segmentation regions.
    Keywords: underwater imaging; level set; image segmentation

  • Pseudo Zernike moments based approach for text detection and localisation from lecture videos   Order a copy of this article
    by Soundes Belkacem, Larbi Guezouli, Samir Zidat 
    Abstract: Text information embedded in videos is an important clue for retrieval and indexation of images and videos. Scene text presents challenging characteristics mainly related to acquisition circumstances and environmental changes, resulting low quality videos. In this paper, we present a scene text detection algorithm based on Pseudo Zernike Moments (PZMs) and stroke features from low resolution lecture videos. The algorithm mainly consists of three steps: slide detection, text detection and segmentation and non-text filtering. In lecture videos, the slide region is a key object carrying almost all the important information; hence the slide region has to be extracted and segmented from other scene objects considered as background for later treatments. Slide region detection and segmentation is done by applying PZMs based on RGB frames. Text detection and extraction is performed using PZM segmentation over V channel of HSV colour space, and then stroke feature is used to filter out non-text regions and remove false positives. PZMs are powerful shape descriptors; they present several strong advantages such as robustness to noise, rotation invariants, and multilevel feature representation. The PZMs based segmentation process consists of two steps: feature extraction and clustering. First, a video frame is partitioned into equal size windows, then the coordinates of each window are normalised to a polar system, then PZMs are computed over the normalised coordinates as region descriptors. Finally, a clustering step using K-means is performed in which each window is labelled for text/non-text region. The algorithm is shown to be robust to illumination, low resolution and uneven luminance from compressed videos. The effectiveness of the PZM description leads to very few false positives compared with other approaches. Moreover, resultant images can be used directly by OCR engines and no more processing is needed.
    Keywords: text localisation, text detection, pseudo Zernike moments, slide region detection.

  • Tracking multiple targets based on min-cost network flows with detection in RGB-D data   Order a copy of this article
    by Mingxin Jiang 
    Abstract: Visual multi-target tracking technology is a challenging problem in computer vision. This study proposes a novel approach for multi-target tracking based on min-cost network flows in RGB-D data with tracking-by-detection scheme. Firstly, the moving objects are detected by fusing RGB information and depth information. Then, we formulated the multi-target tracking problem as a maximum a posteriori (MAP) estimation problem with specific constraints, and the problem is converted into a cost-flow network. Finally, using a min-cost flow algorithm, we can obtain the tracking results. Extensive experimental results show that the proposed algorithm greatly improves the robustness and accuracy of algorithm and outperforms the state-of-the-art significantly.
    Keywords: combined multi-target detection, min-cost network flows, MAP, RGB-D sensor.

Special Issue on: Big Data-oriented Science, Technologies and Applications

  • Time constraint influence maximisation algorithm in the age of big data   Order a copy of this article
    by Meng Han, Zhuojun Duan, Chunyu Ai, Forrest Wong Lybarger, Yingshu Li 
    Abstract: The new generation of social networks contains billions of nodes and edges. Managing and mining this data is a new academic and industrial challenge. Influence maximisation is the problem of finding a set of nodes in a social network that result in the highest amount of influence diffusion. Independent Cascade (textit{IC}) and Linear Threshold (textit{LT}) are two classical approaches that model the influence diffusion process in social networks. Based on both textit{IC} and textit{LT}, lots of previous research works have been developed, which focus exclusively on the efficiency of algorithms, but overlooking the feature of social network data itself, such as time sensitivity and the practicality in large scale. Although much research on this topic has been proposed, such as the hardness (computing influence spread for a given seed set is #P-Hard) of the problem itself, most of the literature on this topic cannot handle the real large scale social data. Furthermore, the new era of 'big data' is changing dramatically right before our eyes - the increase of big data growth gives all researchers many challenges as well as opportunities. As more and more data is generated from social networks in this new age, this paper proposes two new models textit{TIC} and textit{TLT}, which incorporate the dynamism of networks, which considering the time constraint during the influence spreading process in practice. To address the challenge of large scale data, we take a first step designing an efficient influence maximisation framework based on the new models we proposed, and systemic theoretical analysis shows that the effective algorithms we proposed could achieve provable approximation guarantees. We also applied our models to the most notable big data frameworks textit{Hadoop} and textit{Spark} respectively. Empirical studies on different synthetic and real large scale social networks demonstrate that our model, together with solutions on both platforms, provides better practicality as well as giving a regulatory mechanism for enhancing influence maximisation. Not only that, but also it outperforms most existing alternative algorithms.
    Keywords: influence maximisation; cloud computing; data mining; data modelling.

  • Multi-criteria decisional approach for extracting relevant association rules   Order a copy of this article
    by Addi Ait-Mlouk, Fatima Gharnati, Tarik Agouti 
    Abstract: Association rule mining plays a vital role in knowledge discovering in databases. The difficult task is mining useful and non-redundant rules, in fact, in most cases, the real datasets lead to a huge number of rules, which does not allow users to make their own selection of the most relevant. Several techniques have been proposed, such as rule clustering, informative cover method, and quality measurements. Another way is to select relevant association rules, and we believe it is necessary to integrate a decisional approach within the knowledge discovery process. To solve the problem we propose an approach to discover a category of relevant association rules based on multi-criteria analysis by using association rules as actions and quality measurements as criteria. Finally, we conclude our work by an empirical study to illustrate the performance of our proposed approach.
    Keywords: data mining; knowledge discovery in database; association rules; quality measurements; multi-criteria analysis; decision-making system; ELECTRE TRI

  • Analysing user retweeting behaviour on microblogs: prediction model and influencing features   Order a copy of this article
    by Chenglong Lin, Yanyan Li, Ting-Wen Chang, Kinshuk  
    Abstract: This paper explores the feasibility of predicting users retweeting behaviour and ranks the influencing features affecting that behaviour. The four first-dimension features, namely author, text, recipient and relationship, are extracted and split into 39 second-dimension features. This study then applies support vector machine (SVM) to build the prediction model. Data samples extracted from Sina Microblog platform are subsequently used to evaluate this prediction model and rank the 39 second-dimension features. The results show the recall rate of this model is 58.67%, the precision rate is 82.19%, and the F1 test value is 68.46%, which indicate that the performance of the prediction model is highly satisfactory. Moreover, results of ranking indicate the four features that affect the retweeting behaviour of users: the active degree of the microblog author, the similarity of interests between the author and the recipient, the active degree of the microblog recipient, and the similarity between the theme of the microblog and the recipients interest.
    Keywords: microblog; retweeting behaviour; prediction model; influence ranking; support vector machine; information gain

  • System architecture of coastal remote sensing data mining and services based on cloud computing   Order a copy of this article
    by Xuerong Li, Xiujuan Wang, Lingling Wu 
    Abstract: Coastal remote sensing images have been big data which has features of the volume, variety, complexity and specialization. How to effectively carry out data integration, fast extraction and data mining of knowledge and information from these massive remote sensing data, is far behind the requirements of coastal professional applications. In this paper, based on data mining, remote sensing theory, space information and cloud computing technology, towards the goal of coastal zone remote sensing data integration and data mining service system, the meta-data model, data storage model, data mining framework, web service framework model, etc., are provided and designed. Finally, a prototype system of remote sensing data mining services in the cloud computing environment is designed and developed using system integration, and is demonstrated and verified by professional applications. It is valuable to serve the development of the coastal zone monitoring, planning and integrated management and other fields.
    Keywords: remote sensing image; data mining; system architecture; cloud computing; coastal zone

  • Collating multisource geospatial data for vegetation detection using Bayesian network: a case study of Yellow River Delta   Order a copy of this article
    by Dingyuan Mo, Liangju Yu, Meng Gao 
    Abstract: Multisource geospatial data contains a lot of information that can be used for environment assessment and management. In this paper, four environmental indicators that represent typical human activities in Yellow River Delta, China are extracted from multisource geospatial data. By analysing the causal relationship between these human-related indicators and NDVI, a Bayesian Network (BN) model is developed. Part of the raster data pre-processed using GIS is used for training the BN model, and the other data is used for a model test. Sensitivity analysis and performance assessment showed that the BN model was good enough to reveal the impacts of human activities on land vegetation. With the trained BN model, the vegetation change under three different scenarios was also predicted. The results showed that multisource geospatial data could be successfully collated using the GIS-BN framework for vegetation detection.
    Keywords: GIS; NDVI; human activity; oil exploitation; urbanisation; road construction; Bayesian network.

  • A comparative study on disease risk model in exploratory spatial analysis   Order a copy of this article
    by Zhisheng Zhao, Yang Liu, Jing Li, Junhua Liang, Jiawei Wang 
    Abstract: The present work mainly focuses on the issue of risk model in spacial data analysis. Through the analysis on morbidity data of influenza A (H1N1) across Chinas administrative regions from 2009 to 2012, a comparative study was carried out among Poisson model, Poisson-Gamma model, log-normal model, EB estimator of moment and Bayesian hierarchical model. By using R programming language, the feasibility of the above analysis methods was verified and the variability of the estimated values generated by each model was calculated, the Bayesian model for spatial disease analysis was improved, and the estimator considering uncorrelated spatial model, correlated spatial model and covariate factors was proved to be the best by comparing DIC values of the models. By using the Markov chain for simulative iteration, iterative convergence was illustrated by graphs of iteration track, autocorrelation function, kernel density and quantile estimation. The research on spatial variability of disease morbidity is helpful in detecting the epidemic area and forewarning the pathophoresis of prospective epidemic disease.
    Keywords: spatial disease analysis; Bayesian hierarchical model; Poisson-Gamma model; EB estimator of moment; R programming

  • A robust video watermarking scheme using sparse principal component analysis and wavelet transform   Order a copy of this article
    by Shankar Thirunarayanan, Yamuna Govindarajan 
    Abstract: The extension of internet facilities profoundly eases the culmination of all digital data such as audio, images and videos to the general public. A technique to be developed is watermarking in favour of security and facilitating the data as well for copyright protection of digital contents. This paper proposes a blind scheme for digital video watermarking. A discrete wavelet domain watermarking is adopted for hiding a large amount of data with high security, good invisibility and no loss to the secret message. First, Dual Tree Complex Wavelet Transform (DTCWT) is applied to each frame, decomposing it into a number of sub-bands. Then, holo entropy of each sub-band is calculated and the maximum entropy blocks are selected. The selected blocks are transformed using sparse principal component analysis (SPCA). The maximum coefficient of the SPCA blocks of each sub-band is quantised using Quantization Index Modulation (QIM). The watermark bit is embedded into the appropriate quantiser values. The same process is repeated at the extraction process. The proposed video watermarking scheme is analysed through various constraints, such as the Normalised Correlation (NC) and the Peak Signal to Noise Ratio (PSNR), and the embedding quality is maintained with an average PSNR value of 53 dB.
    Keywords: watermarking; quantisation; SPCA; holo entropy; dual tree complex wavelet transform

  • Optimisation for video watermarking using ABC algorithm   Order a copy of this article
    by Sundararajan Madhavan, Yamuna Govindarajan 
    Abstract: Video watermarking is a relatively innovative tool that has been proposed to solve the problem of illegal manipulation and sharing of digital video. It is the process of embedding copyright information into video watermarking. In this paper, an Artificial Bee Colony (ABC) algorithm is used for finding the frame and location for embedding the gray scale image into video watermarking, and then searches a scene and location into which a particular part of watermarking is best to embed. The number of shot frames and locations is identified using the ABC algorithm. Once the best frame and locations are identified, the embedding and extraction procedure is carried out. The performance of the proposed algorithm is analysed with the existing technique using PSNR and NC. This technique is tested against different attacks and the results obtained are encouraging.
    Keywords: digital video watermarking; discrete wavelet transform; artificial bee colony; peak signal to noise ratio; normalised correlation.

  • DWT based gray-scale image watermarking using area of best fit equation and cuckoo search algorithm   Order a copy of this article
    by Sundararajan Madhavan, Yamuna Govindarajan 
    Abstract: This manuscript explains the salient features of the recently presented Nature Inspired Algorithm (NIA) for the improvement of digital image watermarking used in copyright protection. In the embedding process, the gray image divided into four sub-bands using discrete wavelet transform (DWT), and the desired sub-bands are selected. In the selected two sub-bands (LH, HL) a mathematical equation of area of best fit is applied. The cuckoo search algorithm in focus is used to entirely recognise optimal positions in the DWT domain for watermark enclosure in the binary image. The results display the supremacy of using this algorithm for the watermarking techniques focused for copyright protection with the lowest effect on the PSNR values the optimum positions are obtained even for the watermark included images.
    Keywords: cuckoo search algorithm; area of the best fit equation; gray-scale image watermarking; discrete wavelet transform.

  • Term extraction and correlation analysis based on massive scientific and technical literature   Order a copy of this article
    by Wen Zeng 
    Abstract: Scientific and technical terms are the basic units of knowledge discovery and organisation construction. Correlation analysis is one of the important technologies for the deep data mining of massive, different scientific and technical literature. Based on the freely available digital library resources, this study adopts the technology of natural language processing to analyse the linguistics characteristics of terms, and combines with statistical analyses to extract the terms from scientific and technical literature. Using the results of term extraction, the paper proposes the algorithm of improved VSM towards correlation calculation for analysing different scientific and technical literature. According to the experimental results, it proposes a new way and possibility to automatically extract terms and realise correlation analysis for different sources of massive scientific and technical literature. Our method is superior to the method of unadopting linguistic rules and MI calculation. The accuracy of terms is about 73.5%. Compared with the traditional VSM based on terms, the correct rate of correlation calculation is increased by 12%.
    Keywords: term extraction; correlation analysis; scientific and technical literature; knowledge discovery and organization; big data.

  • Hybrid fuzzy collaborative filtering: an integration of item-based and user-based clustering techniques   Order a copy of this article
    by Pratibha Yadav, Shweta Tyagi 
    Abstract: Collaborative filtering is the most widely adopted technique of recommender system which presents the individualised information based on the analysis of users past behaviour and selections. In the literature, numerous collaborative filtering approaches have been put forward. Clustering is one of the successful approaches of the model-based collaborative filtering techniques that deals with the problem of sparsity and provides quality recommendations. The problem with the clustering approach is the fact that it imposes unique membership constraint on the users/items. This issue is addressed in the literature by employing fuzzy c-means clustering, a soft clustering technique which allows an element to belong to more than one cluster. Traditionally, fuzzy c-means clustering technique is adopted with collaborative filtering to first produce item-based fuzzy clusters and then to generate recommendations. In the proposed work, fuzzy c-means clustering technique is adopted in order to produce item-based clusters as well as user-based clusters. Subsequently, collaborative filtering technique explores the item-based and user-based clusters and generates the list of item-based and user-based predictions, respectively. Further, to enhance the quality of recommendations, a novel weighted hybrid scheme is designed which integrates the user-based and item-based predictions to capture the influence of each active user towards item-based and user-based predictions. The proposed schemes are further categorised on the basis of re-clustering and without re-clustering under different similarity measures over sparse and dense datasets. The experimental results reveal that the variants of the proposed hybrid schemes consistently generate better results in comparison with the corresponding variants of proposed user-based schemes and the traditional item-based schemes.
    Keywords: recommender system; collaborative filtering; fuzzy C-means clustering; sparsity

  • Towards patent text analysis based on semantic role labelling   Order a copy of this article
    by Yanqing He, Ying Li, Ling'en Meng, Hongjiao Xu 
    Abstract: Mining patent texts can obtain valuable technical information and competitive intelligence which is important for the development of technology and business. The current patent text-mining approaches suffer from lack of effective, automatic, accurate and wide-coverage techniques that can annotate natural language texts with semantic argument structure. It is helpful for text mining to derive a more meaningful semantic relationship from semantic role labelling (SRL) results of patents. This paper uses Word2Vec to learn word real-valued vector and design features related to word vector to train SRL parser. Based on the SRL parser, two patent text mining methods are then given: patent topic extraction and automatic construction of patent technical effect matrix (PTEM). Experiments show that semantic role labelling help to achieve satisfactory results and saves manpower.
    Keywords: patent technical effect matrix; semantic role labeling; IPC; patent analysis; word vector; patent topic extraction; semantic analysis; text mining

  • Efficient attribute selection strategies for association rule mining in high dimensional data   Order a copy of this article
    by Sandhya Harikumar, Divya Usha Dilipkumar, M. Ramachandran Kaimal 
    Abstract: This paper presents a new computational approach to discover interesting relations between variables, called association rules, in large and high dimensional datasets. State of the art techniques are computationally expensive for reasons such as high dimensions, generation of a huge number of candidate sets, and multiple database scans. In general, most of the enormous discovered patterns are obvious, redundant or uninteresting to the user. So the context of this paper is to improve the Apriori algorithm to find association rules pertaining to only important attributes from high dimensional data. We employ an information theoretic method together with the concept of QR decomposition to represent the data in its proper substructure form without losing its semantics. Specifically, we present a feature selection approach based on entropy measure which is leveraged into the process of QR decomposition for finding significant attributes. This helps in expressing the dataset in compact form by projecting into different subspaces. The association rule mining based on these significant attributes leads to improvement of the traditional Apriori algorithm in terms of candidate set generation and rules mined, as well as time complexity. Experiment on real datasets and comparison with the existing technique reveals that the proposed strategy is computationally always faster and statistically always comparable with the classic algorithms.
    Keywords: association rule mining; Apriori algorithm; entropy; QR decomposition.

Special Issue on: Advanced Cooperative Computing

  • Towards optimisation of replicated erasure codes for efficient cooperative repair in cloud storage systems   Order a copy of this article
    by Guangping Xu, Qunfang Mao, Huan Li 
    Abstract: The study of erasure codes in distributed storage systems has two aspects: one is to reduce the data redundancy and the other one is to save the bandwidth cost during repair process. Repair-efficient codes are investigated to improve the repair performance. However, the researches are mostly at the theoretical stage and hardly applied in the practical distributed storage systems such as cloud storage. In this paper, we present a unified framework to describe some repair-efficient regenerating codes in order to reduce the bandwidth cost in regenerating the lost data. We build an evaluation system to measure the performance of these codes during file encoding, file decoding and individual failure repairing with given feasible parameters. By the experimental comparison and analysis, we validate that the repair-efficient regenerating codes can significantly save much more repair time than traditional erasure codes during the repair process at the same storage cost; in particular, some replication-based erasure codes can perform better than others in some cases. Our experiments can help researchers to decide which kind of erasure codes to use in building distributed storage systems.
    Keywords: erasure codes; distributed storage systems; data recovery; repair-efficient codes

  • Signal prediction based on boosting and decision stump   Order a copy of this article
    by Lei Shi 
    Abstract: Signal prediction has attracted more and more attention from data mining and machine learning communities. Decision stump is a one-level decision tree, and it classifies instances by sorting them based on feature values. The boosting is a kind of powerful ensemble method and can improve the performance of prediction significantly. In this paper, boosting and decision stump algorithm are combined to analyse and predict the signal data. An experimental evaluation is carried out on the public signal dataset and the experimental results show that the boosting and decision stump-based algorithm improves the performance of signal prediction significantly.
    Keywords: decision stump; boosting; signal prediction

  • A matching approach to business services and software services   Order a copy of this article
    by Junfeng Zhao 
    Abstract: Recent studies have shown that Service-Oriented Architecture (SOA) has the potential to revive enterprise legacy systems [1-10], making their continued service in the corporate world viable. In the process of reengineering legacy systems to SOA, some software services extracted in legacy system can be reused to implement business services in target systems. In order to achieve efficient reuse of software services, a matching approach is proposed to extract the software services related to specified business services, where service semantics and structure similarity measures are integrated to evaluate the similarity degree between business service and software services. Experiments indicate that the approach can efficiently map business services to relevant software services, and then legacy systems can be reused as much as possible.
    Keywords: software service; business service; matching approach; semantics similiarity measure; structure similarity measure

  • A new model of vehicular ad hoc networks based on artificial immune theory   Order a copy of this article
    by Yizhe Zhou, Depin Peng 
    Abstract: Vehicular ad hoc networks (VANETs) are highly mobile and wireless networks intended to aid vehicular safety and traffic monitoring. To achieve these goals, we propose a VANET model based on immune network theory. Our model outperforms the Delay Tolerant Mobility Sensor Network (DTMSN) model over a range of node numbers in terms of data packet arrival delay, arrival ratio, and throughput. These findings held true for the on-demand distance vector and connection-based restricted forwarding routing protocols. The model performed satisfactorily on a real road network.
    Keywords: networking model; vehicular ad hoc networks; artificial immune theory; real-time capacity.

  • Feature binding pulse-coupled neural network model using a double color space   Order a copy of this article
    by Hongxia Deng, Han Li, Sha Chang, Jie Xu, Haifang Li 
    Abstract: The feature binding problem is one of the central issues in cognitive science and neuroscience. To implement a bundled identification of colour and shape of one colour image, a double-space vector feature binding PCNN (DVFB-PCNN) model was proposed based on the traditional pulse-coupled neural network (PCNN). In this model, the method of combining RGB colour space with HSI colour space successfully solved the problem that all colours can not always be separated completely. Through the first pulse emission time of the neurons, the different characteristics were separated successfully. Through the colour sequence produced by this process, the different characteristics belonging to the same perceived object were bound together. Experiments showed that the model can successfully achieve separation and binding of image features and will be a valuable tool for PCNN in the feature binding of colour images.
    Keywords: feature binding; double-space; pulse emission time.

  • Using online dictionary learning to improve Bayer pattern image coding   Order a copy of this article
    by Tingyi Zheng, Li Wang 
    Abstract: Image quality is a fundamental concern in image compression. There is a lot of noise in the image compression process, which may impact on users not getting precise identification. It has, thus, always been neglected in image compression in past researches. In fact, noise takes a beneficial role in image reconstruction. In this paper, we choose noise as considered and recommended as a coding method for Bayer pattern image based on online dictionary learning. Investigations have depicted that the proposed method in Bayer pattern image coding might develop the rate of distortion performance of Bayer pattern image coding at any rate.
    Keywords: Bayer pattern image; online dictionary learning; rate distortion.

Special Issue on: ICNC-FSKD'15 Machine Learning, Data Mining and Knowledge Management

  • An improved ORNAM representation of gray images   Order a copy of this article
    by Yunping Zheng, Mudar Sarem 
    Abstract: An efficient image representation can save space and facilitate the manipulation of the acquired images. In order to further enhance the reconstructed image quality and reduce the number of the homogeneous blocks of the overlapping rectangular non-symmetry and anti-packing model (ORNAM) representation, in this paper we propose an improved overlapping rectangular non-symmetry and anti-packing model representation (IORNAM) of gray images. Compared with most of the up-to-date and the state-of-the-art hierarchical representation methods, the new IORNAM representation is characterised by two properties. (1) It adopts a ratio parameter of the length and the width of a homogenous block to improve the reconstructed image quality. (2) It uses a new expansion method to anti-pack the subpatterns of gray images to further decrease the number of homogenous blocks, which is important for improving the compression ratios of image representation and reducing the complexities of many image manipulation algorithms. The experimental results presented in this paper demonstrate that (1) the new IORNAM representation is able to achieve high representation efficiency for gray images and (2) the new IORNAM representation outperforms most of the up-to-date and the state-of-the-art hierarchical representation methods of gray images.
    Keywords: gray image representation; extended Gouraud shading approach; overlapping rectangular NAM; ORNAM; spatial data structures; S-Tree coding; spatial- and DCT-based.

  • Genetic or non-genetic prognostic factors for colon cancer classification   Order a copy of this article
    by Meng Pan, Jie Zhang 
    Abstract: Many researches have addressed patient classification using prognostic factors or gene expression profiles (GEPs). This study tried to identify whether a prognostic factor was genetic by using GEPs. If significant GEP difference was observed between two statuses of a factor, the factor might be genetic. If the GEP difference was largely less significant than the survival difference, the survival difference might not be due to the genes; then, the factor might be non-genetic or partly non-genetic. A practice was made in this study using public dataset GSE40967, which contains GEP data of 566 colon cancer patients, messages of tumor-node-metastasis (TNM) staging, etc. Prognostic factors T, N, M, and TNM were observed being non-genetic or partly non-genetic, which should be complement for future gene expression classifiers.
    Keywords: gene expression profiles; prognostic factor; colon cancer; classification; survival

  • A medical training system for the operation of heart-lung machine   Order a copy of this article
    by Ren Kanehira 
    Abstract: There has been a strong tendency to use Information Communication Technology (ICT) to construct various education/training systems to help students or other learners master necessary skills more easily. Among such systems the ability to obtain operational practice is particularly welcome in addition to the conventional e-learning ones mainly for obtaining textbook-like knowledge only. In this study, we propose a medical training system for the operation of heart-lung machine. Two training contents, one for basic operation and another for troubleshooting, are constructed in the system and their effects are tested.
    Keywords: computer-aided training; skill science; medical training; heart-lung machine; operation supporting; e-learning; clinic engineer.

Special Issue on: BDA 2014 and 2015 Conferences and DNIS 2014 and 2015 Workshops Data Modelling and Information Infrastructure in Big Data Analytics

  • Automatic identification and classification of Palomar Transient Factory astrophysical objects in GLADE   Order a copy of this article
    by Weijie Zhao, Florin Rusu, John Wu, Peter Nugent 
    Abstract: Palomar Transient Factory is a comprehensive detection system for the identification and classification of transient astrophysical objects. The central piece in the identification pipeline is represented by an automated classifier that distinguishes between real and bogus objects with high accuracy. The classifier consists of two components, namely real-time and offline. Response time is the critical characteristic of the real-time component, whereas accuracy is representative for the offline in-depth analysis. In this paper, we make two significant contributions. First, we present an experimental study that evaluates a novel implementation of the real-time classifier in GLADE, a parallel data processing system that combines the efficiency of a database with the extensibility of Map-Reduce. We show how each stage in the classifier - candidate identification, pruning, and contextual realbogus - maps optimally into GLADE tasks by taking advantage of the unique features of the system range-based data partitioning, columnar storage, multi-query execution, and in-database support for complex aggregate computation. The result is an efficient classifier implementation capable of processing a new set of acquired images in a matter of minutes, even on a low-end server. For comparison, an optimised PostgreSQL implementation of the classifier takes hours on the same machine. Second, we introduce a novel parallel similarity join algorithm for advanced transient classification. This algorithm operates offline and considers the entire candidate dataset consisting of all the objects extracted over the lifetime of the Palomar Transient Factory survey. We implement the similarity join algorithm in GLADE and execute it on a massive supercomputer with more than 3000 threads. We achieve more than three orders of magnitude improvement over the optimised PostgreSQL solution.
    Keywords: parallel databases; multi-query processing; scientific data analysis; similarity join; astronomical surveys; transient identification

  • Trust and reputation based multi-agent recommender system   Order a copy of this article
    by Punam Bedi, Sumit Agarwal, Richa Singh 
    Abstract: User profile modelling for the domain of tourism is different from most of the other domains, such as books or movies. The structure of a tourist product is more complex than a movie or a book. Moreover, the frequency of activities and ratings in the tourism domain is also smaller than the other domains. To address these challenges, this study proposes a Trust and Reputation based Collaborative Filtering (TRbCF) algorithm. It augments a notion of dynamic trust between users and reputation of items to an existing collaborative approach for generating relevant recommendations. A Multi-Agent Recommender System for e-Tourism (MARST) for recommending tourism services using the TRbCF algorithm is designed and a prototype is developed. TRbCF also helps to handle the new user cold-start problem. The developed system can generate recommendations for hotels, places to visit and restaurants at a single place, whereas most of the existing recommender systems focus on one service at a time.
    Keywords: multi-agent system, recommender system, e-tourism, trust, reputation

  • Anomaly-free search using multi-table entity attribute value data model   Order a copy of this article
    by Shivani Batra, Shelly Sachdeva 
    Abstract: This paper proposes a principled extension of Dynamic Tables (DT). It is termed as the Multi-Table Entity Attribute Value (MTEAV) model, which offers a search-efficient avenue for storing a database. The paper presents precise semantics of MTEAV and demonstrates the following aspects: (1) MTEAV possesses consistency and availability; (2) MTEAV outperforms other existing models (Entity Attribute Value Model, Dynamic Tables, Optimized Entity Attribute Value and Optimized Column Oriented Model) under various query scenarios and varying datasets size; (3) MTEAV retains the flavour of EAV in terms of handling sparseness and self-adapting schema-changing capability. To heighten the adaptability of MTEAV, a translation layer is implemented over existing SQL engine in a non-intrusive way. The translation layer transforms conventional a SQL query (as per horizontal row representation) to a new SQL query (as per MTEAV structure) to maintain user friendliness. The translation layer makes users feel as if they are interacting with the conventional horizontal row approach. The paper also critically analyses the maximum percentage of non-null density appropriate for choosing MTEAV as a storage option.
    Keywords: database, dynamic tables, entity attribute value model, optimised entity attribute value, optimised column-oriented model, search efficiency, storage efficiency.

  • Secure k-objects selection for a keyword query based on MapReduce skyline algorithm   Order a copy of this article
    by Asif Zaman, Md. Anisuzzaman Siddique, Annisa, Yasuhiko Morimoto 
    Abstract: Keyword query interface has become a de-facto standard in information retrieval and such systems have been used by the community for decades. The user gives a keyword, and objects that are closely related to that keyword are returned to the user. The process of selecting necessary objects for a keyword query has been considered as one of the most precious query problems. Top-k query is one of the popular methods to select important objects from a large number of candidates. A user specifies scoring functions and k, the number of objects to be retrieved. Based on the user's scoring function, k objects are then selected by the top-k query. However, the user's scoring function may not be identical, which implies that the top-k objects are valuable only for users whose scoring functions are similar. Meanwhile, the privacy of data during the selection processing is also a burning issue. In some cases, especially in multi-party computing, parties may not want to disclose any information during the processing. In this paper, we propose a k-object selection procedure that selects various k objects that are preferable for all users whose scoring functions are not identical. During the selection of k-objects, the proposed method prevents disclosures of sensitive values. The idea of skyline and top-k query along with perturbed cipher has been used to select the k objects securely. We propose such efficient secure computation by using MapReduce framework.
    Keywords: skyline query; top-k Query; data privacy; MapReduce ; mobile phone interface.

  • High performance adaptive traffic control for efficient response in vehicular ad hoc networks   Order a copy of this article
    by Vinita Jindal, Punam Bedi 
    Abstract: Nowadays, with the invention of CUDA, a parallel computing platform and programming model, there is a dramatic increase in computing performance by harnessing the power of the GPU. GPU computing with CUDA can be used to find efficient solutions for many real-world complex problems. One such is the traffic signal control problem, which takes care of conflicting movements at the intersections to avoid accidents and ensure smooth flow of traffic in a safe and efficient manner. Adaptive Traffic Control (ATC) algorithm is used in the literature to reduce the average queue length at the intersections. This algorithm has serial implementation on a single CPU and hence takes large computation time. In this paper, we propose a high performance ATC for proving efficient responses and hence reducing average queue length that results in a decrease in the overall waiting time at the intersections. We tested our proposed approach with varying numbers of vehicles for two real world networks. The performance of the proposed algorithm is compared with its serial counterpart.
    Keywords: VANETs; GPU; CUDA; adaptive control; traffic signals.

  • Smart city workflow patterns for qualitative aggregate information retrieval from distributed public information resources   Order a copy of this article
    by Wanming Chu 
    Abstract: We examine a workflow pattern system for public information from multiple resources. This system aggregates timetable information from bus companies, city information from the internet, and the public facilities map of the city, to generate geographic data. Multiple query methods are used to obtain the target information. For example, one of the search results can be set as the origin or destination of a bus route. Next, the shortest bus route with the minimum number of bus stops between the origin and destination can be found by using the bus routing function. The query results and the shortest bus route are visualised on the embedded map. The detailed search information is shown in the side-bar. This system finds city information and transportation routes. It is helpful for residents and visitors. They can use the city public transportation more efficiently for their daily life, business, and travel planning.
    Keywords: GIS; query interface; routing query over heterogeneous information resources.

  • Computational intelligence methods for data mining of causality extent in time series   Order a copy of this article
    by Lukas Pichl, Taisei Kaizoji 
    Abstract: Data mining of causality extent in the time series of economic data is an important area of computational intelligence research with direct applications to algorithmic trading or risk diversification strategies. Based on the particular market and the time scale employed, the causal rates are expected to vary widely. In this work we adopt the Support Vector Machine (SVM) and Artificial Neural Network (ANN) for causality rate extraction. The dataset records all details of the futures contracts on the commodity of gasoline traded in Japan. By sampling the tick data at 1 min, 5 min, 10 min, 30 min, 1 hour and 1 day scales, we derive time series of varying causal degree. Trend predictions are computed by using the SVM binary classifier trained on 66.6% of the data using a five-step-back moving window, which samples the log returns as the predictor data. From the testing data we extract varying rates of causality degree, starting from the borderline of 50% up to the order of 60% in rare cases. The trend prediction analysis is complemented by the ANN method with four hidden layers. We find that whereas the SVM outperforms the ANN in most cases, the opposite may also be true on occasions. In general, whereas considerable causality rates are observed at some high-frequency sampled data segments, returns at the longer time scales are predictable to a lesser extent. Overall, the market of the gasoline futures in Japan is found to be rather close to the efficient market hypothesis in comparison with other commodities markets.
    Keywords: financial futures; artificial neural network; support vector machine; trend prediction; causality extraction.

  • A dataflow platform for applications based on linked data   Order a copy of this article
    by Miguel Ceriani, Paolo Bottoni 
    Abstract: Modern software applications increasingly benefit from accessing the multifarious and heterogeneous Web of Data, thanks to the use of web APIs and linked data principles. In previous work, the authors proposed a platform to develop applications consuming linked data in a declarative and modular way. This paper describes in detail the functional language the platform gives access to, which is based on SPARQL (the standard query language for linked data) and on the dataflow paradigm. The language features interactive and meta-programming capabilities so that complex modules/applications can be developed. By adopting a declarative style, it favours the development of modules that can be reused in various specific execution contexts.
    Keywords: linked data; Semantic Web; SPARQL; RDF; dataflow; declarative programming.

Special Issue on: ICICS 2016 Next Generation Information and Communication Systems

  • Is a picture worth a thousand words? A computational investigation of the modality effect   Order a copy of this article
    by Naser Al Madi, Javed Khan 
    Abstract: The modality effect is a term that refers to differences in learning performance in relation to the mode of presentation. It is an interesting phenomenon that impacts education, online-learning, and marketing, among many other areas of life. In this study, we use Electroencephalography (EEG Alpha, Beta, and Theta) and computational modelling of comprehension to study the modality effect in text and multimedia. First, we provide a framework for evaluating learning performance, working memory, and emotions during learning. Second, we apply these tools to investigate the modality effect computationally focusing on text in contrast to multimedia. This study is based on a dataset that we have collected through a human experiment involving 16 participants. Our results are important for future learning systems that incorporate learning performance, working memory, and emotions in a continuous feedback system that measures and optimises learning during and not after learning.
    Keywords: modality effect; comprehension; electroencephalography; learning; education; text; multimedia; semantic networks; recall; emotions.

  • Automated labelling and severity prediction of software bug reports   Order a copy of this article
    by Ahmed Otoom, Doaa Al-Shdaifat, Maen Hammad, Emad Abdallah, Ashraf Aljammal 
    Abstract: We target two research problems that are related to bug tracking systems: bug severity prediction and automated bug labelling. Our main aim is to develop an intelligent classifier that is capable of predicting the severity and label (type) of a newly submitted bug report through a bug tracking system. For this purpose, we build two datasets that are based on 350 bug reports from the open-source community (Eclipse, Mozilla, and Gnome). These datasets are characterised by various textual features that are extracted from the summary and description of bug reports of the aforementioned projects. Based on this information, we train a variety of discriminative models that can be used for automated labelling and severity prediction of a newly submitted bug report. A boosting algorithm is also implemented for an enhanced performance. The classification performance is measured using accuracy and a set of other measures including: precision, recall, F-measure and the area under the Receiver Operating Characteristic (ROC) curve. For automated labelling, the accuracy reaches around 91% with the AdaBoost algorithm and cross-validation test. On the other hand, for severity prediction, our results show that the proposed feature set has proved successful with a classification performance accuracy of around 67% with the AdaBoost algorithm and cross-validation test. Experimental results with the variation of training set size are also presented. Overall, the results are encouraging and show the effectiveness of the proposed feature sets.
    Keywords: severity prediction; software bugs; machine learning; bug labeling.

Special Issue on: Novel Strategies for Programming Accelerators

  • Evaluating attainable memory bandwidth of parallel programming models via BabelStream   Order a copy of this article
    by Tom Deakin, James Price, Matt Martineau, Simon McIntosh-Smith 
    Abstract: Many scientific codes consist of memory bandwidth bound kernels - the dominating factor of the runtime is the speed at which data can be loaded from memory into the arithmetic logic units, before results are written back to memory. One major advantage of many-core devices such as General Purpose Graphics Processing Units (GPGPUs), and the Intel Xeon Phi is their focus on providing increased memory bandwidth over traditional CPU architectures. However, as with CPUs, this peak memory bandwidth is usually unachievable in practice and so benchmarks are required to measure a practical upper bound on expected performance. We augment the standard set of STREAM kernels with a dot product kernel to investigate the performance of simple reduction operations on large arrays. Such kernels are usually present in scientific codes and are still memory-bandwidth bound. The choice of one programming model over another should ideally not limit the performance that can be achieved on a device. BabelStream (formally GPU-STREAM) has been updated to incorporate a wide variety of the latest parallel programming models, all implementing the same parallel scheme. As such this tool can be used as a kind of 'Rosetta Stone' that provides both a cross-platform and cross-programming model array of results of achievable memory bandwidth.
    Keywords: performance portability; many-core; parallel programming models; memory bandwidth benchmark.

  • Array streaming for array programming   Order a copy of this article
    by Mads Kristensen, James Avery 
    Abstract: A barrier to efficient array programming, for example in Python/NumPy, is that algorithms written as pure array operations completely without loops, while most efficient on small input, can lead to explosions in memory use. The present paper presents a solution to this problem using array streaming, implemented in the automatic parallelisation high-performance framework Bohrium. This makes it possible to use array programming in Python/NumPy code directly, even when the apparent memory requirement exceeds the machine capacity, since the automatic streaming eliminates the temporary memory overhead by performing calculations in per-thread registers. Using Bohrium, we automatically fuse, JIT-compile, and execute NumPy array operations on GPGPUs without modification to the user programs. We present performance evaluations of three benchmarks, all of which show dramatic reductions in memory use from streaming, yielding corresponding improvements in speed and use of GPGPU-cores. The streaming-enabled Bohrium effortlessly runs programs on input sizes much beyond sizes that crash on pure NumPy owing to exhausting system memory.
    Keywords: JIT-compilation; high productivity; Python; OpenCL; OpenMP; Bohrium; Numpy; GP-GPU.

Special Issue on: IEEE ISPA-16 Parallel and Distributed Computing and Applications

  • Method of key node identification in command and control networks based on level flow betweenness   Order a copy of this article
    by Wang Yunming, Pan Cheng-Sheng, Chen Bo, Zhang Duo-Ping 
    Abstract: Key node identification in command and control (C2) networks is an appealing problem that has attracted increasing attention. Owing to the particular nature of C2 networks, the traditional algorithms for key node identification have problems with high complexity and unsatisfactory adaptability. A new method of key node identification based on level flow betweenness (LFB) is proposed, which is suitable for C2 networks. The proposed method first proved the definition of LFB by analysing the characteristics of a C2 network. Then, this method designs algorithms for key node identification based on LFB, and theoretically derives the complexity of this algorithm. Finally, a number of numerical simulation experiments are carried out, and the results demonstrate that this method reduces algorithm complexity, improves identification accuracy and enhances adaptability for C2 networks.
    Keywords: command and control network; complex network; key node identification; level flow betweenness.

Special Issue on: Advances in Evolutionary Computation and its Applications

  • A new group search optimiser integrating multiple strategies   Order a copy of this article
    by Chengwang Xie, Wenjing Chen, Weiwei Yu 
    Abstract: Group search optimiser (GSO) is a recently developed heuristic inspired by biological group search resources behaviour. However, it still has some defects, such as slow convergence speed and poor accuracy of solution. In order to improve the performance of GSO in solving complex optimisation problems, an opposition-based learning (OBL) and a differential evolution (DE) are integrated into GSO to form a hybrid GSO. In this paper, the strategy of OBL is used to enlarge the search region to facilitate jumping out of the local optimal trap, and the approach of DE is used to enhance local search and then improve the accuracy of solution. Comparison experiments based on 13 benchmark test functions have demonstrated that our hybrid GSO has advantages over the other peer optimisers.
    Keywords: group search optimiser; opposition-based learning; differential evolution; hybrid group search optimiser

Special Issue on: CSS 2013 Advances in Cyberspace Safety and Security

  • Enhancing the performance of process level redundancy with coprocessors in symmetric multiprocessors   Order a copy of this article
    by Hongjun Dai 
    Abstract: Transient faults are rising as a crucial concern in the reliability of computer systems. As the emerging trend of integrating coprocessors into symmetric multiprocessors, it offers a better choice for software oriented fault tolerance approaches. This paper presents coprocessor-based Process Level Redundancy (PLR), which makes use of coprocessors and frees CPU cycle to other tasks. The experiment is conducted by comparing the performance of one CPU version of PLR and one coprocessor version PLR using a subset of optimised SPEC CPU2006 benchmark. It shows that the proposed approach enhances performance by 32.6% on average. The performance can be enhanced more if one application contains more system calls. This common technique can be adapted to other software-based fault tolerance as well.
    Keywords: fault tolerance; symmetric multiprocessors; process-level redundancy; coprocessor;

  • Improving stability of PCA-based network anomaly detection by means of Kernel-PCA   Order a copy of this article
    by Christian Callegari, Lisa Donatini, Stefano Giordano, Michele Pagano 
    Abstract: In recent years, the problem of detecting anomalies and attacks by statistically inspecting the network traffic has been attracting more and more research efforts. As a result, many different solutions have been proposed. Nonetheless, the poor performance offered by the proposed detection methods, as well as the difficulty of properly tuning and training these systems, make the detection of network anomalies still an open issue. In this paper we tackle the problem by proposing a way to improve the performance of anomaly detection. In more detail, we propose a novel network anomaly detection method that, by means of Kernel-PCA, is able to overcome the limitations of the 'classical' PCA-based methods, while retaining good performance in detecting network attacks and anomalies.
    Keywords: intrusion detection system; network anomaly detection; Kernel-PCA

  • Applying transmission-coverage algorithms for secure geocasting in VANETs   Order a copy of this article
    by Antonio Prado, Sushmita Ruj, Milos Stojmenovic, Amiya Nayak 
    Abstract: Existing geocasting algorithms for VANETs provide either high availability or security, but fail to achieve both together. Most of the privacy preserving algorithms for VANETs have low availability and involve high communication and computation overheads. The reliable protocols do not guarantee secrecy and privacy. We propose a secure, privacy-preserving geocasting algorithm for VANETs, which uses direction-based dissemination. Privacy and security are achieved using public key encryption and authentication and pseudonyms. To reduce communication overheads resulting from duplication of messages, we adapt a transmission-coverage algorithm used in mobile sensor networks, where nodes delay forwarding messages based on its uncovered transmission perimeter after neighbouring nodes have broadcast the message. Our analysis shows that our protocol achieves a high delivery rate, with reasonable computation and communication overheads.
    Keywords: geocasting, privacy, coverage, VANET