Forthcoming articles

 


International Journal of Computational Science and Engineering

 

These articles have been peer-reviewed and accepted for publication in IJCSE, but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.

 

Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.

 

Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.

 

Articles marked with this Open Access icon are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.

 

Register for our alerting service, which notifies you by email when new issues of IJCSE are published online.

 

We also offer RSS feeds which provide timely updates of tables of contents, newly published articles and calls for papers.

 

International Journal of Computational Science and Engineering (201 papers in press)

 

Regular Issues

 

  • Enhancing the performance of process level redundancy with coprocessors in symmetric multiprocessors   Order a copy of this article
    by Hongjun Dai 
    Abstract: Transient faults are rising as a crucial concern in the reliability of computer systems. As the emerging trend of integrating coprocessors into symmetric multiprocessors, it offers a better choice for software oriented fault tolerance approaches. This paper presents coprocessor-based Process Level Redundancy (PLR), which makes use of coprocessors and frees CPU cycle to other tasks. The experiment is conducted by comparing the performance of one CPU version of PLR and one coprocessor version PLR using a subset of optimised SPEC CPU2006 benchmark. It shows that the proposed approach enhances performance by 32.6% on average. The performance can be enhanced more if one application contains more system calls. This common technique can be adapted to other software-based fault tolerance as well.
    Keywords: fault tolerance; symmetric multiprocessors; process-level redundancy; coprocessor;.
    DOI: 10.1504/IJCSE.2015.10006159
     
  • Applying transmission-coverage algorithms for secure geocasting in VANETs   Order a copy of this article
    by Antonio Prado, Sushmita Ruj, Milos Stojmenovic, Amiya Nayak 
    Abstract: Existing geocasting algorithms for VANETs provide either high availability or security, but fail to achieve both together. Most of the privacy preserving algorithms for VANETs have low availability and involve high communication and computation overheads. The reliable protocols do not guarantee secrecy and privacy. We propose a secure, privacy-preserving geocasting algorithm for VANETs, which uses direction-based dissemination. Privacy and security are achieved using public key encryption and authentication and pseudonyms. To reduce communication overheads resulting from duplication of messages, we adapt a transmission-coverage algorithm used in mobile sensor networks, where nodes delay forwarding messages based on its uncovered transmission perimeter after neighbouring nodes have broadcast the message. Our analysis shows that our protocol achieves a high delivery rate, with reasonable computation and communication overheads.
    Keywords: geocasting; privacy; coverage; VANET.
    DOI: 10.1504/IJCSE.2014.10006161
     
  • High quality multi-core multi-level algorithm for community detection   Order a copy of this article
    by Suely Oliveira, Rahil Sharma 
    Abstract: One of the most relevant and widely studied structural properties of networks is their community structure or clustering. Detecting communities is of great importance in various disciplines where systems are often represented as graphs. Different community detection algorithms have been introduced in the past few years, which look at the problem from different perspectives. Most of these algorithms, however, have expensive computational time that makes them impractical to use for large graphs found in the real world. Maintaining a good balance between the computational time and the quality of the communities discovered is a well known open problem in this area. In this paper, we propose a multi-core multi-level (MCML) community detection algorithm based on the topology of the graph, which contributes towards solving the above problem. MCML algorithm on two benchmark datasets results in detection of accurate communities. We detect high modularity communities by applying MCML on Facebook Forum dataset to find users with similar interests and Amazon product dataset. We also show the scalability of MCML on these large datasets with 16 Xeon Phi cores.
    Keywords: parallel algorithm; multi-level; multi-core; community detection; Facebook user interaction; big data.

  • An improved indoor localisation algorithm based on wireless sensor network   Order a copy of this article
    by Min-Yi Guo, Chen Li, Jianzhong Wu, Jianping Cai, Zengwei Zheng, Jin Lv 
    Abstract: Many sensor network applications require location awareness. In this paper, an improved positioning algorithm based on fingerprinting is presented for indoor environments. The improved algorithm compared with the traditional fingerprint recognition algorithm does not require offline fingerprint collection. The improved algorithm is robust in complex indoor environments and it can effectively deal with the situation of the failure of the beacon node. When there are new nodes added to the wireless sensor network, the algorithm will make use of the new nodes by generating new fingerprints to ensure the positioning performance of the algorithm.
    Keywords: wireless sensor network; indoor localisation; distributed database.

  • Improved artificial bee colony algorithm with differential evolution for numerical optimisation problems   Order a copy of this article
    by Jiongming Jiang, Yu Xue, Tinghuai Ma, Zhongyang Chen 
    Abstract: Evolutionary algorithms have been widely used in recent years. The Artificial Bee Colony (ABC) algorithm is an evolutionary algorithm for numerical optimisation problems. Recently, more and more researchers have shown interest in the ABC algorithm. Previous studies have shown that it is an efficient, effective and robust evolutionary optimisation method. However, the convergence rate of the ABC algorithm still does not meet our requirements and it is necessary to optimise it. In this paper, several local search operations are embedded into the ABC algorithm. This modification enables the algorithm to get a better balance between the convergence rate and the robustness. Thus it can be possible to increase the convergence speed of the algorithm and thereby obtain an acceptable solution. Such an improvement can be advantageous in many real-world problems. This paper focuses on the performance of improving the ABC algorithm with a differential strategy on the numerical optimisation problems. The proposed algorithm has been tested on 18 benchmark functions from relevant literature. The experiment results indicated that the performance of the improved ABC algorithm is better than that of the original ABC algorithm and some other classical algorithms.
    Keywords: artificial bee colony, numerical optimisation, differential algorithm.

  • A study of cooperative advertising in a one-manufacturer two-retailers supply chain based on the multi-stage dynamic game theory   Order a copy of this article
    by Hong Zhang, Quanju Zhang 
    Abstract: In this paper, the coordination of cooperative advertising decisions is analysed in a supply chain with one manufacturer and two retailers. Suppose the manufacturer invests in national advertising and one retailer invests in local advertising, the manufacturer agrees to share part of the local advertising cost with the retailer. Meanwhile, the other retailer refuses to take part in cooperative advertising. The manufacturer and retailer who put investment in cooperative advertising could choose a cooperative or a non-cooperative attitude, but the other retailer always chooses a non-cooperative attitude. We select four decision variables, including local advertising effort, two retailers' marginal profits, price of product to discuss seven three-stage dynamic game models according to the parties' attitudes being cooperative or not. Seven game models, including one non-cooperative model, five partial cooperative models and one cooperative model, are investigated in detail based on a whole mathematical analysis. By comparing the proposed seven models, several interesting propositions are obtained and the corresponding interesting results being acquired either via these propositions.
    Keywords: cooperative advertising; multi-stage dynamic model; supply chain; game theory

  • Cricket chirping algorithm: an efficient meta-heuristic for numerical function optimisation   Order a copy of this article
    by Jonti Deuri, Siva Sathya Sundaram 
    Abstract: Nature-inspired meta-heuristic algorithms have proved to be very powerful in solving complex optimisation problems in recent times. The literature reports several inspirations from nature, exploited to solve computational problems. This paper is yet another step in the journey towards the use of natural phenomena for seeking solutions to complex optimisation problems. In this paper, a new meta-heuristic algorithm based on the chirping behaviour of crickets is formulated to solve optimisation problems. It is validated against various benchmark test functions and then compared with popular state-of-the-art optimisation algorithms, such as genetic algorithm, particle swarm optimisation, bat algorithm, artificial bee colony algorithm and cuckoo search algorithm for performance efficiency. Simulation results show that the proposed algorithm outperforms its counterparts in terms of speed and accuracy. The implication of the results and suggestions for further research are also discussed.
    Keywords: optimisation; meta-heuristic algorithm; numerical function, cuckoo search; artificial bee colony; particle swarm optimisation; genetic algorithm; cricket chirping algorithm; calling chirp; aggressive chirp

  • Optimising the stiffness matrix integration of n-noded 3D finite elements   Order a copy of this article
    by J.C. Osorio, M. Cerrolaza, M. Perez 
    Abstract: The integration of the stiffness and mass matrices in finite element analysis is a time-consuming task. When dealing with large problems having very fine discretisations, the finite element mesh becomes very large and several thousands of elements are usually needed. Moreover, when dealing with nonlinear dynamic problems, the CPU time required to obtain the solution increases dramatically because of the large number of times the global matrix should be computed and assembled. This is the reason why any reduction in computer time (even being small) when evaluating the problem matrices is of concern for engineers and analysts. The integration of the stiffness matrix of n-noded high-order hexahedral finite elements is carried out by taking advantage of some mathematical relations found among the nine terms of the nodal stiffness matrix, previously found for the more simple brick element. Significant time savings were obtained in the 20-noded finite element example case.
    Keywords: stiffness matrix, finite elements, n-noded hexahedral elements, saving CPU time

  • A cost-effective graph-based partitioning algorithm for a system of linear equations   Order a copy of this article
    by Hiroaki Yui, Satoshi Nishimura 
    Abstract: There are many techniques for reducing the number of operations in directly solving a system of sparse linear equations. One such method is nested dissection (ND). In numerical analysis, the ND algorithm heuristically divides and conquers a system of linear equations, based on graph partitioning. In this article, we present a new algorithm for the first level of such graph partitioning, which splits a graph into two roughly equal-sized subgraphs. The algorithm runs in almost linear time. We evaluate and discuss the solving costs by applying the proposed algorithm to various matrices.
    Keywords: sparse matrix; nested dissection; graph partitioning; graph algorithm; Kruskal’s algorithm; Gaussian elimination; bit vector; adjacent list; refinement; system of equations.

  • A based-on-set-partitioning exact approach to multi-trip of picking up and delivering customers to airports   Order a copy of this article
    by Wei Sun, Yang Yu, Jia Li 
    Abstract: Picking up and delivering customers to airports (PDCA) is a new service provided in China. The multi-trip mode of PDCA (MTM-PDCA) service is a promising measure to reduce operation costs. To obtain the exact solution, we propose a novel modelling approach including two stages. In the first stage, all feasible trips of each subset of the customer point set are produced, and then the two local optimum trips of each subset can be obtained easily. Subsequently, using the local optimum trips obtained in the first stage, we establish the novel trip-oriented set-partitioning (TO-SP) model to formulate MTM-PDCA. The MTM-PDCA based on the TO-SP model can be solved exactly by CPLEX. By testing extensive instances, we summarise several managerial insights that can be used to successfully reduce the costs of PDCA by using multi-trip mode.
    Keywords: multi-trip; single-trip; set-partitioning; exact approach.

  • Reliability prediction and QoS selection for web service composition   Order a copy of this article
    by Liping Chen, Weitao Ha 
    Abstract: Web service composition is a distributed model to construct new web services on top of existing primitive or other composite web services. The key issues in the development of web service composition are the dynamic and efficient reliability prediction and the appropriate selection of component services. However, the reliability of the service-oriented systems heavily depends on the remote web services as well as the unpredictable internet. Thus, it is hard to predict the system reliability. In addition, there are many reliable functionally equivalent partner services for the same composite service which have different Quality of Service (QoS). It is important to identify the best QoS candidate web services from a set of functionally equivalent services. But efficient selection from the large numbers of candidate web services brings challenges to the existing methods. In this paper, we discuss web service composition in two ways: reliability prediction and QoS optimal selection. First, we propose a reliability prediction model based on Petri net. For atomic services, a staged reliability model is provided which predicts reliability from network environment availability, hermit equipment availability, discovery reliability and binding reliability. To address the complex connecting relationship among subservices, places of basic Petri net for input and output are extended to some subtypes for multi-source input place and multiuse output place. Secondly, we use a new skyline algorithm based on an R-tree index. The index tree is traversed to judge whether it is dominated by the candidate skyline sets. The leaf points store optimal component services. Experimental evaluation of real and synthetic data shows the effectiveness and efficiency of the proposed approach. The approach has been implemented and has been used in the context of travel process mining. Although the results are presented in the context of Petri nets, the approach can be applied to any process modelling language with executable semantics.
    Keywords: web service composition, atomic services, reliability prediction, QoS, skyline, optimisation

  • Cost-sensitive ensemble classification algorithm for medical image   Order a copy of this article
    by Minghui Zhang, Haiwei Pan, Niu Zhang, Xiaoqin Xie, Zhiqiang Zhang, Xiaoning Feng 
    Abstract: Medical image classification is an important part of domain-specific application image mining. In this paper, we quantify the domain knowledge about medical images for feature extraction. We propose a cost-sensitive ensemble classification algorithm(CEC), which uses a new training method and adopts a new method to acquire parameters. In the weak classifier training process, we mark the samples that are wrongly classified in the former iteration, use the method of re-sampling in the samples that are correctly classified, and put all the wrongly classified in the next training. The classification can pay more attention to those samples that are hardly classified. The weight parameters of weak classifiers are determined not only by the error rates, but also by their abilities to recognise the positive samples. Experimental results show that our algorithm is more efficient for medical image classification.
    Keywords: medical image; domain knowledge; cost-sensitive learning; ensemble classification.

  • Mining balanced API protocols   Order a copy of this article
    by Deng Chen, Yanduo Zhang, Wei Wei, Rongcun Wang, Huabing Zhou, Xun Li, Binbin Qu 
    Abstract: API protocols can be used in many aspects of software engineering, such as software testing, program validation, and software documentation. Mining API protocols based on probabilistic models is proved to be an effective approach to achieve protocols automatically. However, it always achieves unbalanced protocols, that is, protocols described using probabilistic models have unexpected extremely high and low probabilities. In this paper, we discuss the unbalanced probability problem and propose to address it by preprocessing method call sequences used for training. Our method first finds tandem arrays in method call sequences based on the suffix tree. Then, it substitutes each tandem array with a tandem repeat. Since repeated sub method call sequences are eliminated, balanced API protocols may be achieved. In order to investigate the feasibility and effectiveness of our approach, we implemented it in our previous prototype tool ISpecMiner, and used the tool to perform a comparison test based on several real-world applications. Experimental results show that our approach can achieve more balanced API protocols than existing approaches, which is essential for mining valid and precise API protocols.
    Keywords: mining API protocol; suffix tree; probability balance; method call sequence; Markov model; tandem array

  • Advanced DDOS detection and mitigation technique for securing cloud   Order a copy of this article
    by Masoumeh Zareapoor, Pourya Shamsolmoali, M.Afshar Alam 
    Abstract: Distributed Denial of Service (DDoS) attacks have become a serious problem for internet security and cloud computing. This kind of attack is the most complex form of DoS (Denial of Service) attacks. This type of attack can simply duplicate its source address, such as spoofing attack, which disguises the real location of the attack. Therefore, DDoS attack is the most significant challenge for network security. In this paper, we present a model to detect and mitigate DDOS attacks in cloud computing. The proposed model requires very small storage and it has the ability of fast detection. The experiment results show that the system is able to mitigate most of the attacks. Detection accuracy and processing time were the metrics used to evaluate the performance of the proposed model. From the results, it is evident the system achieves high detection accuracy (97%) with some minor false alarms.
    Keywords: distributed denial-of-service; DDOS; information divergence; cloud security; filtering

  • Global and local optimisation based hybrid approach for cloud service composition   Order a copy of this article
    by Jyothi Shetty, Demian Antony D'Mello 
    Abstract: The goal of service composition is to find the best set of services to meet the user's requirements. The efficient local optimisation methods may fail to satisfy the users end-to-end requirements. Global optimisation methods are popular when the users end-to-end requirements are to be satisfied. Optimal composition to end-to-end requirements consumes exponential time, in the case of a large search space. Metaheuristic methods are being used to solve this problem, which give near-optimal solutions. This paper proposes an approach in which both local and global optimisations are used. In order to avoid local optimums during local optimisation, the proposed work selects a set of best services from each task and then uses a global optimisation method on the smaller search space to select the best composition. In order to reduce the communication costs, the optimal solution identifies the minimum number of clouds for composition.
    Keywords: cloud service; service composition; task level selection; global optimisation; local optimisation; exact algorithm

  • Homomorphisms between the covering information systems   Order a copy of this article
    by Zengtai Gong, Runli Chai, Yongping Guo 
    Abstract: The system of information is an important mathematical model in many fields, such as data mining, artificial intelligence, and machine learning. The relation or mapping is a popular method for exploring the communication between two systems of information. In this paper, we first introduce the concepts of the covering relation or mapping and the inverse covering relation or mapping between two covering systems of information and investigate their properties. Then, we propose the view of homomorphism of covering systems of information that are based on covering relation. Moreover, we prove that attribute reductions in the image system and original system are equivalent to each other under the conditions of homomorphism given in this paper.
    Keywords: covering-based rough sets, homomorphism of information, attribute reductions

  • The remote farmland environment monitoring system based on ZigBee sensor network   Order a copy of this article
    by Yongfei Ye, Xinghua Sun, Minghe Liu, Zhisheng Zhao, Xiao Zhang, Hongxi Wu 
    Abstract: In order to change the traditional management of agricultural production, ZigBee technology is used in short distance wireless transmission to design an intelligent farmland environment remote monitoring system, which integrates communication, computing and all aspects of network technology. The real-time accurate data collection of farmland soil pH value, the temperature and humidity surrounding the plants, illumination intensity and crop chlorophyll content, all provide reliable data for the intelligent agricultural production, thereby the level of intelligence of agricultural management is increased. Based on precision guidance, irrigation will become intelligent, which can avoid the waste of water resources and low use rate caused by free operation. At the same time, it will promote modernisation of agricultural production processes.
    Keywords: farmland environment; remote monitoring; ZigBee technology; sensor network; intelligent farmland environment; precise agriculture; agricultural information; data collection; data transmission; real time; agricultural knowledge; computational science; computational engineering.

  • Optimising order selection algorithm based on online taxi-hailing applications   Order a copy of this article
    by Tian Wang, Wenhua Wang, Yongxuan Lai, Diwen Xu, Haixing Miao, Qun Wu 
    Abstract: Nowadays, with the widespread use of smart devices and networking technologies, the application of taxi-hailing servers is becoming more and more popular in our daily life. However, the drivers' behaviour of robbing orders while driving brings great potential traffic security problems. Considering the characteristics and deficiencies of the mainstream taxi-hailing apps in smart devices, this paper studies the order selection problem from the drivers' end. According to different customers' requirements, an order auto-selection algorithm is proposed. Moreover, it adopts a time buffer mechanism to avoid time conflicts among the orders, and a new concept of 'efficiency value of orders' is proposed to evaluate the profits of orders. This algorithm can auto-select orders for the driver according to their qualities, which can not only improve the safety, but also maximise the drivers' revenue. Extensive simulations validate the performance of the proposed method.
    Keywords: taxi-hailing application; order selection algorithm; biggest profit; greedy algorithm; safety; efficiency value of orders.

  • Towards UNL based machine translation for Moroccan Amazigh language   Order a copy of this article
    by Imane Taghbalout, Fadoua Ataa Allah, Mohamed El Marraki 
    Abstract: Amazigh languages, also called Berber, belong to the Afro-Asiatic language (Hamito-Semitic) family. They are a family of similar and closely related languages and dialects indigenous to North Africa. They are spoken in Morocco, Algeria, and some populations in Libya, Tunisia, northern Mali, western and northern Niger, northern Burkina Faso, Mauritania, and in the Siwa Oasis of Egypt. Large Berber-speaking migrant communities have been living in Western Europe since the 1950s. In this paper, we study the Standard Moroccan Amazigh. It became a constitutionally official language of Morocco in 2011. However, it is still considered as a less resourced language. So, it is time to develop linguistic resources and applications for processing automatically this language, in order to ensure its survival and promotion by integrating it into the new information and communication technologies (NICT). In this context and in the perspective to produce a Universal Networking Language (UNL) based machine translation system for this language, we have undertaken the creation of the Amazigh-UNL dictionary, as a first step of linguistic resources development required by the UNL system to achieve translation. Thus, this paper is focused on presenting linguistic features implementation, such as morphological, syntactical and semantic information of the Amazigh languages.
    Keywords: Amazigh language; machine translation; Universal Networking Language; Amazigh-UNL dictionary; inflectional paradigm; subcategorisation frame; Universal Word.

  • Population diversity of particle swarm optimisation algorithms for solving multimodal optimisation problems   Order a copy of this article
    by Shi Cheng, Junfeng Chen, Quande Qin, Yuhui Shi 
    Abstract: The aim of multimodal optimisation is to locate multiple peaks/optima in a single run and to maintain these found optima until the end of a run. In this paper, seven variants of particle swarm optimisation (PSO) algorithms, which includes PSO with star structure, PSO with ring structure, PSO with four clusters structure, PSO with Von Neumann structure, social-only PSO with star structure, social-only PSO with ring structure, and cognition-only PSO, are used to solve multimodal optimisation problems. The population diversity, or more specifically, the position diversity, is used to measure the candidate solutions during the search process. Our goal is to measure the performance and effectiveness of variants of PSO algorithms and investigate why an algorithm performs effectively from the perspective of population diversity. The experimental tests are conducted on eight benchmark functions. Based on the experimental results, the conclusions could be made that the PSO with ring structure and social-only PSO with ring structure perform better than the other PSO variants on multimodal optimisation. From the population diversity measurement, it is shown that to obtain good performances on multimodal optimisation problems, an algorithm needs to balance its global search ability and solutions maintenance ability, which means that the population diversity should be converged to a certain level quickly and be kept during the whole search process.
    Keywords: swarm intelligence algorithm; multimodal optimisation; particle swarm optimisation; population diversity; nonlinear equation systems

  • A pseudo nearest centroid neighbour classifier   Order a copy of this article
    by Hongxing Ma, Jianping Gou, Xili Wang 
    Abstract: In this paper, we propose a new reliable classification approach, called the pseudo nearest centroid neighbour rule, which is based on the pseudo nearest neighbour rule (PNN) and nearest centroid neighbourhood (NCN). In the proposed PNCN, the nearest centroid neighbours rather than nearest neighbours per class are first searched by means of NCN. Then, we calculate k categorical local mean vectors corresponding to k nearest centroid neighbours, and assign the weight to each local mean vector. Using the weighted k local mean vectors for each class, PNCN designs the corresponding pseudo nearest centroid neighbour and decides the class label of the query pattern according to the closest pseudo nearest centroid neighbour among all classes. The classification performance of the proposed PNCN is evaluated on real and artificial datasets in terms of the classification accuracy. The experimental results demonstrate the effectiveness and robustness of PNCN over the competing methods in many practical classification problems.
    Keywords: K-nearest neighbour rule; nearest centroid neighborhood; pseudo nearest centroid neighbour rule; local mean vector; pattern classification

  • A comparative study of mixed least-squares FEMs for the incompressible Navier-Stokes equations   Order a copy of this article
    by Alexander Schwarz, Masoud Nickaeen, Serdar Serdas, Abderrahim Ouazzi, Jörg Schröder, Stefan Turek, Carina Nisters 
    Abstract: In the present contribution we compare (quantitatively) different mixed least-squares finite element methods (LSFEMs) with respect to computational costs and accuracy. In detail, we consider an approach for Newtonian fluid flows, which are described by the incompressible Navier-Stokes equations. Various first-order systems are derived based on the residual forms of the equilibrium equation and the continuity condition. From these systems L^2-norm least-squares functionals are constructed, which are the basis for the associated minimisation problems. The first formulation under consideration is a div-grad first-order system resulting in a three-field formulation with total stresses, velocities, and pressure (S-V-P) as unknowns. Here, the variables are approximated in H(div) x H^1 x L^2 on triangles and in H^1 x H^1 x L^2 on quadrilaterals. In addition to that a reduced stress-velocity (S-V) formulation is derived and investigated. An advantage of this formulation is a smaller system matrix due to the absence of the pressure degree of freedom, which is eliminated in this approach. S-V-P and S-V formulations are promising approaches when the stresses are of special interest, e.g. for non-Newtonian, multiphase or turbulent flows. Furthermore, since in the total stress approach the pressure is approximated instead of its gradient, the proposed S-V-P formulation could be used in formulations with discontinuous pressure interpolation. For comparison the well-known first-order vorticity-velocity-pressure (V-V-P) formulation is investigated. In here, all unknowns are approximated in H^1 on quadrilaterals. Besides some numerical advantages, as e.g. an inherent symmetric structure of the system of equations and a directly available error estimator, it is known that least-squares methods have a drawback concerning mass conservation, especially when lower-order elements are used. Therefore, the main focus of the work is drawn to performance and accuracy aspects on the one side for finite elements with different interpolation orders and on the other side on the usage of efficient solvers, for instance of Krylov-space or multigrid type. Finally, two well-known benchmark problems are presented and the results are compared for different first-order formulations.
    Keywords: least-squares FEM; V-V-P formulation; S-V-P formulation; S-V formulation; Navier-Stokes; multigrid.
    DOI: 10.1504/IJCSE.2016.10006921
     
  • Enhanced differential evolution with modified parent selection technique for numerical optimisation   Order a copy of this article
    by Xiang Li 
    Abstract: Differential evolution (DE) is considered to be one of the most prominent evolutionary algorithms for numerical optimisation. However, it may suffer from a slow convergence rate, especially in the late stage of the evolution progress. The reason might be that the parents in the mutation operator are randomly selected from the parent population. To remedy this limitation and to enhance the performance of DE, in this paper, a modified parent selection technique is proposed, where the parents in the mutation operator are chosen based on their previous successful experiences. The major advantages of the proposed parent selection technique are its simplicity and generality. It does not destroy the simple structure of DE, and it can be used in most DE variants. To verify the performance of the proposed technique, it is integrated into the classical DE algorithm and three advanced DE variants. Thirteen widely used benchmark functions are used as the test suite. Experimental results indicate the the proposed technique is able to enhance the performance of the classical DE and advanced DE algorithms in terms of both the quality of final solutions and the convergence rate.
    Keywords: differential evolution; parent selection; mutation operator; numerical optimisation

  • Intelligent selection of parents for mutation in differential evolution   Order a copy of this article
    by Meng Zhao, Yiqiao Cai 
    Abstract: In most DE algorithms, the parents for mutation are randomly selected from the current population. As a result, all vectors involved in mutation are equally selected as parents without any selective pressure. Although such a mutation strategy is easy to use, it is inefficient for solving complex problems. To address this issue, we present an intelligent parents selection strategy (IPS) for DE. The new algorithmic framework is named as DE with IPS-based mutation (IPSDE). In IPSDE, the neighbourhood of each individual is firstly constructed with a population topology. Then, all the neighbours of each individual are partitioned into two groups based on their fitness values, and a probability value for each neighbour being selected as the parents in the respective groups is calculated based on its distance from the current individual. With the probability values, IPS selects the parents from the neighbourhood of the current individual to guide the mutation process of DE. To evaluate the effectiveness of the proposed approach, IPSDE is applied to several original DE algorithms and advanced DE variants. Experimental results have shown that IPSDE is an effective framework to enhance the performance of most DE algorithms studied.
    Keywords: differential evolution; mutation operator; neighbourhood information; intelligent parents selection.
    DOI: 10.1504/IJCSE.2016.10002299
     
  • Modelling method of dynamic business process based on pi-calculus   Order a copy of this article
    by Yaya Liu, Jiulei Jiang, Weimin Li 
    Abstract: The formal modelling of a dynamic business process is to make the collaborative relationship between organisations more detailed and explicit. It is convenient for people to analyse the structure and interaction of cross-organisational business processes, especially dynamic business processes, and assure the optimisation of the system architecture. Based on the channel mobility of pi-calculus, a new modelling method of the dynamic business process is proposed by combining with the extended directed acyclic graph. It is mainly discussed from three aspects: the selection of the interactive paths, the transition of business objects and the validation of accuracy. Meanwhile, a concrete example with multiple roles is presented to assist in the implementation of the method. It concludes that the method can effectively distinguish the collaborative relationship between organisations, and also be used to build formal models of complicated and dynamic business processes with the mature technology.
    Keywords: dynamic business process; cross-organisational business process; channel mobility; pi-calculus; extended directed acyclic graph.

  • Unsupervised metric learning for person re-identification by image re-ranking   Order a copy of this article
    by Dengyi Zhang, Qian Wang, Xiaoping Wu, Yu Cao 
    Abstract: In a multi-camera video surveillance system with non-overlapping areas, the same person may appear different according to different cameras; also, different people may look the same. This makes person re-identification an important and challenging problem. Most of the current person re-identification methods are based on the supervised distance metrics learning method, which is labels the same person from many cameras as positive samples for distance metric learning, while it is hardly done manually in large numbers of cameras. Thus, this paper describes an unsupervised distance metric learning method based on image re-ranking, calculating the original distance matrix for person samples from two cameras using the original distance metric function, and re-ranking the distance matrix by the image re-ranking method to acquire a better distance function, then using it to calculate the new distance rank matrix. This matrix is used to label positive and negative samples automatically, using unsupervised distanced distance metric learning, and thus to acquire a better Mahalanobis distance metric function, without the need to manually label person samples according to different cameras. Experiments were performed on public datasets VIPeR, i-LIDS, GRID and CAVIAR4REID, and the results compared with current distance learning methods. The results are evaluated by CMC, which indicates this algorithm could overcome the difficulties for labelling large numbers of person samples from cameras in distance metric learning, with a better re-identification rate.
    Keywords: video surveillance; non-overlapping area; person re-identification; unsupervised metric learning; image re-ranking

  • Discovery of continuous coherent evolution biclusters in time series data   Order a copy of this article
    by Meihang Li, Yun Xue, Haolan Zhang, Bo Ma, Jie Luo, WenSheng Chen, Zhengling Liao 
    Abstract: Most traditional biclustering algorithms focus on the biclustering model of non-continuous columns, which is unsuitable for analysis of time series gene expression data. We propose an effective and exact algorithm that can be used to mine biclusters with coherent evolution on contiguous columns, as well as complementary and time-lagged biclusters in time series gene expression matrices. Experimental results show that the algorithm can detect biclusters with statistical significance and strong biological relevance. The algorithm is also applied to currency data analysis, in which meaningful results are obtained.
    Keywords: time series data; bicluster; coherent evolution; complementary; time-lagged

  • Empirical rules based views abstraction for distributed model-driven development   Order a copy of this article
    by Yucong Duan, Jiaxuan Li, Qiang Duan, Lixin Luo, Liang Huang 
    Abstract: UML view integration has been extensively studied in the area of model transformation in model-driven engineering. Empirical processing rules are among the most widely employed approaches for processing view abstraction, which can support model simplification, consistency checking, and management complexity reduction. However, empirical rules face some challenges, such as completeness validation, consistency among rules, and composition priority arrangement. The challenge of rule composition is enlarged in the environment of distributed model-driven development for web service-based systems, where redundant information/data is emphasised. The same redundant information can be expressed in different forms that comprise different topological structures for entity relationship network representing the same part of the system. Such variation will result in choosing different compositions of the rules executed in different orders, which will increase the severity of the current non-determinism from the empirical probability of some rules. In this paper, we investigate the effect of redundancy on rule application through designing a simulated distributed storage for an example diagram model. We propose a formal solution for addressing this challenge through constructing a finite-state automaton to unify empirical abstraction rules while relieving the side effects caused by redundancy. We also show the results obtained from a prototype implementation.
    Keywords: UML, model transformation, view abstraction, finite-state automaton

  • Populating parameters of web services by automatic composition using search precision and WSDL weight matrix   Order a copy of this article
    by Sumathi Pawar, Niranjan Chiplunkar 
    Abstract: Web service composition is meant for connecting different web services according to the requirement. The absence of public Universal Description, Discovery, and Integration (UDDI) made it difficult to get QoS information of the web services unless checked by execution. This research implements a system for invoking and composing web services according to the user requirements by searching required web services dynamically using the Bingo search engine. The user may not know the value of input parameters of the required web services, and these unknown parameters are populated by composing available web services automatically and dynamically. The methodology used here is searching the requested web services according to the functional word, finding the search precision with support and confidence values of search results, computation of Web Service Description Language(WSDL) weight matrix to select suitable web services for user satisfaction, and populating unknown input parameters values by composing the web services. Composable web services are found by intra-cluster search and inter-cluster search among different operation elements of community web services. A composition rule is framed for composable web services according to the order of composition. Pre-condition and effect elements are checked before execution of composition plan. Finally, web services are invoked according to the composition rule.
    Keywords: service composition; WSDL; match-making algorithm; service discovery; WSDL-S.
    DOI: 10.1504/IJCSE.2016.10007953
     
  • Fast elliptic curve scalar multiplication for resisting against SPA   Order a copy of this article
    by Shuanggen Liu 
    Abstract: This paper analyses the computation of the Symbolic Ternary Form (STF) elliptic curve scalar multiplication algorithm and the binary scalar multiplication algorithm. Compared with the binary scalar multiplication algorithm, the efficiency of the STF scalar multiplication algorithm is increased by 5.9% on average and has a corresponding advantage. For this reason, we improve the structure of the STF scalar multiplication algorithm and make the performance more "smooth" by constructing an indistinguishable operation between points addition (A) and triple point (T) and thus resist against the simple power analysis (SPA) attacks. At the same time, we propose the Highest-weight Symbolic Ternary Form (HSTF), which makes a scalar k transform into the highest-weight form. Thus, every cycle has a fixed pattern to resist SPA attack. With respect to binary scalar multiplication algorithm with anti-SPA, the average efficiency is enhanced by 17.7%.
    Keywords: elliptic curve scalar multiplication; simple power analysis; highest-weight symbolic ternary form

  • Predicting rainfall using neural nets   Order a copy of this article
    by Kyaw Kyaw Htike 
    Abstract: One of the most crucial factors that can help in making strategic decisions and planning in countries that rely on agriculture in some ways is successfully predicting rainfall. Despite its clear importance, forecasting rainfall up until now remains a big challenge owing to the highly dynamic nature of the climate process and its associated seemingly random fluctuations. A wide variety of models have been proposed to predict rainfall, among which statistical models have been one of the most relatively successful. In this paper, we propose a novel rainfall forecasting model using Focused Time-Delay Neural Networks (FTDNNs). In addition, we also contribute in comparing rainfall forecasting performances, using FTDNNs, for different prediction time scales, namely: monthly, quarterly, bi-annually and yearly. We present the optimal neural network architecture parameters automatically found for each of the aforementioned time scales. Our models are trained to perform one-step-ahead predictions and we demonstrate and evaluate our results, measured by mean absolute percentage error, on the rainfall dataset obtained from Malaysian Meteorological Department (MMD) for close to a thirty year period. For test data, we found that the most accurate result was obtained by our method on the yearly rainfall dataset (94.25%). For future work, dynamic meteorological parameters such as sunshine data, air pressure, cloudiness, relative humidity and wet bulb temperature can be integrated as additional features into the model for even higher prediction performance.
    Keywords: rainfall prediction; forecasting; statistical prediction models; artificial neural networks, focused time-delay networks.

  • Overview of information visualisation in science education   Order a copy of this article
    by Chun Hua Wang, Dong Han, Wen-Kuang Chou 
    Abstract: Developed as computer-assisted instruction, visual education is a new teaching method, which is a computer techniques-based visual design aimed to education. Based on an overview of previous studies, this paper expounds the main features of education visualisation, outlines the theoretical basis of education visualisation, summarises the empirical study of science education visualisation, and refines the application scenarios and attention matters in science education visualisation by using static and dynamic visualisation as the clues for classification. The paper concludes that whether the effect of education visualisation is a success depends on the students' knowledge background, visual perception and comprehension ability. Therefore, the design of education visualisation must ensure that the objects and contents of visualisation can adapt to the specific conditions and instructional objectives.
    Keywords: science education visualisation; static visualisation; dynamic visualisation.
    DOI: 10.1504/IJCSE.2016.10005643
     
  • An automation approach for architecture discovery in software design using genetic algorithm   Order a copy of this article
    by Sushama C, A Rama Mohan Reddy 
    Abstract: Software architectures are treated as valuable artifacts in software engineering. The functionality of the software is dependent on the software architectures. The software architectures provide high-level analysis whenever the architects need to analyse the dynamic structure of the design. The modifications to the designs are made manually; it is a very complicated process and sometimes it will not solve the problem completely. This paper presents a genetic algorithm for discovery of underlying architectures of software design. The genetic algorithm is carried out with different modules like encoding, fitness function, and mutation. The algorithm was tested with real time projects and the complete experimental study is mentioned.
    Keywords: genetic algorithm, components, interactions, relations, search-based software engineering.

  • A modified electromagnetism-like mechanism algorithm with pattern search for global optimization   Order a copy of this article
    by Qing Wu, Chunjiang Zhang, Liang Gao 
    Abstract: The solution space of most global optimisation problems is very complex, which results in a high requirement for the search performance of algorithms. Electromagnetism-like mechanism (EM) algorithm is a rising global optimisation method. However, the intensification and the diversification of the original EM are not very efficient. This paper proposes a modified EM algorithm. To improve the intensification ability, a more effective variable step-size pattern search has been applied to replace the original random line search at the local search stage. Meanwhile, a perturbing point is used to increase the diversity. In addition, the formula of calculating the total force is simplified to accelerate the algorithms searching process. Numerical experiments are conducted to compare the proposed algorithm with other variants of EM algorithms and different variants of particle swarm optimisation algorithms. The results show that the approach is competitive.
    Keywords: electromagnetism-like mechanism algorithm; pattern search; global optimisation; meta-heuristic algorithm; local search

  • Various GPU memory utilisation exploration for large RDF search   Order a copy of this article
    by Chantana Chantrapornchai 
    Abstract: Graphic Processing Units (GPUs) are the important accelerators in our desktop com- puter nowadays. There are thousands of processing units that can simultaneously run the program and there are various memory types, with different sizes and access times, which are connected in a hierarchy. However, the GPUs have a much smaller internal memory size than a typical computer, which can be an obstacle to performing big data processing. In this paper, we study the use of various memory types: global, texture, constant, and shared memories, in simultaneously searching large Resource Description Framework (RDF) data, which are commonly used on the internet to link to the WWW data based on the GPUs. Using suitable memory types and properly managing the data transfer can lead to a better performance when processing such data. The results show that the parallel search in 45-Gigabyte RDF data on multiple GPUs that uses the global memory for storing large texts and uses the shared memory storing multiple keywords can run about 14 times faster than the sequential search on a low-cost desktop.
    Keywords: graphic processing units; large RDF; parallel string search

  • Hough transform-based cubic spline recognition for natural shapes   Order a copy of this article
    by Cheng-Huang Tung, Wei-Jyun Syu, Wei-Cheng Huang 
    Abstract: A two-stage GHT-based cubic spline recognition method is proposed for recognising flexible natural shapes. First, the proposed method uses cubic splines to interpolate a flexible natural shape, and a sequence of connected boundary points is generated from the cubic splines. Each such point has accurate tangent and curvature features. At the first recognition stage, the proposed method uses the modified GHT to adjust the scale and orientation factors of the input shape with respect to each reference model. At the second recognition stage, the proposed point-based matching technique calculates the difference between each specific reference model and its corresponding adjusted input shape at the point level. Experiments for recognising 15 categories of natural shapes, including fruits and vegetables, the recognition rate of the proposed two-stage method is 97.3%, much higher than 79.3% measured by the standard GHT.
    Keywords: Hough transform, GHT, cubic spline, natural shape, curvature, tangent, point-based matching, recognition method, template database, boundary point.

  • Personalised service recommendation process based on service clustering   Order a copy of this article
    by Xiaona Xia 
    Abstract: Personalised service recommendation is the key technology for service platforms, and the demand preferences of users are the important factors for personalised recommendation. First, in order to improve the accuracy and adaptability of service recommendation, services are needed to be initialised before being recommended and selected, then they are classified and clustered according to demand preferences, and service clusters are defined and demonstrated. For sparse problems of the service function matrix, historical and potential preferences are expressed as double matrices. Second, a service cluster is viewed as the basic business unit, and we optimise the graph summarisation algorithm and construct service recommendation algorithm SCRP. Helped by the experiments about variety parameters, SCRP has more advantages than other algorithms. Third, we select fuzzy degree and difference to be the two key indicators, and use some service clusters to complete the simulation and analyse the algorithm performance. The results show that our service selection and recommendation method is better than others, which might effectively improve the quality of service recommendation.
    Keywords: service clustering; service recommendation; graph summarisation algorithm; personalisation; preference matrix

  • Power-aware high level evaluation model of interconnect length of on-chip memory network topology   Order a copy of this article
    by XiaoJun Wang, Feng Shi, Yi-Zhuo Wang, Hong Zhang, Xu Chen, Wen-Fei Fu 
    Abstract: Interconnect power is the factor that dominates the power consumption on the on-chip memory architecture. Almost all dedicated wires and buses are replaced with packet switching interconnection networks which have become the standard approach to on-chip interconnection. Unfortunately, rapid advances in technology are making it more difficult to assess the interconnect power consumption of NoC. To resolve this problem, a new evaluating methodology Interconnect Power Evaluation based on Topology of On-chip Memory (IP-ETOM) is proposed in this paper. To validate this method, two multicore architectures 2D-Mesh and Triplet based Architecture (TriBA) are evaluated in this research work. The on-chip memory network model is evaluated based on characteristics of on-chip architecture interconnection. Matlab is used for conducting the experiment that evaluates the interconnection power of TriBA and 2D-Mesh.
    Keywords: power evaluation; on-chip memory network topology; NoC interconnects; IPETOM

  • Optimising data access latencies of virtual machine placement based on greedy algorithm in datacentre   Order a copy of this article
    by Xinyan Zhang, Keqiu Li, Yong Zhang 
    Abstract: The total completion time of a task is also the major bottleneck in the big data processing applications based on parallel computation, since the computation and data are distributed on more and more nodes. Therefore, the total completion time of a task is an important index to evaluate the cloud performance. The access latency between the nodes is one of the key factors affecting task completion time for cloud datacentre applications. Additionally, minimising total access time can reduce the overall bandwidth cost of running the job. This paper proposes an optimisation model focused on optimising the placement of virtual machines (VM) so as to minimise the total data access latency where the datasets have been located. According to the proposed model, our optimising VMs problem is linear programming. Therefore, we obtain the optimum solution of our model by the branch-and-bound algorithm that its time complexity is O(2^{NM}). Simultaneously, we also present a greedy algorithm, which has O(NM) of time complexity, to solve our model. Finally, the simulation results show that all of the solutions of our model are superior to existing models and close to the optimal value.
    Keywords: datacentre; cloud environment; access latency; virtual machine placement; greedy algorithm

  • An empirical study of disclosure effects in listed biotechnology and medicine industry using MLR model   Order a copy of this article
    by Chiung-Lin Chiu, You-Shyang Chen 
    Abstract: This research employs the multiple linear regression model to investigate the relationship between voluntary disclosure and firm performance in biotechnology and medicine industry in Taiwan. Using 44 firm-year observations collected from Information Transparency and Disclosure Ranking System and Taiwan Economic Journal financial database for companies listed in the Taiwan Stock Exchange and Taipei Exchange Market, the regression results reveal that there is a positive and significant relationship between voluntary disclosure and firm performance. Firms with better voluntary disclosure have better performance than do firms without voluntary disclosure. The results suggest that companies should pay more attention to voluntary disclosure as additional information. It is also considered by investors as valuable information when making their investment decision.
    Keywords: voluntary disclosure; firm performance; investment decision; MLR; multiple linear regression model, biotechnology and medicine industry; TSE; Taiwan Stock Exchange; ITDRS; information transparency and disclosure ranking system

  • A static analytical performance model for GPU kernel   Order a copy of this article
    by Jinjing Li 
    Abstract: Graphics processing units (GPUs) have shown increased popularity and play an important role as a kind of coprocessor in heterogeneous co-processing environments. Heavily data parallel problems can be solved efficiently by tens of thousands of threads collaboratively working in parallel in GPU architecture. The achieved performance, therefore,depends on the capability of multiple threads in parallel collaboration, the effectiveness of latency hiding, and the use of multiprocessors. In this paper, a static analytical kernel performance model (SAKP) is proposed, based on this performance principle, to estimate the execution time of the GPU kernel. Specifically, a set of kernel and device features for the target GPU is generated in the proposed model. We determine the performance-limiting factors and generate an estimation of the kernel execution time with this model. Matrix Multiplication (MM) and Histogram Generation (HG) in NVIDIA GTX680 GPU card were performed to verify our proposed model, and showed an absolute error in prediction of less than 6.8%.
    Keywords: GPU; co-processing; static analytical kernel performance model; kernel and device features; absolute error.

  • Syntactic parsing of clause constituents for statistical machine translation   Order a copy of this article
    by Jianjun Ma, Jiahuan Pei, Degen Huang, Dingxin Song 
    Abstract: The clause is considered to be the basic unit of grammar in linguistics, which is a structure between a chunk and a sentence. Clause constituents, therefore, are an important kind of linguistically valid syntactic phrase. This paper adopts the CRFs model to recognise English clause constituents with their syntactic functions, and testifies their effect on machine translation by applying this syntactic information to an English-Chinese PBSMT system, evaluated on a corpus of business domain. Clause constituents are mainly classified into six kinds: subject, predicator, complement, adjunct, residues of predicator, and residues of complement. Results show that our rich-feature CRFs model achieves an F-measure of 93.31%, a precision of 93.26%, and a recall of 93.04%. This syntactic knowledge in the source language is further combined with the NiuTrans phrasal SMT system, which slightly improves the English-Chinese translation accuracy.
    Keywords: syntactic parsing; clause constituents; PBSMT.
    DOI: 10.1504/IJCSE.2016.10004598
     
  • A universal compression strategy using sorting transformation   Order a copy of this article
    by Bo Liu, Xi Huang, Xiaoguang Liu, Gang Wang, Ming Xu 
    Abstract: Although traditional universal compression algorithms can effectively use repetition located in a slide window, they cannot take advantage of some message source in which similar messages are distributed uniformly. In this paper, we come up with a universal segmenting-sorting compression algorithm to solve this problem. The key idea is to reorder the message source before compressing it with the Lz77 algorithm. We design transformation methods for two common data types, corpus of webpages and access log. The experimental results show that segmenting-sorting transformation is truly beneficial to the compression ratio. Our new algorithm is able to make the compression ratio 20% to 50% lower than the naive Lz77 algorithm does and takes almost the same decompression time. For some read-heavy sources, segmenting-sorting compression can reduce space cost while guaranteeing throughput.
    Keywords: segmenting; sorting; Lz77; compression; universal compression method.

  • Executing time and cost-aware task scheduling in hybrid cloud using a modified DE algorithm   Order a copy of this article
    by Yuanyuan Fan, Qingzhong Liang, Yunsong Chen 
    Abstract: Task scheduling is one of the basic problems in cloud computing. In a hybrid cloud, task scheduling faces new challenges. In this paper, we propose a GaDE algorithm, based on a differential evolution algorithm, to improve the single objective scheduling performance of a hybrid cloud. In order to better deal with the multi-objective task scheduling optimisation in hybrid clouds, on the basis of the GaDE and Pareto optimum of the quick sorting method, we present a multi-objective algorithm, named NSjDE. This algorithm also reduces the frequency of evaluation. Compared with experiments using the Min-Min algorithm, GaDE algorithm and NSjDE algorithm, results show that for the single object task scheduling, GaDE and NsjDE algorithms perform better in getting the approximate optimal solution. The optimisation speed of the multi-objective NSjDE algorithm is faster than the single-objective jDE algorithm, and NSjDE can produce more than one non-dominated solution meeting the requirements, in order to provide more options to the user.
    Keywords: hybrid cloud; task scheduling; executing time-aware; cost-aware

  • A dynamic cold-start recommendation method based on incremental graph pattern matching   Order a copy of this article
    by Yanan Zhang, Guisheng Yin, Deyun Chen 
    Abstract: In order to give accurate recommendations for a cold-start user who has few records, researchers find similar users for a cold-start user according to social networks. However, these efforts assume that the cold-start users social relationships are static and ignore the fact that updating social relationships in large scale social networks is time consuming. In social networks, cold-start users and other users may change their social relationships as time goes by. In order to give accurate and timely recommendations for cold-start users, it is necessary to continuously update users similar to the cold-start user according to his latest social relationships. In this paper, an incremental graph pattern matching based dynamic cold-start recommendation method (IGPMDCR) is proposed, which updates similar users for a cold-start user based on the topology of social networks, and gives recommendations according to latest similar users. The experimental results show that IGPMDCR could give accurate and timely recommendations for cold-start users.
    Keywords: dynamic cold-start recommendation; social network; incremental graph pattern matching; topology of social network.
    DOI: 10.1504/IJCSE.2016.10006198
     
  • Modelling and simulation research of vehicle engines based on computational intelligence methods   Order a copy of this article
    by Ling-ge Sui, Lan Huang 
    Abstract: We assess the feasibility of two kinds of widely used artificial neural network (ANN) technologies applied in the field of transient emission simulation. In this work, the back-propagation feedforward neural network (BPNN) is shown to be more suitable than the radial basis function neural network (RBFNN). Considering the transient change rule of a transient operation, the composite transient rate is innovatively adopted as an input variable to the BPNN transient emission model, which is composited by the torque transient rate and air-fuel ratio (AFR) transient rate. Thus, a whole process transient simulation platform based on the multi-soft coupling technology of a test diesel engine is established. Through a transient emission simulation, the veracity and generalisation ability of the simulation platform is confirmed. The simulation platform can correctly predict the change trends and establish a peak value difference within 8%. Our findings suggest that the simulation platform can be applied to a control strategies study of typical transient operations.
    Keywords: transient emission; simulation; back-propagation feedforward neural network; radial basis function neural network; diesel engine.
    DOI: 10.1504/IJCSE.2018.10006094
     
  • Institution-based UML activity diagram transformation with semantic preservation   Order a copy of this article
    by Amine Achouri, Yousra Bendaly Hlaoui, Leila Jemni Ben Ayed 
    Abstract: This paper presents a specific tool, called MAV-UML-AD, allowing the specification and the verification of workflow models using UML Activity Diagrams (UML AD) and Event-B and Based on Institutions. The developed tool translates an activity diagram model into an equivalent Event-B specification according to a mathematical semantics. The transformation approach of UML AD models is based on the theory of institutions. In fact, each of UML AD and Event-B specification is defined by an instance of its corresponding institution. The transformation approach is represented by an institution co-morphism, which is defined between the two institutions. Institution theory is adopted as the theoretical framework of the tool essentially for two reasons. First, it gives a locally mathematical semantics for UML AD and Event-B. Second, to define a semantic preserving mapping between UML AD specification and Event-B machine. Thanks to the B theorem prover, functional proprieties such as liveness and fairness can be formally checked. The core of the model transformation approach will be highlighted in this paper and how institution concepts such category, co-morphism and signature are presented in the two used formalisms. This paper will also illustrate the use of the developed tool MAV-UML-AD through an example of specification and verification.
    Keywords: formal semantics; model-driven engineering; institution theory; Event-B; UML activity diagram; formal verification

  • The analysis of evolutionary optimisation on the TSP(1,2) problem   Order a copy of this article
    by Xiaoyun Xia, Xinsheng Lai, Chenfu Yi 
    Abstract: The TSP(1,2) problem is a special case of the travelling salesperson problem, which is NP-hard. Many heuristics including evolutionary algorithms (EAs) are proposed to solve the TSP(1,2) problem. However, we know little about the performance of the EAs for the TSP(1,2) problem. This paper presents an approximation analysis of the (1+1) EA on this problem. It is shown that both the (1+1) EA and $(mu+lambda)$ EA can obtain $3/2$ approximation ratio for this problem in expected polynomial runtime $O(n^3)$ and $Oleft((frac{mu}{lambda})n^3+nright)$, respectively. Furthermore, we prove that the (1+1) EA can provide a much tighter upper bound than a simple ACO on the TSP(1,2) problem.
    Keywords: evolutionary algorithms; TSP(1,2); approximation performance; analysis of algorithm; computational complexity.
    DOI: 10.1504/IJCSE.2016.10007955
     
  • A novel rural microcredit decision model and solving via binary differential evolution algorithm   Order a copy of this article
    by Dazhi Jiang, Jiali Lin, Kangshun Li 
    Abstract: Generally, as an economic means of lifting people out of poverty, microcredit has been accepted as an effective method for empowering both individuals and communities. However, risk control is still a core part of the implementation of agriculture-related loans business for microcredit companies. In this paper, a rural microcredit decision model is presented based on maximising the profit while minimising the risk. Then, a binary differential evolution algorithm is applied to solve the decision model. The result shows that the proposed method and model are scientific and easy to operate, which can also provide a referential solution for the decision management in microcredit companies.
    Keywords: risk control; microcredit; decision model; binary differential evolution

  • Q-grams-imp: an improved q-grams algorithm aimed at edit similarity join   Order a copy of this article
    by Zhaobin Liu, Yunxia Liu 
    Abstract: Similarity join is more and more important in many applications and has attracted widespread attention from scholars and communities. Similarity join has been used in many applications, such as spell checking, copy detection, entity linking, pattern recognition and so on. Actually, in many web and enterprise scenarios, where typos and misspellings often occur, we need to find an efficient algorithm to handle these situations. In this paper, we propose an improved algorithm on q-grams called q-grams-imp that is aimed at solving edit similarity join. We use this algorithm in order to reduce the number of tokens and thus reduce space costs; it is best fitted for same size strings. But for different sizes of strings, we need to handle these strings in order to fit the algorithm. Finally, we conclude and get the results that our proposed algorithm is better than the traditional method.
    Keywords: similarity join; q-grams algorithm; edit distance.

  • An algorithm based on differential evolution for satellite data Transmission Scheduling   Order a copy of this article
    by Qingzhong Liang, Yuanyuan Fan, Xuesong Yan, Ye yan 
    Abstract: Data transmission task scheduling is one of the important problems in satellite communication. It can be considered as a combinatorial optimisation problem among satellite data transmission demand, visible time window and ground station resource, which is an NP-complete problem. In this paper, we propose a satellite data transmission task scheduling algorithm that searches for an optimised solution based on a differential evolution algorithm framework. In its progress of evolution, the individuals evaluating procedure is improved by a modified 0/1 knapsack based method. Extensive experiments are conducted to examine the effectiveness and performance of the proposed scheduling algorithm. Experimental results show that the scheduling results generated from the algorithm satisfy scheduling constraints and are consistent with the expectation.
    Keywords: data transmission; task scheduling; differential evolution; knapsack problem

  • Dynamic load balance strategy for parallel rendering based on deferred shading   Order a copy of this article
    by Mingqiang Yin, Dan Sun, Hui Sun 
    Abstract: To solve the problem of low efficiency in rendering of large scenes with a complex illumination model, a new deferred shading method is proposed, which is applied to the parallel rendering system. In order to make the rendering times of slave nodes in the parallel rendering system equal to each other, the algorithm for rendering task assignment is designed. For the deferred shading method, the process of rendering every frame is divided into two phases. The first one called geometrical process is responsible for the visibility detection. In this phase, the primitives are distributed to each rendering node evenly and are rendered without illumination. The pixels which should be shaded and their corresponding primitives are found. The second one called pixel shading is responsible for colouring the pixels which have been found in the first phase. The pixels are assigned to the rendering node evenly according the image of the last frame. As the rendering tasks in the two phases are assigned evenly, the rendering times of node in the cluster system are roughly equal to each other. Experiments show that this method can improve the rendering efficiency of the parallel rendering system.
    Keywords: parallel rendering; deferred shading; load balance.

  • Big data automatic analysis system and its applications in rockburst experiment   Order a copy of this article
    by Yu Zhang 
    Abstract: In 2006, State Key Laboratory for GeoMechanics and Deep Underground Engineering, GDLab for short, has successfully reconstructed the rockburst procedure indoors. Since then, a series of valuable research results has been gained in the area of rockburst mechanism. At the same time, there are some dilemmas, such as data storage dilemma, data analysis dilemma and prediction accuracy dilemma. GDLab has accumulated more than 500 TB data of rockburst experiments. But so far, the amount of analysed data is less than 5%. The primary cause of these dilemmas is that a large amount of experimental data in the procedure of the study of rockburst are produced. In this paper, a novel big data automatic analysis system for rockburst experiment is proposed. Various modules and algorithms were designed and realised. Theoretical analysis and experimental research show that the system can improve the existing research mechanism of rockburst. It also can make many impossible things become possible. The work of this paper has laid a theoretical foundation for rockburst mechanism research.
    Keywords: rock burst; experiment data; big data; automatic analysis

  • Training auto-encoders effectively via eliminating task-irrelevant input variables   Order a copy of this article
    by Hui Shen, Dehua Li, Zhaoxiang Zang, Hong Wu 
    Abstract: Auto-encoders are often used as building blocks of deep network classifiers to learn feature extractors, but task-irrelevant information in the input data may lead to bad extractors and result in poor generalisation performance of the network. In this paper, via dropping the task-irrelevant input variables the performance of auto-encoders can be obviously improved. Specifically, an importance-based variable selection method is proposed to aim at finding the task-irrelevant input variables and dropping them. The paper first estimates the importance of each variable, and then drops the variables with importance value lower than a threshold. In order to obtain better performance, the method can be employed for each layer of stacked auto-encoders. Experimental results show that when combined with our method the stacked denoising auto-encoders achieve significantly improved performance on three challenging datasets.
    Keywords: feature learning; deep learning; neural network; auto-encoder; stacked auto-encoders; variable selection; feature selection; unsupervised training

  • Model-checking software product lines based on feature slicing   Order a copy of this article
    by Mingyu Huang, Yumei Liu 
    Abstract: Feature model is a popular formalism for describing the commonality and variability of software product line in terms of features. Feature models symbolise a presentation of the possible application configuration space, and can be customised based on specific domain requirements and stakeholder goals. As feature models are becoming increasingly complex, it is desired to provide automatic support for customised analysis and verification based on the specific goals and requirements of stakeholders. This paper first presents feature model slicing based on the requirements of the users. We then introduce three-valued abstraction of behaviour models based on the slicing unit. Finally, based on a multi-valued model checker, a case study was conducted to illustrate the effectiveness of our approach.
    Keywords: feature model; slicing; three-valued model; model checking

  • Decomposition-based multi-objective comprehensive learning particle swarm optimisation   Order a copy of this article
    by Xiang Yu, Hui Wang, Hui Sun 
    Abstract: This paper proposes decomposition-based comprehensive learning particle swarm optimisation (DCLPSO) for multi-objective optimisation. DCLPSO uses multiple swarms, with each swarm optimising a separate objective. Two sequential phases are conducted: independent search and then cooperative search. Important information related to extreme points of the Pareto front often can be found in the independent search phase. In the cooperative search phase, a particle randomly learns from its personal best position or an elitist on each dimension. Elitists are non-dominated solutions and are stored in an external repository shared by all the swarms. Mutation is applied to each elitist in this phase to help escaping from local Pareto fronts. Experiments conducted on various benchmark problems demonstrate that DCLPSO is competitive in terms of convergence and diversity of the resulting non-dominated solutions.
    Keywords: particle swarm optimisation; comprehensive learning; decomposition; multi-objective optimisation.

  • Applicability evaluation of different algorithms for daily reference evapotranspiration model in KBE system   Order a copy of this article
    by Yubin Zhang, Zhengying Wei, Lei Zhang, Jun Du 
    Abstract: An irrigation decision-making system based on Knowledge-based Engineering (KBE) is reported in this paper. It can accurately predict water and fertiliser requirements and achieve intelligent irrigation diagnosis and decision support. However, the basis of the KBE was knowledge of reference crop evapotranspiration (ET0). Therefore, the research examined the accuracy of the support vector machines (SVMs) in the model of ET0. The main obstacles of computing ET0 by the PenmanMonteith model were the complicated nonlinear process and the many climate variables required; furthermore, these were calculated based on the original meteorological data, and the calculation standard was not the only one. Thus, the SVM models are applied with the original or limited data, especially in developing countries. The flexibility of the SVMs in ET0 modelling was assessed using the original meteorological data (Tmax, Tm, Tmin, n, Uh, RHm, φ, Z ) of the years 1990-2014 in five stations of Shaanxi, China. Those eight parameters were used as the input, while the reference evapotranspiration values were the output. In the first part of the study, the SVMs were compared with FAO-24, Hargreaves, McCloud, Priestley-Taylor and Makkink models. The comparison results indicated that the SVMs performed better than other models. In the second part, the total ET0 estimation of the SVMs was compared with the other models in the validation. It was found that the SVM models were superior to the others in terms of relative error. The further assessment of SVMs was conducted, and confirmed that the models could provide a powerful tool in KBE irrigation with a lack of meteorological data. This research could provide a reference for accurate ET0 estimation for decision-making in KBE irrigation systems based on collecting data from humidity sensors and weather stations in the field.
    Keywords: reference evapotranspiration; support vector machines; knowledge-based engineering; original meteorological data.

  • Multi hidden layer extreme learning machine optimised with batch intrinsic plasticity   Order a copy of this article
    by Shan Pang, Xinyi Yang 
    Abstract: Extreme learning machine (ELM) is a novel learning algorithm where the training is restricted to the output weights to achieve a fast learning speed. However, ELM tends to require more neurons in the hidden layer and sometimes leads to ill-condition problem owing to random selection of input weights and hidden biases. To address these problems, we propose a multi hidden layer ELM optimised with batch intrinsic plasticity (BIP) scheme. The proposed algorithm has a deep structure and thus learns features more efficiently. The combination with the BIP scheme helps to achieve better generalisation ability. Comparisons with some state-of-the-art ELM algorithms on both regression and classification problems have verified the performance and effectiveness of our proposed algorithm.
    Keywords: neural network; extreme learning machine; batch intrinsic plasticity; multi hidden layers.

  • Chaotic artificial bee colony with elite opposition-based learning strategy   Order a copy of this article
    by Zhaolu Guo, Jinxiao Shi, Xiaofeng Xiong, Xiaoyun Xia, Xiaosheng Liu 
    Abstract: Artificial bee colony (ABC) algorithm is a promising evolutionary algorithm inspired by the foraging behaviour of honey bee swarms, which has obtained satisfactory solutions in diverse applications. However, the basic ABC demonstrates insufficient exploitation capability in some cases. To address this issue, a chaotic artificial bee colony with elite opposition-based learning strategy (CEOABC) is proposed in this paper. During the search process, CEOABC employs the chaotic local search to promote the exploitation ability. Moreover, the elite opposition-based learning strategy is used to exploit the potential information of the exhausted solution. Experimental results compared with several ABC variants show that CEOABC is a competitive approach for global optimisation.
    Keywords: artificial bee colony; chaotic local search; opposition-based learning; elite strategy.

  • Numerical simulations of electromagnetic wave logging instrument response based on self-adaptive hp finite element method   Order a copy of this article
    by L.I. Hui, Zhu Xifang, Liu Changbo 
    Abstract: Numerical simulation of instrument response is an important method to calibrate instrument parameters, evaluate detection performance, and verify complex system theory. Measurement results of electrical well logging are important for the interpretation of measurement data and characterisation of oil reservoirs, especially in horizontal directional drilling and shale gas and oil development. In this paper, a self-adaptive hp finite element method has been used to investigate the electrical well logging instrument responses, such as the electromagnetic wave resistivity logging- while-drilling (LWD) tool and the through-casing resistivity logging (TCRL) tool. Measurement results illustrate the efficiency of the methods, and provide physical interpretation of resistivity measurements obtained with the LWD and TCRL tools. Numerical simulation examples are provided to show the validity, accuracy, and efficiency of the self-adaptive hp finite element method. The high accuracy simulation results have great importance for electrical well logging tools calibration and logging data interpretation.
    Keywords: numerical simulation; parameters calibration; electromagnetic wave resistivity logging-while-drilling; through-casing resistivity logging; self-adaptive hp finite element method.

  • Upgrading event and pattern detection to big data   Order a copy of this article
    by Soumaya Cherichi, Rim Faiz 
    Abstract: One of the marvels of our time is the unprecedented development and use of technologies that support social interaction. Social mediating technologies have engendered radically new ways of information and communication, particularly during events; in cases of natural disaster, such as earthquakes and tsunami, and the American presidential election. This paper is based on data obtained from Twitter because of its popularity and sheer data volume. This content can be combined and processed to detect events, entities and popular moods to feed various new large-scale data-analysis applications. On the downside, these content items are very noisy and highly informal, making it difficult to extract sense out of the stream. Taking into account all the difficulties, we propose a new event detection approach combining linguistic features and Twitter features. Finally, we present our event detection system from microblogs that aims (1) to detect new events, (2) to recognise temporal markers pattern of an event, and (3) to classify important events according to thematic pertinence, author pertinence and tweet volume.
    Keywords: microblogs; event detection; temporal markers; patterns; social network analysis.

  • A security ensemble framework for securing a file in cloud computing environments   Order a copy of this article
    by Sharon Moses J, Nirmala M 
    Abstract: Scalability and on-demand features of cloud computing have revolutionised the IT industry. Cloud computing provides flexibility to the user in several aspects, including pay as you use. The entire burdens of computing, managing resources and file storage are moved to the cloud service provider end. File storage in clouds is an important issue for both service providers and the end users. Securing the file stored in cloud storage from internal and external attacks has become a primary concern for cloud storage providers. Owing to the accumulation of enormous amounts of personal and confidential information in cloud storage, it draws hackers and data-pirates to steal the information at any cost. Once a file gets stored in cloud storage, the user has no authority over the file as well as any knowledge of its physical location. In this paper, the threats involved in file storage and a secure way of protecting the stored files using a novel ensemble of security strategies is presented. An encryption ensemble module is incorporated over an OpenStack cloud infrastructure for protecting the file. Five symmetric block ciphers are used in the encryption module to encrypt and decrypt the file without disturbing existing security measures provided to a file. This proposed strategy helps service providers as well as users to secure the file in cloud storage more efficiently.
    Keywords: Cloud Storage; File Privacy; File Security; Swift storage; OpenStack security; Security ensemble.

  • Virtual guitar: using real-time finger tracking for musical instruments   Order a copy of this article
    by Noorkholis Luthfil Hakim, Shih-Wei Sun, Mu-Hsen Hsu, Timothy K. Shih, Shih-Jung Wu 
    Abstract: Kinect, a 3D sensing device from Microsoft, invokes the Human Computer Interaction (HCI) research evolution. Kinect has been implemented in many areas, including music. One implementation was in a Virtual Musical Instrument (VMI) system, which uses natural gestures to produce synthetic sounds similar to a real musical instrument. From related work, we found that the use of a large joint, such as hand, arm or leg, is inconvenient and limits the way of playing VMI. Thus this study proposed a fast and reliable finger tracking algorithm suitable for VMI playing. In addition, a virtual guitar system application was developed as an implementation of the proposed algorithm. Experimental results show that the proposed method can be used to play a variety of tunes with an acceptable quality. Furthermore, the proposed application could be used by a beginner who does not have any experience in music or playing a real musical instrument.
    Keywords: virtual guitar; finger tracking; musical instrument; human computerrninteraction; HCI; hand detection; hand tracking; hand gesture recognition; virtual musical instrument; VMI; depth camera.

  • A cloud computing price model based on virtual machine performance degradation   Order a copy of this article
    by Dionisio Machado Leite, Maycon Peixoto, Carlos Ferreira, Bruno Batista, Danilo Costa, Marcos Santana, Regina Santana 
    Abstract: This paper reports the interference effects in virtual machines performance running higher workloads to improve the resources payment in cloud computing. The objective is to produce an acceptable pay-as-you-go model to be used by cloud computing providers. Presently, a price of pay-as-you-go model is based on the virtual machine used per time. However, this scheme does not consider the interference caused by virtual machines running concurrently, which may cause performance degradation. In order to obtain a fair charging model, this paper proposes an approach considering a recovery over the initial price considering the virtual machine performance interference. Results showed benefits of a fair pay-as-you-go model, ensuring the effective user requirement. This novel model contributes to cloud computing in a fair and transparent price composition.
    Keywords: cloud computing; pay-as-you-go; virtualisation; quality of service.

  • Designing scrubbing strategy for memories suffering MCUs through the selection of optimal interleaving distance   Order a copy of this article
    by Wei Zhou, Hong Zhang, Hui Wang, Yun Wang 
    Abstract: As technology scales, multiple cell upsets (MCUs) have shown prominent effect, thus affecting the reliability of memory to a great extent. Ideally, the interleaving distance (ID) should be chosen as the maximum expected MCU size. In order to mitigate MCUs errors, interleaving schemes together with single error correction (SEC) codes can be used to provide the greatest protection. In this paper, we propose the use of scrubbing sequences to improve memory reliability. The key idea is to exploit the locality of the errors caused by a MCU to make scrubbing more efficient. The single error correction, double error detection, and double adjacent error correction (SEC-DEDDAEC) codes have also been used. A procedure is presented to determine a scrubbing sequence that maximizes reliability. An algorithm of scrubbing strategy, which keeps the area overhead and complexity as low as possible without compromising memory reliability, is proposed for the optimal interleaving distance, which should be maximized under some conditions. The approach is further applied to a case study and results show a significant increase in the Mean Time To Failure (MTTF) compared with traditional scrubbing.
    Keywords: interleaving distance; memory; multiple cell upsets (MCUs); soft error; reliability; scrubbing; radiation.
    DOI: 10.1504/IJCSE.2016.10004753
     
  • A model of mining approximate frequent itemsets using rough set theory   Order a copy of this article
    by Yu Xiaomei, Wang Hong, Zheng Xiangwei 
    Abstract: Datasets can be described by decision tables. In real-life applications, data are usually incomplete and uncertain, which poses big challenges for mining frequent itemsets in imprecise databases. This paper presents a novel model of mining approximate frequent itemsets using the theory of rough sets. With a transactional information system constructed on the dataset under consideration, a transactional decision table is put forward, then lower and upper approximations of support are available that can be easily computed from the indiscernibility relations. Finally, by a divide-and-conquer way, the approximate frequent itemsets are discovered taking consideration of support-based accuracy and coverage defined. The evaluation of the novel model is conducted on both synthetic datasets and real-life applications. The experimental results demonstrate its usability and validity.
    Keywords: rough set theory; data mining; decision table; approximate frequent itemsets; indiscernibility relation.

  • Improved predicting algorithm of RNA pseudoknotted structure   Order a copy of this article
    by Zhendong Liu, Daming Zhu, Qionghai Dai 
    Abstract: The prediction of RNA structure with pseudoknots is an NP-hard problem. According to minimum free energy models and computational methods, we investigate the RNA pseudoknotted structure. The paper presents an efficient algorithm for predicting RNA structure with pseudoknots, and the algorithm takes O(n3) time and O(n2) space. The experimental tests in Rfam10.1 and PseudoBase indicate that the algorithm is more effective and precise, and the algorithm can predict arbitrary pseudoknots. And there exists an 1+e (e>0) polynomial time approximation scheme in searching the maximum number of stackings, and we give the proof of the approximation scheme in RNA pseudoknotted structure.
    Keywords: RNA pseudoknotted structure; predicting algorithm; PTAS; pseudoknots; minimum free energy.

  • An efficient algorithm for modelling and dynamic prediction of network traffic   Order a copy of this article
    by Wenjie Fan 
    Abstract: Network node degradation is an important problem in the internet of things, given the ubiquitous high number of personal computers, tablets, phones and other equipment present nowadays. In order to verify the network traffic degradation as one or multiple nodes in a network fail, this paper proposes an algorithm based on Product Form Results (PRF) for the Fractionally Auto Regressive Integrated Moving Average (FARIMA) model, namely PFRF. In this algorithm, the prediction method is established by the FARIMA model, through equations for queuing situation and average queue length in steady state derived from queuing theory. Experimental simulations were conducted to investigate the relationships between average queue length and service rate. Results demonstrated that it not only has good adaptability, but has also achieved promising magnitude of 9.87 as standard deviation, which shows its high prediction accuracy, given the low-magnitude difference between original value and the algorithm.
    Keywords: prediction; product form results; FARIMA model; average length of queue.

  • Reversible image watermarking based on texture analysis of grey level co-occurrence matrix   Order a copy of this article
    by Shu-zhi Li, Qin Hu, Xiao-hong Deng, Zhaoquan Cai 
    Abstract: Embedding the watermark in the complex area of the image can effectively improve concealment. However, most methods simply use the mean squared error (MSE) and some simple methods to judge the texture complexity. In this paper, we propose a new texture analysis method based on grey level co-occurrence matrix (GLCM) and provide an in-depth discussion on how to accurately choose a complex region. This new method is applied to the reversible image watermarking. Firstly, the original host image is divided into 128 * 128 sub-blocks. Then, the mean square error is used to assign the weight of the four texture feature parameters to establish the relationship between the characteristic parameters and the complexity of image sub-block. Applying this formulaic series, we can calculate the complexity of each sub-block, along with the selection of the maximum sub-blocks of the texture complexity. If the embedding position is insufficient, then we select the second sub-block to be embedded in the watermark, until a satisfactory embedding capacity is reached. Pairwise prediction error extend (PPEE) is used to hide the data.
    Keywords: grey level co-occurrence matrix; image sub block; texture complexity; reversible image watermarking.

  • A semantic recommender algorithm for 3D model retrieval based on deep belief networks   Order a copy of this article
    by Li Chen, Hong Liu, Philip Moore 
    Abstract: Interest in 3D modelling is growing; however, the retrieval results achieved for semantic-based 3D model retrieval systems have been disappointing. In this paper we propose a novel semantic recommendation algorithm based on a Deep Belief Network (DBN-SRA) to implement semantic retrieval with potential semantic correlations [between models] being achieved using deep learning form known model samples. The algorithm uses the feature correlation [between the models] as the conditions to enable semantic matching of 3D models to obtain the final recommended retrieval result. Our proposed approach has been shown to improve the effectiveness of 3D model retrieval, in terms of both retrieval time and, importantly, accuracy. Additionally, our study and our reported results suggest that our posited approach will generalise to recommender systems in other domains that are characterised by multiple feature relationships.
    Keywords: deep belief network; 3D model retrieval; recommender algorithm; cluster analysis.

  • Differential evolution with spatially neighbourhood best search in dynamic environment   Order a copy of this article
    by Dingcai Shen, Longyin Zhu 
    Abstract: In recent years, there has been a growing interest in applying differential evolution (DE) to optimisation problems in a dynamic environment. The ability to track a changing optimum over time is concerned in dynamic optimisation problems (DOPs). In this study, an improved niching-based scheme, named spatially neighbourhood best search DE (SnDE), for DOPs is proposed. The SnDE adopts DE with DE/best/1/bin scheme. The best individual in the selected scheme is searched around the considered individual in a predefined neighbourhood size, thus keeping a balance between exploitation ability and exploration ability. A comparative study with several algorithms with different characteristics on a common platform by using the moving peaks benchmark (MPB) and various problem settings is presented in this paper. The results indicate that the proposed algorithm can track the changing optimum in each circumstance effectively on the selected benchmark function.
    Keywords: differential evolution; dynamic optimisation problem; neighbourhood search; niching.

  • Optimal anti-interception orbit design based on genetic algorithm   Order a copy of this article
    by Yifang Liu 
    Abstract: The space defence three-player problem with impulsive thrust is studied in this work. Interceptor spacecraft and anti-interceptor spacecraft have only one chance to manoeuvre, while target spacecraft just keeps running in the target orbit without the ability to manoeuvre. Based on the Lambert theorem, the space defence three-player problem is modelled and divided into two layers. The internal layer is an interception problem in which the interceptor spacecraft tries to intercept the target spacecraft. The external layer is an anti-interception problem in which the anti-interceptor spacecraft tries to defend against the interceptor spacecraft. Because it can get the global solution and does not need the gradient information that is required in traditional optimisation methods, the genetic algorithm is employed to solve the resulting parameter optimisation problem in the interception/anti-interception problem. A numerical simulation is provided to verify the availability of the obtained solution, and the results show that this work is useful for some practical applications.
    Keywords: space three-player problem; anti-interception orbit design; impulsive thrust; parameter optimisation problem; genetic algorithm.

  • Detecting sparse rating spammer for accurate ranking of online recommendation   Order a copy of this article
    by Hong Wang, Xiaomei Yu, Yuanjie Zheng 
    Abstract: Ranking method for online recommendation system is challenging owing to the rating sparsity and the spam rating attacks. The former can cause the well-known cold start problem while the latter complicates the recommendation task by detecting these unreasonable or biased ratings. In this paper, we treat the spam ratings as 'corruptions', which spatially distribute in a sparse pattern, and model them with a L1 norm and a L2,1 norm. We show that these models can characterise the property of the original ratings by removing spam ratings and help to resolve the cold start problem. Furthermore, we propose a group reputation-based method to re-weight the rating matrix and an iteratively programming-based technique for optimising the ranking for online recommendation. We show that our optimisation methods outperform other recommendation approaches. Experimental results on four famous datasets show the superior performances of our methods.
    Keywords: ranking; group-based reputation; sparsity; spam rating; collaborative recommendation.

  • Differential evolution with dynamic neighborhood learning strategy based mutation operators   Order a copy of this article
    by Guo Sun, Yiqiao Cai 
    Abstract: As the core operator of differential evolution (DE), mutation is crucial for guiding the search. However, in most DE algorithms, the parents in the mutation operator are randomly selected from the current population, which may lead to DE being slow to exploit solutions when faced with complex problems. In this study, a dynamic neighborhood learning (DNL) strategy is proposed for DE to alleviate this drawback. The new proposed DE framework is named DE with DNL-based mutation operators (DNL-DE). Unlike the original DE algorithms, DNL-DE uses DNL to dynamically construct neighborhood for each individual during the evolutionary process and intelligently select parents for mutation from the defined neighborhood. In this way, the neighborhood information can be effectively used to improve the performance of DE. Furthermore, two instantiations of DNL-DE with different parent selection methods are presented. To evaluate the effectiveness of the proposed algorithm, DNL-DE is applied to the original DE algorithms, as well as several advanced DE variants. The experimental results demonstrate the high performance of DNL-DE when compared with other DE algorithms.
    Keywords: differential evolution; dynamic neighborhood; learning strategy; mutation operator; numerical optimisation.
    DOI: 10.1504/IJCSE.2016.10005940
     
  • A word-frequency-preserving steganographic method based on synonym substitution   Order a copy of this article
    by Lingyun Xiang, Xiao Yang, Jiahe Zhang, Weizheng Wang 
    Abstract: Text steganography is a widely used technique to protect communication privacy but it still suffers a variety of challenges. One of these challenge is that a synonym substitution based method may change the statistical characteristics of the content, which may be easily detected by steganalysis. In order to overcome this disadvantage, this paper proposes a synonym substitution based steganographic method taking the word frequency into account. This method dynamically divides the synonyms appearing in the text into groups, and substitutes some synonyms to alter the positions of the relatively low frequency synonyms in each group to encode the secret information. By maintaining the number of relatively low frequency synonyms across the substitutions, it preserves some characteristics of the synonyms with various frequencies in the stego and the original cover texts. The experimental results illustrate that the proposed method can effectively resist attack from the detection using relative frequency analysis of synonyms.
    Keywords: synonym substitution; steganography; word-frequency-preserving; multiple-base coding; steganalysis.

  • A personalised ontology ranking model based on analytic hierarchy process   Order a copy of this article
    by Jianghua Li, Chen Qiu 
    Abstract: Ontology ranking is one of the important functions of ontology search engines, which ranks searched ontologies based on the ranking model applied. A good ranking method can help users to acquire the exactly required ontology from a considerable amount of search results, efficiently. Existing approaches that rank ontologies take only a single aspect into consideration, and ignore users personalised demands, hence produce unsatisfactory results. It is believed that the factors that influence ontology importance and the users demands both need to be considered comprehensively in ontology ranking. A personalised ontology ranking model based on the hierarchical analysis approach is proposed in this paper. We build a hierarchically analytical model and apply an analytic hierarchy process to quantify ranking indexes and assign weights to them. The experimental results show that the proposed method can rank ontologies effectively and meet users personalised demands.
    Keywords: hierarchical analysis approach; ontology ranking; personalised demands; weights assignment.

  • Deploying parallelised ciphertext-policy attributed-based encryption in clouds   Order a copy of this article
    by Hai Jiang 
    Abstract: In recent years, cloud storage has become an attractive solution owing to its elasticity, availability and scalability. However, the security issue has started to prevent public clouds to move forward being more popular. Traditional encryption algorithms (both symmetric and asymmetric ones) fail to support achieving effective secure cloud storage owing to severe issues such as complex key management and heavy redundancy. Ciphertext-Policy Attribute Based Encryption (CP-ABE) scheme overcomes the aforementioned issues and provides fine-grained access control as well as deduplication features. CP-ABE has become a possible solution to cloud storage. However, its high complexity has prevented it from being widely adopted. This paper parallelises CP-ABE where issues to ensure secured cloud storage are considered and deployed in cloud storage environments. Major performance bottlenecks, such as key management and encryption/decryption process, are identified and accelerated, and a new AES encryption operation mode is adopted for further performance gains. Experimental results have demonstrated the effectiveness and promise of such a design.
    Keywords: CP-ABE; cloud storage; parallelisation; authentication.

  • Collective intelligence value discovery based on citation of science article   Order a copy of this article
    by Yi Zhao, Zhao Li, Bitao Li, Keqing He, Junfei Guo 
    Abstract: One of the tasks of scientific paper writing is to recommend. When the number of references is increased, there is no clear classification and the similarity measure of the recommendation system will show poor performance. In this work, we propose a novel recommendation research approach using classification, clustering and recommendation models integrated into the system. In an evaluation of the ACL Anthology papers network data, we effectively use a complex network of knowledge tree node degrees (refer to the number of papers) to enhance the accuracy of recommendation. The experimental results show that our model generates better recommended citation, achieving 10% higher accuracy and 8% higher F-score than the keyword march method when the data is big enough. We make full use of the collective intelligence to serve the public.
    Keywords: citation recommendation; classification; clustering; similarity; citation network.

  • Differential evolution with k-nearest-neighbour-based mutation operator   Order a copy of this article
    by Gang Liu, Cong Wu 
    Abstract: Differential evolution (DE) is one of the most powerful global numerical optimisation algorithms in the evolutionary algorithm family, and it is popular for its simplicity and effectiveness in solving numerous real-world optimisation problems in real-valued spaces. The performance of DE depends on its mutation strategy. However, the traditional mutation operators have difficulty in balancing the exploration and exploitation. To address these issues, in this paper, a k-nearest-neighbour-based mutation operator is proposed for improving the search ability of DE. This operator is used to search in the areas in which the vector density distribution is sparse. This method enhances the exploitation of DE and accelerates the convergence of the algorithm. In order to evaluate the effectiveness of our proposed mutation operator on DE, this paper compares other state-of-the-art evolutionary algorithms with the proposed algorithm. Experimental verifications are conducted on the CEC05 competition and two real-world problems. Experimental results indicate that our proposed mutation operator is able to enhance the performance of DE and can perform significantly better than, or at least comparably with, several state-of-the-art DE variants.
    Keywords: differential evolution; unilateral sort; k-nearest-neighbour-based mutation; global optimisation.

  • Topic-specific image indexing and presentation for MEDLINE abstract   Order a copy of this article
    by Lan Huang, Ye Wang, Leiguang Gong, Tian Bai 
    Abstract: MEDLINE is one of the largest databases of biomedical literature. The search results from MEDLINE for medical terms are in the form of lists of articles with PubMed IDs. To further explore and select articles that may help to identify potentially interesting interactions between terms, users need to navigate through the lists of URLs to retrieve and read actual articles to find relevancies among these terms. Such work becomes extremely time consuming and unbearably tedious when each query returns tens of thousands of results with an uncertain recall rate. To overcome this problem, we develop a topic-specific image indexing and presentation method for discovering interactions or relatedness of medical terms from MEDLINE, based on which a prototype tool is implemented to help discover interactions between terms of types of disease. The merits of the method are illustrated by search examples using the tool and MEDLINE abstract dataset.
    Keywords: MEDLINE; data visualisation; customised retrieval.

  • Simultaneous multiple low-dimensional subspace dimensionality reduction and classification   Order a copy of this article
    by Lijun Dou, Rui Yan, Qiaolin Ye 
    Abstract: Fisher linear discriminant (FLD) for supervised learning has recently emerged as a computationally powerful tool for extracting features for a variety of pattern classification problems. However, it works poorly with multimodal data. Local Fisher linear discriminant (LFLD) is proposed to reduce the dimensionality of multimodal data. Through experiments tried out on the multimodal but binary data sets created from several multi-class datasets, it has been shown to be better than FLD in terms of performance. However, LFLD has a serious limitation, which is that it is limited to use on small-scale datasets. In order to address the above disadvantages, in this paper we develop a Multiple low-dimensionality Dimensionality Reduction Technique (MSDR) of performing the dimensionality reduction (DR) of input data. In contrast to FLD and LFLD finding an optimal low-dimensional subspace, the new algorithm attempts to seek multiple optimal low-dimensional subspaces that best make the data sharing the same labels more compact. Inheriting the advantages of NC, MSDR reduces the dimensionality of data and directly performs classification tasks without the need to train the model. Experiments of comparing MSDR with the existing traditional approaches tried out on UCI, show the effectiveness and efficiency of MSDR.
    Keywords: Fisher linear discriminant; local FLD; dimensionality reduction; multiple low-dimensional subspaces.

  • Using Gaussian mixture model to fix errors in SFS approach based on propagation   Order a copy of this article
    by Huang WenMin 
    Abstract: A new Gaussian mixture model is used to improve the quality of the propagation method for SFS in this paper. The improved algorithm can overcome most difficulties of the method, including slow convergence, interdependence of propagation nodes and error accumulation. To slow convergence and interdependence of propagation nodes, a stable propagation source and integration path are used to make sure that the reconstruction work of each pixel in the image is independent. A Gaussian mixture model based on prior conditions is proposed to fix the error of integration. Good results have been achieved in the experiment for the Lambert composite image of front illumination.
    Keywords: shape from shading; propagation method; silhouette; Gaussian mixture model; surface reconstruction.

  • Sign fusion of multiple QPNs based on qualitative mutual information   Order a copy of this article
    by Yali Lv, Jiye Liang, Yuhua Qian 
    Abstract: In the era of big data, the fusion of uncertain information from different data sources is a crucial issue in various applications. In this paper, a sign fusion method of multiple Qualitative Probabilistic Networks (QPNs) with the same structure from different data sources is proposed. Specifically, firstly, the definition of parallel path in multiple QPNs is given and the problem of fusion ambiguity is described. Secondly, the fusion operator theorem has been introduced in detail, including its proof and algebraic properties. Further, an efficient sign fusion algorithm is proposed. Finally, experimental results demonstrate that our fusion algorithm is feasible and efficient.
    Keywords: qualitative probabilistic reasoning; QPNs; Bayesian networks; sign fusion; qualitative mutual information.

  • Estimation of distribution algorithms based on increment clustering for multiple optima in dynamic environments   Order a copy of this article
    by Bolin Yu 
    Abstract: Aiming to locate and track multiple optima in dynamic multimodal environments, an estimation of distribution algorithms based on increment clustering is proposed. The main idea of the proposed algorithm is to construct several probability models based on an increment clustering which improved performance for locating multiple local optima and contributed to find the global optimal solution quickly for dynamic multimodal problems. Meanwhile, a policy of diffusion search is introduced to enhance the diversity of the population in a guided fashion when the environment is changed. The policy uses both the current population information and the part history information of the optimal solutions available. Experimental studies on the moving peaks benchmark are carried out to evaluate the performance of the proposed algorithm in comparison with several state-of-the-art algorithms from the literature. The results show that the proposed algorithm is effective for the function with moving optimum and can adapt to the dynamic environments rapidly.
    Keywords: EDAs; dynamic multimodal problems; diffusion policy; incremental clustering.

  • A blind image watermarking algorithm based on amalgamation domain method   Order a copy of this article
    by Qingtang Su 
    Abstract: Combining with the spatial domain and the frequency domain, a novel blind digital image watermarking algorithm is proposed in this paper to resolve the protecting copyright problem. For embedding a watermark, the generation principle and distribution features of direct current (DC) coefficient are used to directly modify the pixel values in the spatial domain, then four different sub-watermarks are embedded into different areas of the host image for four times, respectively. When extracting the watermark, the sub-watermarks are extracted in a blind manner according to the DC coefficients of the watermarked image and the key-based quantisation step, and then the statistical rule and first to select, second to combine are proposed to form the final watermark. Hence, the proposed algorithm not only has the simple and quick performance of the spatial domain but also has the high robustness feature of DCT domain. Many experimental results have proved that the proposed watermarking algorithm has good invisibility of watermark and strong robustness for many added attacks, e.g., JPEG compression, cropping, adding noise, etc. Comparison results also have shown the preponderance of the proposed algorithm.
    Keywords: information security; digital watermarking; combine domain; direct current.

  • A data cleaning method for heterogeneous attribute fusion and record linkage   Order a copy of this article
    by Huijuan Zhu, Tonghai Jiang, Yi Wang, Li Cheng, Bo Ma, Fan Zhao 
    Abstract: In big data era, when massive heterogeneous data are generated from various data sources, the cleaning of dirty data is critical for reliable data analysis. Existing rule-based methods are generally developed in a single data source environment, so issues such as data standardisation and duplication detection for different data-type attributes are not fully studied. In order to address these challenges, we introduce a method based on dynamic configurable rules which can integrate data detection, modification and transformation together. Secondly, we propose a type-based blocking and a varying window size selection mechanism based on a classic sorted-neighborhood algorithm. We present a reference implementation of our method in a real-life data fusion system and validate its effectiveness and efficiency using recall and precision metrics. Experimental results indicate that our method is suitable in the scenario of multiple data sources with heterogeneous attribute properties.
    Keywords: big data; varying window; data cleaning; record linkage; record similarity; SNM; type-based blocking.

  • Chinese question speech recognition integrated with domain characteristics   Order a copy of this article
    by Shengxiang Gao, Dewei Kong, Zhengtao Yu, Jianyi Guo, Yantuan Xian 
    Abstract: Aiming at domain adaptation in speech recognition, we propose a speech recognition method for Chinese question sentence based on domain characteristics. Firstly, by virtue of syllable association characteristics implied in domain term, syllable feature sequences of domain terms are used to construct the domain acoustic model. Secondly, in decoding process of domain-specific Chinese question speech recognition, we use a domain knowledge relationship to optimise and prune the speech decoding network generated by the language model, to improve continuous speech recognition. The experiments on the tourist domain corpus show that the proposed method has an accuracy of 80.50% on Chinese question speech recognition and of 91.50% on domain term recognition, respectively.
    Keywords: Chinese question speech recognition; speech recognition; domain characteristic; acoustic model library; domain terms; language model; domain knowledge library.

  • Original image tracing with image relational graph for near-duplicate image elimination   Order a copy of this article
    by Fang Huang, Zhili Zhou, Ching-Nung Yang, Xiya Liu 
    Abstract: This paper proposes a novel method for near-duplicate image elimination, by tracing the original image of each near-duplicate image cluster. For this purpose, image clustering based on the combination of global feature and local feature is firstly achieved in a coarse-to-fine way. To accurately eliminate redundant images of each cluster, an image relational graph is constructed to reflect the contextual relationship between images, and the PageRank algorithm is adopted to analyse this contextual relationship. Then the original image will be correctly traced with the highest rank, while other redundant near-duplicate images in the cluster will be eliminated. Experiments show that our method achieves better performance both in image clustering and redundancy elimination, compared with the state-of-the-art methods.
    Keywords: near-duplicate image clustering; near-duplicate image elimination; image retrieval; image search; near-duplicate image retrieval; partial-duplicate image retrieval; image copy detection; local feature; contextual relationship.

  • IFOA: an improved forest algorithm for continuous nonlinear optimisation   Order a copy of this article
    by Borong Ma, Zhixin Ma, Dagan Nie, Xianbo Li 
    Abstract: The Forest Optimisation Algorithm (FOA) is a new evolutionary optimisation algorithm which is inspired by seed dispersal procedure in forests, and is suitable for continuous nonlinear optimisation problems. In this paper, an Improved Forest Optimisation Algorithm (IFOA) is introduced to improve convergence speed and the accuracy of the FOA, and four improvement strategies, including the greedy strategy, waveform step, preferential treatment of best tree and new-type global seeding, are proposed to solve continuous nonlinear optimisation problems better. The capability of IFOA has been investigated through the performance of several experiments on well-known test problems, and the results prove that IFOA is able to perform global optimisation effectively with high accuracy and convergence speed.
    Keywords: forest optimisation algorithm; evolutionary algorithm; continuous nonlinear optimisation; scientific decision-making.

  • A location-aware matrix factorisation approach for collaborative web service QoS prediction   Order a copy of this article
    by Zhen Chen, Limin Shen, Dianlong You, Chuan Ma, Feng Li 
    Abstract: Predicting the unknown QoS is often required because most users would have invoked only a small fraction of web services. Previous prediction methods benefit from mining neighborhood interest from explicit user QoS ratings. However, the implicitly existing but significant location information that would potentially tackle the data sparsity problem is overlooked. In this paper, we propose a unified matrix factorisation model that fully capitalises on the advantages of both location-aware neighborhood and latent factor approach. We first develop a multiview-based neighborhood selection method that clusters neighbours from the views of both geographical distance and rating similarity relationships. Then a personalised prediction model is built up by transforming the wisdom of neighborhoods. Experimental results have demonstrated that our method can achieve higher prediction accuracy than other competitive approaches and also better alleviate the concerned data sparsity issue.
    Keywords: service computing; web service; QoS prediction; matrix factorisation; location awareness.

  • Pairing-free certificateless signature with revocation   Order a copy of this article
    by Sun Yinxia, Shen Limin 
    Abstract: How to revoke a user is an important problem in public key cryptosystems. Free of costly certificate management and key escrow, the certificateless public key cryptography (CLPKC) are advantageous over the traditional public key system and the identity-based public key system. However, there are few solutions to the revocation problem in CLPKC. In this paper, we present an efficient revocable certificateless signature scheme. This new scheme can revoke a user with high efficiency. We also give a method to improve the scheme to be signing-key-exposure-resilient. Based on the discrete logarithm problem, our scheme is provably secure.
    Keywords: revocation; certificateless signature; without pairing; discrete logarithm problem.

  • Large universe multi-authority attribute-based PHR sharing with user revocation   Order a copy of this article
    by Enting Dong, Jianfeng Wang, Zhenhua Liu, Hua Ma 
    Abstract: In the patient-centric model of health information exchange, personal health records (PHRs) are often outsourced to third parties, such as cloud service providers (CSPs). Attribute-based encryption (ABE) can be used to realise flexible access control on PHRs in the cloud environment. Nevertheless, the issues of scalability in key management, user revocation and flexible attributes remain to be addressed. In this paper, we propose a large-universe multi-authority ciphertext-policy ABE system with user revocation. The proposed scheme achieves scalable and fine-grained access control on PHRs. In our scheme, there are a central authority (CA) and multiple attribute authorities (AAs). When a user is revoked, the system public key and the other users' secret keys need not be updated. Furthermore, because our scheme supports a large attribute universe, the number of attributes is not polynomially bounded and the public parameter size does not linearly grow with the number of attributes. Our system is constructed on prime order groups and proven selectively secure in the standard model.
    Keywords: attribute-based encryption; large universe; multi-authority; personal health record; user revocation.

  • A multi-objective optimisation multicast routing algorithm with diversity rate in cognitive wireless mesh networks   Order a copy of this article
    by Zhufang Kuang 
    Abstract: Cognitive Wireless Mesh Networks (CWMNs) were developed to improve the usage ratio of the licensed spectrum. Since the spectrum opportunities for users vary over time and location, enhancing the spectrum effectiveness is a goal and also a challenge for CWMNs. Multimedia applications have recently generated much interest in CWMNs supporting Quality-Of-Service (QoS) communications. Multicast routing and spectrum allocation is an important challenge in CWMNs. In this paper, we design an effective multicast routing algorithm based on diversity rate with respect to load balancing and the number of transmissions for CWMNs. A Load Balancing wireless links weight computing function and computing algorithm based on Diversity Rate (LBDR) are proposed, and a load balancing Channel and Rate Allocating algorithm based on Diversity Rate (CRADR) is proposed. On this basis, a Load balancing joint Multicast Routing, channel and Rate allocation algorithm based on Diversity rate with QoS constraints for CWMNs (LMR2D) is proposed. Balancing the load of node and channel, and minimising the number of transmissions of multicast tree are the objectives of LMR2D. Firstly, LMR2D computes the weight of wireless links using LBDR and the Dijkstra algorithm for constructing the load balancing multicast tree step by step. Secondly, LMR2D uses CRADR to allocate channel and rate of its to links, which is based on the Wireless Broadcast Advantage (WBA). Simulation results show that LMR2D can achieve the expected goal. Not only can it balance the load of node and channel, but also it needs fewer transmissions for multicast tree.
    Keywords: cognitive wireless mesh networks; multicast routing; spectrum allocation; load balanced; diversity rate.

  • Online multi-label learning with cost-sensitive budgeted SVM   Order a copy of this article
    by Jing Liu, Zhongwen Guo, Ling Jian, Like Qiu, Xupeng Wang 
    Abstract: Multi-label learning deals with data associated with multiple labels simultaneously. It has been extensively studied in diverse areas such as information retrieval, bioinformatics, image annotation, etc. Explosive growth of multi-label related data has brought challenges of how to efficiently learn these labelled data and automatically label the unlabelled data. In this paper, we propose an online learning algorithm which processes the data arriving in streaming fashion. It is space-saving and scalable to large-scale problems. Specifically, to tackle the class imbalance problem, we exploit label prior to construct cost-sensitive function for sub-classification problem. Experimental studies corroborate the performance of our approaches on datasets drawn from diverse domains, and demonstrate that our proposed algorithm is an ideal candidate to process streaming data and deal with online multi-label learning tasks.
    Keywords: online learning; budgeted SVM; multi-label learning; cost-sensitive; stochastic gradient descent.

  • Nominal data similarity: a hierarchical measure   Order a copy of this article
    by Hao Yu, Gen Zhang 
    Abstract: Similarity of nominal data plays fundamental roles in numerous fields of machine learning and data mining. Unlike numerical data, the similarity of nominal data is much more difficult to describe, and few efforts have been done for it. Although existing nominal similarity measures can reveal a part of data properties, they suffer low accuracy owing to ignoring value relationships or integrating multi-view relationships inappropriately. In this paper, we propose a novel hierarchical measure for nominal data similarity (HNS). The HNS leverages the intrinsic data characteristics by considering low-level information both within and between attributes, and hierarchically seizes the value distributions, attribute interactions and attribute to object contributions. Meanwhile, it aggregates multi-view relationships through a bottom-up framework, retaining consistency as well as complementary details. We theoretically analysed this measure, and experiments on six UCI datasets demonstrate that the HNS outperforms the state-of-the-art nominal similarity measures in term of target alignment and clustering accuracy.
    Keywords: similarity measure; nominal data; metric learning.

  • Context discriminative dictionary construction for topic representation   Order a copy of this article
    by Shufang Wu 
    Abstract: The construction of a discriminative topic dictionary is important for describing the topic and increasing the accuracy of topic detection and tracking. In this method, we rank the mutual information of words, and the top few words with the maximum mutual information are selected to construct the discriminative topic dictionaries. Considering context words can provide a more accurate expression of the topic, during word selection, we both consider the differences between different topics and the context words that appear in the stories. Since the news topic is dynamic over time,it is not reasonable to keep the topic dictionary unchanged, so a dictionary updating method is also proposed. Experiments were carried out on TDT4 corpus, and we adopt miss probability and false alarm probability as evaluation criteria to compare the performance of incremental TF-IDF and the proposed method. Extensive experiments are conducted to show that our method can provide better results.
    Keywords: discriminative dictionary; context word; topic representation; word selection.

  • Demystifying echo state network with deterministic simple topologies   Order a copy of this article
    by Duaa Elsarraj, Maha Al Qisi, Ali Rodan, Nadim Obeid, Ahmad Sharieh, Hossam Faris 
    Abstract: Echo State Networks (ESN) are a special type of Recurrent Neural Networks (RNN) with distinct performance in the field of Reservoir computing. The state space of the ESN is initially randomised and the reservoir weights are fixed with training done only on the state readout. Beside the advantages of ESN, there remains some opacity in the dynamic properties of the reservoir owing to the presence of randomisation. Our aims in this paper are to demystify the model of ESN in a complete deterministic structure with the use of different proposed reservoir structures (topologies) and to compare their performance with the random ESN on different benchmark datasets. All applied topologies maintain the simplicity of random ESN computation complexity. Most of the topologies showed comparable or even better performance.
    Keywords: echo state network; reservoir computing; reservoir structure topology; memory capacity; echo state network algorithm; complexity.

  • A state space distribution approach based on system behaviour   Order a copy of this article
    by Imene Bensetira, Djamel Eddine Saidouni, Mahfud Al-la Alamin 
    Abstract: In this paper, we propose a novel approach to deal with the state space explosion problem occurring in model checking. We propose an off-line algorithm for distributed state space construction. That is carried out by reviewing the behaviour of the constructed system and redistributing the state space according to the accumulated information about the optimal considered behaviour. Therefore, the distribution will be guided by the systems behaviour. The proposed policy maintains the spatial-time balance. The simulation and implementation of our system are based on a multi-agent technique which fits very well the development of distributed systems. The experimental measures performed on a cluster of machines have shown very promising results for both workload balance and communication overhead.
    Keywords: model checking; combinatorial state space explosion; distributed state space construction; graph distribution; system behaviour; distributed algorithms; reachability analysis.

  • Consensus RNA secondary structure prediction using information of neighbouring columns and principal component analysis   Order a copy of this article
    by Tianhang Liu, Jianping Yin, Long Gao, Wei Chen, Minghui Qiu 
    Abstract: RNA is a family of biological macromolecules. It is important to all kinds of biological processes. RNA structures are closely related to their functions. Hence, determining the structure is invaluable in understanding genetic diseases and creating drugs. Nowadays, RNA secondary structure prediction is a field yet to be researched. In this paper, we present a novel method using an RNA sequence alignment to predict a consensus RNA secondary structure. In essence, the goal of the method is to give a prediction about whether any two columns of an alignment correspond to a base pair or not, using the information provided by the alignment. The information includes the covariation score, the fraction of complementary nucleotides and the consensus probability matrix of the column pair and those of its neighbours. Then principal component analysis is applied to overcome the problem of over-fitting. A comparison of our method and other consensus RNA secondary structure prediction methods, including NeCFold, ELMFold, KnetFold, PFold and RNAalifold, in 47 families from Rfam (version 11.0), is performed. Results show that our method surpasses the other methods in terms of Matthews correlation coefficient, sensitivity and selectivity.
    Keywords: RNA secondary structure prediction; comparative sequence analysis; principal component analysis; information of neighbouring columns.

  • Research on RSA and Hill hybrid encryption algorithm   Order a copy of this article
    by Hongyu Yang, Yuguang Ning, Yue Wang 
    Abstract: An RSA-Hill hybrid encryption algorithm model based on random division of plaintext is proposed. First, the key of the Hill cipher is replaced by a Pascal matrix. Secondly, the session key of the model is replaced by random numbers of plaintext division, and encrypted by the RSA cipher. Finally, the dummy problem in the Hill cipher can be solved, and the model can achieve the one-time pad. Security analysis and experimental results show that our method has better encryption efficiency and stronger anti-attack capacity.
    Keywords: hybrid encryption; plaintext division; Pascal matrix; RSA cipher; Hill cipher.

  • An auction mechanism for cloud resource allocation with time-discounting values   Order a copy of this article
    by Yonglong Zhang 
    Abstract: Group-buying has emerged as a new trading paradigm and has become more attractive. Both sides of the transaction benefit from group-buying: buyers enjoy a lower price and sellers receive more demanding orders. In this paper, we investigate an auction mechanism for cloud resource allocation with time discounting values via group-buying, called TDVG. TDVG consists of two steps: winning seller and buyer selection, and pricing. In the first step, we choose winning seller and buyer in a greedy manner according to some criterion, and calculate the payment for each winning seller and buyer in the second step. Rigorous proof demonstrates that TDVG satisfies the properties of truthfulness, budget balance and individual rationality. Our experiment results show that TDVG achieves better total utility, matching rate and commodities use than the existing works.
    Keywords: cloud resource allocation; auction; time discounting values; group-buying.

  • Study on data sparsity in social network-based recommender system   Order a copy of this article
    by Ru Jia, Ru Li, Meng Gao 
    Abstract: With the development of information technology and the expanding of information resources, it is more difficult for people to get the information that they are really interested in, which is so-called information overload. Recommender systems are regarded as an important approach to deal with information overload, because it can predict users preferences according to users records. Matrix factorisation is very successful in recommender systems, but it faces the problem of data sparsity. This paper deals with the sparsity problem from the perspective of adding more kinds of information from social networks, such as friendships and tags, into the recommending model in order to alleviate the sparsity problem. The paper also validates the impacts of users friendships, tags and neighbours of items on reducing the sparseness of the data and improving the accuracy of recommending by the experiments using the dataset from real life.
    Keywords: social network-based recommender systems; matrix factorisation; data sparsity.

  • A novel virtual disk bandwidth allocation framework for data-intensive applications in cloud environments   Order a copy of this article
    by Peng Xiao, Changsong Liu 
    Abstract: Recently, cloud computing has become a promising distributed processing paradigm to deploy various kinds of non-trivial applications. In those applications, most of them are considered data-intensive and therefore require the cloud system providing massive storage space as well as desirable I/O performance. As a result, virtual disk technique has been widely applied in many real-world platforms to meet the requirements of these applications. Therefore, how to efficiently allocate the virtual disk bandwidth become an important issue that need to be addressed. In this paper, we present a novel virtual disk bandwidth allocation framework, in which a set of virtual bandwidth brokers are introduced to make allocation decisions by playing two game models. Theoretical analysis and solution are presented to prove the effectiveness of the proposed game models. Extensive experiments are conducted on a real-world cloud platform, and the results indicate that the proposed framework can significantly improve the use of virtual disk bandwidth compared with other existing approaches.
    Keywords: cloud computing; bandwidth reservation; quality of service; queue model; gaming theory.

  • Academic research trend analysis based on big data technology   Order a copy of this article
    by Weiwei Lin, Zilong Zhang, Shaoliang Peng 
    Abstract: Big data technology can well support the analysis of academic research trends, which requires the ability to process an enormous amount of metadata efficiently. On this point, we propose an academic trend analysis method that exploits a popular topic model for paper feature extraction and an influence propagation model for field influence evaluation. We also propose a parallel association rule mining algorithm based on Spark to accelerate trend analysis process. Experimentally, a vast amount of paper metadata was collected from four popular digital libraries: ACM, IEEE, Science Direct and Springer, serving as the raw data for our final feature dataset. Focusing on the hotspot of cloud computing, our result demonstrates that the most relevant topics to cloud computing have been changing these years from basic research to applied research, and from a microscopic point of view, the development of cloud computing related fields presents a certain periodicity.
    Keywords: big data; associate rule mining; Spark; Apriori; technology convergence.

  • The discovery in uncertain-social-relationship communities of opportunistic network   Order a copy of this article
    by Xu Gang, Wang Jia-Yi, Jin Hai-He, Mu Peng-Fei 
    Abstract: In the current studies of communities division of the opportunistic network, we always take the uncertain social relations as the input. In the practical application scenarios, because communications are always disturbed and the movements of nodes are random, the social relations are in the uncertain states. Therefore, the result of the community division based on the certain social relations is impractical. To solve the problem which cannot get the accurate communities under the uncertain social relations, we propose an uncertain-social-relation model of the opportunistic network in this paper. Meanwhile we analyze the probability distribution of the uncertain social relation and propose an algorithm of the community division based on the social cohesion, and then we divide communities by the uncertain social relations of opportunistic network. The experimental result shows that the Clique_detection_Based_SoH algorithm of the community division, which is based on the social cohesion, is more in accord with practical communities than the traditional K-clique algorithm of community division.
    Keywords: opportunistic network; uncertain social relations; k-clique algorithm; social cohesion; key node.

  • Tag recommendation based on topic hierarchy of folksonomy   Order a copy of this article
    by Han Xue, Bing Qin, Ting Liu, Shen Liu 
    Abstract: As a recommendation problem, tag recommendation has been receiving increasing attention from both the business and academic communities. Traditional recommendation methods are inappropriate for folksonomy because the basis of such mechanism remains un-updated in time owing to the bottleneck of knowledge acquisition. Therefore, we propose a novel method of tag recommendation based on the topic hierarchy of folksonomy. The method applies the topic tag hierarchy constructed automatically from folksonomy to tag recommendation using the proposed strategy. The method can improve the quality of folksonomy and can evaluate the topic tag hierarchy through tag recommendation. The precision of tag recommendation reaches 0.892. The experimental results show that the proposed method significantly outperforms state-of-the-art methods (t-test, p-value <0.0001) and demonstrates effectiveness with respect to data sources on tag recommendation.
    Keywords: tag recommendation; topic hierarchy; folksonomy.

  • Collective entity linking via greedy search and Monte Carlo calculation   Order a copy of this article
    by Lei Chen, Chong Wu 
    Abstract: Owing to the large amount of entities appearing on the web, entity linking has become popular recently. It assigns an entrance of a resource to one entity to help users grasp the meaning of this entity. Apparently, the entities that usually co-occur are related and can be considered together to find their best assignments. This approach is called collective entity linking and is often conducted based on entity graphs. However, traditional collective entity linking methods either consume much time owing to the large scale of entity graph or obtain low accuracy owing to simplifying graph to boost speed. To improve both accuracy and efficiency, this paper proposes a novel collective entity linking method based on greedy search and Monte Carlo calculation. Experimental results show that our linking algorithm can obtain both accurate results and low running time.
    Keywords: collective entity linking; relationship calculation; Monte Carlo calculation; greedy search.
    DOI: 10.1504/IJCSE.2017.10006975
     
  • Incremental processing for string similarity join   Order a copy of this article
    by Cairong Yan, Bin Zhu 
    Abstract: String similarity join is an essential operation of data quality management and a key step to find the value of data. Now in the era of big data, the existing methods cannot meet the demands of incremental processing. By using the string partition technique, an incremental processing framework for string similarity join is proposed in this paper. This framework treats the inverted index of strings as a state that will be updated after each operation of a string similarity match. Compared with the batching processing model, such framework can avoid the heavy time cost and the space cost brought by the duplicate similarity computation among historical strings and is suitable for processing data streams. We implement two algorithms, Inc-join and Inp-join. Inc-join runs on a stand-alone machine while Inp-join runs on a cluster with Spark environment. The experimental results show that this incremental processing framework can reduce the number of string matchings without affecting the join accuracy and improve the response time for the streaming data join compared with the batch computation model. When the data quantity becomes large, Inp-join can make full use of the advantage of parallel processing and obtain a better performance than Inc-join.
    Keywords: string similarity join; incremental processing; parallel processing; string matching.

  • A hybrid filtering-based network document recommendation system in cloud storage   Order a copy of this article
    by Wu Yuezhong, Liu Qin, Li Changyun, Wang Guojun 
    Abstract: Since the key requirement of users is to efficiently obtain personalised services from mass network document resources, a hybrid filtering-based network document recommendation system is designed with the method of incorporating the content-based recommendation and collaborative filtering recommendation based on the powerful and extensible storage and computing power in cloud storage. The proposed system realises the main service module on Hadoop and Mahout platform, and processes the documents containing the information of user interests by applying AHP-based attribute weighted fusion method. Based on the network interaction, the proposed system not only has advantages on the extensible storage space and high recommendation precision but also has an essential role in realizing network resources sharing and personalised recommendation.
    Keywords: user interest model; collaborative filtering; recommendation system; cloud storage.

  • Multiobjective evolutionary algorithm on simplified biobjective minimum weight minimum label spanning tree problems   Order a copy of this article
    by Xinsheng Lai, Xiaoyun Xia 
    Abstract: As general purpose optimisation methods, evolutionary algorithms have been efficiently used to solve multiobjective combinatorial optimisation problems. However, few theoretical investigations have been conducted to understand the efficiency of evolutionary algorithms on such problems, and even fewer theoretical investigations have been conducted on multiobjective combinatorial optimisation problems coming from the real world. In this paper, we analyse the performance of a simple multiobjective evolutionary algorithm on two simplified instances of the biobjective minimum weight minimum label spanning tree problem, which comes from real world. This problem is to find spanning trees that simultaneously minimise the total weight and also the total number of distinct labels in a connected graph where each edge has a label and a weight. Though these two instances are similar, the analysis results show that the simple multiobjective evolutionary algorithm is efficient for one instance, but it may be inefficient for the other. According to the analysis on the second instance, we think that the restart strategy may be useful in making the multiobjecctive evolutionary algorithm more efficient for the biobjective problem.
    Keywords: multiobjective evolutionary algorithm; biobjective; spanning tree problem; minimum weight; minimum label.

  • High dimensional Arnold inverse transformation for multiple images scrambling   Order a copy of this article
    by Weigang Zou, Wei Li, Zhaoquan Cai 
    Abstract: The traditional scrambling technology based on the low dimensional Arnold transformation (AT) is not able to assure the security of images during the transmission process, since the key space of the low dimensional AT is small and the scrambling period is short. Actually, the Arnold inverse transformation (AIT) is also a good image scrambling technique. The high-dimension AIT used in image scrambling can solve the shortcomings of low dimensional geometric transformation, have good image scrambling effect, and achieve the purpose of image encryption, which enriches the theory and application of image scrambling. Taking into account that an image has location space and colour space, the high dimensional AIT for image scrambling improves the anti-attack ability of image scrambling since the combination of the location space coordinates and the colour space component is very flexible. We investigated the property and application of AIT with five or six dimensions in the digital images scrambling. Specifically, we propose the theory of n dimensional AIT. Our investigations show that the technology in larger key space has a good effect on scrambling and has a certain application value.
    Keywords: information hiding; image scrambling; high dimensional transformation; Arnold transformation; Arnold inverse transformation; periodicity.

  • CAT: a context-aware teller for supporting tourist experiences   Order a copy of this article
    by Francesco Colace, Massimo De Santo, Saverio Lemma, Marco Lombardi, Mario Casillo 
    Abstract: The aim of this paper is the introduction of a methodology for the dynamic creation of an adaptive generator of stories related to a tourist context. The proposed approach selects the most suitable contents for the user and builds a context-aware teller that can support them during the exploration of the context, making it more appealing and immersive. The tourist can use the system by a hybrid app. The dynamic context-aware telling engine grabs the contents from a knowledge base that contains data coming both from the knowledge base and from the web. The user profile is updated thanks to information obtained during the visit and from social networks. A case study and some experimental results are presented and discussed.
    Keywords: context-aware; storyteller; social content; pervasive systems.

  • Saving energy consumption for mixed workloads in cloud platforms   Order a copy of this article
    by Dongbo Liu, Peng Xiao, Yongjian Li 
    Abstract: Virtualisation technology has been widely applied in cloud systems, however it also introduces many energy-efficiency losses especially when I/O virtualisation mechanism is concerned. In this paper, we present an energy-efficiency enhanced virtual machine (VM) scheduling policy, namely Share-Reclaiming with Collective I/O (SRC-I/O), with aim to reducing the energy-efficiency losses caused by I/O virtualisation. The SRC-I/O scheduler allows running VMs to reclaim extra CPU shares in certain conditions so as to increase CPU use. Meanwhile, SRC-I/O policy separates I/O-intensive VMs from CPU-intensive ones and schedules them in a batch manner, so as to reduce the context-switching costs of scheduling mixed workloads. Extensive experiments are conducted on various platforms by using different benchmarks to investigate the performance of the proposed policy. The results indicate that when the virtualisation platform is in presence of mixed workloads, the SRC-I/O scheduler outperforms existing VM schedulers in terms of energy efficiency and I/O responsiveness.
    Keywords: cloud computing; virtual machine; energy efficiency; mixed workload; task scheduling.

  • The extraction of security situation in heterogeneous log based on Str-FSFDP density peak cluster   Order a copy of this article
    by Chundong Wang, Tong Zhao, Xiuliang Mo 
    Abstract: Log analysis has been widely developed for identifying intrusion at the host or network. In order to reduce the false alarm rate in the process of security events extraction and discover a wide range of anomalies by scrutinising various logs, an improvement of Str-FSFDP (a fast search and find of peak density based data stream) clustering algorithm in heterogeneous log analysis is presented. Because of the advantages in data attribute relationship analysis for mixed attributes data, this algorithm can classify log data into two types whose corresponding distance measure metrics are designed. In order to apply Str-FSFDP in various logs, 12 attributes are defined in the unified XML format for clustering in this paper. These attributes are divided by the characteristics of each type of log and the importance of expressing a security event. To match the new micro cluster characteristic vector mentioned in the Str-FSFDP algorithm, this paper uses time gap to improve the UHAD (unsupervised anomaly detection model) framework. The time gap is designed as a threshold value based on micro cluster strategy. Experimental results reveal that the framework using Str-FSFDP clustering algorithm with time threshold can improve the aggregation rate of the log events and reduce the false alarm rate. As the algorithm has an analysis of attributes correlation, the connections between different IP addresses have been tested in the experiment. This helps us to look for the same attackers exploitation traces even if he fakes the IP addresses. It can increase the degree of aggregation in the same event. According to our analysis of each cluster, some serious attacks in the experiment have been summarised through the time line.
    Keywords: heterogeneous log; micro cluster; mixed attributes; unsupervised anomaly detection.

  • An improved KNN text classification method   Order a copy of this article
    by Fengfei Wang, Zhen Liu, Chundong Wang 
    Abstract: A text classification method based on improved SOM and KNN is introduced in this paper. In order to overcome the shortcomings of KNN in the text space model, this paper uses the SOM neural network to optimise the text classification. Based on this, this paper presents an improved SOM combined with KNN algorithm model. The SOM neural network weights of each dimension of the vector space model are calculated, using the SOM neural network in an unsupervised and no prior knowledge state of the sample to execute self-organisation and self-learning, to achieve evaluation and classification of the sample. This characteristic, using the SOM neural network combined with the KNN algorithm, effectively reduces the dimension of the vector, improves the clustering accuracy and speed and can effectively improve the efficiency of text classification.
    Keywords: text classification; KNN; SOM; neural network.

  • On the evaluation of machine-learned network traffic classifiers   Order a copy of this article
    by Junhao Xu, Yu Wang 
    Abstract: The recent years have seen extensive work on using machine learning techniques to classify network traffic based on flow and packet level characteristics. Previous studies reported promising results where the machine-learned classifiers generally achieved highly accurate predictions. However, some properties of the classifiers remain unexplored, of which the most critical one is the ability to identify unknown traffic. In this paper, we present an evaluation study on the issue. We show that most of the training and testing schemes in previous work are unrealistic as they assume that all classes are known a priori and sufficient training data for each class is available. Thus the fact that these classifiers were incapable of dealing with unseen traffic was over-looked. Experimental results obtained in two real-world internet traffic datasets are presented to illustrate the whole picture of the effectiveness of machine learning traffic classifiers.
    Keywords: machine learning; traffic analysis; data sharing; performance evaluation.

  • Privacy-preserving location-based service protocols with flexible access   Order a copy of this article
    by Shuyang Tang, Shengli Liu, Xinyi Huang, Zhiqiang Liu 
    Abstract: We propose an efficient privacy-preserving, content-protecting Location-based Service (LBS) scheme. Our proposal gives refined data classification and uses generalised ElGamal to support flexible access to different data classes. We also make use of Pseudo-Random Function (PRF) to protect users' position query. Since PRF is a light-weighted primitive, our proposal enables the cloud server to locate position efficiently while preserving the privacy of the queried position.
    Keywords: location-based services; outsourced cloud; security; privacy preserving.

  • On providing on-the-fly resizing of the elasticity grain when executing HPC applications in the cloud   Order a copy of this article
    by Rodrigo Righi, Cristiano Costa, Vinicius Facco, Luis Cunha 
    Abstract: Today, we observe that cloud infrastructures are gaining more and more space to execute HPC (High Performance Computing) applications. Unlike clusters and grids, the cloud offers elasticity, which refers to the ability of enlarging or reducing the number of resources (and consequently, processes) to support as close as possible the needs of a particular moment of the execution. In the best of our knowledge, current initiatives explore the elasticity and HPC duet by always handling the same number of resources at each scaling in or out of operation. This fixed elasticity grain commonly reveals a stair-shaped behaviour, where successive elasticity operations take place to address the load curve. In this context, this article presents GrainElastic: an elasticity model to execute HPC applications with the capacity to adapt the elasticity grain to the requirements of each elasticity operation. Its contribution concerns a mathematical formalism that uses historical execution traces and ARIMA time series model to predict the required number of resources (in our case, VMs) to address a reconfiguration point. Based on the proposed model, we developed a prototype that was compared with two other scenarios: (i) non-elastic application and (ii) elastic middleware with a fixed grain. The results presented gains up to 30% in favour of GrainElastic, showing us the relevance on adapting the elasticity grain to enhance system reactivity and performance.
    Keywords: elasticity; resource management; HPC; cloud computing; elasticity grain; adaptivity.

  • Can the hybrid colouring algorithm take advantage of multi-core architectures?   Order a copy of this article
    by João Fabrício Filho, Luis Gustavo Araujo Rodriguez, Anderson Faustino Da Silva 
    Abstract: Graph colouring is a complex computational problem that focuses on colouring all vertices of a given graph using a minimum number of colours. However, adjacent vertices are restricted from receiving the same colour. Over recent decades, various algorithms have been proposed and implemented to solve such a problem. An interesting algorithm is the Hybrid Coloring Algorithm (HCA), which was developed in 1999 by Philippe Galinier and Jin-Kao Hao. The HCA was widely regarded at the time as one of the best performing algorithms for graph colouring. Nowadays, high-performance out-of-order multi-cores have emerged that execute applications faster and more efficiently. Thus, the objective of this paper is to analyse whether the HCA can take advantage of multi-core architectures, in terms of performance, or not. For this purpose, we propose and implement a parallel version of the HCA that takes advantage of all hardware resources. Several experiments were performed on a machine with two Intel(R) Xeon(R) CPU E5-2630 processors, thus having a total of 24 cores. The experiment proved that the parallel HCA, using multi-core architectures, is a significant improvement over the original because it achieves enhancements of up to 40% in terms of the distance to the best chromatic number found in the literature. The expected contribution of this paper is to encourage developers to take advantage of high performance out-of-order multi-cores to solve complex computational problems.
    Keywords: metaheuristics; hybrid colouring algorithm; graph colouring problem; architecture of modern computers.

  • Learning pattern of hurricane damage levels using semantic web resources   Order a copy of this article
    by Quang-Khai Tran, Sa-kwang Song 
    Abstract: This paper proposes an approach for hurricane damage level prediction using semantic web resources and matrix completion algorithms. Based on the statistical unit node set framework, streaming data from five hurricanes and damage levels from 48 counties in the USA were collected from the SRBench dataset and other web resources, and then trans-coded into matrices. At a time t, the pattern of possible highest damage levels at 6 hours into the future was estimated using a multivariate regression procedure based on singular value decomposition. We also applied the Soft-Impute algorithm and k-nearest-neighbours concept to improve the statistical unit node set framework in this research domain. Results showed that the model could deal with inaccurate, inconsistent and incomplete streaming data that were highly sparse, to learn future damage patterns and perform forecasting in near real time. It was able to estimate the damage levels in several scenarios even if two-thirds of the relevant weather information was unavailable. The contributions of this work will be able to promote the applicability of the semantic web in the context of climate change.
    Keywords: hurricane damage; statistical unit node set; matrix completion; SRBench dataset; streaming data.

  • CUDA GPU libraries and novel sparse matrix-vector multiplication implementation and performance enhancement in unstructured finite element computations   Order a copy of this article
    by Richard Haney, Ram V. Mohan 
    Abstract: The efficient solution to systems of linear and non-linear equations arising from sparse matrix operations is a ubiquitous challenge for computing applications that can be exacerbated by the employment of heterogeneous architectures such as CPU-GPU computing systems. There is a common need for efficient implementation and computational performance of solution of sparse system of linear equations in many unstructured finite element-based computations of physics based modeling problems. This paper presents our implementation of a novel sparse matrix-vector multiplication (a significant compute load operation in the iterative solution via pre-conditioned conjugate gradient based methods) employing LightSpMV with Compressed Sparse Row (CSR) format, and the resulting performance characteristics. An unstructured finite element-based computational simulation involving multiple calls to iterative pre-conditioned conjugate gradient algorithm for the solution to a linear system of equations employing a single CPU-GPU computing system using NVidia Compute Unified Device Architecture libraries is employed for the results discussed in the present paper. The matrix-vector product implementation is examined within the context of a resin transfer molding simulation code. Results from the present work can be applied without loss of generality to many other unstructured, finite element-based computational modeling applications in science and engineering that employ solutions to sparse linear and non-linear system of equations using CPU-GPU architecture. Computational performance analysed indicates that LightSpMV can provide an asset to boost performance for these computational modelling applications. This work also investigates potential improvements in the LightSpMV algorithm using CUDA 35 intrinsic, which results in an additional performance boost by 1%. While this may not be significant, it supports the idea that LightSpMV can potentially be used for other full-solution finite element-based computational implementations.
    Keywords: general purpose GPU computing; sparse matrix-vector; finite element method; CUDA; performance analysis.

  • Line integral convolution based non-local structure tensor   Order a copy of this article
    by Yuhui Zheng, Kai Ma, Shunfeng Wang, Jing Sun, Jianwei Zhang 
    Abstract: The non-local structure tensors have received much attention recently. However, the current computation methods of non-local structure tensor fail to fully use the anisotropic characteristic of tensors, hence resulting in limited performance. To address this problem, we present a novel anisotropic non-local regularisation scheme that integrates the atomic decomposition strategy with an extended line integral convolution method using non-local means filtering technique, in order to sufficiently use the spatial direction relevancy of tensors for their anisotropic smoothing. Experimental results on the test images show that our proposed anisotropic non-local structure tensor is superior to the current representative nonlinear structure tensors in corner detection.
    Keywords: non-local structure tensor; image structure analysis; tensor field regularisation.

  • Rational e-voting based on network evolution in the cloud   Order a copy of this article
    by Tao Li, Shaojing Li 
    Abstract: Physically distributed voters can vote online through an electronic voting (e-voting) system. It can outsource the counting work to the cloud when the system is overloaded. However, this kind of outsourcing may lead to some security problems such as anonymity, privacy, fairness etc. Suppose servers in the cloud have no incentives to deviate from the e-voting system, these security problems can be effectively solved. In this paper, we assume that servers in the cloud are rational, and try to maximise their utilities. We look for incentives for rational servers not to deviate from the e-voting system. Here, no deviation means rational servers prefer to cooperate in the e-voting system. Simulation results of our evolution model show that the cooperation level is high after certain rounds. Finally, we put forward a rational e-voting protocol based on the above results and prove that the system is secure under proper assumptions.
    Keywords: electronic voting; utility; cloud computing; rational secret sharing.

  • Water contamination monitoring system based on big data: a case study   Order a copy of this article
    by Gaofeng Zhang, Yingnan Yan, Yunsheng Tian, Yang Liu, Yan Li, Qingguo Zhou, Rui Zhou, Kuan-Ching Li 
    Abstract: Water plays a vital role in peoples lives, and individuals cannot survive without it. However, water contamination has become a serious issue with the development of industry and agriculture, and has become a threat to peoples daily life. Moreover, the amount of data people need to process becomes excessively complex and huge in the big data era. Hence, data management is increasingly a difficult task. There is an urgent need to develop a system to identify major changes of water quality through monitoring and managing these water quality variables. In this paper, we develop a data monitoring system named Monitoring and Managing Data Center (MMDC) for monitoring, downloading, sharing, and time-series analysis based on big data technology. In order to reflect the real hydrological ecosystem, water quality variable data collected from Taihu Lake in China is used to verify the effectiveness of MMDC. Results show that MMDC is effective for monitoring and management of massive data. Although this investigation is focused on Taihu Lake, it is applicable as a general monitoring system for other similar natural resources.
    Keywords: water contamination; big data; MMDC; monitoring; data analysis.

Special Issue on: ICACCI-2013 and ISI-2013 Swarm and Artificial Intelligence

  • Particle swarm optimisation with time-varying cognitive avoidance component   Order a copy of this article
    by Anupam Biswas, Bhaskar Biswas, Anoj Kumar, Krishn Mishra 
    Abstract: Interactive cooperation of local best or global best solution encourages particles to move towards them, hoping that a better solution may present in the neighbouring positions around the local best or global best. This encouragement does not guarantee that movements taken by particles will always be suitable. Sometimes particles may be misled in a wrong direction towards the worst solution. Prior knowledge of worst solutions may predict such misguidance and avoid such moves. Worst solutions can not be known in advance and can be known only through experience. This paper introduces a cognitive avoidance scheme to the particle swarm optimisation method. A very similar kind of mechanism is used to incorporate worst solutions into the strategic movement of particles as used during incorporation of best solutions. A time-varying approach is also extrapolated to the cognitive avoidance scheme to deal with negative effects. The proposed approach is tested with 25 benchmark functions of CEC 2005 special session on real parameter optimisation, as well as with four other very popular benchmark functions.
    Keywords: optimisation; particle swarm optimisation; differntial evolution; heuristics

Special Issue on: Cloud Computing Services Brokering, SLA and Security

  • Coordinated scan detection algorithm based on the global characteristics of time sequence   Order a copy of this article
    by Yanli Lv, Yuanlong Li, Shuang Xiang, Chunhe Xia 
    Abstract: Scanning is a kind of activity or action for the purpose of acquiring the target host status information. In order to obtain the information more efficiently and more secretly, the attackers in the network often use coordinated scans to scan the target host or network. At present, there are no effective methods to detect the coordinated scan. We take scan sequences as a time series and combine the general characteristics of time series. Then based on the features of time series clustering approach, we are going to find the coordinated scans governed by the same controller. Simulation and experiment results show that the methods we propose are better than the existing methods in accuracy and efficiency.
    Keywords: scan; scan detection; coordinated scan; general feature; clustering analysis

  • The intensional semantic conceptual graph matching algorithm based on conceptual sub-graph weight self-adjustment   Order a copy of this article
    by Zeng Hui, Xiong Liyan, Chen Jianjun 
    Abstract: Semantic computing is an important task in the research on natural language processing. On solving the problem of the inaccurate conceptual graph matching, this paper proposes an algorithm to compute the similarity of conceptual graphs, based on conceptual sub-graph weight self-adjustment. The algorithm works by basing itself on the intensional logic model of Chinese concept connotation, using intensional semantic conceptual graph as a knowledge representation method and combining itself with the computation method of E-A-V structures. When computing the similarity of conceptual graphs, the algorithm can give the homologous weight to the sub-graph according to the proportion of how much information the sub-graph contains in the whole conceptual graph. Therefore, it can achieve better similarity results, which has also been proved in the experiments described in this paper.
    Keywords: Chinese semantic analysis; intensional semantic conceptual graph; E-A-V conceptual structures similarity; conceptual sub-graph weight self-adjustment

Special Issue on: High-Performance Information Technologies for Engineering Applications

  • Development and evaluation of the cloudlet technology within the Raspberry Pi   Order a copy of this article
    by Nawel Kortas, Anis Ben Arbia 
    Abstract: Nowadays, communication devices, such as laptops, computers, smartphones and personal media players, have extensively increased in popularity thanks to the rich set of cloud services that they allow users to access. This paper focuses on setting solutions of network latency for communication devices by the use of cloudlets. This work also proposes a conception of a local datacentre that allows users to connect to their data from any point and through any device by the use of the Raspberry. We also display the performance demonstration results of the resource utilisation rate, the average execution time, the latency, the throughput and the lost packets that provide the big advantage of cloudless application from local and distant connections. Furthermore, we display an evaluation of cloudless by comparing it with similar services and by setting simulation results through the CloudSim simulator.
    Keywords: cloudlets; cloud computing; cloudless; Raspberry Pi; datacentre; device communication, file-sharing services.

  • Parallel data processing approaches for effective intensive care units with the internet of things   Order a copy of this article
    by N. Manikandan, S Subha 
    Abstract: Computerisation in health care is more general and monitoring Intensive Care Units(ICU) is more significant and life-critical. Accurate monitoring in an ICU is essential. Failing to take right decisions at the right time may prove fatal. Similarly, a timely decision can save people's lives in various critical situations. In order to increase the accuracy and timeliness in ICU monitoring, two major technologies can be used, namely parallel processing through vectorisation of ICU data and data communication through the Internet of Things (IoT). With our approach, we can improve efficiency and accuracy in data processing. This paper proposes a parallel decision tree algorithm in ICU data to take faster and accurate decisions on data selection. Uses of parallelised algorithms optimise the process of collecting large sets of patient information. A decision tree algorithm is used for examining and extracting knowledge-based data from large databases. Finalised information will be transferred to concerned medical experts in cases of medical emergency using the IOT. Parallel implementation of the decision tree algorithm is implemented with threads, and output data is stored in local IOT tables for further processing.
    Keywords: medical data processing; internet of things; ICU data; vectorisation; multicore architecture; parallel data processing.

  • Study of runtime performance for Java-multithread PSO on multiCore machines   Order a copy of this article
    by Imed Bennour, Monia Ettouil, Rim Zarrouk, Abderrazak Jemai 
    Abstract: Optimisation meta-heuristics, such as Particle Swarm Optimization (PSO), require high-performance computing (HPC). The use of software parallelism and hardware parallelism is mandatory to achieve HPC. Thread-level parallelism is a common software solution for programming on multicore systems. The Java language, which includes important aspects such as its portability and architecture neutrality, its multithreading facilities and its distributed nature, makes it an interesting language for parallel PSO. However, many factors may impact the runtime performance: the coding styles, the threads-synchronisation levels, the harmony between the software parallelism injected into the code and the available hardware parallelism, the Java networking APIs, etc. This paper analyses the Java runtime performance on handling multithread PSO over general purpose multicore machines and networked machines. Synchronous, asynchronous, single-swarm and multi-swarm PSO variants are considered.
    Keywords: high-performance computing , particle swarm optimisation,multicore, multithread, performance, simulation.

  • Execution of scientific workflows on IaaS cloud by PBRR algorithm   Order a copy of this article
    by S.A. Sundararaman 
    Abstract: Job scheduling of scientific workflow applications in IaaS cloud is a challenging task. Optimal resource mapping of jobs to virtual machines is calculated considering schedule constraints such as timeline and cost. Determining the required number of virtual machines to execute the jobs is key in finding the optimal schedule makespan with minimal cost. In this paper, VMPROV algorithm has been proposed to find the required virtual machines. Priority-based round robin (PBRR) algorithm is proposed for finding the job to resource mapping with minimal makespan and cost. Execution of four real-world scientific application jobs by PBRR algorithm are compared with MINMIN, MAXMIN, MCT, and round robin algorithms execution times. The results show that the proposed algorithm PBRR can predict the mapping of tasks to virtual machines in better way compared to the other classic algorithms.
    Keywords: cloud job scheduling; virtual machine provisioning; IaaS

Special Issue on: Technologies and Applications in the Big Data Era

  • Research on implementation of digital forensics in cloud computing environment   Order a copy of this article
    by Hai-Yan Chen 
    Abstract: Cloud computing is a promising next-generation computing paradigm which integrates multiple existing and new technologies. With the maturing and wide application of cloud computing technology, there are more and more crimes occuring in the environment of cloud computing, so the effective investigations of evidence against these crimes are extremely important and of urgent need. Because of the characteristics of the virtual computing environment (mass storage and distribution of data, and multi-tenant), cloud computing sets an extremely hard condition for the investigation of evidence. For this purpose, in this paper, we propose a digital forensics reference model in the cloud environment. First, we divide cloud forensics into four steps and the implementation scheme is given respectively. Secondly, a cloud platform trusted evidence collection mechanism based on trusted evidence collection agent is put forward. Finally, methods of using various data mining algorithms in the evidences analysed are introduced. The experiment and simulation on real data show the accuracy and effectiveness of the proposed method.
    Keywords: cloud computing; digital forensics; cloud environment; digital evidence

  • Building a large-scale testing dataset for conceptual semantic annotation of text   Order a copy of this article
    by Xiao Wei, Daniel Dajun Zeng, Xiangfeng Luo, Wei Wu 
    Abstract: One major obstacle facing the research on semantic annotation is lack of large-scale testing datasets. In this paper, we develop a systematic approach to constructing such datasets. This approach is based on guided ontology auto-construction and annotation methods that use little priori domain knowledge and little user knowledge in documents. We demonstrate the efficacy of the proposed approach by developing a large-scale testing dataset using information available from MeSH and PubMed. The developed testing dataset consists of a large-scale ontology, a large-scale set of annotated documents, and the baselines to evaluate the target algorithm, which can be employed to evaluate both the ontology construction algorithms and semantic annotation algorithms.
    Keywords: semantic annotation; ontology concept learning; testing dataset; evaluation baseline; ontology auto-construction; priori knowledge; MeSH; PubMed

Special Issue on: Advanced Information Processing in Communication

  • Hybrid genetic, variable neighbourhood search and particle swarm optimisation based job scheduling for cloud computing   Order a copy of this article
    by Rachhpal Singh 
    Abstract: In a Cloud Computing Environment (CCE), many scheduling mechanisms have been proposed to balance the load between the given set of distributed servers. Genetic Algorithm (GA) has been verified to be the best technique to reduce the energy consumed by distributed servers, but it becomes unsuccessful to strengthen the exploration in the rising areas. The performance of Particle Swarm Optimisation (PSO) depends on initially selected random particles, i.e. wrongly selected particles may produce poor results. The Variable Neighbourhood Search (VNS) can be used to set the stability of non-local searching and local utilisation for an evolutionary processing period. Therefore, this paper proposes a hybrid VNS, GA and PSO, called HGVP, in order to overcome the constraint of a poorly selected initial amount of particles in the case of PSO-based scheduling for CCE. The simulation results of the proposed technique have shown effective results over the available techniques, especially in terms of energy consumption
    Keywords: cloud computing environment; job scheduling; particle swarm optimisation; genetic algorithm; variable neighbourhood search.

  • Secured image compression using AES in bandelet domain   Order a copy of this article
    by S.P. Raja, A. Suruliandi 
    Abstract: Compression and encryption are jointly used in network systems to improve efficiency and security. A secure and reliable means for communicating images and video is, consequently, indispensable for networks. In this paper, a new methodology is proposed for secure image compression. Initially, a bandelet transform is applied to the input image to obtain coefficients and kernel matching pursuits (KMP) used to choose key bandelet coefficients. The coefficients obtained from the KMP are encrypted using the advanced encryption standard (AES) and encoded using the listless set partitioning embedded block (listless SPECK) image compression encoding technique. For performance evaluation, the peak signal to noise ratio (PSNR), mean square error (MSE), structural similarity index (SSIM) and image quality index (IQI) are taken. From the experimental results and performance evaluation, it is shown that the proposed approach produces high PSNR values and compresses images securely.
    Keywords: bandelet transform; KMP; AES; listless SPECK.

  • A semantic layer to improve collaborative filtering systems   Order a copy of this article
    by Sahraoui Kharroubi, Youcef Dahmani, Omar Nouali 
    Abstract: According to IBM statistics, the internet generates 2.5 trillion items of heterogeneous data on a daily basis. Known as big data, this degrades the performance of search engines and reduces their ability to satisfy requests. Filtering systems such as Netflix, eBay, iTunes and others are widely used on the web to select and distribute interesting resources to users. Most of these systems recommend only one kind of resource, which limits the ambitions of their users. In this paper, we propose a hybrid recommendation system that includes a variety of resources (books, films, music, etc.). A similarity process was applied to group users and resources on the basis of appropriate metadata. We have also used a graph data model known as a Resource Description Framework (RDF) to represent the different modules of the system. RDF syntax allows for perfect integration and data exchange via the SPARQL query language. Real data sets are used to perform the experiments, showing promising results in terms of performance and accuracy.
    Keywords: big data, namespace, rating, relevant item, RDF vocabulary, sparsity, user’s relationship

  • QoS-aware web service selection based on self-organising migrating algorithm and fuzzy dominance.   Order a copy of this article
    by Amal Halfaoui, Fethallah Hadjila, Fedoua Didi 
    Abstract: Web service composition consists of creating a new complex web service by combining existing ones. The selection of composite services is a very complex and challenging task, especially with the increasing number of services offering the same functionality. The web service selection can be considered as a combinatorial problem that focuses on delivering the optimal composition that satisfies the user's requirements (functional and non functional needs). Several optimisation algorithms have been proposed in the literature to tackle the web service selection. In this work, we propose an approach that adapts a recent stochastic optimisation algorithm called Self Organising Migrating Algorithm (SOMA) for QoS web service selection. Furthermore, we propose a fuzzification of the Pareto dominance and use it to improve SOMA by comparing the services within the local search. The proposed approach is applicable to any combinatorial workflow with parallel, choice and loop pattern. We test our algorithm with a set of synthetic datasets and compare it with the most recently used algorithm (PSO). The comparative study shows that SOMA produces promising results and therefore it is able to select the user's composition in an efficient manner.
    Keywords: web service selection; SOMA; fuzzy dominance; swarm-based optimisation algorithms.

  • Fault detection and behavioural prediction of a constrained complex system using cellular automata   Order a copy of this article
    by Priya Radha, Elizabeth Sherly 
    Abstract: Functionality-based failure analysis and validation during the design process in a constrained complex system is challenging. In this paper, we advocate a model to validate the functionality of a constrained complex control system with its structural behaviour. An object-constrained model is proposed for validation of any component of a complex system with constraints, and its state of safeness is predicted using cellular automata. The model consists of two sub-systems: an inference engine that functions based on a rule-based expert system, and a failure forecast engine based on cellular automata. The system is tested against a thermal power plant for early detection of failure in the system, which enhances the process efficiency of power generation.
    Keywords: complex system, constrained objects, cellular automata, control system, prediction engine, failure forecast engine.

  • Distributed diagnosis based on distributed probability propagation nets   Order a copy of this article
    by Yasser Moussa Berghout, Hammadi Bennoui 
    Abstract: This paper addresses the problem of modelling uncertainty in the distributed context. It is situated in the field of diagnosis; more precisely, model-based diagnosis of distributed systems. A special focus is given to modelling uncertainty and probabilistic reasoning. Thus, this work is based on a probabilistic modelling formalism called: "probability propagation nets" (PPNs), which are designed for centralised systems. Hence, an extension of this model is proposed to suit the distributed context. Distributed probability propagation nets (DPPNs), the proposed extension, were conceived to consider the distributed systems' particularities. So, the set we consider is a set of interacting subsystems, each of which is modelled by a DPPN. The interaction among the subsystems is modelled through the firing of common transitions belonging to more than one subsystem. All of that is logically supported by means of probabilistic Horn abductions (PHAs). Furthermore, the diagnostic process is done by exploiting transition-invariants; a diagnostic technique developed for Petri nets. The proposed extension is illustrated through a real life example.
    Keywords: model-based diagnosis; distributed systems; probabilistic reasoning; probability propagation nets; probabilistic Horn abduction; Petri nets.

  • Novel automatic seed selection approach for mass detection in mammograms   Order a copy of this article
    by Ahlem Melouah, Soumai Layachi 
    Abstract: The success of mass detection using seeded region growing segmentation depends on seed point selection operation. The seed point is the first point from which the process of aggregation starts. This point must be inside the mass otherwise the segmentation fails. There are two principal ways to perform the seed point selection. The first one is manual, performed by a medical expert who manually outlines the point of interest using a pointer device. The second one is automatic; in this case the whole process is performed without any user interaction. This paper proposes a novel approach to select automatically the seed point for further region growing expansion. Firstly, suspicious regions are extracted by a thresholding technique. Secondly, the suspicious region whose features match with the predefined masses features is identified as the region of interest. Finally, the seed point is placed inside the region of interest. The proposed method is tested using the IRMA database and the MIAS database. The experimental results show the performance and robustness of the proposed method.
    Keywords: breast cancer; masses detection; mammograms; segmentation; seeded region growing; automatic seed selection; region of interest; features; thresholding.

  • Combining topic-based model and text categorisation approach for utterance understanding in human-machine dialogue   Order a copy of this article
    by Mohamed Lichouri, Rachida Djeradi, Amar Djeradi 
    Abstract: In the present paper, we suggest an implementation of an automatic understanding system of the statement in human-machine communication. The architecture we adopt was based on a stochastic approach that assumes that the understanding of a statement is nothing but a simple theme identification process. Therefore, we present a new theme identification method based on a documentary retrieval technique which is text (document) classification [1]. The method we suggest was validated on a basic platform that gives information related to university schooling management (querying a student database), taking into consideration a textual input in French. This method has achieved a theme identification rate of 95% and a correct utterance understanding rate of about 91.66%.
    Keywords: communication; human-machine dialogue; understanding; utterance; thematic; text classification; topic model.

  • A Manhattan distance based binary bat algorithm vs integer ant colony optimisation for intrusion detection in audit trails.   Order a copy of this article
    by Wassila Guendouzi, Abdelmadjid Boukra 
    Abstract: An intrusion detection system (IDS) is the process of monitoring and analysing security activities occurring in a computer or network systems. The detection method is the brain of IDS and it can perform either anomaly-based or misuse-based detection. The misuse mechanism aims to detect predefined attack scenarios in the audit trails, whereas the anomaly detection mechanism aims to detect deviations from normal user behaviour. In this paper, we deal with misuse detection. We propose two approaches to solve the NP-hard security audit trail analysis problem. Both rely on the Manhattan distance measure to improve the intrusion detection quality. The first proposed method, named Enhanced Binary Bat Algorithm (EBBA), is an improvement of Bat Algorithm (BA) that uses a binary coding and the fitness function defined as the global attacks risks. This fitness function is used in conjunction with the Manhattan distance. In this approach, new operators are adapted to the problem of our interest which are solution transformation, vertical permutation and horizontal permutation operators. The second proposed approach, named Enhanced Integer Ant Colony Optimisation (EIACS), is a combination of two metaheuristics: Ant Colony System (ACS), which uses a new pheromone update method, and Simulated Annealing (SA), which uses a new neighborhood generation mechanism. This approach uses an integer coding and a new fitness function based on the Manhattan distance measure. Experiments on different problem sizes (small, medium and large) are carried out to evaluate the effectiveness of the two approaches. The results indicate that for small and medium sizes the two algorithms have similar performance in term of detection quality. For large problem size the performance of EIACS is more significant than EBBA.
    Keywords: intrusion detection; security audit trail analysis; combinatorial optimisation problem; NP-hard; Manhattan distance; bat algorithm; ant colony system; simulated annealing.

  • An approach for managing the dynamic reconfiguration of software architectures   Order a copy of this article
    by Abdelfetah Saadi, Mourad Chabane Oussalah, Abderrazak Henni 
    Abstract: Currently, most software systems have a dynamic nature and need to evolve at runtime. The dynamic reconfiguration of software systems is a mechanism that must be dealt with to enable the creation and destruction of component instances and their links. To reconfigure a software system, it must be stopped, patched and restarted; this causes unavailability periods which are always a problem for highly available systems. In order to address these problems, this paper presents an approach called software architecture reconfiguration approach (SAREA). We define for this approach a set of intelligent agents, each of them has a precise role in the functioning and the control of software. Our approach implements a restoring mechanism of software architecture to a fully functional state after the failure of one or more reconfiguration operations; it also proposes a reconfiguration mechanism which describes the execution process of reconfigurations.
    Keywords: software architecture; dynamic reconfiguration; evolution; intelligent agents; component model; model driven architecture; MDA; meta-model.

Special Issue on: New Techniques for Secure Internet and Cloud Computation

  • Self and social network behaviours of users in cultural spaces   Order a copy of this article
    by Angelo Chianese, Salvatore Cuomo, Pasquale De Michele, Francesco Piccialli 
    Abstract: Many cultural spaces offer their visitors the use of ICT tools to enhance their visit experience. Data collected within such spaces can be analysed in order to discover hidden information related to visitors behaviours and needs. In this paper, a computational model inspired by neuroscience simulating the personalised interactions of users with cultural heritage objects is presented. We compare a strengthened validation approach for neural networks based on classification techniques with a novel proposal one, based on clustering strategies. Such approaches allow us to identify natural users groups in data and to verify the model responses in terms of user interests. Finally, the presented model has been extended to simulate social behaviours in a community, through the sharing of interests and opinions related to cultural heritage assets. This data propagation has been further analysed in order to reproduce applicative scenarios on social networks.
    Keywords: social network; clustering techniques; cultural heritage, internet of things, user behaviours

  • A perspective on applications of in-memory and associative approaches supporting cultural big data analytics   Order a copy of this article
    by Francesco Piccialli, Angelo Chianese 
    Abstract: Business intelligence, advanced analytics, big data, in-memory database and associative technologies are actually the key enablers for enhanced business decision-making. In this paper, we provide a perspective on applications of in-memory approaches supporting analytics in the field of Cultural Heritage (CH), applied to information resources including structured and unstructured contents, geo-spatial and social network data, multimedia, multiple domain vocabularies, classifiers and ontologies. The proposed approach is implemented in an information system exploiting associative in-memory technologies in a cloud context, as well as integrating semantic technologies for merging and analysing information coming from heterogeneous sources. We analyse and describe the application of this system to trace a behavioral and interest profile of users and visitors for cultural events (exhibitions, museums, etc.) and territorial (touristic areas and routes including cultural resources, historical downtown, archaeological sites). The results of ongoing experimentation encourage a business intelligence approach that is suitable for supporting CH asset crowdsourcing, promotion, publication, management and usage.
    Keywords: in-memory database systems, big data , social analytics , business intelligence , cultural heritage , internet of things.

  • Data security and privacy information challenges in cloud computing   Order a copy of this article
    by Weiwei Kong, Yang Lei, Jing Ma 
    Abstract: Cloud computing has become a hotspot in the area of information technology. However, when indulging into its convenience and strong ability of the data processing, we also find that the great challenges also appear in terms of data security and privacy information protection. In this paper, summary of the current security and privacy information challenges is presented. The current security measures are summarized as well.
    Keywords: cloud computing; data security; privacy information; cloud computing provider

  • Load balancing algorithm based on multiple linear regression analysis in multi-agent systems   Order a copy of this article
    by Xiao-hui Zeng 
    Abstract: With the increase of agents involved in applications of multi-agent systems (MAS), the problem of load balancing is more and more prominent. This paper proposes a novel load balancing algorithm based on multiple linear regression analysis (LBAMLR). By using parallel computing on all servers and using partial information about agents communication, our algorithm can effectively choose the optimal agents' set and the suitable destination servers. The simulation results show our proposed algorithm can shorten the computing time and increase the total performance in MAS.
    Keywords: distributed computing; multi-agent systems; load balancing; multiple linear regression analysis.

  • TERS: a traffic-efficient repair scheme for repairing multiple losses in erasure-coded distributed storage systems   Order a copy of this article
    by Zheng Liming 
    Abstract: Erasure coding has received considerable attention owing to the better tradeoff between the space efficiency and reliability. However, the high repair traffic and the long repair time of erasure coding have posed a new challenge: how to minimise the amount of data transferred among nodes and reduce the repair time when repairing the lost data. Existing schemes are mostly designed for single node failures, which incur high network traffic and result in low efficiency. In this paper, we propose a traffic-efficient repair scheme (TERS) suitable for repairing data losses when multiple nodes fail. TERS reduces the repair traffic by using the overlap of data accessing and computation between node repairs. To reduce the repair time, TERS uses multiple threads during the computation, and pipelines the data transmission during the repair. To evaluate the repair cost and the repair time, we provide an implementation of integrating TERS into HDFS-RAID. The numerical results confirm that TERS reduces the repair traffic by 44% on average compared with the traditional erasure codes and regenerating codes. Theoretical analysis shows that TERS effectively reduces the repair time. Moreover, the experimental results show that compared with current typical repair methods, such as TEC, MSR and TSR, the repair time of TERS is reduced by 25%, 20% and 16%, respectively.
    Keywords: distributed storage; erasure coding; repair traffic; repair time; multiple losses.

  • A sound abstract memory model for static analysis of C programs   Order a copy of this article
    by Yukun Dong 
    Abstract: Abstract memory model plays an important role in static analysis of program. This paper proposes a region-based symbolic three-valued logic (RSTVL) to guarantee the soundness of static analysis, which uses abstract regions to simulate blocks of the concrete memory. RSTVL applies symbolic expressions to express the value of memory objects, and the interval domain to describe the value of each symbol of symbolic expressions. Various operations for memory objects can be mapped to operations about regions. RSTVL can describe the shape information of data structure in memory and storage state of memory object, and a variety of associative addressable expressions, including the point-to relations, hierarchical and valued logic relations. We have built a prototype tool DTSC_RSTVL that detects code level defects in C programs. Five popular C programs are analysed, the results indicate that the analysis is sufficiently sound to detect code level defects with zero false negative rate.
    Keywords: software quality; static analysis; abstract memory model; memory object; defect detection.

Special Issue on: Computational Imaging and Multimedia Processing

  • Underwater image segmentation based on fast level set method   Order a copy of this article
    by Yujie Li, Huiliang Xu, Yun Li, Huimin Lu, Seiichi Serikawa 
    Abstract: Image segmentation is a fundamental process in image processing that has found application in many fields, such as neural image analysis, underwater image analysis. In this paper, we propose a novel fast level set method (FLSM)-based underwater image segmentation method to improve the traditional level set methods by avoiding the calculation of signed distance function (SDF). The proposed method can speed up the computational complexity without re-initialisation. We also provide a fast semi-implicit additive operator splitting (AOS) algorithm to improve the computational complex. The experiments show that the proposed FLSM performs well in selecting local or global segmentation regions.
    Keywords: underwater imaging; level set; image segmentation

  • Pseudo Zernike moments based approach for text detection and localisation from lecture videos   Order a copy of this article
    by Soundes Belkacem, Larbi Guezouli, Samir Zidat 
    Abstract: Text information embedded in videos is an important clue for retrieval and indexation of images and videos. Scene text presents challenging characteristics mainly related to acquisition circumstances and environmental changes, resulting low quality videos. In this paper, we present a scene text detection algorithm based on Pseudo Zernike Moments (PZMs) and stroke features from low resolution lecture videos. The algorithm mainly consists of three steps: slide detection, text detection and segmentation and non-text filtering. In lecture videos, the slide region is a key object carrying almost all the important information; hence the slide region has to be extracted and segmented from other scene objects considered as background for later treatments. Slide region detection and segmentation is done by applying PZMs based on RGB frames. Text detection and extraction is performed using PZM segmentation over V channel of HSV colour space, and then stroke feature is used to filter out non-text regions and remove false positives. PZMs are powerful shape descriptors; they present several strong advantages such as robustness to noise, rotation invariants, and multilevel feature representation. The PZMs based segmentation process consists of two steps: feature extraction and clustering. First, a video frame is partitioned into equal size windows, then the coordinates of each window are normalised to a polar system, then PZMs are computed over the normalised coordinates as region descriptors. Finally, a clustering step using K-means is performed in which each window is labelled for text/non-text region. The algorithm is shown to be robust to illumination, low resolution and uneven luminance from compressed videos. The effectiveness of the PZM description leads to very few false positives compared with other approaches. Moreover, resultant images can be used directly by OCR engines and no more processing is needed.
    Keywords: text localisation, text detection, pseudo Zernike moments, slide region detection.

  • Tracking multiple targets based on min-cost network flows with detection in RGB-D data   Order a copy of this article
    by Mingxin Jiang 
    Abstract: Visual multi-target tracking technology is a challenging problem in computer vision. This study proposes a novel approach for multi-target tracking based on min-cost network flows in RGB-D data with tracking-by-detection scheme. Firstly, the moving objects are detected by fusing RGB information and depth information. Then, we formulated the multi-target tracking problem as a maximum a posteriori (MAP) estimation problem with specific constraints, and the problem is converted into a cost-flow network. Finally, using a min-cost flow algorithm, we can obtain the tracking results. Extensive experimental results show that the proposed algorithm greatly improves the robustness and accuracy of algorithm and outperforms the state-of-the-art significantly.
    Keywords: combined multi-target detection, min-cost network flows, MAP, RGB-D sensor.

Special Issue on: Big Data-oriented Science, Technologies and Applications

  • Time constraint influence maximisation algorithm in the age of big data   Order a copy of this article
    by Meng Han, Zhuojun Duan, Chunyu Ai, Forrest Wong Lybarger, Yingshu Li 
    Abstract: The new generation of social networks contains billions of nodes and edges. Managing and mining this data is a new academic and industrial challenge. Influence maximisation is the problem of finding a set of nodes in a social network that result in the highest amount of influence diffusion. Independent Cascade (textit{IC}) and Linear Threshold (textit{LT}) are two classical approaches that model the influence diffusion process in social networks. Based on both textit{IC} and textit{LT}, lots of previous research works have been developed, which focus exclusively on the efficiency of algorithms, but overlooking the feature of social network data itself, such as time sensitivity and the practicality in large scale. Although much research on this topic has been proposed, such as the hardness (computing influence spread for a given seed set is #P-Hard) of the problem itself, most of the literature on this topic cannot handle the real large scale social data. Furthermore, the new era of 'big data' is changing dramatically right before our eyes - the increase of big data growth gives all researchers many challenges as well as opportunities. As more and more data is generated from social networks in this new age, this paper proposes two new models textit{TIC} and textit{TLT}, which incorporate the dynamism of networks, which considering the time constraint during the influence spreading process in practice. To address the challenge of large scale data, we take a first step designing an efficient influence maximisation framework based on the new models we proposed, and systemic theoretical analysis shows that the effective algorithms we proposed could achieve provable approximation guarantees. We also applied our models to the most notable big data frameworks textit{Hadoop} and textit{Spark} respectively. Empirical studies on different synthetic and real large scale social networks demonstrate that our model, together with solutions on both platforms, provides better practicality as well as giving a regulatory mechanism for enhancing influence maximisation. Not only that, but also it outperforms most existing alternative algorithms.
    Keywords: influence maximisation; cloud computing; data mining; data modelling.

  • Multi-criteria decisional approach for extracting relevant association rules   Order a copy of this article
    by Addi Ait-Mlouk, Fatima Gharnati, Tarik Agouti 
    Abstract: Association rule mining plays a vital role in knowledge discovering in databases. The difficult task is mining useful and non-redundant rules, in fact, in most cases, the real datasets lead to a huge number of rules, which does not allow users to make their own selection of the most relevant. Several techniques have been proposed, such as rule clustering, informative cover method, and quality measurements. Another way is to select relevant association rules, and we believe it is necessary to integrate a decisional approach within the knowledge discovery process. To solve the problem we propose an approach to discover a category of relevant association rules based on multi-criteria analysis by using association rules as actions and quality measurements as criteria. Finally, we conclude our work by an empirical study to illustrate the performance of our proposed approach.
    Keywords: data mining; knowledge discovery in database; association rules; quality measurements; multi-criteria analysis; decision-making system; ELECTRE TRI

  • Analysing user retweeting behaviour on microblogs: prediction model and influencing features   Order a copy of this article
    by Chenglong Lin, Yanyan Li, Ting-Wen Chang, Kinshuk  
    Abstract: This paper explores the feasibility of predicting users retweeting behaviour and ranks the influencing features affecting that behaviour. The four first-dimension features, namely author, text, recipient and relationship, are extracted and split into 39 second-dimension features. This study then applies support vector machine (SVM) to build the prediction model. Data samples extracted from Sina Microblog platform are subsequently used to evaluate this prediction model and rank the 39 second-dimension features. The results show the recall rate of this model is 58.67%, the precision rate is 82.19%, and the F1 test value is 68.46%, which indicate that the performance of the prediction model is highly satisfactory. Moreover, results of ranking indicate the four features that affect the retweeting behaviour of users: the active degree of the microblog author, the similarity of interests between the author and the recipient, the active degree of the microblog recipient, and the similarity between the theme of the microblog and the recipients interest.
    Keywords: microblog; retweeting behaviour; prediction model; influence ranking; support vector machine; information gain

  • System architecture of coastal remote sensing data mining and services based on cloud computing   Order a copy of this article
    by Xuerong Li, Xiujuan Wang, Lingling Wu 
    Abstract: Coastal remote sensing images have been big data which has features of the volume, variety, complexity and specialization. How to effectively carry out data integration, fast extraction and data mining of knowledge and information from these massive remote sensing data, is far behind the requirements of coastal professional applications. In this paper, based on data mining, remote sensing theory, space information and cloud computing technology, towards the goal of coastal zone remote sensing data integration and data mining service system, the meta-data model, data storage model, data mining framework, web service framework model, etc., are provided and designed. Finally, a prototype system of remote sensing data mining services in the cloud computing environment is designed and developed using system integration, and is demonstrated and verified by professional applications. It is valuable to serve the development of the coastal zone monitoring, planning and integrated management and other fields.
    Keywords: remote sensing image; data mining; system architecture; cloud computing; coastal zone

  • Collating multisource geospatial data for vegetation detection using Bayesian network: a case study of Yellow River Delta   Order a copy of this article
    by Dingyuan Mo, Liangju Yu, Meng Gao 
    Abstract: Multisource geospatial data contains a lot of information that can be used for environment assessment and management. In this paper, four environmental indicators that represent typical human activities in Yellow River Delta, China are extracted from multisource geospatial data. By analysing the causal relationship between these human-related indicators and NDVI, a Bayesian Network (BN) model is developed. Part of the raster data pre-processed using GIS is used for training the BN model, and the other data is used for a model test. Sensitivity analysis and performance assessment showed that the BN model was good enough to reveal the impacts of human activities on land vegetation. With the trained BN model, the vegetation change under three different scenarios was also predicted. The results showed that multisource geospatial data could be successfully collated using the GIS-BN framework for vegetation detection.
    Keywords: GIS; NDVI; human activity; oil exploitation; urbanisation; road construction; Bayesian network.

  • A comparative study on disease risk model in exploratory spatial analysis   Order a copy of this article
    by Zhisheng Zhao, Yang Liu, Jing Li, Junhua Liang, Jiawei Wang 
    Abstract: The present work mainly focuses on the issue of risk model in spacial data analysis. Through the analysis on morbidity data of influenza A (H1N1) across Chinas administrative regions from 2009 to 2012, a comparative study was carried out among Poisson model, Poisson-Gamma model, log-normal model, EB estimator of moment and Bayesian hierarchical model. By using R programming language, the feasibility of the above analysis methods was verified and the variability of the estimated values generated by each model was calculated, the Bayesian model for spatial disease analysis was improved, and the estimator considering uncorrelated spatial model, correlated spatial model and covariate factors was proved to be the best by comparing DIC values of the models. By using the Markov chain for simulative iteration, iterative convergence was illustrated by graphs of iteration track, autocorrelation function, kernel density and quantile estimation. The research on spatial variability of disease morbidity is helpful in detecting the epidemic area and forewarning the pathophoresis of prospective epidemic disease.
    Keywords: spatial disease analysis; Bayesian hierarchical model; Poisson-Gamma model; EB estimator of moment; R programming

  • A robust video watermarking scheme using sparse principal component analysis and wavelet transform   Order a copy of this article
    by Shankar Thirunarayanan, Yamuna Govindarajan 
    Abstract: The extension of internet facilities profoundly eases the culmination of all digital data such as audio, images and videos to the general public. A technique to be developed is watermarking in favour of security and facilitating the data as well for copyright protection of digital contents. This paper proposes a blind scheme for digital video watermarking. A discrete wavelet domain watermarking is adopted for hiding a large amount of data with high security, good invisibility and no loss to the secret message. First, Dual Tree Complex Wavelet Transform (DTCWT) is applied to each frame, decomposing it into a number of sub-bands. Then, holo entropy of each sub-band is calculated and the maximum entropy blocks are selected. The selected blocks are transformed using sparse principal component analysis (SPCA). The maximum coefficient of the SPCA blocks of each sub-band is quantised using Quantization Index Modulation (QIM). The watermark bit is embedded into the appropriate quantiser values. The same process is repeated at the extraction process. The proposed video watermarking scheme is analysed through various constraints, such as the Normalised Correlation (NC) and the Peak Signal to Noise Ratio (PSNR), and the embedding quality is maintained with an average PSNR value of 53 dB.
    Keywords: watermarking; quantisation; SPCA; holo entropy; dual tree complex wavelet transform

  • Optimisation for video watermarking using ABC algorithm   Order a copy of this article
    by Sundararajan Madhavan, Yamuna Govindarajan 
    Abstract: Video watermarking is a relatively innovative tool that has been proposed to solve the problem of illegal manipulation and sharing of digital video. It is the process of embedding copyright information into video watermarking. In this paper, an Artificial Bee Colony (ABC) algorithm is used for finding the frame and location for embedding the gray scale image into video watermarking, and then searches a scene and location into which a particular part of watermarking is best to embed. The number of shot frames and locations is identified using the ABC algorithm. Once the best frame and locations are identified, the embedding and extraction procedure is carried out. The performance of the proposed algorithm is analysed with the existing technique using PSNR and NC. This technique is tested against different attacks and the results obtained are encouraging.
    Keywords: digital video watermarking; discrete wavelet transform; artificial bee colony; peak signal to noise ratio; normalised correlation.

  • DWT based gray-scale image watermarking using area of best fit equation and cuckoo search algorithm   Order a copy of this article
    by Sundararajan Madhavan, Yamuna Govindarajan 
    Abstract: This manuscript explains the salient features of the recently presented Nature Inspired Algorithm (NIA) for the improvement of digital image watermarking used in copyright protection. In the embedding process, the gray image divided into four sub-bands using discrete wavelet transform (DWT), and the desired sub-bands are selected. In the selected two sub-bands (LH, HL) a mathematical equation of area of best fit is applied. The cuckoo search algorithm in focus is used to entirely recognise optimal positions in the DWT domain for watermark enclosure in the binary image. The results display the supremacy of using this algorithm for the watermarking techniques focused for copyright protection with the lowest effect on the PSNR values the optimum positions are obtained even for the watermark included images.
    Keywords: cuckoo search algorithm; area of the best fit equation; gray-scale image watermarking; discrete wavelet transform.

  • Term extraction and correlation analysis based on massive scientific and technical literature   Order a copy of this article
    by Wen Zeng 
    Abstract: Scientific and technical terms are the basic units of knowledge discovery and organisation construction. Correlation analysis is one of the important technologies for the deep data mining of massive, different scientific and technical literature. Based on the freely available digital library resources, this study adopts the technology of natural language processing to analyse the linguistics characteristics of terms, and combines with statistical analyses to extract the terms from scientific and technical literature. Using the results of term extraction, the paper proposes the algorithm of improved VSM towards correlation calculation for analysing different scientific and technical literature. According to the experimental results, it proposes a new way and possibility to automatically extract terms and realise correlation analysis for different sources of massive scientific and technical literature. Our method is superior to the method of unadopting linguistic rules and MI calculation. The accuracy of terms is about 73.5%. Compared with the traditional VSM based on terms, the correct rate of correlation calculation is increased by 12%.
    Keywords: term extraction; correlation analysis; scientific and technical literature; knowledge discovery and organization; big data.

  • Hybrid fuzzy collaborative filtering: an integration of item-based and user-based clustering techniques   Order a copy of this article
    by Pratibha Yadav, Shweta Tyagi 
    Abstract: Collaborative filtering is the most widely adopted technique of recommender system which presents the individualised information based on the analysis of users past behaviour and selections. In the literature, numerous collaborative filtering approaches have been put forward. Clustering is one of the successful approaches of the model-based collaborative filtering techniques that deals with the problem of sparsity and provides quality recommendations. The problem with the clustering approach is the fact that it imposes unique membership constraint on the users/items. This issue is addressed in the literature by employing fuzzy c-means clustering, a soft clustering technique which allows an element to belong to more than one cluster. Traditionally, fuzzy c-means clustering technique is adopted with collaborative filtering to first produce item-based fuzzy clusters and then to generate recommendations. In the proposed work, fuzzy c-means clustering technique is adopted in order to produce item-based clusters as well as user-based clusters. Subsequently, collaborative filtering technique explores the item-based and user-based clusters and generates the list of item-based and user-based predictions, respectively. Further, to enhance the quality of recommendations, a novel weighted hybrid scheme is designed which integrates the user-based and item-based predictions to capture the influence of each active user towards item-based and user-based predictions. The proposed schemes are further categorised on the basis of re-clustering and without re-clustering under different similarity measures over sparse and dense datasets. The experimental results reveal that the variants of the proposed hybrid schemes consistently generate better results in comparison with the corresponding variants of proposed user-based schemes and the traditional item-based schemes.
    Keywords: recommender system; collaborative filtering; fuzzy C-means clustering; sparsity

  • Towards patent text analysis based on semantic role labelling   Order a copy of this article
    by Yanqing He, Ying Li, Ling'en Meng, Hongjiao Xu 
    Abstract: Mining patent texts can obtain valuable technical information and competitive intelligence which is important for the development of technology and business. The current patent text-mining approaches suffer from lack of effective, automatic, accurate and wide-coverage techniques that can annotate natural language texts with semantic argument structure. It is helpful for text mining to derive a more meaningful semantic relationship from semantic role labelling (SRL) results of patents. This paper uses Word2Vec to learn word real-valued vector and design features related to word vector to train SRL parser. Based on the SRL parser, two patent text mining methods are then given: patent topic extraction and automatic construction of patent technical effect matrix (PTEM). Experiments show that semantic role labelling help to achieve satisfactory results and saves manpower.
    Keywords: patent technical effect matrix; semantic role labeling; IPC; patent analysis; word vector; patent topic extraction; semantic analysis; text mining

  • Efficient attribute selection strategies for association rule mining in high dimensional data   Order a copy of this article
    by Sandhya Harikumar, Divya Usha Dilipkumar, M. Ramachandran Kaimal 
    Abstract: This paper presents a new computational approach to discover interesting relations between variables, called association rules, in large and high dimensional datasets. State of the art techniques are computationally expensive for reasons such as high dimensions, generation of a huge number of candidate sets, and multiple database scans. In general, most of the enormous discovered patterns are obvious, redundant or uninteresting to the user. So the context of this paper is to improve the Apriori algorithm to find association rules pertaining to only important attributes from high dimensional data. We employ an information theoretic method together with the concept of QR decomposition to represent the data in its proper substructure form without losing its semantics. Specifically, we present a feature selection approach based on entropy measure which is leveraged into the process of QR decomposition for finding significant attributes. This helps in expressing the dataset in compact form by projecting into different subspaces. The association rule mining based on these significant attributes leads to improvement of the traditional Apriori algorithm in terms of candidate set generation and rules mined, as well as time complexity. Experiment on real datasets and comparison with the existing technique reveals that the proposed strategy is computationally always faster and statistically always comparable with the classic algorithms.
    Keywords: association rule mining; Apriori algorithm; entropy; QR decomposition.

Special Issue on: Advanced Cooperative Computing

  • Towards optimisation of replicated erasure codes for efficient cooperative repair in cloud storage systems   Order a copy of this article
    by Guangping Xu, Qunfang Mao, Huan Li 
    Abstract: The study of erasure codes in distributed storage systems has two aspects: one is to reduce the data redundancy and the other one is to save the bandwidth cost during repair process. Repair-efficient codes are investigated to improve the repair performance. However, the researches are mostly at the theoretical stage and hardly applied in the practical distributed storage systems such as cloud storage. In this paper, we present a unified framework to describe some repair-efficient regenerating codes in order to reduce the bandwidth cost in regenerating the lost data. We build an evaluation system to measure the performance of these codes during file encoding, file decoding and individual failure repairing with given feasible parameters. By the experimental comparison and analysis, we validate that the repair-efficient regenerating codes can significantly save much more repair time than traditional erasure codes during the repair process at the same storage cost; in particular, some replication-based erasure codes can perform better than others in some cases. Our experiments can help researchers to decide which kind of erasure codes to use in building distributed storage systems.
    Keywords: erasure codes; distributed storage systems; data recovery; repair-efficient codes

  • A matching approach to business services and software services   Order a copy of this article
    by Junfeng Zhao 
    Abstract: Recent studies have shown that Service-Oriented Architecture (SOA) has the potential to revive enterprise legacy systems [1-10], making their continued service in the corporate world viable. In the process of reengineering legacy systems to SOA, some software services extracted in legacy system can be reused to implement business services in target systems. In order to achieve efficient reuse of software services, a matching approach is proposed to extract the software services related to specified business services, where service semantics and structure similarity measures are integrated to evaluate the similarity degree between business service and software services. Experiments indicate that the approach can efficiently map business services to relevant software services, and then legacy systems can be reused as much as possible.
    Keywords: software service; business service; matching approach; semantics similiarity measure; structure similarity measure

  • A new model of vehicular ad hoc networks based on artificial immune theory   Order a copy of this article
    by Yizhe Zhou, Depin Peng 
    Abstract: Vehicular ad hoc networks (VANETs) are highly mobile and wireless networks intended to aid vehicular safety and traffic monitoring. To achieve these goals, we propose a VANET model based on immune network theory. Our model outperforms the Delay Tolerant Mobility Sensor Network (DTMSN) model over a range of node numbers in terms of data packet arrival delay, arrival ratio, and throughput. These findings held true for the on-demand distance vector and connection-based restricted forwarding routing protocols. The model performed satisfactorily on a real road network.
    Keywords: networking model; vehicular ad hoc networks; artificial immune theory; real-time capacity.

  • Feature binding pulse-coupled neural network model using a double color space   Order a copy of this article
    by Hongxia Deng, Han Li, Sha Chang, Jie Xu, Haifang Li 
    Abstract: The feature binding problem is one of the central issues in cognitive science and neuroscience. To implement a bundled identification of colour and shape of one colour image, a double-space vector feature binding PCNN (DVFB-PCNN) model was proposed based on the traditional pulse-coupled neural network (PCNN). In this model, the method of combining RGB colour space with HSI colour space successfully solved the problem that all colours can not always be separated completely. Through the first pulse emission time of the neurons, the different characteristics were separated successfully. Through the colour sequence produced by this process, the different characteristics belonging to the same perceived object were bound together. Experiments showed that the model can successfully achieve separation and binding of image features and will be a valuable tool for PCNN in the feature binding of colour images.
    Keywords: feature binding; double-space; pulse emission time.

  • Signal prediction based on boosting and decision stump   Order a copy of this article
    by Lei Shi 
    Abstract: Signal prediction has attracted more and more attention from data mining and machine learning communities. Decision stump is a one-level decision tree, and it classifies instances by sorting them based on feature values. The boosting is a kind of powerful ensemble method and can improve the performance of prediction significantly. In this paper, boosting and decision stump algorithm are combined to analyse and predict the signal data. An experimental evaluation is carried out on the public signal dataset and the experimental results show that the boosting and decision stump-based algorithm improves the performance of signal prediction significantly.
    Keywords: decision stump; boosting; signal prediction.
    DOI: 10.1504/IJCSE.2016.10006637
     
  • Using online dictionary learning to improve Bayer pattern image coding   Order a copy of this article
    by Tingyi Zheng, Li Wang 
    Abstract: Image quality is a fundamental concern in image compression. There is a lot of noise in the image compression process, which may impact on users not getting precise identification. It has, thus, always been neglected in image compression in past researches. In fact, noise takes a beneficial role in image reconstruction. In this paper, we choose noise as considered and recommended as a coding method for Bayer pattern image based on online dictionary learning. Investigations have depicted that the proposed method in Bayer pattern image coding might develop the rate of distortion performance of Bayer pattern image coding at any rate.
    Keywords: Bayer pattern image; online dictionary learning; rate distortion.

Special Issue on: ICNC-FSKD'15 Machine Learning, Data Mining and Knowledge Management

  • An improved ORNAM representation of gray images   Order a copy of this article
    by Yunping Zheng, Mudar Sarem 
    Abstract: An efficient image representation can save space and facilitate the manipulation of the acquired images. In order to further enhance the reconstructed image quality and reduce the number of the homogeneous blocks of the overlapping rectangular non-symmetry and anti-packing model (ORNAM) representation, in this paper we propose an improved overlapping rectangular non-symmetry and anti-packing model representation (IORNAM) of gray images. Compared with most of the up-to-date and the state-of-the-art hierarchical representation methods, the new IORNAM representation is characterised by two properties. (1) It adopts a ratio parameter of the length and the width of a homogenous block to improve the reconstructed image quality. (2) It uses a new expansion method to anti-pack the subpatterns of gray images to further decrease the number of homogenous blocks, which is important for improving the compression ratios of image representation and reducing the complexities of many image manipulation algorithms. The experimental results presented in this paper demonstrate that (1) the new IORNAM representation is able to achieve high representation efficiency for gray images and (2) the new IORNAM representation outperforms most of the up-to-date and the state-of-the-art hierarchical representation methods of gray images.
    Keywords: gray image representation; extended Gouraud shading approach; overlapping rectangular NAM; ORNAM; spatial data structures; S-Tree coding; spatial- and DCT-based.

  • Genetic or non-genetic prognostic factors for colon cancer classification   Order a copy of this article
    by Meng Pan, Jie Zhang 
    Abstract: Many researches have addressed patient classification using prognostic factors or gene expression profiles (GEPs). This study tried to identify whether a prognostic factor was genetic by using GEPs. If significant GEP difference was observed between two statuses of a factor, the factor might be genetic. If the GEP difference was largely less significant than the survival difference, the survival difference might not be due to the genes; then, the factor might be non-genetic or partly non-genetic. A practice was made in this study using public dataset GSE40967, which contains GEP data of 566 colon cancer patients, messages of tumor-node-metastasis (TNM) staging, etc. Prognostic factors T, N, M, and TNM were observed being non-genetic or partly non-genetic, which should be complement for future gene expression classifiers.
    Keywords: gene expression profiles; prognostic factor; colon cancer; classification; survival

  • A medical training system for the operation of heart-lung machine   Order a copy of this article
    by Ren Kanehira 
    Abstract: There has been a strong tendency to use Information Communication Technology (ICT) to construct various education/training systems to help students or other learners master necessary skills more easily. Among such systems the ability to obtain operational practice is particularly welcome in addition to the conventional e-learning ones mainly for obtaining textbook-like knowledge only. In this study, we propose a medical training system for the operation of heart-lung machine. Two training contents, one for basic operation and another for troubleshooting, are constructed in the system and their effects are tested.
    Keywords: computer-aided training; skill science; medical training; heart-lung machine; operation supporting; e-learning; clinic engineer.

Special Issue on: BDA 2014 and 2015 Conferences and DNIS 2014 and 2015 Workshops Data Modelling and Information Infrastructure in Big Data Analytics

  • Automatic identification and classification of Palomar Transient Factory astrophysical objects in GLADE   Order a copy of this article
    by Weijie Zhao, Florin Rusu, John Wu, Peter Nugent 
    Abstract: Palomar Transient Factory is a comprehensive detection system for the identification and classification of transient astrophysical objects. The central piece in the identification pipeline is represented by an automated classifier that distinguishes between real and bogus objects with high accuracy. The classifier consists of two components, namely real-time and offline. Response time is the critical characteristic of the real-time component, whereas accuracy is representative for the offline in-depth analysis. In this paper, we make two significant contributions. First, we present an experimental study that evaluates a novel implementation of the real-time classifier in GLADE, a parallel data processing system that combines the efficiency of a database with the extensibility of Map-Reduce. We show how each stage in the classifier - candidate identification, pruning, and contextual realbogus - maps optimally into GLADE tasks by taking advantage of the unique features of the system range-based data partitioning, columnar storage, multi-query execution, and in-database support for complex aggregate computation. The result is an efficient classifier implementation capable of processing a new set of acquired images in a matter of minutes, even on a low-end server. For comparison, an optimised PostgreSQL implementation of the classifier takes hours on the same machine. Second, we introduce a novel parallel similarity join algorithm for advanced transient classification. This algorithm operates offline and considers the entire candidate dataset consisting of all the objects extracted over the lifetime of the Palomar Transient Factory survey. We implement the similarity join algorithm in GLADE and execute it on a massive supercomputer with more than 3000 threads. We achieve more than three orders of magnitude improvement over the optimised PostgreSQL solution.
    Keywords: parallel databases; multi-query processing; scientific data analysis; similarity join; astronomical surveys; transient identification

  • Trust and reputation based multi-agent recommender system   Order a copy of this article
    by Punam Bedi, Sumit Agarwal, Richa Singh 
    Abstract: User profile modelling for the domain of tourism is different from most of the other domains, such as books or movies. The structure of a tourist product is more complex than a movie or a book. Moreover, the frequency of activities and ratings in the tourism domain is also smaller than the other domains. To address these challenges, this study proposes a Trust and Reputation based Collaborative Filtering (TRbCF) algorithm. It augments a notion of dynamic trust between users and reputation of items to an existing collaborative approach for generating relevant recommendations. A Multi-Agent Recommender System for e-Tourism (MARST) for recommending tourism services using the TRbCF algorithm is designed and a prototype is developed. TRbCF also helps to handle the new user cold-start problem. The developed system can generate recommendations for hotels, places to visit and restaurants at a single place, whereas most of the existing recommender systems focus on one service at a time.
    Keywords: multi-agent system, recommender system, e-tourism, trust, reputation

  • Anomaly-free search using multi-table entity attribute value data model   Order a copy of this article
    by Shivani Batra, Shelly Sachdeva 
    Abstract: This paper proposes a principled extension of Dynamic Tables (DT). It is termed as the Multi-Table Entity Attribute Value (MTEAV) model, which offers a search-efficient avenue for storing a database. The paper presents precise semantics of MTEAV and demonstrates the following aspects: (1) MTEAV possesses consistency and availability; (2) MTEAV outperforms other existing models (Entity Attribute Value Model, Dynamic Tables, Optimized Entity Attribute Value and Optimized Column Oriented Model) under various query scenarios and varying datasets size; (3) MTEAV retains the flavour of EAV in terms of handling sparseness and self-adapting schema-changing capability. To heighten the adaptability of MTEAV, a translation layer is implemented over existing SQL engine in a non-intrusive way. The translation layer transforms conventional a SQL query (as per horizontal row representation) to a new SQL query (as per MTEAV structure) to maintain user friendliness. The translation layer makes users feel as if they are interacting with the conventional horizontal row approach. The paper also critically analyses the maximum percentage of non-null density appropriate for choosing MTEAV as a storage option.
    Keywords: database, dynamic tables, entity attribute value model, optimised entity attribute value, optimised column-oriented model, search efficiency, storage efficiency.

  • Secure k-objects selection for a keyword query based on MapReduce skyline algorithm   Order a copy of this article
    by Asif Zaman, Md. Anisuzzaman Siddique, Annisa, Yasuhiko Morimoto 
    Abstract: Keyword query interface has become a de-facto standard in information retrieval and such systems have been used by the community for decades. The user gives a keyword, and objects that are closely related to that keyword are returned to the user. The process of selecting necessary objects for a keyword query has been considered as one of the most precious query problems. Top-k query is one of the popular methods to select important objects from a large number of candidates. A user specifies scoring functions and k, the number of objects to be retrieved. Based on the user's scoring function, k objects are then selected by the top-k query. However, the user's scoring function may not be identical, which implies that the top-k objects are valuable only for users whose scoring functions are similar. Meanwhile, the privacy of data during the selection processing is also a burning issue. In some cases, especially in multi-party computing, parties may not want to disclose any information during the processing. In this paper, we propose a k-object selection procedure that selects various k objects that are preferable for all users whose scoring functions are not identical. During the selection of k-objects, the proposed method prevents disclosures of sensitive values. The idea of skyline and top-k query along with perturbed cipher has been used to select the k objects securely. We propose such efficient secure computation by using MapReduce framework.
    Keywords: skyline query; top-k Query; data privacy; MapReduce ; mobile phone interface.

  • High performance adaptive traffic control for efficient response in vehicular ad hoc networks   Order a copy of this article
    by Vinita Jindal, Punam Bedi 
    Abstract: Nowadays, with the invention of CUDA, a parallel computing platform and programming model, there is a dramatic increase in computing performance by harnessing the power of the GPU. GPU computing with CUDA can be used to find efficient solutions for many real-world complex problems. One such is the traffic signal control problem, which takes care of conflicting movements at the intersections to avoid accidents and ensure smooth flow of traffic in a safe and efficient manner. Adaptive Traffic Control (ATC) algorithm is used in the literature to reduce the average queue length at the intersections. This algorithm has serial implementation on a single CPU and hence takes large computation time. In this paper, we propose a high performance ATC for proving efficient responses and hence reducing average queue length that results in a decrease in the overall waiting time at the intersections. We tested our proposed approach with varying numbers of vehicles for two real world networks. The performance of the proposed algorithm is compared with its serial counterpart.
    Keywords: VANETs; GPU; CUDA; adaptive control; traffic signals.

  • Smart city workflow patterns for qualitative aggregate information retrieval from distributed public information resources   Order a copy of this article
    by Wanming Chu 
    Abstract: We examine a workflow pattern system for public information from multiple resources. This system aggregates timetable information from bus companies, city information from the internet, and the public facilities map of the city, to generate geographic data. Multiple query methods are used to obtain the target information. For example, one of the search results can be set as the origin or destination of a bus route. Next, the shortest bus route with the minimum number of bus stops between the origin and destination can be found by using the bus routing function. The query results and the shortest bus route are visualised on the embedded map. The detailed search information is shown in the side-bar. This system finds city information and transportation routes. It is helpful for residents and visitors. They can use the city public transportation more efficiently for their daily life, business, and travel planning.
    Keywords: GIS; query interface; routing query over heterogeneous information resources.

  • Computational intelligence methods for data mining of causality extent in time series   Order a copy of this article
    by Lukas Pichl, Taisei Kaizoji 
    Abstract: Data mining of causality extent in the time series of economic data is an important area of computational intelligence research with direct applications to algorithmic trading or risk diversification strategies. Based on the particular market and the time scale employed, the causal rates are expected to vary widely. In this work we adopt the Support Vector Machine (SVM) and Artificial Neural Network (ANN) for causality rate extraction. The dataset records all details of the futures contracts on the commodity of gasoline traded in Japan. By sampling the tick data at 1 min, 5 min, 10 min, 30 min, 1 hour and 1 day scales, we derive time series of varying causal degree. Trend predictions are computed by using the SVM binary classifier trained on 66.6% of the data using a five-step-back moving window, which samples the log returns as the predictor data. From the testing data we extract varying rates of causality degree, starting from the borderline of 50% up to the order of 60% in rare cases. The trend prediction analysis is complemented by the ANN method with four hidden layers. We find that whereas the SVM outperforms the ANN in most cases, the opposite may also be true on occasions. In general, whereas considerable causality rates are observed at some high-frequency sampled data segments, returns at the longer time scales are predictable to a lesser extent. Overall, the market of the gasoline futures in Japan is found to be rather close to the efficient market hypothesis in comparison with other commodities markets.
    Keywords: financial futures; artificial neural network; support vector machine; trend prediction; causality extraction.

  • A dataflow platform for applications based on linked data   Order a copy of this article
    by Miguel Ceriani, Paolo Bottoni 
    Abstract: Modern software applications increasingly benefit from accessing the multifarious and heterogeneous Web of Data, thanks to the use of web APIs and linked data principles. In previous work, the authors proposed a platform to develop applications consuming linked data in a declarative and modular way. This paper describes in detail the functional language the platform gives access to, which is based on SPARQL (the standard query language for linked data) and on the dataflow paradigm. The language features interactive and meta-programming capabilities so that complex modules/applications can be developed. By adopting a declarative style, it favours the development of modules that can be reused in various specific execution contexts.
    Keywords: linked data; Semantic Web; SPARQL; RDF; dataflow; declarative programming.

Special Issue on: ICICS 2016 Next Generation Information and Communication Systems

  • Is a picture worth a thousand words? A computational investigation of the modality effect   Order a copy of this article
    by Naser Al Madi, Javed Khan 
    Abstract: The modality effect is a term that refers to differences in learning performance in relation to the mode of presentation. It is an interesting phenomenon that impacts education, online-learning, and marketing, among many other areas of life. In this study, we use Electroencephalography (EEG Alpha, Beta, and Theta) and computational modelling of comprehension to study the modality effect in text and multimedia. First, we provide a framework for evaluating learning performance, working memory, and emotions during learning. Second, we apply these tools to investigate the modality effect computationally focusing on text in contrast to multimedia. This study is based on a dataset that we have collected through a human experiment involving 16 participants. Our results are important for future learning systems that incorporate learning performance, working memory, and emotions in a continuous feedback system that measures and optimises learning during and not after learning.
    Keywords: modality effect; comprehension; electroencephalography; learning; education; text; multimedia; semantic networks; recall; emotions.

  • Automated labelling and severity prediction of software bug reports   Order a copy of this article
    by Ahmed Otoom, Doaa Al-Shdaifat, Maen Hammad, Emad Abdallah, Ashraf Aljammal 
    Abstract: We target two research problems that are related to bug tracking systems: bug severity prediction and automated bug labelling. Our main aim is to develop an intelligent classifier that is capable of predicting the severity and label (type) of a newly submitted bug report through a bug tracking system. For this purpose, we build two datasets that are based on 350 bug reports from the open-source community (Eclipse, Mozilla, and Gnome). These datasets are characterised by various textual features that are extracted from the summary and description of bug reports of the aforementioned projects. Based on this information, we train a variety of discriminative models that can be used for automated labelling and severity prediction of a newly submitted bug report. A boosting algorithm is also implemented for an enhanced performance. The classification performance is measured using accuracy and a set of other measures including: precision, recall, F-measure and the area under the Receiver Operating Characteristic (ROC) curve. For automated labelling, the accuracy reaches around 91% with the AdaBoost algorithm and cross-validation test. On the other hand, for severity prediction, our results show that the proposed feature set has proved successful with a classification performance accuracy of around 67% with the AdaBoost algorithm and cross-validation test. Experimental results with the variation of training set size are also presented. Overall, the results are encouraging and show the effectiveness of the proposed feature sets.
    Keywords: severity prediction; software bugs; machine learning; bug labeling.

Special Issue on: ISTA'16 Intelligent Systems Technologies and Applications

  • FS-CARS: fast and scalable context-aware news recommender system using tensor factorisation   Order a copy of this article
    by Anjali Gautam, Punam Bedi 
    Abstract: Matrix factorisation is a widely adopted approach to collaborative filtering that factorises user-item rating matrix to generate recommendations. The useritem rating matrix can be extended to incorporate users context, resulting in a rating tensor that can be factorised to generate better quality context-aware recommendations. Tensor factorisation is a computationally intensive task, and computational time can be significantly reduced using a distributed and scalable framework. This paper proposes a context-aware news recommender system that classifies news items into different categories and incorporates the users context, resulting in a rating tensor that is then factorised to generate recommendations. The news items are highly dynamic and are generated in large numbers, which can further greatly increase the computational time. To stabilise the computation time of the process, the proposed system is implemented on a distributed and scalable framework of Apache Spark using MLlib library. The proposed recommender system is evaluated for performance and computational time.
    Keywords: context-aware RS; tensor factorisation; matrix factorisation; Apache Spark.

Special Issue on: Novel Strategies for Programming Accelerators

  • Evaluating attainable memory bandwidth of parallel programming models via BabelStream   Order a copy of this article
    by Tom Deakin, James Price, Matt Martineau, Simon McIntosh-Smith 
    Abstract: Many scientific codes consist of memory bandwidth bound kernels - the dominating factor of the runtime is the speed at which data can be loaded from memory into the arithmetic logic units, before results are written back to memory. One major advantage of many-core devices such as General Purpose Graphics Processing Units (GPGPUs), and the Intel Xeon Phi is their focus on providing increased memory bandwidth over traditional CPU architectures. However, as with CPUs, this peak memory bandwidth is usually unachievable in practice and so benchmarks are required to measure a practical upper bound on expected performance. We augment the standard set of STREAM kernels with a dot product kernel to investigate the performance of simple reduction operations on large arrays. Such kernels are usually present in scientific codes and are still memory-bandwidth bound. The choice of one programming model over another should ideally not limit the performance that can be achieved on a device. BabelStream (formally GPU-STREAM) has been updated to incorporate a wide variety of the latest parallel programming models, all implementing the same parallel scheme. As such this tool can be used as a kind of 'Rosetta Stone' that provides both a cross-platform and cross-programming model array of results of achievable memory bandwidth.
    Keywords: performance portability; many-core; parallel programming models; memory bandwidth benchmark.

  • Array streaming for array programming   Order a copy of this article
    by Mads Kristensen, James Avery 
    Abstract: A barrier to efficient array programming, for example in Python/NumPy, is that algorithms written as pure array operations completely without loops, while most efficient on small input, can lead to explosions in memory use. The present paper presents a solution to this problem using array streaming, implemented in the automatic parallelisation high-performance framework Bohrium. This makes it possible to use array programming in Python/NumPy code directly, even when the apparent memory requirement exceeds the machine capacity, since the automatic streaming eliminates the temporary memory overhead by performing calculations in per-thread registers. Using Bohrium, we automatically fuse, JIT-compile, and execute NumPy array operations on GPGPUs without modification to the user programs. We present performance evaluations of three benchmarks, all of which show dramatic reductions in memory use from streaming, yielding corresponding improvements in speed and use of GPGPU-cores. The streaming-enabled Bohrium effortlessly runs programs on input sizes much beyond sizes that crash on pure NumPy owing to exhausting system memory.
    Keywords: JIT-compilation; high productivity; Python; OpenCL; OpenMP; Bohrium; Numpy; GP-GPU.

  • Applicability of the software cost model COCOMO II to HPC projects   Order a copy of this article
    by Julian Miller, Sandra Wienke, Michael Schlottke-Lakemper, Matthias Meinke, Matthias S. Müller 
    Abstract: The complexity of parallel computer architectures continuously increases with the pursuit of exaflop computing, which makes accurate development effort estimation and modelling more important than ever. While sophisticated cost models are widely used in traditional software engineering, they have rarely been investigated for the performance-oriented HPC domain. Therefore, we evaluate the fit and accuracy of the popular COCOMO II model to HPC setups. We lay out a general methodology to evaluate HPC projects with COCOMO II and analyse its cost parameters for the investigated parallelisation projects with OpenACC on NVIDIA GPUs. Further, we evaluate the accuracy of the model in comparison with the reported efforts of the projects, and investigate the impact of inaccuracies in the cost parameter ratings by means of a global sensitivity analysis.
    Keywords: COCOMO; OpenACC; GPU; development effort; effort estimation; sensitivity analysis.

  • Porting the MPI-parallelized LES model PALM to multi-GPU systems and many integrated core processors: an experience report   Order a copy of this article
    by Helge Knoop, Tobias Gronemeier, Matthias Sühring, Peter Steinbach, Matthias Noack, Florian Wende, Thomas Steinke, Christoph Knigge, Siegfried Raasch, Klaus Ketelsen 
    Abstract: The computational power and availability of graphics processing units (GPUs), such as the Nvidia Tesla, and Many Integrated Core (MIC) processors, such as the Intel Xeon Phi, on high performance computing (HPC) systems is rapidly evolving. However, HPC applications need to be ported to take advantage of such hardware. This paper is a report on our experience of porting the MPI+OpenMP parallelised large-eddy simulation model (PALM) to multi-GPU as well as to MIC processor environments using the directive-based high level programming paradigm OpenACC and OpenMP, respectively. PALM is a Fortran-based computational fluid dynamics software package, used for the simulation of atmospheric and oceanic boundary layers to answer questions linked to fundamental atmospheric turbulence research, urban modelling, aircraft safety and cloud physics. Development of PALM started in 1997, the project currently entails 140 kLOC and is used on HPC farms of up to 43,200 cores. The main challenges we faced during the porting process are the size and complexity of the PALM code base, its inconsistent modularisation and the complete lack of a unit-test suite. We report the methods used to identify performance issues as well as our experiences with state-of-the-art profiling tools. Moreover, we outline the required porting steps in order to properly execute our code on GPUs and MIC processors, describe the problems and bottlenecks that we encountered during the porting process, and present separate performance tests for both architectures. These performance tests, however, do not provide any benchmark information that compares the performance of the ported code between the two architectures.
    Keywords: computational fluid dynamics; graphics processing unit; many integrated core processors; Xeon Phi; high performance computing; large-eddy simulation; MPI; OpenMP; OpenACC; porting.

  • Task-based Cholesky decomposition on Xeon Phi architectures using OpenMP   Order a copy of this article
    by Joseph Dorris, Asim YarKhan, Jakub Kurzak, Piotr Luszczek, Jack Dongarra 
    Abstract: The increasing number of computational cores in modern many-core processors, as represented by the Intel Xeon Phi architectures, has created the need for an open-source, high-performance and scalable task-based dense linear algebra package that can efficiently use this type of many-core hardware. In this paper, we examine the design modifications necessary when porting PLASMA, a task-based dense linear algebra library, to run effectively on two generations of Intels Xeon Phi architecture, known as Knights Corner (KNC) and Knights Landing (KNL). First, we modified PLASMAs tiled Cholesky decomposition to use OpenMP tasks for its scheduling mechanism to enable Xeon Phi compatibility. We then compared the performance of our modified code with that of the original dynamic scheduler running on an Intel Xeon Sandy Bridge CPU. Finally, we looked at the performance of the our OpenMP tiled Cholesky decomposition on Knights Corner and Knights Landing processors. We detail the optimisations required to obtain performance on these platforms and compare with the highly tuned Intel MKL math library.
    Keywords: Task-based programming; tile algorithms; Xeon Phi Knights Landing; KNL; Cholesky decomposition; linear algebra; OpenMP.

Special Issue on: IEEE ISPA-16 Parallel and Distributed Computing and Applications

  • Method of key node identification in command and control networks based on level flow betweenness   Order a copy of this article
    by Wang Yunming, Pan Cheng-Sheng, Chen Bo, Zhang Duo-Ping 
    Abstract: Key node identification in command and control (C2) networks is an appealing problem that has attracted increasing attention. Owing to the particular nature of C2 networks, the traditional algorithms for key node identification have problems with high complexity and unsatisfactory adaptability. A new method of key node identification based on level flow betweenness (LFB) is proposed, which is suitable for C2 networks. The proposed method first proved the definition of LFB by analysing the characteristics of a C2 network. Then, this method designs algorithms for key node identification based on LFB, and theoretically derives the complexity of this algorithm. Finally, a number of numerical simulation experiments are carried out, and the results demonstrate that this method reduces algorithm complexity, improves identification accuracy and enhances adaptability for C2 networks.
    Keywords: command and control network; complex network; key node identification; level flow betweenness.

  • CODM: an outlier detection method for medical insurance claims fraud   Order a copy of this article
    by Yongchang Gao, Haowen Guan, Bin Gong 
    Abstract: Data is high dimensional in medical insurance claims management, and there are both dense and sparse regions in these datasets, so traditional outlier detection methods are not suitable for these data. In this paper, we propose a novel method to detect the outliers for abnormal medical insurance claims. Our method consists of three core steps feature bagging to reduce the dimensions of data, calculating the core of the objects k-nearest neighbours, and computing the outlier score for each object by measuring the amount of movement of the core by sequentially increasing k. Experimental results demonstrate our method is promising to tackle this problem.
    Keywords: data mining; outlier detection; medical insurance claims fraud.

Special Issue on: Advanced Computer Science and Information Technology

  • MigrateSDN: efficient approach to integrate OpenFlow networks with STP-enabled networks   Order a copy of this article
    by Po-Wen Chi, Ming-Hung Wang, Jing-Wei Guo, Chin-Laung Lei 
    Abstract: Software defined networking (SDN) is a paradigm-shifting technology in networking. However, in current network infrastructures, removing existing networks to build pure SDN networks or replacing all operating network devices with SDN-enabled devices is impractical because of the time and cost involved in the process. Therefore, SDN migration, which implies the use of co-existing techniques and a gradual move to SDN, is an important issue. In this paper, we focus on how SDN networks can be integrated with legacy networks that use spanning tree protocol (STP). Our approach demonstrates three advantages. First, our approach does not require an SDN controller to apply the STP exchange on all switches but only on boundary switches. Second, our approach enables legacy networks to concurrently use multiple links that used to be blocked except one for avoiding loops. Third, our approach decreases bridge protocol data unit (BPDU) frames used in STP construction and topology change.
    Keywords: software defined networking; spanning tree protocol; network migration.

Special Issue on: Smart X 2016 Smart Everything

  • A New Wolf Colony Search Algorithm Based on Search Strategy for Solving Traveling Salesman Problem   Order a copy of this article
    by Yang Sun, Shoulin Yin, Hang Li, Lin Teng 
    Abstract: Generally, wolf colony search algorithm is abstracted from the behaviour feature of the wolf pack, which shows wonderful skills and amazing strategies. However, there are some disadvantages in traditional wolf colony search algorithms, such as slow convergence, easily falling into local optimal value with low efficiency and accuracy. Though many intelligence algorithms are used for travelling salesman problem (TSP), the main objective of this paper is to execute a new approach to obtain significant improvements. To overcome the shortcomings of the classic wolf colony search algorithm, this paper proposes an improved wolf colony search algorithm based on search strategy. First, we introduce interaction strategy into the travel behaviour and calling behaviour to promote the communication between artificial wolves, which can improve the information acquirement for wolves and enhance the exploring ability of wolves. Second, we present adaptive siege strategy for siege behaviour, which guarantees that the new algorithm can obtain better collaborative search features. Therefore, the range of wolf siege constantly decreases and the mining ability of wolf algorithm increases with the new strategy. Finally, experiments are carried out to verify the effectiveness and performance of our new method by comparing with other swarm intelligence algorithms for some TSP problems in TSP library (TSPLIB) database. The results show that the improved wolf colony search algorithm has higher solving accuracy and faster convergence speed. Furthermore, it has more advantages with better accurate rate, computational robustness and iteration number than other wolf colony search algorithms.
    Keywords: wolf algorithm; search strategy; interaction strategy; adaptive siege strategy; siege behaviour; travelling salesman problem.

  • New intelligent interface study based on K-means gaze tracking   Order a copy of this article
    by Jing Yu, Hang Li, Shoulin Yin 
    Abstract: User Interface (UI) is an interaction and information exchange medium between a system and its users. It is designed for communication with each other, which can enable users to operate hardware easily and effectively to achieve bidirectional interaction. Traditional UI is difficult in satisfying requirements of users. Therefore, this paper proposes a new intelligent interface scheme based on K-means gaze tracking. First, it uniformly describes the user, interface and system in an intelligent interface interaction framework based on a visual attention selection mechanism. Second, it uses the K-means method to calculate the attention degree value on the interface on the basis of the mapping relation of user, interface and system. Third, it adopts a visual attention allocation strategy to predict the users interest degree on the interface. It conducts experiments to verify the performance of our new scheme. The results show that the accuracy of the intelligent interface predicting user decision intention is very high. This method is a kind of selectable solution for marking intelligent interface of user interest goal automatically based on K-means gaze tracking. Whats more, it can effectively improve the quality of gaze tracking.
    Keywords: UI; K-means method; gaze tracking; attention degree; mapping relation.

  • The wisdom of the few: a provable approach   Order a copy of this article
    by Xiao-Yu Huang, Xian-Hong Xiang 
    Abstract: In recent years, the Wisdom Of the Few (WOF) model has attracted substantial research interest. The WOF refers to the findings that in some collaborative prediction tasks, e.g., Collaborative Filtering (CF), with only the ratings from a small set of expert users, it nearly suffices to predict a much larger number of other users' unobserved ratings. In this paper, we propose a WOF algorithm for the CF problem, and prove that under some mild statistical assumptions, the algorithm can predict the users' missing ratings correctly with high probability guaranteed. We also conduct CF experiments with the proposed algorithm on real datasets, and the results show that our algorithm is competitive with the conventional CF algorithm.
    Keywords: collaborative filtering; crowdsourcing; expert systems; wisdom of the crowd.

  • Robust and graph regularised non-negative matrix factorisation for heterogeneous co-transfer clustering   Order a copy of this article
    by Yu Ma, Zhikui Chen, Xiru Qiu, Liang Zhao 
    Abstract: Transferring learning is proposed to tackle the problem where target instances are scarce to train an accurate model. Most existing transferring learning algorithms are designed for supervised learning and cannot obtain transferring results on multiple heterogeneous domains simultaneously. Moreover, the performance of transfer learning can be seriously degraded with the appearance of noises and corruptions. In this paper, a robust non-negative collective matrix factorisation model is proposed for heterogeneous co-transfer clustering, which introduces the error matrices to capture the sparsely distributed noises. The heterogeneous clustering tasks are handled simultaneously and the graph regularisation is enforced on the collective matrix factorisation model to keep the intrinsic geometric structure of different domains. Experiment results on a real-world dataset show the proposed algorithm outperforms the baselines.
    Keywords: transfer learning; non-negative matrix factorisation; error matrix; graph regularisation; clustering.

  • A risk analysis and prediction model of electric power GIS based on deep learning   Order a copy of this article
    by Jianyong Xue, Kehe Wu, Yan Zhou 
    Abstract: In the distribution and supplying of electric power, the regional-based grids and users are diverse and complicated, so it leads to the association between operation of the power systems and its geographic information much more closely. Geographic Information Systems (GIS) are becoming an indispensable part of the Power Information Management System (PIMS). Combined with the aid from equipment dynamic analysing in GIS and with the deep learning of nonlinear network structure, complex functional models are able to simulate the situation of power grid equipment more efficiently. Based on these models, we are able to predict the risk of the entire power grid and provide decision support for the grid management. We have collected multiple sets of historical grid-runtime data that come from provincial power grid systems as the input of the model, and combined them with the prior standard training data to improve the accuracy of the risk prediction model. The methods demonstrate that the model has a high prediction accuracy and full capability of achieving better results than other modern optimisation algorithms.
    Keywords: electric power GIS; risk analysis; deep learning; prediction model.

  • CDLB: a cross-domain load balancing mechanism for software-defined networks in cloud data centres   Order a copy of this article
    by Weiyang Wang, Mianxiong Dong, Kaoru Ota, Jun Wu, Jianhua Li, Gaolei Li 
    Abstract: Currently, cross-domain load balancing is one of the core issues for software-defined networks (SDN) in cloud data centres, which can optimise resource allocation. In this paper, we propose a cross-domain load balancing mechanism, CDLB, based on Extensive Messaging and Presence Protocol (XMPP) for SDN in cloud data centres. Different from poll method, XMPP-based push model is introduced in the proposed scheme, which can avoid wasting network and computing resources in large-scale distributed network environment. The proposed scheme enables all the controllers in the flat distributed control plane to share the same consistent global-view network information in real time through XMPP and XMPP publish/subscribe extension. Thus, the problem of non-real time information synchronisation can be resolved, and cross-domain load balancing can be realised. The simulations show the efficiency of the proposed scheme.
    Keywords: cloud data centre; XMPP; push model.

  • Logistic regression for imbalanced learning based on clustering   Order a copy of this article
    by Huaping Guo, Tao Wei 
    Abstract: Class imbalance is very common in the real world. For the imbalanced class distribution, traditional state-of-the-art classifiers do not work well on imbalanced datasets. In this paper, we apply the well known statistical model logistic regression to the imbalanced learning problem and, in order to improve its performance, we use cluster algorithms as the data pre-processing approach to partition majority class data to clusters. Then the logistic regression is learned on the corresponding rebalanced datasets. Experimental results show that, compared with other state-of-the art methods, the proposed one shows significantly better performance on measures of recall, g-mean, f-measure, AUC and accuracy.
    Keywords: class imbalance; logistic regression; clustering.

Special Issue on: CSS 2013 Advances in Cyberspace Safety and Security

  • Improving stability of PCA-based network anomaly detection by means of Kernel-PCA   Order a copy of this article
    by Christian Callegari, Lisa Donatini, Stefano Giordano, Michele Pagano 
    Abstract: In recent years, the problem of detecting anomalies and attacks by statistically inspecting the network traffic has been attracting more and more research efforts. As a result, many different solutions have been proposed. Nonetheless, the poor performance offered by the proposed detection methods, as well as the difficulty of properly tuning and training these systems, make the detection of network anomalies still an open issue. In this paper we tackle the problem by proposing a way to improve the performance of anomaly detection. In more detail, we propose a novel network anomaly detection method that, by means of Kernel-PCA, is able to overcome the limitations of the 'classical' PCA-based methods, while retaining good performance in detecting network attacks and anomalies.
    Keywords: intrusion detection system; network anomaly detection; Kernel-PCA.
    DOI: 10.1504/IJCSE.2015.10006160