International Journal of Computational Science and Engineering (63 papers in press)
Distributed nested streamed models of tsunami waves
by Kensaku Hayashi, Alexander Vazhenin, Andrey Marchuk
Abstract: This research focuses on designing a high-speed scheme for tsunami modelling using nested computing. Computations are carried out on a sequence of grids composed of geographical areas with resolutions where each is embedded within another. This decreases the total number of calculations by excluding unimportant coastal areas from the process. The paper describes the distributed streaming computational scheme allowing for flexible reconfiguration of heterogeneous computing resources with a variable set of modelling zones. Computations are implemented by distributing these areas over modelling components and by synchronising the transitions of boundary data between them. Results of numerical modelling experiments are also presented.
Keywords: tsunami modelling; nested grids; distributed systems; coarse-grained parallelisation; streaming computing; communicating processes; process synchronisation; task parallelism; programming model; component-based software engineering.
The mining method of trigger words for food nutrition matching
by Shunxiang Zhang
Abstract: The rational food nutrition matching plays a dual role in health and diet for humans. The trigger words related to food nutrition matching have an effect on classifying food nutrition matching into two types: reasonable nutrition matching and unreasonable nutrition matching. This paper proposes an aiming method of trigger words for food nutrition matching. First, food information frequency vector can be extracted by the number of food names, the number of nutrition ingredients and the number of matching effects in the sentence. By judging whether each component of food information frequency vector is 0 or not, the sentences unrelated to food nutrition matching can be filtered. Then, the two food verb-noun joint probability matrices can be constructed. The column of the first matrix is the food name, and the row is the verb. The column of the second matrix is the nutrition ingredient and matching effect, and the row is the verb. By comparing row mean value of the two matrices, whether the verb is a trigger word can be judged. Lastly, under the premise of commendatory and derogatory probabilities of the trigger word, the food nutrition matching can be classified as two types by naive Bayes. The experiments show that the proposed method effectively detects the trigger word related to food nutrition matching.
Keywords: food nutrition matching; food information frequency vector; food verb-noun joint probability matrix.
The clothing image classification algorithm based on the improved Xception model
by Zhuoyi Tan, Yuping Hu, Dongjun Luo, Man Hu, Kaihang Liu
Abstract: This paper proposes a clothing image classification algorithm based on the improved Xception model. Firstly, the last fully connected layer of the original network is replaced with another fully connected layer to recognise eight classes instead of 1000 classes. Secondly, the activation function we employ in our network adopts both Exponential Linear Unit (ELU) and Rectified Linear Unit (ReLU), which can improve the nonlinear and learning characteristics of the networks. Thirdly, in order to enhance the anti-disturbance capability of the network we employ the L2 regularisation method. Fourthly, we perform data augmentation on the training images to reduce over-fitting. Finally, the learning rate is set to zero in the layers of the first two modules of our network and the network is fine-tuned. The experimental results show that the top-1 accuracy by the algorithm proposed in this paper is 92.19%, which is better than the state-of-the-art models of Inception-v3, Inception-ResNet-v2 and Xception.
Keywords: clothing image classification; transfer learning; deep convolutional neural network; Xception.
A short text conversation generation model combining BERT and context attention mechanism
by Huan Zhao, Jian Lu, Jie Cao
Abstract: The standard Seq2Seq neural network model tends to generate general and safe responses (e.g., I dont know) regardless of the input in the field of short-text conversation generation. To address this problem, we propose a novel model that combines the standard Seq2Seq model with the BERT module (a pre-trained model) to improve the quality of responses. Specifically, the encoder of the model is divided into two parts: one is the standard seq2seq which generates a context attention vector; the other is the improved BERT module which encodes the input sentence into a semantic vector. Then through a fusion unit, the vectors generated by the two parts are fused to generate a new attention vector. Finally, the new attention vector is transmitted to the decoder. In particular, we describe two ways to acquire a new attention vector in the fusion unit. Empirical results from automatic and human evaluations demonstrate that our model improves the quality and diversity of the responses significantly.
Keywords: Seq2Seq; short text conversation generation; BERT; attention mechanism; fusion unit.
Implicit emotional tendency recognition based on disconnected recurrent neural networks
by Yiting Yan, Zhenghong Xiao, Zhenyu Xuan, Yangjia Ou
Abstract: Implicit sentiment orientation recognition classifies emotions. The development of the internet has diversified the information presented by text data. In most cases, text information is positive, negative, or neutral. However, the inaccurate participle, the lack of standard complete sentuation lexicon, and the negation of words bring difficulty in implicit emotional recognition. The text data also contain rich and fine-grained information and thus become a difficult research point in natural language processing. This study proposes a hierarchical disconnected recurrent neural network to overcome the problem of lack of emotional information in implicit sentiment sentence recognition. The network encodes the words and characters in the sentence by using the disconnected recurrent neural network and fuses the context information of the implicit sentiment sentence through the hierarchical structure. By using the context information, the capsule network is used to construct different fine-grained context information for extracting high-level feature information and provide additional semantic information for emotion recognition. This way improves the accuracy of implicit emotion recognition. Experimental results prove that the model is better than some current mainstream models. The F1 value reaches 81.5%, which is 2 to 3 percentage points higher than those of the current mainstream models.
Keywords: hierarchical interrupted circulation network; implicit emotion; capsule network; sentiment orientation identification.
Client-side ciphertext deduplication scheme with flexible access control
by Ying Xie, Guohua Tian, Haoran Yuan, Chong Jiang, Jianfeng Wang
Abstract: Data deduplication with fine-grained access control has been applied in practice to realise data sharing and reduce the storage space. However, many existing schemes can only achieve server-side deduplication, which greatly wastes the network bandwidth even when the data transmitted is particularly large. Moreover, few existing schemes consider attribute revocation, in which the forward and backward secrecy cannot be guaranteed. To address the above problems, in this paper, we introduce a client-side ciphertext deduplication scheme with more flexible access control. Specifically, we divide the data owner into different domains and distribute corresponding domain keys to them through the secure channel, achieving PoW verification in client-side deduplication. Besides, we realise attribute revocation through the proxy re-encryption technology, which cannot preset the maximum number of clients in system initialisation. Security and performance analysis shows that our scheme can achieve desired security requirements while realising the efficient client-side deduplication and attribute revocation.
Keywords: client-side deduplication; flexible access control; attribute revocation; random tag.
LMA: label-based multi-head attentive model for long-tail web service classification
by Guobing Zou, Hao Wu, Song Yang, Ming Jiang, Bofeng Zhang, Yanglan Gan
Abstract: With the rapid growth of web services, service classification is widely used to facilitate service discovery, selection, composition and recommendation. Although there is much research in service classification, rarely does work focus on the long-tail problem to improve the accuracy of those categories that have fewer services. In this paper, we propose a novel label-based attentive model LMA with the multi-head structure for long-tail service classification. It can learn the various word-label subspace attention with a multi-head mechanism, and concatenate them to get the high-level feature of services. To demonstrate the effectiveness of LMA, extensive experiments are conducted on 14,616 real-world services with 80 categories crawled from the service repository ProgrammableWeb. The results prove that the LMA outperforms state-of-the-art approaches for long-tail service classification in terms of multiple evaluation metrics.
Keywords: service classification; service feature extraction; long tail; label embedding; attention.
Synthetic data augmentation rules for maritime object detection
by Zeyu Chen, Xiangfeng Luo, Yan Sun
Abstract: The performance of deep neural networks for object detection depends on the amount of data. In the field of maritime object detection, the diversity of weather, target scale, position and orientation make real data acquisition hard and expensive. Recently, the generation of synthetic data is a new trend to enrich the training set. However, synthetic data might not improve the detection accuracy. Two problems remain unsolved: 1) what kind of data need to be augmented? 2) how to augment synthetic data? In this paper, we use knowledge-based rules to constrain the process of data augmentation and to seek effective synthetic samples. Herein, we propose two synthetic data augmentation rules: 1) what to augment depends on the gap between training and expiring data distribution; 2) the robustness and effectiveness of synthetic data depends on the proper proportion and domain randomisation. The experiments show that the average accuracy of boat classification increases 3% with our synthetic data in Pascal VOC test set.
Keywords: data augmentation; synthetic data; object detection; synthetic data augmentation rules.
General process of big data analysis and visualisation
by HongZhang Lv, Guang Sun, WangDong Jiang, FengHua Li
Abstract: There are innumerable data generated on the internet every day, which is hardly effectively analysed by traditional means because of its capacities and complexities. Not only is this data huge, but it also has complex relationships between different kinds of datasets. In addition, if people want to know more information of the changes of data during the certain period of time, that means the time factor will be taken into consideration, which leads to the problem of analysing dynamic data. This kind of data is called big data. Under the circumstances of the big data era, a new process for dealing with this data should be conceived. This process contains five steps. Those are collecting data, cleaning data, storing data, analysing data and data's further analysing. The aim of this paper is to illustrate every step. The fourth and fifth steps will be introduced in detail.
Keywords: big data; visualisation; analysis; process; graph.
Design and implementation of food supply chain traceability system based on hyperledger fabric
by Kui Gao, Yang Liu, Heyang Xu, Tingting Han
Abstract: Food safety problems always cause widespread concerns and panic when food-related incidents occur around the globe. Establishing a credible food traceability system is an effective solution to this issue. Most existing blockchain-based traceability systems are not convincing because the traceability information stored on the chain is just coming from one single organisation. Without the upstream and downstream trading information of the supply chain, even blockchain-based systems with immutability and decentralised trustworthy advantages cannot guarantee accurate traceability for customers. In this paper, we establish a food supply chain traceability system called FSCTS which aggregates all the enterprises and organisations along the food supply chain to make deals and transactions on the blockchain. Through analysing the trading data associating the whole food circulation from production to consumption, reliable transaction-based traceability can be achieved to provide trusted food tracing. We implement the system on the base of hyperledger fabric and prove the effectiveness and superiority of FSCTS by conducting extensive comparison experiments with some similar traceability systems.
Keywords: food safety; food traceability; food trading; food supply chain; blockchain; consortium blockchain; hyperledger fabric.
Automatic recommendation of user interface examples for mobile app development
by Xiaohong Shi, Xiangping Chen, Rongsheng Rao, Kaiyuan Li, Zhensheng Xu, Jingzhong Zhang
Abstract: It is an efficient development practice for user interface (UI) developers to exploit some examples for their reference. We propose an approach for automatic recommendation of UI examples for mobile app development. We first introduce a search engine for UI components of mobile applications based on their descriptions, graphical views and source code. From the search results, an algorithm, density-based clustering with maximum intra-cluster distance (DBCMID), is proposed to automatically recommend examples. The comparison between the recommended examples using our approach and existing summarised examples shows that for 83.33% of summarised examples, there are completely/partly matched examples in our recommended results. In addition, 39 new valuable examples are found based on the search results of six queries.
Keywords: user interface search; user interface development; example recommendation.
A DDoS attack detection method based on SVM and K-nearest neighbour in SDN environment
by Zhaohui Ma, Bohong Li
Abstract: This paper presents a detection method for DDos attack in SDN based on k-nearest neighbour algorithm (KNN) and support vector machine (SVM). This method makes use of the characteristics of SDN centralised control, collects the flow characteristic information efficiently, classifies the flow, screens out the attack flow, and determines whether the system is attacked or not. Experiments show that the method has high accuracy.
Keywords: software define network; controller; detecting method; DDos attack; KNN; SVM.
The sentiments of open financial information, public mood and stock returns: an empirical study on Chinese Growth Enterprise Market
by Qingqing Chang
Abstract: This study links public mood to stock performance and examines the moderating role of co-occurring sentiments as expressed at open financial information platforms in this relationship. Drawing on the agenda-setting and source credibility theories, we developed hypotheses with the use of 345 stocks listed on the Chinese Growth Enterprise Market and data on public mood, and open financial information sentiments collected between 1 October, 2012 and 30 September, 2015. Our findings suggest that public mood has a significant, positive impact on stock returns; more interestingly, we found that public mood has a stronger positive impact on stock performance than open financial information sentiments. Furthermore, the study finds a positive interactive effect between public mood and open financial information sentiments, and determined that variation in public mood is a driving force with a market reaction, while the co-occurring open financial information sentiments amplifies the effect of public mood on stock returns.
Keywords: sentiment analysis; public mood; open financial information sentiments.
Digital watermarking for health-care: a survey of ECG watermarking methods in telemedicine
by Maria Rizzi, Matteo D'Aloia, Annalisa Longo
Abstract: Innovations in healthcare have introduced a radical change in the medical environment, including patient diagnostic data and patient biological signal facilities and processing. The adoption of telemedicine services usually leads to an incremental trend in transmission of electronic sensitive data over insecure infrastructures. Since integrity, authenticity and confidentiality are mandatory features in telemedicine, the need to guarantee these requirements with end-to-end control arises. Among the various techniques implemented for data security, digital watermarking has gained considerable popularity in healthcare oriented applications. The challenge the watermarking insertion has to overcome is to avoid changes in health and medical history of a patient to a level where a decision maker can make a misdiagnosis. This paper presents a survey of different applications of electrocardiogram watermarking for telemedicine. The most recent and significant electrocardiogram watermarking schemes are reviewed, various issues related to each approach are discussed, and some aspects of the adopted techniques, including classification and performance measures, are analysed.
Keywords: watermarking; electrocardiogram; telemedicine; data security; healthcare; integrity verification; authentication; patient record hiding; smart health.
Web services classification via combining Doc2vec and LINE model
by Hongfan Ye, Buqing Cao, Jinkun Geng, Yiping Wen
Abstract: With the rapid increasing of the number of web services, web service discovery is becoming a challenging task. Classifying web services with similar functionality from a tremendous amount of web services can improve the efficiency of service discovery significantly. The current web services classification researches mainly focus on the independent mining of the hidden content semantic information or network structure information in the web service characterisation documents, but few of them integrate the two sets of information comprehensively to achieve better classification performance. To this end, we propose a web service classification method that combines content semantic information and network construction information.
Keywords: web services classification; content semantic; network structure; LINE; Doc2Vec.
Discrete stationary wavelet transform and SVD-based digital image watermarking for improved security
by Rajakumar Chellappan, S. Satheeskumaran, C. Venkatesan, S. Saravanan
Abstract: Digital image watermarking plays an important role in digital content protection and security related applications. Embedding watermark is helpful to identify the copyright of an image or ownership of the digital multimedia content. Both the grey images and colour images are used in digital image watermarking. In this work, discrete stationary wavelet transform and singular value decomposition (SVD) are used to embed watermark into an image. One colour image and one watermark image are considered here for watermarking. Three-level wavelet decomposition and SVD are applied and the watermarked image is tested under various attacks, such as noise attacks, filtering attacks and geometric transformations. The proposed work exhibits good robustness against these attacks, and the simulation results show that proposed approach is better than the existing methods in terms of bit error rate, normalised cross correlation coefficient and peak signal to noise ratio.
Keywords: digital image watermarking; discrete stationary wavelet transform;; wavelet decomposition; singular value decomposition; peak signal to noise ratio.
Disaster Management Using D2D Communication with ANFIS Genetic Algorithm Based CH Selection and Efficient Routing by Seagull Optimization
by Lithungo K. Murry, R. Kumar, Themrichon Tuithung
Abstract: The next generation networks and public safety strategies in communications are at a crossroads in order to render best applications and solutions to tackle disaster management proficiently. There are three major challenges and problems considered in this paper: (i) disproportionate disaster management scheduling among bottom-up and top-down strategies; (ii) greater attention on the disaster emergency reaction phase and the absence of management in the complete disaster management series; and (iii) arrangement deficiency of a long-term reclamation procedure, which results in stakeholder resilience and low level community. In this paper, a new strategy is proposed for disaster management. A hybrid Adaptive Neuro-Fuzzy Inference Network based Genetic Algorithm (D2D ANFIS-GA) used for selecting cluster heads, and for the efficient routing the Seagull Optimization Algorithm (SOA) is used. Implementation is done in the MATLAB platform. The performance metrics, such as energy use, average battery lifetime, battery lifetime probability, average residual energy, delivery probability, and overhead ratio, has used to evaluate the performance. Experimental results are compared with two existing approaches, Epidemic and FINDER. Our proposed approach gives better results.
Keywords: disaster management; adaptive neuro-fuzzy inference network; residual energy; device-to-device communication; seagull optimisation algorithm.
Design and implementation of chicken egg incubator for hatching using IoT
by Niranjan Lakshmappa, C. Venkatesan, Suhas A R, S. Satheeskumaran, Aaquib Nawaz S
Abstract: Egg fertilisation is one of the major factors to be considered in poultry farms. This paper describes a smart incubation system designed to combine the IoT technology with the smartphone in order to make the system more convenient to the user in monitoring and operation of the incubation system. The incubator is designed first with both the setter and the hatcher in one unit and incorporating both still air incubation and forced air incubation, which is controlled and monitored by the controller keeping in mind the four factors temperature, humidity, ventilation and egg turning system. Here we are setting with three different temperatures for the experimental purpose at 36.5oC, 37.5oC and 38oC. The environment is maintained the same in all three cases and the best temperature for the incubation of the chicken eggs is noted.
Keywords: IoT; poultry farms; embryo; brooder; hatchery; Blynk App.
A sinkhole prevention mechanism for RPL in IoT
by Alekha Kumar Mishra, Maitreyee Sinha, Asis Kumar Tripathy
Abstract: A sinkhole node has the ability to redirect all the traffic routes from IoT nodes to the root(sink) node through it via false rank advertisement. Unfortunately, there is no provision for a node to verify the actual rank received by a claiming parent from its parent in the RPL protocol. A number of sinkhole and rank spoofing detection mechanisms have been proposed in the literature. The works claiming higher detection rate mostly use cryptography-based operations, which results in additional computational overhead. A majority of the mechanisms are also reported considering only a single network metric for evaluating the trust level of a parent node and for detecting a sinkhole. Practically, only a single network metric may not be sufficient to detect anomalous behaviour and may lead to false positive cases. In this paper, a prevention mechanism is proposed that decides the legitimacy of a node in the neighborhood by considering three network metrics. There are hop count, residual energy, and expected transmission count. The mechanism relies on the fact that all the nodes in a neighborhood have similar network metrics with respect to the position of the root node in the network. Therefore, a node claiming to have metric values quite different from the mean values in a neighborhood is identified as a sinkhole. The experimental results show that the proposed mechanism can significantly distinguish a sinkhole node from genuine ones in an arbitrary location in the network.
Keywords: IoT; security; RPL; sinkhole; rank spoofing; degree of membership.
A comparative linguistic analysis of English news headlines in China, America, England, and ASEAN countries
by Yusha Zhang, Xiaoming Lu, Yingwen Fu, Shengyi Jiang
Abstract: This paper aims to conduct a comparative study on English news headlines in America and China, England, and ASEAN countries, mainly investigating how the composing of news headlines is interrelated with their linguistic factors, such as the part-of-speech, length and the frequency of most common words, depending on which country the news is published in. In this paper, the linguistic comparison is performed based on merely the headlines without going through whole articles. For this purpose, 13 sets of data are collected from major online news sites within above-mentioned countries. The comparison results reveal that headlines in different countries comply with news writing rules in slightly different ways as well as boasting distinctive features. These differences are attributed to the consideration of the target audiences multi-faceted states, such as knowledge states, beliefs, or interests. To better exemplify the results, the headlines in this article were read with care. The proposed method begins with data collection and pre-processing. News headlines are then fetched from different news source using crawler and processed in Natural Language Processing Tool (NLTK).
Keywords: headline; part-of-speech; length of headlines; cluster.
A new unsupervised method for boundary perception and word-like segmentation of sequence
by Arko Banerjee, Arun K. Pujari, Chhabi Rani Panigrahi, Bibudhendu Pati
Abstract: In cognitive science research on natural language processing, motor learning and visual perception, perceiving boundary points and segmenting a continuous string or sequence is one of the fundamental problems. Boundary perception can also be viewed as a machine learning problem: supervised or unsupervised learning. In a supervised learning approach for determining boundary points for segmentation of a sequence, it is necessary to have some pre-segmented training examples. In an unsupervised mode, the learning is accomplished without any training data, hence the frequency of occurence of symbols within the sequence is normally used as the cue. Most of earlier algorithms use this cue while scanning the sequence in the forward direction. In this paper, we propose a novel approach of extracting the possible boundary points by using bidirectional scanning of the sequence. We show here that such an extension from unidirectional to bidirectional is not trivial and requires judicious consideration of datastructure and algorithm. We here propose a new algorithm that traverses the sequence unidirectinally but extracts the information bidirectionally. Our method yields better segmentation, which is demonstrated by rigorous experimentation on several datasets.
Keywords: boundary perception; sequence segmentation; trie datastructure.
Application of light gradient boosting machine in mine water inrush source type online discriminant
by Yang Yong, Li Jing, Zhang Jing, Liu Yang, Zhao Li, Guo Ruxue
Abstract: Water inrush is a kind of mine geological disaster that threatens mining safety. Type recognition of water inrush sources is an effective auxiliary method to forecast water inrush disaster. Compared with the current hydrochemistry methodology, it spends a large amount of time on sample collection. Considering this problem, it is urgent to propose a novel method to discriminate water inrush source type online, and further to obtain much more time for evacuation before the disaster. The paper proposes an in-situ mine water sources discrimination model based on Light Gradient Boosting Machine (LightGBM). This method combines Light Gradient Boosting (LGB) with the Decision Tree (DT) to improve the network's integrated learning ability and enhance model generalisation. The data were collected from in-situ sensors such as pH, conductivity, Ca, Na, Mg and CO3 components in different water bodies of LiJiaZui Coal Mine in HuaiNan. The results illustrate that the accuracy of the proposed method achieves 99.63% to recognise water sources in the mine. Thus, the proposed discriminant model is a timely and an effective online way to recognise source types of water in mines.
Keywords: water inrush source; light gradient boosting machine; online water sources discrimination.
Unmanned surface vehicle adaptive decision model for changing weather
by Han Zhang, Xinzhi Wang, Xiangfeng Luo, Shaorong Xie, Shixiong Zhu
Abstract: The autonomous decision-making capability of an unmanned surface vehicle (USV) is the basis for many tasks, such as obstacle avoidance, tracking and navigation. Most of the works ignore the variability of the scene when making behavioral decisions. For example, traditional decision-making methods are not adaptable to dynamic environments, especially changing weather that a USV is likely to encounter. In order to solve the low adaptability problem of a USV using a single decision model for autonomous decision-making in changing weather, we propose an adaptive model based on human memory cognitive process. It uses deep learning algorithms to classify weather and uses reinforcement learning algorithms to make decisions. Simulated experiments are carried out on USV obstacle avoidance decision task in the Unity3D ocean scene to test our model. Experiments show that our model's decision-making accuracy in changing weather is 27% higher than using only a single decision model.
Keywords: brain memory cognitive process; reinforcement learning; weather classification; adaptive model.
Wireless energy consumption optimisation using node coverage controlling in linear-type network
by Gaifang Xin, Jun Zhu, Chengming Luo, Jing Tang, Wei Li
Abstract: With the mushrooming development of automatic, intelligent and unmanned technologies, the application of wireless sensor networks has become a hot topic. As a class of narrow band structures, such as corridors and tunnels, wireless sensor networks are is used to detect environmental parameters. To address unbalanced energy consumption of wireless nodes along length direction, this paper proposes a coverage controlling strategy and develops a linear-type network using several sensor nodes. A base station node is equipped in narrow band structures. Firstly, the wireless perception models, containing routing path, coverage model, link load and data credible rate, are analysed aiming at special monitoring environments; secondly, the survival lifetime of linear-type network is solved based on energy consumption of every wireless node; thirdly, in consideration of the accidental death of deployed nodes, time-sharing scheduling strategy is used to guarantee the stability of the monitoring linear-type network; finally, the experimental results show that the proposed coverage control strategy can optimise the linear network survival time and achieve the energy efficiency of linear-type network, which can provide the reference for target positioning, operation safety, environmental monitoring, and disaster assessment in some applications.
Keywords: linear-type structure; wireless sensor network; energy efficiency; coverage controlling.
FACF: fuzzy areas-based collaborative filtering for point-of-interest recommendation
by Ive Tourinho, Tatiane Rios
Abstract: Several online social networks collect information from their users' interactions (co-tagging of photos, co-rating of products, etc.) producing a large amount of activity-based data. As a consequence, this kind of information is used by these social networks to provide their users with recommendations about new products or friends. Moreover, Recommendation Systems (RS) are able to predict a persons activity with no special infrastructure or hardware, such as RFID tags, or by using video and audio. In that sense, we propose a technique to provide personalised Points-of-Interest (POI) recommendations for users of Location-Based Social Networks (LBSN). Our technique assumes users' preferences can be characterised by their visited locations, which is shared by them on LBSN, collaboratively exposing important features as, for instance, Areas-of-Interest (AOI) and POI popularity. Therefore, our technique, named Fuzzy Areas-based Collaborative Filtering (FACF), uses users' activities to model their preferences and recommend the next visits to them. We have performed experiments over two real LBSN datasets and the obtained results have shown our technique outperforms location collaborative filtering at almost all of the experimental evaluation. Therefore, by fuzzy clustering of AOI, FACF is suitable to check the popularity of POI to improve POI recommendations.
Keywords: recommendation systems; fuzzy clustering; location; points-of-interest.
ELBA-NoC: ensemble learning-based accelerator for 2D and 3D network-on-chip architectures
by Anil Kumar, Basavaraj Talawar
Abstract: Networks-on-Chip (NoCs) have emerged as a scalable alternative to traditional
bus and point-to-point architectures. The overall performance of NoCs become highly
sensitive as the number of cores increases. Research on NoCs and development thus play
a key role in the design of hundreds to thousands of cores in the near future. Simulation is one of the main tools used in NoC for analysing and testing new architectures. To achieve the best performance vs. cost tradeoff, simulations are important for both the interconnect designer as well as the system designer. Software simulators are too slow for evaluating medium and large scale NoCs. This paper presents a learning framework which can be used to analyse the performance, area and power parameters of 2D and 3D NoC architectures which is fast, accurate and reliable. This framework is named as Ensemble Learning-Based Accelerator (ELBA-NoC) which is built using the random forest regression algorithm to predict parameters of NoCs considering different synthetic traffic patterns. On 2D, 3D Mesh, Torus and Cmesh NoC architectures, ELBA-NoC was tested and the results obtained were compared with the extensively used cycle-accurate Booksim NoC simulator. Experiments with different virtual channels, traffic patterns and injection rates were performed by varying topology sizes. The framework showed an approximate prediction error of less than 5% and an overall speedup of up to 16K
Keywords: network-on-chip; 2D NoC; 3D NoC; performance modelling; machine learning; regression; ensemble learning; random forest; Booksim; router; traffic pattern.
A blockchain-based authority management framework in traceability systems
by Jiangfeng Li, Yifan Yu, Shili Hu, Yang Shi, Shengjie Zhao, Chenxi Zhang
Abstract: The frequent occurrence of product quality and food safety incidents in recent years has greatly lost the trust of consumers. Traceability systems are developed to trace status of products in processes of production, transportation, and sales. However, the tracing data stored in the traceability systems' centralised database can be tampered. In this paper, a blockchain-based authority management framework for traceability systems is proposed. Tracing data are stored on Hyperledger Fabric and InterPlanetary File System (IPFS) to reduce data storage space and improve data privacy protection on blockchain. In the framework, using the Role Based Access Control (RBAC) mechanism, a blockchain-based RBAC model is presented by defining entities, functions, and rules. Additionally, components in four layers are designed in the framework. Strategies of operation flows are presented to achieve authority management in business applications. The framework not only guarantees the integrity of tracing data, but also prevents confidential information from being leaked. Compared with existing approaches, experiments show that the framework performs better in time and storage.
Keywords: blockchain; authority management; RBAC model; Hyperledger Fabric; IPFS;
The analysis of stego image visual quality for a data-hiding scheme based on a two-layer turtle shell matrix
by Ji-Hwei Horng, Xiao-zhu Xie, Chin-Chen Chang
Abstract: In 2018, Xie et al. proposed a novel data-hiding scheme based on a two-layer turtle shell matrix. In their scheme, they claimed that their stego image visual quality is superior to that of the state-of-the-art methods, regardless of the features of cover images. In this research note, we make a theoretical analysis of stego image quality for Xie et al.'s method based on the symmetrical characteristic of the matrix. Furthermore, we found that their simulation outcomes do not reveal the fact that their embedding capacity is larger than those of the data-hiding methods proposed previously under the same stego image visual quality. More simulations are made to reveal that our experimental outcomes coincide with the results of our theoretical analysis. Furthermore, the experimental results made by Xie et al. have also been corrected.
Keywords: theoretical analysis; two-layer turtle shell; data hiding.
Coupling model based on grey relational analysis and stepwise discriminant analysis for subsidence discrimination of foundations in soft clay areas
by Bo Li, Nian Liu, Wei Wang
Abstract: We selected grey correlation and stepwise discriminant analyses as basic models, proposed a coupling discrimination method, and designed and established a coupling discriminant model of the foundation subsidence in soft soil areas to address the challenges in the discriminant analysis of the seismic subsidence grade of soft soil. In this model, samples for discrimination, and reference samples were first analysed by indicator relations analysis. Seismic subsidence grades were ranked according to the correlation, and discriminant grades were screened. Finally, the seismic subsidence grades of the samples that met the criteria were confirmed with stepwise discriminant analysis. Actual sample data were calculated, and the discriminant results were compared and analysed with the traditional model to verify the applicability and accuracy of the coupling model. We obtained good evaluation results, which provided a new method for discriminant analysis of soft soil seismic subsidence grades.
Keywords: soft clay; grey relational analysis; stepwise discriminant analysis; coupling model.
Efficient self-adaptive access control for personal medical data in emergency setting
by Yifan Wang, Jianfeng Wang
Abstract: The notion of access control allows data owners to outsource their data to cloud servers, while encouraging the sharing of data with legally authorised users. Note that the traditional access control techniques only allow authorised users to access the sharing data. However, it is intractable to obtain the required data when the data owner encounters some emergency circumstances, such as medical first-aid. Recently, Yang et al. proposed a self-adaptive access control scheme, which can ensure secure data sharing in both normal and emergency medical scenarios. However, their construction needs to involve an emergency contact person. We argue that their scheme suffers from two weaknesses: (i) it is vulnerable to single point of failure when the emergency contact person is offline, (ii) the two-cloud model brings extra computation and communication overhead. To overcome the above shortcomings, we present a new efficient self-adaptive medical data access control by integrating fuzzy identity-based encryption and convergent encryption. Specifically, our proposed construction can achieve patients' data access by their fingerprint in emergency setting. Furthermore, the proposed scheme supports cross-user data deduplication and improves the performance of the system by convergent encryption. Experiment results show that our scheme has advantage in efficiency.
Keywords: self-adaptive access control; privacy-preserving; medical data storage; secure deduplication.
Edge servers placement in mobile edge computing using stochastic Petri nets
by Daniel Carvalho, Francisco Airton Silva
Abstract: Mobile Edge Computing (MEC) is a network architecture that takes advantage of cloud computing features (such as high availability and elasticity) and makes use of computational resources available at the edge of the network in order to enhance the mobile user experience by decreasing the service latency. MEC solutions need to dynamically allocate the requests as close as possible to their users. However, the request placement depends not only on the geographical location of the servers, but also on their requirements. Based on this fact, this paper proposes a Stochastic Petri Net (SPN) model to represent a MEC scenario and analyses its performance, focusing on the parameters that can directly impact the service Mean Response Time (MRT) and resource use level. In order to present the applicability of our work, we propose three case studies with numerical analysis using real-world values. The main objective is to provide a practical guide to help infrastructure administrators to adapt their architectures, finding a trade-off between MRT and level of resource usage.
Keywords: mobile edge computing; internet of things; stochastic models; server placement.
Application of particle swarm optimisation for coverage estimation in software testing
by Boopathi Muthusamy, Sujatha Ramalingam, C. Senthil Kumar
Abstract: A Markov approach for test case generation and code coverage estimation using particle swarm optimisation is proposed. Initially, the dd-graph is taken from control flow graph of the software code by joining decision to decision. In the dd-graph, the sequences of independent paths are identified using c-uses and p-uses based on set theory approach and compared with cyclomatic complexity. Automatic test cases are generated and the nature of the test cases are integer, float Boolean variables. Using this initial test suite, the code coverage summary is generated using gcov code coverage analysis tool, and the branch probability percentage is considered as TPM values with respect to each branch in the dd-graph. Path coverage is used as a fitness function which is the product of node coverage and TPM values. This algorithm is iterated until it reaches 100% code coverage among each independent test path. The randomness of the proposed approach is compared with genetic algorithm.
Keywords: particle swarm optimisation; dd-graph; mixed data type variables; branch percentage; TPM-based fitness function; most critical paths.
Enhancing user and transaction privacy in bitcoin with unlinkable coin mixing scheme
by Albert Kofi Kwansah Ansah, Daniel Adu-Gyamfi
Abstract: The concept of coin mixing is significant in blockchain and achieves anonymity and has merited application in bitcoin. Albeit, several coin mixing schemes have been proposed, we point out that they either hoard input transactions and address mapping or do not fully satisfy all requirements of practical anonymity. This paper proposes a coin mixing scheme (mixing countersignature scheme, ring signature, and coin mixing approach) that allows users to transact business untraceably and unlinkably without having to trust a third party to ensure coins are safe. Our proposed novel countersignature scheme simulation results prove the countersignature schemes correctness with an average running time of 4s using PBC Type A.80. The schemes security and privacy are met with standard ring signature, ECDSA unforgeability and our countersignature. We demonstrated the efficiency of the mixing scheme using Bitcoins core regtest mode to set up a private Bitcoin network. The mix takes 80, 160, 320, 640, 800 secs to service 500, 1000, 2000, 4000, 5000 users respectively. It was observed that the number of users scales linearly with average running time.
Keywords: bitcoin blockchain; user and transaction privacy; coin mixing; bilinear pairing and ECDSA; ring signature.
NO2 pollutant concentration forecasting for air quality monitoring by using an optimised deep learning bidirectional GRU model
by Shilpa Sonawani, Kailas Patil, Prawit Chumchu
Abstract: Air pollution is the most crucial environmental problem to be handled as it has adverse effects on human health and agriculture, and is also responsible for climate change and global warming. Several observations have warned about the level of increase in the pollutant nitrogen dioxide (NO2) in the atmosphere in many regions. Studies have also shown that nitrogen dioxide pollutant is associated with diseases such as diabetes mellitus, hypertension, stroke, chronic obstructive pulmonary disease (COPD), asthma, bronchitis, and pneumonia, and its high level can lead to death due to asphyxiation from fluid in the lungs. It can also have negative effect on vegetation, leading to reduced growth and damage to leaves. Considering its devastating effects, to estimate and monitor NO2 concentration an optimised bidirectional GRU model is proposed. It is evaluated for its performance with other models, such as timeseries methods, sklearn machine learning regression methods, AUTOML frameworks and all advanced and hybrid deep learning techniques. The model is further optimised for the number of features, number of neurons, number of lookbacks and epoches. It is implemented on a real time dataset of Pune city in India. This model is helpful to government and central authorities to prevent excessive pollution levels and their adverse effects, and for smart homes for controlling pollution levels.
Keywords: air pollution; air quality; AUTOML; bidirectional GRU; deep learning; nitrogen dioxide; NO2; timeseries forecasting.
A knowledge elicitation framework in ranking healthcare providers using rough set with formal concept analysis
by Arati Mohapatro, S.K. Mahendran, Tapan Kumar Das
Abstract: A comparison of healthcare institutions by ranking involves generating their relative scores based on the infrastructure, process and other quality dynamics. Being a top-ranking institute depends on the overall score secured against the hospital quality parameters that are being assessed for ranking. However, the parameters are not equally important when it comes to ranking. Hence, the objective of this research is to explore the parameters that are vital as they significantly influence the ranking score. In this paper, a hybrid model is presented for knowledge extraction, which employs techniques of rough set on intuitionistic fuzzy approximation space (RSIFAS) for classification, Learning from Examples Module 2 (LEM2) algorithm for generating decision rules, and formal concept analysis (FCA) for attribute exploration. The model is discussed using AHA US News score data for cancer specialisation. The result signifies the connection between quality attributes and ranking. Finally, the leading attribute and its particular values are identified for different states of ranking.
Keywords: rough set with intuitionistic fuzzy approximation space; formal concept analysis; hospital ranking; knowledge mining; attribute exploration.
Real-time segmentation of weeds in cornfields based on depthwise separable convolution residual network
by Hao Guo, Shengsheng Wang
Abstract: Traditional artificial spraying of pesticides not only leads to greater use of pesticides but also environmental pollution. However, intelligent weeding devices can identify weeds and crops through sensing devices for selective spraying, which will effectively reduce the use of pesticides. The accurate and efficient identification method of crops and weeds is crucial to the development of the mechanised weeding model. To improve the segmentation exactitude and real-time performance of crops and weeds, we propose a lightweight network based on the Encoder-Decoder architecture, namely, SResNet. The shuffle-split-separable-residual block was employed to compress the model and increase the number of network layers at the same time, thereby extracting more abundant pixel category information. Besides, the model was optimised by a weighted cross-entropy loss function due to the imbalance of pixel ratios of background, crops, and weeds. The results of the experiment prove that the method presented can greatly improve the segmentation accuracy and real-time segmentation speed on the corns and weeds dataset.
Keywords: weed segmentation; convolutional network; residual network; machine vision; image recognition.
Array manifold matching algorithm based on fourth-order cumulant for 2D DOA estimation with two parallel nested arrays
by Sheng Liu, Jing Zhao, Yu Zhang
Abstract: In this paper, a two-dimensional (2D) direction-of-arrival (DOA) estimation algorithm with two parallel nested arrays is developed. Firstly, a constructor method for fourth-order cumulant (FOC) matrices is given according to the distribution of sensors. Then a pre-existing DOA estimation technique is firstly used to estimate the elevation angles and an improved unilateral array manifold matching (AMM) algorithm is used to estimate the azimuth angles. Compared with some classical 2D DOA estimation algorithms, the proposed algorithm has much better estimation performance, particularly in the case of low SNR environment. Compared with some traditional FOC-based algorithm, the proposed algorithm has higher estimation precision. Simulation results can illustrate the validity of proposed algorithm.
Keywords: DOA estimation; fourth-order cumulant; array manifold matching; two parallel nested arrays.
Feature weighting for naive Bayes using multi-objective artificial bee colony algorithm
by Abhilasha Chaudhuri, Tirath Sahu
Abstract: Naive Bayes (NB) is a widely used classifier in the field of machine learning. However, its conditional independence assumption does not hold true in real-world applications. In the literature, various feature-weighting approaches have attempted to alleviate this assumption. Almost all of these approaches consider the relationship between feature-class (relevancy) and feature-feature (redundancy) independently, to determine the weights of features. We argue that these two relationships are mutually dependent and both cannot be improved simultaneously, i.e. form a trade-off. Multi-objective optimisation (MOO) techniques are used to solve these types of problem. In this paper, we propose a new paradigm to determine the feature weight. Feature weighting is formulated as an MOO problem to balance the trade-off between relevancy and redundancy. Multi-objective Artificial Bee Colony based feature weighting technique for na
Keywords: naive Bayes; feature weighting; multi-objective optimisation; artificial bee colony.
Hyperspectral endmember extraction using Pearson's correlation coefficient
by Dharambhai Shah, Tanish Zaveri
Abstract: Hyperspectral unmixing is a source separation problem. The spectral unmixing process is simply the composition of the three-step chain (subspace identification, endmember extraction and abundance estimation). A critical step in this chain is endmember extraction, which finds endmembers from the image for the estimation of abundances. In this paper, a novel framework is proposed that uses the concept of Pearsons correlation coefficient and convex geometry. The novel framework extracts endmembers from the convex set of the two bands extracted using Pearsons correlation coefficient so it is named as PCGE (Pearsons correlation coefficient based Convex Geometry for Endmember extraction). This PCGE framework is different from other commonly used frameworks owing to there being only two bands convex geometry, which means that the computation time for the proposed framework is less. The proposed framework is applied to a synthetic dataset and four popular real hyperspectral datasets. In the simulation results, the proposed framework is compared with other popular frameworks based on standard evaluation parameters (spectral angle error, spectral information divergence, normalised cross-correlation). It has been observed from the simulation results that the proposed framework outperforms popular frameworks. It has been also observed that the proposed framework takes less time than others for extracting endmembers.
Keywords: endmember extraction; hyperspectral image; Pearson’s correlation coefficient; spectral unmixing.
An image fusion dehazing algorithm based on dark channel prior and Retinex
by Zhongliang Wei, Guangli Zhu, Xingzhu Liang, Wenjuan Liu
Abstract: The dehazing algorithm for image processing is widely used to improve image quality, which aims to reduce the impact of hazy weather. The typical dehazing algorithm using Dark Channel Prior (DCP) can carry out image dehazing efficiently. However, the processed image only by using DCP will lead to low lightness and weak colour of the image, which cannot meet the visual characteristics of the human eye to some extent. To solve this problem, this paper proposes an image fusion dehazing algorithm based on DCP and Retinex. First, the original image is processed by the DCP dehazing algorithm to get the dehazed image. Then, the dehazed image is processed by using the Retinex algorithm to get the enhanced image. Finally, the dehazed image and the enhanced image are fused using a linear fusion method, which is expected to improve the lightness and colour of the image after dehazing. Experimental results show that the algorithm proposed in this paper not only achieves good results for image dehazing, but also enhances its lightness and colour.
Keywords: image dehazing; dark channel prior; Retinex; image fusion.
A dynamic slicing based approach for affective SBFL technique
by Debolina Ghosh, Jagannath Singh
Abstract: Fault finding is an activity to locate the fault or bug present in a software. It is a time-consuming job and needs much more effort if done manually. Hence, automated fault localisation is always in high demand, which reduces the human effort and also makes the task more accurate. Among different existing debugging techniques, spectrum-based debugging is the most efficient for automated fault localisation. Dynamic program slicing is an another technique that can reduce the debugging time by reducing the unaffected source codes depending on slicing criteria. In this paper, we present a spectrum-based fault localisation technique by using dynamic slicing. Context-sensitive slicing is used to diminish the fault localisation time and makes the process more effective. SBFL metrics are used in the sliced program to find the suspiciousness score of individual program statements. The efficiency of the proposed approach is evaluated on three open-source programs. From the results, we notice that owing to dynamic slicing the technique takes less time to find the suspiciousness score of individual statements in the sliced program compared with the original program. We have also observed that the programmer needs to inspect less source code to detect the buggy statement. The results indicate that the proposed approach outperforms the pure spectrum-based fault localisation techniques.
Keywords: program slicing; spectrum-based fault localisation; statistical formula; Java; context-sensitive slicing.
Open data integration model using a polystore system for large scale scientific data archives in astronomy
by Shashank Shrestha, Manoj Poudel, Rashmi Sarode, Wanming Chu, Subhash Bhalla
Abstract: Polystore systems have been recently proposed as a new data integration model to provide integrated access to heterogeneous data stores through a unified single query language. Recently, there is a growing interest in the database community to manage large scale unstructured data from multiple heterogeneous data stores. Special attention is given to this problem owing to growth in the size of data, the speed of increment of data and the emergence of various data types in different scientific data archives. Moreover, astronomy as a scientific domain produces huge amounts of data which are stored in the data archives provided by NASA and their subsidiaries. The data type mostly consists of images, unstructured texts and structured (relations, key-values). This paper articulates the problems of integrating multiple data stores to manage heterogeneous data and presents a polystore architecture as a solution. A method of managing a local data store and communicating with a remote cloud data store with the help of a web-based query system is defined.
Keywords: astronomical data; heterogeneous data; data integration; workflow system.
Adaptive online learning for classification under concept drift
by Kanu Goel, Shalini Batra
Abstract: In machine learning and predictive analytics, the underlying data distributions tend to change with the course of time known as concept drift. Accurate labelling in case of supervised learning algorithms is essential to build consistent ensemble models. However, several real-world applications suffer from drifting data concepts, which leads to deterioration in performance of prediction systems. To tackle these challenges, we study various concept drift handling approaches that identify major types of drift pattern in drifting data streams, such as abrupt, gradual and recurring. This study also highlights the need for adaptive algorithms and demonstrates comparison of various state-of-the-art drift handling techniques by analysing their classification accuracy on artificially generated drifting data streams and real datasets.
Keywords: concept drift; ensemble learning; classification; non-stationary; adaptive algorithms; machine learning.
Flexible human motion transition via hybrid deep neural network and quadruple-like structure learning
by Shu-Juan Peng, Liang-Yu Zhang, Xin Liu
Abstract: Skeletal motion transition is of crucial importance to the animation creation. In this paper, we propose a hybrid deep learning framework that allows for efficient human motion transition. First, we integrate a convolutional restricted Boltzmann machine with deep belief network to extract the spatio-temporal features of each motion style, featuring on appropriate detection of transition points. Then, a quadruple-like data structure is exploited for motion graph building, motion splitting and indexing. Accordingly, the similar frames fulfilling the transition segments can be efficiently retrieved. Meanwhile, the transition length is reasonably computed according to the average speed of the motion joints. As a result, different kinds of diverse motions can be well transited with satisfactory performance. The experimental results show that the proposed transition approach brings substantial improvements over the state-of-the-art methods.
Keywords: skeletal motion transition; hybrid deep learning; convolutional restricted Boltzmann machine; quadruple-like data structure.
Flow-based machine learning approach for slow HTTP distributed denial of service attack classification
by Muraleedharan Navarikuth, Janet B.
Abstract: Distributed Denial of Service (DDoS) attack is one of the common threats to the availability of services on the internet. The DDoS attacks evolve from volumetric attack to slow DDoS. Unlike the volumetric DDoS attack, the slow DDoS traffic rate looks similar to the normal traffic. Hence, it is difficult to detect using traditional security mechanism. In this paper, we propose a flow-based classification model for slow HTTP DDoS traffic. The important flow level features were selected using the CICIDS2017 dataset. The impact of time, packet length and transmission rate for slow DDoS are analysed. Using the selected features, three classification models were trained and evaluated using two benchmark datasets. The results obtained reveal the proposed classifiers can achieve higher accuracy of 0.997 using RF classifiers. A comparison of the results obtained with state-of-the-art approaches shows that the proposed approach can improve the detection rate by 19.7%.
Keywords: denial of service; Slow DDoS; application layer DoS; machine learning; network flow; slow HTTP DDoS; slow loris; slow read.
A decision system based on intelligent perception and decision for scene ventilation safety
by Jingzhao Li, Tengfei Li
Abstract: There are many hidden safety hazards in the mine ventilation process that cannot be dealt with in time. It is because the type of coal mine and its mining conditions are complex and changeable, and the safety decision-making level is low when coal mine ventilation is abnormal. To solve these problems, this paper presents a decision system for scene ventilation safety based on intelligent perception and decision. First, grey correlation analysis and rough set theory are used to reduce the decision table horizontally and vertically. Then, the reduced data is input into the mine ventilation safety decision model based on the improved capsule network to make ventilation safety decision. Experimental results show that this system can significantly improve the accuracy of mine ventilation safety decisions, has the characteristics of strong information perception ability and accurate decisions, and provides an important guarantee for mine ventilation safety.
Keywords: grey correlation analysis;rough set; mine ventilation;capsule network; attribute reduction; intelligent decision making.
Satellite image fusion using undecimated rotated wavelet transform
by Rishikesh Tambe, Sanjay Talbar, Satishkumar Chavan
Abstract: This paper presents two satellite image fusion algorithms namely decimated/subsampled rotated wavelet transform (SSRWT) and undecimated/non-subsampled rotated wavelet transform (NSRWT) using 2D rotated wavelet filters for extracting relevant and pragmatic information from MS and PAN images. Three major visual artefacts, colour distortion, shifting effects and shift distortion, are identified in the fused images obtained using SSRWT which are addressed by using NSRWT. The proposed NSRWT algorithm preserves spatial and spectral features of the source MS and PAN images, resulting in a fused image with better fusion performance. The final fused image provides richer information (in terms of spatial and spectral quality) than that of the original input images. The experimental results strongly reveal that undecimated fusion algorithm not only performs better than decimated fusion algorithm but also improves spatial and spectral quality of the fused images.
Keywords: satellite image fusion; feature extraction; rotated wavelet filters; subsampled rotated wavelet transform; nonsubsampled rotated wavelet transform ; MS images; PAN images; shift distortion; shifting effect; fusion metrics.
SE-SqueezeNet: SqueezeNet extension with Squeeze-and-Excitation block
by Supasit Kajkamhaeng, Chantana Chantrapornchai
Abstract: Convolutional neural networks have been popularly used for image recognition tasks. They are built based on the stack of convolutional operations to extract hierarchical features from images. It is known that the deep convolutional neural network can yield high recognition accuracy while training it can be very time-consuming. AlexNet was one of the very first networks shown to be effective for an image classification task. It contains only five convolutional layers and three fully connected layers. However, owing to its large kernel sizes and fully connected layers, the training time is significant. SqueezeNet has been known as a small network that yields the same performance as AlexNet (Krizhevsky et al., 2012). The key element in the network is the Fire module that contains squeeze and expand filters which can reduce the number of parameters significantly. Based on SqueezeNet, we are interested in supplementing other modules that can further improve the performance. The Squeeze-and-Excitation (SE) module yielded promising results in ILSVRC2017. In this paper, we explore the effective insertion of the SE module into SqueezeNet. The methodology and pattern of module insertion have been explored. Further, we propose to combine the residual operation and SE modules to improve accuracy. The effects on size and accuracy are reported. The experimental results for evaluating the module insertion are shown on the popular image classification datasets, including CIFAR-100 and ILSVRC2012. The results show improvements on CIFAR-100 and ILSVRC2012 top-1 accuracy by 1.55% and 3.32% respectively, while the model size is enlarged up to 16% and 10% for CIFAR-100 and ILSVRC2012, respectively.
Keywords: convolutional neural network; deep learning; image classification; residual network; SENet; SqueezeNet.
Image of plant disease segmentation model based on improved pulse-coupled neural network
by Xiaoyan Guo, Ming Zhang
Abstract: Image segmentation is a key step in feature extraction and disease recognition of plant diseases images. To avoid subjectivity while using a pulse-coupled neural network (PCNN) which realises parameter configuration through artificial exploration to segment plant disease images, an improved image segmentation model called SFLA-PCNN is proposed in this paper. The shuffled frog-leaping algorithm (SFLA) is used to optimise the parameters (β, αθ, Vθ) of PCNN to improve PCNN performance. A series of plant disease images are taken as segmentation experiments, and the results reveal that SFLA-PCNN is more accurate than other methods mentioned in this paper and can extract lesion images from the background area effectively, providing a foundation for subsequent disease diagnosis.
Keywords: shuffled frog leap algorithm; pulse-coupled neural network; PCNN; plant disease.
An improved Sudoku-based data hiding scheme using greedy method
by Chin-Chen Chang, Guo-Dong Su, Chia-Chen Lin
Abstract: Inspired by Chang et al.'s scheme, an improved Sudoku-based data hiding scheme is proposed here. The major idea of our improved scheme is to find the approximate optimal solution of Sudoku using the greedy method instead of via a brute-force search for an optimal solution. Later, the found approximate optimal solution of Sudoku is used to offer satisfactory visual stego-image quality with a lower execution time during the embedding procedure. Simulation results confirmed that the average stego-image quality is enhanced by around 90.51% compared to Hong et al.'s scheme, with relatively less execution time compared to a brute-force search method.
Keywords: data hiding; Sudoku; greedy method; brute-force search method; approximate optimal solution.
DDoS attack detection method based on network abnormal behaviour in big data environment
by Jing Chen, Xiangyan Tang, Jieren Cheng, Fengkai Wang, Ruomeng Xu
Abstract: Distributed denial of service (DDoS) attack becomes a rapidly growing problem with the fast development of the internet. The existing DDoS attack detection methods have time-delay and low detection rate. This paper presents a DDoS attack detection method based on network abnormal behaviour in a big data environment. Based on the characteristics of flood attack, the method filters the network flows to leave only the 'many-to-one' network flows to reduce the interference from normal network flows and improve the detection accuracy. We define the network abnormal feature value (NAFV) to reflect the state changes of the old and new IP addresses of 'many-to-one' network flows. Finally, the DDoS attack detection method based on NAFV real-time series is built to identify the abnormal network flow states caused by DDoS attacks. The experiments show that compared with similar methods, this method has higher detection rate, lower false alarm rate and missing rate.
Keywords: distributed denial of service; DDoS; time series; auto regressive integrated moving average; ARIMA; big data; forecast.
A parallel adaptive-resolution hydraulic flood inundation model for flood hazard mapping
by Wencong Lai, Abdul A. Khan
Abstract: There is a growing demand for improved high-resolution flood inundation modelling in large-scale watersheds for sustainable planning and management. In this work, a parallel adaptive-resolution hydraulic flood inundation model is proposed for large-scale unregulated rivers. This model utilised the public best available topographic data and streamflow statistics data from USGS. An adaptive triangular mesh is generated with fine resolution (~30 m) around streams and coarse resolution (~200 m) away from streams. The river flood-peak discharges are estimated using the regression equations from the National Streamflow Statistics (NSS) Program based on watershed and climatic characteristics. The hydraulic simulation is performed using a discontinuous Galerkin solver for the 2D shallow-water flow equations. The hydraulic model is run in parallel with the global domain partitioned using the stream link and stream length. The proposed model is used to predict the flooding in the Muskingum River Basin and the Kentucky River Basin. The simulated inundation maps are compared with FEMA maps and evaluated using three statistical indices. The results demonstrated that the model is capable of predicting flooding maps for large-scale unregulated rivers with acceptable accuracy.
Keywords: flood inundation; flood mapping; unregulated rivers.
A hidden Markov model to characterise motivation level in MOOCs learning
by Yuan Chen, Dongmei Han, Lihua Xia
Abstract: The effect of MOOCs learning is closely related to the learning ability of learners. In order to study the change of learners' learning ability, this paper uses hidden Markov model to analyse the continuous learning process of MOOCs learners. Based on the data of learning platform of www.shlll.net, the model is established. The empirical results show that the distinction between learners' high and low learning ability is more obvious. Based on the above findings, this paper further analyses the learning behaviour differences of participants in learning activities and continuous learning. The method proposed in this paper provides a new way for the study of MOOCs learning, which is helpful for the development of MOOCs learning.
Keywords: massive open online courses; MOOCs; motivation level; learning ability; hidden Markov model.
Greedy algorithm for image quality optimisation based on turtle-shell steganography
by Guo-Hua Qiu, Chin-Feng Lee, Chin-Chen Chang
Abstract: Information hiding, also known as data hiding, is an emerging field that combines multiple theories and technologies. In recent years, Chang et al. and Liu et al. have proposed new data hiding schemes based on Sudoku, a turtle-shell, etc. These proposed schemes have their own advantages in terms of visual quality and embedded capacity. However, the reference matrices used in these schemes are not optimal. Based on the characteristics of these schemes, Jin et al. employed particle swarm optimisation to select the reference matrix and achieved approximately optimal results in reducing the distortion of the stego-image. However, the complexity is high. In this paper, a turtle-shell matrix optimisation scheme is proposed using a greedy algorithm. The experimental results show that our proposed greedy algorithm is better than the particle swarm optimisation scheme at finding a near-optimal matrix and achieving better stego-image quality, and it outperforms the particle swarm optimisation scheme in terms of computational amount and efficiency.
Keywords: data hiding; turtle-shell steganography; particle swarm optimisation; PSO; greedy algorithm.
An efficient memetic algorithm using approximation scheme for solving nonlinear integer bilevel programming problems
by Yuhui Liu, Hecheng Li, Huafei Chen, Jing Huang
Abstract: Nonlinear integer bilevel programming problems (NIBLPPs) are mathematical models with hierarchical structure, which are known as strongly NP-hard problems. In general, it is extremely hard to solve this kind of problem because they are always non-convex and non-differentiable, especially when integer constraints are involved. In this manuscript, based on a simplified branch and bound method as well as interpolation technique, a memetic algorithm is developed to solve NIBLPPs. Firstly, the leader's variable values are taken as individuals in populations, for each individual in the initial population, a simplified branch and bound method is adopted to obtain the follower's optimal solutions. Then, in order to reduce the computation cost in frequently solving the follower's problems for lots of offspring generated in evolution, the interpolation method is applied to approximate the solutions to the follower's problem for each individuals in populations. In addition, among these approximated points, only potential better points can be chosen to endure further optimisation procedure, so as to obtain precise optimal solutions to the follower's problems. The simulation results show that the proposed memetic algorithm is efficient in dealing with NIBLPPs.
Keywords: nonlinear integer bilevel programming problem; NIBLPP; memetic algorithm; branch and bound method; interpolation function; optimal solutions.
MESRG: multi-entity summarisation in RDF graph
by Ze Zheng, Xiangfeng Luo, Hao Wang
Abstract: Entity summarisation has drawn a lot attention in recent years. But there still exist some problems. Firstly, most of previous works focus on individual entity summarisation while ignoring the effect of neighbours. Secondly, the external resources which may unavailable in practice are frequently used to calculate the similarity between resource description framework (RDF) triples. To solve above two problems, this paper focuses on multi-entity summarisation. A topic model-based model multi-entity summarisation in RDF graph (MESRG) is proposed for multi-entity summarisation, which is capable of extracting informative and diverse summaries involving a two-phase process: 1) to select more important RDF triples, we propose an improved topic model that ranks triples with probability values; 2) to select diverse RDF triples. We use a graph embedding method to calculate the similarity between triples and obtain top k distinctive triples. Experiments of our model with significant results on the benchmark datasets demonstrate the effectiveness.
Keywords: semantic web; knowledge graph; multi-entity summarisation; extract subgraph; rank triples; RDF graph; topic model; Gibbs sampling; deep walk; graph embedding.
Dynamic multiple copies adaptive audit scheme based on DITS
by Xiaoxue Ma, Pengliang Shi
Abstract: Multi-copy storage can effectively improve the security of cloud storage. To resolve the problem of dynamic security audit of multi-copy data in cloud, this paper proposes a dynamic multi-copy adaptive audit scheme (DMCAA) based on dynamic index table structure (DITS). In order to achieve the correctness and completeness detection of multiple copies, the data blocks and the corresponding location index information are connected to generate a duplicate file. The audit process is divided into two parts, one is the third-party audits and the other is client audits. When auditing, in order to prevent collusion attacks as well as to improve the audit accuracy, the third-party auditors apply the challenge-response mode to detect the data block labels, and the client auditor applies the adaptive audit algorithm to retrieve the index information of the data blocks. Finally, to prevent replacement and replay attacks, the version number and block number of the file information are added to the block label. Through theoretical analysis and experimental comparison, the scheme is more secure in verifying the integrity of dynamic data and the correctness of the multi-copy data, which can effectively prevent the existing data threats.
Keywords: cloud storage; dynamic auditing; multiple copies; integrity.
Basins of attraction and critical curves for Newton-type methods in a phase equilibrium problem
by Gustavo Mendes Platt, Fran Sérgio Lobato, Gustavo Barbosa Libotte, Francisco Duarte Moura Neto
Abstract: Many engineering problems are described by systems of nonlinear equations, which may exhibit multiple solutions, in a challenging situation for root-finding algorithms. The existence of several solutions may give rise to complex basins of attraction for the solutions in the algorithms, with severe influence on their convergence behaviour. In this work, we explore the relationship of the basins of attractions with the critical curves (the locus of the singular points of the Jacobian of the system of equations) in a phase equilibrium problem in the plane with two solutions, namely the calculation of a double azeotrope in a binary mixture. The results indicate that the conjoint use of the basins of attraction and critical curves can be a useful tool to select the most suitable algorithm for a specific problem.
Keywords: Newton's methods; basins of attraction; nonlinear systems; phase equilibrium.
Special Issue on: Recent Advancements in Machine Learning Techniques for Big Data and Cloud Computing
Research of the micro grid renewable energy control system
based on renewable related data mining and forecasting technology
by Lin Yue, Yao-jun Qu, Yan-xia Song, Kanae Shunshoku, Jing Bai
Abstract: The output power of renewable energy has the characteristics of random fluctuation and
instability, which have a harmful effect on stability of renewable power grids and causes the problem of low usage ratio on renewable energy output power. Thus, this paper proposes a method to predict the output power of renewable energy based on data mining technology. Firstly, the renewable generation power prediction accuracy of three different algorithms, linear regression, decision tree and random forest, is obtained and compared. Secondly, by applying the prediction result to the power dispatch control system, grid-connected renewable power will be consumed by grid-connected load to improve the usage ratio of renewable power. A simulation model and experiment platform are established to verify and analyse the prediction usefulness. The experiment shows that the prediction accuracy of the random forest algorithm is the highest. The tendency of renewable energy output power within a period can be calculated by using data mining technology, and the designed experiment platform system can adjust the working state automatically by following the instruction from the data mining result, which can increase the usage ratio of renewable energy output power and improve the stability of renewable power grid.
Keywords: data mining; micro grid; renewable energy.
Research on advertising content recognition based on convolutional neural network and recurrent neural network
by Xiaomei Liu, Fazhi Qi
Abstract: The problem tackled in this paper is to identify the text advertisement information published by users in a medium-sized social networking website. First, the text is segmented, and then the text is transformed into sequence tensor by using a word vector representation method, which is input into the deep neural network. Compared with other neural networks, RNN is good at processing training samples with continuous input sequence, and the length of the sequence is different. Although RNN can theoretically solve the training of sequential data beautifully, it has the problem of gradient disappearance, so it is a special LSTM based on RNN model that is widely used in practice. In the experiment, the convolutional neural network is used to process text sequence, and time is regarded as a spatial dimension. Finally, the paper briefly introduces the use of universal language model fine-tuning for text classification.
Keywords: RNN; LSTM; CNN; word vector; text classification.
Special Issue on: Cloud Computing and Networking for Intelligent Data Analytics in Smart City
Real time ECG signal preprocessing and neuro-fuzzy-based CHD risk prediction
by S. Satheeskumaran, C. Venkatesan, S. Saravanan
Abstract: Coronary heart disease (CHD) is a major chronic disease that is directly responsible for myocardial infarction. Heart rate variability (HRV) has been used for the prediction of CHD risk in human beings. In this work, neuro-fuzzy-based CHD risk prediction is performed after performing preprocessing and HRV feature extraction. The preprocessing is used to remove high frequency noise, which is modelled as white Gaussian noise. The real time ECG signal acquisition, preprocessing and HRV feature extraction are performed using NI LabVIEW and DAQ board. A 30 seconds recording of the ECG signal was selected in both smokers and non-smokers. Various statistical parameters are extracted from HRV to predict CHD risk among the subjects. The HRV extracted signals are classified into normal and CHD-risky subjects using neuro-fuzzy classifier. The classification performance of the neuro-fuzzy classifier is compared with the ANN, KNN, and decision tree classifiers.
Keywords: electrocardiogram; Gaussian noise; wavelet transform; heart rate variability; neuro-fuzzy technique; coronary heart disease.
An intelligent block matching approach for localisation of copy-move forgery in digital images
by Gulivindala Suresh, Chanamallu Srinivasa Rao
Abstract: Block-based Copy-Move Forgery Detection (CMFD) methods work with features from overlapping blocks. As overlapping blocks are involved, thresholds related to similarity and the physical distances are defined to identify the duplicated regions. However, these thresholds are controlled manually in localising the forged regions. In order to overcome this, an intelligent block matching approach for localisation is proposed using Colour and Texture Features (CTF) through Firefly algorithm. Investigation of the proposed CTF method is carried out on a standard database, which achieved an average true detection rate of 0.98 and an average false detection rate of 0.07. The proposed CTF method is robust against brightness change, colour reduction, blurring, contrast adjustment attacks, and additive white Gaussian noise. Performance analysis of the CTF method validates its superiority over other existing methods.
Keywords: digital forensics; copy-move forgery detection; intelligent block matching; firefly algorithm.
Optimised fuzzy clustering-based resource scheduling and dynamic load-balancing algorithm for fog computing environment
by Bikash Sarma, R. Kumar, Themrichon Tuithung
Abstract: The influential and standard tool known as fog computing performs applications of the Internet of Things (IoT) and it is an extended version of cloud computing. In the network of edge, the applications of IoT are possibly implemented by the fog computing, which is an emerging technology in the cloud computing infrastructure. The unique technology in fog computing is the resource-scheduling process. The load on the cloud is minimised by the resource allocation of the fog-based computing method. Maximisation of throughput, optimisation of available resources, response time reduction, and elimination of overload of single resource are the goals of a load-balancing algorithm. This paper suggests an Optimised Fuzzy Clustering Based Resource Scheduling and Dynamic Load Balancing (OFCRS-DLB) procedure for resource scheduling and load balancing in fog computing. For resource scheduling, this paper recommends an enhanced form of Fast Fuzzy C-means (FFCM) with the Crow Search Optimisation (CSO) algorithm in fog computing. Finally, the loads or requests are balanced by applying the scalability decision technique in the load-balancing algorithm. The proposed method is evaluated based on some standard measures, including response time, processing time, latency ratio, reliability, resource use, and energy consumption. The proficiency of the recommended technique is obtained by comparing with other evolutionary methods.
Keywords: fog computing; fast fuzzy C-means clustering; crow search optimisation algorithm; scalability decision for load balancing.