International Journal of Computational Science and Engineering (74 papers in press)
Constructive system for double-spend data detection and prevention in inter- and intra-block of blockchain
by Vijayalakshmi Jayaraman, Murugan Annamalai
Abstract: Currently, our global financial market faces lots of trouble owing to migration from fiat currency to cryptocurrency and its underlying blockchain technology. Blockchain provides trust in a decentralised way for storing, managing, and retrieving transactions. The double-spending issue arises owing to the erroneous transaction verification mechanism in the blockchain. Research has shown that transaction malleability, such as double-spending, creates millions of bitcoin losses to the owners as well as to a few bitcoin exchanges. This research aims to detect and prevent the double-spending of bitcoins in single and multiple blocks. In this context, double-spend data in a single block is identified using the DPL2A method. Further, the original transaction from the double-spend transaction list is identified using the ACRT method, which acts as a prevention of double-spend in a forthcoming occurrence. Similarly, double-spend data in multiple blocks are identified using MBDTD along with the Cognizant Merkle tree. Finally, a system named F2DP is constructed to detect and prevent the double-spend data in inter- and intra-blocks of the blockchain. The result indicates these methods will act best for double-spend detection and prevention with a limited set of transaction records. Further research is needed to increase the scalability of transaction records.
Keywords: cryptocurrency; bitcoin; double-spending; UTXO; Merkle.
Workflow scheduling in cloud computing environment with classification ordinal optimisation using support vector machine
by Debajyoti Mukhopadhyay, Vahab Samandi
Abstract: Every day, civilisation generates more and more data. The processing cost and performance issues of this massive set of data have become a challenge in distributed computing. Processing multitasks workloads for big data in a dynamic environment requires real-time scheduling and the additional complication of generating optimal schedules in a large search space with high overhead. In this paper, we propose an adaptive workflow management system that uses ordinal optimisation to acquire suboptimal schedules in much less time. We then introduce a prediction-based workflow scheduler model that predicts the execution time of the next coming workflow by using a support vector machine. We used a real application, Montage workflow for large-volume image data, and the experimental results show that our classification ordinal optimisation outperforms other existing methods.
Keywords: cloud computing; workflow scheduling; classification ordinal optimisation; support vector machine; ordinal optimisation; Montage workflow; big data.
A local region enhanced multi-objective fireworks algorithm with subpopulation cooperative selection
by Xiaoning Shen, Xuan You, Yao Huang, Yinan Guo
Abstract: A local region enhanced multi-objective fireworks algorithm with subpopulation cooperative selection (LREMOFWA) is proposed for multi-objective optimisation. In LREMOFWA, the ranking based on the non-dominated sorting and the crowding distance is taken as the fitness evaluation indicator. A novel way to calculate the explosion amplitude is designed to enhance the search for the local region. The concept of subpopulation is introduced, and the selection operation is performed by the elites in the archive cooperating with the optimal subpopulation sparks. The differential mutation operator is used to deal with the repeated individuals in the fireworks, which prevents the algorithm from falling into the local optimum. The proposed algorithm is compared with five state-of-the-art algorithms on the WFG test functions. Experimental results show that the proposed algorithm has better performance with respect to the searching accuracy and diversity. It is suitable for solving multi-objective function optimisation problems with various complex characteristics.
Keywords: multi-objective optimisation; fireworks algorithm; explosion amplitude; cooperative selection; differential mutation.
High order random coefficient INAR model and simulations
by Qingqing Zhan, Yunyan Wang
Abstract: This paper develops a high-order INAR model based on random environmental processes. The conditional expectation, conditional variance, and correlation structure of this new process are discussed, Yule-Walker estimators of the parameters in the new model are obtained, and strong consistency of the Yule-Walker estimators is proved. Numerical simulation is studied to test the performance of the estimators in a finite sample.
Keywords: random environmental process; generalised signed thinning operator; integer valued time series; Yule-Walker estimation.
GDC-a-CGI: efficient algorithms for dynamic graph data cleaning and indexing
by Santhosh Kumar D K, Demian Antony D'Mello
Abstract: The era of big data has led graph data collection and analytics to grow rapidly in numerous fields. Data quality and data access are the two decisive factors of performance (accuracy and efficiency) for the graph data analytics model. This paper proposes a Graph Data Cleaning (GDC) technique, which removes erroneous messy data, leading to better data quality. The GDC is a dynamic cleaning technique that facilitates the user to update rules expressions at runtime and support inherit rules from inter-domains. In addition to cleaning, GDC verifies and validates the graph data. The paper presents the Cache-based Graph Indexing (CGI) technique to address data access, which is built using the tree structure CSS-Tree on the Hadoop distributed framework. The CGI is a scalable index construction technique, which builds efficient indexing for an extensive graph dataset. We carried out experiments with different graph data sets and results reveal that the proposed GDC and CGI techniques outperform the state of the art.
Keywords: data mining; graph data cleaning; graph data indexing; big data; Hadoop; graph data analytics; dynamic cleaning; cache-based indexing.
A multi-hop cross-blockchain transaction model based on improved hash-locking
by Bingrong Dai, Shengming Jiang, Chao Li, Menglu Zhu, Sasa Wang
Abstract: Blockchain is a decentralised, trust-free distributed ledger technology that has been applied in various fields such as finance, supply chain, and asset management. However, the network isolation between blockchains has limited their interoperability in asset exchange and business collaboration since it forms blockchain islands. Cross-blockchain is an important technology aiming to realise the interoperability between blockchains, and has become one of the hottest research topics in this area. This paper proposes a multi-hop cross-blockchain transaction model based on an improved hash-locking consulted by the notary and users. It can solve the security problems in the traditional hash-locking, and prevent malicious participants from creating a large number of transactions to block the cross-blockchain system. Moreover, a notary multi-signature scheme is designed to solve the problem of lack of trust in the traditional model. A multi-hop cross-blockchain transaction loop is designed based on the loop detection method of directed graphs. The transaction process of key agreement, asset locking, lock releasing, and security analysis based on the model is discussed in detail. Experiments of cross-blockchain trans-actions are carried out in Ethereum private chain, and prove that the proposed model has good applicability.
Keywords: blockchain; cross-blockchain; hash-locking; notary schemes; Diffie-Hellman algorithm; multi-hop transaction.
Workflow scheduling optimisation for distributed environment using artificial neural networks and reinforcement learning
by K. Jairam Naik, Mounish Pedagandham, Amrita Mishra
Abstract: The objective of this research article is to find an optimal schedule that can reduce the makespan of workflow. The workflow scheduling was enhanced by realising Artificial Neural Networks (NN) and reinforcement Q-learning standards. An optimised NN-based scheduling algorithm (WfSo_ANRL) that represents an agent that can effectively schedule the tasks among computational nodes is presented in this article.
Keywords: workflow scheduling; optimisation; distributed environment; artificial neural network; makespan time; Q-learning.
Causal event extraction using causal event element-oriented neural network
by Kai Xu, Peng Wang, Xue Chen, Xiangfeng Luo, Jianqi Gao
Abstract: Causal event extraction plays an important role in natural language processing (NLP) such as question answering, decision making and event prediction. Previous work extracts causal events using template-matching methods, machine-learning methods, or deep-learning methods. However, these methods ignore the guiding role of specific causal patterns on causal event extraction. In this paper, we propose causal event element-oriented neural network (CEEONN) to extract causal events. Firstly, we construct a causal event element knowledge base (CEEKB) from domain casual text. Then we construct a neural network by incorporating both the entire sentence and associated causal patterns into a better semantic representation. With domain-based CEEKB, the proposed CEEONN can be better guided to identify specific causal patterns. Experiments show that CEEONN achieves competitive results compared with previous work.
Keywords: causal event elements; causal event extraction; causal patterns.
Enhancing the energy efficiency by LEACH protocol in the internet of things
by Meghana Lokhande, Dipti Patil
Abstract: The Internet of Things (IoT) is one of the active applications of Wireless Sensor Networks (WSNs) with different objects or devices that can be connected over the internet. Limitation of battery life is the main concern for WSNs, which affects network life. Various medical devices and applications have benefitted from machine-to-machine (M2M) connectivity. In tele-robotic surgery, M2M communication between medical devices provides visual assistance to doctors during internal procedures and gives feedback on the progress of operation from sensors embedded in surgical instruments. Many researchers worked on reducing energy consumption in M2M networks. These studies presented energy-efficient routing protocols (EERP) to enhance energy efficiency to prolong sensor node life using LEACH protocol. LEACH is a ranked protocol that converts sensor nodes (SN) to cluster heads (CH) based on current energy, and CH collects and compresses data, and sends it to the destination node. The energy of the node dissipates when each node receives or transmits data to the base station. In the medical field, surgeons and patients are located at different places and connected through public networks. To make this communication reliable and robust, research shows the design of the medical sensor node network with LEACH protocol. After designing a network, a denial of service or man-in-middle attack is performed to analyse its impact on network performance. The system performance is measured using performance parameters. Based on the experimental results, the protocol can significantly extend the life of the WSN with the LEACH protocol, thus improving energy efficiency. This makes the system more robust and reliable in the emerging area of medical science.
Keywords: machine-to-machine communication; internet of things; security.
Predicting stock price movement using a stack of multi-sized filter maps and convolutional neural networks
by Yash Thesia, Vidhey Oza, Priyank Thakkar
Abstract: This paper explores the use of Convolutional Neural Networks (CNN) to predict the movement of the stock market from a classification perspective. Standard methods of classification yield results with quite low confidence and precision. We, therefore, propose a CNN enhanced by multi-sized feature maps and spatial mapping providing more accurate two-way classification on a set of stocks. We also propose transforming stock indicators and data into a spatial map/image so that they can be processed using CNN. Our model and mapping achieves an average of 80% weighted f1 score for a two-way classification of market movement. A trading strategy is also employed and returns are compared with benchmarks. Our returns from the trading strategy from 2017 to 2020 outperform the previous benchmarks.
Keywords: convolutional neural network; stock market; stock price movement; Inception; technical indicators.
Stacked auto-encoder for Arabic handwriting word recognition
by Benbakreti Samir, Benouis Mohamed, Roumane Ahmed, Benbakreti Soumia
Abstract: Arabic handwritten recognition systems face several challenges, such as the very diverse scripting styles, the presence of pseudo-words and the position-dependent shape of a character inside a given word, etc. These characteristics complicate the task of features extraction. Our proposed solution to this problem is a Stacked Auto-Encoder (SAE) unsupervised learning approach applied to resolve the unconstrained Arabic handwritten word recognition. Our strategy consists of using an unsupervised pre-training stage, i.e. SAE, which will extract the features layer by layer, then, through fine-tuning, the global system will be used for the classification task. By exploiting this, our system gets the advantage of applying a holistic approach, i.e. without word segmentation. In order to train our model, we have enhanced the NOUN v3 hybrid (i.e. offline and online) database, which contains 9600 handwritten Arabic words and 4800 characters. However, this work is focusing on the offline recognition of Arabic word handwriting using a SAE-based architecture for images classification. Our experiment study shows that after a careful tuning of the main SAE parameters we got good results (98.03%).
Keywords: Arabic handwriting; offline recognition; deep learning; auto-encoder; SAE.
Prior distributions based data augmentation for object detection
by Ke Sun, Xiangfeng Luo, Liyan Ma, Shixiong Zhu
Abstract: Deep convolutional neural networks based object detection models require extensive labelled data as train set, while the collection and annotation of these data are much more laborious and costly. To solve the problem, data augmentation methods based on cut-and-paste that can explore the visual context are widely used. However, these methods either limit the expansion of the instances diversity of the dataset or increase the computational burden. In this paper, we propose a novel data augmentation strategy based on prior distributions, which can be used to guide data augmentation for object detection. On one hand, the method can effectively capture the relationship between the foreground instance and the visual context. On the other hand, it can increase the instances diversity of the original dataset as much as possible. Experimental results show that the performance of the popular object detection model can be effectively improved by expanding the original dataset with our method. Compared with the baseline, our method improves by 0.8 percentage point on PASCAL VOC and 1.1 percentage points higher on cross-data test set.
Keywords: data augmentation; visual context; prior distributions; object detection.
Side-path FPN based multi-scale object detection
by Weixian Wan, Xiangfeng Luo, Liyan Ma, Shaorong Xie
Abstract: Multi-scale object detection faces the problem of how to obtain distinguishable features. Feature pyramid network (FPN) is the most typical method to construct a feature pyramid to obtain multi-scale features. FPN is beneficial for multi-scale object detection tasks to improve the mean Average Precision (mAP) of the detectors. However, owing to the lack of feature selection to eliminate redundant information, FPN cannot make full use of multi-scale features. In this paper, side-path FPN is proposed to address this problem. Side-path FPN contains two components: feature alignment and feature fusion. The feature alignment component uses the best operator to extract features. The feature fusion component can enhance features that are helpful for detection and reduce redundant information. With ResNet-50 as the backbone, compared with the original FPN, side-path FPN improves mAP by 1.8 points on the VOC2007 test data set and 1.0 point on the COCO 2017 test data set with MS COCO metrics.
Keywords: object detection; multiple scale; feature selection.
On managing security in smart e-health applications
by Fiammetta Marulli, Stefano Marrone, Emanuele Bellini
Abstract: Distributed machine learning can give an adaptable but strong shared condition for the design of trusted AI applications; this is mainly due to lack of privacy of centralised remote learning mechanisms. This notwithstanding, distributed approaches have also been compromised by several attack models (mainly data poisoning): in such a situation, a malicious member of the learning party may inject bad data. As such applications are growing in criticality, learning models must face issues of security and protection just as with versatility issues. The aim of this paper is to improve these applications by providing extra security features for distributed and federated learning mechanisms: more in the details, the paper examines specific concerns such as the use of blockchain, homomorphic cryptography and meta-modelling techniques to ensure protection as well as other non-functional properties.
Keywords: federated learning; cloud computing; security in machine learning; adversarial attacks.
Transfer learning approach in deep neural networks for uterine fibroid detection
by Sumod Sundar, Sumathy Subramanian
Abstract: Convolutional Neural Network (CNN) is a deep learning algorithm that takes images as input and automatically extracts features for effective class prediction. A lot of research attempts are happening in medical imaging diagnosis using deep learning techniques. The performance of CNN architecture is a major concern while dealing with fewer data. Traditional CNN architectures such as ImageNet, AlexNet, and GoogleNet are trained with a big quantity of data. Also, CNN architectures such as NiftyNet, UNet, and SegNet, are not designed using uterine fibroid images. The idea of transfer learning is used in this work by combining the pre-trained model Inception-V4 and classifier Support Vector Machine (SVM) for better performance while dealing with fewer data. The goal of the proposed approach is to efficiently detect the presence of fibroids in uterus MRI images. The features of fibroid affected uterus images are extracted using the initial layers of Inception-V4 and transferred to SVM during training. Several combinations of various network classifiers are tested, and the performance metrics are evaluated. Experimental validations on the proposed model attained an accuracy of 81.05% with a U-kappa score of 0.402 on predicting fibroid images and 2.65% accuracy improvement compared with Fully Connected Neural Network (FCNN) used for fibroid detection.
Keywords: CNN; transfer learning; deep learning; uterine fibroid; Inception-V4.
Numerical treatment and analysis for a class of time-fractional Burgers equation with the Dirichlet boundary conditions
by A.S.V Ravi Kanth, Neetu Garg
Abstract: This paper aims to study a class of time-fractional Burgers equations with the Dirichlet boundary conditions in the Caputo sense. Burgers equation occurs in the study of fluid dynamics, turbulent flows, acoustic waves, and heat conduction. We discretize the equation by employing the Crank-Nicolson finite difference quadrature formula in the direction of time. We then discretise the resulting equations in the space domain using the exponential B-splines. A rigorous study of stability and convergence analysis is analysed. Several test problems are studied to illustrate the efficacy and feasibility of the proposed method. Numerical simulations confirm the coherence with the theoretical analysis. Comparisons with the other existing results in the literature indicate the effectiveness of the method.
Keywords: exponential B-spline; time-fractional Burgers equation; Caputo fractional derivative.
Multiclass classification using convolution neural networks for plant leaf recognition of ayurvedic plants
by K.V.N. Rajesh, Lalitha Bhaskari Dhavala
Abstract: Ayurveda is the traditional medicine system of India. The ingredients from which ayurvedic medicines are made are mostly herbal and mineral in nature. Also, there are many herbal home remedies in India for general ailments. This knowledge has been passed down from generation to generation in large joint families. This knowledge is slowly fading away in the current generation of nuclear families. The current generation is unable to identify even locally available plants. The authors have come up with the idea of using convolution neural networks for solving this problem. In this solution, the images of leaves are used to identify the plant. This problem is a case of multiclass classification. A leaf image database is created and a neural network model is built using Convolutional Neural Network (CNN). Keras deep learning framework with tensorflow as backend, is used for this purpose. The work presented in this paper is a part of larger research work in this area. This paper explains the developed CNN model and presents the results corresponding to six ayurvedic leaves commonly available in and around the city of Visakhapatnam in the state of Andhra Pradesh.
Keywords: convolution neural networks; multiclass classification; plant leaf recognition; leaf feature extraction.
Improved ELBP descriptors for face recognition
by Shekhar Karanwal, Manoj Diwakar
Abstract: In this work, three novel descriptors are introduced for Face Recognition (FR) so-called Sobel Horizontal Elliptical Local Binary pattern (SHELBP), Sobel Vertical ELBP (SVELBP) and Sobel ELBP (SELBP). All three proposed descriptors are extensions of the work proposed by Nguyen et al. (2012). They proposed three descriptors for FR called as HELBP, VELBP and ELBP. In HELBP and VELBP the horizontal neighborhood pixels (aligned elliptically) and vertical neighborhood pixels (aligned elliptically) are compared with the centre pixel to produce their feature sizes, and ELBP is the combined histogram extracted from both the descriptors. The performance of these descriptors is not effective under illumination variations (without pre-processing), as is experimentally proved in this work. To compensate for that the SOBEL operator is applied as image pre-processing before feature extraction is performed. The features extracted from Sobel magnitude and directional gradients eliminate this problem very effectively.
Keywords: image pre-processing; feature extraction; dimension reduction; classification.
Feature reduction of rich features for universal steganalysis using a meta-heuristic approach
by Ankita Gupta, Rita Chhikara, Prabha Sharma
Abstract: The development of content adaptive steganographies has become a challenge for steganalysis. This led researchers towards extraction of a rich space of features. The detection of stego images based on Spatial Rich Model (SRM) features and its variants is a promising research area in the field of universal steganalysis. SRM features are extracted as 106 submodels that collectively provide 34,671 features. So, one of the most significant challenges in universal steganalysis is feature selection. In this paper, an improved binary particle swarm optimisation, Global and Local Best Particle Swarm Optimization (GLBPSO) with Fisher Linear Discriminant classifier is used to identify relevant feature submodels that improve the efficiency of a steganalyser. A significant reduction rate of more than 70% is achieved by the proposed approach. This further helps in reducing computational complexity without much affecting the detection capability. The proposed methodology gives superior results when compared with state-of-the-art algorithms.
Keywords: steganalysis; spatial rich model; GLBPSO; Fisher linear discriminant; ensemble; submodels; classification accuracy; steganography; meta-heuristic; optimisation.
Self-similarity single image super-resolution based on blur kernel estimation for texture reconstruction
by Kawther Aarizou, Abdelhamid Loukil
Abstract: Most of recent Single Image Super Resolution (SISR) reconstruction methods adopt simple bicubic downsampling to construct low-resolution (LR) and high-resolution (HR) pairs for training. Those models learn an inverted version of an ideal degradation operation, which leads to generating less realistic SR images. The obtained details are either blurred or not, reminiscent of the usually observed textures (Du et al., 2020). The generation of SR image from a single LR with faithful ground-truth texture and no external information remains an issue, especially when the degradation model is not defined (not necessarily bicubic downscaling). To overcome this issue, we focus on designing a single-image SR reconstruction framework for real-world scenarios by injecting the image-specific degradation kernel in the training process. Our method combines the advantages of both SISR and Multiple-Image Super Resolution (MISR) techniques by generating a dataset regarding internal statistic of the LR image. A small CNN is trained over this internal dataset and requires no additional or external data. Our method is proved to address more textural details in the generated outcome, and outperforms state-of-the-art deep models.
Keywords: unsupervised single-image super-resolution; internal learning; image-specific super-resolution; data-augmentation; kernel estimation.
Distributed energy management study based on blockchain technology
by Jingzhao Li, Lei Wang, Xiaowei Qin
Abstract: This article proposes a power network management system based on blockchain technology to address the difficulties in distributed energy management in smart grids and power resource dispatching. Ethereum is used as the development platform to build a blockchain for power interaction management that mainly involves distributed energy transactions. The consensus mechanism for electricity generation and consumption in the blockchain is designed based on the proof of stake (PoS) mutation algorithm. The consensus mechanism is also based on a selection function to determine the bookkeeper among the two sides of the transaction. In order to guarantee the rationality of energy transactions in the grid, this article presents a distributed power dispatching strategy using K-means clustering algorithm and particle swarm optimisation (PSO) algorithm. Finally, the distributed power matching transactions are completed in the power interaction management blockchain with the presence of power operation smart contracts. The experimental results show that the consensus mechanism and power dispatching strategy designed in this paper effectively solve the matching problem in distributed power trading. The application of the power operation smart contract further promotes the success rate of the transaction and effectively reduces its time consumption.
Keywords: smart grid; distributed energy; power network management system; power interactive management blockchain; smart contract.
A genetic algorithm for real time demand side management in smart microgrids
by Salvatore Venticinque, Massimiliano Diodati
Abstract: One of the main drawbacks in the management of renewable resources, including wind and solar energies, is the issue of uncertainty in their behaviour. Demand side management (DSM) shifts loads of a household from times characterised by a surplus in consumption to times with photovoltaic production surplus. In this paper, we propose the use of a genetic algorithm to find the best schedule of energy loads that best matches the energy production by photovoltaic panels. We aim at optimising self-consumption, but satisfying real-time constraints, which allow for addressing unforeseen changes of the planned schedule or unpredictable variations of renewable energy production. We designed specialised genetic operators to accelerate, already in the first iterations, the convergence to a local minimum of the solution space, and evaluated how such improvements affect the optimality of results.
Keywords: smart microgrid; optimisation; genetic algorithm; demand side management.
Empowerment of cluster and grid load-balancing algorithms to support distributed exascale computing systems with high compatibility
by Faezeh Mollasalehi, Shirin Shahrabi, Elham Adibi, Ehsan Mousavi Khaneghah
Abstract: The occurrence of dynamic and interactive events in processes leads to changes in the state of their necessities and in the state of computing elements of the system, leading to changes in the state of the load balancer. This impact may render the load balancer unable to manage the load balancing of the system. On the other hand, the nature of the scientific applications that need to be distributed exascale systems means that both traditional and distributed exascale systems programs are required. As a result, not only does the load balancer support traditional patterns and mechanisms, but also these mechanisms should be empowered to support the states caused by the occurrence of events with a dynamic and interactive nature. In this paper, in addition to the examination of events with dynamic and interactive nature, a mathematical model is presented to examine the impact of this concept on the load balancer. This mathematical model examines the traditional load balancer that is used in cluster and grid computing systems and should support which characteristics to manage the dynamic and interactive nature in processes and execution in distributed exascale computing systems. Based on the model and definition of global activity, mechanisms that are used in cluster and grid systems in distributed exascale systems are examined. In cluster and grid systems that can support the specified characteristics in this article, in 60% of the cases, the load balancer can manage events with a dynamic and interactive nature and use the mathematical model as the load-balancing mechanism in the distributed exascale system.
Keywords: distributed exascale computing; load balancing; events of dynamic and interactive nature; load-balancing algorithms; cluster computing; grid computing.
An improved method for K-means clustering based on internal validity indexes and inter-cluster variance
by Guangli Zhu, Xiaoqing Li, Shunxiang Zhang, Xin Xu, Biao Zhang
Abstract: The traditional internal validity indexes of k-means clustering algorithm are sometimes difficult to get the best cluster number. Therefore, a good clustering result cannot be obtained generally. To solve this problem, this paper proposes an improved method for k-means clustering based on internal validity indexes and inter-cluster variance. This method firstly sets different initial cluster numbers, which are all integers selected from the interval. Then the same data set is clustered under each selected cluster number to obtain the clustering results. Secondly, the obtained clustering results are evaluated by the internal validity indexes. Finally, if the internal validity values are similar, the inter-cluster variances among clustering numbers are compared to get the best clustering result. Experimental results show that the new improved method can obtain a better clustering result under a certain condition.
Keywords: K-means clustering; internal validity indexes; inter-cluster variance.
Recommendation Service for hotel applications on blockchain
by Meng-Yen Hsieh, Pei-Wei Wang, Chih-Hong Kao
Abstract: Adopting recommendation mechanisms to process users data is available on cloud computing to enhance the performance of modelling user preference. Using recommendation APIs available in cloud computing, our work focuses on developing hotel or lodging web applications with a trust-based recommendation service. The recommendation service accompanying trust relationship among users is advanced further to reduce the problem of cold-start users and data-rating sparsity. Additionally, a blockchain service is assisted with an online room-booking service. We suggest that the architecture for hotel or lodging applications is incorporated with a number of requested modules. A prototype is built by the proposed modules over a cluster platform and a blockchain net. The experimental results show that the trust-based recommender of the prototype contains more improved accuracy than general recommenders only with explicit rating data. A smart contract in a blockchain test net for the online room-booking service is implemented, executed, and evaluated.
Keywords: recommendation; booking; trust; blockchain.
Human interactive behaviour recognition method based on multi-feature fusion
by Qing Ye, Rui Li, Hang Yang, Xinran Guo
Abstract: Recently, the selection of the overall and individual characteristics in interactive actions and the high-dimensional complexity of features are still important factors affecting the recognition accuracy. In this paper, we propose a human interactive behaviour recognition method based on multi-feature fusion, which includes two parts, feature extraction and behaviour recognition. Firstly, we use histogram feature descriptors to form a three-dimensional gradient histogram of local space-time feature (3D-HOG) and a histogram of global optical flow feature (HOF). Then the bag-of-words model is used to reduce the dimensions, and the classification matrix is obtained through multilayer perceptron (MLP) classifiers. In the second part, we use recurrent neural network (RNN) to get connections in time. Considering the information of interactive behaviour will be different at different stages, an improved Gauss neural network is proposed for interactive behaviour recognition. The experimental results show that the algorithm can effectively improve the accuracy in the UT-interaction dataset.
Keywords: multi-feature fusion; bag-of-words model; multilayer perceptron classifiers; an improved Gauss neural network; interactive behavior recognition.
Optimised implementation of AVR system using particle swarm optimisation (PSO)
by Amin Jarrah, Mohammad Zaitoun
Abstract: Several techniques have been developed to improve the control quality and deliver optimised products in many industrial process domains. This work aims to propose an optimised automatic voltage regulator (AVR) system implementation by applying a nature-inspired algorithm called Particle Swarm Optimization (PSO) to design a proportional-integrator-derivative (PID) controller for the AVR system. The proposed system consists of two controllers to deal with both the transient state and the time response. Various parallelisation and optimisation techniques, such as loop unrolling, loop pipelining, dataflow, and loop flattening, were adopted and applied to investigate the opportunities of creating a much more effective design. The proposed system achieves better results for the settling time and the overshoot, which makes the proposed system a suitable choice for zero overshoot industry applications.
Keywords: optimisation techniques; particle swarm optimisation; PID controller; AVR system; time response; optimal control.
A secure hash function based on sponge construction and chaotic maps
by Amine Zellagui, Naima Hadj-said, Adda Ali Pacha
Abstract: This work introduces a new hash function based on the sponge structure and two chaotic maps. It aims to avoid the major problems of Merkle-Damg
Keywords: PWLCM; hash function; chaotic maps; sponge construction; cloud computing; collision; password.
A Bayesian network correlation-based classifier chain algorithm for multilabel learning
by Hao Zhang, Kai-Biao Lin, Wei Weng, Juan Wen, Chin-Ling Chen
Abstract: In recent years, researchers have proposed many multilabel classification algorithms to solve the problem of multilabel classification. Among them, the classifier chain (CC) algorithm is widely studied because it fully considers the correlation between labels, the model size is linear with the number of labels, and the training steps can be executed in parallel. However, the chain label sequence of the CC algorithm is random; if the prediction results of the labels in front of the chain are not correct, the impact will spread throughout the rest of the chain, which will greatly affect the performance of the CC algorithm. To solve this problem, we propose a new multilabel learning method, the Bayesian network correlation-based CC (BNCC) algorithm, to decrease the uncertainty in the label order from the CC algorithm. It uses a neural network constructed in TensorFlow as the classifier of all labels and calculates the corresponding error function, which is used to eliminate the influence of the feature set on all labels. A directed acyclic graph (DAG) Bayesian network is constructed by using the error function to identify the correlations between the labels. The optimal correlation label is identified via topological sorting. Finally, the sorted sequence is used as the chain order of the CC. The experimental results demonstrate that the proposed method is superior to the unordered CC model and other multilabel learning algorithms on several benchmark datasets.
Keywords: multilabel learning; Bayesian network; classifier chain; label relationship.
A multi-objective computation offloading algorithm in MEC environments
by Li Liu, Xuemei Lei, Qian Wang
Abstract: Mobile Edge Computing (MEC) is able to provide cloud computing capabilities at network edges by offloading computation tasks to MEC servers deployed in the proximity of edge nodes. Therefore, how to make offloading decision for mobile users has become a critical issue. In this paper, we propose a multi-objective computation offloading algorithm combining Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D) with Invasive Weed Optimisation (IWO) and Differential Evolution (DE). Considering that IWO is a numerical stochastic optimisation method imitating weeds' behaviour in nature and enjoys great robustness, we further improve its searching abilities. In order to reduce computing time, single-object problems can be clustered into several groups in which only one problem can be optimised by IWO and others are optimised by DE. Experimental results show the competitive performance of our proposed algorithm for computation offloading in MEC environments.
Keywords: mobile edge computing; computation offloading; MOEA/D; invasive weed optimisation; offloading decision.
An improved motion estimation criterion for temporal coding of video
by Awanish Mishra, Narendra Kohli
Abstract: The size of video data is growing exponentially worldwide and hence there is need for better video coding standards. There are many video coding standards given by MPEG and H.26X. The latest and effective video coding standards are AVC, HEVC and AV1. MPEG and H.26X use block matching techniques for the temporal coding and these block matching techniques mostly use mean absolute difference (MAD) as the block matching criterion. MAD is very simple and there is much less complexity in its implementation, but sometimes MAD results in spurious selection of matched block owing to different transformations in sequential images and the noise introduced in the frame. To overcome this problem there have been many matching criteria, such as vector matching criteria (VMC), smooth constrained mean absolute error (SC-MAE) and scaled value criterion (SVC). Criteria defined up till now do not consider the noise introduced in the frame and hence are still producing incorrect selection or rejection of the blocks. In this paper, a new motion estimation criterion is suggested and compared with four existing criteria in terms of PSNR, average number of evaluated search points per block, and average MAD per pixel. The MAD per pixel improves by nearly 70% for the proposed matching criterion.
Keywords: motion estimation; matching criterion; block matching; search window; source block.
Cyberbullying detection: an ensemble learning approach
by Pradeep Kumar Roy, Ashish Singh, Asis Kumar Tripathy, Tapan Kumar Das
Abstract: Online social networking platforms have become a common choice for people to communicate with friends, relatives, or business partners. This allows sharing life achievement, success, and much more. In parallel, it also invited hidden issues such as web-spamming, cyberbullying, cybercrime, and others. This paper addresses the issue of cyberbullying using an ensemble machine learning model. The complete experiment works in two phases: firstly, k-nearest neighbour, logistic regression and decision tree classifiers are used to detect the bullying post. Secondly, the prediction outcomes of these classifiers are passed to a voting-based ensemble learning model for the predictions. The experimental outcomes confirmed that the ensemble model is detecting bullying posts with good accuracy.
Keywords: cyberbullying; Twitter; social network; ensemble learning; classification.
T-PdM: tripartite predictive maintenance framework using machine learning algorithms
by Ozlem Ece Yurek, Derya Birant, Alp Kut
Abstract: The purpose of this paper is to propose a new predictive maintenance (PdM) framework that has three aspects: (i) estimating the remaining useful life (RUL) of a machine, (ii) classifying machine health status (failure/non-failure), and (iii) discovering the relationship between the errors and component failures of machines by using machine learning (ML) techniques. This is the first PdM framework that integrates three ML paradigms (regression, classification, and association rule mining) in a single platform. It compares six different ML algorithms. The results indicate that the proposed framework can be successfully used to get valuable knowledge about machines and to build a consistent maintenance strategy to improve machine usage in the industry sector. The existing PdM studies usually use only one ML paradigm, which is insufficient for prediction. To overcome this limitation and improve prediction accuracy, a novel tripartite predictive maintenance framework (T-PdM) is proposed in this study.
Keywords: predictive maintenance; machine learning; classification; regression; association rule mining.
Automated network intrusion detection using multimodal networks
by Subhash Pingale, Sanjay R. Sutar
Abstract: Intrusion detection requires accurate and timely detection of any bad connection that intends to exploit network vulnerabilities. Previous approaches have focused on deriving statistical features based on domain knowledge, followed by primitive machine learning and ensemble techniques. Grouping all the parameters as a single input to a model may not always be effective. In this paper, we propose using multimodal networks for network intrusion detection. The input logs are segregated into multiple subgroups trained differently. Their intermediate representations are combined to produce the final prediction. This approach handles the strengths of individual features better than normalization. The system is evaluated on the NSL-KDD dataset and is compared with standard methods across multiple performance metrics. The proposed system achieves an accuracy of 83.5, the highest compared with other approaches. Channeling inputs for richer feature extraction is fast gaining traction, and we extend the same in cybersecurity.
Keywords: intrusion detection system; multimodal networks; NSL-KDD dataset; cybersecurity.
A hybrid of local and global atmospheric scattering models for depth prediction via the cross Bayesian model
by Qianjin Zhao, Haitao Zhang, Jianhua Cui, Yanguang Sun, Songsong Duan, Chenxing Xia
Abstract: Monocular depth estimation is a fascinating and challenging problem in virtual vision. However, the training of networks based on deep learning largely depends on the training data. This paper proposes a depth prediction method based on the depth cue: atmospheric light scattering, which can effectively predict the depth in different atmospheric light scenarios. But the assumption of global atmospheric light constancy produces unavoidable error. Especially for complex scenes, the complex reflected light of the scene leads to uneven distribution of atmospheric light. This paper proposes a new local atmospheric light estimation method, which can more effectively simulate the real distribution of atmospheric light scattering in the air. The experiment found that the two models are complementary. In order to fuse the intrinsic real information of the two models, this paper adopts the fusion strategy based on the cross Bayesian model, and edge preserving filtering is used to preserve the detailed information.
Keywords: depth estimation; global atmospheric light; local atmospheric light; cross Bayesian model.
Research on online emotion of COVID-19 based on text sentiment analysis
by Zhenyu Gu, Yao Lin, Yonghui Dai, Chenxiao Niu
Abstract: The growth of internet users and the convenience of internet communication provide a foundation for the formation of internet emotions. As the internet and real-life interactions become closer, the influence of internet emotions on society is increasing. Therefore, taking the spread of COVID-19 in Xinjiang in 2020 as an example, 43,111 related micro-blog texts were collected. After a series of operations, such as Chinese word segmentation, POS tagging, data cleaning, text representation, feature extraction and so on, thematic extraction and text sentiment analysis were carried out to get people's comment themes, emotional tendencies and COVID-19's network emotional situation. The results show that the public will have a better understanding of the cause of COVID-19 disease and its infectiousness, preventive measures and cure as time goes on. The research of this paper can help the relevant government departments to perceive and guide the network emotional situation.
Keywords: COVID-19; text sentiment analysis; web sentiment; feature extraction; topic mining.
A named entity recognition method towards product reviews based on BiLSTM-Attention-CRF
by Shunxiang Zhang, Haiyang Zhu, Hanqing Xu, Guangli Zhu
Abstract: Named Entity Recognition (NER) towards product review is intended to identify domain-dependent named entities (e.g., organisation name, product name, etc.) from product reviews. Owing to the fragmentation and non-construction of product reviews, traditional methods are difficult to capture the domain feature information and dependencies precisely. To solve the problem, we propose an NER method towards product reviews based on BiLSTM-Attention-CRF. Firstly, three kinds of feature (characters, words and parts of speech) are integrated into the feature representation of texts. The final feature vector is obtained through training, mapping and linking the selected features. Then, the BiLSTM network is built to extract text features, and the Attention Mechanism is adopted to strengthen the capture of local features. Finally, CRF is applied to annotate and identify the entity. Compared with existing models, it is demonstrated that the proposed method can effectively recognise named entities from product reviews.
Keywords: NER; product reviews; BiLSTM; Attention Mechanism; CRF.
Efficient and non-interactive ciphertext range query based on differential privacy
by Peirou Feng, Qitian Sheng, Jianfeng Wang
Abstract: Differential private range query schemes satisfy differential privacy by adding or deleting records during the process of creating the index, which suffers from the weakness of data loss in query results owing to the negative noise. Recently, Sahin et al. proposed a differential private index with overflow arrays in ICDE 2018, which ensures the integrity of query results. However, this scheme suffers from two drawbacks: (i) some private information (e.g., query requests or frequency) may be leaked because of querying over plaintext index; (ii) the overflow arrays bring extra storage overhead. To this end, we present a non-interactive ciphertext range query based on differential privacy and comparable encryption. Our scheme can protect the query privacy since the query is performed over ciphertext based on comparable encryption. The experiment results show that our proposed scheme can save the storage overhead.
Keywords: range query; differential privacy; short comparable encryption.
Joint training with the edge detection network for salient object detection
by Zongyun Gu, Junling Kan, Chun Ma, Wang Qing, Fangfang Li
Abstract: The U-shaped network has great advantages in object detection tasks. However, most of the previous salient object detection studies suffered from inaccurate predictions affected by unclear object boundaries. Considering the complementarity of the information between salient object and salient edge, we designed a new kind of network to effectively perform the joint training with edge detection tasks in three steps. Firstly, we added a prediction branch on the bottom-up pathway for capturing the edge of salient objects. Secondly, salient object features, global context, integrated low-level details, and high-level semantic information are extracted by the method of progressive fusion. Finally, the feature of the salient edge is concatenated with that of the salient object on the last layer in the top-down pathway. Since the salient edge feature contains much information about edge and location, the feature fusion can locate salient objects more accurately. The results of experiments on five benchmark datasets demonstrate that the proposed approach achieves competitive performance.
Keywords: deep learning; salient object detection; U-shape architecture; edge detection; feature pyramid network.
Application of a deep learning approach for recognition of voiced Odia digits
by Prithviraj Mohanty, Jyoti Prakash Sahoo, Ajit Kumar Nayak
Abstract: Automatic speech recognition in a regional language such as Odia is a challenging field of research. Voiced Odia digit recognition helps in designing automatic voice dialler systems. In this study, a deep learning approach is used for the recognition of voiced Odia digits. The spectrogram representation of voiced samples is given as the input to the deep learning models after considering the feature extraction using MFCC. Various performance metrics are obtained by considering several experiments with different epoch sizes and variation in the dataset, using the train-validate-test ratio. Experimental outcomes reveal that the CNN model provides improved accuracy of 91.72% in epoch size of 500 with a split ratio of 80-10-10 as compared with the other two models that use VSL and DNN. The reported outcome reveals that the proposed CNN model has better average recognition accuracy than contemporary models such as HMM and SVM.
Keywords: ASR; CNN; DNN; MFCC; HMM; SVM; spectrogram.
Service recommendation through graph attention network in heterogeneous information networks
by Fenfang Xie, Yangjun Xu, Angyu Zheng, Liang Chen, Zibin Zheng
Abstract: Recommending suitable services to users autonomously has become the key to solve the problem of service information overload. Existing recommendation algorithms have some limitations, either discarding the side information of the node, or ignoring the information of the intermediate node, or omitting the feature information of the neighbour nodes, or not modelling the pairwise attentive interaction between users and services. To solve the above-mentioned limitations, this paper proposes a service recommendation approach by leveraging the graph attention network (GAT) and co-attention mechanism in heterogeneous information networks (HINs). Specifically, different types of meta-path are first constructed, and a feature expression is learned for each node in HINs. Then, the feature information of mashups/services are aggregated by the co-attention mechanism. Finally, the multi-layer perceptron (MLP) is applied to recommend suitable services for users. Experiments on a real-world dataset illustrate that the proposed method outperforms other state-of-the-art comparison methods.
Keywords: service recommendation; graph attention network; co-attention mechanism; heterogeneous information network.
Machine learning-based land usage identification using Haralick texture features of aerial images with Kekres LUV colour space
by Sudeep Thepade, Shalakha Bang, Rik Das, Zahid Akhtar
Abstract: Study of gathering some useful insights from our planet Earth its natural, man-made, physical, and biological structures is quite engrossing. Earth observation, despite being intuitive, also helps in mitigating the adverse impacts of human civilisation on our mother Earth. Multiple techniques that help in observing the Earths surface include Earth Surveying Techniques, Remote Sensing technology, etc. The properties which are measured using Remote-sensing technology stimulate the study of Land Usage Identification which refers to the purpose the land is used for. The rapid increase in population, immense growth in infrastructure and technology have led to massive urbanisation posing a great number of challenges. The knowledge of Land Use Identification will help in developing strategies to drive off issues related to the depletion of forest areas, urban encroachment, monitoring of natural disasters, etc. This paper attempts to give a more robust approach towards Land Usage Identification that extracts Haralick texture features from input aerial images of the earth by considering their representation in two different color spaces namely RGB and Kekre-LUV. Comparing the results obtained by using different machine learning classification algorithms, it is found that an ensemble of simple logistic and random forest classifiers outputs maximum classification accuracy.
Keywords: grey level co-occurrence matrix; random forest; simple logistic regression; land usage identification; remote sensing.
Constrained-based power management algorithm for green cloud computing
by Sanjib Kumar Nayak, Sanjaya Kumar Panda, Satyabrata Das
Abstract: In green cloud computing (GCC), power management provides many advantages, such as reducing costs, saving the environment and improving system efficiency. It is adopted in various facilities, like datacenters, which are backed by non-renewable energy (NRE) sources. These sources are not only costly, but also drastically impact the environment. This paper introduces a constrained-based power management algorithm, which considers four NRE and RE power supplies of the datacenters, grid, photovoltaics (PV), wind and battery, to fulfill the cumulative load power demand of submitted user requests (URs). The URs are fulfilled in the order of PV, wind and battery, and grid, respectively. The simulation is carried out by taking NRE sources, RE sources and both, and ten instances. The simulation results are compared using overall cost, UR assigned to NRE and UR assigned to RE to show the performance in three scenarios of the proposed algorithm.
Keywords: green cloud computing; power management; non-renewable energy; renewable energy; fossil fuel; solar energy; wind energy; load balancing.
Optimisations of four imputation frameworks for performance exploring based on decision tree algorithms in big data analysis problems
by Jale Bektas, Turgay Ibrikci
Abstract: The phenomenon of how to treat missing values is a problem confronted in big data analysis. Therefore, various applications have been developed on imputation strategies. This study focused on four imputation frameworks proposing novel perspectives based on expectation-maximization (EM), self-organising map (SOM), K-means, and multilayer perceptron (MLP). Initially, several transformation steps such as normalized, standardised, interquartile range, and wavelet were applied. Then, imputed datasets were analysed using decision tree algorithms (DTAs) by optimising their parameters. These analyses showed that DTAs had not been strikingly affected by any data transformation techniques except interquartile range. Even though the dataset contains a missing value ratio of 33.73%, the EM imputation framework provided a performance increase of 0.42% to 3.14%. DTAs based on C4.5 and NBTree algorithms have been more stable in analysing all big imputed datasets. Furthermore, realistic performance measurement of any preprocessing experiment with a classification algorithm based on C4.5 can be proposed to avoid time complexity.
Keywords: preprocessing; data mining; multiple imputation,decision tree classifier; machine- learning; big data analytics.
A new approach based on generalised multiquadric and compactly supported radial basis functions for solving two-dimensional Volterra-Fredholm integral equations
by Dalila Takouk
Abstract: This article describes a numerical scheme to solve two-dimensional nonlinear VolterraFredholm integral equations (IEs). The method estimates the solution by compactly supported radial basis functions and compared with the approximation of the solution by generalised multiquadric radial basis function with the optimal strategy for the exponent . Integrals appearing in the procedure of the solution are approximated using shifted LegendreGaussLobatto nodes and weights. The method is mathematically simple and truly meshless. It can be used for high-dimensional problems because it does not require any cell structures. Finally, numerical experiments are given to show and test the applicability of the presented approach and confirm the theoretical analysis.
Keywords: Volterra-Fredholm integral equations; two-dimensional integral equations; generalized multiquadric radial basis functions; compactly supported radial basis functions; interpolation method; shifted Legendre-Gauss-Lobatto nodes and weights .
KH-FC: krill herd-based fractional calculus algorithm for text document clustering using MapReduce structure
by Priyanka Shivaprasad More, Dr. Baljit Singh Saini
Abstract: In this paper, Krill Herd-based Fractional Calculus (KH-FC) using MapReduce framework is proposed for effective text document clustering. Here, the stop word removal and stemming model is applied in the pre-processing step, helps to remove redundant information and hence the size of the information is reduced, which further enhances the clustering accuracy. Furthermore, Term Frequency (TF) and Inverse Document Frequency (IDF) are employed for extracting significant features. Finally, the developed KH-FC model is used for clustering the text documents. The developed KH-FC algorithm is developed by combining the FC concept into the KH technique. In this method, pre-processing and feature extraction is performed in the mapper phase, whereas the clustering process is executed in the reducer phase. The performance of the developed approach is evaluated based on performance metrics, such as accuracy, Jaccard coefficient, and F-measure. The developed KH-FC approach obtained better performance in terms of accuracy, Jaccard coefficient, and F-measure is 0.983, 0.936 and 0.967, respectively.
Keywords: text document clustering; fractional calculus; krill herd algorithm; term frequency–inverse document frequency; Jaccard similarity.
A comprehensive understanding of popular machine translation evaluation metrics
by Md. Adnanul Islam, Md. Saddam Hossain Mukta
Abstract: Machine translation is one of the pioneer applications of natural language processing and artificial intelligence. Automatic evaluation of the translation performance of the machine translators is one of the most challenging tasks, as manual evaluation of large volumes of document translations is infeasible in practice. Thus, to facilitate the evaluation of translation performance automatically, several metrics have been introduced and used widely. Although these translation performance evaluation metrics cannot match the efficiency level of human evaluation, they are popularly employed in automatic evaluation of translation quality of texts across multifarious application domains. This article discusses three such widely used evaluation metrics BLEU, METEOR, and TER, with relevant details by demonstrating step-by-step calculations. The main novelty of this article lies in the consideration of several example translations to present and clarify the calculation process of these three popular evaluation metrics for measuring the performance or quality of machine translation. Moreover, the article presents a comparative analysis among these three metrics using two different datasets to reveal their similarities and distinctions in terms of behaviour.
Keywords: evaluation metrics; translation performance; BLEU; METEOR; TER; machine translation.
Intelligent recommendation of personalised tourist routes based on improved discrete particle swarm
by Jie Luo, Xilian Duan
Abstract: In order to overcome the problems of low accuracy and long time consuming in traditional personalised travel route recommendation methods, this paper proposes an intelligent recommendation of personalized tourist routes based on improved discrete particle swarm. This method analyses the key problems of tourism recommendation according to the personalised tourism characteristics, collects the information of tourists' interest, and establishes the model of tourists' interest. On this basis, the discrete particle swarm optimisation algorithm is improved, and the improved discrete particle swarm optimisation algorithm is used to select the personalised travel route, and the selection results are recommended to the passengers, so as to realise the personalised travel route intelligent recommendation. The experimental results show that the recommendation accuracy of this method is between 82.5% and 96.9%, and the recommendation time is always less than 0.5 s, which can realise the accurate and rapid recommendation of personalised tourist routes.
Keywords: discrete particle swarm; personalised travel route; intelligent recommendation; passenger interest.
Web API service recommendation for mashup creation
by Gejing Xu, Sixian Lian, Mingdong Tang
Abstract: Mashup refers to a sort of Web application developed by reusing or combining Web API services, which are very popular software components for building various applications. As the number of open Web APIs increases, to find suitable Web APIs for mashup creation, however, becomes a challenging issue. To address this issue, a number of Web API service recommendation methods have been proposed. Content-based methods rely on the description of the service candidates and the users request to make recommendations. Collaborative filtering-based methods use the invocation records of a set of services generated by a set of users to make recommendations. There are also some studies employing both the description and invocation records of services to make recommendations. In this paper, we survey the state-of-the-art Web API service recommendation methods, and discuss their characteristics and differences. We also present some possible future research directions.
Keywords: web service; recommendation; collaborative filtering; mashup creation.
A novel dual-fusion algorithm of single image dehazing based on anisotropic diffusion and Gaussian filter
by Kaihan Xiao, Qingshan Tang, Si Liu, Sijie Li, Jiayi Huang, Tao Huang
Abstract: Dark channel prior (DCP) is a widely used method in single image dehazing technology. Here, we propose a novel dual-fusion algorithm of single image dehazing based on anisotropic diffusion and Gaussian filter to suppress the halo effect or colour distortion in traditional DCP algorithms. Anisotropic diffusion is used for edge-preserving smooth images and a Gaussian filter is used to smooth the local white objects. A dual-fusion strategy is conducted to optimise the atmospheric veil. Besides, the fast explicit diffusion (FED) scheme is used to accelerate the numerical solution of the anisotropic diffusion to reduce time consumption. The subjective and objective evaluation of the experiment shows that the proposed algorithm can effectively suppress the halo effect and colour distortion, and has good dehazing performance on evaluation metrics. The proposed algorithm also reduces the time consumption by 54.2% compared with DCP with guided filter. This study provides an effective solution for single image dehazing.
Keywords: image dehazing; dark channel prior; anisotropic diffusion; fast explicit diffusion; image fusion.
Robust pedestrian detection using scale and illumination invariant mask R-CNN
by Ujwalla Gawande, Kamal Hajari, Yogesh Golhar
Abstract: In this paper, we address the challenging difficulty of detecting pedestrians with variation in scale and the illumination of the images. Occurrences of pedestrians with such variations exhibit diverse features. Therefore, it intensely affects the performance of recent pedestrian detection methods. We propose a new robust approach for overcoming the antecedent challenges. We proposed a Scale and Illumination invariant Mask R-CNN (SII Mask-RCNN) framework. The first phase of the proposed framework wields illumination variations by performing a logarithmic transformation and adaptive illumination enhancement. In addition, the non-subsampled contourlet transform used to decompose the image into multi-resolution components. Finally, we convolved the image with the multi-scale masks to find corresponding points that are illumination and scale-invariant. Extensive evaluations on pedestrian benchmark databases illustrate the effectiveness and robustness of the proposed framework. The experimental results contribute the notable performance improvements in pedestrian detection compared with the state-of-the-art approaches.
Keywords: deep learning; pedestrian detection; computer vision; neural network; CNN.
Special Issue on: CCPI'20 Smart Cloud Applications, Services and Technologies
A big data and cloud computing model architecture for a multi-class travel demand estimation through traffic measures: a real case application in Italy
by Armando Cartenì, Ilaria Henke, Assunta Errico, Marida Di Bartolomeo
Abstract: The big data and cloud computing are an extraordinary opportunity to implement multipurpose smart applications for the management and the control of transport systems. The aim of this paper is to propose a big data and cloud computing model architecture for a multi-class origin-destination demand estimation based on the application of a bi-level transport algorithm using traffic counts on a congested network, also to propose sustainable policies at urban scale. The proposed methodology has been applied to a real case study in terms of travel demand estimation within the city of Naples (Italy), also aiming to verify the effectiveness of a sustainable policy in term of reducing traffic congestion by about 20% through en-route travel information. The obtained results, although preliminary, suggest the usefulness of the proposed methodology in terms of ability in real time/pre-fixed time periods to estimate traffic demand.
Keywords: cloud computing; big data; virtualisation; smart city; internet of things; transportation planning; demand estimation; sustainable mobility; simulation model.
A methodology for introducing an energy-efficient component within the rail infrastructure access charges in Italy
by Marilisa Botte, Ilaria Tufano, Luca D'Acierno
Abstract: After the separation of rail infrastructure managers from rail service operators occurred within the European Union in 1991, the necessity of defining an access charge framework for ensuring non-discriminatory access to the rail market arose. Basically, it has to guarantee an economic balance for infrastructure manager accounts. Currently, in the Italian context, access charge schemes neglect the actual energy-consumption of rail operators and related costs of energy traction for infrastructure managers. Therefore, we propose a methodology, integrating cloud-based tasks and simulation tools, for including such an aspect within the infrastructure toll, thus making the system more sustainable. Finally, to show the feasibility of the proposed approach, it has been applied to an Italian real rail context, i.e. the Rome-Naples high-speed railway line. Results have shown that customising the tool access charges, by considering the power supply required, may generate a virtuous loop with an increase in energy-efficiency of rail systems.
Keywords: cloud-based applications; rail infrastructure access charges; environmental component; energy-saving policies.
Edge analytics on resource-constrained devices
by Sean Savitz, Charith Perera, Omer Rana
Abstract: Video and image cameras have become an important type of sensor within the Internet of Things (IoT) sensing ecosystem. Camera sensors can measure our environment at high precision, providing the basis for detecting more complex phenomena in comparison with other sensors e.g. temperature or humidity. This comes at a high computational cost on the CPU, memory and storage resources, and requires consideration of various deployment constraints, such as lighting and height of camera placement. Using benchmarks, this work evaluates object classification on resource-constrained devices, focusing on video feeds from IoT cameras. The models that have been used in this research include MobileNetV1, MobileNetV2 and Faster R-CNN, which can be combined with regression models for precise object localisation. We compare the models by using their accuracy for classifying objects and the demand they impose on the computational resources of a Raspberry Pi.
Keywords: internet of things; edge computing; edge analytics; resource-constrained devices; camera sensing; deep learning; object detection.
Traffic control strategies based on internet of vehicles architectures for smart traffic management: centralised vs decentralised approach
by Houda Oulha, Roberta Di Pace, Rachid Ouafi, Stefano De Luca
Abstract: In order to reduce traffic congestion, real-time traffic control is one of the most widely adopted strategies. However, the effectiveness of this approach is constrained not only by the adopted framework but also by data. Indeed, the computational complexity may significantly affect this kind of application, thus the trade-off between the effectiveness and the efficiency must be analysed. In this context, the most appropriate traffic control strategy to be adopted must be accurately evaluated. In general, there are three main control approaches in the literature: centralised control, decentralised control and distributed control, which is an intermediate approach. In this paper, the effectiveness of a centralised and a decentralised approach is compared and applied to two network layouts. The results, evaluated not only in terms of performance index with reference to the network total delay but also in terms of emissions and fuel consumption, highlight that the considered centralised approach outperforms the adopted decentralised one and this is particularly evident in the case of more complex layouts.
Keywords: cloud computing; internet of vehicles; transportation; centralised control; decentralised control; emissions; fuel consumption.
ACSmI: a solution to address the challenges of cloud services federation and monitoring towards the cloud continuum
by Juncal Alonso, Maider Huarte, Leire Orue-Echevarria
Abstract: The evolution of cloud computing has changed the way in which cloud service providers offer their services and how cloud customers consume them, moving towards the usage of multiple cloud services, in what is called multi-cloud. Multi-cloud is gaining interest by the expansion of IoT, edge computing and the cloud continuum, where distributed cloud federation models are necessary for effective application deployment and operation. This work presents ACSmI (Advanced Cloud Service Meta-Intermediator), a solution that implements a cloud federation, supporting the seamless brokerage of cloud services. Technical details addressing the discovered shortcomings are presented, including a proof of concept built on JHipster, Java, InfluxD, Telegraf and Grafana. ACSmI contributes to relevant elements of the European Gaia-X initiative, specifically to the federated catalogue, continuous monitoring, and certification of services. The experiments show that the proposed solution effectively saves up to 75% of the DevOps teams effort to discover, contract and monitor cloud services.
Keywords: cloud service broker; cloud services federation; cloud services brokerage; cloud services intermediation; hybrid cloud; cloud service monitoring; multi-cloud; DevOps; cloud service level agreement; cloud service discovery; multi-cloud service management; cloud continuum.
User perception and economic analysis of an e-mobility service: development of an electric bus service in Naples, Italy
by Ilaria Henke, Assunta Errico, Luigi Di Francesco
Abstract: Among the sustainable mobility policies, electric mobility seems to be one of the best choices to reach sustainable goals, but it has limits that could be partially exceeded in the local public transport. This research presents a methodology to design a new sustainable public transport service that meets users needs by analysing economic feasibility. This methodology is then applied to a real case study: renewing an 'old' bus fleet with an electric one charged by a photovoltaic system in the city of Naples (Southern Italy). Its effects on users' mobility choices were assessed through a mobility survey. The bus line and the photovoltaic system were designed. Finally, the economic feasibility of the project was assessed through a cost-benefit analysis. This research is placed in the field of smart mobility and new technologies that increasingly need to store, manage, and process large amounts of data typical of cloud computing and big data applications
Keywords: e-mobility; electric bus services; cloud computing; user perception; economic analysis; cost-benefit analysis; photovoltaic system; sustainable mobility policies; sustainable goals; new technologies; local emissions; environmental impacts.
Special Issue on: Intelligent Self-Learning Algorithms with Deep Embedded Clustering
Application of virtual numerical control technology in cam processing
by Linjun Li
Abstract: Numerical control (NC) machining is an important processing method in the machinery manufacturing industry. In most cases, as the final processing procedure, NC machining directly determines the quality of the finished product. As the key components needed in many industries such as automobile, internal combustion engine, national defense and so on, the precision and efficiency of the cam processing have a direct impact on the quality, life and energy saving standard of the engine and related products. This paper takes the cam NC grinding machining as the research object, takes the optimisation and intelligentisation of processing technology as the goal and uses the virtual NC technology to develop a process intelligent optimisation and NC machining software platform specifically for cam NC grinding machining. The software platform has machine tool library, grinding wheel library, material library, coolant reservoir library, process accessory library and other basic technology libraries, and it also has process example library, meta-knowledge rule library, forecast model library and other process intelligent libraries. With the support of database, the software platform can realise intelligent optimisation and automatic NC machining programming of cam grinding process plan. Because the software platform involves many research contents, this paper mainly focuses on the modelling of the motion process of the NC machining process system, the architecture of the intelligent platform software of the cam NC machining, and the virtual NC machining simulation of the process system. Therefore, the study of this paper is of great significance.
Keywords: cam grinding; numerical control grinding; intelligent platform software; process problems; virtual grinding.
Research on oil painting effect based on image edge numerical analysis
by Yansong Zhang
Abstract: With the continuous development of the technology of non-photorealistic rendering, the effect of oil painting on image is increasing. The traditional oil painting effect is not satisfactory enough to satisfy people's needs. Therefore, this paper puts forward the research of oil painting effect based on image edge numerical analysis, and constructs a corresponding algorithm for image edge numerical analysis and detection. Through the comparison experiment with traditional algorithm oil painting results, the conclusion is drawn that the algorithm proposed in this paper can accurately analyse and detect the image edge, and the final rendering effect is more natural and more smooth than the traditional algorithm oil painting effect.
Keywords: image; edge value; analysis; oil painting effect.
Research on multimedia and interactive teaching model of college English
by Zhang Juan
Abstract: Since the current higher education focuses on cultivating comprehensive practical ability rather than simply inculcating theoretical ideas, English should be adopted from the aspects of teaching purpose, teaching content and teaching strategy. A multiple interactive English teaching model is constructed to improve the information of a constructed method. Spatial reconstruction is used to extract and retrieve the information of multiple teaching resources, optimise and control the allocation of resources under the condition of load balance, and construct the data-mining model of College English teaching resources in the environment of information technology. With the result of information processing, optimised to maximise enthusiasm and creativity of the teachers and students, to continue the development of multimedia network resources and create a multiple interactive teaching environment, so as to create a platform for students.
Keywords: information technology environment; college English; multiple interactive teaching mode;.
Design and application of system platform in piano teaching based on feature comparison
by Tingting Rao
Abstract: Traditional piano teaching is managed mainly by hand, but there are low management efficiency, management confusion and other problems, seriously restricting the development of piano teaching activities. In order to make up for the limitations of piano music teaching materials and the shortage of music teachers in some areas, the automatic score of computer is introduced into music learning, and a set of piano music singing and singing system based on characteristic comparison is developed. The difference between the score system and the existing commercial music scoring system on the internet lies in the educational orientation of the system, which is mainly reflected in the design and implementation of the feedback evaluation module. The system uses melody feature extraction, similarity comparison and pitch data analysis to perform the automatic singing score, locate the error position, estimate the cause of the error and give the learners detailed feedback and guidance suggestions. The application case study shows that the system has practical application value.
Keywords: piano music teaching material; similarity comparison; learning feedback.
Special Issue on: SIRS'20 Intelligent Recognition Techniques and Applications
A deep learning approach for detecting the behaviour of people having personality disorders towards Covid-19 from Twitter
by Mourad Ellouze, Seifeddine Mechti, Moez Krichen, Vinayakumar Ravi, Lamia Hadrich Belguith
Abstract: This paper proposes an architecture taking advantage of artificial intelligence and text mining techniques in order to: (i) detect paranoid people by classifying their set of tweets into two classes (paranoid/not-paranoid), (ii) ensure the surveillance of these people by classifying their tweets about Covid-19 into two classes (person with normal behaviour/person with inappropriate behaviour). These objectives are achieved using an approach that takes advantage of different information related to the textual part, user and tweets for feature selection task and deep neural network for the classification task. We obtained an F-score rate of 70% for the detection of paranoid people and 73% for the detection of the behaviour of these people towards Covid-19. The obtained results are motivating and encouraging researchers to improve them given the interest and the importance of this research axis.
Keywords: Covid-19; personality disorder; text mining; natural language processing; deep learning; Twitter.
A content-based image retrieval scheme with object detection and quantised colour histogram
by Yuvaraj Tankala, Joseph K. Paul, Manikandan V M
Abstract: Content-based image retrieval (CBIR) is an active area of research due to its wide applications. Most of the existing CBIR schemes are concentrated to do the searching of the images based on the texture, colour, or shape features extracted from the query image. In this manuscript, we propose an object detection based CBIR scheme with quantised colour histograms. In the proposed scheme, the meaningful objects will be identified from the query image by using you only look once (YOLO) object detection techniques and the quantised histograms of each of the object categories. The object lists, their count, and the area covered by the objects along with quantised colour histograms will be used during feature matching to retrieve the related images from the large image pool. The experiment of the proposed scheme is performed on the Corel 1K and Caltech image dataset. We have observed an average precision of 0.96 during the experimental study, which is quite high compared with the precision from the well-known existing schemes.
Keywords: content-based image retrieval; object detection; colour histogram; you look only once; feature extraction;.
Prediction of heart disease using hybrid optimisation techniques in data clustering
by N. Gomathi, Amolkumar N. Jadhav, Mukund B. Wagh
Abstract: The disease diagnosis in the medical field enhances better medical service to patients and also leads to a decrease in their mortality rate. The prediction of the survival rate of the patients purely depends on the accurate diagnosis of the diseases, but still, it is a major challenge to the physicians as well as to medical domains. Besides, several researchers have experimented related to the prediction and classification of heart diseases, but they are ineffective in providing accurate results. In this research, the performance analysis of the optimal clustering algorithm-based real-world heart dataset is carried out with the developed clustering methods. Here, three developed methods, such as kernel-based exponential grey wolf optimisation, enhanced kernel-based exponential grey wolf optimisation, and whale grey clustering algorithm obtained better performance and provide accurate results about the diagnosis of diseases. Moreover, the performance analysis is done by considering the evaluation metrics such as the Jaccard coefficient, F-measure, MSE, and Rand coefficient.
Keywords: data clustering; disease diagnosis; medical image processing; grey wolf optimisation; kernel-based grey wolf optimiser.
Offline Arabic handwritten character recognition: from conventional machine learning system to deep learning approaches
by Soumia Faouci, Djamel Gaceb
Abstract: Researchers have made great strides in the area of Arabic handwritten character recognition in the last decades, especially with the fast development of deep learning algorithms. The characteristics of Arabic manuscript text pose several problems for a recognition system. This paper presents a conventional machine learning system based on the extraction of a set of preselected features and an SVM classifier. In the second part, a simplified Convolutional Neural Network (CNN) model is proposed, which is compared with six other CNN models based on the pre-trained architectures. The suggested methods were tested using three databases: two versions of the OIHACDB dataset and the AIA9K dataset. The experimental results show that the proposed CNN model obtained promising results, as it is able to recognise 94.7%, 98.3%, and 95.6% of the test set of the three databases OIHACDB-28, OIHACDB-40, and AIA9K, respectively.
Keywords: deep learning; convolutional neural network; Arabic handwritten character recognition; machine learning; support vector machine; transfer learning; features extractor; fine tuning.
Understanding the nonlinear dynamics of seizure and sleep EEG patterns generated using hierarchical chaotic neuronal network
by Sunitha Ramachandran, A. Sreedevi
Abstract: The purpose of this article is to describe how a chaotic biological neural network based on a mammalian olfactory system can be used to generate EEG patterns during seizures, REM and NREM sleep. The parameters governing the connection between each node at each layer of an olfactory system's K3 topology have been tuned to replicate low and high dimensional activities as well as periodic bursts matching to distinct brain states. The chaotic qualities of the simulated time series are evaluated against practical recordings of EEG patterns generated during distinct brain states by computing Hurst exponent, fractal dimension, and detrended fluctuation analysis. Our findings contribute to a better understanding of the complex cognitive tasks involved in various functional stages of the brain, as well as to the modelling of these activities using a biologically plausible hierarchical network of neurons.
Keywords: mammalian olfactory system; chaotic biological neuronal network; EEG; epilepsy; REM; NREM; power spectrum; fractal dimension; Hurst exponent; detrended fluctuation analysis.
Special Issue on: Computational Intelligence in Data Science
Topologisation of the situation geographical image in the aspect of control of local transport and economic activity
by Sergei Bidenko, Sergei Chernyi, Yuri Nikolashin, Evgeniy Borodin, Denis Milyakov
Abstract: The specific features of cartographic images are considered from the point of view of procedures for assessing the situation in the area of maritime transport activity and spatial planning. The tasks of spatial analysis are highlighted, requiring a transition from cartographic to topological mapping of geographic reality. The existing anamorphic techniques, their classification as well as their advantages and disadvantages, are considered. Models for constructing anamorphosis of the terrain for topologising the geoimage of real situation have been developed. An algorithm based on affine transformation, based on the distortion of the boundaries of the area relative to the centre of mass of the region, is proposed. A comparison of the proposed algorithm with the applicable Gastner-Newman algorithm is given.
Keywords: maritime territorial activity; territorial situation; analysis and assessment of the situation; base map; geospace; geoobject; anamorphosing; cartoid; anamorphosis.
Palm-print recognition based on quality estimation and feature dimension
by Poonam Poonia, Pawan K. Ajmera
Abstract: The exploitation of biometric traits for human identification is more and more in style in recent years. Among the widely used biometric traits, palm-print is a vital one because of its acquisition convenience and comparatively high recognition results. The paper proposes a palm-print recognition system based on quality estimation and feature dimensions. Initially, a quality assessment is applied on the extracted region of interest (ROI) images. Gabor filter is employed to extract the palm-print features having various scales and orientations. The kernel-based dimensionality reduction is applied in the full space, which reduces the high dimensional Gabor features. The experiments are conducted on the PolyU, IIT-Delhi and CASIA palm-print databases. The best recognition performance in terms of an Equal Error Rate (EER) of 0.051% and Recognition Rate (RR) of 98.34% was achieved on the PolyU database. Experimental results prove the effectiveness of the proposed approach.
Keywords: palm-print; pre-processing; quality control; dimensionality reduction; feature extraction.
Design and implementation of an efficient and cost-effective deep feature learning model for rice yield mapping
by Divakar M. Sarith, Elayidom M. Sudheep, R. Rajesh
Abstract: Crop yield prediction before harvest is essential to address the instability of crop prices and ensure food security. Existing approaches of crop yield forecasting focus on survey data and are expensive. Remote sensing-based crop yield forecasting is a promising approach, especially in areas where field data is scarce. Recent studies using machine learning and deep learning techniques used modern representation learning ideas instead of traditionally used features that discarded many spectral bands available from the satellite imagery. A deep feature learning model using convolutional LSTM cells is used for forecasting rice yield from remote sensing satellite imagery. Convolutional LSTM with convolutional input and recurrent transformations directly captures spatial and temporal features of the input data. Feature selection is performed using principal component analysis to reduce the dimension of input data without much loss in the performance. Results suggest that features learned are highly informative and our proposed model performed better than other existing techniques.
Keywords: precision agriculture; remote sensing; crop yield forecast; deep learning; recurrent neural network; long short term memory; convolutional LSTM network; PCA; MODIS.
Stock indices price prediction in real time data stream using deep learning with extra-tree ensemble optimisation
by Monika Arya, Hanumat Sastry G
Abstract: Stock price prediction has always been one of the favourite research topics in industry and academia. The patterns of stock markets follow random walk motion and are highly volatile. Stock traders usually forecast the upcoming effective trends and take the decision to buy or sell the stock by performing fundamental or technical analysis. Earlier prediction models using machine learning, ensemble learning, neural networks and deep learning (DL) techniques for forecasting stock are more complex and less accurate. We propose a novel DL network with Extra-Tree Ensemble optimisation (DELETE) for predicting stock indices price trends in a real time data stream. We have applied extra-tree ensemble for optimising the cross entropy loss function and derived highly predictive Stock Technical Indicators (STIs), thus improving prediction accuracy. In the proposed neuro-computational model we supplied these STIs as tensor, to make computation faster. The experiments were conducted in Python language using powerful DL libraries: TensorFlow and Keras. For performance evaluation the data of three popular stock indices of National Stock Exchange (NSE) India were chosen. The daily prediction model achieved an accuracy of up to 78.9% and average accuracy of 66.61%, which is up to 30.2% higher than benchmark models. The DL algorithm predicts what is likely to happen to prices, while the optimisation algorithm gives strength to such predictions by improving its accuracy. The proposed model performed well in deriving correct monthly trends as well as higher prediction accuracy for daily buy/sell decisions.
Keywords: deep learning; ensemble learning; extra-tree optimisation; machine learning; neural network; stock indices prediction; predictive model; real time data streams.
A two-stage text detection approach using gradient point adjacency and deep network
by Tauseef Khan, Ayatullah Faruk Mollah
Abstract: Accurate localisation of texts in complex scene environment is a pivotal problem in computer vision and image processing research. Although several methods have been reported, one may hardly find any method that performs adequately well in the wild, and most of the reported methods have employed Latin script. However, a focus on Indian regional scripts that appear in diversified text-pattern and orientation has not received ample attention. In this paper, an attempt is made to design a simple, robust yet effective text detection method for both scene and computer-generated images under a multi-script environment in an Indian context. At first, a fine-scale edge-map is generated from the original image, and subsequently, adaptive clustering is applied to form clusters of edge-points based on their spatial density. Foreground objects are then extracted with the help of the appropriate cluster boundaries, and they are considered as prospective text proposals. Such text proposals are fed to a deep convolutional neural network for learning and prediction as text or non-text components. Finally, true-text components are properly aggregated as localised final texts of the original image. The proposed method is evaluated on two popular benchmark datasets, viz. ICDAR 2017-MLT and ICDAR 2013 born-digital image. The results are found to surpass some other state-of-the-art methods, which demonstrates its strength and pertinent usefulness in both scene and born-digital environments.
Keywords: text detection; candidate text proposal; foreground object classification; text proposal aggregation; deep network.
RNN-BD: an approach for fraud visualisation and detection using deep learning
by G. Madhukar Rao, K. Srinivas
Abstract: The evolution of banking information systems considerably increases fraud activities, which can have a negative impact on banking financial services. The use of credit cards has increased significantly due to electronic funding, electronic services and e-commerce activities. Massive amounts of data from credit card transactions can result in big data. Researchers are now using machine learning algorithms to detect and analyse fraud in online transactions. One of the major concerns of the banking industry is the visualisation and detection of credit card fraud. Machine learning techniques only work well when the dataset is small and does not have complex models. Deep learning, on the other hand, processes large and complex datasets. The objective of this paper is to visualise and detect credit card fraud by incorporating deep learning and dimensionality reduction techniques. A real dataset is used to assess the effectiveness of the intended work. The results show that our proposed model is more efficient in identifying fraudulent transactions to reduce fraud and income loss. We found that our deep learning model can be used to identify fraudulent transactions and reduce fraud losses to protect customer interests.
Keywords: big data; credit card transaction fraud; deep Learning; optimisation; visualisation.
Statistical heart rate variability analysis under rest and post-exercise
by Prashant Kumar, Ashis Kumar Das, Suman Halder
Abstract: Heart rate variability (HRV) analysis can track the physical occurrence of variation of inter-beat interval and can be employed for regular monitoring of the health of the sportsperson. For the present work, 62 datasets (two from each participant) have been incorporated from people who were actively engaged in some sort of morning exercise or games to investigate about chronic fatigue and underperformance due to overtraining. Data were acquired by using BIOPAC MP45 and pre-processed signals were applied for R peak detection using maximum overlap discrete wavelet transform (MODWT). Analysis of variance (ANOVA) and Wilcoxon signed-rank test have been evaluated to differentiate HRV parameters in both resting and post-exercise conditions. The p-value based on ANOVA for each HRV indices suggests that there is no statistically significant difference between the two sets of data and it confirms the null hypothesis, but significant differences have been attained for the standard deviation of heart rate and approximate entropy in the case of the Wilcoxon signed-rank test. The statistically significant difference in resting and post-exercise data may be due to overtraining involved during exercise, which is a very common issue for athletes and sportspersons. Overtraining can be monitored with the help of biosignals non-invasively.
Keywords: electrocardiogram; heart rate variability; heart rate; Poincare plot; approximate entropy; standard error of the mean; coefficient of variation; analysis of variance.
Spatial and temporal trends reveal: hotspot identification of crimes using a machine learning approach
by Khushboo Sukhija, Shailendra Narayan Singh, Mukesh Kumar, Deepti Mehrotra
Abstract: With the escalation in criminal cases, most of the population all over the country are becoming victims of different types of crime, which is one of the major concerns in the evolution of society. Therefore, hotspot identification of crimes by analysing real-time datasets has become essential and will significantly benefit the public by accurately analysing the dangerous locations. This paper aims to develop the framework model for identifying criminal hotspots using a modified KNN (K Nearest Neighbour) algorithm by considering different crime characteristics such as the severity of the crime, frequency of the crime and temporal data of crime by visualising hotspots using a Geographic Information System (GIS). This study analyses the real dataset of crime for the recent five years collected from the Commissioner Police of Gurgaon, Haryana. The data cleaning and pre-processing strategies have been applied to make the data ready for further training the model. The results demonstrate locations of the different hotspots based on the density of crime occurrences, and accurate visualisation of hotspots using GIS display is done by supervised learning and unsupervised classifiers. The claims have been validated through a proposed model, the modified KNN algorithm, with a comprehensive accuracy of around 99% by appropriately tuning and optimising the parameters.
Keywords: hotspot; crime; GIS; supervised learning; unsupervised learning; spatial analysis; temporal analysis; network.