International Journal of Intelligent Information and Database Systems (13 papers in press)
Chronological Penguin Adam-based deep Long Short Term Memory classifier for stock market prediction
by Dattatray P. Gandhmal, Kumar K
Abstract: To forecast the individual stocks or indexes with the direction of market price movement or the level of future market prices with the time-series data poses a challenging issue in the research community. Hence, to analyze, forecast, and understand the time-series data in the stock market for predicting the future prices plays a major attention in the research community. To ensure the effectiveness of prediction in the stock market, an effective method named Chronological Penguin Adam based deep Long Short Term Memory (CPAdam-based deep LSTM) is proposed in this research to forecast the future prices in stock market. Moreover, the technical indicators are effectively extracted from the stock market input data, and use the wrapper approach to select the suitable features. The deep Long Short term memory classifier performs the stock market prediction based on the memory cell associated, which acts as an accumulator to store the state information. Moreover, the deep LSTM is trained by the proposed CPAdam algorithm using the effectiveness of adaptive learning rates. It uses the sparse gradient factor and step size annealing for approximating the hyperparameter. The proposed CPAdam-based deep LSTM attained better performance in stock market prediction better performance using data set-2 in terms of the metrics, like MSE is 3.12E-05, prediction accuracy is 68.576%, RMSE is 0.006, return-per trade is 61.126%, and winning rate is 82.14%, respectively.
Keywords: Penguin search optimization algorithm (PeSOA); Adam optimization algorithm; stock market prediction; wrapper approach; deep Long Short Term Memory (LSTM).
A roadmap to data science: background, future, and trends
by Moaiad Ahmad Khder, Samah Wael Fujo, Mohammad Adnan Sayfi
Abstract: Data science is a combination of several disciplines that aims to get accurate insights from a bunch of data, develop the technology, and algorithm to solve the complicated problems analytically. Today, data science plays a massive role in our life, and researchers realise how it is essential. Numerous research studies on data science have been published in recent years, but each focus on specific issues, such as data science and its impact on business, manufacturing, academia, and healthcare. This paper will present a roadmap of data science to benefits the readers to know about this field and realise how it is essential in several areas. Besides, clarifies the mechanism of how data science work and what are the capabilities of the data scientist to be able to work in this field. It also shows the trends and future work of data science.
Keywords: data science; big data; data science trends.
Predicting the possibility of COVID-19 infection using fuzzy logic system
by Shadab Hafiz Choudhury, Azmary Jannat Aurin, Tanbin Akter Mitaly, Rashedur M. Rahman
Abstract: Diagnosing COVID-19 in a fast and efficient manner is an ongoing problem. Currently, methods of detection involve physical tests. Physical tests have the disadvantage that they require either test kits or medical equipment. This paper outlines the design of a type-2 fuzzy logic system that can help provide a preliminary diagnosis by computing the possibility that a patient is suffering from COVID-19 based on their external symptoms. It uses input information that can be gleaned without need for medical procedures. Both statistical data and the knowledge base was garnered from publicly available databases and datasets. The fuzzy logic system implemented here is functional, but it is fairly inaccurate and illustrates that more information, both symptomatic and epidemiological is needed, to predict COVID-19 infections through an expert system.
Keywords: iterative type-2 fuzzy logic system; Mamdani fuzzy inference; novel coronavirus; COVID-19.
A method for blind image deblurring based on Gabor filter edge estimation
by Van Huan Nguyen, Trung Dung Do, Xuenan Cui, Hale Kim
Abstract: Edge is widely used in blind deconvolution methods to provide the information source for the deblurring process. In the natural images, the blurring can be shown in any directions on edges randomly. However, the previous edge-based deconvolution methods made use of vertical and/or horizontal edge information only in the recovering process that lowers the performance. In this paper, a novel method of blind deconvolution using Gabor filer to estimate the edge information in omnidirectional directions is proposed. The deblurring process is then followed by applying fast iterative shrinkage-thresholding algorithm and iteratively reweighted least squares to recover the true sharp image. Also, the paper proposes a sharpness measurement, named as Haar defocus score, based on Haar-wavelet transformation, to estimate the quality of the deblurred image in cases no ground-truth exists for comparison. Experiments on the common public databases in the field show promising performance of the proposed method with respect to both PSNR and Haar defocus score measurements in comparison with the previous methods.
Keywords: blind deconvolution; omnidirectional edge; deblurring; Gabor filter.
Cat Deep System for Intrusion Detection in Software Defined Networking
by Yogita Hande, Akkalashmi Muddana
Abstract: The development of Software-Defined networks (SDN) has made the application of the network control more convenient and easy to develop and manage. It has provided a better capability of adapting the condition to the changing demands of applications and the network conditions. The maintenance of the network becomes very easier with the improved security due to the presence of the SDN. This paper proposes an intrusion detection system in SDN with the newly developed Cat Deep System (CDS). The training is done using the Deep Convolutional Neural Network (DCNN) with the integration of the Stochastic Gradient Descent (SGD) with the cat swarm optimization. The proposed system consists of three important components, such as sniffer, detector, and the sensor. All the packets are inspected with the sniffer in order to extract the features and the extracted features are used to detect the abnormality using DCNN and the features are then used to check the boundary to find the presence of attack in the system. The experimentation is done using the KDD Cup99 database. The accuracy, precision, recall, and F_1 measure of the Cat Deep system is 0.8954, 0.8928, 0.8668, and 0.7770 respectively that implies the effectiveness of the proposed system. rnrn
Keywords: Software-defined Networks; Cat Deep System; cat swarm optimization; duration; and sniffer.
Convolutional recurrent neural network with attention for Vietnamese speech to text problem in the operating room
by Trinh Tan Dat, Le Tran Anh Dang, Vu Ngoc Thanh Sang, Le Nhi Lam Thuy, Pham The Bao
Abstract: We introduce automatic Vietnamese speech recognition (ASR) system for converting Vietnamese speech to text on a real operating room ambient noise recorded during liver surgery. First, we propose applying a combination between convolutional neural network (CNN) and bidirectional long short-term memory (BLSTM) for investigating local speech feature learning, sequence modelling, and transcription for speech recognition. We also extend the CNN-LSTM framework with an attention mechanism to decode the frames into a sequence of words. The CNN, LSTM and attention models are combining into a unified architecture. In addition, we combine connectionist temporal classification (CTC) and attentions loss functions in training phase. The length of the output label sequence from CTC is applied to the attention-based decoder predictions to make the final label sequence. This process helps to decrease irregular alignments and make speedup of the label sequence estimation during training and inference, instead of only relying on the data-driven attention-based encoder-decoder for estimating the label sequence in long sentences. The proposed system is evaluated using a real operating room database. The results show that our method significantly enhances the performance of the ASR system. We find that our approach provides a 13.05% in WER and outperforms standard methods.
Keywords: Vietnamese speech recognition; convolutional neural network; CNN; bidirectional long short-term memory; BLSTM; attention; operating room.
Real-time long short-term glance-based fire detection using CNN-LSTM neural network
by Huan Van Nguyen, Xuan Thang Pham, Cuong Nguyen Le
Abstract: Vision-based fire detection is widely studied recently to reduce the damage of fire disaster thanks to the advantages of software-based methods comparing to traditional hardware-based fire detection using sensors. This paper presents a novel method for fire detection using the convolutional neural networks on image sequences of videos to extract both the spatial and temporal information for fire classification. The system includes a CNN network to extract the image features, and short-term and long-term stages at the end for classification. Experiments carried out on the common public datasets show promising results in terms of performance in comparison to the previous works.
Keywords: fire detection; long short-term memory; LSTM; temporal CNN.
Performance evaluation of reformulated query for information retrieval using real estate ontology
by Namrata Rastogi, Parul Verma, Pankaj Kumar
Abstract: The data over the internet is growing at an unprecedented rate daily. Hence efficient information retrieval has always been at stake. The scenario becomes more tedious in an e-government sector like real estate where fetching of legal documentation used for buying and selling of land property is unknown to novice users. The uncommon legal terminology also dilutes the keyword-based retrieval system. The proposed system thus insists on creating and further utilising ontology in this semantic web era. The performance of information retrieval is measured by calculating various parameters like mean average precision, precision@k, and normalised discounted cumulative gain for general user query first and then comparing it with the reformulated query after applying the real estate ontology. The experimental results are further statistically checked to depict an improvement in all the parameters thereby indicating that an initial user query, after ontological reformulation improves the efficiency of the information retrieval process.
Keywords: e-government; information retrieval; legal ontology; mean average precision; mAP; performance evaluation; precision; query reformulation; real estate; semantic web.
A novel index retrieval and query optimisation method for private information retrieval in location-based service application
by K.M.Mahesh Kumar, Radhakrishna Bhat, N.R. Sunitha
Abstract: Location-based service is a popular information and communications technology. Security, trust and privacy are the major concerns preventing the wide deployment of LBS. In this paper, we address privacy issues by employing computational private information retrieval schemes and highlight a few optimisation methods. We propose a novel index retrieval technique which helps the user to identify their grid id and know the index value for the point-of-interest (POI) type of his interest and an adaptive computation method (flip-optimisation) to reduce multiplication cost for PIR query used to retrieve the POI item at the specified index. Adaptive computation method proposed in this paper is generic and can be applied to any application which uses PIR protocol to access data privately. Our work empirically evaluated the proposed method by implementing the PIR prototype and found it suitable for a practical purpose.
Keywords: index retrieval; location-based service; LBS; location privacy; private information retrieval; PIR; quadratic residuosity assumption; QRA; query optimisation.
SDMS: smart database management system for accessing heterogeneous databases
by Khaleel Mershad, Ali Hamieh
Abstract: Digital data are usually divided into structured and unstructured types. Although unstructured data are becoming a significant part of digital data, there are still a large number of organisations that save their data into structured datastores. However, with many companies shifting into utilising non-structural database management systems (DBMS), they find themselves obliged to translate and convert their old structured data into new forms. In order to avoid the consequences of such costly operations, we present the basics of a new database management system that allows a database user to access and query both structural and non-structural databases at the same time using a single query. We explain the details of various types of queries (insert, update, delete, and search) that are performed by the proposed system, and discuss the tests that we made to measure the end-to-end delays of these queries on two MySQL and HBase databases.
Keywords: big data; structured data; unstructured data; smart database management system; SDBMS; mixed query language; MQL; heterogeneous databases.
Optimisation algorithm-based recurrent neural network for big data classification
by Md Mobin Akhtar, Danish Ahamad, Shabi AlamHameed
Abstract: This paper introduces a technique for big data classification using an optimisation algorithm. Here, the classification of big data is performed in a Hadoop MapReduce framework, wherein the map and reduce functions are based on the proposed dragonfly rider optimisation algorithm (DROA), which is designed by integrating the dragonfly algorithm (DA) and rider optimisation algorithm (ROA). The mapper uses the proposed optimisation as a mapper function for selecting the optimal features from the input big-data, for which the fitness function is based on Renyi entropy. Then, the selected features are subjected to the reducer phase, where the classification of the big data is performed using the DROA-based recurrent neural network (RNN), in which the RNN is trained by the proposed DROA. The result proves that the proposed method acquired a maximal accuracy of 0.996, the sensitivity of 0.995, and specificity of 0.995, respectively.
Keywords: big data classification; optimisation; MapReduce; recurrent neural network; RNN; Renyi entropy.
Leveraging app features to improve mobile app retrieval
by Messaoud Chaa, Omar Nouali, Patrice Bellot
Abstract: The continued increase in the use of smartphones and other mobile devices has led to a substantial increase in the demand for mobile applications. With the growing availability of mobile apps, retrieving the right application from a large set has become difficult. However, the existing term-based search engines tend to retrieve relevant apps based on query terms rather than considering app features really required by users, such as functionalities, technical or user-interface characteristics. The novelty of this paper lies in extracting app features from app description and social users' reviews, extracting user-requested features and matching between them to get the feature-based score. In addition, we propose effective techniques that extract and weight features requested in the query. Finally, we combine feature-based and term-based scores together to obtain the app relevance score. The experimental results indicate that the proposed approach is effective and outperforms the state-of-the-art retrieval models for app retrieval.
Keywords: app retrieval; feature extraction; social information retrieval; natural language processing; NLP; feature-based score; term-based score.
Construction and application of knowledge-base in telecom fraud domain
by Rongchen Zhu, Han Ye, Haichun Sun, Xin Li, Yongchen Duan, Jiaqi Hou
Abstract: Nowadays, the number of telecom frauds crimes has been increasing rapidly in China. To improve the people's awareness of prevention and enhance the professional ability of public security police in investigating and combating, this paper proposes and implements a knowledge-base of the telecom fraud domain. Firstly, we improve the event-based telecom fraud ontology model by considering two perspectives, including crime combats and preventions. We also conduct a fine-grained classification and finish a multi-dimensional fusion of the domain knowledge to improve data quality. Secondly, we introduce a new method to reuse domain business processes knowledge, which is a supplementing of knowledge representation and enhances the quality of the knowledge-base. Finally, this paper shows two types of domain applications, a Q&A application and a case assistant. The experimental results show that the public and the police can get some advice and inspiration from the built knowledge-base.
Keywords: telecom fraud; knowledge-base; combat; prevention.