International Journal of Intelligent Information and Database Systems (6 papers in press)
SDMS: Smart Database Management System for Accessing Heterogeneous Databases
by Khaleel Mershad, Ali Hamieh
Abstract: During the last twenty years, the amount of stored digital data has witnessed enormous growth. These data are usually divided by the scientific community into structured and unstructured data. Although unstructured data are becoming a significant part of big data, there are still a large number of data generators that save their data into traditional structured (i.e., relational) databases. On the other hand, with the advantages that exist in adopting non-structural database systems to save various kinds of data, many companies and institutions are shifting into acquiring and utilizing a professional non-structural database management system (DBMS). These organizations find themselves obliged to translate and convert their old structured data into new forms and formats, in order to make them compatible with the new systems. In order to avoid the consequences of such costly operations, we present the basics of a new database management system that allows a database user to access and query both structural and non-structural databases at the same time using a single query. Our proposed system makes a database user view the two databases (structured and non-structured) as a single database (DB) on which a single operation is executed. In this paper, we present an example of our system that uses MySQL and HBase as the structured and non-structured databases. We explain the details of various types of queries (Insert, Update, Delete, and Search) that are performed by the proposed system, and discuss the tests that we made to measure the end-to-end delays of these queries on two similar MySQL and HBase databases.
Keywords: Big data; structured data; unstructured data; Smart Database Management System; Mixed Query Language; Heterogeneous Databases.
Optimisation algorithm-based recurrent neural network for big data classification
by Md Mobin Akhtar, Danish Ahamad, Shabi AlamHameed
Abstract: This paper introduces a technique for big data classification using an optimisation algorithm. Here, the classification of big data is performed in a Hadoop MapReduce framework, wherein the map and reduce functions are based on the proposed dragonfly rider optimisation algorithm (DROA), which is designed by integrating the dragonfly algorithm (DA) and rider optimisation algorithm (ROA). The mapper uses the proposed optimisation as a mapper function for selecting the optimal features from the input big-data, for which the fitness function is based on Renyi entropy. Then, the selected features are subjected to the reducer phase, where the classification of the big data is performed using the DROA-based recurrent neural network (RNN), in which the RNN is trained by the proposed DROA. The result proves that the proposed
method acquired a maximal accuracy of 0.996, the sensitivity of 0.995, and specificity of 0.995, respectively.
Keywords: big data classification; optimisation; MapReduce; recurrent neural network; RNN; Renyi entropy.
Leveraging App Features to Improve Mobile App Retrieval
by Messaoud CHAA, Omar Nouali, Patrice Bellot
Abstract: The continued increase in the use of smartphones and other mobile devices has led to a substantial increase in the demand for mobile applications (apps). With the growing availability of mobile apps, retrieving the right application from a large set has become difficult. However, the existing term-based search engines tend to retrieve relevant apps based on the keywords issued by user rather than considering app features really required by users, such as functionalities, technical characteristics or characteristics related to the user interface of apps. In this paper, we propose a Term and Feature-based approach that, in addition to terms, uses app features extracted from app description and social users' reviews, in order to retrieve the relevant apps to the query and meet the user's needs. The novelty of the proposed approach lies in the use of a representation by features to both apps and queries and the computation of the relevance score between them to get the feature-based score. In addition, our approach combines Feature-based score and Term-based score to get the relevance score of each app. We finally propose an effective techniques that extracts and weight features requested by user in her query. The Experimental results indicate that the proposed approach is effective and outperforms the state-of-the-art retrieval models for app retrieval.
Keywords: App Retrieval; Feature Extraction; Social Information Retrieval;rnNatural Language Processing; Feature-based Score; Term-based Score.
Construction and Application of Knowledge Base in Telecom Fraud Domain
by Rongchen Zhu, Han Ye, Haichun Sun, Xin Li, Yongchen Duan, Jiaqi Hou
Abstract: Nowadays, the number of telecom frauds crimes has been increasing rapidly in China. To improve the peoples awareness of prevention and enhance the professional ability of public security police in investigating and combating, this paper proposes and implements a knowledge base of the telecom fraud domain. Firstly, we improve the event-based telecom fraud ontology model by considering two perspectives, including crime combats and preventions. We also conduct a fine-grained classification and finish a multi-dimensional fusion of the domain knowledge to improve data quality. Secondly, we introduce a new method to reuse domain business processes knowledge, which is a supplementing of knowledge representation and enhances the quality of the knowledge base. Finally, this paper shows two types of domain applications, a Q&A application, and a case assistant. The experimental results show that the public and the police can get some advice and inspiration from the built Knowledge Base.
Keywords: telecom fraud; knowledge base; combat and prevention.
Chronological Penguin Adam-based deep Long Short Term Memory classifier for stock market prediction
by Dattatray P. Gandhmal, Kumar K
Abstract: To forecast the individual stocks or indexes with the direction of market price movement or the level of future market prices with the time-series data poses a challenging issue in the research community. Hence, to analyze, forecast, and understand the time-series data in the stock market for predicting the future prices plays a major attention in the research community. To ensure the effectiveness of prediction in the stock market, an effective method named Chronological Penguin Adam based deep Long Short Term Memory (CPAdam-based deep LSTM) is proposed in this research to forecast the future prices in stock market. Moreover, the technical indicators are effectively extracted from the stock market input data, and use the wrapper approach to select the suitable features. The deep Long Short term memory classifier performs the stock market prediction based on the memory cell associated, which acts as an accumulator to store the state information. Moreover, the deep LSTM is trained by the proposed CPAdam algorithm using the effectiveness of adaptive learning rates. It uses the sparse gradient factor and step size annealing for approximating the hyperparameter. The proposed CPAdam-based deep LSTM attained better performance in stock market prediction better performance using data set-2 in terms of the metrics, like MSE is 3.12E-05, prediction accuracy is 68.576%, RMSE is 0.006, return-per trade is 61.126%, and winning rate is 82.14%, respectively.
Keywords: Penguin search optimization algorithm (PeSOA); Adam optimization algorithm; stock market prediction; wrapper approach; deep Long Short Term Memory (LSTM).
A roadmap to data science: background, future, and trends
by Moaiad Ahmad Khder, Samah Wael Fujo, Mohammad Adnan Sayfi
Abstract: Data science is a combination of several disciplines that aims to get accurate insights from a bunch of data, develop the technology, and algorithm to solve the complicated problems analytically. Today, data science plays a massive role in our life, and researchers realise how it is essential. Numerous research studies on data science have been published in recent years, but each focus on specific issues, such as data science and its impact on business, manufacturing, academia, and healthcare. This paper will present a roadmap of data science to benefits the readers to know about this field and realise how it is essential in several areas. Besides, clarifies the mechanism of how data science work and what are the capabilities of the data scientist to be able to work in this field. It also shows the trends and future work of data science.
Keywords: data science; big data; data science trends.