International Journal of Intelligent Information and Database Systems (10 papers in press)
Building Natural Language Responses from Natural Language Questions in the Spatio-Temporal Context
by Ghada Landoulsi, Khaoula Mahmoudi, Sami Faïz
Abstract: With the evolving research in geographic information system (GIS) owing to its ability to support decision makers in different fields, there is a strong need to enabling all users; specialists and non specialists to profit from this technology. Although, the key impediment to non specialists is the language to interact with the GIS and especially its embedded Geographic Database (GDB) which require SQL skills. In this paper we explore a new approach which alleviates nomad GIS users from any formatting effort by only using the natural language as a GDB communication mean. The process is generally twofold: (1) formatting the natural language user query to be processed by the GDB engine, and (2) translating the GDB retrieved answer to a text easily interpreted by all GIS users. The resulting implemented system was integrated to the OpenJump GIS and has been evaluated to give satisfactory results.
Keywords: Spatio-temporal data; Geographic Databases; Question Answering Systems; Structured Query Language; Natural Language Generation.
A Scalable Approach for Index in Generic Location-aware Rank Query
by Utharn Buranasaksee
Abstract: As location-aware information becomes more popular, many researchers have been proposed many effective approaches to solving the problem. In this work, we identify the shortcomings of the existing method. After that, an efficient method called Multi-IRS is proposed to optimize the query search at runtime. Our analysis points out how optimization could be done. Multi-IRS makes use of the improved algorithm that addresses numeric and textual attributes. After that, the Sparse Ordered Set is proposed to reduce the index construction time. Finally, the External Attribute-based IR-Tree sort algorithm is proposed to solve the scalability issues. Throughout the extensive experiments, our proposed method significantly outperforms the existing method.
Keywords: scalable; generic; location-aware; query; search; keyword; spatial.
SDMS: Smart Database Management System for Accessing Heterogeneous Databases
by Khaleel Mershad, Ali Hamieh
Abstract: During the last twenty years, the amount of stored digital data has witnessed enormous growth. These data are usually divided by the scientific community into structured and unstructured data. Although unstructured data are becoming a significant part of big data, there are still a large number of data generators that save their data into traditional structured (i.e., relational) databases. On the other hand, with the advantages that exist in adopting non-structural database systems to save various kinds of data, many companies and institutions are shifting into acquiring and utilizing a professional non-structural database management system (DBMS). These organizations find themselves obliged to translate and convert their old structured data into new forms and formats, in order to make them compatible with the new systems. In order to avoid the consequences of such costly operations, we present the basics of a new database management system that allows a database user to access and query both structural and non-structural databases at the same time using a single query. Our proposed system makes a database user view the two databases (structured and non-structured) as a single database (DB) on which a single operation is executed. In this paper, we present an example of our system that uses MySQL and HBase as the structured and non-structured databases. We explain the details of various types of queries (Insert, Update, Delete, and Search) that are performed by the proposed system, and discuss the tests that we made to measure the end-to-end delays of these queries on two similar MySQL and HBase databases.
Keywords: Big data; structured data; unstructured data; Smart Database Management System; Mixed Query Language; Heterogeneous Databases.
Enhancement for Graph Operations in Relational Database for Criminal Intelligence Domain
by Mateusz Piech, Marcin Los, Robert Marcjan
Abstract: The aim of the research was to improve the performance of Graph Operations in Relational Database for semi-structured data. First of all, this required us to select a model that allows storing semi-structured data with relations. For this purpose, we selected one of the existing solutions, which stores semi-structured data in a model built using JSON as a native type and adapted it to our requirements. Secondly, we created an algorithm based on Common Table Expression with recursion with the extension of JSON to create a possibility to perform a more complex graph analysis of data. We compared proposed solutions with Neo4j - as a representative of Graph Databases. The results show that we obtained an improvement in execution performance in some cases. Although we focused our use cases on Criminal Intelligence Domain, the research output can be applied to every Domain with semi-structured data.
Keywords: Criminal Data; CTE; Graph Database; JSON; PostgreSQL.
Middleware Based Fault Recovery Technique for Replicated DRTDBS
by Pratik Shrivastava, Udai Shanker
Abstract: Replication is the most accepted technique used to enhance the performance of distributed real time database system (DRTDBS). The benefits of replication technique require replica sites to be in consistent state. Thus, research in replicated DRTDBS (RDRTDBS) is conducted towards maintaining mutual consistency via replication protocol (RPL). Very little research is conducted to recover the failed replica sites. Our objective is to propose recovery algorithm that simultaneously allows processing of admitted RTTs and recovery of recovering replica site. In the current paper, middleware based RPL (Shrivastava P. & Shanker U., 2018) is extended with feedback layer for recovering the failed replica site. Feedback sublayer consists of four algorithms; status collector, suspect confirmer, recovery algorithm and log updater. These proposed algorithms collaboratively work to identify and recover the recovering site from system failure. Quantitative evaluation of our proposed algorithms indicates that our proposed algorithms shall be more effective and efficient than traditional fault recovery technique.
Keywords: Middleware; Recovery Protocol; Replication Protocol; RDRTDBS.
Improving the Quality of Service of Real-Time Database Systems through a Semantics-Based Scheduling Strategy
by Fehima Achour, Emna Bouazizi, Wassim Jaziri
Abstract: A RTDBMS (Real-Time DataBase Management System) aims to manage applications with a large number of data being accessed by update and user transactions having to meet some time constraints. We are interested in improving the Quality of Service (QoS) in RTDBMSs by optimizing the execution of transactions to improve meeting their deadlines. A new scheduling strategy is developed and evaluated into a QoS management architecture called FCS (Feedback Control Scheduling) proposed for RTDBMSs. The proposed strategy is based on the new AEDF-TAL-DSL (Advanced Earliest Deadline First based on Transactions Aggregation and Data Semantic Links) protocol we developed. It introduces new parameters relating to the aggregation links (defined according to the type of operations composing the transactions and the data they access) existing between transactions as well as the semantic links appearing between the users' queries. We also show the contributions provided by our approach through simulation results.
Keywords: Real-Time Database Management System; Transactions; Quality of Service; User Satisfaction; Scheduling; Feedback Control; Aggregation Links; Semantic Links; AEDF-TAL-DSL; Simulation.
Optimization algorithm based Recurrent Neural network for big data classification
by Md Mobin Akhtar, Danish Ahamad
Abstract: The progression in technologies led to the development of advanced applications for handling huge data. However, the conventional software tools face a hectic burden in handling the large data. Moreover, the presence of imbalanced data is a major constraint in the research industry. This paper introduces a novel optimization technique for solving the problems using imbalanced data for effective management of large datasets. Here, the classification of big data is performed in a MapReduce framework, wherein the map and reduce functions are based on the proposed optimization technique. The proposed technique is Dragonfly Rider Optimization Algorithm (DROA), which is designed by integrating Dragonfly Algorithm (DA) and Rider Optimization Algorithm (ROA). The mapper uses the proposed optimization as mapper function for selecting the optimal features from the input big-data, for which the fitness function is based on Renyi entropy. The features that are selected is subjected to the reducer phase where classification of the big data is performed using the DROA-based Recurrent Neural Network (RNN). Thus, the categorization of big data is performed based on proposed DROA-RNN-based MapReduce Framework. The result proves that the proposed method acquired a maximal accuracy of 0.996, sensitivity of 0.995, and specificity of 0.995, respectively.
Keywords: Big data Classification; Optimization; MapReduce; Recurrent Neural Network (RNN); Renyi entropy.
Leveraging App Features to Improve Mobile App Retrieval
by Messaoud CHAA, Omar Nouali, Patrice Bellot
Abstract: The continued increase in the use of smartphones and other mobile devices has led to a substantial increase in the demand for mobile applications (apps). With the growing availability of mobile apps, retrieving the right application from a large set has become difficult. However, the existing term-based search engines tend to retrieve relevant apps based on the keywords issued by user rather than considering app features really required by users, such as functionalities, technical characteristics or characteristics related to the user interface of apps. In this paper, we propose a Term and Feature-based approach that, in addition to terms, uses app features extracted from app description and social users' reviews, in order to retrieve the relevant apps to the query and meet the user's needs. The novelty of the proposed approach lies in the use of a representation by features to both apps and queries and the computation of the relevance score between them to get the feature-based score. In addition, our approach combines Feature-based score and Term-based score to get the relevance score of each app. We finally propose an effective techniques that extracts and weight features requested by user in her query. The Experimental results indicate that the proposed approach is effective and outperforms the state-of-the-art retrieval models for app retrieval.
Keywords: App Retrieval; Feature Extraction; Social Information Retrieval;rnNatural Language Processing; Feature-based Score; Term-based Score.
Construction and Application of Knowledge Base in Telecom Fraud Domain
by Rongchen Zhu, Han Ye, Haichun Sun, Xin Li, Yongchen Duan, Jiaqi Hou
Abstract: Nowadays, the number of telecom frauds crimes has been increasing rapidly in China. To improve the peoples awareness of prevention and enhance the professional ability of public security police in investigating and combating, this paper proposes and implements a knowledge base of the telecom fraud domain. Firstly, we improve the event-based telecom fraud ontology model by considering two perspectives, including crime combats and preventions. We also conduct a fine-grained classification and finish a multi-dimensional fusion of the domain knowledge to improve data quality. Secondly, we introduce a new method to reuse domain business processes knowledge, which is a supplementing of knowledge representation and enhances the quality of the knowledge base. Finally, this paper shows two types of domain applications, a Q&A application, and a case assistant. The experimental results show that the public and the police can get some advice and inspiration from the built Knowledge Base.
Keywords: telecom fraud; knowledge base; combat and prevention.
Chronological Penguin Adam-based deep Long Short Term Memory classifier for stock market prediction
by Dattatray P. Gandhmal, Kumar K
Abstract: To forecast the individual stocks or indexes with the direction of market price movement or the level of future market prices with the time-series data poses a challenging issue in the research community. Hence, to analyze, forecast, and understand the time-series data in the stock market for predicting the future prices plays a major attention in the research community. To ensure the effectiveness of prediction in the stock market, an effective method named Chronological Penguin Adam based deep Long Short Term Memory (CPAdam-based deep LSTM) is proposed in this research to forecast the future prices in stock market. Moreover, the technical indicators are effectively extracted from the stock market input data, and use the wrapper approach to select the suitable features. The deep Long Short term memory classifier performs the stock market prediction based on the memory cell associated, which acts as an accumulator to store the state information. Moreover, the deep LSTM is trained by the proposed CPAdam algorithm using the effectiveness of adaptive learning rates. It uses the sparse gradient factor and step size annealing for approximating the hyperparameter. The proposed CPAdam-based deep LSTM attained better performance in stock market prediction better performance using data set-2 in terms of the metrics, like MSE is 3.12E-05, prediction accuracy is 68.576%, RMSE is 0.006, return-per trade is 61.126%, and winning rate is 82.14%, respectively.
Keywords: Penguin search optimization algorithm (PeSOA); Adam optimization algorithm; stock market prediction; wrapper approach; deep Long Short Term Memory (LSTM).