Authors: M. Sreerama Murty; N. Naga MalleswaraRao
Addresses: Department of CSE, Achary Nagarjuna University, Guntur, Andhra Pradesh, 522510, India ' Department of IT, RVR & JC College of Engineering, Guntur, Andhra Pradesh, India
Abstract: The loading and searching the data from data with in local data nodes by using the Hadoop environment. In general the loading and searching data by using a query is more complex, because the capacity of the dataset may large. We propose a technique to handle the data in local nodes without overlapping and data retrieved by script. The main task of the query is to store the information on distributed environment and searching the without any delay. Here we define the script to avoid the redundancy of the duplicate while searching and loading the data in dynamic mechanism. And also provide the Hadoop file system in distributed environment. The apache script is used to loading and searching the information instead of the SQL mechanism. We improve the performance of query execution and graph theory. The query can split into three parts to search the data individually and combined the results in execution. Here we used the replica concept to store the data at time of executing query in Hadoop file system. The script is executed on the locating environment of Hadoop file system.
Keywords: HDFS; Hadoop distributed file system; replica; local; distributed; capacity; SQL; redundancy.
International Journal of Data Science, 2020 Vol.5 No.1, pp.41 - 52
Received: 20 Jan 2020
Accepted: 30 Mar 2020
Published online: 25 Aug 2020 *