Title: Automatic identification and classification of Palomar Transient Factory astrophysical objects in GLADE

Authors: Weijie Zhao; Florin Rusu; Kesheng Wu; Peter Nugent

Addresses: University of California Merced, 5200 N Lake Rd., Merced, CA 95343, USA ' University of California Merced, 5200 N Lake Rd., Merced, CA 95343, USA ' Lawrence Berkeley National Laboratory, 1 Cyclotron Rd., Berkeley, CA 94720, USA ' Lawrence Berkeley National Laboratory, 1 Cyclotron Rd., Berkeley, CA 94720, USA

Abstract: Palomar Transient Factory (PTF) is a comprehensive detection system for the identification and classification of transient astrophysical objects. In this paper, we make two significant contributions to the PTF pipeline. First, we present an experimental study that evaluates a novel implementation of the real-time classifier in GLADE - a parallel data processing system that combines the efficiency of a database with the extensibility of map-reduce. We show how each stage in the classifier maps optimally into GLADE tasks by taking advantage of the unique features of the system - range-based data partitioning, columnar storage, multi-query execution, and in-database support for complex aggregate computation. Second, we introduce a novel parallel similarity join algorithm for advanced transient classification. We implement this algorithm in GLADE and execute it on a massive supercomputer with more than 3,000 threads, achieving more than three orders of magnitude improvement over the PostgreSQL solution.

Keywords: parallel databases; multi-query processing; scientific data analysis; similarity join; astronomical surveys; transient identification.

DOI: 10.1504/IJCSE.2018.093775

International Journal of Computational Science and Engineering, 2018 Vol.16 No.4, pp.337 - 349

Received: 13 Apr 2016
Accepted: 30 Jun 2016

Published online: 06 Aug 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article