Title: Interval graph mining

Authors: Amina Kemmar; Yahia Lebbah; Samir Loudni

Addresses: LITIO, University of Oran 1 Ahmed Ben Bella, BP 1524 ElM'Naouer, Oran, Algeria; ESE of Oran, BP 65 Ch 2 Achaba Hnifi USTO Oran, Algeria ' LITIO, University of Oran 1 Ahmed Ben Bella, BP 1524 El, M'Naouer Oran, Algeria ' GREYC (CNRS UMR 6072), University of Caen, France

Abstract: Frequent subgraph mining is a difficult data mining problem aiming to find the exact set of frequent subgraphs into a database of graphs. Current subgraph mining approaches make use of the canonical encoding which is one of the key operations. It is well known that canonical encodings have an exponential time complexity. Consequently, mining all frequent patterns for large and dense graphs is computationally expensive. In this paper, we propose an interval approach to handle canonicity, leading to two encodings, lower and upper encodings, with a polynomial time complexity, allowing to tightly enclose the exact set of frequent subgraphs. These two encodings lead to an interval graph mining algorithm where two minings are launched in parallel, a lower mining (resp. upper mining) using the lower (resp. upper) encoding. The interval graph mining approach has been implemented within the state of the art Gaston miner. Experiments performed on synthetic and real graph databases coming from stock market and biological datasets show that our interval graph mining is effective on dense graphs.

Keywords: graphmining; interval approach; frequent subgraph discovery; graph encoding; subgraph isomorphism; graph isomorphism.

DOI: 10.1504/IJDMMM.2018.089629

International Journal of Data Mining, Modelling and Management, 2018 Vol.10 No.1, pp.1 - 22

Accepted: 09 Jul 2017
Published online: 02 Feb 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article