Authors: Pedro Martins; Pedro Furtado
Addresses: Faculdade de Ciências e Tecnologias, Univerisdade de Coimbra, Polo II, Rua Silvio Lima 3030-790 Coimbra, Portugal ' Faculdade de Ciências e Tecnologias, Univerisdade de Coimbra, Polo II, Rua Silvio Lima 3030-790 Coimbra, Portugal
Abstract: In the past, data management research has concentrated in separate data processing issues: heavy database like query processing, and throughput of stream data processing over high-rate data (CEP). However, in many practical contexts, high-rate stream and heavy data processing work together, for correlation, lookup, aggregation, merging or comparison with large amounts of previous data. We refer to these as stream-DB workloads. One way to provide scalability with any off-the-shelf engine is to have multiple machines and/or processor cores, and to parallelise the load (external scheduler), but nodes can still overload. We propose automated control for balancing and scalability over stream-DB workloads. The approach, called DynLW, offers scalability with an integrated mechanism that manages overload (re)scheduling, automated elasticity, shedding, admission control and overload alerts when resources are insufficient. As a result, the approach provides continuous and totally balanced operation, and avoids overload-related problems.
Keywords: complex event processing; CEP; parallel processing; distributed systems; scalability; load balancing; algorithms; data management; overload scheduling; admission control; heavy database workloads; query processing; stream data processing; high-rate data.
International Journal of Business Intelligence and Data Mining, 2014 Vol.9 No.1, pp.15 - 30
Available online: 24 Jun 2014 *Full-text access for editors Access for subscribers Purchase this article Comment on this article