International Journal of High Performance Computing and Networking
| Editor in Chief: Prof. Kuan-Ching Li|
ISSN online: 1740-0570
ISSN print: 1740-0562
8 issues per year
IJHPCN addresses the most innovative developments in high-performance computing and networking such as information and system architectures, grid- and web-based information management and infrastructures, data storage, management, analysis and visualisation, advanced networking with applications, scalable parallel computing, cluster and grid computing, distributed systems, and high performance scientific and engineering computing with applications.
Editor in Chief
Editorial Board Members
A few essentials for publishing in this journal
All articles for this journal must be submitted using our online submissions system.
- Forthcoming paper
Authors: M. Graham Lopez, Wayne Joubert, Veronica Vergara Larrea, Oscar Hernandez, Azzam Haidar, Stanimire Tomov, Jack Dongarra
Abstract: We present an extended exploration of the performance portability of directives provided by OpenMP 4 and OpenACC to program various types of node architecture with attached accelerators, both self-hosted multicore and offload multicore/GPU. Our goal is to examine how successful OpenACC and the newer offload features of OpenMP 4.5 are for moving codes between architectures, and we document how much tuning might be required and what lessons we can learn from these experiences. To do this, we use examples of algorithms with varying computational intensities for our evaluation, as both compute and data access efficiency are important considerations for overall application performance. To better understand fundamental compute vs. bandwidth bound characteristics, we add the compute-bound Level 3 BLAS GEMM kernel to our linear algebra evaluation. We implement the kernels of interest using various methods provided by newer OpenACC and OpenMP implementations, and we evaluate their performance on various platforms including both x86_64 and Power8 with attached NVIDIA GPUs, x86_64 multicores, self-hosted Intel Xeon Phi KNL, as well as an x86_64 host system with Intel Xeon Phi coprocessors. We update these evaluations with the newest version of the NVIDIA Pascal architecture (P100), Intel KNL 7230, Power8+, and the newest supporting compiler implementations. Furthermore, we present in detail what factors affected the performance portability, including how to pick the right programming model, its programming style, its availability on different platforms, and how well compilers can optimise and target multiple platforms.
- Forthcoming paper
Authors: Robert Searles, Stephen Herbein, Travis Johnston, Michela Taufer, Sunita Chandrasekaran
Abstract: High performance computing (HPC) offers tremendous potential to process large amounts of data, commonly referred to as big data. Owing to the immense computational requirements of big data applications, the HPC and big data communities are converging. As a result, heterogeneous and distributed systems are becoming commonplace. In order to take advantage of the immense computing power of these systems, distributing data efficiently and leveraging specialised hardware (e.g. accelerators) is critical. In this paper, we develop a portable, high-level paradigm that can be used to run big data applications on existing and future HPC systems. More specifically, we will target graph analytics applications, since these types of application are becoming increasingly popular in the big data and machine learning communities. Using our paradigm, we accelerate three real-world, compute- and data-intensive, graph analytics applications: a function call graph similarity application, a triangle enumeration subroutine, and a graph assaying application. Our paradigm uses the popular MapReduce framework, Apache Spark, in conjunction with CUDA, in order to simultaneously take advantage of automatic data distribution and specialised hardware present on each node of our HPC systems. We demonstrate scalability with regard to compute-intensive portions of the code that are parallelisable, as well as an exploration of the parameter space for each application. We show that our method yields a portable solution that can be used to leverage almost any legacy, current, or next-generation HPC or cloud-based system.
- Forthcoming paper
Authors: Athanasios Kiourtis, Argyro Mavrogiorgou, Dimosthenis Kyriazis, Ilias Maglogiannis, Marinos Themistocleous
Abstract: The amount of digital information increases tenfold every year, owing to the exponential increase of Cyber-Physical Systems (CPS), real and virtual internet-connected sources. Most researches are focused on data processing and inter-connection fields, leading to the question concerning the interoperable use of data: if data is efficiently processed, how can unknown data be used in a different natures application? A three-stepped approach is presented in this paper, addressing this question, where following the data-lifecycle, a known CPSs dataset is firstly stored into domain-specific language, then translated into domain-agnostic language, and finally, using the fitting function of an ANN, it is compared with an unknown dataset, resulting in the translation of the unknown dataset into the first datasets domain. A scenario of that approach is provided, analysing the data interoperability challenges and needs, emerging from todays Internet of Everything evolution, studying the fields of data annotation, semantics, modelling, and characterisation.
- Forthcoming paper
Authors: Mydhili Palagummi, Ricardo Lent
Abstract: We examine the virtual network embedding problem with QoS constraints and formulate an approach that exploits the betweenness centrality of VNE requests to improve performance. A pay-per-use revenue model is introduced to evaluate the algorithm. An evaluation study using datacentre-like substrates and a wide area topology compares the approach with four embedding methods from the literature and reports on the average revenue rate, embedding success probability, average number of VNE deployments, cost, and impact of substrate failures on the operation of the VNEs, confirming the efficacy of the proposed approach.
- Forthcoming paper
Authors: Rocco Aversa, Luca Tasquier
Abstract: Cloud federation is an emerging computing model where multiple resources from independent cloud providers are leveraged to create large-scale distributed virtual computing clusters, operating as within a single cloud organisation. This concept of service aggregation is characterised by interoperability features, which can address different problems about inter-cloud collaboration, such as vendor lock-in. Furthermore, it approaches challenges like performance and disaster-recovery through methods such as co-location and geographic distribution. One of the main issues within a cloud federation is related to the monitoring of the application deployed on resources coming from different vendors belonging to the federation. In this work we present an agent-based architecture and its prototypal implementation that aims at monitoring the user's cloud environment provided by the federation: the elasticity of the proposed architecture allows the configuration and customisation of the monitoring infrastructure to adapt it to the specific cloud application. A multi-layer architecture is proposed, where each part monitors different aspects of the multi-cloud infrastructure, starting from the detection of critical conditions on low level parameters for the computational units and composing different monitoring levels in order to check the federated SLA. The agent-based approach will introduce fault-tolerance and scalability to the monitoring architecture, while the agents' reactivity and proactivity capabilities will allow a deep and intelligent monitoring, where each agent can focus on different aspects of the monitoring activity, from low level performance indexes to the checking of the federated SLA compliance. Agents will be strengthened by algorithms and rules used to monitor QoS parameters that are critical for the specific application; the configuration of the adaptive monitoring environment will be made easier by an interface that will help the user in describing his/her application's deployment. The prototypal implementation of the proposed framework will be applied on a testbed application to validate the monitoring architecture.
- Forthcoming paper
Authors: Takuma Yamaguchi, Kohei Fujita, Tsuyoshi Ichimura, Muneo Hori, Lalith Maddegedara
Abstract: We accelerate CPU-based unstructured implicit low-order finite-element simulations by porting to a GPU-CPU heterogeneous compute environment by OpenACC. We modified the algorithm of performance-sensitive parts, such as sparse matrix-vector multiplication and MPI communication, so that computations are suitable for GPUs. Other parts of the earthquake simulation code are ported by directly inserting OpenACC directives into the CPU code. This porting approach enables high performance with relatively low development costs. When comparing eight K computer nodes and eight NVIDIA Pascal P100 GPUs, we achieve 20.8 times speedup for the 3-by-3 block Jacobi preconditioned conjugate gradient finite-element solver. We show the effectiveness of the proposed method through many-case crust-deformation simulations and a large-scale computation using a finite element model with billion degrees-of-freedom on a GPU cluster.
- 29 - 31 October 2018
Selected authors will be invited to elaborate on their research topic and submit the results to the journal for review and potential publication.