Forthcoming articles


International Journal of High Performance Systems Architecture


These articles have been peer-reviewed and accepted for publication in IJHPSA, but are pending final changes, are not yet published and may not appear here in their final order of publication until they are assigned to issues. Therefore, the content conforms to our standards but the presentation (e.g. typesetting and proof-reading) is not necessarily up to the Inderscience standard. Additionally, titles, authors, abstracts and keywords may change before publication. Articles will not be published until the final proofs are validated by their authors.


Forthcoming articles must be purchased for the purposes of research, teaching and private study only. These articles can be cited using the expression "in press". For example: Smith, J. (in press). Article Title. Journal Title.


Articles marked with this shopping trolley icon are available for purchase - click on the icon to send an email request to purchase.


Articles marked with this Open Access icon are freely available and openly accessible to all without any restriction except the ones stated in their respective CC licenses.


Register for our alerting service, which notifies you by email when new issues of IJHPSA are published online.


We also offer RSS feeds which provide timely updates of tables of contents, newly published articles and calls for papers.


International Journal of High Performance Systems Architecture (8 papers in press)


Regular Issues


  • Soft Skills Requirements in Mobile Applications Development Employment Market   Order a copy of this article
    by JIngdong Jia, Zupeng Chen, Xi Liu 
    Abstract: The soft skills of developers have a major influence on the quality of software product and project. However, which soft skills are important for mobile applications development remains unknown. Additionally, it is necessary to examine the differences of soft skills requirements between traditional software and mobile applications development. In this article, based on text mining including word segmentation, similarity calculation and clustering analysis, we analyse lots of advertisements, and extract 13 categories of soft skills requirements for mobile applications development. We also compare the categories with those for traditional software development. We find that communication and teamwork are still the most important two soft skills. However, fast learning is more important for mobile developers, and we identified four soft skills that are not proposed before. Additionally, season has a minor impact on soft skills requirements of mobile applications development.
    Keywords: soft skill; mobile application development; job advertisement; text mining; cluster analysis.

  • Energy Optimized Cryptography (EOC)for Low Power Devices in Internet of Things   Order a copy of this article
    by RAJESH G, Vamsi Krishna C, Christopher Selvaraj B, Roshan Karthik S, Arun Kumar Sangaiah 
    Abstract: Internet of Things(IoT) has a plethora of devices ranging from high capacity servers to low powered devices that works with Bluetooth, ZigBee, GPRS, RFID and WiFi etc,. These the low power devices are constrained to security, power management, reliability and privacy limitations. The existing traditional security algorithms could not be applied to these low power devices, due tothe high processing and battery power requirements. Here proposed an Energy Optimized Cryptography (EOC) for low power devices in IoT. Here the security of the low power devices are providedby two light weight security techniques called R2CV, a sub key generation method and Optimized Message Authentication Code Generation Function (OMGF) tomaintain security without compromising energy and processing power consumption. The proposed security algorithms reduce the computational requirements for sub key generation and MAC generation in low power devices. The experimental results are compared with the existing security algorithms like RC5 and SHA, and is proven that R2CV and OMGF reduce the time consumed, increase battery life and in turn it extends the network life time.
    Keywords: IoT Security; low-power devices; Message authentication code; Energy efficiency; Internet of Things.

  • Real-Time Physical Register File Allocation with Neural Networks for Simultaneous Multi-Threading Processors   Order a copy of this article
    by Wenjun Wang, Wei-Ming Lin 
    Abstract: Simultaneous Multi-Threading (SMT) processors improve system performance by allowing concurrent execution of multiple independent threads with shared key resources. Physical register file, shared among the threads in real time, is one of the most critical resources in deciding overall system performance. Disproportional distribution of registers among the threads may easily hamper normal processing of some threads. In this paper, we develop a machine learning algorithm to efficiently allocate registers among concurrent executing threads based on current resource utilization circumstances. An off-line training process is first employed to establish a well-trained neural network which is then applied to dynamically adjust the resource distribution in real time. Our experiment results on M-sim, which is a multi- threaded micro-architectural simulation environment, show that our proposed technique significantly improves the average system throughput by up to 42% without sacrificing execution fairness among the threads.
    Keywords: Simultaneous Multi-Threading; Register Re- naming; Physical Register File; Neural Networks; Machine Learning.

  • Multiprocessing Scalable String Matching Algorithm for Network Intrusion Detection System   Order a copy of this article
    by Adnan Hnaif, Ali Aldahoud, Mohammad Alia, Issa Al’otoum, Duaa Hani 
    Abstract: With high increasing speed of today's computer networks which affects the performance of security issues in terms of detection speed, the traditional security tools such as firewall is insufficient to protect the networks from external threads. Intrusion Detection Systems (IDS) are one of the most reliable tools that can be used to monitor all the network traffic to identify unauthorized usage of computer system networks.rnIn this paper, we have proposed a scalable string matching algorithm based on Network IDS (NIDS) to enhance the speed of NIDS detection engine, which called Multiprocessing Scalable String Matching Algorithm for Network Intrusion Detection System (MSNIDS). The MSNIDS implemented by using enhanced weighted exact matching algorithm (EWEMA) in both sequential and parallel processing. The MSNIDS based on EWEMA can be achieved more than 89% in sequential processing time compared with WEMA, and 86% in parallel processing time compared with sequential matching processing.
    Keywords: String Matching Algorithms; Distributed Architecture; Parallel Processing; Network Intrusion Detection System.

  • An Efficient VLSI Architecture For Two-Dimensional Discrete Wavelet Transform   Order a copy of this article
    by Rohan Pinto 
    Abstract: In this paper, a memory efficient 2-D discrete wavelet transform (DWT) structure is presented for high-speed application. The architecture is based on the modified lifting scheme to reduce the critical path to one multiplier delay. In order to increase the speed of processing, four pipeline stages are introduced in the structure. The computation time for an N x N image is N2/4, as the throughput rate of the structure is four. The results after comparison reveal that the proposed architecture has a temporal memory lower than the other DWT architectures. The Z-scan method is employed to fetch the input data which suits the transpose unit design. Five registers and a multiplexer constitute a transpose unit, which is required to transpose the data between the row and the column processor. The proposed 2-D dual-scan DWT architecture has the merits of low latency, low control complexity and regular signal ow, making it suitable for a very large-scale integration (VLSI) implementation. The architecture is modeled in VHDL and synthesized with the CMOS 180nm technology.
    Keywords: Discrete wavelet transform (DWT); lifting scheme; pipeline; VLSI; architecture.

  • Heterogeneous Computing on Mobile GPU-FPGA Cooperation Platform   Order a copy of this article
    by Nan Hu, Xuehai Zhou, Xi Li 
    Abstract: In recent years, mobile GPUs have been widely adopted in Systems-On-Chip(SoCs) platforms, especially in the graphics area. Meanwhile, reconfigurable processors and emerging FPGA computing devices are also widely used. However, the research of mobile GPU for general computing cooperation with FPGA, is still scarce. Such heterogeneous systems pose a great challenge to the parallel programming. In this paper, we present a Flow-Lead-In Architecture (FLIA) is proposed as a unified data flow driven development model based on coupled GPU-FPGA. The servant represents an intermediate language module that is compiled from the high-level programming language and is compiled to different types of processors at runtime. Execution-flow abstracts the communication task between the servants and controls the pipeline execution for spatial parallelism. By scheduling multiple servants to heterogeneous processors, the cooperation system uses fewer resources to achieve near performance and power with the pure FPGA system.
    Keywords: heterogeneous computing; GPU-FPGA cooperation; mobile GPU; ARM GPU FPGA partitioning; reconfigurable computing.

  • A framework for evaluating branch predictors using multiple performance parameters   Order a copy of this article
    by Moumita Das, Ansuman Banerjee, Bhaskar Sardar 
    Abstract: Selecting a branch predictor for a program for prediction is a challenging task. The performance of a branch predictor is measured not only by the prediction accuracy - parameters like predictor size, energy expenditure, latency of execution play a key role in predictor selection. For a specific program, a predictor which provides the best results based on one of these parameters, may not be the best when some other parameter is considered. The task to select the best predictor considering all the different parameters, is therefore, a non-trivial one, and is considered one of the foremost challenges. In this paper, we propose a framework to systematically address this important challenge using the concept of aggregation and unification. For a given program, our framework considers the performance of the different predictors, with respect to the different parameters, and makes a predictor selection based on all of them. On one side, our framework can be an important aid for deciding on the best predictor to use at runtime. On the other side, the proposal of new predictor can be systematically evaluated and placed in purview of existing ones, considering the parameters of choice. We present experimental results of our framework on the Siemens, SPEC 2006 and SPEC 2017 benchmarks.
    Keywords: Branch prediction; prediction accuracy; execution latency; rank aggregation.

Special Issue on: On-Chip Communication Theory and Applications

  • Parallel Video Processing on FPGA Architecture   Order a copy of this article
    by Lamjed Touil, Abdessalem Bn Abdelali, Lilia Kechiche, Bouraoui Ouni, Abdelatif MTIBAA 
    Abstract: Real time Video applications are becoming widely used in many domains with more demand for high performance. Video processing is intensive and habitually has accompanying real-time or super-real-time requirements. Such us, Multiple cameras are used in monitoring and surveillance systems in automatically real time analyze video to detect unusual events. Due to the strong computational imposed by video algorithms, real-time video treatment is notably amenable to concurrent processing. Classical implementation solutions whether based on general purpose processors or dedicated ones like DSP cannot fulfill wanted performance. In this article, we focus on the applicability of computing reconfigurable architectures to parallel video processing applications. The experiment results show that the proposed hardware-oriented multi-treatment architecture can provide an average frame rate of 45 frames/s at high definition resolution. Statistics show a consumption about 18 % of logic resources and 27% of on chip memory which gives the possibility to integrate additional treatments.
    Keywords: FPGA; MPMC; Video processing; Cut Detection; Picture in Picture.