International Journal of High Performance Systems Architecture (6 papers in press)
Towards Designing Quantum Reversible 32-bit MIPS Register File
by Mohammad Samadi Gharajeh, Majid Haghparast
Abstract: Reversible circuit design can be applied in various emerging technologies such as quantum computing. Since researchers have proposed many building blocks and designed small circuits (e.g., reversible full adder) already, it is the time to design large-scale reversible circuits. This paper proposes a novel quantum reversible 32-bit MIPS register file for quantum computer processors. It presents a reversible 5-to-32 decoder, thirty-two reversible buffer registers, and two reversible 32-to-1 multiplexers, too. The proposed reversible decoder block, namely GH-DEC, and the proposed reversible multiplexer block, namely GH-MUX, use the Feynman, Toffoli, and Fredkin gates. They have been designed by a minimum number of constant inputs, number of garbage outputs, and quantum cost. Besides, output expressions of all the circuits are simplified to enhance the performance of proposed quantum design, considerably. Comparison results show that the proposed reversible design surpasses the existing works in terms of the number of constant inputs, number of garbage outputs, and quantum cost.
Keywords: Reversible Circuit Design; Quantum Computing; Reversible Register File; Reversible Decoder; Reversible Multiplexer.
A Review of Shared Resource Contention in Multicores and its Mitigating
by Preeti Jain, Sunil Surve
Abstract: Chip Multiprocessor (CMP) systems have become inevitable to meet high computing demands. Having high potential and the reduced latency in inter-processor communication amongst the CMP cores makes it a viable solution for parallel execution, in contrast to conventional, single core processors. In such systems sharing of resources is imperative for better resource utilization. The challenge arises when various application programs running on neighbouring cores compete for these resources concurrently and introduce contention. Further an urgency to mitigate contention aggravates as process-level parallelism grows rapidly. Extensive studies in the past have been carried out to study contention due to resource sharing and various techniques are proposed to mitigate it. We present in a simple, lucid and captivating manner a summary of previous work on contention in multicores due to various shared resources like shared caches, main memory, memory bus bandwidth, prefetchers etc. The work aims to briefly discuss key ideas proposed by the research community to alleviate resource contention due to various resources, under a single umbrella. The paper provides better understanding on the contention problem in multicores as we present a cumulative overview of previous challenges due to all shared resources. The work throws light on the fact that, alone a single shared component is not a dominant reason for performance degradation in CMPs, rather all elements in the memory hierarchy introduce resource contention thereby affecting performance cumulatively. The work presented would assist novice readers, researchers and academicians to further serve to propose optimal policies to address contention in designing multicore applications, considering overall impact of these resources on the performance of multicore systems.
Keywords: Multicore; shared resources; contention; LLC; Main memory; bus bandwidth; prefetching; mitigating techniques.
Efficient Hardware Implementations of QTL Cipher for RFID Applications
by Nivedita Shrivastava, Pulkit Singh, Bibhudendra Acharya
Abstract: Extensive deployment of various ubiquitous computing devices brings wide new range of privacy and security issues in the low-resource domain. Various lightweight algorithms are proposed to solve this problem of security in a constrained resource environment. In this work, optimized hardware implementations of lightweight block cipher QTL are proposed in order to provide security with optimum resource utilization. In proposed reduced datapath architecture, resource utilization is reduced and it gives good trade-off between area footprint and performance. In proposed pipelined architecture, encryption round is divided into two sub-stages and registers are introduced in between these stages. This design methodology significantly improves the operating frequency. As a result, this design is apt for high-speed applications. The proposed unified architecture combines three different architectures of QTL encryption process into one single design. It is suitable for resource-constrained applications such as Radio Frequency Identification (RFID) and also is able to provide flexible security according to the requirement. This design controls the security level by introducing some extra hardware for selecting the desired key scheduling mechanism. All the three architectures are extensively evaluated and compared on the basis of performance, area utilization, energy requirement and power consumption for their implementation in different FPGA platforms.
Keywords: Block Random Access Memory; Field Programmable Gate Array; Lookup Table; Lightweight; Ciphers; Security.
Performance Optimized Architectures of Piccolo Block Cipher for Low Resource IoT Applications
by Gandu Ramu, Zeesha Mishra, Pulkit Singh, Bibhudendra Acharya
Abstract: Radio Frequency Identiffcation (RFID) and Wireless Sensor Networks (WSNs) are the devices with constrained environments, have been expanded with the current
trend of network capabilities and ubiquitous computing, which spread wings of Internet
of Things (IoT). Piccolo is one of the ultra lightweight block cipher, uses 64-bit plaintext and two versions of keys 80 and 128-bit, makes them suitable for low computing devices. Different hardware architectures have been proposed to make them suitable for the low area, low power, and high speed applications. The strategies like loop rolled, parallel round based and pipelined architectures are employed to optimize the hardware design for low resource applications. The proposed architectures have been implemented on FPGA achieving throughput of 691.54, 613.26, and 1195.54 Mbps as well as slice count of 47 results in low area and low power.
Keywords: IoT; RFID; lightweight cryptography; encryption; Piccolo algorithm; fiestel
structure; S-box; FPGA; throughput.
Special Issue on: Recent Advances in the Security of Multimedia Big Data in Semantic Web-based Social Networks
A new correntropy-based level set algorithm using local robust statistics information
by Sheng Wang, Xiaoliang Jiang
Abstract: Intensity inhomogeneity or noise usually appear in various kinds of images, which cause a challenging task in image segmentation. To solve these issues, a novel correntropy-based level set algorithm utilizing local robust statistics information is introduced. In the proposed method, the modified local image fitting (MLIF) equation is built by describing the difference between the images of fitted and local robust statistics. Then, by using the correntropy criterion, the MLIF model can automatically emphasize the weight coefficient of the samples that are approximately to the gray means. In this case, the new guided energy term can accurately process images with weak edge and more adaptive to noise. Finally, we introduce a level set regularization terms to remove re-initialization process. Experiments on a lot of images demonstrate our method has good segmentation ability on the part of visual perception and robustness, as compared with traditional algorithms.
Keywords: Correntropy-based; level set; local robust statistics.
A Video Image Detection Approach Based on Cooperative Positioning
by Cai Aiping
Abstract: Target detection tasks usually need to be performed with the help of cooperative positioning in machine vision positioning systems with multiple viewing angles and cameras. For traditional target positioning tasks, classifiers are used to distinguish foregrounds and backgrounds, or new dictionaries of visions are created by learning features of samples. Models built by classifier algorithms in most cases are complicated, compromising algorithms efficiency while ensuring the accuracy of the models. A detection approach based on cooperative positioning of locations and mutex constraints is proposed in this paper, and with this approach, the efficiency of classification can be improved. First, a mutex matrix of consecutive frame images in video clips is worked out, and a target function model is built by performing exclusive or operation on this matrix and the original mutex matrix. Then L-K optical flow approach is used to get an optical flow map and the motion-related priori information of the candidate objects. Then a mutex matrix based on overlapping degree and aspect ratio is used to constrain candidate objects overlapped and greatly different in appearance from being selected from single images. When either mutex is satisfied, the two kinds of candidate targets cannot be included in the maximum weight clique simultaneously, and thereby the problem of cooperative positioning modeling is solved. According to the test results, better detection effects can be generated since the location and scale of one object in one video clip will not undergo any sudden and drastic change.
Keywords: image classification;co-localization;video detection; mutex constrained.