Authors: Francisco J. Cazorla, Alex Ramirez, Mateo Valero, Enrique Fernandez
Addresses: Department of Computer Architecture, UPC, Jodi Girona 1-3, Barcelona D6. 08034, Spain. ' Department of Computer Architecture, UPC, Jodi Girona 1-3, Barcelona D6. 08034, Spain. ' Department of Computer Architecture, UPC, Jodi Girona 1-3, Barcelona D6. 08034, Spain. ' University of Las Palmas de Gran Canaria, Departamento de Informatica, y Sistemas Campus Universidad de Tafira, Las Palmas de Gran Canaria 35017, Spain
Abstract: Simultaneous multithreading (SMT) processors fetch instructions from several threads, increasing the available instruction level parallelism of each thread exposed to the processor. In an SMT the fetch engine decides which threads enter the processor and have priority in using resources. Hence, the fetch engine determines how shared resources are allocated, playing a key role in the final performance of the machine. When a thread experiences an L2 cache miss, critical resources can be monopolised for a long time, throttling the execution of remaining threads. Several approaches have been proposed to cope with this problem. The first contribution of this paper is the evaluation and comparison of the three best published policies addressing the long latency load problem. The second and main contributions of this paper are that we have proposed improved versions of these three policies. Our results show that the improved versions significantly enhance the original ones in both throughput and fairness.
Keywords: SMT processors; simultaneous multithreading; fetch policy; long latency loads; load miss predictors; multiple threading; high performance computing; fetch engine; memory latencies.
International Journal of High Performance Computing and Networking, 2004 Vol.2 No.1, pp.45 - 54
Available online: 14 Mar 2006 *Full-text access for editors Access for subscribers Purchase this article Comment on this article