


default search action
ACM Transactions on Architecture and Code Optimization, Volume 17
Volume 17, Number 1, March 2020
- Yuhao Li, Dan Sun, Benjamin C. Lee:

Dynamic Colocation Policies with Reinforcement Learning. 1:1-1:25 - Nikolaos Tampouratzis

, Ioannis Papaefstathiou
, Antonios Nikitakis, Andreas Brokalakis, Stamatis Andrianakis, Apostolos Dollas, Marco Marcon
, Emanuele Plebani
:
A Novel, Highly Integrated Simulator for Parallel and Distributed Systems. 2:1-2:28 - Lijuan Jiang, Chao Yang, Wenjing Ma:

Enabling Highly Efficient Batched Matrix Multiplications on SW26010 Many-core Processor. 3:1-3:23 - Mustafa Cavus, Resit Sendag, Joshua J. Yi:

Informed Prefetching for Indirect Memory Accesses. 4:1-4:29 - Yohann Uguen

, Florent de Dinechin, Victor Lezaud, Steven Derrien:
Application-Specific Arithmetic in High-Level Synthesis Tools. 5:1-5:23 - Yang Song

, Bill Lin
:
Improving Memory Efficiency in Heterogeneous MPSoCs through Row-Buffer Locality-aware Forwarding. 6:1-6:26 - Hao Wu

, Weizhi Liu, Huanxin Lin, Cho-Li Wang:
A Model-Based Software Solution for Simultaneous Multiple Kernels on GPUs. 7:1-7:26 - Xuanhua Shi, Wei Liu, Ligang He, Hai Jin, Ming Li, Yong Chen:

Optimizing the SSD Burst Buffer by Traffic Detection. 8:1-8:26
Volume 17, Number 2, June 2020
- Charu Kalra, Fritz Previlon, Norm Rubin, David R. Kaeli:

ArmorAll: Compiler-based Resilience Targeting GPU Applications. 9:1-9:24 - Stefano Cherubin

, Daniele Cattaneo
, Michele Chiari
, Giovanni Agosta:
Dynamic Precision Autotuning with TAFFO. 10:1-10:26 - Ahmet Erdem

, Cristina Silvano
, Thomas Boesch, Andrea C. Ornstein, Surinder Pal Singh, Giuseppe Desoli:
Runtime Design Space Exploration and Mapping of DCNNs for the Ultra-Low-Power Orlando SoC. 11:1-11:25 - Amir Hossein Nodehi Sabet, Junqiao Qiu

, Zhijia Zhao, Sriram Krishnamoorthy:
Reliability Analysis for Unreliable FSM Computations. 12:1-12:23 - Jiachen Xue, T. N. Vijaykumar, Mithuna Thottethodi

:
Network Interface Architecture for Remote Indirect Memory Access (RIMA) in Datacenters. 13:1-13:22 - Qinggang Wang, Long Zheng, Jieshan Zhao, Xiaofei Liao, Hai Jin, Jingling Xue

:
A Conflict-free Scheduler for High-performance Graph Processing on Multi-pipeline FPGAs. 14:1-14:26 - Anita Tino, Caroline Collange, André Seznec:

SIMT-X: Extending Single-Instruction Multi-Threading to Out-of-Order Cores. 15:1-15:23
Volume 17, Number 3, August 2020
- David R. Kaeli:

Editorial: A Message from the Editor-in-Chief. 16:1-16:2 - Ram Rangan, Mark W. Stephenson, Aditya Ukarande

, Shyam Murthy, Virat Agarwal, Marc Blackstein:
Zeroploit: Exploiting Zero Valued Operands in Interactive Gaming Applications. 17:1-17:26 - Karel Adámek

, Sofia Dimoudi, Mike B. Giles
, Wesley Armour
:
GPU Fast Convolution via the Overlap-and-Save Method in Shared Memory. 18:1-18:20 - Arnab Das

, Sriram Krishnamoorthy
, Ian Briggs, Ganesh Gopalakrishnan
, Ramakrishna Tipireddy:
FPDetect: Efficient Reasoning About Stencil Programs Using Selective Direct Evaluation. 19:1-19:27 - Tarek S. Abdelrahman:

Cooperative Software-hardware Acceleration of K-means on a Tightly Coupled CPU-FPGA System. 20:1-20:24 - Jaekyu Lee

, Yasuo Ishii, Dam Sunwoo:
Securing Branch Predictors with Two-Level Encryption. 21:1-21:25 - Luca Cerina

, Marco D. Santambrogio, Giuseppe Franco, Claudio Gallicchio, Alessio Micheli
:
EchoBay: Design and Optimization of Echo State Networks under Memory and Time Constraints. 22:1-22:24 - Savvas Sioutas, Sander Stuijk

, Twan Basten, Henk Corporaal, Lou J. Somers:
Schedule Synthesis for Halide Pipelines on GPUs. 23:1-23:25 - Muhammad Huzaifa, Johnathan Alsop, Abdulrahman Mahmoud, Giordano Salvador, Matthew D. Sinclair, Sarita V. Adve:

Inter-kernel Reuse-aware Thread Block Scheduling. 24:1-24:27
Volume 17, Number 4, November 2020
- Gokul Subramanian Ravi

, Joshua San Miguel
, Mikko H. Lipasti:
SHASTA: Synergic HW-SW Architecture for Spatio-temporal Approximation. 25:1-25:26 - Aravind Acharya

, Uday Bondhugula, Albert Cohen
:
Effective Loop Fusion in Polyhedral Compilation Using Fusion Conflict Graphs. 26:1-26:26 - Steffen Maass, Mohan Kumar Kumar, Taesoo Kim, Tushar Krishna, Abhishek Bhattacharjee:

ECOTLB: Eventually Consistent TLBs. 27:1-27:24 - Anchu Rajendran, V. Krishna Nandivada

:
DisGCo: A Compiler for Distributed Graph Analytics. 28:1-28:26 - Yu Zhang, Xiaofei Liao, Lin Gu, Hai Jin, Kan Hu, Haikun Liu, Bingsheng He

:
AsynGraph: Maximizing Data Parallelism for Efficient Iterative Graph Processing on GPUs. 29:1-29:21 - Yemao Xu

, Dezun Dong, Yawei Zhao, Weixia Xu, Xiangke Liao:
OD-SGD: One-Step Delay Stochastic Gradient Descent for Distributed Training. 30:1-30:26 - Xinfeng Xie

, Xing Hu
, Peng Gu, Shuangchen Li, Yu Ji, Yuan Xie:
NNBench-X: A Benchmarking Methodology for Neural Network Accelerator Designs. 31:1-31:25 - S. VenkataKeerthy

, Rohit Aggarwal, Shalini Jain, Maunendra Sankar Desarkar, Ramakrishna Upadrasta, Y. N. Srikant:
IR2VEC: LLVM IR Based Scalable Program Embeddings. 32:1-32:27 - Jhe-Yu Liou

, Xiaodong Wang, Stephanie Forrest
, Carole-Jean Wu:
GEVO: GPU Code Optimization Using Evolutionary Computation. 33:1-33:28 - Rolando Brondolin

, Marco D. Santambrogio:
A Black-box Monitoring Approach to Measure Microservices Runtime Performance. 34:1-34:26 - Utpal Bora

, Santanu Das, Pankaj Kukreja, Saurabh Joshi, Ramakrishna Upadrasta, Sanjay V. Rajopadhye:
LLOV: A Fast Static Data-Race Checker for OpenMP Programs. 35:1-35:26 - George Christou, Giorgos Vasiliadis

, Vassilis Papaefstathiou, Antonis Papadogiannakis, Sotiris Ioannidis:
On Architectural Support for Instruction Set Randomization. 36:1-36:26 - Athanasios Stratikopoulos, Christos Kotselidis

, John Goodacre, Mikel Luján:
FastPath_MP: Low Overhead & Energy-efficient FPGA-based Storage Multi-paths. 37:1-37:23 - Cristóbal Ramírez

, César-Alejandro Hernández-Calderón, Oscar Palomar
, Osman S. Unsal, Marco Antonio Ramírez
, Adrián Cristal:
A RISC-V Simulator and Benchmark Suite for Designing and Evaluating Vector Architectures. 38:1-38:30 - Sam Likun Xi

, Yuan Yao, Kshitij Bhardwaj, Paul N. Whatmough, Gu-Yeon Wei, David Brooks:
SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning Workloads. 39:1-39:26 - Albin Eldstål-Ahrens

, Ioannis Sourdis
:
MemSZ: Squeezing Memory Traffic with Lossy Compression. 40:1-40:25 - Dennis Pinto, José-María Arnau, Antonio González:

Design and Evaluation of an Ultra Low-power Human-quality Speech Recognition System. 41:1-41:19

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














