발행물

전체 논문

48

11

TLP Balancer: Predictive Thread Allocation for Multi-Tenant Inference in Embedded GPUs
IEEE Embedded Systems Letters, 1970

12

VitBit: Enhancing Embedded GPU Performance for AI Workloads through Register Operand Packing
ICPP 2024, 1970

13

SAVector: Vectored Systolic Arrays
IEEE Access, 1970

14

Conflict-Aware Compiler for Hierarchical Register File on GPUs
Journal of Systems Architecture, 1970

15

Adaptive Kernel Merge and Fusion for Multi-Tenant Inference in Embedded GPUs
IEEE Embedded Systems Letters, 1970

16

Warped-MC: An Efficient Memory Controller Scheme for Massively Parallel Processors
ICPP 2023, 1970

17

Imprecise Store Exceptions
ISCA 2023, 1970

18

R2D2: Removing ReDunDancy Utilizing Linearity of Address Generation in GPUs
ISCA 2023, 1970

19

A Low-latency On-chip Cache Hierarchy for Load-to-use Stall Reduction in GPUs
Negin Mahani, Hajar Falahati, Sina Darabi, Ahmad Javadi-Nezhad, Yunho Oh, Mohammad Sadrosadati, Hamid, Sarbazi-Azad, Babak Falsafi
ACM Transactions on Architecture and Code Optimization, 1970

20

An Entropy Model for GPU Register Compression
Minsik Kim, Yunho Oh, Won Woo Ro
Journal of Semiconductor Technology and Science, 1970