EdgeDiff: Energy-Efficient Multi-Modal Few-Step Diffusion Model Accelerator Using Mixed-Precision and Reordered Group Quantization | 유회준 교수 연구실 | 한국과학기술원 전기및전자공학부

유회준 교수 연구실

서비스 플랜

연구실 검색

프로젝트 공고

정부 과제 추천

AI 기반 기업 서칭

홈

기본 정보

연구 분야

프로젝트

논문

구성원

article|

인용수 0

·2025

EdgeDiff: Energy-Efficient Multi-Modal Few-Step Diffusion Model Accelerator Using Mixed-Precision and Reordered Group Quantization

Sangjin Kim, Ji Hye Oh, J. H. So, Yuseon Choi, Sangyeob Kim, Dongseok Im, Gwangtae Park, Hoi‐Jun Yoo

IEEE Journal of Solid-State Circuits

초록

Recent advances in diffusion models (DMs)—such as few-step denoising and multi-modal conditioning—have significantly improved computational efficiency and functional flexibility, but they also introduce new hardware challenges. In particular, the elimination of inter-timestep redundancy, increased encoder/decoder workload, and heightened sensitivity to quantization demand a new class of accelerator. We present EdgeDiff, the first processor to support end-to-end, few-step, and multi-modal DM inference. EdgeDiff introduces a unified solution named condition-aware reordered group mixed precision (CRMP) with several novel microarchitectures: compress-and-add (CAA) processing elements (PEs) with bit-shuffle trees (BSTs) for efficient low-bit multiply-accumulate (MAC), a tiered accumulation unit (TAU) to reduce floating-point (FP) accumulation energy, and a grid-based quantization unit (GQU) to eliminate expensive FP division. Fabricated in 28-nm CMOS, EdgeDiff achieves up to 34.4-TOPS/W energy efficiency and reduces generation energy to 418.4 mJ/image for one-step text-to-image (T2I) generation—<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX"> $3.3 \times$ </tex-math> </inline-formula> lower than prior state of the art. Despite aggressive quantization, EdgeDiff maintains output quality comparable to FP inference across Fréchet Inception Distance (FID), contrastive language–image pretraining (CLIP), and peak signal-to-noise ratio (PSNR) metrics, establishing it as a compelling solution for energy-efficient, real-time generative artificial intelligence (AI) on edge platforms.

키워드

Quantization (signal processing)InferenceEfficient energy useNoise reductionGenerative modelReduction (mathematics)Enhanced Data Rates for GSM Evolution

타입

article

IF / 인용수

- / 0

원문

https://doi.org/10.1109/jssc.2025.3611456

게재 연도

2025

프로젝트 공고 서비스 문의 자주 묻는 질문 이용약관 개인정보처리방침

주식회사 디써클

대표 장재우,이윤구서울특별시 강남구 역삼로 169, 명우빌딩 2층 (TIPS타운 S2)대표 전화 0507-1312-6417이메일 info@rndcircle.io사업자등록번호 458-87-03380호스팅제공자 구글 클라우드 플랫폼(GCP)