논문 | 권은지 교수 연구실 | 국민대학교 인공지능학부

권은지 교수 연구실

홈

기본 정보

연구 분야

논문

구성원

논문

연구 성과 추이

표시된 성과는 수집된 데이터 기준으로 산출되며, 일부 차이가 있을 수 있습니다.

5개년 연도별 논문 게재 수

12총합

5개년 연도별 피인용 수

54총합

주요 논문

*2026년 기준 최근 6년 이내 논문에 한해 Impact Factor가 표기됩니다.

article

인용수 0

2025

Autonomous Model Quantization Framework for Hybrid Vision Transformers based on Reinforcement Learning

Eunji Kwon, Tajana Rosing

IF 2.9 (2025)

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

기존 양자화(quantization) 접근법은 하이브리드 합성곱(convolution) 및 트랜스포머 모델을 저 비트폭(low bit-width)으로 압축할 때 종종 상당한 정확도 저하를 겪는다. 본 논문은 이전의 RL-PTQ 프레임워크[1]를 확장한 RL-PTQv2를 제시한다. RL-PTQv2는 새로운 강화학습(reinforcement learning, RL) 기반 사후 학습 양자화(post-training quantization, PTQ) 방법을 도입한다. RL-PTQv2는 두 가지 핵심 발전을 포함한다: (i) 하드웨어(HW)-인식 PTQ(선택적)로, RL이 인-루프 PIM 시뮬레이터(in-loop PIM simulator)로부터의 실제 지연(real latency) 및 에너지(energy) 피드백에 의해 유도되며, 정확도, 지연, 에너지를 공동으로 최적화하는 배치 가능한 설계를 가능하게 하고, (ii) 정밀도와 효율의 균형을 더 잘 맞추기 위한 대칭/비대칭 양자화 및 혼합(mixed) 적응적 라운딩(mixed adaptive rounding)을 지원하는 개선된 양자화 기법이다. MobileViTv1 및 v2[2], [3], EfficientFormerv1 및 v2[4], [5], MobileFormer[6]를 포함한 다양한 하이브리드 비전 트랜스포머 계열에서 RL-PTQv2는 이전 PTQ 방법[7], [8], [9], [10]에 비해 정량화된 정확도에서 최신 성능을 달성한다. 또한, 양자화된 모델은 기준 모델(baseline model)과 비교하여 TransPIM[11]에서 10.1×, Titan RTX GPU에서 22.6×의 에너지 효율 향상을 보였으며, 이는 MobileViT 모델을 효율적으로 실행하기 위한 전용 처리 프레임워크인 HViT-PIM에 배치했을 때 특히 두드러졌다. HViT-PIM은 주로 HW-aware PTQ의 잠재력을 탐색하기 위해 개발되었다. 그러나 RL-PTQv2는 메모리 내 처리(processing-in-memory, PIM)에만 국한되지 않는다. 다양한 비트-직렬(bit-serial) 가속기와도 원활하게 통합할 수 있으며, 이를 통해 기저 HW에 맞춘 자동 양자화를 가능하게 한다.

https://doi.org/10.1109/tcad.2025.3641538

Quantization (signal processing)

Reinforcement learning

Transformer

Rounding

Adder

Efficient energy use

article

인용수 0

2025

DeltaTrack: Flow-Driven Multiple Object Tracking Accelerator With Variable LSB Approximation for Real-Time and Energy-Efficient Video Analytics

Seunghyun Moon, Eunji Kwon

IF 4.9 (2025)

IEEE Transactions on Circuits & Systems II Express Briefs

다중 객체 추적(Multiple object tracking, MOT)은 실시간 비디오 분석에서 핵심적인 과제이지만, 기존의 검출기–추적기 파이프라인은 매 프레임마다 무거운 검출 및 추적 모델을 실행해야 하므로 높은 계산 비용과 전력 소비를 초래한다. 본 연구에서는 비키 프레임(non-key frames)에서 검출을 동적으로 스킵함으로써 불필요한 연산을 크게 줄이는 하드웨어 효율적인 MOT 가속기 Delta-Track을 제안한다. 우리의 방법은 경량 광류(optical flow) 추정 모듈을 사용하여 비키 프레임에서 객체 궤적을 예측하고, 새로운 객체가 검출되는 경우에만 전체 검출기–추적기 파이프라인을 호출한다. 또한 하드웨어 효율을 위해 가변 LSB 근사(variable LSB approximation) 방식도 도입하여, 선택된 레이어에서 비트 폭을 줄여 곱셈을 수행한다. 구체적으로, 정확도 허용 오차가 있는 레이어에서는 4비트 가중치 × 8비트 활성(activations)으로 성능을 개선하고, 정확도에 민감한 레이어에서는 지연(latency)을 개선하는 대신 6비트 가중치 × 8비트 활성로 동적 전력(dynamic power)을 감소시킨다. 28-nm CMOS에서 레이아웃 후(post-layout) 추정에 기반해 DeltaTrack은 640×640(초당 7.59 Mpixel)에서 18.5 frame/s를 유지하며, 프레임당 5.29 mJ(픽셀당 12.9 nJ)의 전력을 소비한다. 정규화(normalized)된 기준으로 이는 선행 가속기 대비 처리량이 2.26−4.66× 더 높고, 에너지는 1.29−8.7× 더 낮다.

https://doi.org/10.1109/tcsii.2025.3624840

Pipeline (software)

Object detection

Video tracking

Computation

Throughput

Variable (mathematics)

Tracking (education)

Object (grammar)

Pipeline transport

article

인용수 0

2025

QSLR: Post-Training Compression via Quantized Sparse and Low-Rank Factorization

Eunji Kwon

IF 3.6 (2025)

IEEE Access

트랜스포머 기반 파운데이션 모델의 규모와 복잡성이 커짐에 따라, 특히 자원이 제한된 환경에서 이들을 효율적으로 배치하는 데 상당한 어려움이 발생하고 있다. 최근의 후학습(post-training) 가지치기 또는 희소 + 저랭크 분해와 같은 방법들은 재학습 없이 모델 크기를 줄이지만, 여전히 32비트/고정밀(full-precision) 가중치에 의존하여 메모리 대역폭과 지연 시간에서의 이득을 제한한다. 본 논문에서는 희소 및 저랭크 분해를 통한 이상치(outlier) 인지 가지치기와 성분(component)-단위 헤시안(Hessian) 인지 양자화를 결합한 통합 후학습 양자화(PTQ) 프레임워크인 QSLR을 제안한다. 분해된 각 구성요소(희소 행렬, 저랭크 좌측 인자, 저랭크 우측 인자)는 투영된 헤시안을 사용하여 독립적으로 양자화되며, 불필요한 헤시안 계산을 제거하기 위한 효율적인 근사 기법이 사용된다. 또한 2차(Second-order) 양자화 손실을 최소화하기 위해 헤시안 가중 그리드 탐색으로 양자화 파라미터를 추가로 최적화한다. LLaMA2-7B 및 ViT-Base에 대한 실험 결과, QSLR은 최소한의 정확도 저하로 최대 5× 모델 압축을 달성하며, 기존의 최첨단 가지치기 또는 양자화 방법들을 일관되게 능가함을 보였다.

https://doi.org/10.1109/access.2025.3615473

Quantization (signal processing)

Pruning

Factorization

Vector quantization

Limiting

Data compression

Sparse matrix

Compression ratio

Grid

article

인용수 3

2024

RL-PTQ: RL-based Mixed Precision Quantization for Hybrid Vision Transformers

Eunji Kwon, Minxuan Zhou, Weihong Xu, Tajana Rosing, Seokhyeong Kang

기존의 양자화 접근법은 낮은 비트 폭으로 하이브리드 컨볼루션 및 트랜스포머 모델을 압축할 때 상당한 정확도 손실을 초래한다. 본 논문은 강화학습(RL)을 활용하는 새로운 사후 학습 양자화(PTQ) 프레임워크인 RL-PTQ를 제안한다. 우리는 계층을 그룹화하고 하이브리드 트랜스포머의 양자화에서 발생하는 문제를 다룸으로써, 혼합 정밀도를 위한 양자화 구성에 최적의 비트 폭 및 옵저버를 결정하는 데 초점을 둔다. 이전의 PTQ 방법들 [5--7]과 비교하여 MobileViTs에서 가장 높은 양자화 정확도를 달성하였다. 또한 PIM(Processing In Memory) 아키텍처에서의 우리의 양자화 모델은 최신 PIM 가속기 [15] 및 GPU에 비해 각각 기준 모델 대비 에너지 효율이 10.1× 및 22.6× 향상되었다.

https://doi.org/10.1145/3649329.3656231

Computer science

Quantization (signal processing)

Transformer

Computer vision

Artificial intelligence

Engineering

Electrical engineering

article

인용수 10

2023

Mobile Transformer Accelerator Exploiting Various Line Sparsity and Tile-Based Dynamic Quantization

Eunji Kwon, Jongho Yoon, Seokhyeong Kang

IF 2.7 (2023)

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

트랜스포머 모델은 메모리 및 연산이 집약적인 특성 때문에 모바일 기기에서 활용하기 어렵다. 이에 따라 가지치기(pruning)와 양자화(quantization)와 같은 다양한 트랜스포머 모델 압축 방법에 대한 연구가 지속적으로 이루어지고 있다. 그러나 중앙 처리 장치(CPU)와 그래픽 처리 장치(GPU) 같은 일반적인 연산 플랫폼은 비정형 희소성(unstructured sparsity)을 나타내기 때문에 병렬성(parallelism) 저하가 발생하여 가지치기된 모델을 가속하는 데 에너지 효율적이지 않다. 본 논문에서는 서로 다른 과립도(granularity)에서 선 가지치기(line pruning)로 유도된 다양한 수준의 구조화 희소성(structured sparsity)을 처리할 수 있는 저전력 트랜스포머 가속기를 제안한다. 제안한 방법은 헤드(head) 단위 및 라인(line) 단위로 가지치기된 트랜스포머를 가속한다. 또한 각 헤드마다 수행되는 연산 수가 달라서 처리 엔진(PE) 간 부하 불균형 문제를 야기하는 문제를 해결하는 동시에, 헤드 단위 스킵(skip) 연산을 지원하는 헤드 재구성(head reorganization) 및 셔플(shuffling) 방법을 제시한다. 더 나아가 라인 단위 스킵을 지원하고 활성값(activations)에 대해 온더플라이(on-the-fly) 타일 기반(tile-based) 동적 양자화(dynamic quantization)를 수행하는 희소 양자화 일반 행렬-대-행렬 곱셈(sparse quantized general matrix-to-matrix multiplication, SQ-GEMM) 모듈을 구현하였다. 그 결과, 제안한 가속기는 모바일 GPU 및 CPU에 비해 각각 검출 트랜스포머(DETR)에서 에너지 효율을 2.9× 및 12.3× 향상시켰고, 비전 트랜스포머(ViT) 모델에서는 각각 3.0× 및 12.4× 향상시켰다. 또한 제안한 모바일 가속기는 현재의 최첨단 FPGA 기반 트랜스포머 가속기들 중에서 가장 높은 에너지 효율을 달성하였다.

https://doi.org/10.1109/tcad.2023.3347291

Computer science

Transformer

Efficient energy use

Computation

Parallel computing

Quantization (signal processing)

Matrix multiplication

Mobile device

Granularity

Computer hardware

전체 논문

article

인용수 0

2025

Autonomous Model Quantization Framework for Hybrid Vision Transformers based on Reinforcement Learning

Eunji Kwon, Tajana Rosing

IF 2.9 (2025)

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

https://doi.org/10.1109/tcad.2025.3641538

Quantization (signal processing)

Reinforcement learning

Transformer

Rounding

Adder

Efficient energy use

article

인용수 0

2025

DeltaTrack: Flow-Driven Multiple Object Tracking Accelerator With Variable LSB Approximation for Real-Time and Energy-Efficient Video Analytics

Seunghyun Moon, Eunji Kwon

IF 4.9 (2025)

IEEE Transactions on Circuits & Systems II Express Briefs

https://doi.org/10.1109/tcsii.2025.3624840

Pipeline (software)

Object detection

Video tracking

Computation

Throughput

Variable (mathematics)

Tracking (education)

Object (grammar)

Pipeline transport

article

인용수 0

2025

QSLR: Post-Training Compression via Quantized Sparse and Low-Rank Factorization

Eunji Kwon

IF 3.6 (2025)

IEEE Access

https://doi.org/10.1109/access.2025.3615473

Quantization (signal processing)

Pruning

Factorization

Vector quantization

Limiting

Data compression

Sparse matrix

Compression ratio

Grid

article

인용수 3

2024

RL-PTQ: RL-based Mixed Precision Quantization for Hybrid Vision Transformers

Eunji Kwon, Minxuan Zhou, Weihong Xu, Tajana Rosing, Seokhyeong Kang

https://doi.org/10.1145/3649329.3656231

Computer science

Quantization (signal processing)

Transformer

Computer vision

Artificial intelligence

Engineering

Electrical engineering

article

인용수 10

2023

Mobile Transformer Accelerator Exploiting Various Line Sparsity and Tile-Based Dynamic Quantization

Eunji Kwon, Jongho Yoon, Seokhyeong Kang

IF 2.7 (2023)

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

https://doi.org/10.1109/tcad.2023.3347291

Computer science

Transformer

Efficient energy use

Computation

Parallel computing

Quantization (signal processing)

Matrix multiplication

Mobile device

Granularity

Computer hardware

article

인용수 4

2024

ViT- ToGo: Vision Transformer Accelerator with Grouped Token Pruning

Seungju Lee, Kyu-Min Cho, Eunji Kwon, Sejin Park, Seojeong Kim, Seokhyeong Kang

비전 트랜스포머(

V

iT)는 다양한 비전 작업에서의 성능으로 주목받고 있지만, 상당한 연산 및 메모리 요구량을 동반하여 자원이 제한된 엣지 디바이스에 배치할 때 어려움이 따른다. 이 한계를 해결하기 위해 연산을 줄이기 위한 다양한 토큰 가지치기(token pruning) 방법이 제안되어 왔다. 그러나 대부분의 토큰 가지치기 기법은 실제 임베디드 디바이스에서의 사용을 고려하지 않는데, 임베디드 디바이스는 계산 부하의 상당한 감소를 요구한다. 본 논문에서는 그룹화된 토큰 가지치기를 적용한

V

iT 가속기 ViT-ToGo를 제안한다. 이를 통해

V

iT 모델과 토큰 가지치기 과정을 병렬로 실행할 수 있다. 우리는 토큰 가지치기 과정을 포함한 정렬 및 재배열(sorting and reordering) 필요를 단순화하는 head-wise 중요도 추정기(head-wise importance estimator)로 그룹화된 토큰 가지치기를 구현한다. 제안 방법은 토큰 수를 최대 66%까지 감소시키며, 그 결과 GFLOPs를 최대 36%까지 감소시키는 성과를 보였고, 정확도는 약 1%의 최소한의 하락만을 나타낸다. 또한 하드웨어 구현은 평균 1.13%의 미미한 자원 오버헤드를 유발한다.

https://doi.org/10.23919/date58400.2024.10546804

Security token

Computer science

Transformer

Artificial intelligence

Electrical engineering

Engineering

Operating system

Voltage

book-chapter

인용수 0

2023

Adaptive FSP: Adaptive Architecture Search with Filter Shape Pruning

Aeri Kim, Seungju Lee, Eunji Kwon, Seokhyeong Kang

Lecture notes in computer science

https://doi.org/10.1007/978-3-031-26319-4_32

Computer science

Pruning

Architecture

Filter (signal processing)

Adaptive filter

Artificial intelligence

Computer architecture

Algorithm

Computer vision

article

인용수 8

2023

Mobile Accelerator Exploiting Sparsity of Multi-Heads, Lines, and Blocks in Transformers in Computer Vision

Eunji Kwon, Haena Song, Jihye Park, Seokhyeong Kang

메모리 및 연산 집약적 특성 때문에 모바일 기기에서 컴퓨터 비전 용도로 transformer 모델을 활용하기는 어렵다. 이에 따라 가지치기(pruning)와 같은 transformer 모델 압축을 위한 다양한 방법에 관한 연구가 지속되고 있다. 그러나 중앙 처리 장치(central processing units, CPUs)와 그래픽 처리 장치(graphics processing units, GPUs)와 같은 일반 컴퓨팅 플랫폼은 구조적 희소성(structured sparsity) 때문에 가지치기로 압축된 모델을 가속하는 데 에너지 효율적이지 않다. 본 논문에서는 가지치기를 서로 다른 과립도(granularity)로 수행하여 유도된 다양한 크기의 구조적 희소성을 갖는 transformer용 저전력 가속기를 제안한다. 본 연구에서는 헤드(head)-단위, 라인(line)-단위 또는 블록(block)-단위로 가지치기된 transformer를 가속할 수 있다. 이를 위해 헤드-단위 스킵 연산을 지원하고, 한 개의 헤드에서 수행되는 연산 수가 달라 발생하는 처리 엔진(processing engine, PE) 부하 불균형 문제를 해결하기 위한 헤드 스케줄링(head scheduling) 알고리즘을 개발하였다. 또한 라인-단위 및 블록-단위 스킵을 지원하는 희소 일반 행렬-대-행렬 곱셈(sparse general matrix-to-matrix multiplication, sparse GEMM) 모듈을 구현하였다. 그 결과, 제안한 가속기는 모바일 GPU와 모바일 CPU 각각에 비해 검출 transformer(detection transformer, DETR) 모델에서 에너지 효율을

6.1 \times

및

13.6 \times

향상시켰고, 비전 transformer(vision transformer, ViT) 모델들에서는 평균적으로 에너지 효율을 각각 약

2.6 \times

및

7.9 \times

향상시켰다.

https://doi.org/10.23919/date56975.2023.10137099

Computer science

Transformer

Computation

Mobile device

Modulo

Parallel computing

Computer hardware

Computational science

Algorithm

Operating system

article

인용수 1

2023

FPGA-Based Accelerator for Rank-Enhanced and Highly-Pruned Block-Circulant Neural Networks

Haena Song, Jongho Yoon, Dohun Kim, Eunji Kwon, Tae-Hyun Oh, Seokhyeong Kang

자원 제약이 있는 임베디드 시스템에 딥 신경망을 배치하기 위해 수많은 네트워크 압축 방법이 제안되어 왔다. 그중 블록 순환 행렬(block-circulant matrix, BCM) 압축은 가속과 압축 모두를 위한 유망한 하드웨어 친화적 방법 중 하나이다. 그러나 BCM 압축에는 몇 가지 한계가 있다. (i) 순환 행렬의 구조적 특성으로 인한 제한된 표현력, (ii) 압축 파라미터의 제한, (iii) BCM으로 압축된 네트워크 가속기를 위한 데이터플로의 특화 필요성이다. 본 논문에서는 이러한 한계를 극복하기 위해 순위 향상 및 고도로 가지치기된 블록 순환 행렬 압축(rank-enhanced and highly-pruned block-circulant matrices compression, RP-BCM) 프레임워크를 제안한다. RP-BCM은 두 단계로 구성되며, Hadamard-BCM과 BCM 단위 가지치기(BCM-wise pruning)로 이루어진다. 또한 BCM 단위 희소성을 활용하여 높은 병렬성을 확보하기 위한 처리 요소(processing element) 설계를 위한 전용 스킵 스킴(skip scheme)을 도입한다. 더 나아가, 자원 제약이 있는 FPGA에서 BCM으로 압축된 네트워크를 위한 특화 데이터플로를 제안한다. 그 결과, 제안된 방법은 ImageNet에서 ResNet-50에 대해 각각 92.4% 및 77.3%의 파라미터 감소와 FLOPs 감소를 달성한다. 또한 제안된 하드웨어 설계는 GPU와 비교하여 ImageNet에서 ResNet-18에 대해 Xilinx PYNQ-Z2 FPGA 보드에서 에너지 효율을

3.1 \times

향상시킨다.

http://dx.doi.org/10.23919/date56975.2023.10137111

Computer science

Field-programmable gate array

Circulant matrix

Dataflow

Pruning

Block (permutation group theory)

Kernel (algebra)

Parallel computing

Hardware acceleration

Artificial neural network

article

인용수 4

2021

MDARTS: Multi-objective Differentiable Neural Architecture Search

Sung‐Hoon Kim, Hyunjeong Kwon, Eunji Kwon, Youngchang Choi, Tae-Hyun Oh, Seokhyeong Kang

본 연구에서는 하드웨어 설계 제약을 고려하는 두 가지 경쟁 목표, 즉 결과의 품질(QoR)과 서비스의 품질(QoS)을 함께 반영하는 미분 가능한 신경망 아키텍처 탐색(NAS) 방법을 제시한다. NAS 연구는 최근 수작업으로 설계된 모델보다 더 우수한 성능을 낼 수 있는 아키텍처 후보를 자동으로 찾아낼 수 있다는 점에서 많은 주목을 받았다. 그러나 실제 HW 설계 제약을 만족하는 NAS 접근은 상대적으로 덜 탐구되었다. 이를 위한 단순한 NAS 방식은 QoR과 QoS의 두 기준을 결합하여 최적화하는 것일 수 있으나, 선행 연구를 그대로 확장할 경우 종종 퇴화된 아키텍처가 생성되며 민감한 하이퍼파라미터 튜닝의 문제도 동반된다. 본 연구에서는 MDARTS(Multi-objective differential neural architecture search)라는 다목적 미분형 신경망 아키텍처 탐색을 제안한다. MDARTS는 탐색 시간이 부담스럽지 않으며, QoR 대 QoS의 파레토 전선(Pareto frontier)을 찾을 수 있다. 또한 소프트 연결이 이진화되는 최종 후처리(post-processed) 아키텍처와, 기존의 모든 미분형 NAS 결과 사이에 존재하는 문제적 간극을 규명한다. 이 간극은 모델을 배치·운용할 때 성능 저하를 초래한다. 이 간극을 완화하기 위해, 엔트로피를 암묵적으로 최소화함으로써 구성 요소 간 연결이 무한정 지속되는 것을 억제하는 분리 손실(separation loss)을 제안한다.

https://doi.org/10.23919/date51398.2021.9474068

Computer science

Differentiable function

Architecture

Artificial neural network

Artificial intelligence

Mathematics

Pure mathematics

프로젝트 공고 서비스 문의 자주 묻는 질문 이용약관 개인정보처리방침

주식회사 디써클

대표 장재우,이윤구서울특별시 강남구 역삼로 169, 명우빌딩 2층 (TIPS타운 S2)대표 전화 0507-1312-6417이메일 info@rndcircle.io사업자등록번호 458-87-03380호스팅제공자 구글 클라우드 플랫폼(GCP)

주요 논문

*2026년 기준 최근 6년 이내 논문에 한해 Impact Factor가 표기됩니다.

article

인용수 0

2025

Autonomous Model Quantization Framework for Hybrid Vision Transformers based on Reinforcement Learning

Eunji Kwon, Tajana Rosing

IF 2.9 (2025)

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

https://doi.org/10.1109/tcad.2025.3641538

Quantization (signal processing)

Reinforcement learning

Transformer

Rounding

Adder

Efficient energy use

article

인용수 0

2025

DeltaTrack: Flow-Driven Multiple Object Tracking Accelerator With Variable LSB Approximation for Real-Time and Energy-Efficient Video Analytics

Seunghyun Moon, Eunji Kwon

IF 4.9 (2025)

IEEE Transactions on Circuits & Systems II Express Briefs

https://doi.org/10.1109/tcsii.2025.3624840

Pipeline (software)

Object detection

Video tracking

Computation

Throughput

Variable (mathematics)

Tracking (education)

Object (grammar)

Pipeline transport

article

인용수 0

2025

QSLR: Post-Training Compression via Quantized Sparse and Low-Rank Factorization

Eunji Kwon

IF 3.6 (2025)

IEEE Access

https://doi.org/10.1109/access.2025.3615473

Quantization (signal processing)

Pruning

Factorization

Vector quantization

Limiting

Data compression

Sparse matrix

Compression ratio

Grid

article

인용수 3

2024

RL-PTQ: RL-based Mixed Precision Quantization for Hybrid Vision Transformers

Eunji Kwon, Minxuan Zhou, Weihong Xu, Tajana Rosing, Seokhyeong Kang

https://doi.org/10.1145/3649329.3656231

Computer science

Quantization (signal processing)

Transformer

Computer vision

Artificial intelligence

Engineering

Electrical engineering

article

인용수 10

2023

Mobile Transformer Accelerator Exploiting Various Line Sparsity and Tile-Based Dynamic Quantization

Eunji Kwon, Jongho Yoon, Seokhyeong Kang

IF 2.7 (2023)

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

https://doi.org/10.1109/tcad.2023.3347291

Computer science

Transformer

Efficient energy use

Computation

Parallel computing

Quantization (signal processing)

Matrix multiplication

Mobile device

Granularity

Computer hardware

전체 논문

article

인용수 0

2025

Autonomous Model Quantization Framework for Hybrid Vision Transformers based on Reinforcement Learning

Eunji Kwon, Tajana Rosing

IF 2.9 (2025)

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

https://doi.org/10.1109/tcad.2025.3641538

Quantization (signal processing)

Reinforcement learning

Transformer

Rounding

Adder

Efficient energy use

article

인용수 0

2025

DeltaTrack: Flow-Driven Multiple Object Tracking Accelerator With Variable LSB Approximation for Real-Time and Energy-Efficient Video Analytics

Seunghyun Moon, Eunji Kwon

IF 4.9 (2025)

IEEE Transactions on Circuits & Systems II Express Briefs

https://doi.org/10.1109/tcsii.2025.3624840

Pipeline (software)

Object detection

Video tracking

Computation

Throughput

Variable (mathematics)

Tracking (education)

Object (grammar)

Pipeline transport

article

인용수 0

2025

QSLR: Post-Training Compression via Quantized Sparse and Low-Rank Factorization

Eunji Kwon

IF 3.6 (2025)

IEEE Access

https://doi.org/10.1109/access.2025.3615473

Quantization (signal processing)

Pruning

Factorization

Vector quantization

Limiting

Data compression

Sparse matrix

Compression ratio

Grid

article

인용수 3

2024

RL-PTQ: RL-based Mixed Precision Quantization for Hybrid Vision Transformers

Eunji Kwon, Minxuan Zhou, Weihong Xu, Tajana Rosing, Seokhyeong Kang

https://doi.org/10.1145/3649329.3656231

Computer science

Quantization (signal processing)

Transformer

Computer vision

Artificial intelligence

Engineering

Electrical engineering

article

인용수 10

2023

Mobile Transformer Accelerator Exploiting Various Line Sparsity and Tile-Based Dynamic Quantization

Eunji Kwon, Jongho Yoon, Seokhyeong Kang

IF 2.7 (2023)

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

https://doi.org/10.1109/tcad.2023.3347291

Computer science

Transformer

Efficient energy use

Computation

Parallel computing

Quantization (signal processing)

Matrix multiplication

Mobile device

Granularity

Computer hardware

article

인용수 4

2024

ViT- ToGo: Vision Transformer Accelerator with Grouped Token Pruning

Seungju Lee, Kyu-Min Cho, Eunji Kwon, Sejin Park, Seojeong Kim, Seokhyeong Kang

비전 트랜스포머(

V

V

iT 가속기 ViT-ToGo를 제안한다. 이를 통해

V

https://doi.org/10.23919/date58400.2024.10546804

Security token

Computer science

Transformer

Artificial intelligence

Electrical engineering

Engineering

Operating system

Voltage

book-chapter

인용수 0

2023

Adaptive FSP: Adaptive Architecture Search with Filter Shape Pruning

Aeri Kim, Seungju Lee, Eunji Kwon, Seokhyeong Kang

Lecture notes in computer science

https://doi.org/10.1007/978-3-031-26319-4_32

Computer science

Pruning

Architecture

Filter (signal processing)

Adaptive filter

Artificial intelligence

Computer architecture

Algorithm

Computer vision

article

인용수 8

2023

Mobile Accelerator Exploiting Sparsity of Multi-Heads, Lines, and Blocks in Transformers in Computer Vision

Eunji Kwon, Haena Song, Jihye Park, Seokhyeong Kang

6.1 \times

및

13.6 \times

향상시켰고, 비전 transformer(vision transformer, ViT) 모델들에서는 평균적으로 에너지 효율을 각각 약

2.6 \times

및

7.9 \times

향상시켰다.

https://doi.org/10.23919/date56975.2023.10137099

Computer science

Transformer

Computation

Mobile device

Modulo

Parallel computing

Computer hardware

Computational science

Algorithm

Operating system

article

인용수 1

2023

FPGA-Based Accelerator for Rank-Enhanced and Highly-Pruned Block-Circulant Neural Networks

Haena Song, Jongho Yoon, Dohun Kim, Eunji Kwon, Tae-Hyun Oh, Seokhyeong Kang

3.1 \times

향상시킨다.

http://dx.doi.org/10.23919/date56975.2023.10137111

Computer science

Field-programmable gate array

Circulant matrix

Dataflow

Pruning

Block (permutation group theory)

Kernel (algebra)

Parallel computing

Hardware acceleration

Artificial neural network

article

인용수 4

2021

MDARTS: Multi-objective Differentiable Neural Architecture Search

Sung‐Hoon Kim, Hyunjeong Kwon, Eunji Kwon, Youngchang Choi, Tae-Hyun Oh, Seokhyeong Kang

https://doi.org/10.23919/date51398.2021.9474068

Computer science

Differentiable function

Architecture

Artificial neural network

Artificial intelligence

Mathematics

Pure mathematics