허재혁 교수 연구실 | 한국과학기술원 전산학부

허재혁 연구실

한국과학기술원 전산학부

허재혁 교수

허재혁 교수 연구실

서비스 플랜

연구실 검색

프로젝트 공고

정부 과제 추천

AI 기반 기업 서칭

홈

기본 정보

연구 분야

프로젝트

발행물

구성원

홈

허재혁 연구실

한국과학기술원 전산학부 허재혁 교수

허재혁 연구실은 컴퓨터구조와 병렬·분산 시스템을 기반으로 멀티코어 프로세서, 메모리 및 캐시 시스템, 가상화·클라우드 소프트웨어, GPU·NPU 보안, 기밀 컴퓨팅, 그리고 PIM 기반 AI 반도체와 LLM 추론 가속까지 폭넓게 연구하며, 고성능·고효율·고신뢰성을 동시에 만족하는 차세대 컴퓨팅 시스템과 데이터센터 아키텍처를 설계하는 데 집중하고 있다.

대표 연구 분야

연구 영역 전체보기

컴퓨터구조 및 병렬·분산 시스템

가상화·클라우드 시스템 소프트웨어

GPU·가속기 보안 및 기밀 컴퓨팅

주요 논문

논문 전체보기

article

인용수 5

2020

Reconciling Time Slice Conflicts of Virtual Machines With Dual Time Slice for Clouds

Taeklim Kim, Chang Hyun Park, Jaehyuk Huh, Jeongseob Ahn

IF 6

IEEE Transactions on Parallel and Distributed Systems

The proliferation of system virtualization poses a new challenge for the coarse-grained time sharing techniques for consolidation, since operating systems are running on virtual CPUs. The current system stack was designed under the assumption that operating systems can seize CPU resources at any moment. However, for the guest operating system on a virtual machine (VM), such assumption cannot be guaranteed, since virtual CPUs of VMs share a limited number of physical cores. Due to the time-sharing of physical cores, the execution of a virtual CPU is not contiguous, with a gap between the virtual and real time spaces. Such a virtual time discontinuity problem leads to significant inefficiency for lock and interrupt handling, which rely on the immediate availability of CPUs whenever the operating system requires computation. To reduce scheduling latencies of virtual CPUs, shortening time slices can be a straightforward strategy, but it may lead to the increased overhead of context switching costs across virtual machines for some workloads. It is challenging to determine a single time slice to satisfy all the VMs. In this article, we propose to have dual time slice to resolve the time slice conflict problem occurred in different types of virtual machines.

https://doi.org/10.1109/tpds.2020.2993252

Computer science

Virtual machine

Context switch

Interrupt

Operating system

Virtualization

Distributed computing

Temporal isolation among virtual machines

Hypervisor

Computer multitasking

article

인용수 7

2019

ZeroKernel: Secure Context-isolated Execution on Commodity GPUs

Ohmin Kwon, Yonggon Kim, Jaehyuk Huh, Hyunsoo Yoon

IF 7.5

IEEE Transactions on Dependable and Secure Computing

In the last decade, the dedicated graphics processing unit (GPU) has emerged as an architecture for high-performance computing workloads. Recently, researchers have also focused on the isolation property of a dedicated GPU and suggested GPU-based secure computing environments with several promising applications. However, despite the security analysis conducted by the prior studies, it has been unclear whether a dedicated GPU can be leveraged as a secure processor in the presence of a kernel-privileged attacker. In this paper, we first demonstrate the security of dedicated GPUs through comprehensive studies on context information for GPU execution. The paper shows that a kernel-privileged attacker can manipulate the GPU contexts to redirect memory accesses or execute arbitrary GPU codes on the running GPU kernel. Based on the security analysis, this paper proposes a new on-chip execution model for the dedicated GPU and a novel defense mechanism supporting the security of the on-chip execution. With comprehensive evaluation, the paper assures that the proposed solutions effectively isolate sensitive data in on-chip storages and defend against known attack vectors from a privileged attacker, supporting that the commodity GPUs can be leveraged as a secure processor.

https://doi.org/10.1109/tdsc.2019.2946250

Computer science

Kernel (algebra)

Context (archaeology)

CUDA

Graphics processing unit

General-purpose computing on graphics processing units

Coprocessor

Graphics

Parallel computing

Operating system

article

bronze

인용수 3

2018

GVTS: Global Virtual Time Fair Scheduling to Support Strict Fairness on Many Cores

Changdae Kim, Seungbeom Choi, Jaehyuk Huh

IF 6

IEEE Transactions on Parallel and Distributed Systems

Proportional fairness in CPU scheduling has been widely adopted to fairly distribute CPU shares corresponding to their weights. With the emergence of cloud environments, the proportionally fair scheduling has been extended to groups of threads or nested groups to support virtual machines or containers. Such proportional fairness has been supported by popular schedulers, such as Linux Completely Fair Scheduler (CFS) through virtual time scheduling. However, CFS, with a distributed runqueue per CPU, implements the virtual time scheduling locally. Across different queues, the virtual times of threads are not strictly maintained to avoid potential scalability bottlenecks. The uneven fluctuation of CPU shares caused by the limitations of CFS not only violates the fairness support for CPU assignments, but also significantly increases the tail latencies of latency-sensitive applications. To mitigate the limitations of CFS, this paper proposes a global virtual-time fair scheduler (GVTS), which enforces global virtual time fairness for threads and thread groups, even if they run across many physical cores. The new scheduler employs the hierarchical enforcement of target virtual time to enhance the scalability of schedulers, which is aware of the topology of CPU organization. We implemented GVTS in Linux kernel 4.6.4 with several optimizations to provide global virtual time efficiently. Our experimental results show that GVTS can almost eliminate the fairness violation of CFS for both non-grouped and grouped executions. Furthermore, GVTS can curtail the tail latency when latency-sensitive applications are co-running with batch tasks.

https://doi.org/10.1109/tpds.2018.2851515

Computer science

Scalability

Linux kernel

Virtual machine

Scheduling (production processes)

Cloud computing

Distributed computing

Latency (audio)

Thread (computing)

Parallel computing

정부 과제

과제 전체보기

2024년 4월-2029년 4월

|287,149,000원

생성형 AI의 프라이버시 보존을 위한 메모리 중심 이기종 HW-SW 시스템 연구

본 연구과제는 메모리 중심의 이기종 PIM-가속기 기반의 HW-SW 시스템 기술을 확보하여 사용자 프라이버시, 모델 보호, 결과 무결성을 제공하는 안전한 생성형 AI를 위한 PIM/PNM 기밀 컴퓨팅 기술을 제시하며 이로 인한 성능 하락을 최소화하는 것을 목표로 한다. 다음과 같은 세 가지 세부목표를 가진다. (1) PIM-가속기 이종 시스템을 위한 기밀 ...

생성형 AI

프라이버시 보존

메모리 구조

가속기

기밀 컴퓨팅

2024년 3월-2027년 12월

|2,083,334,000원

시뮬레이션 기반 고속/고정확도 데이터센터 워크로드/시스템 분석 플랫폼 개발

● 데이터 센터에서 대규모 AI 워크로드를 최적으로 실행하기 위한 NPU, PIM 등의 AI 반도체를 탑재하는 단일 서버 시스템 하드웨어 구성을 도출하기 위한 고속/고정밀 시뮬레이션/프로파일링 플랫폼 개발- 고속/고정밀 시뮬레이션/프로파일링 플랫폼을 통해 대규모 AI 워크로드에 따른 최적의 하드웨어 구성으로 유휴 하드웨어 자원을 최소화하여 구축 비용 및 에...

대규모 AI 워크로드 분석

서버 시스템 모델링

서버 시스템 프로파일링

AI 반도체 데이터센터

고속/고정밀 시뮬레이터

2024년 3월-2028년 12월

|898,000,000원

LLM 구현을 위한 효율적인 메모리 관리 및 병렬화 기법을 갖는 추론연산 DRAM PIM 하드웨어 구조 개발

초거대 언어 모델 (LLM) 추론 시스템의 메모리 사용량 감소, 전력 효율성 향상, 및 처리율 향상을 지원하는 PIM 하드웨어, 컴파일러, 및 시스템 SW 개발하며 다음의 산출물을 가진다. ● HW 아키텍처 디자인: 희소화 및 양자화된 LLM 가속을 지원하는 PIM 아키텍처 및 전력 효율성 향상을 위한 HW-SW 기술 개발● 컴파일러: PIM 다중 이기종 ...

메모리 내재 연산

초거대 언어 모델

메모리 근접 연산

인공지능 가속기 컴파일러

인공지능 추론 시스템

최신 특허

특허 전체보기

상태	출원연도	과제명	출원번호
공개	2023	신뢰할 수 있는 서버리스 컴퓨팅을 위한 하드웨어 기반 보안 및 가속 방법 및 시스템	1020230077529
공개	2023	심층 신경망 학습 연산 재배열을 통한 NPU 온칩 메모리 활용 향상 방법 및 시스템	1020230055347
공개	2023	다중 GPU 시스템에서 안전한 통신을 위한 동적인 암호화 메타 데이터 관리 방법 및 시스템	1020230055346

전체 특허

신뢰할 수 있는 서버리스 컴퓨팅을 위한 하드웨어 기반 보안 및 가속 방법 및 시스템

상태

공개

출원연도

2023

출원번호

1020230077529

상세 정보 바로가기

심층 신경망 학습 연산 재배열을 통한 NPU 온칩 메모리 활용 향상 방법 및 시스템

상태

공개

출원연도

2023

출원번호

1020230055347

상세 정보 바로가기

다중 GPU 시스템에서 안전한 통신을 위한 동적인 암호화 메타 데이터 관리 방법 및 시스템

상태

공개

출원연도

2023

출원번호

1020230055346

상세 정보 바로가기