논문 | 나재호 교수 연구실 | 상명대학교 SW융합학부

나재호 교수 연구실

서비스 플랜

연구실 검색

프로젝트 공고

정부 과제 추천

AI 기반 기업 서칭

홈

기본 정보

연구 분야

프로젝트

발행물

구성원

논문

주요 논문

article

bronze

인용수 8

2020

QuickETC2

Jae‐Ho Nah

IF 9.5

ACM Transactions on Graphics

Compressed textures are indispensable in most 3D graphics applications to reduce memory traffic and increase performance. For higher-quality graphics, the number and size of textures in an application have continuously increased. Additionally, the ETC2 texture format, which is mandatory in OpenGL ES 3.0, OpenGL 4.3, and Android 4.3 (and later versions), requires more complex texture compression than the traditional ETC1 format. As a result, texture compression becomes more and more time-consuming. To accelerate ETC2 compression, we introduce two new compression techniques, named QuickETC2. The first technique is an early compression-mode decision scheme. Instead of testing all ETC1/2 modes to compress a texel block, we select proper modes for each block by exploiting the luma difference of the block to reduce unnecessary compression overhead. The second technique is a fast luma-based T- and H-mode compression method. When clustering each texel into two groups, we replace the 3D RGB space with the 1D luma space and quickly find the two groups that have the minimum luma differences. We also selectively perform the T- or H-mode and reduce its distance candidates, according to the luma differences of each group. We have implemented both techniques with AVX2 intrinsics to exploit SIMD parallelism. According to our experiments, QuickETC2 can compress more than 2000 1K×1K-sized images per second on an octa-core CPU.

https://doi.org/10.1145/3414685.3417787

Computer science

Intrinsics

Texel

Texture compression

Data compression

SIMD

Computer graphics (images)

Block (permutation group theory)

OpenGL

Artificial intelligence

article

인용수 6

2017

Z 2 traversal order: An interleaving approach for VR stereo rendering on tile-based GPUs

Jae‐Ho Nah, Yeongkyu Lim, Sunho Ki, Chulho Shin

IF 18.3

Computational Visual Media

With increasing demands of virtual reality (VR) applications, efficient VR rendering techniques are becoming essential. Because VR stereo rendering has increased computational costs to separately render views for the left and right eyes, to reduce the rendering cost in VR applications, we present a novel traversal order for tile-based mobile GPU architectures: Z2 traversal order. In tile-based mobile GPU architectures, a tile traversal order that maximizes spatial locality can increase GPU cache efficiency. For VR applications, our approach improves upon the traditional Z order curve. We render corresponding screen tiles in left and right views in turn, or simultaneously, and as a result, we can exploit spatial adjacency of the two tiles. To evaluate our approach, we conducted a trace-driven hardware simulation using Mesa and a hardware simulator. Our experimental results show that Z2 traversal order can reduce external memory bandwidth requirements and increase rendering performance.

https://doi.org/10.1007/s41095-017-0093-5

Computer graphics (images)

Computer science

Tile

Rendering (computer graphics)

Tree traversal

Interleaving

Parallel rendering

Software rendering

Real-time rendering

Computer graphics

article

인용수 23

2014

HART: A Hybrid Architecture for Ray Tracing Animated Scenes

Jae‐Ho Nah, Jinwoo Kim, Junho Park, Won‐Jong Lee, Jeong‐Soo Park, Seokyoon Jung, Woo-Chan Park, Dinesh Manocha, Tack‐Don Han

IF 6.5

IEEE Transactions on Visualization and Computer Graphics

We present a hybrid architecture, inspired by asynchronous BVH construction [1], for ray tracing animated scenes. Our hybrid architecture utilizes heterogeneous hardware resources: dedicated ray-tracing hardware for BVH updates and ray traversal and a CPU for BVH reconstruction. We also present a traversal scheme using a primitive's axis-aligned bounding box (PrimAABB). This scheme reduces ray-primitive intersection tests by reusing existing BVH traversal units and the primAABB data for tree updates; it enables the use of shallow trees to reduce tree build times, tree sizes, and bus bandwidth requirements. Furthermore, we present a cache scheme that exploits consecutive memory access by reusing data in an L1 cache block. We perform cycle-accurate simulations to verify our architecture, and the simulation results indicate that the proposed architecture can achieve real-time Whitted ray tracing animated scenes at 1,920 × 1,200 resolution. This result comes from our high-performance hardware architecture and minimized resource requirements for tree updates.

https://doi.org/10.1109/tvcg.2014.2371855

Computer science

Tree traversal

Ray tracing (physics)

Cache

Parallel computing

Data structure

Tree (set theory)

Computer architecture

Computer hardware

Computer graphics (images)

article

인용수 69

2014

RayCore

Jae‐Ho Nah, Hyuck-Joo Kwon, Dongseok Kim, Cheol-Ho Jeong, Jin‐Hong Park, Tack‐Don Han, Dinesh Manocha, Woo-Chan Park

IF 9.5

ACM Transactions on Graphics

We present RayCore, a mobile ray-tracing hardware architecture. RayCore facilitates high-quality rendering effects, such as reflection, refraction, and shadows, on mobile devices by performing real-time Whitted ray tracing. RayCore consists of two major components: ray-tracing units (RTUs) based on a unified traversal and intersection pipeline and a tree-building unit (TBU) for dynamic scenes. The overall RayCore architecture offers considerable benefits in terms of die area, memory access, and power consumption. We have evaluated our architecture based on FPGA and ASIC evaluations and demonstrate its performance on different benchmarks. According to the results, our architecture demonstrates high performance per unit area and unit energy, making it highly suitable for use in mobile devices.

https://doi.org/10.1145/2629634

Computer science

Rendering (computer graphics)

Architecture

Field-programmable gate array

Mobile device

Ray tracing (physics)

Embedded system

Tracing

Power consumption

Computer hardware

article

인용수 25

2011

T&I engine

Jae‐Ho Nah, Jeong‐Soo Park, Chanmin Park, Jin‐Woo Kim, Yun-Hye Jung, Woo-Chan Park, Tack‐Don Han

IF 9.5

ACM Transactions on Graphics

Ray tracing naturally supports high-quality global illumination effects, but it is computationally costly. Traversal and intersection operations dominate the computation of ray tracing. To accelerate these two operations, we propose a hardware architecture integrating three novel approaches. First, we present an ordered depth-first layout and a traversal architecture using this layout to reduce the required memory bandwidth. Second, we propose a three-phase ray-triangle intersection architecture that takes advantage of early exit. Third, we propose a latency hiding architecture defined as the ray accumulation unit. Cycle-accurate simulation results indicate our architecture can achieve interactive distributed ray tracing.

https://doi.org/10.1145/2070781.2024194

Tree traversal

Computer science

Architecture

Computation

Tracing

Intersection (aeronautics)

Ray tracing (physics)

Latency (audio)

Bandwidth (computing)

Computer architecture

전체 논문

article

bronze

인용수 8

2020

QuickETC2

Jae‐Ho Nah

IF 9.5

ACM Transactions on Graphics

https://doi.org/10.1145/3414685.3417787

Computer science

Intrinsics

Texel

Texture compression

Data compression

SIMD

Computer graphics (images)

Block (permutation group theory)

OpenGL

Artificial intelligence

article

인용수 6

2017

Z 2 traversal order: An interleaving approach for VR stereo rendering on tile-based GPUs

Jae‐Ho Nah, Yeongkyu Lim, Sunho Ki, Chulho Shin

IF 18.3

Computational Visual Media

https://doi.org/10.1007/s41095-017-0093-5

Computer graphics (images)

Computer science

Tile

Rendering (computer graphics)

Tree traversal

Interleaving

Parallel rendering

Software rendering

Real-time rendering

Computer graphics

article

인용수 23

2014

HART: A Hybrid Architecture for Ray Tracing Animated Scenes

Jae‐Ho Nah, Jinwoo Kim, Junho Park, Won‐Jong Lee, Jeong‐Soo Park, Seokyoon Jung, Woo-Chan Park, Dinesh Manocha, Tack‐Don Han

IF 6.5

IEEE Transactions on Visualization and Computer Graphics

https://doi.org/10.1109/tvcg.2014.2371855

Computer science

Tree traversal

Ray tracing (physics)

Cache

Parallel computing

Data structure

Tree (set theory)

Computer architecture

Computer hardware

Computer graphics (images)

article

인용수 69

2014

RayCore

Jae‐Ho Nah, Hyuck-Joo Kwon, Dongseok Kim, Cheol-Ho Jeong, Jin‐Hong Park, Tack‐Don Han, Dinesh Manocha, Woo-Chan Park

IF 9.5

ACM Transactions on Graphics

https://doi.org/10.1145/2629634

Computer science

Rendering (computer graphics)

Architecture

Field-programmable gate array

Mobile device

Ray tracing (physics)

Embedded system

Tracing

Power consumption

Computer hardware

article

인용수 25

2011

T&I engine

Jae‐Ho Nah, Jeong‐Soo Park, Chanmin Park, Jin‐Woo Kim, Yun-Hye Jung, Woo-Chan Park, Tack‐Don Han

IF 9.5

ACM Transactions on Graphics

https://doi.org/10.1145/2070781.2024194

Tree traversal

Computer science

Architecture

Computation

Tracing

Intersection (aeronautics)

Ray tracing (physics)

Latency (audio)

Bandwidth (computing)

Computer architecture

article

gold

인용수 0

2025

Considerations for the Acceleration Structure of Sound Propagation on Mobile Devices: Kd-Trees Versus Multi-Bounding Volume Hierarchies

Hyeon-ki Lee, Hyeju Kim, Dong-Yun Kim, Woo-Chan Park, Jae‐Ho Nah

IF 3.6

IEEE Access

Sound propagation algorithms can provide immersive auditory experiences for users in various domains, such as games, virtual/augmented reality, and other applications. Recently, there has been a growing trend to apply spatial sound effects not only on desktops but also on mobile devices with limited computational resources. In geometric-acoustic (GA) sound propagation utilizing ray tracing methods for spatial audio effects, the choice of acceleration structure is an important factor that can either degrade or enhance performance. From this perspective, we seek to provide insights into the essential considerations for selecting acceleration structures for sound rendering on mobile devices. In this paper, we propose guidelines for mobile devices that address both how to select acceleration structures for sound rendering depending on the scene characteristics and how to optimize the selected acceleration structures. We used kd-trees and multi-bounding volume hierarchies (MBVHs), both of which are widely used acceleration structures for ray tracing. According to our experiments, our optimization approach, when compared against the baseline kd-trees, not only achieved performance improvements of up to 1.33× and 1.44× for optimized kd-trees and MBVHs, respectively, with minimal increases in power consumption on a Google Pixel 8a, but also enabled analysis of the advantages and disadvantages of each acceleration structure in various test scenes.We expect that our research will serve as a valuable reference for future studies on sound propagation and the broader multimedia community.

https://doi.org/10.1109/access.2025.3627966

Rendering (computer graphics)

Acceleration

Ray tracing (physics)

Mobile device

Sound (geography)

Visualization

Volume rendering

Tracing

article

인용수 0

2025

A Practical Encoding Approach for Texture Compression: Combining Multi-Processing and Multi-Threading

Hyeon-ki Lee, Jae-Ho Nah

High-resolution textures are critical for delivering immersive graphics. In game development, these textures are typically stored in compressed formats and encoded offline. However, when encoding a large number of textures in parallel, the performance benefits of multi-threading can be limited by bottlenecks, including image loading and decoding (e.g., PNG).

https://doi.org/10.1145/3757374.3771436

Encoding (memory)

Decoding methods

Texture (cosmology)

Image (mathematics)

Pattern recognition (psychology)

Texture compression

article

gold

인용수 13

2024

Efficient Haze Removal from a Single Image Using a DCP-Based Lightweight U-Net Neural Network Model

Yunho Han, Jiyoung Kim, Jinyoung Lee, Jae‐Ho Nah, Yo‐Sung Ho, Woo-Chan Park

IF 3.5

Sensors

In this paper, we propose a lightweight U-net architecture neural network model based on Dark Channel Prior (DCP) for efficient haze (fog) removal with a single input. The existing DCP requires high computational complexity in its operation. These computations are challenging to accelerate, and the problem is exacerbated when dealing with high-resolution images (videos), making it very difficult to apply to general-purpose applications. Our proposed model addresses this issue by employing a two-stage neural network structure, replacing the computationally complex operations of the conventional DCP with easily accelerated convolution operations to achieve high-quality fog removal. Furthermore, our proposed model is designed with an intuitive structure using a relatively small number of parameters (2M), utilizing resources efficiently. These features demonstrate the effectiveness and efficiency of the proposed model for fog removal. The experimental results show that the proposed neural network model achieves an average Peak Signal-to-Noise Ratio (PSNR) of 26.65 dB and a Structural Similarity Index Measure (SSIM) of 0.88, indicating an improvement in the average PSNR of 11.5 dB and in SSIM of 0.22 compared to the conventional DCP. This shows that the proposed neural network achieves comparable results to CNN-based neural networks that have achieved SOTA-class performance, despite its intuitive structure with a relatively small number of parameters.

https://doi.org/10.3390/s24123746

Computer science

Artificial neural network

Peak signal-to-noise ratio

Convolutional neural network

Computation

Noise (video)

Channel (broadcasting)

Haze

Algorithm

Artificial intelligence

article

인용수 4

2023

QuickETC2-HQ: Improved ETC2 encoding techniques for real-time, high-quality texture compression

Jae‐Ho Nah

IF 2.8

Computers & Graphics

https://doi.org/10.1016/j.cag.2023.08.032

Computer science

Encoder

Encoding (memory)

Texture compression

Data compression

Compression (physics)

Artificial intelligence

Set (abstract data type)

Real-time computing

Computer vision

article

인용수 1

2023

An Architecture and Implementation of Real-Time Sound Propagation Hardware for Mobile Devices

Eunjae Kim, Sukwon Choi, J.K. Kim, Jae‐Ho Nah, Woonam Jung, Tae-Hyeong Lee, Yeon-Kug Moon, Woo-Chan Park

This paper presents a high-performance and low-power hardware architecture for real-time sound rendering on mobile devices. Traditional sound rendering algorithms require high-performance CPUs or GPUs for processing because of its high computational complexities to realize ultra-realistic 3D audio. Thus, it has been hard to achieve real-time rates on low-power mobile devices. To overcome this limitation, we propose a hardware architecture that adopts hardware-friendly sound-propagation-path calculation algorithms. We verified the function and performance of our architecture through its implementation on an FPGA board. According to ASIC evaluation with the 8-nm process technology, it achieves high performance with 120 FPS, low power consumption with 50 mW, and a small silicon area with 0.31 mm2, allowing real-time sound rendering on mobile devices.

https://doi.org/10.1145/3610548.3618237

Computer science

Rendering (computer graphics)

Field-programmable gate array

Embedded system

Mobile device

Application-specific integrated circuit

Architecture

Computer hardware

Power consumption

Computer architecture

주요 논문

article

bronze

인용수 8

2020

QuickETC2

Jae‐Ho Nah

IF 9.5

ACM Transactions on Graphics

https://doi.org/10.1145/3414685.3417787

Computer science

Intrinsics

Texel

Texture compression

Data compression

SIMD

Computer graphics (images)

Block (permutation group theory)

OpenGL

Artificial intelligence

article

인용수 6

2017

Z 2 traversal order: An interleaving approach for VR stereo rendering on tile-based GPUs

Jae‐Ho Nah, Yeongkyu Lim, Sunho Ki, Chulho Shin

IF 18.3

Computational Visual Media

https://doi.org/10.1007/s41095-017-0093-5

Computer graphics (images)

Computer science

Tile

Rendering (computer graphics)

Tree traversal

Interleaving

Parallel rendering

Software rendering

Real-time rendering

Computer graphics

article

인용수 23

2014

HART: A Hybrid Architecture for Ray Tracing Animated Scenes

Jae‐Ho Nah, Jinwoo Kim, Junho Park, Won‐Jong Lee, Jeong‐Soo Park, Seokyoon Jung, Woo-Chan Park, Dinesh Manocha, Tack‐Don Han

IF 6.5

IEEE Transactions on Visualization and Computer Graphics

https://doi.org/10.1109/tvcg.2014.2371855

Computer science

Tree traversal

Ray tracing (physics)

Cache

Parallel computing

Data structure

Tree (set theory)

Computer architecture

Computer hardware

Computer graphics (images)

article

인용수 69

2014

RayCore

Jae‐Ho Nah, Hyuck-Joo Kwon, Dongseok Kim, Cheol-Ho Jeong, Jin‐Hong Park, Tack‐Don Han, Dinesh Manocha, Woo-Chan Park

IF 9.5

ACM Transactions on Graphics

https://doi.org/10.1145/2629634

Computer science

Rendering (computer graphics)

Architecture

Field-programmable gate array

Mobile device

Ray tracing (physics)

Embedded system

Tracing

Power consumption

Computer hardware

article

인용수 25

2011

T&I engine

Jae‐Ho Nah, Jeong‐Soo Park, Chanmin Park, Jin‐Woo Kim, Yun-Hye Jung, Woo-Chan Park, Tack‐Don Han

IF 9.5

ACM Transactions on Graphics

https://doi.org/10.1145/2070781.2024194

Tree traversal

Computer science

Architecture

Computation

Tracing

Intersection (aeronautics)

Ray tracing (physics)

Latency (audio)

Bandwidth (computing)

Computer architecture

전체 논문

article

bronze

인용수 8

2020

QuickETC2

Jae‐Ho Nah

IF 9.5

ACM Transactions on Graphics

https://doi.org/10.1145/3414685.3417787

Computer science

Intrinsics

Texel

Texture compression

Data compression

SIMD

Computer graphics (images)

Block (permutation group theory)

OpenGL

Artificial intelligence

article

인용수 6

2017

Z 2 traversal order: An interleaving approach for VR stereo rendering on tile-based GPUs

Jae‐Ho Nah, Yeongkyu Lim, Sunho Ki, Chulho Shin

IF 18.3

Computational Visual Media

https://doi.org/10.1007/s41095-017-0093-5

Computer graphics (images)

Computer science

Tile

Rendering (computer graphics)

Tree traversal

Interleaving

Parallel rendering

Software rendering

Real-time rendering

Computer graphics

article

인용수 23

2014

HART: A Hybrid Architecture for Ray Tracing Animated Scenes

Jae‐Ho Nah, Jinwoo Kim, Junho Park, Won‐Jong Lee, Jeong‐Soo Park, Seokyoon Jung, Woo-Chan Park, Dinesh Manocha, Tack‐Don Han

IF 6.5

IEEE Transactions on Visualization and Computer Graphics

https://doi.org/10.1109/tvcg.2014.2371855

Computer science

Tree traversal

Ray tracing (physics)

Cache

Parallel computing

Data structure

Tree (set theory)

Computer architecture

Computer hardware

Computer graphics (images)

article

인용수 69

2014

RayCore

Jae‐Ho Nah, Hyuck-Joo Kwon, Dongseok Kim, Cheol-Ho Jeong, Jin‐Hong Park, Tack‐Don Han, Dinesh Manocha, Woo-Chan Park

IF 9.5

ACM Transactions on Graphics

https://doi.org/10.1145/2629634

Computer science

Rendering (computer graphics)

Architecture

Field-programmable gate array

Mobile device

Ray tracing (physics)

Embedded system

Tracing

Power consumption

Computer hardware

article

인용수 25

2011

T&I engine

Jae‐Ho Nah, Jeong‐Soo Park, Chanmin Park, Jin‐Woo Kim, Yun-Hye Jung, Woo-Chan Park, Tack‐Don Han

IF 9.5

ACM Transactions on Graphics

https://doi.org/10.1145/2070781.2024194

Tree traversal

Computer science

Architecture

Computation

Tracing

Intersection (aeronautics)

Ray tracing (physics)

Latency (audio)

Bandwidth (computing)

Computer architecture

article

gold

인용수 0

2025

Considerations for the Acceleration Structure of Sound Propagation on Mobile Devices: Kd-Trees Versus Multi-Bounding Volume Hierarchies

Hyeon-ki Lee, Hyeju Kim, Dong-Yun Kim, Woo-Chan Park, Jae‐Ho Nah

IF 3.6

IEEE Access

https://doi.org/10.1109/access.2025.3627966

Rendering (computer graphics)

Acceleration

Ray tracing (physics)

Mobile device

Sound (geography)

Visualization

Volume rendering

Tracing

article

인용수 0

2025

A Practical Encoding Approach for Texture Compression: Combining Multi-Processing and Multi-Threading

Hyeon-ki Lee, Jae-Ho Nah

https://doi.org/10.1145/3757374.3771436

Encoding (memory)

Decoding methods

Texture (cosmology)

Image (mathematics)

Pattern recognition (psychology)

Texture compression

article

gold

인용수 13

2024

Efficient Haze Removal from a Single Image Using a DCP-Based Lightweight U-Net Neural Network Model

Yunho Han, Jiyoung Kim, Jinyoung Lee, Jae‐Ho Nah, Yo‐Sung Ho, Woo-Chan Park

IF 3.5

Sensors

https://doi.org/10.3390/s24123746

Computer science

Artificial neural network

Peak signal-to-noise ratio

Convolutional neural network

Computation

Noise (video)

Channel (broadcasting)

Haze

Algorithm

Artificial intelligence

article

인용수 4

2023

QuickETC2-HQ: Improved ETC2 encoding techniques for real-time, high-quality texture compression

Jae‐Ho Nah

IF 2.8

Computers & Graphics

https://doi.org/10.1016/j.cag.2023.08.032

Computer science

Encoder

Encoding (memory)

Texture compression

Data compression

Compression (physics)

Artificial intelligence

Set (abstract data type)

Real-time computing

Computer vision

article

인용수 1

2023

An Architecture and Implementation of Real-Time Sound Propagation Hardware for Mobile Devices

Eunjae Kim, Sukwon Choi, J.K. Kim, Jae‐Ho Nah, Woonam Jung, Tae-Hyeong Lee, Yeon-Kug Moon, Woo-Chan Park

https://doi.org/10.1145/3610548.3618237

Computer science

Rendering (computer graphics)

Field-programmable gate array

Embedded system

Mobile device

Application-specific integrated circuit

Architecture

Computer hardware

Power consumption

Computer architecture