MC Complexity Reduction for Generalized P and B Pictures in HEVC
Kyung‐Yong Kim, Hui Yong Kim, Jin Soo Choi, Gwang Hoon Park
IF 11.1
IEEE Transactions on Circuits and Systems for Video Technology
Motion compensation (MC) is a critical component in terms of computational complexity and memory bandwidth. The MC complexity of High Efficiency Video Coding (HEVC) for UHD contents significantly increased more than that of AVC/H.264. This paper reveals and analyzes a feature of generalized P and B pictures in HEVC and introduces a simple and effective method for MC complexity reduction that can be exploited at both the encoder and decoder without affecting the compression performance. The proposed method bypasses the \(L1\) interpolation process when the \(L0\) and \(L1\) motion information of a bipredicted block are identical. The simulation results show that the time reductions of 14.5% and 6.4% for the encoder and decoder, respectively, were achieved for LD-B configuration without any changes in coding results. The proposed method was adopted in the HEVC test model as a non-normative complexity reduction tool.
Deblocking Filtering for Illumination Compensation in Multiview Video Coding
Gwang Hoon Park, Min Woo Park, Sung-Chang Lim, Woo Sung Shim, Yung-Lyul Lee
IF 11.1
IEEE Transactions on Circuits and Systems for Video Technology
A deblocking filtering method that reduces the blocking artifacts that result from the use of the illumination change-adaptive motion compensation method, which was already adopted in the current multiview video coding (MVC) standard, is introduced. When the macroblock (MB)-based illumination compensation method is used to compensate local illumination changes in multiview video sequences, the neighboring MBs could have different illumination values that represent the average values of the luminance MBs, horizontally and vertically. This phenomenon causes horizontal/vertical blocking artifacts on the MB boundaries due to the difference in illumination between the neighboring MBs. The proposed deblocking filtering method is designed to reduce this phenomenon by using the adequately modified boundary strength derivation process of the current H.264/AVC deblocking filtering mechanism. The proposed deblocking filtering method improves the subjective video quality and provides slightly better objective video quality in multiview video sequences.
Multi-Scale Self-Attention-Based Convolutional-Neural-Network Post-Filtering for AV1 Codec: Towards Enhanced Visual Quality and Overall Coding Performance
Woowoen Gwun, Kiho Choi, Gwang Hoon Park
IF 2.2
Mathematics
This paper presents MS-MTSA, a multi-scale multi-type self-attention network designed to enhance AV1-compressed video through targeted post-filtering. The objective is to address two persistent artifact issues observed in our previous MTSA model: visible seams at patch boundaries and grid-like distortions from upsampling. To this end, MS-MTSA introduces two key architectural enhancements. First, multi-scale block-wise self-attention applies sequential attention over 16 × 16 and 12 × 12 blocks to better capture local context and improve spatial continuity. Second, refined patch-wise self-attention includes a lightweight convolutional refinement layer after upsampling to suppress structured artifacts in flat regions. These targeted modifications significantly improve both perceptual and quantitative quality. The proposed network achieves BD-rate reductions of 12.44% for Y, 21.70% for Cb, and 19.90% for Cr compared to the AV1 anchor. Visual evaluations confirm improved texture fidelity and reduced seam artifacts, demonstrating the effectiveness of combining multi-scale attention and structural refinement for artifact suppression in compressed video.
Multi-Type Self-Attention-Based Convolutional-Neural-Network Post-Filtering for AV1 Codec
Woowoen Gwun, Kiho Choi, Gwang Hoon Park
IF 2.2
Mathematics
Over the past few years, there has been substantial interest and research activity surrounding the application of Convolutional Neural Networks (CNNs) for post-filtering in video coding. Most current research efforts have focused on using CNNs with various kernel sizes for post-filtering, primarily concentrating on High-Efficiency Video Coding/H.265 (HEVC) and Versatile Video Coding/H.266 (VVC). This narrow focus has limited the exploration and application of these techniques to other video coding standards such as AV1, developed by the Alliance for Open Media, which offers excellent compression efficiency, reducing bandwidth usage and improving video quality, making it highly attractive for modern streaming and media applications. This paper introduces a novel approach that extends beyond traditional CNN methods by integrating three different self-attention layers into the CNN framework. Applied to the AV1 codec, the proposed method significantly improves video quality by incorporating these distinct self-attention layers. This enhancement demonstrates the potential of self-attention mechanisms to revolutionize post-filtering techniques in video coding beyond the limitations of convolution-based methods. The experimental results show that the proposed network achieves an average BD-rate reduction of 10.40% for the Luma component and 19.22% and 16.52% for the Chroma components compared to the AV1 anchor. Visual quality assessments further validated the effectiveness of our approach, showcasing substantial artifact reduction and detail enhancement in videos.
Low‐complexity patch projection method for efficient and lightweight point‐cloud compression
Sungryeul Rhyu, Junsik Kim, Gwang Hoon Park, Kyuheon Kim
IF 1.6
ETRI Journal
Abstract The point cloud provides viewers with intuitive geometric understanding but requires a huge amount of data. Moving Picture Experts Group (MPEG) has developed video‐based point‐cloud compression in the range of 300–700. As the compression rate increases, the complexity increases to the extent that it takes 101.36 s to compress one frame in an experimental environment using a personal computer. To realize real‐time point‐cloud compression processing, the direct patch projection (DPP) method proposed herein simplifies the complex patch segmentation process by classifying and projecting points according to their geometric positions. The DPP method decreases the complexity of the patch segmentation from 25.75 s to 0.10 s per frame, and the entire process becomes 8.76 times faster than the conventional one. Consequently, this proposed DPP method yields similar peak signal‐to‐noise ratio (PSNR) outcomes to those of the conventional method at reduced times (4.7–5.5 times) at the cost of bitrate overhead. The objective and subjective results show that the proposed DPP method can be considered when low‐complexity requirements are required in lightweight device environments.
Seam Generation Matrix Based on a Guided Energy-Depth Map for Image and Video Stitching
Seongbae Rhee, Gwang Hoon Park, Kyuheon Kim
IF 3.6
IEEE Access
An image captured by a single camera has a smaller viewing angle than that of the human eye. One method to expand this viewing angle is a technique known as image stitching, which generates a wider view from images captured with multiple cameras. Although this technique has found uses in multiple industries, it is vulnerable to parallax distortion, wherein objects disappear from or repeatedly appear in stitched images when the parallax between cameras differs significantly. To minimize parallax distortion, seam-based and multi-homography-based methods have been proposed. In particular, the seam-based method enables faster image stitching owing to its intuitive procedure; however, the seam generation matrix may still incur parallax distortion under certain restrictive circumstances, and a longer stitching time is required when this method is applied to video sequences. This motivated us to develop the Guided Energy–Depth Map, which uses the energy function, depth information, and guidance map to minimize parallax distortion from a human visual perspective and reduce the time required to apply the stitching process to video sequences. Based on Average Seam Error (ASE) evaluation, the proposed method produces better seams than energy functions in 25 out of 32 experimental datasets, and the improvement rate of ASE evaluation is 15.58%. Also, the Frame Selection module for video stitching proposed in this paper takes only 7.27% of the time to find a specific frame for seam regeneration compared to the instance segmentation-based frame selection method.
MC Complexity Reduction for Generalized P and B Pictures in HEVC
Kyung‐Yong Kim, Hui Yong Kim, Jin Soo Choi, Gwang Hoon Park
IF 11.1
IEEE Transactions on Circuits and Systems for Video Technology
Motion compensation (MC) is a critical component in terms of computational complexity and memory bandwidth. The MC complexity of High Efficiency Video Coding (HEVC) for UHD contents significantly increased more than that of AVC/H.264. This paper reveals and analyzes a feature of generalized P and B pictures in HEVC and introduces a simple and effective method for MC complexity reduction that can be exploited at both the encoder and decoder without affecting the compression performance. The proposed method bypasses the \(L1\) interpolation process when the \(L0\) and \(L1\) motion information of a bipredicted block are identical. The simulation results show that the time reductions of 14.5% and 6.4% for the encoder and decoder, respectively, were achieved for LD-B configuration without any changes in coding results. The proposed method was adopted in the HEVC test model as a non-normative complexity reduction tool.
Deblocking Filtering for Illumination Compensation in Multiview Video Coding
Gwang Hoon Park, Min Woo Park, Sung-Chang Lim, Woo Sung Shim, Yung-Lyul Lee
IF 11.1
IEEE Transactions on Circuits and Systems for Video Technology
A deblocking filtering method that reduces the blocking artifacts that result from the use of the illumination change-adaptive motion compensation method, which was already adopted in the current multiview video coding (MVC) standard, is introduced. When the macroblock (MB)-based illumination compensation method is used to compensate local illumination changes in multiview video sequences, the neighboring MBs could have different illumination values that represent the average values of the luminance MBs, horizontally and vertically. This phenomenon causes horizontal/vertical blocking artifacts on the MB boundaries due to the difference in illumination between the neighboring MBs. The proposed deblocking filtering method is designed to reduce this phenomenon by using the adequately modified boundary strength derivation process of the current H.264/AVC deblocking filtering mechanism. The proposed deblocking filtering method improves the subjective video quality and provides slightly better objective video quality in multiview video sequences.