Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation

Neural implicit video representations such as NeRV have emerged as a powerful alternative to traditional video codecs. However, the high computational cost and full-precision storage of NeRV limit its practicality for resource-constrained and embedded platforms. In this work, we propose Binary-NeRV,...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Shanableh, Tamer (author)
التنسيق: article
منشور في: 2026
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/11073/33172
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513434901544960
author Shanableh, Tamer
author_facet Shanableh, Tamer
author_role author
dc.creator.none.fl_str_mv Shanableh, Tamer
dc.date.none.fl_str_mv 2026-02-23T07:36:25Z
2026-02-23T07:36:25Z
2026
dc.format.none.fl_str_mv application/pdf
dc.identifier.none.fl_str_mv T. Shanableh, "Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation," in IEEE Access, vol. 14, pp. 16143-16157, 2026, doi: 10.1109/ACCESS.2026.3658604.
2169-3536
https://hdl.handle.net/11073/33172
10.1109/ACCESS.2026.3658604
dc.language.none.fl_str_mv en_US
dc.publisher.none.fl_str_mv IEEE
dc.relation.none.fl_str_mv https://doi.org./10.1109/ACCESS.2026.3658604
dc.subject.none.fl_str_mv Neural video representation
Binary neural networks
Deep video coding
Model binarization
Deep learning
dc.title.none.fl_str_mv Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation
dc.type.none.fl_str_mv Peer-Reviewed
Published version
info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/article
description Neural implicit video representations such as NeRV have emerged as a powerful alternative to traditional video codecs. However, the high computational cost and full-precision storage of NeRV limit its practicality for resource-constrained and embedded platforms. In this work, we propose Binary-NeRV, a hybrid-precision extension of NeRV that integrates XNOR-based binary convolutions into the decoding pipeline to significantly reduce model size, bitrate, and computational complexity while reasonably preserving reconstruction fidelity. Inspired by XNOR-Net, convolutional weights are binarized to ±1 with learned scaling factors, while critical components such as the stem MLP, normalization layers, activations, skip connections, and upsampling operators are selectively retained in FP32 to ensure stable training and high visual quality. We introduce two directional progressive binarization strategies, Left-to-Right (L2R) and Right-to-Left (R2L), to analyze the impact of binarizing layers at different spatial resolutions. A detailed complexity analysis shows that over 75% of NeRV’s computational cost is dominated by the final high-resolution convolutional layer, enabling highly effective targeted binarization. Extensive experiments on standard video sequences demonstrate that selectively binarizing only the deepest layer achieves up to 68–89% reduction in equivalent GFLOPs with reasonable degradation in PSNR and MS-SSIM. Temporal consistency analysis shows that selective binarization preserves temporal stability, while aggressive full-component binarization introduces noticeable flickering artifacts. We additionally present an ablation study where all convolutional layers are binarized, both before and after pruning and quantization. These results systematically validate the hybrid-precision design, showing that full binarization yields substantial bitrate reductions but at a notable quality cost, thereby justifying the selective binarization strategy adopted in Binary-NeRV. Compared with state-of-the-art NeRV variants, Binary-NeRV delivers substantial efficiency gains, establishing hybrid binarization as a practical and scalable approach for efficient neural video representation.
format article
id aus_d15ff037170c5e04b67416d298497a2e
identifier_str_mv T. Shanableh, "Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation," in IEEE Access, vol. 14, pp. 16143-16157, 2026, doi: 10.1109/ACCESS.2026.3658604.
2169-3536
10.1109/ACCESS.2026.3658604
language_invalid_str_mv en_US
network_acronym_str aus
network_name_str aus
oai_identifier_str oai:repository.aus.edu:11073/33172
publishDate 2026
publisher.none.fl_str_mv IEEE
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
spelling Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video RepresentationShanableh, TamerNeural video representationBinary neural networksDeep video codingModel binarizationDeep learningNeural implicit video representations such as NeRV have emerged as a powerful alternative to traditional video codecs. However, the high computational cost and full-precision storage of NeRV limit its practicality for resource-constrained and embedded platforms. In this work, we propose Binary-NeRV, a hybrid-precision extension of NeRV that integrates XNOR-based binary convolutions into the decoding pipeline to significantly reduce model size, bitrate, and computational complexity while reasonably preserving reconstruction fidelity. Inspired by XNOR-Net, convolutional weights are binarized to ±1 with learned scaling factors, while critical components such as the stem MLP, normalization layers, activations, skip connections, and upsampling operators are selectively retained in FP32 to ensure stable training and high visual quality. We introduce two directional progressive binarization strategies, Left-to-Right (L2R) and Right-to-Left (R2L), to analyze the impact of binarizing layers at different spatial resolutions. A detailed complexity analysis shows that over 75% of NeRV’s computational cost is dominated by the final high-resolution convolutional layer, enabling highly effective targeted binarization. Extensive experiments on standard video sequences demonstrate that selectively binarizing only the deepest layer achieves up to 68–89% reduction in equivalent GFLOPs with reasonable degradation in PSNR and MS-SSIM. Temporal consistency analysis shows that selective binarization preserves temporal stability, while aggressive full-component binarization introduces noticeable flickering artifacts. We additionally present an ablation study where all convolutional layers are binarized, both before and after pruning and quantization. These results systematically validate the hybrid-precision design, showing that full binarization yields substantial bitrate reductions but at a notable quality cost, thereby justifying the selective binarization strategy adopted in Binary-NeRV. Compared with state-of-the-art NeRV variants, Binary-NeRV delivers substantial efficiency gains, establishing hybrid binarization as a practical and scalable approach for efficient neural video representation.American University of SHarjahIEEE2026-02-23T07:36:25Z2026-02-23T07:36:25Z2026Peer-ReviewedPublished versioninfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfT. Shanableh, "Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation," in IEEE Access, vol. 14, pp. 16143-16157, 2026, doi: 10.1109/ACCESS.2026.3658604.2169-3536https://hdl.handle.net/11073/3317210.1109/ACCESS.2026.3658604en_UShttps://doi.org./10.1109/ACCESS.2026.3658604oai:repository.aus.edu:11073/331722026-02-23T08:17:33Z
spellingShingle Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation
Shanableh, Tamer
Neural video representation
Binary neural networks
Deep video coding
Model binarization
Deep learning
status_str publishedVersion
title Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation
title_full Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation
title_fullStr Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation
title_full_unstemmed Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation
title_short Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation
title_sort Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation
topic Neural video representation
Binary neural networks
Deep video coding
Model binarization
Deep learning
url https://hdl.handle.net/11073/33172