Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation
Neural implicit video representations such as NeRV have emerged as a powerful alternative to traditional video codecs. However, the high computational cost and full-precision storage of NeRV limit its practicality for resource-constrained and embedded platforms. In this work, we propose Binary-NeRV,...
محفوظ في:
| المؤلف الرئيسي: | |
|---|---|
| التنسيق: | article |
| منشور في: |
2026
|
| الموضوعات: | |
| الوصول للمادة أونلاين: | https://hdl.handle.net/11073/33172 |
| الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| _version_ | 1864513434901544960 |
|---|---|
| author | Shanableh, Tamer |
| author_facet | Shanableh, Tamer |
| author_role | author |
| dc.creator.none.fl_str_mv | Shanableh, Tamer |
| dc.date.none.fl_str_mv | 2026-02-23T07:36:25Z 2026-02-23T07:36:25Z 2026 |
| dc.format.none.fl_str_mv | application/pdf |
| dc.identifier.none.fl_str_mv | T. Shanableh, "Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation," in IEEE Access, vol. 14, pp. 16143-16157, 2026, doi: 10.1109/ACCESS.2026.3658604. 2169-3536 https://hdl.handle.net/11073/33172 10.1109/ACCESS.2026.3658604 |
| dc.language.none.fl_str_mv | en_US |
| dc.publisher.none.fl_str_mv | IEEE |
| dc.relation.none.fl_str_mv | https://doi.org./10.1109/ACCESS.2026.3658604 |
| dc.subject.none.fl_str_mv | Neural video representation Binary neural networks Deep video coding Model binarization Deep learning |
| dc.title.none.fl_str_mv | Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation |
| dc.type.none.fl_str_mv | Peer-Reviewed Published version info:eu-repo/semantics/publishedVersion info:eu-repo/semantics/article |
| description | Neural implicit video representations such as NeRV have emerged as a powerful alternative to traditional video codecs. However, the high computational cost and full-precision storage of NeRV limit its practicality for resource-constrained and embedded platforms. In this work, we propose Binary-NeRV, a hybrid-precision extension of NeRV that integrates XNOR-based binary convolutions into the decoding pipeline to significantly reduce model size, bitrate, and computational complexity while reasonably preserving reconstruction fidelity. Inspired by XNOR-Net, convolutional weights are binarized to ±1 with learned scaling factors, while critical components such as the stem MLP, normalization layers, activations, skip connections, and upsampling operators are selectively retained in FP32 to ensure stable training and high visual quality. We introduce two directional progressive binarization strategies, Left-to-Right (L2R) and Right-to-Left (R2L), to analyze the impact of binarizing layers at different spatial resolutions. A detailed complexity analysis shows that over 75% of NeRV’s computational cost is dominated by the final high-resolution convolutional layer, enabling highly effective targeted binarization. Extensive experiments on standard video sequences demonstrate that selectively binarizing only the deepest layer achieves up to 68–89% reduction in equivalent GFLOPs with reasonable degradation in PSNR and MS-SSIM. Temporal consistency analysis shows that selective binarization preserves temporal stability, while aggressive full-component binarization introduces noticeable flickering artifacts. We additionally present an ablation study where all convolutional layers are binarized, both before and after pruning and quantization. These results systematically validate the hybrid-precision design, showing that full binarization yields substantial bitrate reductions but at a notable quality cost, thereby justifying the selective binarization strategy adopted in Binary-NeRV. Compared with state-of-the-art NeRV variants, Binary-NeRV delivers substantial efficiency gains, establishing hybrid binarization as a practical and scalable approach for efficient neural video representation. |
| format | article |
| id | aus_d15ff037170c5e04b67416d298497a2e |
| identifier_str_mv | T. Shanableh, "Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation," in IEEE Access, vol. 14, pp. 16143-16157, 2026, doi: 10.1109/ACCESS.2026.3658604. 2169-3536 10.1109/ACCESS.2026.3658604 |
| language_invalid_str_mv | en_US |
| network_acronym_str | aus |
| network_name_str | aus |
| oai_identifier_str | oai:repository.aus.edu:11073/33172 |
| publishDate | 2026 |
| publisher.none.fl_str_mv | IEEE |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| spelling | Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video RepresentationShanableh, TamerNeural video representationBinary neural networksDeep video codingModel binarizationDeep learningNeural implicit video representations such as NeRV have emerged as a powerful alternative to traditional video codecs. However, the high computational cost and full-precision storage of NeRV limit its practicality for resource-constrained and embedded platforms. In this work, we propose Binary-NeRV, a hybrid-precision extension of NeRV that integrates XNOR-based binary convolutions into the decoding pipeline to significantly reduce model size, bitrate, and computational complexity while reasonably preserving reconstruction fidelity. Inspired by XNOR-Net, convolutional weights are binarized to ±1 with learned scaling factors, while critical components such as the stem MLP, normalization layers, activations, skip connections, and upsampling operators are selectively retained in FP32 to ensure stable training and high visual quality. We introduce two directional progressive binarization strategies, Left-to-Right (L2R) and Right-to-Left (R2L), to analyze the impact of binarizing layers at different spatial resolutions. A detailed complexity analysis shows that over 75% of NeRV’s computational cost is dominated by the final high-resolution convolutional layer, enabling highly effective targeted binarization. Extensive experiments on standard video sequences demonstrate that selectively binarizing only the deepest layer achieves up to 68–89% reduction in equivalent GFLOPs with reasonable degradation in PSNR and MS-SSIM. Temporal consistency analysis shows that selective binarization preserves temporal stability, while aggressive full-component binarization introduces noticeable flickering artifacts. We additionally present an ablation study where all convolutional layers are binarized, both before and after pruning and quantization. These results systematically validate the hybrid-precision design, showing that full binarization yields substantial bitrate reductions but at a notable quality cost, thereby justifying the selective binarization strategy adopted in Binary-NeRV. Compared with state-of-the-art NeRV variants, Binary-NeRV delivers substantial efficiency gains, establishing hybrid binarization as a practical and scalable approach for efficient neural video representation.American University of SHarjahIEEE2026-02-23T07:36:25Z2026-02-23T07:36:25Z2026Peer-ReviewedPublished versioninfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfT. Shanableh, "Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation," in IEEE Access, vol. 14, pp. 16143-16157, 2026, doi: 10.1109/ACCESS.2026.3658604.2169-3536https://hdl.handle.net/11073/3317210.1109/ACCESS.2026.3658604en_UShttps://doi.org./10.1109/ACCESS.2026.3658604oai:repository.aus.edu:11073/331722026-02-23T08:17:33Z |
| spellingShingle | Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation Shanableh, Tamer Neural video representation Binary neural networks Deep video coding Model binarization Deep learning |
| status_str | publishedVersion |
| title | Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation |
| title_full | Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation |
| title_fullStr | Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation |
| title_full_unstemmed | Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation |
| title_short | Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation |
| title_sort | Binary-NeRV: Hybrid-Precision Weights Binarization for Efficient Neural Video Representation |
| topic | Neural video representation Binary neural networks Deep video coding Model binarization Deep learning |
| url | https://hdl.handle.net/11073/33172 |