Sievenet: An Efficient Model Utilizing H.265 Codec Structure for Video Object Detection

Onur Can Koyun*, Behcet Ugur Toreyin

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In the field of video content analysis, object detection is a crucial task. The High Efficient Video Coding (H.265, HEVC) standard's coding structures are strongly correlated with the video content, creating an opportunity to utilize these structures for video object detection in a computationally efficient way. To address this, we present a video object detection method that partitions frames into macroblocks based on the H.265 structure. Blocks with spatially high-frequency content go through a dynamic-layer approach that subjects them to deeper analysis with more layers, while blocks with spatially low-frequency content undergo fewer layers to enable a lower computational load. Results on ImageNet-Vid Dataset indicate that our approach has the potential to save significant computational resources while maintaining accurate object detection performance.

Original languageEnglish
Title of host publicationICASSPW 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350302615
DOIs
Publication statusPublished - 2023
Event2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, ICASSPW 2023 - Rhodes Island, Greece
Duration: 4 Jun 202310 Jun 2023

Publication series

NameICASSPW 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, Proceedings

Conference

Conference2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, ICASSPW 2023
Country/TerritoryGreece
CityRhodes Island
Period4/06/2310/06/23

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Keywords

  • Compressed Domain Video Analysis
  • Deep learning
  • H.265
  • HEVC
  • Video Object Detection

Fingerprint

Dive into the research topics of 'Sievenet: An Efficient Model Utilizing H.265 Codec Structure for Video Object Detection'. Together they form a unique fingerprint.

Cite this