Focus-and-Detect: A small object detection framework for aerial images

Onur Can Koyun*, Reyhan Kevser Keser, İbrahim Batuhan Akkaya, Behçet Uğur Töreyin

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

36 Citations (Scopus)

Abstract

Despite recent advances, object detection in aerial images is still a challenging task. Specific problems in aerial images makes the detection problem harder, such as small objects, densely packed objects, objects in different sizes and with different orientations. To address small object detection problem, we propose a two-stage object detection framework called “Focus-and-Detect”. The first stage which consists of an object detector network supervised by a Gaussian Mixture Model, generates clusters of objects constituting the focused regions. The second stage, which is also an object detector network, predicts objects within the focal regions. Incomplete Box Suppression (IBS) method is also proposed to overcome the truncation effect of region search approach. Results indicate that the proposed two-stage framework achieves an AP score of 42.06 on VisDrone validation dataset, surpassing all other state-of-the-art small object detection methods reported in the literature, to the best of authors’ knowledge.

Original languageEnglish
Article number116675
JournalSignal Processing: Image Communication
Volume104
DOIs
Publication statusPublished - May 2022

Bibliographical note

Publisher Copyright:
© 2022 Elsevier B.V.

Funding

This work was supported in part by ASELSAN Inc., Turkey with grant number 65834 STB and in part by the Scientific and Technical Research Council of Turkey, TÜBİTAK , with grant number 121E378 .

FundersFunder number
TÜBİTAK
Türkiye Bilimsel ve Teknolojik Araştirma Kurumu121E378
Aselsan65834 STB

    Keywords

    • Aerial images
    • Object detection
    • Region search
    • Small object detection

    Fingerprint

    Dive into the research topics of 'Focus-and-Detect: A small object detection framework for aerial images'. Together they form a unique fingerprint.

    Cite this