HaLViT: Half of the Weights are Enough

Onur Can Koyun*, Behçet Uǧur Töreyin

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Deep learning architectures like Transformers and Convolutional Neural Networks (CNNs) have led to ground-breaking advances across numerous fields. However, their extensive need for parameters poses challenges for implementation in environments with limited resources. In our research, we propose a strategy that focuses on the utilization of the column and row spaces of weight matrices, significantly reducing the number of required model parameters without substantially affecting performance. This technique is applied to both Bottleneck and Attention layers, achieving a notable reduction in parameters with minimal impact on model efficacy. Our proposed model, HaLViT, exemplifies a parameter-efficient Vision Transformer. Through rigorous experiments on the ImageNet dataset and COCO dataset, HaLViT's performance validates the effectiveness of our method, offering results comparable to those of conventional models.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024
PublisherIEEE Computer Society
Pages3669-3678
Number of pages10
ISBN (Electronic)9798350365474
DOIs
Publication statusPublished - 2024
Event2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024 - Seattle, United States
Duration: 16 Jun 202422 Jun 2024

Publication series

NameIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
ISSN (Print)2160-7508
ISSN (Electronic)2160-7516

Conference

Conference2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024
Country/TerritoryUnited States
CitySeattle
Period16/06/2422/06/24

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Keywords

  • CNN
  • Efficient
  • Parameter
  • Transformer
  • ViT
  • Vision Transformer
  • computer vision
  • deep learning

Fingerprint

Dive into the research topics of 'HaLViT: Half of the Weights are Enough'. Together they form a unique fingerprint.

Cite this