Abstract
In e-commerce, users often seek complementary products that appear together in the same scene, but existing visual search systems face scalability challenges. Traditional approaches rely on class-specific object detectors and supervised metric-learning models that require re-training and re-annotation for each new product category. Consequently, this reliance on manually labeled datasets limits adaptability in dynamic marketplace environments. We propose an end-to-end pipeline that addresses these constraints through open-vocabulary object detection combined with a novel freeze-weighted reciprocal rank fusion (FWRRF) retrieval strategy. Our system employs YOLO-World for prompt-driven product detection, enabling zero-shot recognition of unseen categories without additional training or annotations. The retrieval module extracts dual-view embeddings from both detected object crops and full scene images, then combines them using FWRRF, a lightweight fusion method that preserves high-precision object-centric results while incorporating contextual information through weighted reciprocal ranking. Evaluation on a curated living-room dataset demonstrates strong performance: 0.851 [email protected] for detection and Recall@3 = 0.580 for retrieval, surpassing single-view baselines and conventional fusion methods. This framework enables scalable, prompt-driven product discovery with continuous category expansion capabilities, eliminating costly retraining cycles in dynamic e-commerce environments.
| Original language | English |
|---|---|
| Pages (from-to) | 975-980 |
| Number of pages | 6 |
| Journal | International Conference on Computer Science and Engineering, UBMK |
| Issue number | 2025 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 10th International Conference on Computer Science and Engineering, UBMK 2025 - Istanbul, Turkey Duration: 17 Sept 2025 → 21 Sept 2025 |
Bibliographical note
Publisher Copyright:© 2025 IEEE.
Keywords
- e-commerce
- open-vocabulary object detection
- rrf
- shop the look
- visual search
Fingerprint
Dive into the research topics of 'Open-Vocabulary Product Discovery via Fusion Based Visual Retrieval in E-Commerce'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver