Detecting Questions in Online Communities: A Machine Learning Approach

Dilnaz Omarova, Fares A. Dael, Ibraheem Shayea, Gulnara Abitova, Eldos Sailaukhanov

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The proliferation of online forums and communities has greatly facilitated knowledge sharing and user support but has also introduced the significant challenge of managing redundant and semantically similar questions. Traditional keyword-based methods have proven inadequate in addressing this issue due to the inherent complexities of natural language, where the same idea can be expressed in numerous ways. This study investigates the use of advanced machine learning algorithms - Logistic Regression, Random Forest, and Gradient Boosting (XGBoost) - to detect semantically similar questions. By employing the Quora Question Pairs dataset, the performance of these models is evaluated using metrics such as accuracy, precision, recall, and F1-score. This research not only provides a comparative analysis of these machine learning models but also suggests a framework for improving information retrieval and user experience in online forums. The study highlights the potential for future integration of deep learning models and advanced semantic understanding techniques to further enhance the detection of semantically similar questions.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE 16th International Conference on Communication Systems and Network Technologies, CICN 2024
EditorsGeetam Singh Tomar
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages290-297
Number of pages8
ISBN (Electronic)9798331505264
DOIs
Publication statusPublished - 2024
Event16th IEEE International Conference on Computational Intelligence and Communication Networks, CICN 2024 - Indore, India
Duration: 22 Dec 202423 Dec 2024

Publication series

NameProceedings - 2024 IEEE 16th International Conference on Communication Systems and Network Technologies, CICN 2024

Conference

Conference16th IEEE International Conference on Computational Intelligence and Communication Networks, CICN 2024
Country/TerritoryIndia
CityIndore
Period22/12/2423/12/24

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Keywords

  • Machine Learning
  • Natural Language Processing
  • Sentiment Analysis
  • Word Embeddings

Fingerprint

Dive into the research topics of 'Detecting Questions in Online Communities: A Machine Learning Approach'. Together they form a unique fingerprint.

Cite this