A Binary Classification Model for PM 10 Levels

Kiymet Kaya, Sule Gunduz Oguducu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Citations (Scopus)

Abstract

The particulate matter in the air effects human health in a negative way. Yet, no regression model has estimated the density of PM 10 at Istanbul using datasets with imbalanced class distribution. In order to fill this gap, we designed a new regression model that transforms the regression problem into the imbalanced binary classification problem at the initial stage. In this paper, PM 10 classification problem is considered as the imbalanced binary classification problem that is coded as harmless class (1) and dangerous class (0). In the sampling part of the solution, the balanced version of the data by Under Sampling methods yielded unsatisfactory results. In the algorithmic part, the performances of RFC (Random Forest Classifier), ETC (Extra Trees Classifier) and GBC (Gradient Boosting Classifier) models, which stand out with their positive effects on unbalanced learning problems, are compared in terms of AUROC. The proposed model, uses all training set samples and predicts through RFC. The experimental results on real world dataset seem quite promising for our further research.

Original languageEnglish
Title of host publicationUBMK 2018 - 3rd International Conference on Computer Science and Engineering
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages361-366
Number of pages6
ISBN (Electronic)9781538678930
DOIs
Publication statusPublished - 6 Dec 2018
Event3rd International Conference on Computer Science and Engineering, UBMK 2018 - Sarajevo, Bosnia and Herzegovina
Duration: 20 Sept 201823 Sept 2018

Publication series

NameUBMK 2018 - 3rd International Conference on Computer Science and Engineering

Conference

Conference3rd International Conference on Computer Science and Engineering, UBMK 2018
Country/TerritoryBosnia and Herzegovina
CitySarajevo
Period20/09/1823/09/18

Bibliographical note

Publisher Copyright:
© 2018 IEEE.

Funding

ACKNOWLEDGMENT We’re thankful to the Turkish State Meteorological Service for providing the meteorological data used in this study. The first author was partially supported by the ITU project of standardization, integration and modernization of measuring systems.

FundersFunder number
International Technological University

    Keywords

    • air pollution
    • binary classification
    • Data mining
    • ensemble methods
    • imbalanced learning

    Fingerprint

    Dive into the research topics of 'A Binary Classification Model for PM 10 Levels'. Together they form a unique fingerprint.

    Cite this