Learning with Type-2 Fuzzy activation functions to improve the performance of Deep Neural Networks

A. Beke, T. Kumbasar*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

40 Citations (Scopus)

Abstract

In this study, we propose a novel Interval Type-2 (IT2) Fuzzy activation layer that is composed of Single input IT2 (SIT2) Fuzzy Rectifying Units (FRUs) to improve the learning performances of Deep Neural Networks (DNNs). The novel SIT2-FRU has tunable parameters that not only define the slopes of the positive and negative quadrants but also the characteristic of the input–output mapping of the activation function. The novel SIT2-FRU also alleviates vanishing gradient problem and has a fast convergence rate since it can push the mean activation to around zero by processing the inputs defined in the negative quadrant. Thus, SIT2-FRU gives the opportunity to the DNN to have a better learning behavior as it is capable to express linear or sophisticated input–output mapping by simply tuning the footprint of uncertainty of its IT2 fuzzy sets. In order to examine the performance of the SIT2-FRU, comparative experimental studies are performed on the MNIST, Quickdraw Pictionary and CIFAR-10 benchmark datasets. The proposed SIT2-FRU is compared with the state of the art activation functions which are the Rectified Linear Unit (ReLU), Parametric ReLU (PReLU) and Exponential Linear Unit (ELU). Comparative experimental results and analyses clearly show the enhancement in the learning performance of DNNs that include activation layer(s) composed of SIT2-FRUs. It is shown that the learning performance of the SIT2-FRU is robust to different parameter settings of the learning rates and mini batch sizes. Furthermore, the experimental results show that SIT2-FRU can result with a high performance with or without batch normalization layers unlike the other employed activation units. It is concluded that DNNs with SIT2-FRUs have a satisfactory generalization capability, a robust and high learning performance when compared to the ReLU, PReLU and ELU activation functions.

Original languageEnglish
Pages (from-to)372-384
Number of pages13
JournalEngineering Applications of Artificial Intelligence
Volume85
DOIs
Publication statusPublished - Oct 2019

Bibliographical note

Publisher Copyright:
© 2019 Elsevier Ltd

Funding

This research is supported by the project ( 118E807 ) of Scientific and Technological Research Council of Turkey (TUBITAK) . All of these supports are appreciated.

FundersFunder number
TUBITAK
Türkiye Bilimsel ve Teknolojik Araştirma Kurumu118E807

    Keywords

    • Activation units
    • Deep learning
    • Deep Neural Networks
    • Footprint of uncertainty
    • Interval Type-2 Fuzzy systems

    Fingerprint

    Dive into the research topics of 'Learning with Type-2 Fuzzy activation functions to improve the performance of Deep Neural Networks'. Together they form a unique fingerprint.

    Cite this