Ana gezinime geç Aramaya geç Ana içeriğe geç

Machine learning tree trimming for faster Markov reward game solutions

  • Burhaneddin İzgi*
  • , Murat Özkaya
  • , Nazım Kemal Üre
  • , Matjaž Perc
  • *Bu çalışma için yazışmadan sorumlu yazar
  • Canakkale Onsekiz Mart University
  • University of Maribor
  • Community Healthcare Center Dr. Adolf Drolc Maribor
  • Kyung Hee University
  • Complexity Science Hub Vienna
  • Korea University

Araştırma sonucu: Dergiye katkıMakalebilirkişi

1 Atıf (Scopus)

Özet

Existing methodologies for solving Markov reward games mostly rely on state–action frameworks and iterative algorithms to address these challenges. However, these approaches often impose significant computational burdens, particularly when applied to large-scale games, due to their inherent complexity and the need for extensive iterative calculations. In this paper, we propose a new neural network architecture for solving Markov reward games in the form of a decision tree with relatively large state and action sets, such as 2-actions-3-stages, 3-actions-3-stages, and 4-actions-3-stages, by trimming the decision tree. In this context, we generate datasets of Markov reward games with sizes ranging from (Formula presented) to (Formula presented) using the holistic matrix norm-based solution method and obtain the necessary components, such as the payoff matrices and the corresponding solutions of the games, for training the neural network. We then propose a vectorization process to prepare the outcomes of the matrix norm-based solution method and adapt them for training the proposed neural network. The neural network is trained using both the vectorized payoff and transition matrices as input, and the prediction system generates the optimal strategy set as output. In the model, we approach the problem as a classification task by labeling the optimal and non-optimal branches of the decision tree with ones and zeros, respectively, to identify the most rewarding paths of each game. As a result, we propose a novel neural network architecture for solving Markov reward games in real time, enhancing its practicality for real-world applications. The results reveal that the system efficiently predicts the optimal paths for each decision tree, with f1-scores slightly greater than 0.99, 0.99, and 0.97 for Markov reward games with 2-actions-3-stages, 3-actions-3-stages, and 4-actions-3-stages, respectively.

Orijinal dilİngilizce
Makale numarası102726
DergiJournal of Computational Science
Hacim92
DOI'lar
Yayın durumuYayınlandı - Ara 2025

Bibliyografik not

Publisher Copyright:
© 2025 Elsevier B.V. All rights are reserved, including those for text and data mining, AI training, and similar technologies.

Parmak izi

Machine learning tree trimming for faster Markov reward game solutions' araştırma başlıklarına git. Birlikte benzersiz bir parmak izi oluştururlar.

Alıntı Yap