IJPAM: Volume 51, No. 2 (2009)

ANALYSIS OF MARKOV CHAIN WITH REWARDS BY
Z-TRANSFORM AND THE DECISION OF
ITS OPTIMAL POLICY

Yoshinori Uchimura$^1$, Yuko Hara-Mimachi$^2$
$^1$Graduate School of Science and Technology
Meijo University
1-501, Shiogamaguchi, Tempaku-ku, Nagoya, 468-5802, JAPAN
e-mail: m0732008@ccmailg.meijo-u.ac.jp
$^2$Department of Information Sciences
Meijo University
1-501, Shiogamaguchi, Tempaku-ku, Nagoya, 468-5802, JAPAN
e-mail: yuko@ccmfs.meijo-u.ac.jp


Abstract.This paper describes the analysis of Markov chain with rewards by $z$-transform. The decision method of its optimal policy is proposed by Ronald A. Howard. We have implemented Howard's Policy-Improvement algorithm by a high level programming language Scilab and added a counter of the iteration to this. It will be usually found this policy in a small number of iterations. On the other hand, we can obtain example of 6 time iterations by this program.

Received: February 5, 2009

AMS Subject Classification: 60J22, 65C40, 65K05

Key Words and Phrases: Markov chain, ergodic Markov chain, $z$-transform, dynamic programming, optimal policy

Source: International Journal of Pure and Applied Mathematics
ISSN: 1311-8080
Year: 2009
Volume: 51
Issue: 2