COMPARING DIFFERENT METHODS FOR ESTIMATING PARAMETERS OF MIXTURE OF TWO CHI-SQUARE DISTRIBUTIONS

Mixture models in general and parametric mixture models in particular are very useful methods for modelling a population and have lots of applied useful examples in medicine, industry and economics. Render and Walker (1984)1 introduced using EM algorithm to find maximum likelihood estimates for mixture densities. Mixture models introduced by McLachlan et,al (2000)2. Wang, Tan and Louis (2014)3 used mixture models for modelling time-to-event data to evaluate treatment effects in randomised clinical trials. Teel, Park and Sampson (2015)4 considered the use of an EM algorithm for fitting finite mixture models when mixture component size is known.


Introduction
Mixture models in general and parametric mixture models in particular are very useful methods for modelling a population and have lots of applied useful examples in medicine, industry and economics.Render and Walker (1984) 1 introduced using EM algorithm to find maximum likelihood estimates for mixture densities.Mixture models introduced by McLachlan et,al (2000) 2 .Wang, Tan and Louis (2014) 3 used mixture models for modelling time-to-event data to evaluate treatment effects in randomised clinical trials.Teel, Park and Sampson (2015) 4 considered the use of an EM algorithm for fitting finite mixture models when mixture component size is known.
Baudry and Celeux (2015) 5 showed that, however, Maximum likelihood through the EM algorithm is wide 6 introduced additive and multiplicative mixed normal distributions.Zaman et,al.(2006)  7 introduced mixture of chisquare distributions using Poisson elements.Chen, Ponomareva and Tamer (2014) 8 introduced likelihood inference in some finite mixture models and discussed different situations for mixture models.Chi-square distribution is one of the most applied statistical distributions, which is used in different branches of industry and technology.For example, you can fit this distribution on the amount of extraction petroleum products from crude oil in refineries.Density function of this distribution is as below: .
Here Γ(.) is gamma function and θ is the distribution parameter and we can write: X∼χ2 (θ) Non central moments of this distribution are given as: As you know, there are two common methods for estimating parameters, method of moments and maximum likelihood estimators.We will discuss about these methods for chi-square distribution.At method of moments, parameters will be estimated by putting sample moments equal to distribution moments.Because chi-square distribution has only one parameter, we need the first order central moment.Also, because E(X) =θ, which is the first order moment of this distribution, should be set equal to the samples first order moment, X, we can write: θ = X.
We need maximum likelihood function to estimating parameters by using maximum likelihood method for chi-square distribution.The maximum likelihood function of chi-square distribution is written as bellow: The maximum likelihood estimator of θ is the value which maximizes l(x;θ).You can find maximum with derivation from l(x;θ), with respect to θ and set the result equal to zero.In the other words: ) So ), Moreover, we estimate θ by solving which is impossible with common methods and will be solved numerically.
2. Mixture Distributions, see [9]   Mixture models in general and parametric mixture models in particular are very useful methods for modelling a population and have a lot of applied useful examples in medicine, industry and economics.Mixture models are developed for studying populations, which are made from some sub-populations.In more general terms, if your population made from k sub-populations and each subpopulation has density f j (θ), j = 1, 2...k, you can say that your population has mixture distribution f(Θ) where: In this distribution, mixing parameters are τ j , j=1, 2 k and k j=1 τ j = 1.If each sub-population is independent from others, using equations ( 1) and ( 5), we can write: And Where So we can write the moments as: And maximum likelihood function as: Suppose that random variable X is a mixture of two other random variables, X 1 and X 2 with mixing parameter τ from the first sub-population.Also you can suppose that X i ∼ χ 2 (θ i ), i=1,2.Using (8)to have a mixture of two chi-square distribution, we can write: Where Θ=(τ ,θ 1 ,θ 2 )and f(x;θ i ) is the density of ith population.And In the other words, X has Mixture of Two Chi-square Distributions (MTChD) or The most important problem is estimating the parameters of MTChD;(τ ,θ 1 ,θ 2 ).We continue with two common methods: Method of Moments and Maximum likelihood.

Method of Moments for Estimating Parameters of MTChD
As you know, for this method, we need moments up to order of the unknown parameters.Because this distribution has 3 unknown parameters, we have: Now we can estimate the parameters with solving the equations:

Maximum Likelihood Estimation of Parameters of MTChD
To use maximum likelihood method, we should find maximum likelihood function.We use from indicator variable I(z i =j)to define likelihood function, where And likelihood function is: )) Where τ 2 =1-τ 1 .Logarithm of this function is ))) (22) As mentioned above, you can find estimators, by derivation from l(Θ;x,z) with respect to each unknown parameter and set the result equal to zero and solving the equation.But this method is very difficult for estimating parameters of MTChD.So we use from an alternative method, EM algorithm, which is an iterative method for finding the parameters in Maximum Likelihood method.

Estimating Parameters of MTChD Using EM-Algorithm
Suppose Y is a p-dimensional random vector with probability density function g(y; Θ) where Θ = (θ 1 , θ 2 , ..., θ d ) T is the vector of unknown parameters in probability density function of Y.The likelihood function of Θ, which calculated on the observed values of y is In some cases vector Y may be included incomplete data which made from missing data or censored data; but in other situations it may be a complete data from a mixture of two or more distributions which the proportions of allocation in different distributions is unknown and we should find these proportions.In these cases using the method of Maximum Likelihood is very difficult, but EM-algorithm is a useful method.EM-algorithm is an iterative method for estimating parameters in case of incomplete data using maximum likelihood method.Since log L(Θ) included incomplete data, suppose that log L c (Θ) is the logarithm of the likelihood function of the complete data.We want to maximise expectation of log L c (Θ) with condition of complete data of Y.In the other words, suppose that Θ(k) is the value of after k th iteration.At step k+1, steps E and M are as below: Step M (Maximization): choosing Θ ( k + 1) for each Θ ∈ Ω which maximizing Q(Θ; Θ (k) ) with respect to Θ.In the other words Step M is repeated until the sequence L (Θ (k) ) converges.
In 1977 Rubin, Dumpster and Layrd showed that the likelihood function of incomplete data L(Θ) after one iteration is not decreasing.So the sequence L(Θ (k) ) converges uniformly to some L*, which L* is a stationary value for Θ*, which is true in ∂L(Θ) ∂Θ = 0 or equivalently ∂logL(Θ) ∂Θ = 0.In some cases, it is possible that Θ* is a local maximum and in very rare cases it may be a saddle point and not a locally maximum or minimum.Value of Θ* depends on the initial value of Θ(0).It should be noted that iteration in EM-algorithm will be increasing the likelihood, and will converge to a value in some generally good conditions.Now suppose that we have a population which is formed from a mixture of two independent Chi-Square distributions, and, also, suppose that one part of population has chi-square distribution with parameter θ 1 and other part has chi-square distribution with parameter θ 2 which is independent from the last distribution; and τ 1 is the proportion of population which comes from the first part; and τ 2 =1τ 1 is the proportion of population which comes from the second part which is unknown.At this situation we have a mixture of two independent chi-square distributions.As we mentioned above you can use EM-algorithm for finding the parameters of this population because τ is unknown.To formulate this population, we have: )) Where z i is an indicator variable for identifying the distribution of x i and n is the number of observations.L is the likelihood function for a mixture distribution with proportion of mixing τ 1 .If we show the logarithm of likelihood function with l(Θ; x, z), then: ))) (24) In this case we have a mixture of two independent chi-square distributions (MTChD) with parameters(τ, θ 1 , θ 2 ).Steps E and M of EM-algorithm are as: E step: Suppose that θ (t) is the current parameter value which is known.The distribution of z i will be achieved using Bayes rule which is proportional to chi-square parameters and proportions which are introduced in θ (t) .In the other words: M Step: As mentioned above, we should maximize the equation achieved in E-step with respect to all of those parameters.We should find the maximum of Q(θ|θ((t)) with respect to τ 1 , θ 1 and θ 2 .Parameters of this distribution can be maximized independently.Since τ 1 + τ 2 = 1, we have: And hence:   greater than 0.5, method of moments has reasonable estimates.We use from numerical methods to estimating parameters in method of maximum likelihood.
Estimates from maximum likelihood are more reasonable from EM algorithm and method of moments.Although, EM algorithm estimators have smaller MSEs than maximum likelihood estimators, but their biases are greater.In the other words, estimates of parameters in EM algorithm, converges to the mean of two population parameters.So the maximum likelihood method develops the best estimates of MTChD, when τ is known.

Figure 1 :
Figure 1: Comparing EM-Algorithm and Method of Moments estimators of MTChD, τ is unknown,θ 1 and θ 2 are known