# Newton method implementation of maximum likelihood estimation based on R and Python

catalogue

0. Preface

1. Theoretical basis

2. Maximum likelihood estimation of Cauchy distribution

2.1 theoretical basis

2.2 algorithm

2.2.1R language implementation

2.2.2 Python language implementation

3. Maximum likelihood estimation of gamma distribution

3.1 theoretical basis

3.2 algorithm

3.2.1R language implementation

3.2.2 Python language implementation

# 0. Preface

Recently, I was studying Theory and Method of Statistics. The textbook used was computer age statistical inference: algorithms, evolution, and data science, which was jointly written by Bradley Efron and Trevor Hastie. Fisher inference and maximum likelihood estimation (Fisher inference and maximum likelihood estimation) described in Chapter 4 of the book mentioned that the maximum likelihood estimation is not so easy to solve in practical applications, such as Cauchy distribution and Gamma distribution.

If the maximum likelihood estimation method has no explicit solution, it can be solved by numerical calculation (such as Newton method); Furthermore, if the second derivative does not exist or the Hessian matrix is non positive definite, the quasi Newton method can be used; More complex, you can use MM algorithm (EM is a special case of MM). Taking Newton method as an example, this paper gives the solution   The theory of maximum likelihood estimation parameters of Cauchy distribution and Gamma distribution is realized by R and Python.

# 1. Theoretical basis

This section gives the general theory of maximum likelihood parameter estimation of distribution by Newton method.

If random variable Independent and identically distributed in And a set of samples is known , in order to estimate the parameters of the distribution, the maximum likelihood estimation method can be used.

First, write the likelihood function of the sample yes Logarithmic processing is carried out to obtain the logarithmic likelihood function  Then unknown parameters are solved Equivalent to solving the following equations Let's assume that the convergent solution is , will In The neighborhood of is expanded into Taylor series   In this way, an iterative relationship is obtained If it is continuous and the zero point to be solved is isolated, there is a region around the zero point. As long as the initial value is in this adjacent region, the Newton method must converge. Moreover, if it is not 0, the Newton method will have the performance of square convergence. Roughly speaking, this means that the significant number of Newton's results will double with each iteration.

# 2. Maximum likelihood estimation of Cauchy distribution

If random variable Subject to Cauchy distribution, recorded as , where Is the scale parameter of half the width at half the maximum value, It defines the location parameter of the distribution peak position When , the Cauchy distribution at this time is called the standard Cauchy distribution The most special property of Cauchy distribution is that its expectation and higher-order moments do not exist, so it is naturally impossible to estimate the moments of parameters. However, the cdf of Cauchy distribution has good properties, and the parameters can be estimated by using the quantiles of a group of samples.    ## 2.1 theoretical basis

Estimation using maximum likelihood method , known samples       Therefore, the iterative formula is ## 2.2 algorithm

Step0: given , , stop accuracy Step 1: Calculation , if Then terminate, otherwise proceed to the next step

Step 2: Calculation , order Step3: Calculation ,   Jump to step 1

### 2.2.1R language implementation

```# Set up Newton algorithm framework
Newtons = function(fun, x, theta,ep = 1e-5, it_max = 100){
index = 0 ;k = 1
while (k <= it_max){
theta1 = theta
obj = fun(x,theta)
theta = theta - solve(obj\$J,obj\$f)
norm = sqrt((theta - theta1) %*% (theta - theta1))
if (norm < ep){
index = 1; break
}
k = k + 1
}
obj = fun(x,theta)
list(root = theta, it = k, index = index, FunVal = obj\$f)
}
# Calculate the hessian matrix and the first derivative
funs = function(x,theta){
n = length(x)
temp_1 = (x-theta)^2+theta^2
temp_2 = (x-theta)^2-theta^2
f = c(n/theta-sum(2*theta/temp_1),
sum((2*(x-theta))/temp_1))
J = matrix(c(-n/theta^2-sum(2*temp_2/temp_1^2), -sum((4*theta*(x-theta))/temp_1^2),
-sum((4*theta*(x-theta))/temp_1^2),sum(2*temp_2/temp_1^2)), nrow = 2, byrow = T)
list(f = f, J = J)
}

# Take 1000 samples
set.seed(80)
sample = rcauchy(1000,scale = 2,location = 2)

# The quantile estimation of cauchy distribution parameters is calculated as the initial value
gamma_ = quantile(sample,0.75) - median(sample)
theta_ = median(sample)
theta = c(gamma_,theta_)

Newtons(funs,sample,theta)```

### 2.2.2 Python language implementation

```import numpy as np
import scipy.stats as st

np.random.seed(12)
sample = st.cauchy.rvs(loc=1,scale = 1,size = 100) # scipy.stats.cauchy.rvs(loc=0, scale=1, size=1, random_state=None)
gamma_ = np.quantile(sample,0.75) - np.median(sample)
theta_ = np.median(sample)
theta = np.array([gamma_,theta_])

def Newtons(fun,x,theta,ep=1e-5,it_max = 100):
index = 0
k = 1
while k <= it_max:
theta1 = theta
obj = fun(x,theta)
theta = theta - np.dot(np.linalg.inv(obj),obj)
norm = np.sqrt(np.sum((theta - theta1)**2))
if norm < ep:
index = 1
break
k = k+1
obj = fun(x,theta)
print('gamma,theta The estimated value is%.3f,%.3f'%(theta,theta))
def funs(x,theta):
n = len(x)
temp_1 = (x - theta) ** 2 + theta ** 2
temp_2 = (x - theta) ** 2 - theta ** 2
f = np.array([n/theta-np.sum(2*theta/temp_1),
np.sum((2*(x-theta))/temp_1)])
j = np.array([[-n/theta**2-sum(2*temp_2/temp_1**2),-np.sum((4*theta*(x-theta))/temp_1**2)],
[-np.sum((4*theta*(x-theta))/temp_1**2),sum(2*temp_2/temp_1**2)]])
return [f,j]

Newtons(funs,sample,theta)```

# 3. Maximum likelihood estimation of gamma distribution

If random variable Obey the Gamma distribution and record as , where Is a shape parameter, β Is the scale parameter, λ Is the location parameter In order to give the moment estimation of the three parameters of Gamma distribution, the first, second and third order origin moment of Gamma distribution is considered (the solution skill is to gather Gamma function)

Taking the proof of the first-order origin moment as an example, x is divided into so       ①

For the second-order and third-order origin moments, respectively Split into  Easy to get  also  Available from ① ② ③ ## 3.1 theoretical basis

Known samples , parameters      Therefore, the iterative formula is ## 3.2 algorithm

Step0: given , , stop accuracy Step 1: Calculation , if Then terminate, otherwise proceed to the next step

Step 2: Calculation , order Step3: Calculation ,   Jump to step 1

### 3.2.1R language implementation

```# Set up Newton algorithm framework
Newtons = function(fun, x, theta,ep = 1e-5, it_max = 100){
index = 0; k = 1
while (k <= it_max ){
theta1 = theta; obj = fun(x,theta)
theta = theta - solve(obj\$J,obj\$f)
norm = sqrt((theta - theta1) %*% (theta - theta1))
if (norm < ep){
index = 1; break
}
k = k + 1
}
obj = fun(x,theta)
list(root = theta, it = k, index = index, FunVal = obj\$f)
}
# Calculate the hessian matrix and the first derivative
funs = function(x,theta){
n = length(x)
f = c(-n*log(theta)-n*digamma(theta)+sum(log(x-theta)),
-n*theta/theta+1/(theta^2)*sum(x-theta),
(theta-1)*sum(-1/(x-theta))+n/theta)
J = matrix(c(-n*trigamma(theta), -n/theta, sum(-1/(x-theta)),
-n/theta,n*theta/(theta^2)-2/(theta^3)*sum(x-theta),-n/(theta^2),
sum(-1/(x-theta)),-n/(theta^2),(theta-1)*sum(-1/(x-theta)^2)), nrow = 3, byrow = T)
list(f = f, J = J)
}

# Extract random gamma
set.seed(80)
sample = rgamma(100,2,)

# The moment estimation of gamma distribution parameters is calculated as the initial value
s_m = mean(sample)
s_v = var(sample)
m_3 = (sum((sample-s_m)^3))/length(sample)
alpha = (4*(s_v)^3)/(m_3^2)
beta = m_3/(2*s_v)
lambda = s_m-2*((s_v)^2/m_3)
theta = c(alpha,beta,lambda)

Newtons(funs,sample,theta)
```

### 3.2.2 Python language implementation

```import numpy as np
import scipy.stats as st
import scipy.special as sp

np.random.seed(12)
sample = st.gamma.rvs(2,scale = 1,size = 50) # scipy.stats.gamma.rvs(a, loc=0, scale=1, size=1, random_state=None)
s_m = sample.mean()
s_v =sample.var()
m_3 = np.mean(((sample - s_m)**3))
alpha = (4*(s_v)**3)/(m_3**2)
beta = m_3/(2*s_v)
lambda_ = s_m-2*((s_v)**2/m_3)
theta = np.array([alpha,beta,lambda_])

def Newtons(fun,x,theta,ep=1e-5,it_max = 100):
index = 0
k = 1
while k <= it_max:
theta1 = theta
obj = fun(x,theta)
theta = theta - np.dot(np.linalg.inv(obj),obj)
norm = np.sqrt(np.sum((theta - theta1)**2))
if norm < ep:
index = 1
break
k = k+1
obj = fun(x,theta)
print('alpha,beta,lambda The estimated value is%.3f,%.3f,%.3f'%(theta,theta,theta))
def funs(x,theta):
n = len(x)
f = np.array([-n*np.log(theta)-n*sp.digamma(theta)+np.sum(np.log(x-theta)),
-n*theta/theta+1/(theta**2)*np.sum(x-theta),
(theta-1)*np.sum(-1.0/(x-theta))+n/theta])
j = np.array([[-n*sp.polygamma(1,theta),-n/theta,np.sum(-1/(x-theta))],
[-n/theta,n*theta/(theta**2)-2/(theta**3)*np.sum(x-theta),-n/(theta**2)],
[np.sum(-1/(x-theta)),-n/(theta**2),(theta-1)*np.sum(-1/(x-theta)**2)]])
return [f,j]

Newtons(funs,sample,theta)```

Note: in the process of maximum likelihood estimation of three parameters of Gamma distribution, if Newton method is used, it is easy to have positive definite matrix, so the correct solution cannot be obtained. At this time, quasi Newton method or modified Newton method can be used.

Tags: Python Algorithm

Posted on Tue, 21 Sep 2021 18:47:52 -0400 by longtone