Numerical analysis case: Newton interpolation forecast 2019 urban (Asian) temperature, Crout solution to urban isothermal factor coefficient

Numerical analysis case: Newton interpolation forecast 2019 urban (Asian) temperature, Crout solution to urban isothermal factor coefficient

Article catalog

1, Experiment purpose and data source

1. Overview of research issues:

This paper mainly studies the practical application of the interpolation method in the prediction of urban temperature change, and uses the linear equations to solve the factor coefficients, to further explore the correlation factors of urban isothermal.

2. Data source:

  1. World Average Temperature(https://www.kaggle.com/efradgamer/world-average-temperature)

  2. global environmental factors(https://www.kaggle.com/sadeka007/global-environmental-factors)

2, Experiment content

Part I: "forecasting 2019 urban temperature by Newton interpolation"

Step 1:

Randomly select a certain number of urban temperature data of Asia, and ensure that the geographic location information of these cities has a certain division in longitude and dimension, so as to achieve the effect of global sampling. *The selected cities are as follows: Beijing, Chongqing, Taipei, Tokyo, Sapporo, Seoul, Dikson, Vladivostok, Chiang Mai, Hoi, Mumbai, Danang, Hanoi, Erzurum.

The code is as follows:

# -*- coding:utf-8 -*-
import numpy as np
import pandas as pd
import csv
import matplotlib.pyplot as plt
%matplotlib inline

path = 'DATA.csv'
data = pd.read_csv(path, index_col=0)
data.head(15)

Step 2:

Data visualization can intuitively feel the contrast of urban temperature, and draw the urban temperature line chart as follows:

The code is as follows:

# Convert csv file format to np.array Format, data format is float type
Data = np.array(data.iloc[:,0:12], dtype=float) # Convert data to float
print(Data)

# Take 12 months' temperature statistics of each city
Beijing = Data[0,:]
Chongqing = Data[1,:]
Taipei = Data[2,:]
Tokyo = Data[3,:]
Sapporo = Data[4,:]
Seoul = Data[5,:]
Dikson = Data[6,:]
Vladivostok = Data[7,:]
ChiangMai = Data[8,:]
HatYai = Data[9,:]
DaNang = Data[10,:]
Hanoi = Data[11,:]
Mumbai = Data[12,:]
Erzurum = Data[13,:]

# Visualization of annual temperature change data of a city in the form of line chart
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

plt.figure(figsize=(13, 13))
plt.rcParams['font.sans-serif']=['SimHei'] #Used to display Chinese labels normally
plt.rcParams['axes.unicode_minus']=False #Used to display negative sign normally
plt.plot(x, Beijing,  ms=2, label="Beijing")
plt.plot(x, Chongqing,ms=2, label="Chongqing")
plt.plot(x, Taipei,  ms=2, label="Taipei")
plt.plot(x, Tokyo, ms=2, label="Tokyo")

plt.plot(x, Sapporo, ms=2, label="Sapporo")
plt.plot(x, Seoul, ms=2, label="Seoul")
plt.plot(x, Dikson, ms=2, label="Dikson")

plt.plot(x, Vladivostok, ms=2, label="Vladivostok")
plt.plot(x, ChiangMai,  ms=2, label="Chiang Mai")
plt.plot(x, HatYai,  ms=2, label="Artemisia argyi")
plt.plot(x, DaNang, ms=2, label="Da Nang")

plt.plot(x, Hanoi,  ms=2, label="Hanoi")
plt.plot(x, Mumbai,  ms=2, label="bombay")
plt.plot(x, Erzurum, ms=2, label="Erzurum")
plt.xticks(rotation=45)
plt.xlabel("Month")
plt.ylabel("Temperature/degree")
plt.title("2019 temperature change in Asian cities")
plt.legend(loc="upper left")


# Display the specific value on the line chart, ha parameter controls the horizontal alignment, va controls the vertical alignment
for y in [Beijing, Chongqing, Taipei, Tokyo,Sapporo,Seoul,Dikson,Vladivostok,
          ChiangMai,HatYai,DaNang,Hanoi,Mumbai,Erzurum]:
    for x1, yy in zip(x, y):
        plt.text(x1, yy + 1, str(yy), ha='center', va='bottom', fontsize=12, rotation=0)
plt.savefig("a.jpg")
plt.show()

Step 3:

Data division, the city temperature data is divided into training set and test set. In order to ensure the stratified sampling of data, training data set: select the temperature of January, March, may, July, September and November; test data set: select the temperature of February, April, June, August, October and December.

The code is as follows:

# train_data
Beijing_train = Data[0,[0,2,4,6,8,10]]
Chongqing_train = Data[1,[0,2,4,6,8,10]]
Taipei_train = Data[2,[0,2,4,6,8,10]]
Tokyo_train = Data[3,[0,2,4,6,8,10]]
Sapporo_train = Data[4,[0,2,4,6,8,10]]
Seoul_train = Data[5,[0,2,4,6,8,10]]
Dikson_train = Data[6,[0,2,4,6,8,10]]
Vladivostok_train = Data[7,[0,2,4,6,8,10]]
ChiangMai_train = Data[8,[0,2,4,6,8,10]]
HatYai_train = Data[9,[0,2,4,6,8,10]]
DaNang_train = Data[10,[0,2,4,6,8,10]]
Hanoi_train = Data[11,[0,2,4,6,8,10]]
Mumbai_train = Data[12,[0,2,4,6,8,10]]
Erzurum_train = Data[13,[0,2,4,6,8,10]]
# Test_data
Beijing_test = Data[0,[1,3,5,7,9,11]]
Chongqing_test = Data[1,[1,3,5,7,9,11]]
Taipei_test = Data[2,[1,3,5,7,9,11]]
Tokyo_test = Data[3,[1,3,5,7,9,11]]
Sapporo_test = Data[4,[1,3,5,7,9,11]]
Seoul_test = Data[5,[1,3,5,7,9,11]]
Dikson_test = Data[6,[1,3,5,7,9,11]]
Vladivostok_test = Data[7,[1,3,5,7,9,11]]
ChiangMai_test = Data[8,[1,3,5,7,9,11]]
HatYai_test = Data[9,[1,3,5,7,9,11]]
DaNang_test = Data[10,[1,3,5,7,9,11]]
Hanoi_test= Data[11,[1,3,5,7,9,11]]
Mumbai_test = Data[12,[1,3,5,7,9,11]]
Erzurum_test = Data[13,[1,3,5,7,9,11]] # The list subscript starts from 0, subscript 0,2,4,6,8,10 corresponds to January, March, may, July, September and November respectively, while subscript 1,3,5,7,9,11 corresponds to February, April, June, August, October and December respectively

Step 4:

The choice of difference method takes into account the long lattice phenomenon of Lagrange interpolation, while Hermite interpolation needs the participation of derivative, and the fitting of piecewise interpolation is weak. Based on the comprehensive consideration, Newton difference method is selected to predict the difference.

Taking Beijing as an example, the temperatures in February, April, June, August, October and December are predicted as follows:

The code is as follows:

#Newton difference method
import numpy as np


def Qda_Table():
    for i in range(n+1):
        for j in range(n):
            if initial[i][j] is not None or i >= j + 2:
                continue
            else:
                initial[i][j] = (initial[i - 1][j] - initial[i - 1][j - 1]) / (
                        initial[0][j] - initial[0][j - i + 1])
            #  Print (% D, the current value of% d is: "% (i, j) + str(initial[i][j]))


def calculate_factors(no, vari):  # Calculating the independent variable factor polynomials of each multiplication
    factor = 1
    if no == 1:
        return factor
    else:
        for i in range(no-1):
            factor = factor * (vari-x[i])
    return factor


def final_calculate(k):
    result = 0
    for time in range(n):
        result = result + initial[time+1][time] * calculate_factors(time+1, v[k])
    return result



x = [1, 3, 5, 7, 9, 11]# Month entered
y = Beijing_train.copy()
n = len(x)
sum = 0
v = [2, 4, 6, 8, 10, 12]# Months to forecast
y_predict = [] # Temperature value used to store forecast months

initial = []  # Initialize Table
for p in range(n+1):
    initial.append([])
    for q in range(n):
        initial[p].append(None)  # Generate a two-dimensional list of n*n and initialize the list element to 0
for k in range(n):
    initial[0][k] = x[k]
    initial[1][k] = y[k]
# print("the initialized table is as follows:")
# print(initial)  # Fill the values of x and y into the Table to complete the initialization of Table.

Qda_Table()  # Calculate the difference between each position and fill in the form
QDA_table = np.array(initial).T  # Output result after transpose
# print("the final difference table is as follows:")
# print(QDA_table)

for i in range(n):
    y_predict.append(np.around(final_calculate(i),decimals=4))
print("The temperature information for the forecast month is as follows:")
print(y_predict)

The second part: "Crout solves and analyzes the coefficient of influencing factors of urban isothermal"

Step 1:

Select the country where the first part of the city is located (Korea, South, Japan, Turkey, India, Vietnam, Thailand, China, Russia) environmental factor data, but because environmental factor data contains many data items, and in order to build the later linear equation system to solve the factor coefficient and analyze its correlation, the following 8 environmental factors are selected as the exploration indicators of urban isothermal factor analysis after consulting data and screening:*

elevation

cropland_cover (farmland coverage)

tree_canopy_cover (vegetation coverage)

rain_mean_annual (average annual precipitation)

rain_ Seasonality (seasonal precipitation)

temp_annual_range (annual temperature fluctuation range)

cloudiness (cloud volume)

temp_mean_annual (annual average temperature)

The detailed data are as follows:

The code is as follows:

# Data visualization
path = 'ENVI.csv'
source = pd.read_csv(path, index_col=0)
source.head(15)

Step 2:

A linear system of equations is constructed, assuming that there is a linear relationship between the possibility factor and the isothermal property. According to the calculated coefficient, the correlation between the coefficient and the isothermal property is preliminarily analyzed: if the coefficient | is approximately 0, the correlation is low; if the coefficient | is greater than or equal to 0.5, it is determined as high correlation. Crout decomposition method is used to solve the equation.

Independent variables: elevation, crobrand_ cover,tree_canopy_cover,rain_mean_annual,rain_seasonailty,temp_annual_range,cloudiness,temp_mean_annual

Dependent variable: isothermality

The results are as follows:

The code is as follows:

# Solution by Crout decomposition method
import numpy as np
import copy

def UAndL_Figure(k):
    # Calculate the value of Lj in U matrix by row
    for j in range(mu):
        if j <= k:
            continue
        else:
            uj = A[k][j]/A[k][k]
            U[k][j] = uj
    # According to U_Figure calculated li coefficient changes A matrix
    for j in range(mu):
        j = j+k
        for i in range(nu):
            if j < nu-1:
                A[i][j+1] = A[i][j+1] - U[k][j+1] * A[i][k]
            else:
                continue


mu, nu = 8, 8 # The number of equations and independent variables is 8
source = np.array(source.iloc[:,0:9], dtype=float) # Convert data to float
A, L, U = [], [], []  # Initialize ALU matrix
for p in range(nu):
    A.append([]), L.append([]), U.append([])
    for q in range(mu):
        x_in = source[p][q]
        A[p].append(x_in)
        if p == q:
            L[p].append(None), U[p].append(1)
        else:
            L[p].append(None), U[p].append(0)
A, L, U = np.array(A), np.array(L), np.array(U)


b_Const = source[:,[8]]  # Dependent y matrix


for i in range(nu):
    UAndL_Figure(i)
L = copy.deepcopy(A)

y = np.dot(np.linalg.inv(L), b_Const)
x = np.dot(np.linalg.inv(U), y)
x = np.around(x,decimals=3)
# print("x is calculated as follows:")
# print(x)


t=[i for item in x for i in item]#One dimension of two-dimensional X data
print("The results of the coefficients of the possible factors are as follows:")
print(t)

Step 3:

After solving the linear correlation coefficient of Step 2, we get the factor with high degree of correlation with isothermal property: cropland_ Cover, tree_ canopy_ Cover, temp_ annual_ Range. Visualize the data of all cities on these indicators:


With crobrand_ For example, the code of over factor is as follows:

# Convert csv file format to np.array Format, data format is float type
source = np.array(source.iloc[:,0:9], dtype=float) # Convert data to float
print(source)

# Set matplotlib to display Chinese and minus sign normally
plt.rcParams['font.sans-serif']=['SimHei']
plt.rcParams['axes.unicode_minus']=False     

# Generate canvas
plt.figure(figsize=(20, 8), dpi=70)
# Abscissa city name
country_name = ['Korea, South','Japan','Turkey','India','Vietnam','Thailand','China','Russia']
# Ordinate altitude
y = source[:,1]
x=range(len(country_name))

plt.bar(x,y,color=['b','r','g','y','c','m','y','k','c','g','g'])
plt.xticks(x, country_name)
plt.xlabel("Country")
plt.ylabel("Cropland_cover")
plt.title("2019 Cropland_cover of the countries in Asian")

# Display the specific value on the line chart, ha parameter controls the horizontal alignment, va controls the vertical alignment
for x1, yy in zip(x, y):
    plt.text(x1, yy + 1, str(yy), ha='center', va='bottom', fontsize=12, rotation=0)

plt.show()

Step 4:

Because of the limitation of the number of selected countries and the selected environmental data items, the relationship between the urban isotherm and other environmental factors can not be established directly by analyzing the potential factors through the thermodynamic diagram. Therefore, according to the Pearson correlation coefficient, * the statistics used to reflect the linear correlation degree of the two variables.

The thermodynamic diagram is explained as follows:

The horizontal axis indexes are Elevation and rain respectively_ mean_ annual, rain_seasonailty,cloudiness,temp_ mean_ The vertical axis data of annual and Isothermality are Korea, South, Japan, Turkey, India, Vietnam, Thailand, China and Russia.

The thermodynamic diagram is used to explore whether different countries have correlation on the same environmental factors. Elevation and rain can be found_ mean_annual has potential relevance in data statistics of different countries.

This heat map does not include cropland_ Cover, tree_ canopy_ Cover, temp_ annual_ Range factor analysis.

The code is as follows:

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
f, ax = plt.subplots(figsize = (6,6))

#Display heat map
sns.heatmap(source[:,[0,3,4,6,7,8]], annot=True,fmt='.2f',cbar=True)
ax.set_title('City  Statistics ')
ax.set_xlabel('Index Of Environmental Attributes')
ax.set_ylabel('City name')

3, Experimental results and analysis

"Using Newton interpolation to predict 2019 urban (Asian) temperature". In this part, Newton interpolation method is used to predict 2019 urban (Asian) temperature. After testing, the interpolation result is relatively good, and the error with the actual statistics is small. Practical significance: when the temperature data of monitoring month is missing, Newton interpolation can be used to complete the data for use.

This part is an exploratory experiment. Its conclusion can be used as a research and exploration direction, but it can't be taken out of context without a lot of verification. The calculation result of the possible factors related to the isothermal property is: cropland_ Cover, tree_ canopy_ Cover, temp_ annual_ Range (annual temperature fluctuation range) is also in line with geographical knowledge.

Finally, based on Pearson correlation coefficient and thermodynamic diagram analysis, the potential factors are Elevation and rain_mean_annual (annual average precipitation) is also the factor direction that the experiment can continue. In addition, the construction of nonlinear relationship can be considered, and further strengthening analysis can be carried out by using the easy fitting of neural network *.

Tags: network

Posted on Wed, 17 Jun 2020 02:59:31 -0400 by emceej