Article catalog
- Numerical analysis case: Newton interpolation forecast 2019 urban (Asian) temperature, Crout solution to urban isothermal factor coefficient
1, Experiment purpose and data source
1. Overview of research issues:
This paper mainly studies the practical application of the interpolation method in the prediction of urban temperature change, and uses the linear equations to solve the factor coefficients, to further explore the correlation factors of urban isothermal.
2. Data source:
-
World Average Temperature(https://www.kaggle.com/efradgamer/world-average-temperature)
-
global environmental factors(https://www.kaggle.com/sadeka007/global-environmental-factors)
2, Experiment content
Part I: "forecasting 2019 urban temperature by Newton interpolation"
Step 1:
Randomly select a certain number of urban temperature data of Asia, and ensure that the geographic location information of these cities has a certain division in longitude and dimension, so as to achieve the effect of global sampling. *The selected cities are as follows: Beijing, Chongqing, Taipei, Tokyo, Sapporo, Seoul, Dikson, Vladivostok, Chiang Mai, Hoi, Mumbai, Danang, Hanoi, Erzurum.
The code is as follows:
# -*- coding:utf-8 -*- import numpy as np import pandas as pd import csv import matplotlib.pyplot as plt %matplotlib inline path = 'DATA.csv' data = pd.read_csv(path, index_col=0) data.head(15)
Step 2:
Data visualization can intuitively feel the contrast of urban temperature, and draw the urban temperature line chart as follows:
The code is as follows:
# Convert csv file format to np.array Format, data format is float type Data = np.array(data.iloc[:,0:12], dtype=float) # Convert data to float print(Data) # Take 12 months' temperature statistics of each city Beijing = Data[0,:] Chongqing = Data[1,:] Taipei = Data[2,:] Tokyo = Data[3,:] Sapporo = Data[4,:] Seoul = Data[5,:] Dikson = Data[6,:] Vladivostok = Data[7,:] ChiangMai = Data[8,:] HatYai = Data[9,:] DaNang = Data[10,:] Hanoi = Data[11,:] Mumbai = Data[12,:] Erzurum = Data[13,:] # Visualization of annual temperature change data of a city in the form of line chart x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] plt.figure(figsize=(13, 13)) plt.rcParams['font.sans-serif']=['SimHei'] #Used to display Chinese labels normally plt.rcParams['axes.unicode_minus']=False #Used to display negative sign normally plt.plot(x, Beijing, ms=2, label="Beijing") plt.plot(x, Chongqing,ms=2, label="Chongqing") plt.plot(x, Taipei, ms=2, label="Taipei") plt.plot(x, Tokyo, ms=2, label="Tokyo") plt.plot(x, Sapporo, ms=2, label="Sapporo") plt.plot(x, Seoul, ms=2, label="Seoul") plt.plot(x, Dikson, ms=2, label="Dikson") plt.plot(x, Vladivostok, ms=2, label="Vladivostok") plt.plot(x, ChiangMai, ms=2, label="Chiang Mai") plt.plot(x, HatYai, ms=2, label="Artemisia argyi") plt.plot(x, DaNang, ms=2, label="Da Nang") plt.plot(x, Hanoi, ms=2, label="Hanoi") plt.plot(x, Mumbai, ms=2, label="bombay") plt.plot(x, Erzurum, ms=2, label="Erzurum") plt.xticks(rotation=45) plt.xlabel("Month") plt.ylabel("Temperature/degree") plt.title("2019 temperature change in Asian cities") plt.legend(loc="upper left") # Display the specific value on the line chart, ha parameter controls the horizontal alignment, va controls the vertical alignment for y in [Beijing, Chongqing, Taipei, Tokyo,Sapporo,Seoul,Dikson,Vladivostok, ChiangMai,HatYai,DaNang,Hanoi,Mumbai,Erzurum]: for x1, yy in zip(x, y): plt.text(x1, yy + 1, str(yy), ha='center', va='bottom', fontsize=12, rotation=0) plt.savefig("a.jpg") plt.show()
Step 3:
Data division, the city temperature data is divided into training set and test set. In order to ensure the stratified sampling of data, training data set: select the temperature of January, March, may, July, September and November; test data set: select the temperature of February, April, June, August, October and December.
The code is as follows:
# train_data Beijing_train = Data[0,[0,2,4,6,8,10]] Chongqing_train = Data[1,[0,2,4,6,8,10]] Taipei_train = Data[2,[0,2,4,6,8,10]] Tokyo_train = Data[3,[0,2,4,6,8,10]] Sapporo_train = Data[4,[0,2,4,6,8,10]] Seoul_train = Data[5,[0,2,4,6,8,10]] Dikson_train = Data[6,[0,2,4,6,8,10]] Vladivostok_train = Data[7,[0,2,4,6,8,10]] ChiangMai_train = Data[8,[0,2,4,6,8,10]] HatYai_train = Data[9,[0,2,4,6,8,10]] DaNang_train = Data[10,[0,2,4,6,8,10]] Hanoi_train = Data[11,[0,2,4,6,8,10]] Mumbai_train = Data[12,[0,2,4,6,8,10]] Erzurum_train = Data[13,[0,2,4,6,8,10]] # Test_data Beijing_test = Data[0,[1,3,5,7,9,11]] Chongqing_test = Data[1,[1,3,5,7,9,11]] Taipei_test = Data[2,[1,3,5,7,9,11]] Tokyo_test = Data[3,[1,3,5,7,9,11]] Sapporo_test = Data[4,[1,3,5,7,9,11]] Seoul_test = Data[5,[1,3,5,7,9,11]] Dikson_test = Data[6,[1,3,5,7,9,11]] Vladivostok_test = Data[7,[1,3,5,7,9,11]] ChiangMai_test = Data[8,[1,3,5,7,9,11]] HatYai_test = Data[9,[1,3,5,7,9,11]] DaNang_test = Data[10,[1,3,5,7,9,11]] Hanoi_test= Data[11,[1,3,5,7,9,11]] Mumbai_test = Data[12,[1,3,5,7,9,11]] Erzurum_test = Data[13,[1,3,5,7,9,11]] # The list subscript starts from 0, subscript 0,2,4,6,8,10 corresponds to January, March, may, July, September and November respectively, while subscript 1,3,5,7,9,11 corresponds to February, April, June, August, October and December respectively
Step 4:
The choice of difference method takes into account the long lattice phenomenon of Lagrange interpolation, while Hermite interpolation needs the participation of derivative, and the fitting of piecewise interpolation is weak. Based on the comprehensive consideration, Newton difference method is selected to predict the difference.
Taking Beijing as an example, the temperatures in February, April, June, August, October and December are predicted as follows:
The code is as follows:
#Newton difference method import numpy as np def Qda_Table(): for i in range(n+1): for j in range(n): if initial[i][j] is not None or i >= j + 2: continue else: initial[i][j] = (initial[i - 1][j] - initial[i - 1][j - 1]) / ( initial[0][j] - initial[0][j - i + 1]) # Print (% D, the current value of% d is: "% (i, j) + str(initial[i][j])) def calculate_factors(no, vari): # Calculating the independent variable factor polynomials of each multiplication factor = 1 if no == 1: return factor else: for i in range(no-1): factor = factor * (vari-x[i]) return factor def final_calculate(k): result = 0 for time in range(n): result = result + initial[time+1][time] * calculate_factors(time+1, v[k]) return result x = [1, 3, 5, 7, 9, 11]# Month entered y = Beijing_train.copy() n = len(x) sum = 0 v = [2, 4, 6, 8, 10, 12]# Months to forecast y_predict = [] # Temperature value used to store forecast months initial = [] # Initialize Table for p in range(n+1): initial.append([]) for q in range(n): initial[p].append(None) # Generate a two-dimensional list of n*n and initialize the list element to 0 for k in range(n): initial[0][k] = x[k] initial[1][k] = y[k] # print("the initialized table is as follows:") # print(initial) # Fill the values of x and y into the Table to complete the initialization of Table. Qda_Table() # Calculate the difference between each position and fill in the form QDA_table = np.array(initial).T # Output result after transpose # print("the final difference table is as follows:") # print(QDA_table) for i in range(n): y_predict.append(np.around(final_calculate(i),decimals=4)) print("The temperature information for the forecast month is as follows:") print(y_predict)
The second part: "Crout solves and analyzes the coefficient of influencing factors of urban isothermal"
Step 1:
Select the country where the first part of the city is located (Korea, South, Japan, Turkey, India, Vietnam, Thailand, China, Russia) environmental factor data, but because environmental factor data contains many data items, and in order to build the later linear equation system to solve the factor coefficient and analyze its correlation, the following 8 environmental factors are selected as the exploration indicators of urban isothermal factor analysis after consulting data and screening:*
elevation
cropland_cover (farmland coverage)
tree_canopy_cover (vegetation coverage)
rain_mean_annual (average annual precipitation)
rain_ Seasonality (seasonal precipitation)
temp_annual_range (annual temperature fluctuation range)
cloudiness (cloud volume)
temp_mean_annual (annual average temperature)
The detailed data are as follows:
The code is as follows:
# Data visualization path = 'ENVI.csv' source = pd.read_csv(path, index_col=0) source.head(15)
Step 2:
A linear system of equations is constructed, assuming that there is a linear relationship between the possibility factor and the isothermal property. According to the calculated coefficient, the correlation between the coefficient and the isothermal property is preliminarily analyzed: if the coefficient | is approximately 0, the correlation is low; if the coefficient | is greater than or equal to 0.5, it is determined as high correlation. Crout decomposition method is used to solve the equation.
Independent variables: elevation, crobrand_ cover,tree_canopy_cover,rain_mean_annual,rain_seasonailty,temp_annual_range,cloudiness,temp_mean_annual
Dependent variable: isothermality
The results are as follows:
The code is as follows:
# Solution by Crout decomposition method import numpy as np import copy def UAndL_Figure(k): # Calculate the value of Lj in U matrix by row for j in range(mu): if j <= k: continue else: uj = A[k][j]/A[k][k] U[k][j] = uj # According to U_Figure calculated li coefficient changes A matrix for j in range(mu): j = j+k for i in range(nu): if j < nu-1: A[i][j+1] = A[i][j+1] - U[k][j+1] * A[i][k] else: continue mu, nu = 8, 8 # The number of equations and independent variables is 8 source = np.array(source.iloc[:,0:9], dtype=float) # Convert data to float A, L, U = [], [], [] # Initialize ALU matrix for p in range(nu): A.append([]), L.append([]), U.append([]) for q in range(mu): x_in = source[p][q] A[p].append(x_in) if p == q: L[p].append(None), U[p].append(1) else: L[p].append(None), U[p].append(0) A, L, U = np.array(A), np.array(L), np.array(U) b_Const = source[:,[8]] # Dependent y matrix for i in range(nu): UAndL_Figure(i) L = copy.deepcopy(A) y = np.dot(np.linalg.inv(L), b_Const) x = np.dot(np.linalg.inv(U), y) x = np.around(x,decimals=3) # print("x is calculated as follows:") # print(x) t=[i for item in x for i in item]#One dimension of two-dimensional X data print("The results of the coefficients of the possible factors are as follows:") print(t)
Step 3:
After solving the linear correlation coefficient of Step 2, we get the factor with high degree of correlation with isothermal property: cropland_ Cover, tree_ canopy_ Cover, temp_ annual_ Range. Visualize the data of all cities on these indicators:
With crobrand_ For example, the code of over factor is as follows:
# Convert csv file format to np.array Format, data format is float type source = np.array(source.iloc[:,0:9], dtype=float) # Convert data to float print(source) # Set matplotlib to display Chinese and minus sign normally plt.rcParams['font.sans-serif']=['SimHei'] plt.rcParams['axes.unicode_minus']=False # Generate canvas plt.figure(figsize=(20, 8), dpi=70) # Abscissa city name country_name = ['Korea, South','Japan','Turkey','India','Vietnam','Thailand','China','Russia'] # Ordinate altitude y = source[:,1] x=range(len(country_name)) plt.bar(x,y,color=['b','r','g','y','c','m','y','k','c','g','g']) plt.xticks(x, country_name) plt.xlabel("Country") plt.ylabel("Cropland_cover") plt.title("2019 Cropland_cover of the countries in Asian") # Display the specific value on the line chart, ha parameter controls the horizontal alignment, va controls the vertical alignment for x1, yy in zip(x, y): plt.text(x1, yy + 1, str(yy), ha='center', va='bottom', fontsize=12, rotation=0) plt.show()
Step 4:
Because of the limitation of the number of selected countries and the selected environmental data items, the relationship between the urban isotherm and other environmental factors can not be established directly by analyzing the potential factors through the thermodynamic diagram. Therefore, according to the Pearson correlation coefficient, * the statistics used to reflect the linear correlation degree of the two variables.
The thermodynamic diagram is explained as follows:
The horizontal axis indexes are Elevation and rain respectively_ mean_ annual, rain_seasonailty,cloudiness,temp_ mean_ The vertical axis data of annual and Isothermality are Korea, South, Japan, Turkey, India, Vietnam, Thailand, China and Russia.
The thermodynamic diagram is used to explore whether different countries have correlation on the same environmental factors. Elevation and rain can be found_ mean_annual has potential relevance in data statistics of different countries.
This heat map does not include cropland_ Cover, tree_ canopy_ Cover, temp_ annual_ Range factor analysis.
The code is as follows:
import numpy as np import seaborn as sns import matplotlib.pyplot as plt f, ax = plt.subplots(figsize = (6,6)) #Display heat map sns.heatmap(source[:,[0,3,4,6,7,8]], annot=True,fmt='.2f',cbar=True) ax.set_title('City Statistics ') ax.set_xlabel('Index Of Environmental Attributes') ax.set_ylabel('City name')
3, Experimental results and analysis
"Using Newton interpolation to predict 2019 urban (Asian) temperature". In this part, Newton interpolation method is used to predict 2019 urban (Asian) temperature. After testing, the interpolation result is relatively good, and the error with the actual statistics is small. Practical significance: when the temperature data of monitoring month is missing, Newton interpolation can be used to complete the data for use.
This part is an exploratory experiment. Its conclusion can be used as a research and exploration direction, but it can't be taken out of context without a lot of verification. The calculation result of the possible factors related to the isothermal property is: cropland_ Cover, tree_ canopy_ Cover, temp_ annual_ Range (annual temperature fluctuation range) is also in line with geographical knowledge.
Finally, based on Pearson correlation coefficient and thermodynamic diagram analysis, the potential factors are Elevation and rain_mean_annual (annual average precipitation) is also the factor direction that the experiment can continue. In addition, the construction of nonlinear relationship can be considered, and further strengthening analysis can be carried out by using the easy fitting of neural network *.