Talk about the matplotlib module of Python real-world data visualization (real-world chapter)

Supporting resources

In view of the sadness that Python programming is hard to find on the Internet, I have a deep understanding. So, here we provide a link for download (with the help of my favorite support (● '◡' ●)):
Baidu cloud link: https://pan.baidu.com/s/1-XE0pBS8IaDLoUBdO8hDOw Password: n39g

CSV file format

CSV - comma separated values file format, comma separated values (CSV, sometimes also known as character separated values, because separated characters can also not be commas), whose files store table data (numbers and text) in plain text. Plain text means that the file is a sequence of characters and does not contain data that must be interpreted as binary numbers. A CSV file consists of any number of records separated by a line break character. Each record consists of fields. The separators between fields are other characters or strings, the most common being commas or tabs. Generally, all records have exactly the same sequence of fields. They are usually plain text files.
For example:

Country,Indicator,Year,Value
AFG,NGDP_R,2002,183.26
AFG,NGDP_R,2003,198.736
AFG,NGDP_R,2004,200.069
AFG,NGDP_R,2005,223.737
AFG,NGDP_R,2006,235.731
AFG,NGDP_R,2007,267.177
AFG,NGDP_R,2008,277.498
AFG,NGDP_R,2009,334.621
AFG,NGDP_R,2010,362.857
AFG,NGDP_R,2011,386.368
AFG,NGDP_R,2012,440.336
AFG,NGDP_R,2013,456.453
.......................
.......................

Using CSV module to get weather data

First, put the Sitka ﹣ weather ﹣ 07-2014.csv file into the same directory of the project, and then use the CSV module provided by Python standard library to analyze the data lines (i.e. the records mentioned above) in the CSV file, so that we can quickly extract the values of interest. The case code is as follows:

import csv

# A CSV file containing weather information of sitka city in July, separated by commas
filename = 'sitka_weather_07-2014.csv'

# Open the csv file, instantiate the reader reader object of a csv module, which is an iterative object, so you can use the for loop to traverse the reader object,
# You can also call the bif next function to traverse the next line. Note that each line record in the reader can only be traversed once, where the reader has a line num attribute to return the corresponding line number during the traversal.
with open(filename) as f:
    reader = csv.reader(f)  # What is returned is a csv reader object, which can't be obtained directly by printing
    print(reader)  # What is printed out is < _csv.reader object at 0x000002d1b24912b8 >
    head_row = next(reader)  # Returns a list of strings made up of the contents of the record in the first row
    print(head_row)  # Print ['akdt ',' max temperaturef '...]
    for row in reader:
        # According to the feature that the record of reader object (one record per row) can only be traversed once, the line number starts from 2
        print(reader.line_num, row)  # Print 2 ['2014-7-1 ',' 64 ',' 56 ',' 50 '...]

The operation results are as follows:

OK, we have got the contents of the CSV file initially, but we should pay attention to that each line of records in the reader can only be traversed once. Let's add a line of code to understand more deeply:

import csv

filename = 'sitka_weather_07-2014.csv'

with open(filename) as f:
    reader = csv.reader(f)
    print(reader)
    print(list(reader))  # It is worth noting that when you call print(list(reader)) first, you will find that you have traversed to the end, and then execute the following code, and the error will be reported
    head_row = next(reader)
    print(head_row)
    for row in reader:
        print(reader.line_num, row)

The operation results are as follows:

Analyze the above results: for the list built-in function, the internal implementation can probably guess that it is to traverse the records in the reader object first, then add them to a list, and then return the whole list. In fact, the list built-in function has been traversed once, and the traversal pointer points to the last record, and because the reader's record (one record per row) can only be traversed once, it is meaningless for the following code to traverse the reader object again.

Draw daily maximum temperature line chart of Sitka, Alaska in July 2014

We already know how to get the weather data, so we will use the pyplot module to draw the daily maximum temperature line map of Sitka, Alaska in July 2014. The code is as follows (comments help to understand the code):

# Plot weather data for Sitka in July
# Import the csv module supported by python to process the csv file, analyze the data lines in the csv file, and let's extract the values we are interested in
import csv
# Import the pyplot module in the matplotlib package to visualize the maximum daily temperature obtained
from matplotlib import pyplot as plt

from datetime import datetime

# A CSV file containing city weather information, separated by commas
filename = 'sitka_weather_07-2014.csv'

# Open the csv file and instantiate a reader object of the csv module
with open(filename) as f:
    reader = csv.reader(f)
    head_row = next(reader)  # Returns a string list consisting of the record contents of the first row, for example: ['1 ',' 2 '...]
    highs = []  # Used to store the highest temperature in a day
    dates = []  # Used to store dates
    for row in reader:  # row is also a string list of records that return traversal rows
        current_date = datetime.strptime(row[0], "%Y-%m-%d")  # Time string to datetime object of specified format
        dates.append(current_date)
        high = int(row[1])  # Since we have already passed next, according to the nature that the reader can only be traversed once, here the for loop starts from the second line
        highs.append(high)  # The highest temperature of each day is saved in the heights list. At this time, the elements stored in the list are numbers rather than strings. Then we can use this list for visualization
fig = plt.figure(dpi=128, figsize=(10, 6))
plt.plot(dates, highs, c='red')
plt.title("Daily high temperatures,July 2014", fontsize=24)
plt.xlabel('', fontsize=16)
fig.autofmt_xdate()  # Draw a slanted date label
plt.ylabel("Temperature (F)", fontsize=16)
plt.tick_params(axis='both', which='major', labelsize=14)
plt.show()

The operation results are as follows:

Carefully observe the above operation results. There are some differences between the X-axis and the book pictures, which is not a problem. This is correct, because our operation results also start in early July, but the X-axis includes June.

Draw a line chart of 2014 daily maximum and minimum temperatures in Sitka, Alaska

Previously, we drew a line chart of the daily maximum temperature in Sitka, Alaska in July. Now we will get the weather data of the whole year in Sitka, Alaska in 2014, and then draw a line chart of the daily maximum temperature and minimum temperature in Sitka, Alaska in 2014. Let's first put the Sitka ﹣ weather ﹣ 2014.csv file in the same directory of the project. The code is as follows:

# Drawing weather data of Sitka for the whole year without error checking
# Import the csv module supported by python to process the csv file, analyze the data lines in the csv file, and let's extract the values we are interested in
import csv
# Import the pyplot module in the matplotlib package to visualize the maximum daily temperature obtained
from matplotlib import pyplot as plt

from datetime import datetime

# A CSV file containing city weather information, separated by commas
filename = 'sitka_weather_2014.csv'

# Open the csv file and instantiate a reader object of the csv module
with open(filename) as f:
    reader = csv.reader(f)
    head_row = next(reader)  # Returns a string list consisting of the record contents of the first row, for example: ['1 ',' 2 '...]
    highs = []  # Used to store the highest temperature in each day of the year
    dates = []  # Used to store dates
    lows = []  # Used to store the lowest temperature in every day of the year
    for row in reader:  # Row is also a list of strings that return the contents of elements traversing the row
        current_date = datetime.strptime(row[0], "%Y-%m-%d")
        dates.append(current_date)
        high = int(row[1])  # Since we have already passed next, according to the nature that the reader can only be traversed once, here the for loop starts from the second line
        highs.append(high)  # The highest temperature of each day is saved in the heights list. At this time, the elements stored in the list are numbers rather than strings. Then we can use this list for visualization
        low = int(row[3])
        lows.append(low)
fig = plt.figure(dpi=128, figsize=(10, 6))
plt.plot(dates, highs, c='red', alpha=0.5)  # Draw a line chart of the maximum temperature of each day in the whole year
plt.plot(dates, lows, c='blue', alpha=0.5)  # Draw a line chart of the lowest temperature of each day in the whole year
plt.fill_between(dates, highs, lows, facecolor='blue', alpha=0.1)  # Fill the area between the daily maximum and minimum temperatures
plt.title("Daily high and low temperatures,- 2014", fontsize=24)
plt.xlabel('', fontsize=16)
fig.autofmt_xdate()
plt.ylabel("Temperature (F)", fontsize=16)
plt.tick_params(axis='both', which='major', labelsize=14)
plt.show()

The operation results are as follows:

Because some weather stations occasionally fail to collect all the data that should have been collected, if we still use the previous code example to obtain the weather data of a city, an error will be reported, because the city's csv file may lack data one day, so a ValueError error will be reported. As shown in the figure below, the data of one day is missing:

Therefore, in view of the above problems, in the next section, the weather data of death valley for the whole year will be obtained by using exception handling mechanism to prevent this situation.

Which hospital is better for infertility in Zhengzhou: http://jbk.39.net/yiyunfengcai/tsyl_zztjyy/3010/

Draw the daily maximum and minimum temperature line chart of death valley in 2014

As mentioned above, problems such as lack of data may occur in practical applications. As a program developer, we must envisage various possible problems and adopt practical methods to solve them. We will use the anomaly processing mechanism to draw the daily maximum and minimum temperature line chart of death valley in 2014. The code is as follows:

# Import the csv module supported by python to process the csv file, analyze the data lines in the csv file, and let's extract the values we are interested in
import csv
# Import the pyplot module in the matplotlib package to visualize the maximum daily temperature obtained
from matplotlib import pyplot as plt

from datetime import datetime

# CSV file containing a city weather information
filename = 'death_valley_2014.csv'

# Open the csv file and instantiate a reader object of the csv module
with open(filename) as f:
    reader = csv.reader(f)
    head_row = next(reader)  # Returns a string list consisting of the record contents of the first row, for example: ['1 ',' 2 '...]
    highs = []  # Used to store the highest temperature in a day
    dates = []  # Used to store dates
    lows = []  # Used to store the lowest temperature every day
    for row in reader:  # row is also a string list of records that return traversal rows
        try:
            current_date = datetime.strptime(row[0], "%Y-%m-%d")
            high = int(row[1])  # Since we have already passed next, according to the nature that the reader can only be traversed once, here the for loop starts from the second line
            low = int(row[3])
        except ValueError:
            print(current_date,'missing data')
        else:
            dates.append(current_date)
            highs.append(high)  # The highest temperature of each day is saved in the heights list. At this time, the elements stored in the list are numbers rather than strings. Then we can use this list for visualization
            lows.append(low)
fig = plt.figure(dpi=128, figsize=(10, 6))
plt.plot(dates, highs, c='red',alpha=0.5)
plt.plot(dates, lows, c='blue',alpha=0.5)
plt.fill_between(dates,highs,lows,facecolor='blue',alpha=0.1)
plt.title("Daily high and low temperatures,- 2014\nDeath Valley.CA", fontsize=20)
plt.xlabel('', fontsize=16)
fig.autofmt_xdate()
plt.ylabel("Temperature (F)", fontsize=16)
plt.tick_params(axis='both', which='major', labelsize=14)
plt.show()

The operation results are as follows (two figures):
(1) Line chart

(2) Terminal diagram

It can be seen from the above two figures that the temperature data of 2014-02-16 is lost, so the maximum and minimum temperatures of 2014-02-16 are not drawn in the line graph (because they are too dense to be seen).

Zhengzhou infertility treatment: http://www.zzfkyy120.com/

Tags: Programming Python Attribute

Posted on Mon, 04 May 2020 09:24:50 -0400 by kamy99