How can we track novel coronavirus pneumonia data around the world with Python?

With the outbreak of large-scale epidemic, all kinds of related information spread faster than the coronavirus (COVID-19) itself, which makes it difficult for us to screen which information is really beneficial to us. But it's clear that we have to understand the actual statistics that affect the areas we live in.

Today, we're going to use Python, an interesting way to teach you how to get information about global coronavirus related numbers.

I'll show you how to get information about how many people are affected by coronavirus and similar information in your email every day.

I'm going to use a method called Web crawling that uses Selenium and Python.

Come on, let's get to the point

Preparation

First of all, we have to find the data source, that is, where does the data come from? I decided to use Worldometers to do this, because the data in it is relatively accurate, and the page of the website is also very intuitive and concise.

This is a table that shows the data for each affected country, with different data contents in many columns.

What we want to do is get the data of the corresponding country you want from the table, and then it will send you an email automatically.

 

Setting environment

First, you need to install a chrome driver( https://chromedriver.chromium.org/ ), which will enable us to operate the browser and send commands to it for testing and use.

Open the link and download the file for the operating system, and extract the file. I suggest entering the file, right-click manual operation, and then click "unzip here".

In this folder, there is a file called "Chrome driver". We must move it to a specific folder on your computer.

Open the terminal and enter the following command:

1sudo su #enter the root mode
2cd      #go back to base from the current location
3mv /home/*your_pc_name*/Downloads/chromedriver /usr/local/bin 
4#move the file to the right location

Next, just plug in the actual name of the computer, not * your? PC? Name *.

When you are finished, open the editor. My personal choice is Visual Studio code. It's easy to use, customizable, and reduces the burden on your device.

Open a new project and create two new files wherever you like. This is what my file looks like when it's opened:

In VS code, there is a "Terminal" tab that you can use to open internal terminals in VS code, which is very useful for putting everything in one place.

When you open it, there are very few things to install. This is the virtual environment and selenium of the web driver.

Enter the following command into the terminal:

1pip3 install virtualenv
2source venv/bin/activate
3pip3 install selenium

After activating the virtual environment, we are ready.

Code

Now that we have determined What we want and Where we will get it, we have to do the "How" part.

Create your tools and launch Chrome drivers.

1class Coronavirus():
2  def __init__(self):
3    self.driver = webdriver.Chrome()

That's all we need to start developing. Now go to your terminal and type:

1python -i coronavirus.py

This command allows us to use files as an interactive playground. The browser's new tag will be opened and we can start issuing commands to it.

If you want to experiment, you can use the command line instead of typing it directly into the source file. (not a self-employed robot.)

Terminal:

1bot = Coronavirus()
2bot.driver.get('https://www.worldometers.info/coronavirus/')

Source code:

1self.driver.get('https://www.worldometers.info/coronavirus/')

When we visit the website:
Take the table as a Web element and save it under table. To find this element on a Web page, we use find element by xpath() and filter it with the id it defines.

1table = self.driver.find_element_by_xpath('//*[@id="main_table_countries"]/tbody[1]')

In this table, we need to get the country to make sure it's the country we originally wanted to find.

1country_element = table.find_element_by_xpath("//td[contains(text(), 'China')]")

Let's use XPath again, with "China" as an example.

Because we need the data next to 'China', we have to make sure it belongs to that row, which is why we need to get the parent element from the country element.

1row = country_element.find_element_by_xpath("./..")

In this row, we get all the data we need. We will divide the string into columns and save them to variables.

1data = row.text.split(" ")
2total_cases = data[1]
3new_cases = data[2]
4total_deaths = data[3]
5new_deaths = data[4]
6active_cases = data[5]
7total_recovered = data[6]
8serious_critical = data[7]

Basically 'data' is a list, it comes from the segmentation of strings, and then we spread it into different variables for later use.

Send mail

We have to set up an email sending server, go to Google account service, go to "app password", where you should generate a new password and use it in this little script.

We also template the email we will receive:

 1def send_mail(country_element, total_cases, new_cases, total_deaths, new_deaths, active_cases, total_recovered, serious_critical):
 2server = smtplib.SMTP('smtp.gmail.com', 587)
 3server.ehlo()
 4server.starttls()
 5server.ehlo()
 6server.login('email', 'password')
 7subject = 'Coronavirus stats in your country today!'
 8body = 'Today in ' + country_element + '\
 9\nThere is new data on coronavirus:\
10\nTotal cases: ' + total_cases +'\
11\nNew cases: ' + new_cases + '\
12\nTotal deaths: ' + total_deaths + '\
13\nNew deaths: ' + new_deaths + '\
14\nActive cases: ' + active_cases + '\
15\nTotal recovered: ' + total_recovered + '\
16\nSerious, critical cases: ' + serious_critical  + '\
17\nCheck the link: https://www.worldometers.info/coronavirus/'
18msg = f"Subject: {subject}\n\n{body}"
19server.sendmail(
20'Coronavirus',
21'email',
22msg
23)
24print('Hey Email has been sent!')
25server.quit()

If you want this script to repeat every day, look at this link: https://stackoverflow.com/questions/15088037/python-script-to-do-something-at-the-same-time-every-day

Original link: https://towardsdatascience.com/how-to-track-coronavirus-with-python-a5320b778c8e

Welcome to“ JD Zhilian cloud ”Learn more!

The above information is from the Internet, compiled by the official account of Jingdong cloud developer, and does not represent Jingdong's cloud location.

Tags: Programming Python Selenium Google

Posted on Sun, 22 Mar 2020 08:54:36 -0400 by ehask