Fetch COVID-19 Data Using COVID-19 API in Python
- To begin with, make sure you have Python installed and the `requests` library available to handle HTTP requests. If not, install it using pip:
pip install requests
.
- Identify a suitable COVID-19 API provider. There are several options such as COVID19API and others that offer free access to COVID-19 statistics. Ensure the API satisfies your requirements regarding coverage, data update frequency, and rate limits.
import requests
# Example of base URL for COVID-19 API
BASE_URL = 'https://api.covid19api.com/'
def get_covid19_data(country=None):
endpoint = f"{BASE_URL}summary"
if country:
endpoint = f"{BASE_URL}dayone/country/{country}"
response = requests.get(endpoint)
if response.status_code == 200:
data = response.json()
return data
else:
raise Exception(f"Failed to fetch data: {response.status_code}")
# Fetch global COVID-19 summary
global_data = get_covid19_data()
print(global_data)
# Fetch country-specific COVID-19 data
country_data = get_covid19_data(country="us")
print(country_data)
Handle the JSON Response
- Upon a successful request, the API provides data in JSON format. You may need to familiarize yourself with the data structure by examining a sample of the returned JSON. This will help you to efficiently extract the relevant information.
- Use Python's native capabilities or libraries like Pandas to manipulate the JSON data for analysis or display. You can convert it into a DataFrame or other data structures based on needs.
import pandas as pd
# Suppose response from API contains 'Countries' key with statistical data for all listed countries
countries_data = global_data['Countries']
# Convert data to a Pandas DataFrame for easy manipulation
countries_df = pd.DataFrame(countries_data)
# Display a summary for each country
print(countries_df[['Country', 'TotalConfirmed', 'TotalDeaths', 'TotalRecovered']])
Handle Errors and Edge Cases
- In practice, network requests can fail for various reasons. Implement error handling to manage HTTP errors or connectivity issues gracefully. You can retry the request or log the error for later diagnosis.
- Pay attention to API limits and guidelines. Implement exponential backoff or other rate limiting strategies to avoid exceeding API quotas. Properly handle cases where data is unavailable or inconsistent.
import time
def robust_request(endpoint):
max_retries = 3
retry_count = 0
while retry_count < max_retries:
try:
response = requests.get(endpoint)
if response.status_code == 200:
return response.json()
else:
retry_count += 1
print(f"Error: {response.status_code}. Retrying {retry_count}/{max_retries}")
time.sleep(2 ** retry_count) # Exponential backoff
except requests.RequestException as e:
print(f"Request failed: {e}")
retry_count += 1
time.sleep(2 ** retry_count)
raise Exception("Max retries exceeded")
# Use the robust request function
data = robust_request(f"{BASE_URL}summary")
Automate and Extend
- Consider setting up scheduled tasks or using cron jobs to run this data fetching script at regular intervals. This ensures you always have the latest data for analysis or reporting.
- Extend the script by adding functionalities such as data visualization using libraries like Matplotlib or Plotly, or integrating the API data with a dashboard application such as Dash or Streamlit for more interactive analysis.
import matplotlib.pyplot as plt
def plot_covid_data(df, country):
country_data = df[df['Country'] == country]
plt.figure(figsize=(10, 5))
plt.plot(country_data['Date'], country_data['TotalConfirmed'], label="Confirmed")
plt.plot(country_data['Date'], country_data['TotalDeaths'], label="Deaths")
plt.plot(country_data['Date'], country_data['TotalRecovered'], label="Recovered")
plt.xlabel("Date")
plt.ylabel("Cases")
plt.title(f"COVID-19 Data for {country}")
plt.legend()
plt.show()
# Assuming `country_data` is a DataFrame from previous steps
plot_covid_data(countries_df, "United States")