Coronavirus disease 2019 (COVID-19) Data Analysis using Python

By Pradip Shrestha

j

26/04/2020

I like to find interesting insights from data and analyze them. I made a simple COVID-19 data analysis using Python and Plotly. Here’s my Jupyter notebook code created for analysis.
Note: The data used are from the sources below and it is only used for data analyzing purposes.

The list of used data sources

Source: Johns Hopkins University (JHU)
Data from the repo: https://github.com/CSSEGISandData/COVID-19
Three time series tables are for the global confirmed cases, recovered cases, and deaths.
-time_series_covid19_confirmed_global.csv
-time_series_covid19_deaths_global.csv
-time_series_covid19_recovered_global.csv

Import needed libraries

import pandas as pd   
import chart_studio.plotly as py  
import plotly.graph_objects as go  

Data source

dfConfirmedWorldwide = pd.read_csv(r'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/
master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv'
) dfDeathsWorldwide = pd.read_csv(r'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/
master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv'
) dfRecoveredWorldwide = pd.read_csv(r'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/
master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_recovered_global.csv'
)

Drop columns

dfConfirmedWorldwide.drop(columns=['Lat','Long'], inplace = True)  
dfDeathsWorldwide.drop(columns=['Lat','Long'], inplace = True)  
dfRecoveredWorldwide.drop(columns=['Lat','Long'], inplace = True)  

Set the Dataframe index

dfConfirmedWorldwide = dfConfirmedWorldwide.set_index("Country/Region")  
dfDeathsWorldwide = dfDeathsWorldwide.set_index("Country/Region")  
dfRecoveredWorldwide = dfRecoveredWorldwide.set_index("Country/Region")  

Total Confirmed Cases Top 10 Countries

dfConfirmedTop10Countries= dfConfirmedWorldwide.nlargest(10,[dfConfirmedWorldwide.columns[-1]])  
dfConfirmedTop10Countries = dfConfirmedTop10Countries[dfConfirmedTop10Countries.columns[-30:]]  

Denmark’s total confirmed cases and deaths

dfConfirmedDenmark = dfConfirmedWorldwide.loc['Denmark']  
dfConfirmedDenmark = dfConfirmedDenmark.nlargest(1,[dfConfirmedDenmark.columns[-1]])  
dfDeathsDenmark = dfDeathsWorldwide.loc['Denmark']  
dfDeathsDenmark = dfDeathsDenmark.nlargest(1,[dfDeathsDenmark.columns[-1]])  

Worldwide Total Confirmed Cases and Deaths

fig1 = go.Indicator(
    mode = "number+delta",
    value = dfConfirmedWorldwide.sum()[-1],
    delta = {'position': "bottom", 'reference': dfConfirmedWorldwide.sum()[-2],
            'increasing':{'color':'red'},
            'decreasing':{'color':'green'}},
    title = {'text': "Total Confirmed Cases"},
    domain = {'x': [0, 0.5], 'y': [0.5,1]})
fig2 = go.Indicator(
    mode = "number+delta",
    value = dfDeathsWorldwide.sum()[-1],
    delta = {'position': "bottom", 'reference': dfDeathsWorldwide.sum()[-2],
            'increasing':{'color':'red'},
            'decreasing':{'color':'green'}},
    title = {'text': "Total Deaths"},
    domain = {'x': [0.5, 1], 'y': [0.5,1]})

Denmark’s Total Confirmed cases and Deaths

fig3 = go.Indicator(
    mode = "number+delta",
    value = dfConfirmedDenmark.sum()[-1],
    delta = {'position': "bottom", 'reference': dfConfirmedDenmark.sum()[-2],
            'increasing':{'color':'red'},
            'decreasing':{'color':'green'}},
    title = {'text': "Denmark Confirmed Cases"},
    domain = {'x': [0, 0.5], 'y': [0,0.4]})
fig4 = go.Indicator(
    mode = "number+delta",
    value = dfDeathsDenmark.sum()[-1],
    delta = {'position': "bottom", 'reference': dfDeathsDenmark.sum()[-2],
            'increasing':{'color':'red'},
            'decreasing':{'color':'green'}},
    title = {'text': "Denmark Deaths"},
    domain = {'x': [0.5, 1], 'y': [0,0.4]})

data = [fig1, fig2, fig3, fig4]
layout = dict(title='Worldwide and Denmark',paper_bgcolor='lightgray')
 
When working in a Jupyter Notebook to display the plot in the notebook.
py.iplot(dict(data=data, layout=layout))
to return the unique url and optionally open the url.

#py.plot(data, filename = ‘Indicator’, auto_open=True)

When working in a Jupyter Notebook to display the plot in the notebook.
py.iplot(dict(data=data, layout=layout))
to return the unique url and optionally open the url.

#py.plot(data, filename = ‘Indicator’, auto_open=True)

Total Confirmed Cases – Top 10 Countries
dfConfirmedTop10Countries 
Denmark’s total confirmed cases
dfDKconfirmed_deathsData= pd.concat([dfConfirmedDenmark, dfDeathsDenmark])
dfDKconfirmed_deathsData = dfDKconfirmed_deathsData[dfDKconfirmed_deathsData.columns[-30:]]
Denmark – Cumulative confirmed cases and deaths
trace01 = go.Scatter(
    x = dfConfirmedDenmark.columns[-30:],
    y = dfConfirmedDenmark.iloc[0][-30:], name = 'Confirmed Cases')

trace02 = go.Scatter(
    x = dfDeathsDenmark.columns[-30:],
    y = dfDeathsDenmark.iloc[0][-30:], name = 'Deaths')

layout = dict(title='Denmark - Cumulative confirmed cases and deaths',paper_bgcolor='lightgray')
data = [trace01, trace02]

py.iplot(dict(data=data, layout=layout))
Denmark – Cumulative confirmed cases and deaths
dfDKconfirmed_deathsData

Related Post

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Share This