top of page

Portfolio

Covid19 Data Analysis














The dataset is from 2020 but this will show the analysis


Let's Import the modules

import pandas as pd

import numpy as np

import seaborn as sns

import matplotlib.pyplot as plt

print('Modules are imported.')

importing covid19 dataset

​corona_dataset_csv = pd.read_csv("Datasets/Covid19_Confirmed_dataset.csv")

corona_dataset_csv.head(10)

first 10 rows of corona_dataset_csv Delete the useless columns and show the first 10 rows again

​corona_dataset_csv.drop(["Lat","Long"],axis=1,inplace=True)

corona_dataset_csv.head(10)


Aggregating the rows by the country

​corona_dataset_aggregated = corona_dataset_csv.groupby("Country/Region").sum()

corona_dataset_aggregated.head()

corona_dataset_aggregated.shape

(187, 101)


Visualizing data related to a country for example China: visualization always helps for better understanding of data.

corona_dataset_aggregated.loc['China'].plot()

corona_dataset_aggregated.loc['Italy'].plot()

corona_dataset_aggregated.loc['Spain'].plot()

plt.legend()


Calculating a good measure: we need to find a good measure represented as a number, describing the spread of the virus in a country.

corona_dataset_aggregated.loc['China'].plot()

plt.title("Spread of Virus in China")


find maximum infection rate for all of the countries

​countries = list(corona_dataset_aggregated.index)

max_infection_rates = []

for c in countries:

max_infection_rates.append(corona_dataset_aggregated.loc[c].diff().max())

corona_dataset_aggregated["max_infection_rate"] = max_infection_rates

create a new dataframe with only needed column

​corona_data = pd.DataFrame(corona_dataset_aggregated["max_infection_rate"])

corona_data.head()



17 views0 comments

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page