Retrieving and Visualizing COVID-19 Statistics from Local Health Authority Website

Motivation of this Work:

The Ohio Department of Health provide daily update on Covid-19 related data such as total confirmed, total hospitalized, and total death, but the access to the daily new case count (newly confirmed, hospitalized, and dead) is a bit tricky. So, I created my own version of daily case count tracking.

Also, in some cases, the department updates daily case count data a few days later. This has made me hesitate on how much trust I should place for the “latest” update. Therefore, to track the discrepancies, I decided to add a function to save the data each time I run the code. This way, I can make comparisons later to see if there are any patterns in the “late update”.

Primary Outcome: Visualized Daily New Case Count

Comparison of Daily Confirmed Cases among Four Major Cities in Ohio, USA

A detailed breakdown of what my code can do:

  • Retrieving daily update data from the Ohio Department of Health Website
  • Producing a daily confirmed case data from the raw data
  • Charting the daily confirmed case data graphically
  • Visualizing the trend (lowess smoothing)
  • Saving the daily confirmed case data into a .csv file for later comparison.
  • Run the code any time to get the results!

Code Block:

clear all

//retrieving the data
import delimited "https://coronavirus.ohio.gov/static/COVIDSummaryData.csv"

//capturing the date and the time of the data
local c_date = c(current_date)
local c_time = c(current_time)

drop if _n == 1

//manipulating variable names and columes to obtain the daily count
rename v1 county
label variable county "county"

rename v2 sex
label variable sex "sex"
encode(sex), gen(gender)

rename v3 ageRange
label variable ageRange "Age range"

gen onSetDate = date(v4, "MDY")
format onSetDate %tdnn/dd
label variable onSetDate "Onset Date"

gen dateDeath = date(v5, "MDY")
format dateDeath %tdnn/dd
label variable dateDeath "Date of Death"

gen admissionDate = date(v6, "MDY")
format admissionDate %tdnn/dd
label variable admissionDate "Admission Date"

rename v7 caseCount
destring(caseCount), replace force
label variable caseCount "Case Count"

rename v8 deathCount
destring(deathCount), replace force
label variable deathCount "Death Count"

rename v9 hospitalizeCount
destring(hospitalizeCount), replace force
label variable hospitalizeCount "Hospitalize Count"


//Generating daily confirmed case count
collapse (sum)caseCount (sum)deathCount (sum)hospitalizeCount, by(onSetDate county)

//Charting four major counties 

twoway (scatter caseCount onSetDate if county == "Cuyahoga", msize(0.5)) (lowess caseCount onSetDate if county == "Cuyahoga"), name(Cuyahoga) legend(off) xtitle("") ytitle("") yline(5, lcolor(green)) title("Cuyahoga County (Cleveland)") note("Population: 1.24 million")

twoway (scatter caseCount onSetDate if county == "Franklin" & caseCount <= 100, msize(0.5)) (lowess caseCount onSetDate if county == "Franklin"), name(Franklin, replace) legend(off) xtitle("") ytitle("") yline(5, lcolor(green)) title("Franklin County (Columbus)") note("Population: 0.89 million; Some dates in late April to early May where the case count " "was higher than 100 is obmitted to keep the graph scale consistent.")

twoway (scatter caseCount onSetDate if county == "Hamilton", msize(0.5)) (lowess caseCount onSetDate if county == "Hamilton"), name(Hamilton, replace) legend(off) xtitle("") ytitle("") yline(5, lcolor(green)) title("Hamilton County (Cincinnati)") note("Population: 0.82 million")

twoway (scatter caseCount onSetDate if county == "Lucas", msize(0.5)) (lowess caseCount onSetDate if county == "Lucas"), name(Lucas, replace) legend(off) xtitle("") ytitle("") yline(5, lcolor(green)) title("Lucas County (Toledo)") note("Population: 0.43 million")

graph combine Cuyahoga Franklin Hamilton Lucas, col(1) xcommon ycommon ysize(2) xsize(1) note("Data Source: Ohio Department of Health; Green Line = 5 Cases") title("COVID-19 Daily New Case Count in OH") subtitle("Updated on $S_DATE" )

//Saving the data using the date and time of the data
local c_date = c(current_date)
local c_time = c(current_time)

local c_time_date = "`c_date'"+"_" +"`c_time'"

local time_string = subinstr("`c_time_date'", ":", "_", .)
local time_string = subinstr("`time_string'", " ", "_", .)
display "`time_string'"

local folderPath = "D:/OHIO_COVID_DAILY_CONFIRMED_"
local fileName = "`folderPath'" + "`time_string'"
display "`fileName'"

export delimited using `fileName'.csv, replace

Comments

Leave a Reply to kangaroo Cancel reply

%d bloggers like this: