Capacity Numbers
I got a chance to take a peek at those hospitalization data that the government has put out to see what is in there. The link to the site is https://healthdata.gov/agencies/department-health-human-services. This is the generic page and the screen comes up as:
Scroll down the page until you see something about patient impact and hospital capacity by facility:
Clicking on that title should lead you to a page looking like:
I had to explore and play around a little to see what web address would enable the power query in Excel and in Power BI to directly link to the data and it looks like this address does the job: https://healthdata.gov/node/3651441/download.
How did I get that address? This is where playing and experimenting, using Excel’s Power Query, comes in. While exploring the links and download buttons, I would hover the cursor over the link or button and then look at the bottom of my screen, in my case the lower left, to see what web address would come up.
Then I would test that web address in Power Query to see if data would appear. I had to click on various links and buttons and found the main one that got me somewhere was the link referencing the “Covid-19 reported patient impact and hospital capacity by facility” – found by scrolling down the page:
Clicking on that link led me finally to the page with download button with the web address that worked (which is the image with the highlighted web address).
Excel Power Query will pull in the data as follows (the image was enlarged so you could see the fields better, but it’s still blurry):
Here is where Excel has the advantage over Power BI: when one needs to look at the data, it’s easier to do calculations in Excel to do some testing. I had to first go through the field headers to see what is being captured and to try to understand what they are. That kind of scrutiny just works better in Excel for me.
The field headers in the data are:
Yep, there’s a lot of juicy data. I’ve been scrutinizing and arranging those field headers, trying to make sense of them until I figured out a few things. This report is being done on a weekly basis – on Fridays – so in the data there are some field headers that represent averages over 7 days. The sum looks like the totals summed up over 7 days – most of the time. I think the coverage is supposed to mean how many days of data that number represents – most of the field headers have a coverage of 7 days. To get the average of a particular field header, say total_beds_7_day_average, one would have to divide the sum by the coverage: total_beds_7_day_sum divided by total_beds_7_day_coverage.
My final list of unique headers ended up looking like:
Now, I’m in the process of figuring out how I want to handle those field headers. Right now I have the following table of possibilities:
I have tentatively looked at the percentage of icu beds used, percentage of inpatient beds used, percentage of overall beds used. I’m missing context so I can’t tell who is in dire straits. What percentage determines that a hospital is in trouble? I still have to look at data pertaining to staffed beds since staffing available is a huge constraint: you can more easily get more beds but getting more medical doctors and nurses when the whole nation is under duress is not as easy.
So, there is still some work to do.
You must be logged in to post a comment.