Playing with Data Query and Power BI
A few weeks ago, I did a post “Playing with Data Query” on pulling in data from websites such as Wikipedia or government sites. I believe most sites give data in the form of tables. Wikipedia or the government site showing you the names of House of Representatives or Senators are good examples where you go to the web and you see the data. Behind the scene, they exist as tables so when you use data query, you will see one or more tables in Excel’s “Navigator”. (You will see “Navigator” once you enter the URL.) The House of Representative site is a good example of where you will see tables for each state. When you have multiple tables you need to pull in, you have the option of choosing multiple tables at once but in Excel, those tables will reside in separate tabs, so you will have to use the data query to combine those tables if you want a single database to play with.
Different kind of data
Sometimes data will appear in a different fashion such as those for historical state temperatures. You don’t get tables but you receive a web page.
I didn’t use Power BI for the House of Representatives because I didn’t need “charts” or any visuals for the list of names. But I did want some kind of charts for the historical trends in state temperatures. That was a post from a few months ago (“Using Power BI to Delve into Climate Change”). I pulled the data from a different government site and the methodology was different at the time. At this site that I found, you had to input your criteria, so I pulled data state by state, using timescale of 12 months, going as far back in time as I could. I was doing this in September so I pulled in a range of monthly historical data, some going as far back as 1895, on through August 2019. I hit plot and a graph would appear below for the chosen state.
At that point, I would hit the download button, using the Excel/CSV version. A surprise occurred at that point: instead of actually downloading an actual Excel file, the data appeared as a website. So back in September, I copied that data and pasted into Excel or a text xml file. I did that for all 50 states. I did them as separate files mainly because I wanted to test the data query capability at the time. But I copied and pasted 50 times.
Hmmm, a bit more work than I wanted to do.
Thanksgiving Weekend Surprise
Then, during Thanksgiving weekend, I couldn’t even access the site using the URL.
So that whole project was starting to crumble.
But this week, I was able to find the particular page again but it had to be done without using the URL. I had to get there the long way. AND, I figured out another way of getting the data for all states at once. It’s way easier than one would think. Also, I later learned I could refresh the data and update the data with more up-to-date timestamps. The first time, you have to go to the site to get the URL but once that URL is embedded in the data query or Power BI, you don’t have to have that site or the page of data open to refresh the data.
How To Pull This Type of Data
So, if you want to follow along, here’s what I did to pull in data from that website where you enter criteria for what data you want and the data pulls up as another web window rather than a download.
The site in question is the government site: https://www.ncdc.noaa.gov/
When you get there, you want to pick “Climate Information”.
Then pick “Climate of the United States”.
Next “Climate at a Glance”.
When you get to the “Climate at a Glance”, you will be on the “Global” and “mapping” page. You want the “Statewide” and “mapping” page if you want state data.
Within the “Statewide” page, you will see some options offered. Play around with that. Once you’ve made your decisions, make sure you hit the “Plot” button. The map and data will appear below.
At this point, you can click on the “Download All Months/Years” Excel icon. Or you can pick “Download your Table Data”. Again, play around.
When you click that download icon, you will not get an actual downloaded file; you will get a web page. This was the part that threw me off last time and I started to copy and paste the data from that web page to my files.
Instead of copying and pasting the data, copy the URL for that web page and use it in your Get Data in Excel’s data query or Power BI. After you enter the URL, your Navigator page will look slightly different but don’t let that throw you off. You are getting the data; it’s just a different approach. You will have to use some transformation such as deleting the first 3 or 4 rows, making the 4th or 5th row as a row header and adding a column for dates.
One you’ve loaded the data, you can refresh that data in the future, even if you do not have that page open. That’s the mindboggling thing about this. That page does not need to be open in order for the data to update with new months of data, so long as the webpage does not move or change.
You must be logged in to post a comment.