Creative Excel Idea
There’s a lot of talk about the need to be creative in business in this new world that we are entering. One of the skills you got to have is to be “creative”. I would imagine that a lot of people, when they think of creativity, they think of artists, musicians, actors – that kind of creativity. Others, if they think further, they think entrepreneurs.
But there are other ways you can be creative: such as a creative way of using Excel.
Every once in a while, I come up with a creative idea on how I can use Excel to do the work for me. In this post I will try to explain one tiny creative idea I had with Excel while I’ve been tracking the coronavirus.
Generalized process for gathering daily coronavirus data
Every night, I update connections to coronavirus data via a macro and then I go through each of the newly updated connections to make sure nothing weird happened or even to see if the data updated. My state puts out a lot of data and oftentimes I don’t really know what to do with the data when the data first comes out, so I just keep copies of the data until I get an idea of what to do with them. It could be many months later that I think, “you know what, that data would be useful” but if I hadn’t been saving them, I would have lost those data especially since my state had to change their methodology a couple of times as we all learn what needed to be tracked. As time progressed, my state would make changes in their data gathering methodology and I would like to see what exactly changed – so I can make an assessment on what they’ve done.
So, every night I gather data that I may not be using but I still review the updates to make sure new data came in. And that review got old over a period of months; consequently, I started to dream of ways to make my life easier. Basically, I wanted a table listing all of the sources of data and how they may or may not have updated.
The structure of my state’s data
Okay…before I write an explication on how I developed this creative idea of making Excel help me review the updates, a little background on the structure of my state’s data: due to how the header of my state’s Excel data would include date and time, I have to download their Excel files and then copy and paste their data into my Excel file rather than pull in data by direct connect using Power Query. Every day the header would change because of that date and time and if I used Power Query, or Power BI, I would get an error message saying Power Query, or Power BI, could not find the specific date and time. Until I figure out how to get around that date and time in their file, I have to download the files and then use a macro to copy over the data. That’s my solution for the moment.
To give you an idea of what I’m talking about, here’s an example of a sheet of data with a header at the top with the offending date and time highlighted in yellow.
And here’s what the Advanced Editor looks like. The item highlighted in yellow is what will cause me problems the next time I update the data connection. I need to figure out how to get around that date and time because it changes daily.
So, for the time being, I’m downloading their Excel files.
My state puts out multiple files of data and some files have multiple tabs although most of the files have just one tab. My own file has multiple tabs to kind of represent the separate file/sheets combinations.
My creative idea
For my table listing the sources of data and their updates, the key thing was to identify what one thing would tell me that the file was updated and that was simple: the date and time on the header of the Excel file – the very thing that caused me pain in Power Query/Power BI connection.
Here’s a screenshot of my table listing just a few of the tabs in my file holding the data. The first column contains the tab name in my file, the second column is cell reference for the last column of data, the third column is the date found in that last column’s field header, and the fourth column represents the prior night’s date.
Most of the state’s files had a header with date and time, then the next row some additional description and then rows of counties listed under column A and a row of dates on row 3 or 4, going off to the right. Every night an additional column of data would be filled in, representing a new day’s worth of data.
I used conditional formatting to tell me whether the date changed from the prior night’s run – the third and fourth columns from my table should show different dates if the files were updated. Every night I would copy the dates in the third column over to the fourth column, then run my macro to do the data collection, and then I would check to see if column 3 was different from column 4. If the dates had not changed, then I would go back to the downloaded file and check out the data there.
That’s the basic idea.
Unfortunately, the table is a little bit more complicated by the fact that every night, a new column would appear and that table needed to reflect a new column under column 2. The first column is just typed in with the tab name – that’s no problem.
The set up to identify the latest column of data
The second column is where it gets complicated: identifying the latest column of data pulled in. In the second column I used a combination of the ADDRESS function, COUNTA function, and the INDIRECT function:
=ADDRESS(3,COUNTA(INDIRECT(“‘”&AE16&”‘!3:3”)),4)
In general terms for the ADDRESS function (the AE16 is a cell reference and will be explained later):
=ADDRESS(row, column, absolute/relative reference using 1, 2, 3, or 4)
The ADDRESS function is basically telling you what row and column you are looking at and whether you want absolute reference (with $ signs 1) or relative (without $ signs 4) or some combination. When you enter in the row, column and type of reference, the ADDRESS function spits back at you the cell reference.
So, for the tab “Cases by County”, the result of the function is MU3. The dates are listed on the 3rd row and the last column with the latest date is column MU. In order to get the column number, I had to COUNT the number of columns that are filled with something, which in this example the count is 359. For the current night in the example, the last column was under MU or column number 359. The next night, if the state updates their data, a new column MV or 360 will be filled.
Now I could have done:
=ADDRESS(3,COUNTA(‘Cases by County’!3:3),4)
But there are multiple tabs where I needed to do a quick review, and I wanted to be able to easily and painlessly copy the formula. I needed to generalize the formula rather than make the formula specific to a certain tab; hence, the INDIRECT function came into play.
=INDIRECT(“‘”&AE16&”‘!3:3”)
AE16 is referencing the cell with the tab name or it is referencing the first column of my table. That first column is under AE. The INDIRECT function builds up to ‘Cases by County’!3:3. You can use the INDIRECT function to build up references that could change. It’s tricky if you are unfamiliar the function but very handy once you got it set.
The third column is simply pointing to the latest cell with the dates and it also uses INDIRECT function.
=INDIRECT(“‘”&AE16&”‘!”&AF16)
Again AE16 is really the first column in the table which holds the tab name and AF16 is the cell with the ADDRESS function so that cell holds the cell reference.
Now I can very much envision that formula getting more complicated if the tabs were not just holding dates, but in my case, the cells are just holding the dates.
In closing
You can see there are innumerable possibilities of making Excel help you out. The example I provided here was to simply tell me if the state’s files had updated and that my file contained the latest data. Instead of going to each of the tabs, I just look at the table and check on the dates. It is just so much quicker this way rather than pacing through the tabs.
You must be logged in to post a comment.