When Strange Results Pop Up
It doesn’t happen often, but every once in a great while, I will see something strange in the numbers or graphs. Generally, I’m not so good at catching these weird results; I find other people are better at spotting the possible errors, which is why I rely on Excel or a program to do that job. I just don’t do that very well.
But sometime last week, I saw something that made me go “Whoa! What’s this?”
That bright blue line from Wikipedia represents Asia, which is an admittedly broad category that appears to include Russia, China, Indonesia, Middle East, India, etc. Around 12/10, the moving average just surged upward – kind of like what the purple line at the bottom did. So, I decided to wait and see if that pop up would go back down as a correction to this “erroneous” upsurge possibly due to keypunch error. Well, it hasn’t so far.
That upsurge was an interesting one day surge that I had to go looking for which country had that result. Normally, when I see something like that, I think, “Keypunch error”, but this time it might not be keypunch error but a serious update on backlogged data. After playing around with some graphs trying to discern the source, I found this:
Culled from the same data from Wikipedia (as far as I can tell, the Covid Tracking project does not track international countries), the upper left chart shows just Asia while the graph on the right just shows the various lines for all the countries in Asia; however, I think this line graph in Power BI limits the number of countries it will show. I’m not 100 % sure though. The bar graph beneath the line graph gave me the answer: that huge red bar slightly off center to the right is Turkey. If you hover your cursor over that red bar, the country’s name will appear as well as its absolute value.
The moral of this story is to try various types of graphs to see if something jumps out. I couldn’t find the answer using a line graph but the bar graph depicted the outlier very clearly.
Then I wondered, is this a real upsurge or is this an error in Wikipedia? Since I don’t think the Covid Tracking project tracks international countries but John Hopkins does, I went playing with the JHU international data and found the following:
In each of these graphs, I did not specify Turkey; instead I let Power BI graph all of the daily cases for the entire world and you can see that outlier, sticking out like a sore thumb. The third chart with the colorful bar chart basically told me that it was Turkey that had that jump.
By the way, when I looked at the numbers for Turkey, the country was adding roughly 30K new cases a day in December, both before and after that jump. That jump represented 823K (823,000!) new cases for that single day.
Unfortunately, I still haven’t found out why Turkey suffered that increase in December. I’m thinking Turkey is reporting a serious backlog of cases that had not been reported before. Just shy of a million. That’s a h*** of a miss.
My state also had a recent change in their numbers, but this time it came through the Covid Tracking project and nowhere else. This change, interestingly enough, occurred just when my state changed a lot of stuff on its site: added new files, added new types of data, changed file and tab names, changed layouts, and on and on and on. There were a lot of changes made last week, so I’m wondering if the Covid Tracking project incorrectly pulled in a set of numbers. Typically, I look at a set of statistics such as top 5 case counts, top 5 death counts, etc. I look at them every night and last Saturday, my state, all of a sudden, shot up to the top of the top 5 list. Covid Tracking project picked up an update for November and it just changed my state’s statistics.
I’m waiting to see if this statistic gets corrected in the coming days.
Sometimes, these data are just crazy.
You must be logged in to post a comment.