Mindware: Simple Statistics

[My Page Builder from Site Origin no longer works since the last WordPress update. I’m not happy! It makes it much harder to format and save these pages!]

Are you trying to convince someone to your viewpoint and they just won’t buy it? You try providing facts and logic but it seems to be going nowhere? It sometimes feel like logic alone won’t move them? Well, this article provides some possible lessons on how to go about convincing people. The main subject of the story provides a good miniature case study on what needs to be done, but there is no easy way to do it; it will take time and patience. In this day and age of fake news, we gonna need all the patience we can get.

But…with that in mind I will proceed on to the next section of Mindware by Richard Nisbett. The next set of chapters deal with statistics! But he approaches it from an everyday common sense stance and doesn’t delve into the algebraic of it. We need

to have a better understanding of statistics so we can evaluate anecdotal such as “I heard so and so did ‘y’ and ‘x’ happened to him”. Politicians will give you simple stories to convey you the point but those simple stories may be too simple.
[ap_column_wrap]
[ap_column span=”3″]One of the things we need to do is to frame the situation in such a way that we can think of the issue in statistical terms, but in a common sense way. Now, the author does provide a lot of juicy examples, but this is one area where he falls down: there are not enough examples to help us in framing the issues. But once you have framed the problem properly, you should be able to assess the situation more properly.

Consider Sample Size and Randomness

One of the things you should be asking yourself is: what is the sample size and is it large enough relative to the population? If the population is large, then testing the entire population may not be feasible so you have to sample it. The larger the sample, the closer you will get at the population’s characteristics. This concept is what the author calls the Law of Large Numbers. So if a person tries to prove a point with a single story, you have to ask yourself, is a sample of one/one incident enough to prove the point? The answer should be no. The other point about samples is that the sample needs to be random or unbiased. If the sample data is biased, then your estimated results for the population may be way off.[/ap_column]
[ap_column span=”3″]Other thinking tools:

Be aware that we don’t always know what is going on in our thinking process.
Take into account of the fact that the situation may be driving people’s or even your behavior.
Rely on your unconscious to solve some of your problems because there are certain kinds of problems that your subconscious is best at solving.
Use economic tools such as cost-benefit analysis.
Don’t fall into the sunk cost fallacy. Think: the rest of your life starts now when doing the cost-benefit analysis.
Also consider your opportunity cost when doing cost-benefit thinking.

[/ap_column]
[/ap_column_wrap]

Polling is a great example of using sampling to forecast election results and the author talks a bit about polling. Polling in the last two elections have led us astray, and from my readings, the sample size is not what is in question but how the sampling is done. The pollsters may be having problems getting a representative voter. The author noted that today’s polling actually does not sample randomly. Instead the pollster has to apply some kind of intuition on the weighing factors of voters’ characteristics. I do remember Nate Silver saying that he takes all of the state polls and apply some kind of adjusting/weighing factor to reduce the non-randomness of the polling and to get closer to the real characteristics of the population. It sounds more voodoo than science.

Two factors may have played a role in the poor polling: 1) a lot of people were still undecided late into the election cycle and 2) the Comey effect came so late in the election that there was no time to poll and evaluate the effect of the Comey factor. There may be other factors but those are the two that comes immediately to mind.

One last thing on polling. The author made a very curious statement:

“…the accuracy of a sample statistic (mean, median, standard deviation, etc.) is essentially independent of the size of the population the sample is drawn from. Most national polls in elections sample about a thousand people, for which they claim an accuracy of about + or – 3 percent. A sample of a thousand people gives about as good an estimate of the exact percentage supporting a given candidate when the population size is 100 million as when the population size is ten thousand.” Mindware, p. 113, hardback.

Regression to the Mean

The next thing to consider besides sample size and the random selection is the regression to the mean concept. As you sample the data, keep in mind that the characteristics of the data will vary around the mean or average. The author uses restaurant meals to illustrate regression to the mean. The quality of the food could range from very poor to just fabulous and it depends on the day you go to that restaurant. There will be variability in the quality. Some restaurants will have a tighter range of variation of quality so the dispersion of quality is narrow. Others will have a wider range. Standard deviation is often the concept used to describe the variation from the mean. However, each time you go to the restaurant the quality will differ. So if you go to the restaurant once, you shouldn’t think that experience is representative of the quality of the food because you may have happened on a day when all the stars lined up (or did not). If you happened to eat on a day when all of the stars are lined up and the food was just fabulous, the next time you go, it is very likely that the quality will suffer. The experience or data will regress to the mean.

What are the repercussions of this concept? One, you can’t rely on a single evaluation. You will have to sample many times before you make an evaluation. This means a single interview is likely not enough. You will most likely have to rely on other sources of data that tracks over a period of time. It could be another person who has had multiple contacts with the interviewee. It could be some kind of test scores. It could be a trial period. You need a range of data before you can make a full assessment.

Some Metrics Regarding Standard Deviation

There are some simple metrics with standard deviation that works well with normal distribution type of data, which it sounds like most of life falls under.

68% of the values falls within 1 standard deviation from the mean.
96% of the values falls within 2 standard deviation from the mean.
84% is 1 standard deviation above the mean.
Just about 98% falls just below 2 standards above the mean.

So if you know what the standard deviation is, then you can guess at a rough estimate of the range of values.

The author used IQ as an example. IQ happens to exhibit the characteristics of a normal distribution. The scores are set such that 100 is the average or mean and 15 points is 1 standard deviation. 68% of the population has IQ ranging between 85 to 115. An IQ of 115 is the level where someone could do college work and maybe even some postgraduate work. The occupation would be professional, managerial, or technical. An IQ of 100 would be more likely to have community college or junior college or maybe even just high school. So 83% of the population falls below the level regarded as sufficient for college work. Or put it another way, only 17% may have sufficient IQ to do college work. There may be some flaws to this logic because I also recall that roughly 30% of the population attain at least a bachelor’s degree. But it is good to know that as the world moves toward the automated mode and jobs require moving up the scale, then think about what this means for those whose IQ may not be where it should be. Or does IQ really impact your ability to attain a college degree?

Correlations

[Okay, time to wrap this up. This post is getting long and the hour is late. I need to get ready for work tomorrow. So to shorten this…]

The main takeaway from the chapter on correlations is that we are very bad at assessing whether two characteristics are tied together or not. Sometimes two incidents will appear correlated due to closeness of time but they are really not correlated. Other times, two incidents are really tied together but we don’t expect them to and so we ignore them. That’s pretty much the gist of it.

So on a practical matter, we have to gather data and do some calculations to see if two variables are related or have an impact on each other. The author used as an example of presence or absence of symptoms versus presence or absence of disease. You will actually have 4 types of data: no symptom and no disease, no symptom and disease is present, symptom is present but no disease, and both symptom and disease are present. The author says most people tend to look at 2 types of data but we need to look at all 4 in order to better assess correlation.

One last thing to note: correlation does not imply causation.

Okay, I’m done for the night.