The other day I was talking with my friend Matt, and he said to me that he thought I knew a lot of people with names starting with “J.” I thought that was a bit strange, but started looking through my Facebook Friends list out of curiosity, and sure enough, Matt was right. The distribution of my Facebook Friends’ Names is below:
Then he claimed that he felt that most of my friend’s names began with a “J.” As you can see, the highest is “J” with 43, followed by “S – 35,” “M – 32,” “D – 31,” and “R – 30.” And if your name starts with a “U” or an “X,” I’m likely not to be your friend.
So, I wanted to test Matt’s hypothesis that most of my friend’s names start with a “J.”
H0: the proportion of my friend’s names that start with a “J” is 0.5
H1: the proportion of my friend’s names that start with a “J” is greater than 0.5
My calculator gave me the following result:
Therefore, at the α = .05 level, I can argue that Matt’s claim is not correct.
But then I started wondering if my friend’s names follow the distribution of other people’s friend’s names. So, I looked up the distribution of first initials from the 2010 Census and used a Chi-Square Goodness-of-Fit test, which gave me the following result:
Now, as a statistician, I know that these results are useless since some of my ‘observed’ counts were less than 5. However, this made me wonder how the calculator I used was able to give me a result anyway. I should have my statistics students next semester investigate the alternatives to the Chi-Square Goodness-of-Fit test when some of the cell counts are zero – hence why I am leaving out the answer this question here.
I also started wondering if there is a way to determine whether a list of names is fraudulent, like how Benford’s Law can be used to determine whether a list of numbers is fraudulent. Of course, there is, but the area of studying text is still relatively new. For example, if a company has a stack of purchase orders, and the text is analyzed, and the majority of the orders are found to have come from the same person, they may wish to investigate.
Anyway, the part about all of this that interested me the most is that all of my questions and all of these things I started wondering about were prompted by a simple statement from someone who is not a mathematician. Matt was shocked by how excited I was to start playing with these numbers and told me that he hadn’t meant to make a statement so mathematical. But I reminded Matt that everyone makes mathematical statements on an everyday basis, whether they realize it or not. And sometimes, it is fun to dig a little deeper and play with the numbers. That’s part of the joy of mathematics.