The Chocolate Cake Hypothesis

How Null Hypotheses Add Flavor and Color to Scientific Inquiry

Aug 30, 2023

Hello, Simplify readers! This is my blog’s first guests post: a statistics explainer from an experienced psychological researcher. In other words, this is the blog’s first article that is not written by Klaus.

My Great Grandma Betty’s chocolate cake recipe is over 100 years old, and it yields the best chocolate cake in the world. My family knows that the ‘secret’ ingredient is marshmallow fluff blended with chocolate frosting. After all, the cake wouldn’t be Great Grandma Betty’s chocolate cake at all if it didn’t use marshmallow fluff!

Let's use this chocolate cake to build our intuition about hypotheses. My professor once told me that, if I think I understand something, then that’s a hypothesis. For example, I know that marshmallow fluff makes my Great Grandma Betty’s chocolate cake the best in the world. Therefore, if I change the amount of marshmallow fluff, it will affect its status as the best in the world!

A hypothesis is a statement about cause and effect. More formally, a hypothesis explicitly states the way in which two variables relate to one another or affect each other. These two variables are the independent variable--the one researcher’s deliberately change (the cause)—and the dependent variable—the one researchers use to measure that change (the effect). In our example, changing the marshmallow fluff is the independent variable (the cause), and tasting the "best” is the dependent variable (the effect).

We can be a little more specific and define what we mean by the independent and dependent variables. By being more specific, we are more objective and therefore can measure what we intend. For the independent variable, we can either use or not use marshmallow fluff. For the dependent variable, we can define what we mean by “best”. There are several candidates, such as aesthetics, quality, and smell, but the one that best reflects our theory is how the cake tastes. For this, we can ask, “On a scale from (Not Very Tasty) 1 to 5 (Very Tasty), rate how tasty the cake is?”

What is the null hypothesis?

A world without my Great Grandma Betty’s chocolate cake is a colorless world. Yet this colorlessness is what the null hypothesis is all about. The word “null” means “nothing” as in there is “no effect”. As we saw earlier, a hypothesis predicts AN EFFECT. On the other hand, a null hypothesis predicts NO EFFECT. Therefore, the null hypothesis predicts that two variables are unrelated to or do not affect each other.

In shorthand, the null hypothesis is abbreviated as H₀. Consider a complementary yet different hypothesis, so it is called the alternative hypothesis, abbreviated as H₁.

As is often done in research courses, let's compare the null hypothesis and the alternative hypothesis to see the difference between the two types of hypotheses. Stated explicitly, they are:

H₀: Using or not using marshmallow fluff in the chocolate frosting DOES NOT AFFECT the taste ratings.
H₁: Using or not using marshmallow fluff in the chocolate frosting DOES AFFECT the taste ratings.

In my opinion, the null hypothesis is counterintuitive. After all, don’t we want to PROVE that there is an effect to PROVE our theory? Don’t we want to live in a more colorful world where marshmallow fluff intensifies our chocolate cakes, fruit sweetens our yogurt, and butter flavors our toast?

Why science takes the color away (or does it?)

Scientists are skeptics by nature. They don’t just believe a claim or hypothesis to be true. They have to carefully observe and measure variables to test their hypotheses. But the nature of skepticism pervades into how scientists test hypotheses. Skepticism enters us into the realm of falsifiability.

Falsifiability, proposed by philosopher Karl Popper, is the idea that a statement or hypothesis is scientific if and only if it can be proven false. Think back to our earlier example about the impact of marshmallow fluff on making the cake the "best." This was actually an unfalsifiable alternative hypothesis because "best" wasn't clearly defined. I could easily move the goalposts by saying, "That's not what I meant by 'best.'" Science leans on falsifiability to weed out such ambiguous or untestable claims.

Consider the alternative hypothesis again (H₁): Using or not using marshmallow fluff in the chocolate frosting DOES AFFECT the taste ratings. If we falsified this, what would the outcome be? It would be our null hypothesis! That is, falsifying the alternative hypothesis lends support to our null hypothesis, leaving us in a colorless world until we can provide more data.

On the other hand, if we falsified the null hypothesis, then we would essentially be supporting the alternative hypothesis. Thus, while appearing to take away the magic, the null hypothesis refines our understanding and helps us see the world more clearly. Thus, the null hypothesis, counterintuitively, brings color back to our world.

Therefore, science aims to falsify, or reject, the null hypothesis.

Rejecting and failing to reject the null hypothesis

Let’s advance the falsifiability notion a little bit further. In the previous section, we considered whether the null hypothesis can be falsified. Now we will consider whether, after running our experiment and observing our results, we should reject or fail to reject the null hypothesis.

Consider two realities. In one reality, the null hypothesis is really true. That means, there really is no effect. In this case, we should fail to reject the null hypothesis, but sometimes we can make a mistake and reject it. In another reality, the null hypothesis is really false. That means there really is an effect. In this case, we should reject the null hypothesis, but sometimes we can make a mistake and fail to reject it.

These outcomes can be seen more clearly in this table. On the left side, we have our decision “reject” or “fail to reject”. On the top, we have our reality: the null hypothesis is true or false.

As was well explained in this article by Klaus, Statistical Significance Isn’t Magic, to provide evidence to support our decision for rejecting or failing to reject the null hypothesis, we calculate the p-value.

The p-value is based on the idea that the null hypothesis is True (left column). That is, if we assume the null hypothesis is true, what is the probability that we would observe our predicted effect anyway? If the p-value is quite small (scientists use a cut-off of .05), then the predicted effect given the null hypothesis is quite rare! Therefore, we should reject the null hypothesis. On the other hand, if the p-value is larger than .05, then the predicted effect given the null hypothesis more commonplace, so we shouldn’t be so surprised that we observed the effect. In this case, we would fail to reject the null hypothesis. By testing the null hypothesis, we effectively position ourselves to be pleasantly surprised by the data rather than forcing it to fit our preconceived notions.

The Final Layer

In conclusion, using the analogy of my Great Grandma Betty's chocolate cake, we've delved into the details of alternative hypotheses and null hypotheses in scientific research. The cake's 'secret' ingredient, marshmallow fluff, serves as our practical example of an independent variable, helping us understand skepticism in science, or falsifiability, and data interpretation. While it may seem counterintuitive, the null hypothesis plays an essential role in painting a clearer, more nuanced picture of reality. Far from making the world 'colorless,' the rigors of scientific testing allow us to appreciate the true 'colors' or effects we might otherwise take for granted. By striving to falsify the null hypothesis and relying on measurable outcomes, like p-values, science ensures that our conclusions are as grounded in evidence as possible. So the next time you bite into a slice of chocolate cake, think of it as a delicious reminder of how science, skepticism, and a dash of marshmallow fluff can help us better understand the world around us.

A guest post by

Jie Wen

Stats, science, and culture.