# The Mathematical Probability of Accuracy for Ecological Validity in a Given Experiment for Behavioral Psychology

In behavioral psychology, there are two ways to observe animal and human behavior: in a controlled laboratory environment, and in a non-controlled real-life environment.

Knowing how many independent variables^{1} exist in such an experiment is crucial to a diagnosis of the experiment’s accuracy.Ideally, one should have a single IV. When this IV changes and the DV consistently changes with it, one can accurately say that a direct correlation exists. Predictable outcomes, after all, are one of the primary goals of psychology.

In a controlled environment, it is quite easy to boil the IVs down to a single one. The time of day, the light intensity, the amount of food present—all of these things can be more or less kept constant from experiment to experiment. This is why laboratory testing was so popular in the early 1900s. Things were easy to control, and experiments like Little Albert and Pavlov’s dogs were in the books while they were still in the laboratories.

The problem with controlled experiments is that they often do not reflect real life at all. Dogs usually don’t hear bells before given food. Babies usually don’t hear loud noises after touching rats. Real life is much less consistent. Laboratories can give key insights into some aspects of existence, but they often don’t have a helpful takeaway for everyday life. They lack ecological validity.

To remedy this, scientists observe humans and animals in real life. The difficulty with this approach is that real life is much harder to control than a laboratory. There are many moving parts. No two mornings are alike in a non-controlled environment.

All of this is stuff you’ll find in a normal psychology textbook; it’s also intuitive even if you have never studied psychology. But as a person interested in mathematics, I wanted to quantify in numbers just what this looked like: as you introduce new IVs into an experiment, what is the statistical likelihood that the results of that experiment accurately reflect a correlation between what you suspect is the primary IV, and the resulting DV? A mathematical formula seemed appropriate to answer such a question. Sitting at my desk with pen in hand, I began to derive a formula.

First, I started with the obvious: a single IV would lead to a 100% accuracy. Let’s say you knew there are 3 ways to get a rabbit scared: by playing a loud noise, by grabbing it suddenly, or by showing it a predator. In a laboratory environment, you could throw away the last two (it’s a controlled environment, remember) and just experiment with a loud noise. This becomes your single IV. You observe that when you play a loud noise, the rabbit is scared, and when you do not, it is not. Direct correlation, 100% predictability.

Second, I needed to find a pattern before I could begin writing my formula, so I asked, what happens when there are two IVs present? To use our example, what if you played a loud noise and grabbed the rabbit at the same time? Assuming you hadn’t performed the earlier experiment, you would conclude the following: something scared the rabbit, and therefore it was either (1) the loud noise (2) the grabbing (3) or both. You know it is one of those three options, but you’re not sure which. If you had to choose among these options, your accuracy would be 33%.

Just by introducing a second IV, the accuracy drastically dropped from 100% to 33%! This was quite a jump. I needed one more data point before I could really write my formula. So, what happened if you introduced a third IV by also showing a predator to the rabbit while performing experiment #2? Then you would have seven explanations for what caused the scaring: (1) the loud noise (2) the grabbing (3) the predator (4) the loud noise and the grabbing (5) the loud noise and the predator (6) the grabbing and the predator (7) or all three. Seven combinations meant that the likelihood of any of them being the right combo was 1 in 7, or about 14%.

It was around this time that I realized we were working with binary math:

- 1 in binary is 1 in decimal
- 11 in binary is 3 in decimal
- 111 in binary is 7 in decimal

There was our formula! With each additional IV, simply you add a “1” to the binary number like tally marks, which then increases the decimal equivalent, which in turn forms the denominator of the answer for that scenario.^{2}

Armed with my formula in hand, it was time to make a visual out of this.^{3} Thanks to Xcode 6 and Swift, I was able to code it up fairly quickly and post a gist of it for you to scrutinize. This graph is the nexus of psychology, mathematics, and computer programming. It was a fun project to see to completion.

As you can see from this graphic, ecological validation is very very difficult to attain with certainty. Real life contains many IVs but after less than half a dozen of them, the accuracy of the DV plummets to near-zero.

This is the reason that, despite their limits, laboratories are still in use in psychology. This is also the reason that Facebook tinkered with users’ feeds for a massive psychology experiment: if you’re going to insist on doing experiments in real life, you have to do them at such a scale that you can offset the huge unlikelihood that the IV you suspect is causing the DV outcome, is really the right IV.

- I assume your knowledge of independent and dependent variables. For the remainder of this article, I use the acronyms IV and DV to denote them. ↩︎
- I studied mathematical proofs in discrete mathematics. You could get a lot more formal than I have here with a comprehensive proofs, ending in
*quod erat demonstrandum*(QED), or “thus it has been demonstrated.” But I’m really not interested in that, and I didn’t think you were either. ↩︎ - Unless you want to geek out over the code, you can safely ignore the values on the X axis. It’s the Y axis (percentage of accuracy) and the addition of new data points (each new dot is an additional IV) that you should find particularly interesting and, if you’re a behavioral psychologist, disturbing. ↩︎