I filed 20 postdoc applications this weekend.
Estimated number of offers: 0.1. That's right, with 22 applications in, I have a one-in-ten chance that I might hear back from one of them.
Real life is no fun.
Anyway, now that my big batch of postdoc applications is filed, I'll get back to the posts I meant to write last month, still hoping to complete the 30 posts I wanted to write in November by the end of this year. There is still lots of interesting physics to cover!
Good news, everyone! Well, good news for me at least: I've been granted a spot at the 2014 Science Online Together conference!
Science Online is an organization that, well, like the name suggests, supports people who promote and develop scientific content on the internet. They manage the Science Seeker blog aggregator and hold several annual conferences to bring together people involved in science online in all capacities. The flagship conference is always held near the beginning of the year in Raleigh, NC, and this time I get to go!
This is good news for you too, though. When I'm not busy conferencing I'll be uploading blog posts and tweeting, so that everyone else can share the experience as much as possible. Stay tuned for that as the conference is running, February 27 to March 1. For now, if you're interested in such things, conference news (and griping about the cost) is flowing under the #scio14 tag on Twitter.
At the beginning of this month, you may remember, I set out to write 30 blog posts in 30 days. Well, there are four days left, and I'm barely a third of the way to my target of blog posts. It turns out that applying for postdoc positions will take up all your time, and then some, leaving precious little for blogging. Which is kind of a shame, because I had some good sciencey posts lined up.
Most of my postdoc application deadlines are coming up this week or early next week, so I have to prioritize those for now. To make it up to you, my reader(s), once I'm done with applications I'll keep going with all the posts I had wanted to write this month. With any luck, I can crank out all 30 by the end of December.
When I arrived in Princeton last Friday, I was greeted with this headline:
Emergency meningitis vaccine will be imported to halt Ivy League outbreak
Emergency doses of a meningitis vaccine not approved for use in the U.S. may soon be on the way to Princeton University to halt an outbreak of the potentially deadly infection that has sickened seven students since March.
Well then. Perfect timing. But seriously, it is actually a perfect time to reflect on why vaccines are necessary in the first place. And it's not (just) for the reason you might think.
If you're vaccinated against a disease, not only does it mean that you won't get sick, it also means that you won't pass that disease on to other people. Vaccinations protect the people around you too. And conversely, even if you're not vaccinated yourself, the more people around you who are, the lower your chances of catching the disease from someone else.
Let me illustrate this with a simple model of how a disease spreads. Imagine a world where people live in apartments on a perfect grid and only ever talk to their four neighbors, once a day.
Suppose one of these people gets sick.
Each time that person talks to their neighbor, they have some chance to pass on the disease. Maybe each day that they're sick, the person has a 20% chance to infect each neighbor. (That's a pretty high chance, but then again in reality most of us talk to a lot more than four people each day.) If the illness lasts three days, the odds are pretty good that one or two neighbors are going to get infected.
(light red squares represent people who have just gotten sick, dark red represent those who have been sick for a while)
And then those neighbors infect their neighbors. The four neighbors of Patient Zero (the first person to get sick) have eight new neighbors of their own, so there are even more opportunities for the infection to spread.
(blue squares represent people who are recovering)
And then those eight neighbors have 12 new neighbors, and those 12 have 16, and so on. Eventually you wind up with a wave of infection spreading back and forth through the population.
What if some of these people are vaccinated?
Vaccines don't work on everyone in real life, but to keep my example simple, I'll pretend vaccinated people are totally immune to the disease. They'll be represented by green squares. Look what happens if even 10% of the people in this imaginary grid city are vaccinated:
See the difference? The infection finds it a lot harder to spread. Instead of disease running rampant throughout the population, it's limited to one little cluster in a corner of the grid. The few people who are vaccinated act as a protective barrier of sorts — maybe one with a few holes in it, but still, enough to significantly hold back the spread of the disease and keep the unvaccinated 90% a lot safer.
Say we increase that to 20% of people getting vaccinated. Here's what that looks like:
Now how about that! The disease just sputters out without ever reaching most of the population!
Maybe that was just a fluke. Let's try again and let the randomness play out differently:
Same thing. One more:
Yet again, the infection disappears after just a few steps of the simulation.
Wait — I lied. Watching these little colored squares is too entertaining. This is the last one I swear:
It takes a little longer this time, but still, the infection disappears.
Evidently, you only need some critical fraction of the population to be vaccinated to stop a communicable disease in its tracks! This phenomenon is called herd immunity.
The critical fraction of vaccinated people you need to produce herd immunity depends on how people in the population interact with each other. In my toy example, there's very little interaction — each person only interacts with four neighbors, once a day. That makes the critical fraction fairly low; just 20% vaccinated is enough to stop the disease after a fairly short time.
In real life, people interact a lot more, which means the critical fraction is higher: 80%, 90%, 95%, or more. A community can only afford to leave a few people unvaccinated without spoiling the herd immunity. It's important to leave those "slots" for those few people who have legitimate medical reasons not to take a vaccine, so that they can be protected by the rest of us.
For the curious, here is the (completely unpolished) Python code I used to create the pictures.
Evidently my post from a week ago on the rate of fires in Tesla electric cars compared to gas cars couldn't have come at a more appropriate time. People are still harping on the recent string of Tesla Model S fires, despite the fact that — as I showed in my last post — there's no evidence to suggest that the fire risk in a Tesla is any greater than that of a regular car. In fact, if anything it seems to be slightly less.
In my last post I kind of hinted at the fact that the rate of fires isn't the whole story. Even if a fire does happen, your risk of getting injured or killed is different in a Tesla than a normal car. Something similar goes for other types of accidents. So if you want to tell whether Teslas are safe, what you probably should be looking at is the overall rate of injuries and fatalities for Tesla drivers and passengers, compared to the equivalent for gas cars. And that number tells a very interesting story: Tesla CEO Elon Musk has written a new blog post which emphasizes that not one person has ever been killed or seriously injured while driving or riding in a Tesla!
It's an impressive record, but as Musk admits,
Of course, at some point, the law of large numbers dictates that this, too, will change, but the record is long enough already for us to be extremely proud of this achievement.
Well, I happen to know a thing or two about the law of large numbers (and so do you, if you read my earlier post). Let's see what we can tell from the fact that Teslas have been driven as much as they have without a serious accident. Just how proud should Tesla Motors be about this?
Here I'm just going to apply the same statistical methods from my other post, except now focusing on the number of deaths and serious injuries, instead of the number of fires. So now designates the true, underlying probability per mile of a driver or passenger of a Tesla being seriously injured or killed for any reason related to the car.
As before, we don't know , but we do know that the corresponding event (death or serious injury) has happened zero times over a hundred million miles driven, . If you followed my last post, then, you'll understand how the maximum likelihood estimate (the "best guess") of is just zero — that Teslas protect you perfectly from traffic-related injuries.
As much as Elon Musk (and a lot of other people) would like it to be the case, I think we all know that's not realistic.
Time to move on to the a more sophisticated analysis. Remember that in my earlier blog post, I started with the formula for a Poisson distribution, which for a given value of , gives the probabilities of experiencing various numbers of events (fires, in that case) per hundred million miles driven.
But remember, we don't know . We know the actual number of events that happened, which is . So as I showed in my previous post, we fix to be that specific value, rename this quantity as a "likelihood" instead of a probability, and compare it for different guesses at .
In the earlier post, an "event" was a fire, and there were of them in a hundred million miles. That gave us the yellow curve in this graph:
This time, an "event" is a death or serious injury. There are of them. That gives us the green curve in this graph:
My next step in the earlier post was to divide by the maximum likelihood and take the logarithm, giving
For fires, this gives a plot you might recognize from before,
whereas for deaths and serious injuries, the maximum likelihood estimate occurs at and we get this:
Interesting — it's linear! This happens whenever the number of events observed is zero, as you can work out from the equations above:
(in case it bothers you, ).
On that last graph, you'll see that I marked two dotted vertical lines, indicating the two hypotheses I want to test. A little explanation of these hypotheses is in order, because there was a bit of work involved in coming up with them.
See, Tesla's blog post says there has never been a death or a serious injury in a Tesla. So in order to figure out whether Teslas are statistically safer than normal cars, I need to look at the rate of fatalities and serious injuries in normal cars — basically that's just the equivalent of for a gas car. The specific quantitative hypothesis we should try to reject can be stated like this:
The rate of fatalities and serious injuries in a Tesla is at least as great as the rate of fatalities and serious injuries in a normal car.
I can find from an NHTSA report (PDF) that in 2012, there were 1.14 deaths and 80 injuries per hundred million miles driven on US roads overall. Most injuries in traffic accidents are minor, though. So all that tells us is that the total rate of deaths and serious injuries was somewhere between 1.14 and 81 per hundred million miles. That's a pretty wide range.
We can start by checking the boundaries of this range. First, consider the lower boundary, corresponding to . In words it would be stated like this:
The rate of fatalities and serious injuries in a Tesla is at least as great as the rate of fatalities in a normal car.
This hypothesis is represented by the leftmost dotted line, which crosses the log likelihood curve (the green line) at , which is not really that unlikely. Remember from my earlier post that (a.k.a. ) is probably the most common threshold for statistical significance, albeit a rather weak one (you get it wrong 5% of the time), and this hypothesis doesn't even meet that threshold. So if gas-powered cars had 1.14 deaths and serious injuries per hundred million miles, we would not be able to confidently say that Teslas are safer.
On the other hand, think about the upper boundary, corresponding to . In words it would be
The rate of fatalities and serious injuries in a Tesla is at least as great as the rate of fatalities and all injuries in a normal car.
In this case, the relevant value is way off the right edge of the chart! A vertical line at that value of would cross the (green) log likelihood line at , which is huge. That's very unlikely. If that were the rate of deaths and serious injuries, we could conclude that Teslas are safer and effectively never be wrong. (The chance would be . Check your understanding, if you like, by working this out from the cumulative distribution function of the chi-square probability distribution.)
Evidently, depending on just how many of those 80 injuries per hundred million miles are "serious," we may or may not be able to reject the original hypothesis — remember, that means we may or may not be able to conclude that Teslas are safer than normal cars. So I did some digging into data provided by the National Highway Traffic Safety Administration to try to pin this number down. It turns out that every year, NHTSA investigators go out and collect a fairly large (around 60 thousand) sample of police reports about car accidents, and incorporate them all into a database called NASS, the National Automotive Sampling System. From the NASS/GES data for 2012, I was able to determine that out of the car accidents that year in which someone was injured (but not killed), 22.6% involved at least one serious injury (for what seems like a reasonable definition of serious). So it stands to reason that is a reasonable estimate at the number of traffic accidents per hundred million miles that involved a serious injury.
Adding that figure of 18.1 to the 1.14 fatalities per hundred million miles, I get the overall estimate of deaths and serious injuries per hundred million miles to be 19.3. The corresponding hypothesis is that , and that is marked with the vertical dotted line at the right of the last graph. That line intersects the green log likelihood line at , which exceeds any reasonable threshold for statistical significance — not only the common threshold, but also the (a.k.a. ) used in particle physics, and even the threshold () used in high-precision manufacturing. At this value of , we have only a one-in-two-billion chance of rejecting the hypothesis if it is in fact true.
So as far as we can tell from the statistics, we can pretty confidently state that Teslas are safer than gas-powered cars.
When I teach classes, sometimes I like to show a comic at the beginning to lighten the mood a bit, or perhaps keep people from falling asleep quite so easily. So I've been accumulating this list of science-themed comics for a few years now.
Of course there are always more, so I welcome suggested additions in the comments....
It's time for the post where I make excuses for why I haven't been keeping up with my target of one post per day this month.
Don't worry, I have a good one this time: I'm applying for postdoc positions. A lot of the application deadlines are coming up at the end of this month (well, actually a lot of them passed already, but the ones that haven't are coming up soon), so in addition to cranking out blog posts, I have to prepare my application materials.
I've found that in the world of physics, there's a certain set of documents that you tend to be asked for when applying for a job or postdoc, or an award (which is another unfortunate necessity of success in academia). I'm going to save all you youngn's who haven't gone through the process yet the trouble, and list out the common documents which you should have prepared ahead of time and keep up to date as needed:
It's worth noting that applications for postdocs and faculty jobs typically should be submitted almost a year ahead of when the position starts. For example, I'm looking at positions starting in fall 2014, and the application deadlines range from mid-September 2013 to the beginning of January 2014. So start early!
About a month ago, this happened: a Tesla Model S (electric car) ran over a large piece of metal which punctured its battery compartment, and the car caught on fire. It was a big deal because, according to CEO Elon Musk's blog post (first link above), that was the first time a Tesla has caught on fire.
(image credit: Tesla Motors blog, from the previously linked post)
Since then there have been two more similar incidents in which a Tesla was involved in an accident and caught fire. Naturally, people are getting concerned: three high-profile fires in one month is a lot! But these incidents get more than their share of attention because electric cars are new technology without a proven safety record. So the question we all should be asking is, how does the fire risk in a Tesla compare to that of a regular, gas-powered car?
Most of Elon's blog post about the first incident discusses how well the safety features of the car performed after it did catch on fire, and how this would have been a catastrophic event if the car were gas-powered like a normal car, and now we should all be driving electric cars and so on. My interest here is purely in the statistics, though. Towards the end, he points out
The nationwide driving statistics make this very clear: there are 150,000 car fires per year according to the National Fire Protection Association, and Americans drive about 3 trillion miles per year according to the Department of Transportation. That equates to 1 vehicle fire for every 20 million miles driven, compared to 1 fire in over 100 million miles for Tesla. This means you are 5 times more likely to experience a fire in a conventional gasoline car than a Tesla!
Is that true? Could you really tell, based on that one occurrence of a Tesla fire, that you're really five times more likely to experience a fire in a gas-powered car? What about now, that there have been two more — is this enough to reach a statistically significant conclusion?
One of the things to know about statistical significance is that it only tells you about when you can reject a hypothesis. And for that, you need a hypothesis. Statistics won't invent the hypothesis for you — that is, it won't tell you what your data means. You have to first come up with your own possible conclusion, some model that tells you the probabilities of various results, and then a statistical test will help you tell whether it's reasonable or not.
The hypothesis we want to test in this case is the statement "you are 5 times more likely to experience a fire in a conventional gasoline car than a Tesla." But to get a proper model that we can use statistical analysis on, we have to be a bit more precise about what that statement means. And in order to do that, there is one very important thing you have to understand:
The probability of something happening is not the same as the fraction of times it actually happens
This can be a tricky distinction, so bear with me. Here's an example: Teslas have covered a hundred million miles on the road. They have experienced three fires. So you might want to conclude, based on that data, that the probability per mile of having a Tesla fire is three hundred-millionths.
But if you had done the same thing a month ago, that same reasoning would lead you to conclude that the probability per mile is one hundred-millionth! That doesn't make any sense. The true probability per mile of experiencing a Tesla fire, whatever it is, can't have tripled in a month! Clearly, figuring out that true probability is not as simple as just taking the fraction of times it happened.
In fact, the true probability of an event is something we don't know, and in fact can never really know, because there's always some variation in how many times a random event actually happens. We can only estimate the probability based on the results we see, and hope that with enough data, our estimate will be close to whatever the actual value is. It works because, as you collect more data, your estimate of the probability tends to get closer to the true probability. This statement goes by the name of the Law of Large Numbers among mathematicians.
A quick side note: you might notice that there are actually two unknown probabilities here, the true probability per mile of experiencing a fire in a gas car, and the true probability per mile of experiencing a fire in a Tesla. You can do all the right math with two unknown probabilities, but it gets complicated. To keep things from getting too crazy, I'm going to pretend that we know one of these: that the true probability for a gas car to catch on fire is five hundred millionths per mile. There's a lot of data on gas cars, after all, and that should be a pretty precise estimate.
With that in mind, take another look at the hypothesis.
you are 5 times more likely to experience a fire in a conventional gasoline car than a Tesla!
To rephrase, Elon is saying that the probability per mile of experiencing a fire in a Tesla is one-fifth of the probability per mile of experiencing a fire in a gasoline car. But we don't really know that, do we? After all, true probabilities can't be measured! What we can do is make some guesses at the true probabilities, and see how well each one corresponds to the results we've seen. Given this, and our assumption that we know the true probability of a gas car fire, Elon's statement says that the probability per mile of a Tesla catching on fire is one hundred millionth. That will be our hypothesis.
Moving forward, I'm going to use to represent the probability per mile of a Tesla fire. We don't know what the numerical value is, but our hypothesis is that . Statisticians like to call this kind of quantity a parameter, because it's something unknown that we have to choose when constructing our hypothesis (as opposed to the probabilities of specific results, , which we'll be calculating later).
According to this hypothesis, the average number of Tesla fires per hundred million miles should be one. But of course, even if the probability per mile is actually , over a hundred million miles there's some chance of having two fires. Or three. Or none. It is a random occurrence, after all. We'll need to calculate the probabilities of having some particular number of fires every hundred million miles, starting from .
To do this, we need to use something called a probability mass function (PMF). When you have a discrete set of possible outcomes — here, various numbers of fires per hundred million miles driven — the probability mass function tells you how likely each one is relative to the others. There are many different PMFs that apply in different situations. In our case, we have something really unlikely (a fire, ), but we're giving it a large number of chances () to happen, so the relevant PMF is that of the Poisson distribution,
Here is some number of times the event (a car fire) could actually happen, is the probability it will happen that number of times, and is the true average number of fires in a hundred million miles — the expectation value.
This plot shows the relative probabilities for an expectation value of one:
It's important to understand just what this graph is telling you. Let's say you decide to run an experiment, which consists of driving a car for a hundred million miles (and if it catches on fire, you swap to an identical car and keep going). And since you want to be a good scientist (you do, right?), you repeat the experiment many times to get reliable results. And suppose the underlying true probability of a car fire in this car is . (You wouldn't know this value as you run the experiment, but I'm inventing this whole situation out of thin air so I can make it whatever I want.) The Poisson distribution tells you that in 36.8% of these hundred-million-mile trips, the car will not catch on fire. In another 36.8% of them, it will catch on fire once. In 18.4% of the trips, the car will catch on fire and the replacement car will catch on fire, for a total of two fires. And so on.
But it's even more important is what this graph does not tell you. It doesn't tell you anything about what happens if you drive a car with a different probability per mile of catching on fire.
That's worth thinking about. It leads you to another very important thing you have to understand:
Probability tells you about the results you can get given a hypothesis, not about the hypotheses you can assume given a result
One of the most common ways people screw this up is to make an argument like this:
This hypothesis says that the probability of getting one car fire in a hundred million miles is 36.8%, so if Teslas have driven a hundred million miles and had one fire, there's a 36.8% chance that the hypothesis is right.
But that's just not true! The two different parts of this statement are saying very different things:
The probability of getting one car fire in a hundred million miles is 36.8%
This is the probability that the result happens, given the hypothesis. In other words, out of all the times you drive this particular type of car a hundred million miles, in 36.8% of them you will have exactly one fire.
there's a 36.8% chance that the hypothesis is right
This, on the other hand, is talking about the probability that the hypothesis is true, given the result. In other words, out of all the times you drive the car and get one fire, it's saying the hypothesis will be true in 36.8% of them. But that's not the case at all! For example, think back to that same experiment I described a few paragraphs back. You drive the same kind of car a hundred million miles, many times. The hypothesis, that , is always true in that case. So obviously, out of all the times you get one fire, the hypothesis will be true in 100% of them. That's not 36.8%.
On the other hand, if you drive a different kind of car (with a different risk of fire) a hundred million miles, many times, you might get one fire in some of those trials, but the hypothesis will never be true. Still, that's not 36.8%.
The point is that there's no such thing as "the probability that the hypothesis is true." It might always be true, it might never be true, it might be true some of the time, but in a real situation, you don't know which of these is the case, because you don't know the underlying parameter . So if we're going to evaluate whether the hypothesis is sensible, we need to do it using something other than probability.
Naturally, statisticians have a tool for evaluating hypotheses. It's called likelihood ratio testing, which is a fancy-sounding name for something that's actually quite simple. The idea is that instead of picking a fixed value for the parameter (in our case, the true probability of a fire per mile) when you make your hypothesis, you instead try a bunch of different values. For each one, you can calculate the probability of seeing the results you actually got, given that value of the parameter, and call it the likelihood of that parameter. Likelihood is just the same number as probability, in a different context. Then comparing those different likelihoods tells you something about how likely one value of the parameter is relative to another.
Take a look at these graphs.
The ones on top, like the earlier graph, show the probability of various numbers of fires, . Each graph corresponds to a different value of the parameter . In each of the top graphs individually, all the bars add up to 1, as they should because out of all the possible outcomes, something has to happen.
The bottom graph gives you the same calculation, but this time it shows how the probability — now called likelihood — of getting a fixed number of fires changes as you adjust the parameter. In other words, each curve on the bottom graph is comparing the ability of different hypotheses to produce the same outcome. The likelihood values for a given outcome don't have to add up to 1, because these are not mutually exclusive outcomes of the same experiment; they're the same possible outcome in entirely different experiments.
So what can we tell from this likelihood graph? Well, for starters, hopefully it makes sense that, assuming a given result, the most likely value of the parameter (, in this case) should be the one that has the highest likelihood — the maximum of the graph. This seemingly self-evident statement has a name, maximum likelihood estimation, and it's actually possible to mathematically prove that, in a particular technical sense, it is the best guess you can make at the unknown value of the parameter.
The maximum likelihood method is telling us, in this case, that when there was only one Tesla fire in a hundred million miles driven, our best guess at was , based on the red curve. Now that there have been three fires in a hundred million miles driven, our best guess is revised to , using the yellow curve. So far, so good.
We can do more than just picking out the most likely value, though. I'll demonstrate with an example. Have a look at these other two likelihood graphs, which I just made up out of thin air. They don't have anything to do with the cars.
They both have the same maximum likelihood; that is, both the blue and the red curve are highest when the parameter is 3, so that's clearly the maximum likelihood estimate — the "best guess".
But the blue curve is very broad, so that other, widely separated values of the parameter also have fairly high likelihoods. For example, it could reasonably be 1 or 5. Thus we can't be very confident that the true value really is close to 3.
On the other hand, the red curve is very narrow. Only values very close to the the maximum likelihood estimate have a high likelihood; values like 1 and 5 are very unlikely indeed.
Statisticians like to display this by graphing the "log likelihood ratio," defined as the logarithm of the ratio of the likelihood to the maximum likelihood, or in math notation:
Don't ask me why it's called . Anyway, the plot looks like this:
The graph kind of evokes the image of a hole that you're dropping a ball into. If the hole is narrow, the ball is stuck pretty close to the bottom, but if the hole is wide, it has some freedom to roll around. Similarly, a narrow log likelihood ratio curve means that any reasonable estimate you can make for the value of the parameter is "stuck" near the most likely value, but a wide curve means there's a wide range of reasonable estimates, for some definition of "reasonable."
As you might imagine, there are various possible definitions of "reasonable," depending on how precise you want your statistical test to be. In practice, what you actually do is pick a threshold value of . Call it . It splits all the possible values of the parameter, the axis, into three regions: a central one, where is smaller than your threshold , and two outer ones, where is larger than .
Then you go out and run the experiment again, find some result, and calculate the corresponding value of the parameter. Assuming the maximum likelihood estimate (the bottom of the curve) is right, the value you calculated has a certain probability of falling within the center region. If , the probability is about 68%. If , it's about 95%, because the center region is bigger. And so on.
If you pick a really high threshold, it's really really likely that in any future experiments, the value of the parameter you calculate will be in the center region. And in turn, because those values should cluster around the true value, that means the true value should also fall in the center region! You've basically excluded values of the parameter outside that center — in other words, you can reject any hypothesis that has the value of the parameter outside your center region! The higher your threshold, the more confident you are in that exclusion. The probability associated with is basically the fraction of times it will turn out that you were right to reject the hypothesis.
By the way, you might recognize the probabilities I mentioned above: 68% in the center for , 95% in the center for , and so on. If you have some normally distributed numbers, then 68% of the results are within (one standard deviation) of the mean, 95% are within of the mean, and so on. In this case, the numbers come from the chi-squared distribution, with one degree of freedom because there is one parameter, but the formula happens to work out the same, so you can make a correspondence between the threshold and the number of sigmas:
If you've ever heard of the famous threshold in particle physics, this is where it comes from. (At last!) A "discovery" in particle physics corresponds to choosing , and doing more and more experiments to narrow down that center region until it excludes whatever value of the parameter would mean "the particle does not exist." For example, the blue curve below might represent a preliminary result of your experiment, where you exclude the value 5.5 at (which is really saying nothing). The red curve might represent a final result, where you can now exclude 5.5 at — that means the probability of the true value differing from the "best guess" (3) by 2.5 or more is less than 1 in 300 million!
At this point, hopefully you understand why it's so hard to get a straight explanation out of a physicist as to what they mean by "discovered!"
Now that I've explained (and hopefully you understand, but it's okay if you don't because this is complicated) how likelihood ratio testing works, and what it means to exclude a hypothesis at a certain level, let's see how this works for the Tesla fires.
If I go back to the graph of likelihood for the Tesla fires, and compute the corresponding log likelihood ratio , I get this:
Remember, the red curve represents how the probability of having one fire per hundred million miles varies as we adjust our guess at the true probability of a fire. The yellow curve is the same except it's for three fires per hundred million miles. We would have used the red curve a month ago, after the first fire; as of today, we should use the yellow one.
I've also marked three possible hypotheses — that is, three possible true values of the parameter — with vertical lines (or in one case a shaded area). Well, actually one of them isn't a possible hypothesis, physically speaking. Clearly Teslas are not invulnerable; we know because otherwise there would never have been any fires at all! But even if common sense didn't tell you that, the graph would. Notice how both the red curve and the yellow curve shoot up to infinity as . That means, no matter what threshold you choose, is always excluded ().
Let's now look at Elon's original hypothesis, that Teslas are five times safer than gas cars. At that point in the graph, the red curve is zero. So Elon was perfectly justified in saying that Teslas were five times less likely to experience a fire, based on the data he had after the first fire a month ago. These days, however, it wouldn't be quite so easy to make that claim. The yellow curve, the one based on the three fires we've seen so far, has a value of . That means the hypothesis is excluded at the level (because ), or that there is an 11% probability of seeing something at least as extreme as the observed result (three fires) given that hypothesis. That's not a small enough percentage to rule out the hypothesis by any common standard, so it's still potentially viable — just not as likely as it used to be.
If you ask me, though, the really important question is this: do Teslas have a lower fire risk than gas cars, period? To that end, I included the third hypothesis, that Teslas are just as safe than normal cars. That's the dotted line on the right. Based on the data from a month ago, after the first car fire, that hypothesis is excluded at the level — actually, it's , but it's common to round down. That corresponds to a 3% probability of seeing only one fire (or some even more extreme result), if the hypothesis is correct; in a sense, this also means that if you decided to go ahead and declare the Tesla safer than a gas car, you'd have a 3% chance of being wrong. For a lot of people outside particle physics, that's good enough; 5% (or , remember) is a common cutoff. But 5% is still not that small. It means you're wrong one out of every 20 times on average.
Fast forward to today, and we now have to use the yellow curve. That intersects the hypothesis line quite a bit lower; in fact, it's less than one sigma! So based on current data, if you decide to reject the hypothesis that Teslas are as safe as gas cars, you have a one in three chance of being wrong. Either way, I don't think anyone would consider it safe to reject that hypothesis based on that result. For now, we have to live with the possibility that Teslas may not have any lower of a fire risk than regular gas-powered cars. Sorry, Elon.
Here's another post for my more technically-minded readers: in the course of writing the software for my latest research project (which I am still going to post about later this month), I needed algorithms for two-dimensional interpolation and quasi-Monte Carlo integration. Neither of these exists in GSL — the GNU Scientific Library, kind of a standard set of libraries for scientific software. So I wrote my own.
These might be useful for anyone else doing scientific computation.