ellipsix informatics

About saturation

Time to kick off a new year of blog posts! For my first post of 2015, I'm continuing a series I've had on hold since nearly the same time last year, about the research I work on for my job. This is based on a paper my group published in Physical Review Letters and an answer I posted at Physics Stack Exchange.

In the first post of the series, I wrote about how particle physicists characterize collisions between protons. A quark or gluon from one proton (the "probe"), carrying a fraction x_p of that proton's momentum, smacks into a quark or gluon from the other proton (the "target"), carrying a fraction x_t of that proton's momentum, and they bounce off each other with transverse momentum Q. The target proton acts as if it has different compositions depending on the values of x_t and Q: in collisions with smaller values of x_t, the target appears to contain more partons.

kinematic diagram of proton composition

At the end of the last post, I pointed out that something funny happens at the top left of this diagram. Maybe you can already see it: in these collisions with small x_t and small Q, the proton acts like a collection of many partons, each of which is relatively large. Smaller x_t means more partons, and smaller Q means larger partons. What happens when there are so many, and so large, that they can't fit?

Admittedly, that may not seem like a problem at first. In the model I've been using so far, a proton is a collection of particles. And it seems totally reasonable that when you're looking at a proton from one side, some of the particles will look like they're in front of other particles. But this is one of those situations where the particle model falls short. Remember, protons are really made of quantum fields. Analyzing the proton's behavior using quantum field theory is not an easy task, but it's been done, and it turns out an analogous, but very serious, problem shows up in the field model: if you extrapolate the behavior of these quantum fields to smaller and smaller values of x_t, you reach a point where the results don't make physical sense. Essentially it corresponds to certain probabilities becoming greater than 1. So clearly, something unexpected and interesting has to happen at small x_t to keep the fields under control.

Parton branching and the BFKL equation

To explain how we know this, I have to go all the way back to 1977. Quantum chromodynamics (QCD), the model we use to describe the behavior of quarks and gluons, was only about 10 years old, and physicists at the time were playing around with it, poking and prodding, trying to figure out just how well it explained the known behavior of protons in collisions.

Most of this tinkering with QCD centered around the parton distributions f_i(x, Q^2), which I mentioned in my last post. Parton distributions themselves actually predate QCD. They first emerged out of something called the "parton model," invented in 1969, which is exactly what it sounds like: a mathematical version of the statement "protons are made of partons." So by the time QCD arrived on the scene, the parton distributions had already been measured, and the task that fell to the physicists of the 1970s was to try to reproduce the measurements of f_i(x, Q^2) using QCD.

When you're testing a model of particle behavior, like QCD, you do it by calculating something called a scattering cross section, which is like the effective cross-sectional area of the target particle. If the target were a sphere of radius r, for example, its cross section would be \pi r^2. But unlike a plain old solid sphere, the scattering cross section for a subatomic particle depends on things like how much energy is involved in the collision (which you may remember as \sqrt{s} from the last post) and what kinds of particles are colliding. The information about what kinds of particles are colliding is represented mathematically by the parton distributions f_i(x, Q^2). So naturally, in order to make a prediction using the theory, you need to know the parton distributions.

The thing is, we actually can't do that! Believe me, people are trying, but there's a fairly fundamental problem: parton distributions are nonperturbative, meaning they are inextricably linked to the behavior of the strong interaction when it is too strong for standard methods to handle. They already knew this in the 1970s. However, that didn't stop people from trying to calculate something about the PDFs which could be linked to experimental results.

perturbative and nonperturbative parton distributions

It turns out that even though the exact forms of the parton distributions can't be calculated from quantum field theory, you can calculate their behavior at small values of x, the green part on the left of the preceding diagram. In 1977, four Russian physicists — Ian Balitsky, Victor Fadin, Eduard Kuraev and Lev Lipatov — derived from QCD an equation for the rate of change of parton distributions with respect to x, in collisions with energy \sqrt{s} much larger than either the masses of the particles involved or the amount of energy transferred between them (Q, roughly). In modern notation, the equation (which I will explain later) is written

\pd{N(x, Q^2, \vec{r}_{01})}{\ln\frac{1}{x}} = \frac{\alpha_s}{2\pi}\int\uddc\vec{r}_2\frac{r_{01}^2}{r_{02}^2r_{12}^2} [N(x, Q^2, \vec{r}_{02}) + N(x, Q^2, \vec{r}_{12}) - N(x, Q^2, \vec{r}_{01})]

N is something called the color dipole cross section, which is related to f from before via an equation roughly like this:

f(x, Q^2) = \int^{Q^2}\iint N(x, k^2, \vec{r})\uddc\vec{r}\udc k^2

That's why f is often called an integrated parton distribution and N an unintegrated parton distribution. I won't go into the details of the difference between N and f, since both of them show the behavior I'm going to talk about in the rest of this post.

Anyway, the behavior that Balitsky, Fadin, Kuraev, and Lipatov analyzed comes from processes like these:

parton branching diagrams

At each vertex, one parton with a certain fraction x of the proton's momentum splits into other partons with smaller values of x. You can see this reflected in the equation: the term -N(x, Q^2, \vec{r}_{01}) represents the disappearance of the original parton, and N(x, Q^2, \vec{r}_{02}) + N(x, Q^2, \vec{r}_{12}) represents the creation of two new partons with smaller momentum fractions x. When this happens repeatedly, it leads to a cascade of lower-and-lower momentum particles as the branching process goes on. This explains why the number of partons, and thus the parton distribution functions, increase as you go to smaller and smaller values of x.

This BFKL model has been tested in experiment after experiment for many years, and it works quite well. For example, in the plot below, from this paper by Anatoly Kotikov, you can see that the predictions from the BFKL equation (solid lines) generally match the experimental data (dots with error bars) quite closely.

comparison of F2 experimental data and BFKL predictions

The plot shows the structure function F_2, which is a quantity related to the integrated parton distribution.

Parton recombination

However, there is one big problem with the BFKL prediction: it never stops growing! After all, if the partons keep splitting over and over again, you keep getting more and more of them as you go to lower momentum fractions x. Mathematically, this corresponds to exponential growth in the parton distributions:

\mathcal{F}(x, Q^2) = \ud{f}{Q^2} \sim \frac{x^{-4\bar{\alpha}_s\ln 2}}{\sqrt{\ln\frac{1}{x}}}

which is roughly the solution to the BFKL equation.

If the parton distributions get too large, when you try to calculate the scattering cross section, the result "breaks unitarity," which effectively means the probability of two partons interacting becomes greater than 1. Obviously, that doesn't make sense! So this exponential growth that we see as we look at collisions with smaller and smaller x_t can't continue unchecked. Some new kind of physics has to kick in and slow it down. That new kind of physics is called saturation.

The physical motivation for saturation was proposed by two physicists, Balitsky (the same one from BFKL) and Yuri Kovchegov, in a series of papers starting in 1995. Their idea is that, when there are many partons, they actually interact with each other — in addition to the branching described above, you also have the reverse process, recombination, where two partons with smaller momentum fractions combine to create one parton with a larger momentum fraction.

parton recombination

At large values of x, when the number of partons is small, it makes sense that not many of them will merge, so this recombination doesn't make much of a difference in the proton's structure. But as you move to smaller and smaller x and the number of partons grows, more and more of them recombine, making the parton distribution deviate more and more from the exponential growth predicted by the BFKL equation. Mathematically, this recombination adds a negative term, proportional to the square of the parton distribution, to the equation.

\pd{N(x, Q^2, \vec{r}_{01})}{\ln\frac{1}{x}} = \frac{\alpha_s}{2\pi}\int\uddc\vec{r}_2\frac{r_{01}^2}{r_{02}^2r_{12}^2} [N(x, Q^2, \vec{r}_{02}) + N(x, Q^2, \vec{r}_{12}) - N(x, Q^2, \vec{r}_{01}) - N(x, Q^2, \vec{r}_{02})N(x, Q^2, \vec{r}_{12})]

When the parton density is low, N is small and this nonlinear term is pretty small. But at high parton densities, the nonlinear term has a value close to 1, which cancels out the other terms in the equation. That makes the rate of change \pd{N}{\ln\frac{1}{x}} approach zero as you go to smaller and smaller values of x, which keeps N from blowing up and ruining physics.

By way of example, the following plot, from this paper (my PhD advisor is an author), shows how the integrated gluon distribution grows more slowly when you include the nonlinear term (solid lines) than when you don't (dashed lines):

plot of BFKL and BK solutions

So where does that leave us? Well, we have a great model that works when the parton density is low, but we don't know if it works when the density is high. That's right: saturation has never really been experimentally confirmed, although it's getting very close. In the third and final post in this series (not counting any unplanned sequels), I'll explain how physicists are now trying to do just that, and how my group's research fits into the effort.


Hooray, I have a postdoc!

I figured a quick update is in order to announce that starting this fall, I'll be a postdoc at Central China Normal University!

If you've been following this blog for a while, you may remember that visited CCNU back in 2012 for a conference and a week of research collaboration. It's definitely different — you know, because China is not the US — but there's a lot to like about the place. CCNU is rapidly developing a strong international reputation for their theoretical physics research. They're well placed to take advantage of the Chinese drive to promote basic science; in particular, unlike the US and even Europe, to some extent, basic research in China still gets substantial amounts of financial support. The living costs are low, so even a small salary goes a long way, and I'll definitely be looking forward to all the delicious food of Hubei province.

There's a lot to do between now and the fall, when I move, so I'm sure I'll have more to say about this as it happens. It'll be nice to be a scientist for a little while longer.


What's in a proton?

Hooray, it's time for science! For my long-overdue first science post of 2014, I'm starting a three-part series explaining the research paper my group recently published in Physical Review Letters. Our research concerns the structure of protons and atomic nuclei, so this post is going to be all about the framework physicists use to describe that structure. It's partially based on an answer of mine at Physics Stack Exchange.

What's in a proton?

Fundamentally, a proton is really made of quantum fields. Remember that. Any time you hear any other description of the composition of a proton, it's just some approximation of the behavior of quantum fields in terms of something people are likely to be more familiar with. We need to do this because quantum fields behave in very nonintuitive ways, so if you're not working with the full mathematical machinery of QCD (which is hard), you have to make some kind of simplified model to use as an analogy.

If you're not familiar with the term, fields in physics are things which can be represented by a value associated with every point in space and time. In the simplest kind of field, a scalar field, the value is just a number. Think of it like this:

representation of a scalar field

More complicated kinds of fields exist as well, where the value is something else. You could, in principle, have a fruit-valued field, that associates a fruit with every point in spacetime. In physics, you'd be more likely to encounter a vector-, spinor-, or tensor-valued field, but the details aren't important. Just keep in mind that the value associated with a field at a certain point can be "strong," meaning that the value differs from the "background" value by a lot, or "weak," meaning that the value is close to the "background" value. When you have multiple fields, they can interact with each other, so that the different kinds of fields tend to be strong in the same place.

The tricky thing about quantum fields specifically (as opposed to non-quantum, or classical, fields) is that we can't directly measure them, the way you would directly measure something like air temperature. You can't stick a field-o-meter at some point inside a proton and see what the values of the fields are there. The only way to get any information about a quantum field is to expose it to some sort of external "influence" and see how it reacts — this is what physicists call "observing" the particle. For a proton, this means hitting it with another high-energy particle, called a probe, in a particle accelerator and seeing what comes out. Each collision acts something like an X-ray, exposing a cross-sectional view of the innards of the proton.

Because these are quantum fields, though, the outcome you get from each collision is actually random. Sometimes nothing happens. Sometimes you get a low-energy electron coming out, sometimes you get a high-energy pion, sometimes you get several things together, and so on. In order to make a coherent picture of the structure of a proton, you have to subject a large number of them to these collisions, find some way of organizing collisions according to how the proton behaves in each one, and accumulate a distribution of the results.

Classification of collisions

Imagine a slow collision between two protons, each of which has relatively little energy. They just deflect each other due to mutual electrical repulsion. (This is called elastic scattering.)

animation of elastic proton scattering

If we give the protons more energy, though, we can force them to actually hit each other, and then the individual particles within them, called partons, start interacting.

animation of low-energy inelastic scattering

At higher energies, a proton-proton collision entails one of the partons in one proton interacting with one of the partons in the other proton. We characterize the collision by two variables — well, really three — which can be calculated from measurements made on the stuff that comes out:

  • x_p is the fraction of the probe proton's forward momentum that is carried by the probe parton
  • x_t is the same, but for the target proton
  • Q^2 is roughly the square of the amount of transverse (sideways) momentum transferred between the two partons.

With only a small amount of total energy available, x_p and x_t can't be particularly small. If they were, the interacting partons would have a small fraction of a small amount of energy, and the interaction products just wouldn't be able to go anywhere after they hit. Also, Q^2 tends to be small, because there's not enough energy to give the interacting particles much of a transverse "kick." You can actually write a mathematical relationship for this:

(\text{total energy})^2 = s = \frac{Q^2}{x_p x_t}

Collisions that occur in modern particle accelerators involve much more energy. There's enough to allow partons with very small values of x_p (in the probe) or x_t (in the target) to participate in the collision and easily make it out to be detected. Or alternatively, there's enough energy to allow the interacting partons to produce something with a large amount of transverse momentum. Accordingly, in these high-energy collisions we get a random distribution of all the combinations of x_p, x_t, and Q^2 that satisfy the relationship above.

Proton structure

Over many years of operating particle accelerators, physicists have found that the behavior of the target proton depends only on x_t and Q^2. In other words, targets in different collisions with the same values of x_t and Q^2 behave pretty much the same way. While there are some subtle details, the results of these decades of experiments can be summarized like this: at smaller values of x_t, the proton behaves like it has more constituent partons, and at larger values of Q^2, it behaves like it has smaller constituents.

This diagram shows how a proton might appear in different kinds of collisions. The contents of each circle represents, roughly, a "snapshot" of how the proton might behave in a collision at the corresponding values of x and Q^2.

kinematic diagram of proton composition

Physicists describe this apparently-changing composition using parton distribution functions, denoted f_i(x, Q^2), where i is the type of parton: up quark, antidown quark, gluon, etc. Mathematically inclined readers can roughly interpret the value of a parton distribution for a particular type of parton as the probability per unit x and per unit Q^2 that the probe interacts with that type of parton with that amount of momentum.

This diagram shows how the parton distributions relate to the "snapshots" in my last picture:

kinematic diagram of proton composition with PDFs

The general field I work in is dedicated to determining these parton distribution functions as accurately as possible, over as wide of a range of x and Q^2 as possible.

As particle accelerators get more and more powerful, we get to use them to explore more and more of this diagram. In particular, proton-ion collisions (which work more or less the same way) at the LHC cover a pretty large region of the diagram, as shown in this figure from a paper by Carlos Salgado:

diagram showing LHC reach in x-Q^2 space

I rotated it 90 degrees so the orientation would match that of my earlier diagrams. The small-x, low-Q^2 region at the upper left is particularly interesting, because we expect the parton distributions to start behaving considerably differently in those kinds of collisions. New effects, called saturation effects, come into play that don't occur anywhere else on the diagram. In my next post, I'll explain what saturation is and why we expect it to happen. Stay tuned!


Obligatory musings on the Nobel Prize

You've probably heard that the Nobel Prize in Physics was awarded yesterday to François Englert and Peter Higgs, for the theoretical prediction of the Higgs boson. You've probably also heard all the commotion leading up to the announcement, about how silly it is that Nobel Prizes are awarded only to three people. And you may have noticed that I didn't weigh in.

Frankly, that's because I didn't really care. I'm sure it's a big deal to the recipients and non-recipients of the prize, but to the rest of us, the work done by all six authors stands on its own merits. The community of physicists doesn't need a prize to tell them whose research leads to a better understanding of the universe — and in the end, even if you ask most Nobel Prize winners, understanding the universe is what makes doing science worthwhile, not getting recognition.

If this year's debate gets people to look more closely at the actual science being done, and put less emphasis on who gets labeled a Nobel Prize winner, that can only be a good thing.

I'll leave you with the links to the Nobel-winning papers, all of which are free to download from the Physical Review Letters website: the paper by Englert and Brout, the paper by Higgs, and the paper by Guralnik, Hagen, and Kibble. And also, my own more accessible explanation of the Higgs mechanism from this very blog.


B meson decay confirmed!

Time for a blog post that has been far too long coming! Remember the Quest for B Meson Decay? I wrote about this several months ago: the LHCb experiment had seen one of the rarest interactions in particle physics, the decay of the \mathrm{B}^0_s meson into a muon and antimuon, for the first time after 25 years of searching.

Lots of physicists were interested in this particular decay because it's unusually good at distinguishing between different theories. The standard model (which incorporates only known particles) predicts that a muon and antimuon should be produced in about 3.56 out of every billion \mathrm{B}^0_s decays — a number known as the branching ratio. But many other theories that involve additional, currently unknown particles, predict drastically different values. A precise measurement of the branching ratio thus has the ability to either rule out lots of theoretical predictions, or provide the first confirmation on Earth of the existence of unknown particles!

Naturally, most physicists were hoping for the latter possibility — having an unknown particle to look for makes things exciting. But so far, the outlook doesn't look good. Last November, LHCb announced their measurement of the branching ratio as

\mathcal{B}(\mathrm{B}^0_s\to\ulp\ualp) = (3.2^{+1.5}_{-1.2})\times 10^{-9}

In the months since then, the LHCb people have done a more thorough analysis that incorporates more data. They presented it a couple of weeks ago at the European Physical Society Conference on High-Energy Physics in Sweden, EPS-HEP 2013, along with a similar result from the CMS experiment. Here's the exciting thing: the two groups were able to partially combine their data into one overall result, and when they do so they pass the arbitrary 5\sigma threshold that counts as a discovery!

Yes, I am about to explain what that means. :-)

Without Further Ado

In this plot from LHCb,

Plot of selected LHCb events

and this one from CMS,

Plot of selected CMS events

you can see the number of events detected in each energy range, the black dots, along with the theoretical prediction for what should be detected assuming the standard model's prediction of the decay rate is correct, represented by the blue line. The data and the prediction are clearly consistent with each other, but that by itself doesn't mean we can be sure the decay is actually happening. Maybe the prediction that would be made without \mathrm{B}^0_s\to\ulp\ualp decay would also be consistent with the data, in which case these results wouldn't tell you anything useful.

A more useful plot is this one from the CMS experiment, which shows the likelihood ratio statistic a.k.a. log-likelihood (on the vertical axis) as a function of the branching ratio (or branching fraction BF):

Plot of log likelihood of strange B meson decay

Basically, the plot tells you which values of the branching ratio are more or less likely to be the true branching ratio, using the measured value as a reference point. For example, CMS measured the branching ratio to be \num{3e-9}, so in the absence of other information, that's the most likely true value of the branching ratio. The most likely value is no less likely than itself (just meditate on that for a moment), so the curve touches zero — the bottom of the graph — at \num{3e-9}.

On the other hand, consider \num{2.1e-9}. That's not the number CMS measured, so it's less likely that the true branching ratio is \num{2.1e-9} than \num{3e-9}. How much less likely? Well, if you look at the plot, the value at \num{2.1e-9} is pretty much on the 1\sigma line. But you have to be careful how you interpret that. It does not tell you the probability that the true value is less than \num{2.1e-9}, but it does tell you that, if the true value is \num{2.1e-9}, the probability of having measured the result CMS did (or something higher and further away from the true value) is the "one-sided" probability which corresponds to 1\sigma: 16%. (That number comes from the normal distribution, by the way: 68% of the probability is within one standard deviation of the mean, leaving 16% for each side.) Yes, this is kind of a confusing concept. But the important point is simple: the higher the curve, the less likely that value is to be the true value of the branching ratio.

The most interesting thing to take away from this graph is the value for a branching ratio of zero, where the curve intersects the vertical axis. That tells you how likely it is that the branching ratio is zero, given the experimental data. (Technically zero or less, except that we know it can't be less than zero because a negative number of decays doesn't make any sense!) It's labeled on the plot as 4.3\sigma. That corresponds to a one-sided probability of \num{8.54e-6}, or less than a thousandth of a percent! In other words, if \mathrm{B}^0_s\to\ulp\ualp doesn't happen at all, the chances that CMS would have seen as many \ulp\ualp pairs as they did is 0.001%. That's a pretty small probability, so it seems quite likely that \mathrm{B}^0_s\to\ulp\ualp is real.

Let's be careful, though! Because if you do run a hundred thousand particle physics experiments, you can expect one of them to produce an outcome with a probability of 0.001%. How do we know this isn't the one? Well, we don't. But here's what we can do: estimate the number of particle physics experiments and sub-experiments that get done over a period of, say, several years, and choose a probability that's low enough so that you don't expect to get a fluke result in all those experiments. For example, let's say there are 10,000 experiments done each year. If you decide to call it a "discovery" when you see something that happens with a probability of 0.001%, you'll "discover" something that doesn't really exist every ten years or so. But if you hold off on calling it a "discovery" until you see something that has a probability of 0.00003% — that's one in 350 million — you'll only make a fake discovery on average once every few centuries. Or in other words, once in the entire history of physics, since Newton! Physicists have chosen this threshold of a 1-in-350-million probability to constitute a "discovery" for exactly that reason, and also because it corresponds to a nice round number of sigmas, namely five.

So in order to say the decay \mathrm{B}^0_s\to\ulp\ualp has officially been discovered, we would need that curve on the plot to shoot up to 5\sigma by the time it hits the vertical axis. CMS isn't there yet.

The Combination

LHCb also has their own, separate result showing evidence for the decay. It's natural to wonder, could we combine them? After all, if it's unlikely that one experiment sees a decay that doesn't exist, it must be way less likely that two separate experiments indepdently see it!

That's true to some extent, but it's not an easy thing to properly combine results from different experiments. You can't just multiply a couple of probability distributions, because the experiments will often be made of different components, work in different ways, and have different algorithms for filtering and processing their data, and all of that has to be taken into account to produce a combined result. But in this case, the LHCb and CMS collaborations have decided their detectors are similar enough that they can do an approximate combination, which they shared during the EPS-HEP conference.

Plot of combined results

This plot shows the individual measurements and uncertainties from LHCb,

\mathcal{B}_\text{LHCb}(\mathrm{B}^0_s\to\ulp\ualp) = 2.9^{+1.1}_{-1.0}\times 10^{-9}

from CMS,

\mathcal{B}_\text{CMS}(\mathrm{B}^0_s\to\ulp\ualp) = 2.9^{+1.1}_{-1.0}\times 10^{-9}

as well as the combined value,

\mathcal{B}_\text{combined}(\mathrm{B}^0_s\to\ulp\ualp) = (2.9\pm 0.7)\times 10^{-9}

and the standard model prediction in the green band,

\mathcal{B}_\text{SM}(\mathrm{B}^0_s\to\ulp\ualp) = \SI{3.56e-9}{0.30e-9}

The uncertainties are 1\sigma, which means the error bars indicate how far it is from the measured value to the point at which the log-likelihood curve passes the 1\sigma mark. You can see that they match up in this composite of the two plots:

Cross-referenced plots of combination with CMS log-likelihood curve

You'll notice that the error bars on the combined result are smaller than those for either the CMS or LHCb results individually. That means the log-likelihood curve shoots up faster for the combined result, apparently fast enough that it's above the 5\sigma level by the time the branching ratio reaches zero — which is exactly the criterion to say the decay \mathrm{B}^0_s\to\ulp\ualp is officially discovered.

Of course, the standard model predicted that this decay would occur. So the thing that LHCb and CMS have discovered, namely that the log-likelihood is above the 5\sigma level for a branching ratio of zero, isn't really unexpected. What would be much more exciting is if they find that the log-likelihood is above the 5\sigma level at the value predicted by the standard model — that would indicate that the real branching ratio is not what the standard model predicts, so some kind of new physics is definitely happening! But we'll be waiting a long time to see whether that turns out to be the case.


Have we really found a tetraquark?

Hooray, it's time for another blog post! What I'm writing about this time is kind of old news — and don't worry, there's more to come on just why I haven't been able to write about it for so long — but very interesting nonetheless.

A few weeks ago, two separate physics experiments announced that they had discovered a tetraquark, a composite particle made of four quarks. Or rather, that's what all the popular news coverage said. But what really happened? The discovery of a real tetraquark would be huge news, so I'm sure not going to trust the media reports on this one. As always, I'm going straight to the source: the original papers by the BES III and Belle collaborations.

Two or Three is Company, Four's a Crowd

Before delving into the discovery itself, I'm going to tackle the burning question on everybody's mind: what's so special about a particle made of four quarks?

To understand that, we have to look to quantum chromodynamics (QCD), the theory of how "color-charged" particles interact. In some ways, QCD is superficially similar to quantum electrodynamics, the theory of how electrically charged particles interact. You can rely on your familiar intuition that positive and negative charges attract, for instance. Just as positively charged protons and negatively charged electrons attract each other, and can bind together to form atoms, particles with matching postive and negative color charges can bind together to form particles with two quarks (actually one quark and one antiquark) which are called mesons.

But unlike electromagnetism, where there is only one kind of charge (electric), QCD includes three kinds of charge, each of which can be independently positive or negative. The three charges are called red, green, and blue, even though they have nothing to do with visible colors. They're often shown in a diagram like this one:

color diagram

If there are three independent charges, why don't I show them using three dimensions? Well, it turns out that a combination of a red, a green, and a blue charge is precisely equivalent to no color charge at all. Accordingly, if you add together the (positive) red, green, and blue vectors from that image, it gets you right back to the center. An object like this — whether uncharged, or positive and negative of the same color, or red+green+blue, or some combination of those — is called a color singlet.

Now, one of the consequences of QCD, and a rule drilled into all young particle physicists at an early age, is that all naturally occurring particles are color singlets. The reason is simple: remember that the force which acts on color-charged objects (analogous to how the electromagnetic force acts on electrically charged objects) is called the strong force, because, well, it's strong. If there were any loose color-charged particles floating around, they would be immediately attracted to other particles with different color charges, forming color-neutral groups. In fact this process, hadronization, actually took place early in the universe's formation, shortly after the big bang. It also happens in heavy-ion collisions in modern particle accelerators, anywhere a quark-gluon plasma is produced.

There are plenty of known particles that constitute the simplest kinds of color singlets: a quark and antiquark of the same color or three quarks (or antiquarks) of different colors. But what about more complicated combinations? You can imagine a particle composed of two quarks and their corresponding antiquarks (a tetraquark) or four quarks and an antiquark, or vice-versa (a pentaquark). They're valid color singlets because the net color charge is zero, but they can break apart into smaller color singlets with two or three quarks, and that's presumably why there aren't a bunch of them floating around the universe.

decay of exotic color particles

Like other unstable particles, though, tetraquarks and pentaquarks can be produced in the collisions made in a particle accelerator. At least, that's what the theory says, but in decades of colliding particles (until about a year ago, if this discovery is real), we've never seen clear evidence that a tetraquark or pentaquark has ever been produced.

Finding a Particle in a Haystack

If you have a particle that decays into two other particles, the total momentum of the two products, or final-state particles, is equal to the momentum of the original, initial-state particle. Same goes for energy.

\begin{align}E_i &= E_{f1} + E_{f2} \\ \vec{p}_i &= \vec{p}_{f1} + \vec{p}_{f2}\end{align}

This is useful because special relativity tells us the energy and momentum of the original particle satisfy E_i^2 - p_i^2 c^2 = m_i^2 c^4, where m_i, the mass of the particle, is a constant. Plugging in, you get

(E_{f1} + E_{f2})^2 - (\vec{p}_{f1} + \vec{p}_{f2})^2 c^2 = m_i^2 c^4

All these variables, the energies and momenta of the products, are measured by the detector. You can plug the energies and momenta of pairs of particles into this formula, and if they came from the same decay, you'll get the mass of the intermediate particle they came from. If they didn't come from the same decay, you'll get a more or less random value.

Somewhat confusingly, m_i, also denoted m_{12}, is called the invariant mass of the pair 1,2 even if it doesn't match the mass of the thing the particles decayed from (or even if those two particles didn't come from the same decay at all).

Virtual particles

Hold on — I lied. Yes, just like everything else in physics, it's not that simple. In quantum field theory, a particle — which is really a certain kind of disturbance in a quantum field — doesn't satisfy E_i^2 - p_i^2 c^2 = m_i^2 c^4 exactly. If you produce a bunch of identical particles of a type that has mass m_i, the quantity E_i^2 - p_i^2 c^2 won't always be equal to m_i^2 c^4; instead, it follows a probability distribution which roughly resembles

P(E_i) \sim \frac{1}{\bigl(E_i^2 - p_i^2 c^2 - m_i^2 c^4\bigr)^2}

When you take pairs of particles and calculate (E_1 + E_2)^2 - (\vec{p}_1 + \vec{p}_2)^2 c^2 = m_{12}^2 c^4 (and note that m_{12} is still called the invariant mass of the pair despite being almost completely unrelated to the mass of any particle involved), even if they did come from the same decay, you don't get a single value. You get a peaked distribution, like this:

1D histogram of virtual particle decay

Particles which didn't come from a single decay will still give you a mostly flat distribution:

1D histogram of nothing

Dalitz plots

In electron-positron collisions, the kind that take place in BES III and Belle, the only way you get exactly two final-state particles out of the collision is if the electron and positron turn into a single particle which then decays into two products. There are not a whole lot of options for what that single particle can be, because of all the conservation laws that apply; it's basically limited to photons, Z bosons, and Higgs bosons. So to have any hope of finding other interesting particles, we have to look at other collisions with more complicated decay chains. For example, the tetraquark candidate that BES III and Belle have discovered, called the Z_c(3900), is produced like this:

Zc(3900) decay chain

A multi-stage decay means that more than two particles are produced, and that means you can't just pick the two decay products of the particle you want to discover. Think about it — if everything in the yellow circle below is hidden from you, how would you know that it's the J/\psi and \pi^+ that came from the Z_c(3900), rather than the J/\psi and \pi^-?

Zc(3900) decay schematic

In these cases, the way to go is make histograms of (E_1 + E_2)^2 - (p_1 + p_2)^2 c^2, like those in the last section, for each pair of particles that comes out. If any two of those particles came from one decay, you'll get a spike in the corresponding plot, but the other plots will be basically flat.

However, if there are only three final-state particles, as in \elp\ealp\to J/\psi\pipm\pimm, we don't actually need to make all three histograms. Check this out:

(E_{J/\psi} + E_{\pipm} + E_{\pimm})^2 = (E_{J/\psi} + E_{\pipm})^2 + (E_{J/\psi} + E_{\pimm})^2 + (E_{\pipm} + E_{\pimm})^2 - E_{J/\psi}^2 - E_{\pipm}^2 - E_{\pimm}^2

The analogous equation works for momentum as well:

(\vec{p}_{J/\psi} + \vec{p}_{\pipm} + \vec{p}_{\pimm})^2 = (\vec{p}_{J/\psi} + \vec{p}_{\pipm})^2 + (\vec{p}_{J/\psi} + \vec{p}_{\pimm})^2 + (\vec{p}_{\pipm} + \vec{p}_{\pimm})^2 - p_{J/\psi}^2 - p_{\pipm}^2 - p_{\pimm}^2

Subtracting c^2 times the latter equation from the former, since E^2 - p^2 c^2 = m^2 c^4, gives

m_{J/\psi\pipm\pimm}^2 = m_{J/\psi\pipm}^2 + m_{J/\psi\pimm}^2 + m_{\pipm\pimm}^2 - m_{J/\psi}^2 - m_{\pipm}^2 - m_{\pimm}^2

(with an overall factor of c^4 that I left out). Now, m_{J/\psi\pipm\pimm} on the left is the same as the center-of-mass energy of the colliding electron and positron. And the masses of the individual particles are known, so the three quantities we'd be making histograms of, m_{J/\psi\pipm}^2, m_{J/\psi\pimm}^2, and m_{\pipm\pimm}^2, always have to add up to the same value! They're not independent. That means any peak from an intermediate decay has to show up on at least two of the plots.

With that in mind, you can put two of these histograms — any two — on the two axes of a scatter plot. This is called a Dalitz plot. Each \elp\ealp\to J/\psi\pipm\pimm collision appears as one point on the graph, with x coordinate m_{J/\psi\pipm}^2 c^4 = (E_{J/\psi} + E_{\pipm})^2 - (\vec{p}_{J/\psi} + \vec{p}_{\pipm})^2 c^2 and y coordinate m_{J/\psi\pimm}^2 c^4 = (E_{J/\psi} + E_{\pimm})^2 - (\vec{p}_{J/\psi} + \vec{p}_{\pimm})^2 c^2. An intermediate particle will show up as a clumping together of points along either the horizontal or vertical axis, corresponding to a peak in a histogram.

Dalitz plot with sample data

This example uses the same data from the histograms in the previous section. It's much cleaner than you'd get in reality, but you see the point: the peak of the histogram corresponds to the dense line of dots running up and down the middle of the plot.

The Big Deal... Maybe

For the real data, we turn to the papers from BES III and Belle. Both experiments have published Dalitz plots showing M_{J/\psi\pipm}^2 against M_{\pipm\pimm}^2. This is from BES III:

BES III Dalitz plot

and this from Belle:

Belle Dalitz plot

The inset plots in the top right show an estimate of the background, based on the "J/\psi sideband": essentially, collisions in which no J/\psi was produced.

If you look around M_{J/\psi\pi^{\pm}}^2 = \SI{15.2}{GeV^2}/c^4 = (\SI{3.9}{GeV}/c^2)^2, you can see a slight clump of points running up and down the graph. Here it is highlighted for clarity: BES III

BES III Dalitz plot with ridge highlighted

and Belle

Belle Dalitz plot with ridge highlighted

It's not all that clear in these plots, but if you look at the corresponding 1D histograms for M_{J/\psi\pipm} (formed by adding up dots down the columns), there's a clear difference between the theoretical prediction without the particle (red lines) and the data (black dots):

BES III histogram


Belle histogram

(Belle). You can see the spike at \SI{3.9}{GeV}/c^2, and another bump at \SI{3.5}{GeV}/c^2 which the paper assures us is an "echo" of the same particle. That structure indicates an intermediate particle, the Z_c(3900), with a mass of around \SI{3.9}{GeV}/c^2 (the peak of the graph), which decays into a J/\psi and a \pipm.

That's the funny thing, though. Since the J/\psi is electrically neutral and the pion has an electric charge of +1, we know the Z_c(3900) also has +1 electric charge. On the other hand, the J/\psi is made of a charm quark and anticharm quark, which means that most likely the Z_c(3900) also contains charm and anticharm quarks. That combination is electrically neutral. So if Z_c(3900) has charm and anticharm quarks, it must also have some other combination of quarks with an electrical charge of +1. That takes at least two additional quarks, for a total of four or more.

When a Tetraquark Isn't Really a Tetraquark

The mere fact that the Z_c(3900) seems to contain at least four quarks doesn't necessarily make it a "true" tetraquark, though. In other words, we don't know that it's actually a single particle. It could be a bound state of two mesons made of two quarks each, a meson molecule, which would be interesting, but not as exotic and exciting as a single particle made of four quarks.

The difference between these two options is kind of subtle. Try this analogy on for size: think of it like two balls of slightly dry clay. If you take the two balls and stick them to each other, you'll get what could reasonably be called one lump of clay, but if you then pull this new lump apart, it'll split back into the two original balls, more or less. That's because the chemical bonds between the clay from one lump and the clay from the other lump aren't quite as strong as the bonds holding each lump of clay together. On the other hand, if you push the balls together hard, and knead them around a bit, now you've got a bona fide single lump of clay. You can pull it apart into two pieces, sure, but the clay from both original balls is all mixed up, so you're not going to split it the same way it was originally split. There isn't one easy division for the clay like there was when you had two separate balls stuck together.

schematic of binding in tetraquark and meson molecule

Mesons and tetraquarks work kind of like that. If you have a molecule of two mesons, in each meson, the quark and antiquark are bound together tightly, like the clay in one lump. The binding between one meson and the other will be a little weaker. But with a "true" tetraquark, all four quarks are equally well bound to each other, forming one object, like when you knead all the clay together. Right now, we can't tell these two options apart, so the Z_c(3900) could be either (or something else entirely). It should be possible in principle to distinguish between them based on how quickly and how often the Z_c(3900) decays into various other types of particles, but we're going to need much more precise measurements to make that happen. This is definitely something to keep an eye out for in future results from both BES III and Belle!


Putting the JATO rocket car to rest

It's that time again: Mythbusters is back! And they sure know how to kick things off with a bang — or better yet, a prolonged burn!

For the 10th anniversary of the show, the Mythbusters revisited the very first myth they ever tested, the JATO rocket car. Wikipedia has the story in what appears to be its most common form:

The Arizona Highway Patrol came upon a pile of smoldering metal embedded into the side of a cliff rising above the road at the apex of a curve. the wreckage resembled the site of an airplane crash, but it was a car. The type of car was unidentifiable at the scene. The lab finally figured out what it was and what had happened.

It seems that a guy had somehow gotten hold of a JATO unit (Jet Assisted Take Off - actually a solid fuel rocket) that is used to give heavy military transport planes an extra 'push' for taking off from short airfields. He had driven his Chevy Impala out into the desert and found a long, straight stretch of road. Then he attached the JATO unit to his car, jumped in, got up some speed and fired off the JATO!

The facts, as best could be determined, are that the operator of the 1967 Impala hit JATO ignition at a distance of approximately 3.0 miles from the crash site. This was established by the prominent scorched and melted asphalt at that location. The JATO, if operating properly, would have reached maximum thrust within five seconds, causing the Chevy to reach speeds well in excess of 350 MPH, continuing at full power for an additional 20-25 seconds. The driver, soon to be pilot, most likely would have experienced G-forces usually reserved for dog-fighting F-14 jocks under full afterburners, basically causing him to become insignificant for the remainder of the event. However, the automobile remained on the straight highway for about 2.5 miles (15-20 seconds) before the driver applied and completely melted the brakes, blowing the tires and leaving thick rubber marks on the road surface, then becoming airborne for an additional 1.4 miles and impacting the cliff face at a height of 125 feet, leaving a blackened crater 3 feet deep in the rock.

Most of the driver's remains were not recoverable; however, small fragments of bone, teeth and hair were extracted from the crater, and fingernail and bone shards were removed from a piece of debris believed to be a portion of the steering wheel.

cartoon of JATO car going out of control

It's a fascinating story and all, but there's plenty of evidence to suggest that this didn't happen, and in fact that it can't happen as described. Not only does the Arizona Highway Patrol have no record of ever investigating a case like this, but it's been tested no less than three times on Mythbusters.

  • The first time, on the pilot episode in 2003, Adam and Jamie found that a Chevy Impala with three hobby rockets on top, providing equivalent thrust to a JATO unit, wouldn't get anywhere close to \SI{350}{mph}, though it did exceed the \SI{130}{mph} top speed of the chase helicopter.
  • The second time, in 2007, they found that... rockets explode sometimes. Hey, it's Mythbusters, you can expect nothing less. :-P
  • The third time was the 10th anniversary special that aired last week. With more power than the equivalent of one JATO unit, the car still didn't get much faster than about \SI{200}{mph}. In this iteration, the Mythbusters also tested the part of the myth in which the car supposedly took off and flew through the air — which also failed spectacularly (in every sense of the word), with their test car running off a ramp and nosediving into the desert.

There's a lot of juicy physics in this myth. But it breaks down into a couple of key parts: first, can a rocket-powered Impala even make it up to \SI{350}{mph}? And secondly, if it did, could it fly a mile and a half through the air? I'm going to handle the speed issue here, and get into the fable of the flying car for a later post.

force diagram for JATO car

Before it becomes airborne, a JATO car is just an object subject to three forces: the thrust of the rocket and the engine force pointing forward, and air resistance pointing backward. Well, there's also tire friction, but that's a lesser influence. In the simplified model I'm going to use, three of these forces are constant — the thrust and engine force forward, and tire friction backward — and air resistance is the one velocity-dependent force. For convenience I'll group all the constant forces under the name drive force, F_\text{drive}.

With the forces as described, a car is pretty similar to a falling object, which is also subject to a constant force in one direction and air resistance in the other. Like, say, an airplane pilot who fell out of his plane at \SI{22000}{ft.}, which I've already worked through the math for in an earlier blog post. Here's how that same math applies to the JATO car: first write Newton's second law \sum F = ma, including the drive force F_\text{drive} and the air resistance F_\text{drag},

F_\text{drive} - F_\text{drag} = F_\text{drive} - \frac{1}{2}CA\rho v^2 = m\ud{v}{t}

The car's terminal speed is the speed at which its acceleration, \ud{v}{t}, is zero. Plugging that in, we get

v_T = \sqrt{\frac{2F_\text{drive}}{CA\rho}}

Now we can rewrite Newton's second law like so:

F_\text{drive}\biggl(1 - \frac{v^2}{v_T^2}\biggr) = m\ud{v}{t}

This makes it easy to see that if an object is moving faster than its terminal speed at any point, that is v > v_T, it will tend to slow down (because \ud{v}{t} < 0), and if it's moving slower than its terminal speed, v > v_T, it will speed up. It'll only stay at a steady speed if \ud{v}{t}\approx 0, and that requires v \approx v_T, i.e. that the object is traveling roughly at its terminal speed.

The story from Wikipedia has the car barreling down the road at more than \SI{350}{mph} for several seconds, which suggests that for a JATO Impala, v_T \gtrsim \SI{350}{mph}. Is that realistic? Well, we do have some of the information necessary to figure it out. As reported in the Mythbusters pilot, the thrust of a JATO is about \SI{1000}{lbf.}, or \SI{4400}{N}, and the density of air, \rho = \SI{1.21}{kg/m^3}. But we don't know the cross-sectional area A and drag coefficient C of the car.

Hmm. That could be a problem.

Fortunately, there's a way around that. Ever heard of a drag race? That's where you set a car (or two) at the beginning of typically a quarter mile track and just floor it to see how fast it makes it to the end. Besides being good for a movie or two... or six (come on, seriously?), the quarter-mile run is a pretty common way to test a car's performance, and the results of some of these drag tests are available online. For the '67 Impala, the site gives

1967 Chevrolet Impala SS427 (CL)
427ci/385hp, 3spd auto, 3.07, 0-60 - 8.4, 1/4 mile - 15.75 @ 86.5mph

This means the car's acceleration is sufficient to take it from rest to \SI{60}{mph} in \SI{8.4}{s}, and that it completed the quarter mile in \SI{15.75}{s}, traveling \SI{86.5}{mph} over the final 66 feet.

force diagram for normal car driving

Let's now go back to the rewritten version of Newton's second law for a normal car,

F_\text{drive}\biggl(1 - \frac{v^2}{v_T^2}\biggr) = m\ud{v}{t}

You can solve this by rearranging and integrating it, but I'm lazy: I just plugged it into Mathematica. The solution for speed as a function of time is

v(t) = v_T\tanh\biggl(\frac{F_\text{drive}}{m v_T}t + \atanh\frac{v_0}{v_T}\biggr)

where v_0 is the initial speed. Now, a '67 Impala has a mass of \SI{1969}{kg} (it varies from car to car, of course, but that's a representative value), and in a drag test, the initial velocity v_0 = 0. That leaves two variables still unknown: F_\text{drive}, the net forward force which moves the car (engine minus friction), and v_T, its terminal velocity. Luckily, we have two data points we can use to solve for them: the 0-60 benchmark v(\SI{8.4}{s}) = \SI{60}{mph}, and the quarter mile time v(\SI{15.75}{s}) = \SI{86.5}{mph}. Those two velocity-time coordinates should be enough to determine F_\text{drive} and v_T. Probably the easiest way to do it is to make a contour plot showing the combinations of v_T and F_\text{drive} that satisfy each condition, like this:

plot of points satisfying condition

The blue curve shows the points at which

v_T\tanh\frac{F_\text{drive}(\SI{8.4}{s})}{(\SI{1969}{kg})v_T} = \SI{60}{mph}

and the red curve shows points where

v_T\tanh\frac{F_\text{drive}(\SI{15.75}{s})}{(\SI{1969}{kg})v_T} = \SI{86.5}{mph}

Their intersection is the one set of parameters that satisfies both conditions, namely v_T = \SI{100.8}{mph} and F_\text{drive} = \SI{7250}{N} = \SI{1628}{lbf.}. Bingo! Now we've got everything we need to calculate the behavior of the rocket car.

Well, wait a minute. We said — actually, Mythbusters said (and what I can find online seems to confirm) that a JATO provides a thousand pounds of thrust. But the value we found for F_\text{drive} is even larger. Surely a standard car engine can't be more powerful than a rocket, can it?

I think we have to conclude that it is. Though this isn't quite what you'd call a standard car engine. The results of the drag test we used to calculate F_\text{drive} were for the '67 Impala SS (Super Sport) 427, a special high-performance version of the car whose engine could crank out \SI{385}{hp}. A standard version of the car would have a less powerful engine, ranging down to \SI{155}{hp}, and thus could have as little as half the engine force — roughly, F \propto P^{2/3}, because F \sim v^2 and P = Fv, and (150/385)^{2/3} \approx 0.5.

Note that if you use F_\text{drive} as obtained from the drag test to calculate the power at the car's top speed of \SI{86.5}{mph}, you get \SI{376}{hp}, which is quite close to the engine's reported horsepower. But I think that's just a coincidence. The same method applied to the car's eventual top speed of $SI{100.8}{mph} gives \SI{437}{hp}, which is more power than the engine is even able to generate! This reflects the fact that the way we derived v(t) makes a lot of simplifying assumptions. For example, it assumes that the air resistance is proportional to v^2. In reality, a car is a complicated shape that induces some amount of turbulence, making the drag force difficult to characterize. More importantly, we've assumed the drive force is constant, which is not at all true for a real car. The engine force changes as the car shifts gears and as parts warm up, there are other assorted forces at work like rolling friction.

As a check of sorts on how close this model comes, I'm going to integrate the formula again, to get a formula for distance traveled over time:

x(t) = \frac{m v_T^2}{F_\text{drive}}\ln\cosh\frac{F_\text{drive} t}{m v_T}

In theory, we should be able to plug in the values we found — m = \SI{1969}{kg}, v_T = \SI{100.8}{mph}, and F_\text{drive} = \SI{7250}{N} — along with the quarter mile time, t = \SI{15.75}{s}, and the distance \SI{0.25}{mile} will pop out. When I actually plug the values in, I get \SI{0.229}{mile}, which is within 10%, so not bad. That suggests that the different inaccuracies cancel out to some extent.

OK, so where does that leave us? We have a formula,

v(t) = v_T\tanh\frac{F_\text{drive} t}{m v_T}

which seems to somewhat accurately describe the motion of a car under its own power in a drag test. We also have best-fit values for the parameters of this formula: v_T = \SI{100.8}{mph}, and F_\text{drive} = \SI{7250}{N} for the SS427, and F_\text{drive} \approx \SI{3600}{N} for a standard Impala. Time to ramp it up to the JATO rocket car!

You may remember from earlier in this post that the terminal velocity is proportional to the square root of the net constant force F_\text{drive} applied to move the car. If a terminal speed of \SI{100.8}{mph} corresponds to a drive force of \SI{7250}{N}, then adding on a JATO's thrust of an additional \SI{1000}{lbf.} would naively give a terminal speed of

\SI{100.8}{mph}\times \sqrt{\frac{\SI{7250}{N} + \SI{1000}{lbf.}}{\SI{7250}{N}}} = \SI{128}{mph}

Doing the same calculation for the \SI{1500}{lbf.} hobby rockets the Mythbusters used gives

\SI{100.8}{mph}\times \sqrt{\frac{\SI{7250}{N} + \SI{1500}{lbf.}}{\SI{7250}{N}}} = \SI{140}{mph}

which is considerably less than the estimate of a top speed around \SI{190}{mph} for the car in the pilot episode. Huh.

OK, so what could be wrong with the model?

  • Could the drag coefficient — actually the product CA\rho — of the car (with attached rocket) be much smaller than we thought, making the car's terminal speed higher? It's hard to see how. After all, the terminal speed we calculated was for a stock Impala, a shape which is reasonably aerodynamic, but the Mythbusters went and put a rocket pack on it, breaking the airflow over the roof. If anything, it should go slower with the rockets on top.
  • Could the car's engine be more powerful than we think? Again, probably not. I'm already using a value which corresponds to one of the most powerful engines available at the time — it'd still be a pretty strong engine in the modern market. To increase the car's terminal speed up to \SI{190}{mph} by increasing engine force would require an engine almost three times as strong as what I'm using in the model.
  • What about power losses in the drivetrain? Well, of course those make the car's terminal speed slower, not faster. There are plenty of reasons to imagine that the terminal speed should be less than what we calculated, but not higher.

I'm actually doubtful that whatever the error is has anything to do with the car itself. Here's why: suppose we take that equation for v(t) from earlier and plot it, starting at \SI{80}{mph}, for several different configurations.

plot of speed of JATO car over time

The lower edge of each band represents a car with a standard \SI{155}{hp} motor, and the higher edge represents a car with an enhanced performance \SI{385}{hp} motor. And the shaded areas represent the results we actually saw on Mythbusters (or the circumstances described in the myth, for the gray box). Notice that the orange band, representing the car with 5 rockets simultaneously firing as in the anniversary special, goes right through the pink box representing the speed we saw on that show. So the model works in that case!

On the other hand, the light green band, the one for the hobby rockets used in the pilot, only barely breaks the \SI{130}{mph} speed of the chase helicopter (the dotted line) and doesn't ever get up to the speed estimate of \SI{190}{mph}. The only way I can come up with to reconcile this math with the results we saw on the show is that the rockets they used might have been more powerful than claimed. If you assume that the rockets give off a little more than twice as much thrust as was said on the show, i.e. around \SI{3500}{lbf.}, you get the dark green curve which seems to come close to the observed results. If anyone knows a better explanation for that difference, I'd be interested to here it, but for now I just have to conclude that something was off about those reported rocket numbers.

Of course, let's not forget the real point of the graph. None of these curves come anywhere close to the circumstances claimed in the myth! So despite the discrepancies in the details, the conclusion remains solidly the same: physics says this myth is absolutely busted.


EXTRA EXTRA (positrons)! Read all about it!

Last week, I wrote about the announcement of the first results from the Alpha Magnetic Spectrometer: a measurement of the positron fraction in cosmic rays. Although AMS-02 wasn't the first to make this measurement, it was nevertheless a fairly exciting announcement because they confirm a drastic deviation from the theoretical prediction based on known astrophysical sources.

Unfortunately, most of what you can read about it is pretty light on details. News articles and blog posts alike tend to go (1) Here's what AMS measured, (2) DARK MATTER!!!1!1!! All the attention has been focused on the experimental results and the vague possibility that it could have come from dark matter, but there's precious little real discussion of the underlying theories. What's a poor theoretical physics enthusiast to do?

Well, we're in luck, because on Friday I attended a very detailed presentation on the AMS results by Stephane Coutu, author of the APS Viewpoint about the announcement. He was kind enough to point me to some references on the topic, and even to share his plots comparing the theoretical models to AMS (and other) data, several of which appear below. I never would have been able to put this together without his help, so thanks Stephane!

Time to talk positrons.

The Cosmic Background

When people talk about "known astrophysical sources" of positrons, they're mostly talking about cosmic rays. Not primary cosmic rays, though, which are the particles that come directly from pulsars, accretion discs, or whatever other sources are out there. Primary cosmic rays are generally protons or atomic nuclei. As they travel through space, they decay into other particles, secondary cosmic rays, through processes like this:

\begin{align}\prbr + \text{particle} &\to \pipm + X \\ \pipm &\to \ualp\unu \\ \ualp &\to \ealp\enu\uanu\end{align}

Positrons in the energy range AMS can detect, below \SI{1}{TeV} or so, mostly come from galactic primary cosmic rays (protons). We can determine the production spectrum of these cosmic ray protons (how quickly they are produced at various energies) using astronomical measurements like the ratio of boron to carbon nuclei and the detected flux of electrons — but that's a whole other project that I won't get into here.

Once the proton spectrum is set, we can combine it with the density of the interstellar medium to determine how often reactions like the one above will occur, again as a function of energy. That gives us a spectrum for positron production. But to actually match this model to what we detect in Earth orbit, we need to account for various energy loss mechanisms that affect cosmic rays as they travel. Both primary (protons) and secondary (positrons) cosmic rays lose energy to processes like synchrotron radiation (energy losses as charged particles change direction in a magnetic field), bremsstrahlung (energy losses from charged particles slowing down in other particles' electric field), and inverse Compton scattering (charged particles "bouncing" off photons). These dissipative mechanisms tend to reduce the positron spectrum at high energies.

Galactic disk and haloDoing all this accurately involves accounting for the distribution of matter in the galactic disk, and accordingly it takes a rather sophisticated computer program to get it right. The "industry standard" is a program called GALPROP, which breaks down the galaxy and its halo (a slightly larger region surrounding the disk, which contains globular clusters and dark matter) into small regions, tracks the spectra of various kinds of particles in each region, and models how the spectra change over time as cosmic rays move from one region to another. There are various models with different levels of detail, most of which are described in this paper and improved in e.g. this one and this one:

  • The class of theories known as leaky box models (or homogeneous models) assume that cosmic rays are partially confined within the galaxy — a few leak out into intergalactic space, but mostly they stay within the galactic disk and halo. Both the distribution of where secondary cosmic rays are produced and the interstellar medium they travel through are effectively uniform. Accordingly, the times (or distances) they travel before running into something follow an exponential distribution with an energy-dependent average value \expect{t} (or \lambda_e = \rho v\expect{t}).
  • The diffusive halo model assumes that the galaxy consists of two regions, a disk and a halo. Within these two regions, cosmic rays diffuse outward from their sources, and those that reach the edge of the halo escape from the galaxy, never to return. The diffusion coefficient is taken to be twice as large in the disk as in the halo due to the increased density of matter.
  • The dynamical halo model is exactly like the diffusive halo model with the addition of a "galactic wind" that pushes all cosmic rays in the halo outward at some fixed velocity V.

There are others, less commonly used, but all these models share one significant thing in common: they give a positron fraction that decreases with increasing energy. And the first really precise measurements of cosmic ray positrons, performed by the HEAT and CAPRICE experiments, confirmed that conclusion, as shown in this plot.

Early measurements of positron fraction from HEAT and CAPRICE

But new data from PAMELA, Fermi-LAT, and now AMS-02 show something entirely different! Above \SI{10}{GeV}, the positron fraction actually increases with energy, showing that something must be producing additional positrons at those higher energies.

Enhanced measurements from PAMELA, Fermi-LAT, and AMS-02

The spectrum of the positron fraction excess, i.e. the difference between secondary emission predictions and the data, suggest that this unknown source produces roughly equal numbers of positrons and electrons at the energies AMS has been able to measure, with a power-law spectrum for each:

\phi_{\mathrm{e}^\pm} \propto E^{\gamma_s},\quad E \lesssim \SI{300}{GeV}

As an example model, the AMS-02 paper postulated

\begin{align}\Phi_{\ealp} &= C_{\ealp}E^{-\gamma_{\ealp}} + C_s E^{-\gamma_s} e^{-E/E_s} \\ \Phi_{\elp} &= C_{\elp}E^{-\gamma_{\elp}} + C_s E^{-\gamma_s} e^{-E/E_s}\end{align}

with E_s = \SI{760}{GeV} based on a fit to their data. But regardless of whether this specific formula works, the point is that secondary emission tends to produce more positrons than electrons (because most primary cosmic rays are protons, which generally decay into positrons due to charge conservation). That doesn't fit the profile. This unexplained excess is probably something else.


Naturally, physicists are going to be most excited if the positron excess turns out to come from some previously unknown particle. The most likely candidate is the neutralino, denoted \tilde{\chi}^0, a type of particle predicted by most supersymmetric theories. Neutralinos are the superpartners of the W and Z gauge bosons, and of the Higgs boson(s).

According to the theories, reactions involving supersymmetric particles tend to produce other supersymmetric particles. The neutralino, as the lightest of these particles , is at the end of the supersymmetric decay chain, which makes it a good candidate to constitute the mysterious dark matter. But occasionally, neutralinos will annihilate to produce normal particles like positrons and electrons. If dark matter is actually made of large clouds of neutralinos, it's natural to wonder whether the positrons produced from their annihilation could make up the difference between the prediction from secondary cosmic rays and the AMS observations.

Here's how the calculation goes. Using the mass of dark matter we know to exist from galaxy rotation curves and gravitational lensing, and assuming some particular mass m_{\chi} for the neutralino, we can calculate how many neutralinos are in our galaxy's dark matter halo. Multiplying that by the decay rate predicted by the supersymmetric theory gives the rate of positron production from neutralino decay. That rate gets plugged into cosmic ray propagation models like those described in the last section, leading to predictions for the positron flux measured on Earth.

Several teams have run through the calculations and found that... well, it kind of works, but only if you fudge the numbers a bit. Neutralino annihilation predicts a roughly power-law contribution to the positron fraction up to the mass of the neutralino; that is,

\phi_{\tilde{\chi}^0\to \mathrm{e}^{\pm}} \sim \begin{cases}C E^{\gamma_\chi},& E \lesssim m_\chi c^2 \\ \text{small},& E \gtrsim m_\chi c^2\end{cases}

As long as m_\chi \gtrsim \SI{500}{GeV} or so, this is exactly the kind of spectrum needed to explain the discrepancy between the PAMELA/Fermi/AMS results and the secondary emission spectrum. The problem lies in the overall constant C, which you would calculate from the dark matter density and the theoretical decay rate. It's orders of magnitude too small. So the papers multiply this by an additional "boost" factor, B, and examine how large B needs to be to match the experimental results. Depending on the model, B ranges from about 30 (Baltz et al., m_\chi = \SI{160}{GeV}) to over 7000 (Cholis et al., m_\chi = \SI{4000}{GeV}).

Prediction from neutralino annihilation with boost factor

Alternatively, you can assume that something is wrong with the propagation models, and that positrons lose more energy than expected on their way through the interstellar medium. This is the approach taken in this paper, which finds that increasing the energy loss rate by a factor of 5 can kind of match the positron fraction data.

Prediction from neutralino annihilation with unrealistic energy loss

But that much of an adjustment to the energy loss leads to conflicts with other measurements. It winds up being an even more unrealistic model.

Even if the parameters of some supersymmetric theory can be tweaked to match the data without a boost factor, there's one more problem: neutralinos decay into antiprotons and photons too. If the positron excess is caused by neutralino decay, there should be corresponding excesses of antiprotons and gamma rays, but we don't see those. It's going to be quite tricky to tune a dark matter model so that it gives us the needed flux of positrons without overshooting the measurements of other particles. There is only a small range of values of mass and interaction strength that would be consistent with all the measurements. So as much as dark matter looks like an interesting direction for future research, it's not a realistic model for the positron excess just yet.

Astrophysical sources

With the dark matter explanation looking only moderately plausible at best, let's turn to other (less exotic) astrophysical sources. There's a fair amount of uncertainty about just how many cosmic rays are produced even by known sources. They could be emitting enough electrons and positrons to make the difference between the new data and the theories.

Pulsars in particular, in addition to being sources of primary cosmic rays (protons), are often surrounded by nebulae that emit electrons and positrons from their outer regions. The pulsar's solar wind interacts with the nebula to accelerate light particles to high energies, giving these systems the name of pulsar wind nebulae (PWNs). Simply by virtue of being a PWN, it's expected to emit a certain "baseline" positron and electron flux, which is included in secondary emission models, but the pulsar could have been much more active in the past, emitting a lot more positrons and electrons. These would have become "trapped" in the surrounding nebula and continued to leak out over time, which means we would be seeing more positrons and electrons than we'd expect to based on the pulsar's current activity. There are a few nearby PWNs which seem like excellent candidates for this effect, going by the (rather snazzy, if you ask me) names of Geminga and Monogem. A number of papers (Yüskel et al., and recently Linden and Profumo) have crunched the numbers on these pulsars, and they find that the positron/electron flux from enhanced pulsar activity can match up quite well with the positron fraction excess detected by PAMELA, Fermi-LAT, and AMS-02.

Positron fraction from high-activity Geminga

The "smoking gun" that would definitely (well, almost definitely) identify a pulsar as the source of the excess would be an anisotropy in the flux: we'd see more positrons coming from the direction of the pulsar than from other directions in the sky. Now, AMS-02 (and Fermi-LAT before it) looked for an effect of this sort, and they didn't find it — but according to Linden and Profumo, it's entirely possible that the anisotropy could be very slight, less than what either experiment was able to detect. We'll have to wait for future experimental results to check that hypothesis.

Modified secondary emission

Of course, it's important to remember (again) that all these analyses are based on the propagation models that tell us how cosmic rays are produced and move through the galaxy. It's entirely possible that adjusting the propagation models alone, without involving any extra source of positrons, would bring the predictions from secondary emission in line with the experimental data.

A paper by Burch and Cowsik looked at this possibility, and it turns out that something called the nested leaky-box model can fix the positron fraction discrepancy fairly well. As I wrote back in the first section, the leaky box model gets its name because cosmic rays are considered to be partially confined within the galaxy. Well, the nested leaky box model adds the assumption that cosmic rays are also partially confined in small regions around the sources that produce them. That means that, rather than being produced uniformly throughout the galaxy, secondary cosmic rays come preferentially from certain regions of space. This is actually similar to the hypothesis from the last section, of extra positrons coming from PWNs, so it shouldn't be too surprising that using the nested leaky box model can account for the data about as well as the pulsars can.

Modified secondary production model

Looking to the future

All the media outlets reporting on the AMS results have been talking about the dark matter hypothesis, even going so far as to say AMS found evidence of dark matter — but clearly, that's not the case. There's no reason to say we have evidence of dark matter when there are perfectly valid, simpler, maybe even better explanations for the positron fraction excess at high energies! There's just not enough data yet to tell which explanation is right.

As AMS-02 continues to make measurements over the next decade or so, there are two main things to look for that will help distinguish between these models. First, does the positron fraction stop rising? And if so, where on the energy spectrum does it peak? As we've seen, this can happen in any model, but if neutralino annihilation is the right explanation, that peak will have to occur at an energy compatible with other constraints on the neutralino mass. Perhaps more importantly, is there any anisotropy in the direction from which these positrons are coming? If there is, it would pretty strongly disfavor the dark matter hypothesis. The anisotropy itself could actually point us toward the source of the extra positrons. So even if we don't wind up discovering a new particle from this series of experiments, there's probably something pretty interesting to be found.


Positrons in space!!

A fair amount of what I write about here is about accelerator physics, done at facilities like the Large Hadron Collider. But you can also do particle physics in space, which is filled with fast-moving particles from natural sources. "All" you need to do is build a particle detector, analogous to ATLAS or CMS, and put it in Earth orbit. That's exactly what the Alpha Magnetic Spectrometer (AMS) is. Since 2011, when it was installed on the International Space Station, AMS has been detecting cosmic electrons and positrons, looking for anomalous patterns, and today they presented their first data release.

Let's jump straight to the results:

AMS data with other experiments

This plot shows the number of positrons with a given energy as a fraction of the total number of electrons and positrons with that energy, \frac{N_\text{positrons}}{N_\text{electrons} + N_\text{positrons}}. The key interesting feature, which confirms a result from the previous experiments PAMELA and Fermi-LAT, is that the plot rises at energies higher than about \SI{10}{GeV}. That's not what you'd normally expect, because most physical processes produce fewer particles at higher energies. (Think about it: it's less likely that you'll accumulate a lot of energy in one particle.) So there must be some process, not completely understood, which is producing positrons.

As part of their data analysis, AMS has tested a model which describes the flux (total number) of positrons as the sum of two contributions:

  • A power law (the first term in the below equations), representing known, "typical" sources, and
  • A "source spectrum" with an exponential cutoff, representing something new

\begin{align}\Phi_{\ealp} &= C_{\ealp}E^{-\gamma_{\ealp}} + C_s E^{-\gamma_s} e^{-E/E_s} \\ \Phi_{\elp} &= C_{\elp}E^{-\gamma_{\elp}} + C_s E^{-\gamma_s} e^{-E/E_s}\end{align}

The model fits pretty well to the data so far:

Fit of model to AMS data

This means that the AMS data are consistent with the existence of a new massive particle, one that might make up the universe's dark matter. But a new particle is not the only explanation. You'll see a lot of news articles, blog posts, comments, etc. saying that AMS has detected evidence of dark matter, but that's just not true. For example, there are known astrophysical sources, such as pulsars, which could conceivably be making these high-energy positrons. The results found by AMS so far are not precise enough, and don't go up to high enough energies, to allow us to tell the difference with any confidence.

There are a couple of signs we'll be looking for that could help identify this unknown source of positrons:

  • The main one is that, if the positrons are coming from the decays of some as-yet-undiscovered particle, they can't be produced at energies higher than that particle's mass. Now yes, the new particle could be moving fast when it decays, and that would produce fast-moving positrons with high energies — but for a variety of reasons, we don't expect that to be the case. In the standard model of cosmology, the dark matter is "cold," which means that it's not moving very fast relative to other things in the universe.
  • The other main sign to look for is any anisotropy in the positrons' direction — that would mean that there are more positrons coming from some directions in the sky than others. If an anisotropy is detected, that indicates that these positrons are coming from specific, localized sources, and not from something that exists pretty much everywhere, as dark matter would. AMS has checked for this effect in the data they've collected so far, and they see a pretty isotropic distribution; positrons are coming from every direction in the sky with roughly the same frequency. That is consistent with the idea that they come from dark matter particle decays, but again, it is not actual evidence. Even if the positrons are coming from something like pulsars, they could be bent around to approach in different directions by the volatile magnetic fields of the Sun, Earth, and galaxy.

AMS will continue to collect data for a long time to come, so we can look forward to ever more precise data releases in the future, data which will hopefully put a rest to the mystery of the not-missing positrons. In the meantime, you may want to check out the PRL viewpoint — a not-too-technical explanation — of this research, or even read the original paper, which can be downloaded for free from the gray citation on the viewpoint page.


An April Fool's Planck, for science

Oh, I kid. Despite the name, nothing about this post is a prank (except perhaps for the title).

It's been a week and a half since the Planck collaboration released their measurements of the cosmic microwave background. At the time, I wrote about some of the many other places you can read about what those measurements mean for cosmology and particle physics. But it's a little harder to find information on how we come to those conclusions. So I thought I'd dig into the science behind the cosmic microwave background: how we measure it and how we manipulate those measurements to come up with numbers.

Measuring the CMB

With that in mind, what did Planck actually measure? Well, the satellite is basically a spectrometer attached to a telescope. It has 74 individual detectors, each of which detects photons in one of 9 separate frequency ranges. As the telescope points in a certain direction, each detector records how much energy carried by photons in its frequency range hit it from that direction. The data collected would look something like the points in this plot:

sample temperature measurements

From any one of these data points, given the frequency and the measured power, you can calculate the temperature of the blackbody that produced it by starting with Planck's law,

B(\nu, T) = \frac{2h\nu^3}{c^2}\frac{1}{\exp(h\nu/kT) - 1}

(\nu is the frequency), combining it with the definition of spectral radiance,

B = \frac{\udc^4 P}{\uddc\Omega \uddc A\cos\theta}

(A is the area of the satellite's mirror, \Omega is the solid angle it sees at one time), and solving for temperature to get

T = \frac{h\nu}{k\ln\bigl(\frac{2h \Omega A \nu^3}{Pc^2} + 1\bigr)}

With 74 separate simultaneous measurements, you can imagine that Planck is able to constrain the temperature of the CMB very precisely!

We've known for quite some time, since the COBE data presentation in 1990, that the CMB has an essentially perfect blackbody spectrum, with a temperature of \SI{2.72548+-0.00057}{K}.

COBE spectrum

But we've also known for some time that the CMB isn't exactly the same temperature in every direction. It varies by tiny fractions (thousandths) of a degree from one spot in the sky to another, so depending on which way you point the telescope, you'll find a slightly different result. The objective of the Planck mission, like WMAP before it, is to measure these slight variations in temperature as precisely as possible.

The CMB power spectrum

One way to represent the temperature variations, or anisotropies, measured by Planck is a straightforward visualization, like this:

Planck CMB

Every direction in space maps to a point in this image. Red areas indicate the directions in which Planck found the CMB to be slightly warmer than average (after accounting for radiation received from our galaxy and other known astronomical sources), and blue areas indicate the directions in which it was slightly cooler than average.

But for the scientific analysis of the CMB, it's not actually that important to know exactly where in the sky the hot spots and cold spots are. These little anisotropies were generated by quantum fluctuations in the structure of spacetime in the very early universe, and quantum fluctuations are random. No theory can actually predict that you'll have a hot spot in this particular direction and a cold spot in that particular direction. What you can predict is how the energy in the CMB should be distributed among various "modes." Each mode is basically a pattern of hot and cold spots of a characteristic size.

Interlude: modes of a one-dimensional function

Modes are a lot easier to understand once you can visualize them, so let me explain the concept with a simple example. Here's a plot of a function:

normal plot of function

Actually, that graph is kind of boring. Here's a more colorful way of visualizing the same function: a density map, which is red in the regions where the function is large and blue where it's small:

density map of function

If you're familiar with Fourier analysis, you know that any function (more precisely: any periodic function on a finite interval of length L) can be expressed a sum of several sinusoidal functions with different wavelengths.

f(x) = \sum_{n = 0}^{\infty} f_n \sin\frac{n\pi x}{L}

For this function, those are the 12 sine waves shown here:

sine waves

Each of these sine waves corresponds to a mode. We typically label them by the number of peaks and valleys in the wave, shown as n on each plot. If you look at the density maps, you can see that the modes with higher numbers have smaller hot (red) and cold (blue) spots, as I mentioned earlier.

Having broken down the original function into these sine waves, you can talk about how much energy is in each mode. The energy is related to the square of the sine wave's amplitude. For example, the n = 3 mode of this function has the most energy; n = 6 and n = 8 have relatively little. You can tell because the graph of the n = 3 sine wave has the largest amplitude, and the n = 6 and n = 8 sine waves have small amplitudes. (The density maps don't show the amplitudes.)

What a cosmological theory predicts is the amount of energy (per unit time) in each mode: the numbers C_n = \abs{f_n}^2. This is called the power spectrum.

Modes on a sphere

We can do the same thing with the CMB that we did with the one-dimensional function in that example: break it down into individual modes, each with some amplitude, and determine how much energy is in each mode. Of course, the sky isn't a line; it's a sphere, which means the modes of the CMB are more complicated than just sine waves. Modes on a sphere are called spherical harmonics, and their density maps look like this:

spherical harmonics

Because a sphere is a two-dimensional surface, it takes two numbers to index these modes, \ell and m.

Any real-valued function on a sphere — like, say, the function which gives the temperature of the CMB in a given direction — can be expressed as a sum of real spherical harmonics, each scaled by some amplitude.

T(\theta, \phi) = \sum_{\ell = 0}^{\infty}\sum_{m = 0}^{m = \ell} T_{\ell m} Y_{\ell m}(\theta, \phi)

We can then go on to compute the power spectrum, just as in the 1D case. But it's conventional to combine the power in all the modes with the same value of \ell:

C_{\ell} = \sum_{m = 0}^{\ell} \abs{T_{\ell m}}^2

The numbers C_{\ell} constitute the power spectrum, analogous to C_n for the 1D function.

To the data!

Here's the actual power spectrum measured by Planck, shown as red dots:

Planck power spectrum

The quantity on the vertical axis is \frac{1}{2\pi}\ell(\ell + 1) C_\ell. Beyond \ell = 10 or so, each point represents an average over a few different values of \ell. You don't see points for \ell = 0 or \ell = 1 on the plot because the amplitude of the \ell = 0 mode is just the average CMB temperature over the entire sky, which is probably skewed by sources of radiation local to our little corner of the universe, and the amplitude of the \ell = 1 mode is primarily due to our motion relative to the CMB. So the meaningful physics starts at \ell = 2.

At this point, I would love to dig into the models, and explain where some of those features in the power spectrum come from. But that will have to wait for another day. Stay tuned for the sequel to this post, coming soon, where I'll talk about the physics that makes the CMB power spectrum what it is!