ellipsix informatics

About saturation

Time to kick off a new year of blog posts! For my first post of 2015, I'm continuing a series I've had on hold since nearly the same time last year, about the research I work on for my job. This is based on a paper my group published in Physical Review Letters and an answer I posted at Physics Stack Exchange.

In the first post of the series, I wrote about how particle physicists characterize collisions between protons. A quark or gluon from one proton (the "probe"), carrying a fraction x_p of that proton's momentum, smacks into a quark or gluon from the other proton (the "target"), carrying a fraction x_t of that proton's momentum, and they bounce off each other with transverse momentum Q. The target proton acts as if it has different compositions depending on the values of x_t and Q: in collisions with smaller values of x_t, the target appears to contain more partons.

kinematic diagram of proton composition

At the end of the last post, I pointed out that something funny happens at the top left of this diagram. Maybe you can already see it: in these collisions with small x_t and small Q, the proton acts like a collection of many partons, each of which is relatively large. Smaller x_t means more partons, and smaller Q means larger partons. What happens when there are so many, and so large, that they can't fit?

Admittedly, that may not seem like a problem at first. In the model I've been using so far, a proton is a collection of particles. And it seems totally reasonable that when you're looking at a proton from one side, some of the particles will look like they're in front of other particles. But this is one of those situations where the particle model falls short. Remember, protons are really made of quantum fields. Analyzing the proton's behavior using quantum field theory is not an easy task, but it's been done, and it turns out an analogous, but very serious, problem shows up in the field model: if you extrapolate the behavior of these quantum fields to smaller and smaller values of x_t, you reach a point where the results don't make physical sense. Essentially it corresponds to certain probabilities becoming greater than 1. So clearly, something unexpected and interesting has to happen at small x_t to keep the fields under control.

Parton branching and the BFKL equation

To explain how we know this, I have to go all the way back to 1977. Quantum chromodynamics (QCD), the model we use to describe the behavior of quarks and gluons, was only about 10 years old, and physicists at the time were playing around with it, poking and prodding, trying to figure out just how well it explained the known behavior of protons in collisions.

Most of this tinkering with QCD centered around the parton distributions f_i(x, Q^2), which I mentioned in my last post. Parton distributions themselves actually predate QCD. They first emerged out of something called the "parton model," invented in 1969, which is exactly what it sounds like: a mathematical version of the statement "protons are made of partons." So by the time QCD arrived on the scene, the parton distributions had already been measured, and the task that fell to the physicists of the 1970s was to try to reproduce the measurements of f_i(x, Q^2) using QCD.

When you're testing a model of particle behavior, like QCD, you do it by calculating something called a scattering cross section, which is like the effective cross-sectional area of the target particle. If the target were a sphere of radius r, for example, its cross section would be \pi r^2. But unlike a plain old solid sphere, the scattering cross section for a subatomic particle depends on things like how much energy is involved in the collision (which you may remember as \sqrt{s} from the last post) and what kinds of particles are colliding. The information about what kinds of particles are colliding is represented mathematically by the parton distributions f_i(x, Q^2). So naturally, in order to make a prediction using the theory, you need to know the parton distributions.

The thing is, we actually can't do that! Believe me, people are trying, but there's a fairly fundamental problem: parton distributions are nonperturbative, meaning they are inextricably linked to the behavior of the strong interaction when it is too strong for standard methods to handle. They already knew this in the 1970s. However, that didn't stop people from trying to calculate something about the PDFs which could be linked to experimental results.

perturbative and nonperturbative parton distributions

It turns out that even though the exact forms of the parton distributions can't be calculated from quantum field theory, you can calculate their behavior at small values of x, the green part on the left of the preceding diagram. In 1977, four Russian physicists — Ian Balitsky, Victor Fadin, Eduard Kuraev and Lev Lipatov — derived from QCD an equation for the rate of change of parton distributions with respect to x, in collisions with energy \sqrt{s} much larger than either the masses of the particles involved or the amount of energy transferred between them (Q, roughly). In modern notation, the equation (which I will explain later) is written

\pd{N(x, Q^2, \vec{r}_{01})}{\ln\frac{1}{x}} = \frac{\alpha_s}{2\pi}\int\uddc\vec{r}_2\frac{r_{01}^2}{r_{02}^2r_{12}^2} [N(x, Q^2, \vec{r}_{02}) + N(x, Q^2, \vec{r}_{12}) - N(x, Q^2, \vec{r}_{01})]

N is something called the color dipole cross section, which is related to f from before via an equation roughly like this:

f(x, Q^2) = \int^{Q^2}\iint N(x, k^2, \vec{r})\uddc\vec{r}\udc k^2

That's why f is often called an integrated parton distribution and N an unintegrated parton distribution. I won't go into the details of the difference between N and f, since both of them show the behavior I'm going to talk about in the rest of this post.

Anyway, the behavior that Balitsky, Fadin, Kuraev, and Lipatov analyzed comes from processes like these:

parton branching diagrams

At each vertex, one parton with a certain fraction x of the proton's momentum splits into other partons with smaller values of x. You can see this reflected in the equation: the term -N(x, Q^2, \vec{r}_{01}) represents the disappearance of the original parton, and N(x, Q^2, \vec{r}_{02}) + N(x, Q^2, \vec{r}_{12}) represents the creation of two new partons with smaller momentum fractions x. When this happens repeatedly, it leads to a cascade of lower-and-lower momentum particles as the branching process goes on. This explains why the number of partons, and thus the parton distribution functions, increase as you go to smaller and smaller values of x.

This BFKL model has been tested in experiment after experiment for many years, and it works quite well. For example, in the plot below, from this paper by Anatoly Kotikov, you can see that the predictions from the BFKL equation (solid lines) generally match the experimental data (dots with error bars) quite closely.

comparison of F2 experimental data and BFKL predictions

The plot shows the structure function F_2, which is a quantity related to the integrated parton distribution.

Parton recombination

However, there is one big problem with the BFKL prediction: it never stops growing! After all, if the partons keep splitting over and over again, you keep getting more and more of them as you go to lower momentum fractions x. Mathematically, this corresponds to exponential growth in the parton distributions:

\mathcal{F}(x, Q^2) = \ud{f}{Q^2} \sim \frac{x^{-4\bar{\alpha}_s\ln 2}}{\sqrt{\ln\frac{1}{x}}}

which is roughly the solution to the BFKL equation.

If the parton distributions get too large, when you try to calculate the scattering cross section, the result "breaks unitarity," which effectively means the probability of two partons interacting becomes greater than 1. Obviously, that doesn't make sense! So this exponential growth that we see as we look at collisions with smaller and smaller x_t can't continue unchecked. Some new kind of physics has to kick in and slow it down. That new kind of physics is called saturation.

The physical motivation for saturation was proposed by two physicists, Balitsky (the same one from BFKL) and Yuri Kovchegov, in a series of papers starting in 1995. Their idea is that, when there are many partons, they actually interact with each other — in addition to the branching described above, you also have the reverse process, recombination, where two partons with smaller momentum fractions combine to create one parton with a larger momentum fraction.

parton recombination

At large values of x, when the number of partons is small, it makes sense that not many of them will merge, so this recombination doesn't make much of a difference in the proton's structure. But as you move to smaller and smaller x and the number of partons grows, more and more of them recombine, making the parton distribution deviate more and more from the exponential growth predicted by the BFKL equation. Mathematically, this recombination adds a negative term, proportional to the square of the parton distribution, to the equation.

\pd{N(x, Q^2, \vec{r}_{01})}{\ln\frac{1}{x}} = \frac{\alpha_s}{2\pi}\int\uddc\vec{r}_2\frac{r_{01}^2}{r_{02}^2r_{12}^2} [N(x, Q^2, \vec{r}_{02}) + N(x, Q^2, \vec{r}_{12}) - N(x, Q^2, \vec{r}_{01}) - N(x, Q^2, \vec{r}_{02})N(x, Q^2, \vec{r}_{12})]

When the parton density is low, N is small and this nonlinear term is pretty small. But at high parton densities, the nonlinear term has a value close to 1, which cancels out the other terms in the equation. That makes the rate of change \pd{N}{\ln\frac{1}{x}} approach zero as you go to smaller and smaller values of x, which keeps N from blowing up and ruining physics.

By way of example, the following plot, from this paper (my PhD advisor is an author), shows how the integrated gluon distribution grows more slowly when you include the nonlinear term (solid lines) than when you don't (dashed lines):

plot of BFKL and BK solutions

So where does that leave us? Well, we have a great model that works when the parton density is low, but we don't know if it works when the density is high. That's right: saturation has never really been experimentally confirmed, although it's getting very close. In the third and final post in this series (not counting any unplanned sequels), I'll explain how physicists are now trying to do just that, and how my group's research fits into the effort.


A look back at 2014 on the blog

Every New Year's Eve I do a review of my favorite blog posts from the past year. And normally I have too many good physics posts to make a top 10 list like so many other sites seem to do. But not this year. It's been a pretty quiet year for blogging, especially for physics blogging (unless you count that one really big blog post they call a dissertation).

Therefore, New Year's resolution #1: write more blog posts about interesting physics. This is one I actually think I can keep.

For now, here is a short list of my favorites out of the 32 blog posts I wrote this year.


Adventures in China: The Christmas

Guess where this is?

pile of presents in restaurant lobby

This is the restaurant where I went to dinner last night. A fancy, yet very definitely Chinese restaurant. In China.

News flash: Americans aren't the only ones obsessed with Christmas.

Okay, to be fair, nobody turns Christmas into an obsession quite like the United States. I think the frantic rush to start making preparations in September is a uniquely American tradition. But the celebration is catching on among the Chinese, especially young people, in a big way. From what I hear, a lot of Chinese are taking Christmas as an occasion to spend more time with their families. And businesses are capitalizing on the spirit by putting up holiday-themed decorations — lights, presents, and even decorated trees are everywhere.

ornamented stairs at Best Western

tree at Best Western

As I write this, I've been sitting in the Beijing airport for five hours listening to a loop of "Santa Baby," "There's No Place Like Home For The Holidays," "Silver Bells," "Jingle Bells," and a rather Hawaiian-sounding rendition of "Let It Snow" (notable for the contrast with the complete lack of snow outside).

I guess the lesson is, if you're tired of the Christmas frenzy, you might be able to hide, but you can't run. It's everywhere.


Website back up

Hooray, it works again! It took about 3 days of frantic hacking in the free time I had left over from research, but my website is back up and working properly (so it seems) on the new server. More blog posts to come, when I have time. Soon, I promise.

That is all.


Switching servers

The (virtual) computer this site runs on is showing its age, so I'm switching over to a newer one within the next day or so. Just so you know in case you have any trouble accessing the site.

While I'm on the subject, kudos to Linode for having a very solid migration plan and for making continual upgrades to their hardware while lowering prices. The new server costs about a third as much as the one I'm using now.


Introducing pwait

Today I'm announcing pwait, a little program I wrote to wait for another program to finish and return its exit code. Download it from Github and read on...

Why pwait?

Because I was procrastinating one day and felt like doing some systems programming.

Seriously though. Sometimes you need to run one program after another one finishes. For example, you might be running a calculation and then you need to run the data analysis script afterwards, but only if the calculation finished successfully. The easy way to do this is

run_calculation && analyze_data

(sorry to readers who don't know UNIX shell syntax, but then again the program I'm introducing is useless to you anyway).

Which is fine if you plan for this before you run the calculation in the first place, but sometimes you already started the calculation, and it's been running for 3 hours and you don't want to stop it and lose all that progress. The easy way to do this is to hit Ctrl+Z (or some equivalent; it depends on your terminal) to suspend the calculation, and then run

fg && analyze_data

which will resume it and run the analysis script afterwards.

Which is fine if the program is actually running in a terminal where you can get to it to suspend it, and doesn't already have something else set to run after it. But what if it's not?

Or what if doing this doesn't give you enough hacker street cred?

This is where pwait comes in. You run it as

pwait <pid>

and it will wait for the process with ID <pid> to finish, intercept the exit code, and return it as pwait's own exit code. You can use this to passively observe whether a program finishes successfully or not. Or, at least, semi-passively.

Blurry animation of pwait at work

How it works

pwait uses the ptrace system call to attach itself as a tracer to the process you want to wait for. A tracer process can do all sorts of things to its tracee, including stopping and starting it, examining its memory, changing values in its memory, filtering signals that are going to be sent to it, and so on. ptrace is mainly used by debuggers. But pwait ignores most of its tracer superpowers, only watching out for one thing: the signal the process receives when it is about to exit. The exit code is contained in that signal. So pwait copies that status code and exits itself.

Using ptrace has some drawbacks. For instance, you can't have multiple tracers tracing the same process. This means you can't wait for a program you're debugging. (I can't imagine this ever really being a problem, but you never know.) You also can't wait for a single program with multiple instances of pwait. (I could imagine this being an inconvenience.)

To get around some of these issues, I added a netlink mode. netlink is a way for the Linux kernel to pass messages to and from normal (userspace) programs. Of course, there are many ways messages get passed back and forth between the kernel and userspace, but netlink is rather generic and you can get a broad spectrum of information out of it. Of particular interest to pwait is that netlink can be configured to emit a message every time a process exits. pwait can then register itself to wait for that signal. It gets notified about every process that exits on the whole system, but it'll just discard all those notifications until it finds one that matches the process ID it's looking for.

netlink is definitely on the "new and fancy" end of Linux kernel tools; in fact, I could only find one website that demonstrates the functionality I needed for pwait.

How to get it

I'm not particularly confident in this program yet, so there's no formal release. Just head to Github and click on "Download ZIP" in the lower right, or just clone the repository if you prefer. Bug reports and feedback are very welcome!


First steps toward new scicomm conferences

Join the Google group mailing list to stay informed or to help with planning!

My post last week considering options for a new science communication conference series got a pretty strong response, at least relative to most things on this blog. As it turns out, there already are some people in various stages of planning new (un)conferences in the style of Science Online, much like what I was thinking about. I won't say anything about them here because those people haven't revealed their plans yet, but I hope they will go public soon!

I also completely forgot that Science Online was not monolithic; it had regional branches around the US and around the world, which were largely separate from the main organization. At least two of them are still holding events: Science Online Leiden and Science Online DC. (There are also branches in Boston, Denver, and Vancouver, maybe others that I don't know about, but they seem to be inactive.) These smaller groups could play a big role in the future of the science communication community, since as several people have pointed out, it's a lot easier to organize events that involve fewer people. Perhaps a big international conference is too much to plan for in the next couple of years, but bringing together communicators from a couple of neighboring states? Not so hard. If there's no science communication group in your area, why not start one? If you do, it'd be an excellent thing to announce on the mailing list!

In fact, the same goes for topical conferences, like the massively successful Science Online Oceans. Again, a smaller conference is easier to organize, and it could build up over time to become something for the whole community.

Of course, there's no reason for me to only be writing about Science Online affiliates. I just do it because those are the events and groups I know, or can easily find out about. Actually, a lot of people I've heard from think that we should see any new conference, not as replacing Science Online, but as an opportunity to construct an event that the science communication community wants, from the ground up. I agree. After all, Science Online had its share of problems; the brand is somewhat tarnished, and any new events would probably do well to set themselves apart from that history.

Toward a new conference

While other people pursue their plans for new conferences, I've been musing on the seven-step "plan" (if you can call it that) I laid out in my earlier blog post. Here are some thoughts on the early steps, in light of what people have told me in the past week:

  1. Putting together a group with organizational experience: the Science Online "regulars" were no strangers to organizing events. After all, if you want to communicate with people, bringing the people to you is step 1. So the talent and the experience are out there. I've actually been in touch with several people who would be very capable of planning a new conference, once they decide it's time to go ahead and do it.
  2. Figure out what went wrong with Science Online: a lot of things. Here's a (partial) list, in fact. Here's another one. But this step is ongoing.
  3. Gauge interest: Yes, people are interested. Maybe not all the same people who used to regularly attend Science Online events, but a lot of them are interested enough that — as I mentioned above — they were talking about plans for some kind of new event even before my first blog post on the matter. The trick seems to be putting the interested attendees in touch with the interested organizers, which is what I'm trying to do right now.

The rest of the details — time, location, content, name, sponsors — is stuff for the future. For now, I think it's all about communication. So, whether you're interested in planning a conference or just want to be kept up to date on what everyone else is doing, please, join the mailing list!


Adventures in China: the toys of the trade

My boss got me a new toy today.

it's a Mac!

This is one of the perks of working for a well-funded research group, I guess. And a new research group. It's not often that you get your foot in the door right when they're buying equipment.

It's also a perk of being a phenomenologist (which is like being a theorist but sometimes we measure things we can't calculate). Unlike experimental physicists, who have to spend their budgets on all sorts of exotic lab equipment (which I'm given to understand means obscene amounts of duct tape and aluminum foil), all you need for phenomenology is a computer, pencil and paper, and a place to sit. So there's really no reason not to blow as much money as possible on nice equipment. And this is nice equipment. It's literally the best Apple computer you can buy over here, featuring a 27 inch display (oooooh) and OS X Yosemite, the newest update to the operating system.

Not that I don't have reason to complain. The system stalled twice before I even managed to finish the setup procedure.

Eternal Flame

I guess I have to start making offerings to the Apple Gods now? Or the spirit of Steve Jobs?


On compiler warnings (and off them, too)

Quick, what's wrong with this C++ program?

#include <iostream>

int test(int arg) {
    cout << arg << endl;

int main(int argc, char** argv) {

Did you guess nothing at all? Because that's what GCC says:

$ g++ -o funnyprogram funnyprogram.cpp


Pretty much every other programming language that makes you explicitly identify a function's return type will also make you actually return something from that function. C and C++ don't, and furthermore GCC doesn't even warn you that anything is wrong. This can occasionally lead to serious bugs, as I discovered today in this real-world example. I had a function that checks the name of an object and returns an enum value based on that name.

virtual const HardFactorOrder get_order() const {
    // relies on a particular convention for get_name()
    // but can be overridden for hard factors where that convention doesn't apply
    std::string name = get_name();
    if (name.compare(0, 3, "H01") == 0) {
        return MIXED;
    else if (name.compare(0, 2, "H0") == 0) {
        return LO;
    else if (name.compare(0, 2, "H1") == 0) {
        return NLO;

That was all good when all the objects involved had names conforming to the convention, but my latest batch of updates to the code involve objects with totally different names, and I forgot to override get_order(). So the default implementation above was getting used. Instead of failing with an error when none of the patterns matched, it was just not returning anything, and the variable that I set the return value to was getting assigned some random binary nonsense. Something like 2692389, where the legal values were 0, 1, and 2.

Needless to say, if GCC had complained about this from the start, I wouldn't have spent at least an hour staring at tiny text in a debugger.

tiny text in a debugger

There are over a hundred warnings that GCC can be configured to emit. Some of them are relatively useless, but most probably should be enabled if you want to save yourself a lot of debugging time. I pored through the manual and came up with the following set of warnings for myself:

-Wall -Wextra -Wformat-security -Wmissing-include-dirs -Wuninitialized
 -Wtrampolines -Wconversion -Wuseless-cast -Wjump-misses-init -Wlogical-op
 -Wstrict-prototypes -Wctor-dtor-privacy -Wold-style-cast -Wno-reorder
 -Wno-unused-parameter -Werror=delete-non-virtual-dtor -Werror=return-type

That's quite a mouthful, though a build system like Cmake can handle it easily; just copy and paste into the appropriate spot in the configuration file. For times when you invoke GCC manually, you can put those options in a file, perhaps ~/Wreally-all, and then run

g++ @/home/user/Wreally-all -o program program.cpp ...

which will include the contents of the file as if you had specified it on the command line.

Feel free to use this as a starting point for figuring out what set of warnings is most useful for your own environment.



I'm having a little too much fun with my newly-discovered ability to embed Twitter widgets. Enjoy these scientific twists on popular movie titles.