ellipsix informatics

Ellipsix Informatics: the personal website and blog of David Zaslavsky.

I'm a graduate student styudying theoretical particle physics, and I also do a lot of computer programming. Find me elsewhere online:

2014
Nov
22

#scienceamovietitle

I'm having a little too much fun with my newly-discovered ability to embed Twitter widgets. Enjoy these scientific twists on popular movie titles.

2014
Nov
20

Another Mathematica bug

Math is hard.

Not for Barbie, but for Mathematica.

I ran into a weird Mathematica bug while trying to evaluate the sum

\sum_{k=1}^{\infty} \biggl[-\frac{\pi^2}{6} + \psi'(k + 1) + H_k^2\biggr]\frac{z^k}{k!}

Split this into three parts. The first one is the well-known expansion of the exponential function

-\frac{\pi^2}{6}\sum_{k=1}^{\infty} \frac{z^k}{k!} = -\frac{\pi^2}{6}(e^z - 1)

The second is not the well-known expansion of the exponential function.

\sum_{k=1}^{\infty} \psi'(k + 1)\frac{z^k}{k!} \neq \frac{\pi^2}{6}(e^z - 1)

Obviously not, in fact, since if two power series are equal, \sum_i a_n z^n = \sum_i b_n z^n, for an infinite number of points, each of their coefficients have to be equal: \forall n,\ a_n = b_n. (You can show this by taking the difference of the two sides and plugging in a bunch of different values of z.)

I guess Mathematica doesn't know that.

In[1] = Sum[PolyGamma[1, k + 1] z^k/k!, {k, 1, Infinity}]
Out[1] = 1/6(-1 + E^z)Pi^2

I had my hopes up for about two days that these two terms would cancel out, but I should have gone with my instinct that something was fishy about that result.

Apparently we have to go all the way back to version 7 to get a correct answer.

I'm still working on the rest of it.

Can you trust your calculator?

This is actually emblematic of a big problem that is on many people's minds right now because of a recent article in the Notices of the American Mathematical Society, (PDF) "The Misfortunes of a Trio of Mathematicians Using Computer Algebra Systems. Can We Trust in Them?" In this paper, the authors construct some fairly simple mathematical expressions for which Mathematica gives egregiously wrong results. (The expressions may not look simple to us, but they're polynomials with integer coefficients, pretty much the easiest mathematical objects for computers to handle.)

Forget math, it's programming that's hard!

What all this shows is that relying on the results of a computer program is dangerous without some kind of independent verification that it does what you think it does. Anyone who's ever written a program should understand that. But it's all too easy to forget; we get lulled into a false sense of security by the 90% of the time that programs do work, plus the 9% of the time that they seem to work because whatever errors they produce are buried in a giant pile of output.

I think sometimes we could all do with a reminder like this not to get too trusting.

2014
Nov
19

The future of science online without Science Online

Join the Google group mailing list to stay informed or help with planning!

It's been a little more than a month since Science Online, the organization famous for its annual series of conferences (and infamous for the sexual harassment promulgated by one of its founders) announced that it was disappearing for good.

As a result of our state of insolvency, the ScienceOnline board of directors voted on Oct. 6 to proceed with a plan for dissolution, which we will implement over the coming weeks.

One unfortunate but necessary consequence of this decision is that we have to cancel the ScienceOnline Together 2015 conference scheduled for Atlanta in February. We have notified those who have already registered for ScienceOnline Together 2015 and will be fully refunding registration fees.

The decision took a lot of people by surprise, since the organization had been in the middle of planning their 2015 conference in Atlanta. But all of a sudden, with no warning, it was all over.

When the news broke, a lot of former attendees took to Twitter to share their favorite memories of past conferences.

My time at last year's conference, as a first-time participant, was pretty close to being the best three days of my life.

I'm not ready to give up on that.

I'm not the only one, either.

Science Online had some great aspects that were worth preserving. The "unconference" format of moderated discussions, in particular, did an amazing job of drawing new people into the community. And the half-hour coffee breaks between sessions, organized lunches, evening activities, really emphasize the idea that networking is the most important part of an event like this. I want future generations of science communicators to be able to experience that inclusiveness too.

A new conference?

At the moment, I don't see anyone else making serious plans for a successor to Science Online Together, so I'm going to push for this myself.

There are a few things that need to be done before a new conference series can take Science Online's place.

  1. Put together a steering committee with people who have expertise in organizing, and who are also legitimately committed to seeing the legacy of Science Online continue. Goodness knows I don't have any idea how to organize a conference, especially not from halfway around the world, but I do know it takes a lot of work from many, many people.

  2. Figure out why Science Online failed in the first place, and how to avoid making the same mistakes. Everyone seems to have their own ideas, ranging from the cost of the conference and travel, to the change in location, to the fact that many "regulars" were taking a break this year. Most people probably suspect that Bora's sexual harassment scandal and the organization's inadequate handling of it at Science Online 2014 played a role.

  3. Decide whether there is enough interest to hold a conference at all, and if so, how large it should be. Right now the dominant attitude in the community seems to be that Science Online was nice while it lasted, but now it's dead and gone and that's all there is to it. If that's really the case, then it's pointless trying to start a new conference series. But I don't think that's the case. You don't sell out registration spots in an hour year after year unless you have a really committed audience.

    • This probably involves getting a message out to all those who would have attended Science Online Together 2015, and others in the science communication community. I intend to ask the board whether they would be willing to pass along a message, or share the contact information that people made public when they registered.
  4. Decide on a time. Moving the conference from the traditional February-March time slot to another time (summer?) might improve people's ability to attend. Besides that, though, does an annual conference make sense? If the cost is prohibitive, moving to every 18 months or every two years might help keep the series strong.

    • This would also be something to ask potential Science Online attendees about.
  5. Decide on a location. The Science Online people already had Atlanta selected, but I heard some complaints about it, so is it really the best choice? Would it be better to rotate between different locations, as a lot of other conference series do?

  6. Check for interested sponsors. Some of the existing Science Online sponsors might be willing to continue funding a new conference series, but I bet that gets less likely the longer we wait to ask them. This is also a prime opportunity to bring new sponsors into the fold. I'm sure there are companies that are involved in online science communication that would love a chance to advertise themselves to an interested and influential audience.

  7. Come up with an awesome name. The obvious choice is to try to get the rights to use the Science Online trademark (it is a trademark, right?). But I think, given the organization's history, it may make more sense to come up with something new to distance the new conference series from the shortcomings of the old one.

    Actually, I've got this one covered. The Science Communication Initiative Online eXchange. It's descriptive and sounds modern and dynamic, and was definitely not just the first thing I could think of that would let us keep the #sciox hashtag.

Originally I had considered trying to take over the conference center booking, the existing event planning, and other preparation that had already been done by Science Online for the 2015 Atlanta conference. But that would have required acting very quickly, and also knowing that the full complement of 400 people would show up to the new conference. That's probably not realistic.

At this point, I'm thinking of planning a small, informal meeting in 2015, perhaps still in March, as a planning session for a full conference in 2016. It could be held on the same days as the original Science Online conference was scheduled, or it could be done in conjunction with the Atlanta Science Festival March 21-28, as discussed on the planning Facebook page.

What do you think?

I've created a Google group (just an email list) for planning the future of this conference series. If you want to help or just want to see how it's going, head to Google and sign up! You do not need a Google account, any email address will do. (Any problems, email me or Tweet at me.)

2014
Nov
17

How one line in one file made me reinstall Gentoo

Hey, internet. Long time no see.

(It's often claimed "long time no see" is a literal translation from Chinese "好久不见", "hǎo jiǔ bú jiàn", but actually nobody knows for sure.)

On Thursday, my computer crashed. Not just that it crashed, but it somehow corrupted itself so that I couldn't even boot it. It survived for two seconds after being turned on, before bailing out with this error:

Linux crashes after two seconds

init[1]: segfault at 0 ip 00007ff10ea3fe05 sp 0007ffff7cb49148 error 4 in libc-2.19.so[7ff10e919000+19e000]
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

This is so early in the startup process that nothing is running. It's pretty much just the kernel and init (which, if you don't know, is the very first program to run when a Linux system starts, the one that runs all other programs).

So of course I recompiled (it's Gentoo) and reinstalled both the kernel and init, as well as libc, several times, as well as several other programs that I'm pretty sure didn't even get a chance to start, so how could they have anything to do with this crash? Still, better safe than sorry I guess.

None of it made any difference. After replacing every program that could possibly be running at the time of the error, init still crashed. This is probably the most frustrating error I've seen in 15 years as a Linux user.

So I spent my weekend reinstalling everything on my computer.

Verifying downloads before GPG

One of the easy ways a nefarious group could hack into my computer is by intercepting the source code I download when I'm installing a program and modifying it to do, well, whatever they want. China's government tries to maintain pretty strict control over the internet in their country, so I wouldn't put it past them to do this. (To be fair, I wouldn't put it past the US government either.) One way I could get around that is to download everything from a trusted server in the US or some other country, but the internet access here is kind of slow, especially going in or out of the country. It'd be a lot quicker to download all my source code within China, and I need all the time I can get.

The alternative is to use cryptography. Gentoo already does this, to some extent: whenever I install a program, the installer first checks the SHA256 hash of the source code against an internal database to make sure the source code hasn't been tampered with. But that internal database is also something I need to download, and how do I know that hasn't been tampered with?

Of course, the Gentoo Release Engineering Team (the people who maintain this internal database) have thought about this, and they cryptographically sign the database files. I'll skip the details, but through the "magic" of public-key cryptography, I can check that the SHA256 hashes I download are the same ones they upload by checking one single 40-letter key fingerprint, just once.

Here's the catch: to verify the hashes, I need to have GPG installed, and to install GPG, I need to verify the hashes of its source code and all the other code it depends on. My bright idea to get around this was to define this little function, a fake version of GPG:

echo "gpg invoked: gpg $*"
echo "please do this manually and press ok if successful"
read -n 2
if [[ "$REPLY"=="ok" ]]; then
    echo
    return 0
else
    echo
    return 1
fi

That goes into /usr/local/bin/gpg in the system I'm trying to install. This way, when I run emerge-webrsync to download the hashes, and it gets to the point where it's going to check the file's signature, it will pause so I can manually check the signature.

I spent way too much time coming up with that.

About that crash

I did eventually figure out the problem. It turned out to be a line in /etc/ld.so.preload, which is a file specifying libraries of code that the system should preload for every new program it starts up. Essentially you're patching extra computer code into the program when it runs. Extra code that it probably wasn't written to deal with. That makes an environment ripe for conflicts between the preloaded code and the original code. (The fact that this feature exists is shocking, most of the time, but it does have a few legitimate uses.)

I had tried to install the Astrill VPN client, which as part of its installation adds this line to /etc/ld.so.preload:

/lib64/$LIB/liblsp.so

I'm guessing liblsp.so is Astrill's way of intercepting everything that a program tries to send or receive over the internet. It might work for most programs (might; I'm not going to keep it around to find out), but clearly, it has a major conflict with init (which doesn't even need to access the internet anyway).

It wasn't easy to find the culprit, actually; I had to boot my computer using the System Rescue CD (on a USB drive, despite the name) and find all files that were modified at the exact time I installed Astrill:

find /mnt/gentoo -mmin 537 -mmax 539

This finds all files which were last modified 538±1 minutes ago. There were about 50 of them, and it wasn't hard to pick out the one that could affect how the very first program on the computer starts up.

Still, I can't believe I reinstalled my entire operating system because of one line in one file... life of a Gentoo user, I guess. :-P

2014
Nov
11

There are two kinds of textbooks...

...those written in binary and — wait, that's not right!

Actually, the two kinds of textbooks, at least in physics, are educational books and reference books.

  • Educational books are good for learning a subject for the first time. They're written in a way that makes it easy to read a whole section or chapter without getting overwhelmed. They have clear summaries, walk you through simple examples, and put their topic in context to keep you interested. These books are almost telling a coherent story as much as they are imparting information.

    Examples include David Griffiths' books on quantum mechanics, electromagnetism, and particle physics, Daniel Schroeder's thermodynamics book, James Hartle's book on general relativity, and Barton Zwiebach's book on string theory.

  • Reference books are good for looking up the details of a specific procedure or situation. They're densely packed with information, but that makes it very hard to read a large section at a time, the way you would read other books. Instead, a reference book is most useful when you need to know one particular fact and can look up just the part of the book that deals with that fact. In these books, the individual sections should be more self-contained because you don't want to read through the whole book to get the information you need.

    Examples include Cohen-Tannoudji, Diu, and Laloë on quantum mechanics; Jackson on electromagnetism; Misner, Thorne, and Wheeler on general relativity; and Green, Schwartz, and Witten on superstring theory.

One of the biggest mistakes a textbook author can make is failing to decide which kind of book they want to write. While most textbooks have some features of each type, no single book can adequately fill both roles. Trying to do it just makes your book bad at both.

Of course, one of the biggest mistakes a textbook reader can make is expecting an educational book to be good as a reference, or vice-versa. A lot of trained physicists do this when they complain about Griffiths' books being insufficiently rigorous, for example. Those books aren't meant for trained physicists, they're meant for undergraduates, who would just be confused by "sufficient" rigor.