More git for newbies: merge vs. rebase

Posted by David Zaslavsky on February 8, 2010 9:32 PM

— Comments

One of the things everybody points out about git is that it’s a fairly complex system. Of course, other version control systems like Subversion are complex as well, but git doesn’t seem to do as much as the others insulate you (the user) from what’s going on “under the hood.” Case in point: the difference between merging and rebasing.

Merging was a simple enough idea (not) to get used to when I was using Subversion. Basically the idea is this: you have your copy of the versioned material (in Subversion: the working copy), and somewhere else there’s a remote copy of the material (in Subversion: the repository). When both your local copy and the remote copy have been modified since they were last synced up, you have two sets of changes to the same data: your local changes and the remote changes. If you’re going to sync your copy to the remote copy again, you need to combine those two sets of changes. The way Subversion does it for you is a two-step process (which in Subversion terminology is called “updating”):

Compute what (remote) changes were made to the remote copy since you last synced with it.
Apply those changes to your local copy and hope they don’t conflict with any of the local changes.

The same thing is basically what git calls a “merge.” (Except: the repository kept in the .git subdirectory of the working copy is the “remote copy” for purposes of that discussion. Typically, you would do this just after syncing your own repository to some other repository that actually is “remote,” so the changes are still coming from a remote source, just indirectly.)

But it turns out that there’s another way to put your local changes together with the remote changes. You can temporarily undo your local changes, apply the remote changes, and then reapply your local changes. This is what git calls a “rebase.” Subversion doesn’t let you do this automatically, although I’ve done it manually a few times, so this should be a welcome feature in git.

In a workflow like mine (one person using git/SVN to keep work in sync on multiple computers), there’s usually no difference between a merge and a rebase. When the local changes and the remote changes aren’t directly in conflict, it doesn’t matter whether I apply the local changes and then the remote changes (merge) or the remote changes followed by the local changes (rebase). When there is a conflict, though, I’m leaning toward rebase, because it ensures that the remote changes always get applied to my work. It basically ensures that my local copy is brought up to date to match the point at which I left off on another computer. If my local changes conflict with work I did (and committed) somewhere else, I’d want to adjust the local changes so they fit with the work I’ve already committed, rather than having to fix up things that are already in the repository.