Category Archives: Git

A Hacker’s Guide to Git

Introduction

Git is currently the most widely used version control system in the world, mostly thanks to GitHub. By that measure, I’d argue that it’s also the most misunderstood version control system in the world.

This statement probably doesn’t ring true straight away because on the surface, Git is pretty simple. It’s really easy to pick up if you’ve come from another VCS like Subversion or Mercurial. It’s even relatively easy to pick up if you’ve never used a VCS before. Everybody understands adding, committing, pushing and pulling; but this is about as far as Git’s simplicity goes. Past this point, Git is shrouded by fear, uncertainty and doubt.

Once you start talking about branching, merging, rebasing, multiple remotes, remote-tracking branches, detached HEAD states… Git becomes less of an easily-understood tool and more of a feared deity. Anybody who talks about no-fast-forward merges is regarded with quiet superstition, and even veteran hackers would rather stay away from rebasing “just to be safe”.

I think a big part of this is due to many people coming to Git from a conceptually simpler VCS — probably Subversion — and trying to apply their past knowledge to Git. It’s easy to understand why people want to do this. Subversion is simple, right? It’s just files and folders. Commits are numbered sequentially. Even branching and tagging is simple — it’s just like taking a backup of a folder.

Basically, Subversion fits in nicely with our existing computing paradigms. Everybody understands files and folders. Everybody knows that revision #10 was the one after #9 and before #11. But these paradigms break down when you try to apply them to Git’s advanced features.

That’s why trying to understand Git in this way is wrong. Git doesn’t work like Subversion at all. Which is pretty confusing, right? You can add and remove files. You can commit your changes. You can generate diffs and patches which look just like Subversion’s. How can something which appears so similar really be so different?

Complex systems like Git become much easier to understand once you figure out how they really work. The goal of this guide is to shed some light on how Git works under the hood. We’re going to take a look at some of Git’s core concepts including its basic object storage, how commits work, how branches and tags work, and we’ll look at the different kinds of merging in Git including the much-feared rebase. Hopefully at the end of it all, you’ll have a solid understanding of these concepts and will be able to use some of Git’s more advanced features with confidence.

It’s worth noting at this point that this guide is not intended to be a beginner’s introduction to Git. This guide was written for people who already use Git, but would like to better understand it by taking a peek under the hood, and learn a few neat tricks along the way. With that said, let’s begin.

(more…)

Force Bower to clone from https:// instead of git://

Most Bower packages will be fetched using a git:// URL, which connects on port 9418. This can be problematic if you’re behind a firewall which blocks this port.

You can get around this quite easily by telling Git to always use https:// instead of git://:

git config --global url.https://.insteadOf git://

Don’t use Git’s autocorrect feature

Quite often I’ve accidentally typed “git” twice. Usually this is fine, and Git just does something like this:

$ git git diff
git: 'git' is not a git command. See 'git --help'.

Did you mean this?
    init

But I recently turned on Git’s autocorrect feature, to see what it was like (git config --global help.autocorrect 1). The results were… interesting:

$ git git diff
WARNING: You called a Git command named 'git', which does not exist.
Continuing under the assumption that you meant 'init' in 0.1 seconds automatically...
fatal: internal error: work tree has already been set
Current worktree: /nfs/personaldev/vagrant/mobileweb-v2
New worktree: /nfs/personaldev/vagrant/mobileweb-v2/diff

This is really bizarre behaviour. The fact that it wants to autocorrect it to git init is sort-of okay. But rather than giving me the option to confirm that this is what I want, Git gives me a whole 0.1 seconds to hit Ctrl+C before it automatically runs the command for me.

Thankfully, git init isn’t a very destructive command. I was lucky that the only side effect of this was that Git created a new directory called diff. I can’t help but wonder what would’ve happened if Git decided to autocorrect to a more destructive command like reset, checkout, or gc.

The lesson here? Don’t use Git’s autocorrect. It really sucks.

Update: m_bright pointed out that the value of help.autocorrect is actually how many tenths of a second Git will wait before automatically executing the command. So something like git config --global help.autocorrect 10 would give you 1 second before the command is executed, which is probably slow enough to let you cancel any mistakes, and quick enough to still be useful.

Git: Ignore changes to already-tracked files

There are often times when you want to modify a file but not commit the changes, for example changing the database configuration to run on your local machine.

Adding the file to .gitignore doesn’t work, because the file is already tracked. Luckily, Git will allow you to manually “ignore” changes to a file or directory:

git update-index --assume-unchanged <file>

And if you want to start tracking changes again, you can undo the previous command using:

git update-index --no-assume-unchanged <file>

Easy!

Converting an SVN repository to Git

Today I found out just how easy it is to convert an SVN repository to Git without losing any commit history. Note that you will need git-svn (apt-get install git-svn on Debian/Ubuntu).

git svn clone http://mysvnrepo.com/my-project my-project
cd my-project
git remote add origin git@mygitrepo.com:/my-project.git
git push origin master

Et voilà, my-project.git has the full commit history of the my-project SVN repository.

If anybody knows whether SVN branches can be converted to Git branches, please get in touch!

Useful Git Configuration Items

Name and email address

Each commit you make has your name and email address attached to it. Git will automatically configure these based on your username and hostname, but this information is usually not a good identifier. It is a good idea to set your real name and email address so that your commits can be identified easily.

git config --global user.name "Your Name"
git config --global user.email you@yourdomain.com

Global ignore file

Often there are files or directories that you want Git to ignore globally. These are probably created automatically by your IDE or operating system. Git’s core.excludesfile config allows you to write a global .gitignore so that you don’t have to fill local .gitignore files with clutter.

git config --global core.excludesfile /path/to/.gitignore_global

(more…)

Deploying a Git repository to a remote server

Git’s archive command is basically the equivalent of SVN’s export – it dumps a copy of the entire repository without any of the version control files, making it perfect for deploying to a testing or production server. (more…)