2007-12-02

Backing up Gmail

My Gmail account is pretty important to me—I've been using it since 2004 and it currently holds around 4000 emails (grouped into about 2500 threads, taking up 425 MB). I've been thinking about backing it up for a while, but I finally got around to it and succeeded after only a little frustration.

The first thing I did was configure Gmail to accept POP3 connections. I went into the Gmail settings page and, in the "Forwarding and POP/IMAP" tab, chose "Enable POP for all mail."

At first I tried fetchmail to get the emails over POP3 from Gmail. I had it all set up and everything until I discovered that it depends on a mailserver running on the local machine so it can get the messages from Gmail and send them to the local mailserver. I didn't want to go through the hassle of installing and securing sendmail or postfix, so I went looking for alternatives that would just dump the mails to a file or directory.

I found getmail (available through the getmail4 package on Ubuntu) and installed it. Here's how I configured it:

First, I set up the getmail configuration directory and the Maildir structure where the mails would go:
cd ~
mkdir .getmail/
mkdir gmail-inbox/
mkdir gmail-inbox/cur/
mkdir gmail-inbox/new/
mkdir gmail-inbox/tmp/


Then I created the file ~/.getmail/.getmailrc-gmail with the following contents:
[retriever]
type = SimplePOP3SSLRetriever
server = pop.gmail.com
port = 995
username = ankurdave@gmail.com
password = my password

[destination]
type = Maildir
path = /home/ankur/gmail-inbox/

[options]
delete = false


Finally, I launched getmail to fetch the emails:
getmail --rcfile='.getmailrc-gmail'


I had to run this command several times because Gmail serves up the emails in chunks of 400–600 mails at a time instead of giving all 4000 emails in one go. (This took a while to figure out.)

Since getmail uses the Maildir format to store its emails, I could read the emails as they were being downloaded. I ran mutt with mutt -f ~/gmail-inbox/ to point it to the mail directory.

Finally, if you have to start downloading over again for some reason, you have to tell Gmail you're starting over. In its web interface, it shows that you have downloaded up to a certain point in your inbox with the following line (available in Settings->Forwarding and POP/IMAP): "Status: POP is enabled for all mail that has arrived since 12/10/06" where the date is the date of the last message retrieved through POP. The next time you download emails over POP, Gmail only delivers mails after that date to avoid duplicates. You can reset the date by choosing "Enable POP for all mail (even mail that's already been downloaded)."

So overall, backing up my Gmail account was fairly easy and clean, and now I can be sure that all that information isn't at the mercy of Google.

Update 2007-12-02 11:04:55 PM: I just finished downloading all the email. There are 4235 messages, taking up 434 MB of space. When compressed with zip, they take up 282 MB—still pretty big, but a little more manageable.

4 comments:

Anonymous said...

If you use Microsoft Outlook then you can simply use Outlook PST file to export and back up emails. I have 11 years worth of email and doesnt require writing code to do a simple task.

Ankur said...

There are a few problems with that, though:

1. Outlook doesn't run on Linux. You could make the argument that there's no need for me to use Linux at all when I can use Windows, but I find Windows to be much less manageable than Linux. (It's harder to get "under the hood" when you need to.)
2. There are lots of cases when Outlook PST files have gotten corrupted. This has happened to me once, too, even though mine was only 40 MB or so.
3. I've used Outlook at work and I much prefer the Gmail interface. Heck, I even prefer the Mutt interface to Outlook's. Why? Outlook is seriously slow, and a little too mouse-heavy for my taste.
4. I like writing code. If I didn't have to write code for anything at all, I would feel like my skills are useless :)

So while it might seem like Windows/Outlook is the easiest solution, once you've gotten a more in-depth solution in place, it works much more smoothly.

Anonymous said...

Excellent tutorial Ankur! Very well explained! You helped me a lot!

Alexis said...
This comment has been removed by a blog administrator.