My Gmail account is pretty important to me—I've been using it since 2004 and it currently holds around 4000 emails (grouped into about 2500 threads, taking up 425 MB). I've been thinking about backing it up for a while, but I finally got around to it and succeeded after only a little frustration.
The first thing I did was configure Gmail to accept POP3 connections. I went into the Gmail settings page and, in the "Forwarding and POP/IMAP" tab, chose "Enable POP for all mail."
At first I tried fetchmail to get the emails over POP3 from Gmail. I had it all set up and everything until I discovered that it depends on a mailserver running on the local machine so it can get the messages from Gmail and send them to the local mailserver. I didn't want to go through the hassle of installing and securing sendmail or postfix, so I went looking for alternatives that would just dump the mails to a file or directory.
I found getmail (available through the
getmail4 package on Ubuntu) and installed it. Here's how I configured it:
First, I set up the getmail configuration directory and the Maildir structure where the mails would go:
cd ~
mkdir .getmail/
mkdir gmail-inbox/
mkdir gmail-inbox/cur/
mkdir gmail-inbox/new/
mkdir gmail-inbox/tmp/
Then I created the file
~/.getmail/.getmailrc-gmail with the following contents:
[retriever]
type = SimplePOP3SSLRetriever
server = pop.gmail.com
port = 995
username = ankurdave@gmail.com
password = my password
[destination]
type = Maildir
path = /home/ankur/gmail-inbox/
[options]
delete = false
Finally, I launched getmail to fetch the emails:
getmail --rcfile='.getmailrc-gmail'
I had to run this command several times because Gmail serves up the emails in chunks of 400–600 mails at a time instead of giving all 4000 emails in one go. (This took a while to figure out.)
Since getmail uses the Maildir format to store its emails, I could read the emails as they were being downloaded. I ran mutt with
mutt -f ~/gmail-inbox/ to point it to the mail directory.
Finally, if you have to start downloading over again for some reason, you have to tell Gmail you're starting over. In its web interface, it shows that you have downloaded up to a certain point in your inbox with the following line (available in Settings->Forwarding and POP/IMAP): "Status: POP is enabled for all mail that has arrived since 12/10/06" where the date is the date of the last message retrieved through POP. The next time you download emails over POP, Gmail only delivers mails after that date to avoid duplicates. You can reset the date by choosing "Enable POP for all mail (even mail that's already been downloaded)."
So overall, backing up my Gmail account was fairly easy and clean, and now I can be sure that all that information isn't at the mercy of Google.
Update 2007-12-02 11:04:55 PM: I just finished downloading all the email. There are 4235 messages, taking up 434 MB of space. When compressed with
zip, they take up 282 MB—still pretty big, but a little more manageable.