Tuesday, November 10, 2009

Using Live Mesh to Backup SVN

I really like cloud storage. The whole not having to worry about your hard drive crashing or backing up makes it sound very appealing. The fact that you can access your files from anywhere isn't too bad either.
For notes, I use Evernote and I have a very deliberate system of tagging (which I'll probably share in later posts). For most of my files, I have Live Mesh installed on both my PC and Mac so my files are available on both. The only problem I noticed was that my development files (all stored in SVN repositories) were sitting locally on my hard drive without anything getting backed up. If my hard drive dies all my hard work is gone. Sure, I have revisions checked out on other computers so I would probably still have a relatively recent snapshot of my repository; but all the history of SVN, one of the most important features, will be gone.
I didn't want to deal with backing up to an external hard drive and having to create some script that runs every so often. I wanted a simple solution that involved the cloud. So I was thinking... why not store the repository in Live Mesh?
At first I was thinking of all the ways Live Mesh and SVN could be combined. The first is to put a checkout folder on Live Mesh. The advantage of this would be that you could have multiple computers looking at the same checked out repository. You wouldn't need to commit/update every time you want to make a change on both computers and sync them. However, the benefits end there. There are major flaws with this implementation. The first being that you aren't storing the entire repository in Live Mesh, you're only storing a checked out snapshot. That means it can't be used for backup purposes. The second issue is that (potentially) large binary files will take up lot's of space on your Live Mesh. This is useless because you could just recreate the binary files by building the project. There is no way to tell Live Mesh which files to exclude based on filters. I find no use in syncing the same exact checkout folder between my two computers, and that's why I this method wouldn't work for me. The other use case of Live Mesh and SVN would be to add the repositories to Live Mesh. After thinking about it a bit, it seemed very intriguing. I would have the benefit of my entire repository being backed up in a small footprint. This doesn't take up a lot of space on live mesh since it only contains deltas of file differences, and it doesn't store all those huge binaries which you shouldn't be including in your repository anyways.
I went over to my repositories folder and saw that it's less than 5MB, no where near the 5GB limit available on Live Mesh. I looked through the files and they all seemed small enough and this started to look feasible. Before I take the plunge however, I usually try to think of all the things that could possibly go wrong.
The first was what if the repository gets too big? For example, what if I accidentally commit a huge file to my repository it will forever remain in live mesh taking up a whole lot of space even if I end up deleting the file from the repository. I started looking online if there was a way to delete all traces of a file using SVN. Apparently, most version control systems have a command called obliterate which is supposed to help with just this issue. However, SVN doesn't seem to have a function like that. I kept on reading and noticed that there is a workaround involving dumping the svn repository, filtering the repository, and then importing it again. They even documented this usage in the SVN Book. So I figure the chance of me making this mistake is small and even though it's not trivial to export/import the data, it's possible, so there's no need to worry about it now.
The next issue I thought of is making sure my data doesn't accidentally get erased by live mesh. Live Mesh automatically updates my local PC every time the contents are changed. That means if I accidentally go online and press delete on my repository folder, all my data will be gone forever from both the cloud and my local PC. Or will it? I setup a little experiment to see what happens when I delete a file on my live mesh Live Desktop. After deleting the file, I verified, as the documentation says, that it simply gets sent to the recycle bin and isn't deleted right away. This seems good enough for me. The odds that the folder will get deleted are slim. The odds that it will get deleted and my recycle bin will be emptied are even slimmer, so I figure this is a non issue.
The only other issue I could think of is the security risk, which I'll leave for you to make the call, depending on how sensitive your code is; how much you trust Microsoft in securing your data; not accidentally revealing your password; making sure your virus protection is up to date... the list could go on and on, I'll just leave it at that. I'm sure there are people much more suitable to discuss the security risk than I am.
Okay, with those issues resolved, and me not being able to think of any other ones, I decided to give it a go. I'm using VisualSVN Server which let's me easily create repositories with a Windows GUI. I'll include the steps I did in case someone has the same Live Mesh/SVN/VisualSVN Server combination I have. I first created a top level folder in Live Mesh which will store my repositories. The name Repositories seemed fitting. I like creating this as a top level folder cause it should only be synced with the Live Desktop and my PC that is hosting the SVN Server. My Macintosh doesn't need to know about this so it's better off in it's own top folder. Next, I "checked out the folder" on my PC which should now be empty. I then copied my repositories from where VisualSVN Server stored them to my newly synced folder. To find out where VisualSVN Server stores your Repositories, right-click the root node VisualSVN Server (Local) and select properties. You'll see the Repositories Root setting with a path on your local computer. I simply pointed this to the checked out repository folder that is synced with Live Mesh, and it all works flawlessly.
I then made sure that I'd be able to use this data as a backup recovery if my hard drive ever were to crash. I synced a new version of my repository folder (only taking data from the cloud) and pointed VisualSVN Server to it, it worked perfectly. While Live Mesh does exclude some types of files from syncing, such as hidden files and those that have the extension .tmp , all the files that are stored in an SVN Repository get synced by the service, making the two compatible.
With that, it seems like I solved the final piece of the puzzle, and now all of my data is stored in cloud services and is backed up in case my local hard drives fail. I hope you found this post useful. And if you're new to this, welcome to the cloud. This seems like the future of data storage.
Disclaimer: While I don't foresee any problems with this method, I may just be overlooking them. Keeping your whole repository only backed up on Live Mesh should be done so at your own risk. You might want to combine this method with another form of backup just to be on the safe side.

No comments:

Post a Comment