Version 91 (modified by zooko, at 2010-03-16T17:44:10Z) (diff)


Tahoe-LAFS Summer-of-Code Projects

This page contains specific suggestions for projects we would like to see in the Summer of Code. Note that they vary a lot in required skills and difficulty. We hope to get applications with a broad spectrum.

If you are interested in working on any of these projects, please contact the Mentors listed at the bottom of the page.

In addition, you may wish to discuss your proposal on IRC—join us on #tahoe-lafs on

We encourage you to come up with your own suggestions, if you cannot find a suitable project here. You can find more project ideas by exploring the issue tracker. Especially see tickets labelled 'gsoc' (developers: please add this label to any tickets that might make a good GSoC project). You may also want to read this mailing list thread about GSoC ideas.

Deadlines and directions for students' applications to the Google Summer-of-Code can be found on the Google pages.

Redundant Array of Independent CloudsMediumZooko Wilcox-O'Hearn or any mentor
Share MigrationMediumBrian Warner or any mentor
Secure Decentralized WikiMediumZooko Wilcox-O'Hearn or any mentor
Cloud AppsEasy–HardJack Lloyd or any mentor
WebDAVMediumDavid-Sarah Hopwood or any mentor
Distributed IntroductionEasyZooko Wilcox-O'Hearn or any mentor

Redundant Array of Independent Clouds

Add backends to the storage servers so that they store their shares on a cloud storage system instead of on their local filesystem. This means that you can get all of the availability and scalability of services such as Amazon S3 or Rackspace CloudFiles combined with the security properties of Tahoe-LAFS. See the RAIC diagram. For details read ticket #999 which including pointers to the relevant source code and instructions on how to begin writing the code.

Share Migration

When uploading a file to a grid, Tahoe-LAFS will make sure that the file is healthy (a good discussion of what healthy means is found in #778) before reporting that the file is uploaded successfully. Tools to effectively maintain file health (or to adapt to new definitions of health) aren't quite complete, however -- our users have had several use cases that aren't easily addressed with what we have. Students taking this project would be building tools to address those use cases.

A good starting point would be to become familiar with how files are placed on a grid. architecture.txt, file-encoding.txt, mutable.txt, the immutable file upload code, and the mutable file upload code are good places to do that. Also, you might want to look at the storage server code to understand that better. Some good tickets to start looking at are #699, #543, and #232; you'll find that those link to other tickets.

There are many ways to help address these issues. Some ideas:

  • Alter the CLI and the WUI to give users the ability to rebalance files that they've uploaded already. (#699)
  • Build tools that allow node administrators to moves shares around a grid (#543, #864)
  • Alter Tahoe-LAFS to rebalance mutable files when uploading a new version of them. (#232)

Any one of these projects is probably too small to fill a summer, but combining a few of them would be a big usability improvement for Tahoe-LAFS.

Depending on how you address this, this is tightly integrated with ideas of file health and accounting, so prospective students would do well to explore those open issues, too. A good accounting jumping-off point is #666. A good jumping-off point for health is #778.

Secure Decentralized Wiki

Write a wiki in Google's caja dialect of JavaScript. This wiki will load and store data directly on a Tahoe-LAFS storage grid so that it is a full "Cloud App"—there is no server. All computation is done in the user's web browser in caja and all of the storage is done by the decentralized Tahoe-LAFS storage grid. This wiki should leverage Tahoe-LAFS's secure sharing features to offer fine-grained, dynamic, and easy transclusion or client-side mashups. This project is intended to be the successor to the TiddlyWiki-on-Tahoe-LAFS project, which is a wiki written in JavaScript and hosted on Tahoe-LAFS, but one that has been "bolted on" to Tahoe-LAFS instead of designed for Tahoe-LAFS, and is currently incapable of good transclusions or mashups.

To get started, play with the TiddlyWiki-on-Tahoe-LAFS quick start, read the source code of the HTTPSavingPlugin and the TahoePlugin for TiddlyWiki, and experiment with writing live caja applets.

Cloud Apps

Difficulty: easy to hard, depending on project choice and how far you want to push it

Invent your own Summer-of-Code project by building a new web app on top of Tahoe-LAFS. The Secure Decentralized Wiki is one example of a Cloud App. See GSoCIdeas/CloudApps for other ideas.


Implement a WebDAV front-end for Tahoe-LAFS so that files and directories stored in a distributed grid can be accessed by operating systems (including Windows, Mac, and Linux) and applications that speak the WebDAV protocol. For details see #451 which describes what the Tahoe-LAFS web server does now, how this differs from what a WebDAV web server does, and how to get started experimenting with the relevant source code.

Distributed Introduction

Implement a protocol for distributed introduction, thus removing the only remaining Single Point of Failure (SPoF) in the Tahoe-LAFS system. For details see ticket #68 which describes the distributed notification algorithm and points to the relevant source code.


Who is willing to spend about five hours a week (estimated) helping a student do it right?

  • Zooko Wilcox-O'Hearn blog (Python/C/C++/JavaScript, security+cryptography)
  • Jack Lloyd blog (C/C++/Python, security+cryptography)
  • David-Sarah Hopwood (Python/C/JavaScript, SFTP frontend, security+cryptography)
  • Brian Warner (Python/C/JavaScript, security+cryptography)