Changes between Version 97 and Version 98 of GSoCIdeas2010

2010-03-29T01:21:16Z (11 years ago)

add Kevan's MDMF write-up


  • GSoCIdeas2010

    v97 v98  
    1 Tahoe-LAFS Summer-of-Code Projects
     1= Tahoe-LAFS Summer-of-Code Projects =
    33This page contains specific suggestions for projects we would like to see in the Summer of Code. Note that they vary a lot in required skills and difficulty. We hope to get applications with a broad spectrum.
    1414||[#RedundantArrayofIndependentClouds Redundant Array of Independent Clouds]||Medium||[ Zooko Wilcox-O'Hearn] or any mentor||
     15||[#RedundantArrayofIndependentClouds Redundant Array of Independent Clouds]||Medium||[ Zooko Wilcox-O'Hearn] or any mentor||
    1516||[#ShareMigration Share Migration]||Medium||[ Brian Warner] or any mentor||
    1617||[#SecureDecentralizedWiki Secure Decentralized Wiki]||Medium||[ Zooko Wilcox-O'Hearn] or any mentor||
    26 = Redundant Array of Independent Clouds =
     27== Medium-Sized Distributed Mutable Files (MDMF) ==
     29Mutable files in Tahoe-LAFS have some significant limitations and
     30performance issues, as discussed in
     31[ docs/performance.txt]. Users who aren't aware of these limitations are
     32surprised when they find out that mutable files can't scale to large
     33sizes without using unacceptable levels of memory, and that reading one
     34byte of the file costs as much as reading the entire file.
     36A fix for this issue would essentially be fixing #393. That is,
     38  * Developing mutable files that are segmented on upload, as with immutable files. Part of this would involve making sure that the way we currently ensure the integrity of the parts of mutable files stored on servers is adequate for your new design, and altering it if it isn't.
     39  * Implementing efficient reading and writing of arbitrary spans of those mutable files.
     41This would make Tahoe-LAFS less surprising to users, and allow mutable
     42files to be used in more ways than they currently are. If successful enough, this might allow Tahoe-LAFS to support range queries or "graph database"-style access, in the style of the "NoSQL" projects.
     44To learn more about this issue, you should first read
     45[ docs/performance.txt], so you're familiar with the performance problems
     46with mutable files as currently implemented. You should also look at the
     47[ file encoding specification], to understand how immutable files are
     48segmented (since you'll be doing something similar with this project). [ The mutable file specification] may be informative as well.
     49The mutable file upload and download code is in
     50[ mutable],
     51and, for comparison, the immutable file upload and download code is in
     52[ immutable].
     54== Redundant Array of Independent Clouds ==
    2856Add backends to the storage servers so that they store their shares on a cloud storage system instead of on their local filesystem. This means that you can get all of the availability and scalability of services such as Amazon S3 or Rackspace !CloudFiles combined with the security properties of Tahoe-LAFS. See [ the RAIC diagram]. For details read ticket #999 which including pointers to the relevant source code and instructions on how to begin writing the code.
    30 = Share Migration =
     58== Share Migration ==
    3260When uploading a file to a grid, Tahoe-LAFS will make sure that the file is
    6593jumping-off point for health is #778.
    67 = Secure Decentralized Wiki =
     95== Secure Decentralized Wiki ==
    6997Write a wiki in Google's [ "caja"] dialect of !JavaScript. This wiki will load and store data directly on a Tahoe-LAFS storage grid so that it is a full "Cloud App"—there is no server. All computation is done in the user's web browser in caja and all of the storage is done by the decentralized Tahoe-LAFS storage grid. This wiki should leverage Tahoe-LAFS's secure sharing features to offer fine-grained, dynamic, and easy transclusion or client-side mashups. This project is intended to be the successor to [ the TiddlyWiki-on-Tahoe-LAFS project], which is a wiki written in !JavaScript and hosted on Tahoe-LAFS, but one that has been "bolted on" to Tahoe-LAFS instead of designed for Tahoe-LAFS, and is currently incapable of good transclusions or mashups.
    7199To get started, play with [ the TiddlyWiki-on-Tahoe-LAFS quick start], read the source code of [ the HTTPSavingPlugin] and [ the TahoePlugin] for !TiddlyWiki, and experiment with [ writing live caja applets].
    73 = Cloud Apps =
     101== Cloud Apps ==
    75103Difficulty: easy to hard, depending on project choice and how far you want to push it
    77105Invent your own Summer-of-Code project by building a new web app on top of Tahoe-LAFS. The [#SecureDecentralizedWiki Secure Decentralized Wiki] is one example of a Cloud App. See [wiki:GSoCIdeas/CloudApps] for other ideas.
    79 = WebDAV Support =
     107== WebDAV Support ==
    81109Difficulty: medium to hard, depending on how much of an existing WebDAV implementation you are able to reuse
    128156[!closed&order=priority&keywords=~webdav Tickets labelled 'webdav']
    131 = Distributed Introduction =
     158== Distributed Introduction ==
    133160Implement a protocol for distributed introduction, thus removing the only remaining Single Point of Failure (SPoF) in the Tahoe-LAFS system. For details see [comment:11:ticket:68 ticket #68] which describes the distributed notification algorithm and points to the relevant source code.
    135 = DVCS Integration =
     162== DVCS Integration ==
    137164Write patches for the [ git] or [ darcs] distributed revision control tool so that it reads and writes directly to a Tahoe-LAFS storage grid instead of its local filesystem. This creates a "revision control repository in the sky"—a repository that is distributed, fault-tolerant, and highly available. It also lends Tahoe-LAFS's unique security and access-control properties to your revision control system—you can share read-only access or read-write access with specific people through Tahoe-LAFS's capability access control system, and you can rely on the integrated digital signatures to verify that you are reading an authorized version of the repository.