[tahoe-dev] GSoC Share Rebalancing and Repair Proposal

Wed Apr 24 18:28:20 UTC 2013

I should be able to make it.

Mark Berger

On Tue, Apr 23, 2013 at 1:18 PM, Zooko O'Whielacronx <zookog at gmail.com>wrote:

> Dear Mark:
>
> Yay! Good proposal! I'm excited about the prospect of getting some
> focused work on improving repair and rebalancing. There are a lot of
> different ways that the functionality could be improved.
>
> I'm not sure, but I have the _feeling_ that the #1382 that Kevan
> Carstensen started may be sort of a critical basis for successful work
> on related tickets. See his github branch for the code he's written:
> https://github.com/isnotajoke/tahoe-lafs/commits/ticket1382
>
> Kevan: do you think Mark could profitably work on other repair and
> rebalancing tickets while leaving your #1382 branch alone? Or do the
> two of you, Kevan and Mark, think it might be a good idea to have Mark
> take over Kevan's branch and finish it?
>
> https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1382# immutable peer
> selection refactoring and enhancements
>
> Mark, would you be able to attend the coming Tahoe-LAFS Weekly Dev
> Chat on Thursday at 15:30Z (8:30am Pacific)?
>
> https://tahoe-lafs.org/trac/tahoe-lafs/wiki/WeeklyMeeting
>
> Regards,
>
> Zooko
>
> On Sun, Apr 21, 2013 at 6:17 PM, Mark Berger <mjberger at stanford.edu>
> wrote:
> > Hi everyone, over the last few days I have been working on a proposal for
> > GSoC to address share rebalancing and repair. I've copied the proposal
> below
> > (with some of my personal contact information redacted :] ). If you see
> > something wrong in my proposal, have any questions, or have any
> suggestions,
> > please let me know.
> >
> > Thanks!
> > Mark Berger
> >
> >
> >
> > Organization: Tahoe-LAFS
> > =============
> >
> > Student Info:
> > =============
> > Mark J. Berger
> > Time Zone: Pacific
> > Time Zone during GSoC: Eastern
> > IRC Handle: Mark_B at irc.freenode.net
> > Github: markberger
> > Email: mjberger [at] stanford.edu
> >
> > University Info:
> > ================
> > University: Stanford University
> > Major: Computer Science
> > Current Year: Freshman
> > Expected Graduation: June 2016
> > Degree: BS
> >
> > About Me:
> > =========
> >
> > I'm a freshman at Stanford University studying computer science. Right
> now
> > I am finishing up my core requirements and will be pursuing the
> artificial
> > intelligence track or the systems track within the major. My interests
> lie
> > in machine learning, large distributed systems, and web applications.
> >
> > I began programming during an internship at Four Directions Productions
> in
> > 2011, where I learned how to use Python in conjunction with Maya. The
> > majority of my college coursework has been in C or C++ on linux with a
> > little Java. This has made me familiar with tools such as GCC, GDB and
> > Valgrind.
> >
> > While I have never contributed to an open source project before, I am
> > making an effort to learn about Tahoe-LAFS and become familiar with its
> > code base and community. Using a virtual machine, I've successfully
> > installed Tahoe on an Ubuntu server and connected to the Public Test
> Grid.
> > I've also subscribed to the mailing list, connected to the IRC channel,
> and
> > successfully pulled the code off of Github. While I know my lack of
> > experience in open source is a short coming, I am completely dedicated to
> > using GSoC's Community Bonding Period to overcome any obstacles before
> the
> > official coding period begins.
> >
> >
> > Project Title: Share Rebalancing and Repair in Tahoe-LAFS
> > =========================================================
> >
> > Abstract:
> > =========
> >
> > The "servers of happiness" algorithm has improved Tahoe's ability to
> > maximize redundancy by ensuring a given subset of all shares are placed
> on
> > distinct nodes. However, this processes is not used to upload mutable
> > files, instead opting for the old "shares of happiness" algorithm, which
> > has well documented downsides. Additionally, file repair does not
> > necessarily  redistribute files to new servers when nodes have been
> added.
> > This creates issues in terms of redundancy and long term server health.
> > Implementing proper file rebalancing for all file types during file
> upload,
> > modification, and repair will enhance the reliability of the Tahoe system
> > and take full advantage of erasure encoding.
> >
> >
> > Deliverables:
> > =============
> >
> > 1. Mutable files automatically distribute over nodes according to the
> > "servers of happiness" algorithm whenever uploaded, modified, or repaired
> > (ticket #232).
> >
> > 2. Repair will redistribute files according to "servers of happiness"
> > algorithm and only renew the appropriate leases (ticket #699).
> >
> > 3. Documentation changed to correctly reflect the new feature set
> >
> > 4. Create a test suite to be used on a network of virtual machines in
> order
> > to test file rebalancing.
> >
> >
> > Time Line:
> > ==========
> >
> > Note: I would like to have a code review session with my mentor on a
> weekly
> > basis at minimum, especially at the beginning of the program. Those
> sessions
> > are
> > left off the time line to avoid redundancy
> >
> > May 27th - June 17th (Community Bonding):
> > -----------------------------------------
> >
> > - Remain available via IRC and email
> > - Closely follow the development email list
> > - Isolate and understand the classes which pertain to the current
> >  implementations of the servers of happiness algorithm to determine which
> >  parts can be reused.
> > - Discuss with my mentor(s) and the community to determine whether code
> >  should be refactored to apply to both immutable and mutable files or if
> >  the two need to remain distinct for design reasons
> > - Discuss with my mentor(s) and the community the best way to go about
> > testing
> >  file rebalancing.
> >
> > Note: June 3rd through the 14th is my final exams period and I will be
> > packing
> > so that I can go home to Upstate NY. Since I will be very busy during
> this
> > time, not all of the above may be accomplished in time to start coding.
> > My classes do not resume until the end of September 23rd, so I can push
> my
> > time line back a week or two if need be.
> >
> >
> > Jun 17th - 28th
> > ---------------
> > - Implement "servers of happiness" for mutable files during the initial
> >  file upload and file modification
> >
> > Jul 1st - 12th
> > --------------
> > - Throughly document code
> > - Write test scripts for larger networks
> > - Test code using virtual machines or predetermined test scheme from CBP
> >
> > Jul 15th - 19th
> > ---------------
> > - Clean up test scripts
> > - Throughly document test scripts
> > - Fix minor bugs
> > - Continue to consider and test edge cases
> >
> > Note: "Servers of happiness" for mutable files should be in a mergable
> state
> >       with tests before the midway point on July 29th.
> >
> > Jul 22nd - Aug 2
> > ----------------
> >
> > - Modify repair code to use the "server of happiness" algorithm for both
> >  immutable and mutable files. This should be accomplished by utilizing
> the
> >  existing code from the initial upload process
> >
> > - Edit mechanism for lease renewal to ensure minimal amount of lease
> >  renewal is done during rebalancing
> >
> > Aug 5th - 16th
> > --------------
> >
> > - Throughly document code
> > - Extend tests for mutable files to encompass rebalancing during file
> repair
> >
> > Aug 19th - 23rd
> > ---------------
> >
> > - Clean up test scripts
> > - Throughly document test scripts
> > - Fix minor bugs
> > - Continue to consider and test edge cases
> >
> > Aug 26th - 30th
> > ---------------
> >
> > - Change documentation to reflect additional features
> >
> >
> > The weeks of September 1st and 8th are left blank for flexibility.
> >
> >
> > Possible projects if the above are accomplished ahead of schedule:
> > ==================================================================
> >
> >  - Detect if disk(s) on a server are in a near fail state. If the disk(s)
> >    are close to failing, notify the administrator, and slowly begin
> >    redistributing shares to the other storage nodes (tickets #481 and
> #864).
> >
> >  - Let the user specify a maximum storage capacity for a given storage
> node
> >    based on folder size instead of free space left on the machine.
> >
> >  - Tahoe backend for Google Drive (ticket #1831).
> >
> >
> > _______________________________________________
> > tahoe-dev mailing list
> > tahoe-dev at tahoe-lafs.org
> > https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
> >
> _______________________________________________
> tahoe-dev mailing list
> tahoe-dev at tahoe-lafs.org
> https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20130424/ee9da6ad/attachment.html>