[tahoe-dev] LAFS Weekly Dev Hangout notes, 2012-12-06
Shawn Willden
shawn at willden.org
Sat Dec 8 00:24:30 UTC 2012
Are you guys doing these as Hangouts-on-Air? If so, where's the YouTube
channel? If not, please do!
On Fri, Dec 7, 2012 at 3:45 PM, Paul Rabahy <prabahy at gmail.com> wrote:
> First off, let me say that I enjoyed listening in to the hangout even
> though I couldn't participate. For the last year or 2 I have been thinking
> about building a distributed, untrusted storage system. When I found
> Tahoe-lafs, I was ecstatic that it had already implemented about 90% of the
> ideas that I had thought of and that it sounds like the remaining 10% are
> being worked on.
>
> I have some comments on Zooko's Proof-of-Retrievability paper.
> 1. Great job writing this. It was very easy to read and get up to speed
> without having to read 10 other whitepapers first to understand the basics.
> (I have some background in cryptography/secure computing from college)
>
> 2. I completely agree with the 3 levels of bad behavior (Greed, Malice,
> and Adaptive Malice). In addition, I believe there should be a fourth level
> which I will call "Accidental Greed". In this case, the sever stores the
> data, responds to all requests properly, but one day fails (either a POR or
> GetData request) for some unknown reason. This server will acknowledge
> their mistake and attempt to reverse it once they are notified (restore
> backups, patch bug, etc.).
> 2a. For POR, Zooko nailed this. We don't have to care about "Accidental
> Greed" at the protocol level because if we cover our-self for Malice or
> Adaptive Malice we already have a solution.
>
> 3. I am convinced that to prevent Malice or Adaptive Malice, there cannot
> be a difference between running a POR or GetData. If there is either type
> of attacker could respond correctly to the POR but incorrectly to the
> GetData.
> 3a. For my use cases I feel that having POR and GetData can have different
> traffic patterns and not affect my experience as a customer. I realize that
> will introduce a gap in the protocol so that Adaptive Malice could defeat
> the POR. I would like for POR to cover me 90% of the time, but occasionally
> I will actually download a file and will be able to catch the Adaptive
> Malice server at that point. (This might seem like a contradiction, but
> Tahoe already has the enormously powerful feature of erasure coding to
> protect me from an occasional malicious server even if POR fails.)
>
> 4. Unfortunately Zooko lost me in Part 2b and 3. I understand the
> trade-off between "(a) reduced performance for downloads, and/or (b)
> increased bandwidth usage for verification", but I was never able to
> understand how Tahoe is supposed to be convinced that a share is
> retrievable without even contacting the server containing the share.
> 4a. Several times during the hangout, it was mentioned that increasing N
> and K would help POR to work better. I don't follow that argument. I agree
> that setting N higher increases the retrievability of a file (because it
> can withstand more malicious servers), but I don't see how increasing
> either of these will help me single out the malicious server.
>
> 5. I need to do more reading on the current Tahoe verify system, but I
> don't understand how Tahoe can verify a file using B Bandwidth where B is
> less than F Filesize.
> 5a. Using the Tahoe defaults (K = 3, N = 10) and assuming that F = 1(MB)
> it will take 3.33(MB) to store all ten shares. Each share will take
> .333(MB) to store. To verify the file, wouldn't you have to retrieve at
> least K shares therefor B would equal .333(MB) * 3(K) = 1MB(F). To me, it
> seems we didn't save any bandwidth.
> 5b. (Ah, just thought of this as I was writing). Does Tahoe maintain some
> sort of tree/share based hash so that it can verify individual shares or
> parts of a share without verifying the entire file? If so, I can see the
> bandwidth savings.
>
> 6. I agree that TOR/distributed verification could help in the case of an
> Adaptive Malicious server, but until I have a clearer understanding of my
> points 5 and 6, I'm not sure if this description of POR will be have a
> benefit for my use case.
>
> Hopefully these points make sense. Let me know if I made anything
> confusing.
> PRabahy
>
>
> On Thu, Dec 6, 2012 at 3:42 PM, Zooko Wilcox-O'Hearn <zooko at zooko.com>wrote:
>
>> In attendance: Brian, David-Sarah, Zooko (scribe), Andrew, PRabahy
>> (silent)
>>
>> The meeting started about 10 minutes late and ran more than 30 minutes
>> past its scheduled stop-time. (Because we were too engaged to stop at
>> the stop-time since we were sorting out the question of whether
>> Zooko's "Strong Proof-of-Retrievability" concept was inherently as
>> inefficient as simply downloading the whole file.)
>>
>> Caveat Lector! I might have forgotten some stuff. I haven't taken the
>> time to add explanations for most of what follows. My own biases shine
>> through willy nilly.
>>
>>
>> * The LAFS-PoR.rst text file was cleverly hidden behind an obstacle
>> course.
>>
>> * 'Ephemeral Elliptic Curve Diffie-Hellman‽ My friend Zooko excels at
>> redefining "What 'everyone' or what 'no-one' uses."'
>>
>> * leasedb+cloud-backend
>> * LeastAuthority.com has at long last delivered Milestone 3 to
>> DARPA. Milestone 1 was a design document Milestone 2 was Cloud/S3
>> backend, and Milestone 3 was leasedb.
>> * https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1818 /
>> https://github.com/davidsarah/tahoe-lafs/tree/1818-leasedb is the
>> implementation of leasedb against trunk (disk backend)
>> * https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1819 /
>> https://github.com/davidsarah/tahoe-lafs/tree/1819-cloud-merge is the
>> merge of that with the cloud backend
>> * The 1819-cloud-merge branch passes all unit tests, and passes
>> manual testing by David-Sarah. It is currently being evaluated on
>> behalf of DARPA by their contractors, BITSYS.
>> * next steps:
>> * Keep 1818-leasedb and 1819-cloud-merge out of Tahoe-LAFS v1.10.
>> * Let Brian review them.
>> * David-Sarah is still re-recording the patch series for
>> 1819-cloud-merge.
>> * Zooko is still code-reviewing the patches.
>> * Check for the transition experience — what happens the first
>> time you upgrade, for example.
>> * There is at least one incomplete detail about transition:
>> starter leases don't get added (there isn't a ticket for this — we
>> should open one).
>> * Zooko and David-Sarah want to implement #1834 and related
>> tickets — not necessarily before we land it on trunk, but before we
>> release 1.11. Or we could do it on the branch before we land it on
>> trunk.
>>
>> * Tahoe-LAFS v1.10
>> * Let's package up what we have currently on trunk (plus, Zooko
>> added to these notes after the meeting, possibly a few other good
>> patches that are basically already done, are very non-disruptive —
>> such as documentation-only patches — and/or have forward-compatibility
>> implications, such as #1240, #1802, #1789, #1477, #901, #1539, #1643,
>> #1842, and #1679).
>> * Everyone review pending tickets!
>> https://tahoe-lafs.org/trac/tahoe-lafs/milestone/1.10.0
>> * The next Weekly Dev Hangout will be about Tahoe-LAFS v1.10
>> * goal: get trunk to meet our desires for Tahoe-LAFS v1.10, release
>> from trunk
>> * Brian wants to fix #1767, which has forward-compatibility
>> implications.
>>
>> * tarcieri's new HTML
>> * not for 1.10
>> * It changes only the front page and so the other pages are
>> inconsistent with the new front page.
>> * But commit it to a branch ASAP and demonstrate to tarcieri that
>> we're serious about merging it to trunk as soon as it is complete.
>>
>> * Proof-of-Retrievability
>> * Zooko has written a rough draft of a tahoe-dev post/science
>> paper, arguing that real "Strong" Proof-of-Retrievability is possible,
>> that the current exemplars in the crypto literature fail to provide
>> Strong Proofs-of-Retrievability, and that Tahoe-LAFS combined with Tor
>> would make a nice basis on which to build a Strong
>> Proof-of-Retrievability, and that if it did, it would be a practical
>> censorship-resistance tool.
>> * Brian posed some good challenges in practical terms about the
>> performance and bandwidth costs.
>> * The key difference that makes this new concept of
>> Proof-of-Retrievability different and better than previous attempts is
>> that it uses multiple storage servers (which are hopefully not
>> colluding with one another), and erasure-coding in order to keep total
>> upload and storage costs fixed even while scaling a single file,
>> horizontally, to a large number of storage servers.
>> * That's also the key to answering Brian's challenge — that sort of
>> spreading across storage servers alllows one to gain verification
>> assurance — *even* against Adaptive Malicious Storage Servers — at a
>> fraction of the aggregate bandwidth cost of a full download. If there
>> were only a single storage server then Juels-2009 and
>> Brian-in-this-meeting would be right that no efficient Strong PoR is
>> possible.
>> * Next steps: Zooko needs to rewrite the second half of the current
>> document to emphasize these insights gained from this meeting and to
>> streamline it. Several experts have volunteered to review it already.
>> Then: post it to tahoe-dev?
>> * David-Sarah has some idea that Brian and Zooko don't quite get
>> about improving the quantitative advantage to the defender by
>> increasing erasure coding parameters and storing multiple shares per
>> server.
>> * Let's get drunk and argue about whether God can see into the future.
>> _______________________________________________
>> tahoe-dev mailing list
>> tahoe-dev at tahoe-lafs.org
>> https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
>>
>
>
> _______________________________________________
> tahoe-dev mailing list
> tahoe-dev at tahoe-lafs.org
> https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
>
>
--
Shawn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20121207/efc2f35f/attachment-0001.html>
More information about the tahoe-dev
mailing list