Changes between Version 111 and Version 112 of FAQ


Ignore:
Timestamp:
2015-08-24T17:54:15Z (9 years ago)
Author:
AzureCerulean
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • FAQ

    v111 v112  
    1515'''[=#Q2_what_is_erasure_coding Q2:] "Erasure-coding"?  What's that?'''
    1616
    17 A: You know how with RAID-5 you can lose any one drive and still recover?  And there is also something called RAID-6 where you can lose any two drives and still recover.  Erasure coding is the generalization of this pattern: you get to configure how many drives you could lose and still recover.  You can choose how many drives (actually storage servers) will be used in total, from 1 to 256, and how many storage servers are required to recover all the data, from 1 to however many storage servers there are.  We call the number of total servers {{{N}}} and the number required {{{K}}}, and we write the parameters as "{{{K-of-N}}}".
    18 
    19 This uses an amount of space on each server equal to the total size of your data divided by {{{K}}}.
    20 
    21 The default Tahoe-LAFS parameters are {{{3-of-10}}}, so the data is spread over 10 different drives, and you can lose any 7 of them and still recover the entire data.  This gives much better reliability than comparable RAID setups, at a cost of only 3.3 times the storage space that a single copy takes.  It takes about 3.3 times the storage space, because it uses space on each server equal to 1/3 of the size of the data, and there are 10 servers.
    22 
    23 "Forward error correction" is another term for erasure coding.
     17A: RAID-5 can lose one drive and RAID-6 can lose two drives and recover. Using a method of data protection in which data is broken into fragments, expanded and codified with redundancies, stored across a selected set of various places or storage servers, Erasure coding (CE). The number of records (storage / servers / nodes) used in total can be chosen from 1 to 256, and the number of storage servers that are required to recover all the data, from 1 to the total number of available storage servers. The number of overall storage servers, we call {{{N}}} and the number needed {{{K}}} and write the parameters such that it is "{{{K-of-N}}}".
     18
     19This uses an amount of space on each storage server equal to the total size of your data is shared over all {{{K}}}.
     20
     21Tahoe-LAFS having default parameters {{{3-of-10}}}, the data is spread over 10 different disks and losing any 7, continue to recover all the data. This is more reliable than comparable RAID arrangements, with a cost of only 3.3 times the storage space that a single copy carries. It takes about 3.3 times the storage space, because it uses space on each server, equal to 1/3 of the size of the data, and there are 10 servers.
     22
     23"Forward error correction" (FEC) is another term for erasure coding.
    2424
    2525Erasure coding should not be confused with "secret sharing", which has the additional security property that fewer than {{{K}}} servers cannot recover any information about the data. Tahoe-LAFS' erasure coding does not have this property, and does not need to have it because we rely on secret-key encryption (using a key in the read cap) for confidentiality.
    2626
    27 "Information Dispersal Algorithm" (IDA) can refer either to an erasure code or a secret sharing algorithm depending on context, so we prefer not to use that term.
     27"Information Dispersal Algorithm" (IDA) can refer either to erasure code or secret sharing algorithm according to context, so we prefer not to use that term.
    2828
    2929'''[=#Q3_disable_encryption Q3:] Is there a way to disable the encryption for content which isn't secret? Won't that save a lot of CPU cycles?'''