[tahoe-dev] Grid Design Feedback
Nathan Eisenberg
nathan at atlasnetworks.us
Mon Jun 27 00:37:30 PDT 2011
All,
I am in the process of designing a production datacenter storage grid. It will start at 4x4TB storage nodes, and hopefully grow to 10x4TB before too long. It will store a variety of content, but mostly images under 4MB. There are apt to be tens of millions of files almost immediately, as the day-one user will be moving existing data off of a NFS solution. 'Put' access will be via the FTP frontend, and 'get' access will be via a caching reverse proxy which will frontend the twisted WUI. Directories will probably not go more than 3 deep.
The grid will never grow to more than 10 nodes, as we'll just create additional grids after that (this is primary to prevent an A-M-D type failure where dircaps are spread over many servers). If more space is required, we'll expand the 10 nodes, rather than add more nodes.
At some point, we plan to use rsync to create a 'replica grid' off-site. Expiration will be turned on, and renewal scripts will crawl the user directories to keep leases current.
Since nodes will never leave the grid permanently (only brief windows for reboots and such), I was thinking that simple replication (k=1, n=2, happy=2) would be sufficient. The backing disks are in RAID-1, to prevent a disk failure from requiring a file repair.
Are there any initial concerns with this design? Will Tahoe scale to this many files? Is there any problem with replication-only on a server grid? Is there a better way to replicate the grid than rsyncing the master nodes to a slave node in a remote facility (for example, is it perhaps possible to make the FTP frontend 'put' to two grids instead of just one)?
Lastly, is there any method for accounting/logging on the FTP server? For example, when a user uploads a file, is it possible to have a log that says what time, what user, what file, and how much data?
Sorry for all the questions!
Nathan Eisenberg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20110627/972cd863/attachment.html>
More information about the tahoe-dev
mailing list