[tahoe-dev] down with filesystems! up with the web! -- Re: [tahoe-lafs] #776: users are confused by "tahoe rm"

David-Sarah Hopwood david-sarah at jacaranda.org
Mon Dec 28 13:13:20 PST 2009


Zooko Wilcox-O'Hearn wrote:
> On Sunday, 2009-12-27, at 19:19 , Shawn Willden wrote:
> 
>> Indeed, the files are just as much deleted as they are in any Unix  
>> file system.  The only difference is that in a Tahoe grid garbage  
>> collection is much slower (really slow if the storage nodes have GC  
>> turned off).
> 
> It's true that this same issue is present in any unix file system,  
> but the speed of garbage collection is not the only difference.  An  
> important difference is that every unix filesystem disallows hard  
> links to directories.  (An exception that proves this rule is that  
> Apple recently extended HFS to allow hardlinks to directories, but  
> only with some specific limitations intended to prevent cycles, and  
> only to support Time Machine backups.)  Also non-unix filesystems  
> such as Windows and pre-unix Mac disallow hardlinks to directories,  
> and even hardlinks to files.  This makes me suspicious that the  
> designers of those systems had good reasons for this, and the fact  
> that Tahoe-LAFS gaily allows hardlinks to any object is probably an  
> example of fools rushing in where angels fear to tread.

I think this is overestimating OS designers' concern for avoiding
user confusion. All common operating systems [*] do allow symlinks to
directories, which introduce all the same potential usability problems
as hardlinks to directories, and more.

I'm more inclined to believe that disallowing hardlinks to directories
was almost entirely a matter of implementation expedience: if links
can form cycles, that implies either implementing GC or accepting leaks,
and it is easier to disallow hardlinks to directories than to implement
cycle prevention. Note that this theory is consistent with the HFS
exception: Apple found a good enough reason to support acyclic hardlinks
to directories, to justify the effort of implementing cycle prevention.
It also explains why symlinks to directories are permitted: since they
are allowed to be dangling links (i.e. a symlink does not keep its
referent alive), they don't require implementing GC.


[*] including Windows NTFS. For many versions that ability was well-hidden,
    but in Vista it is used and visible.

[...]
> Suppose instead of thinking of their Tahoe-LAFS-hosted files and  
> their Tahoe-LAFS directories as being part of a "folders-and- 
> documents" abstraction, and instead of them being part of a unixy  
> path-based "filesystem", they thought of them as a collection of web  
> pages which could have hyperlinks to one another.  Then there is no  
> more "impedance mismatch" between the abstraction in the user's head  
> and the underlying graph structure.  No user is ever surprised that  
> multiple web pages can point to the same web page, or that following  
> a series of hyperlinks can take you in a circle.  No software  
> intended for the Web assumes that the set of web pages that it will  
> visit forms a perfectly hierarchical tree structure without cycles or  
> converging links.

And no software for traversing filesystems *should* do so either,
because of cyclic symlinks. But the Unix "worse is better" philosophy
says to ignore that :-/

-- 
David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 292 bytes
Desc: OpenPGP digital signature
Url : http://allmydata.org/pipermail/tahoe-dev/attachments/20091228/7b430e0e/attachment.pgp 


More information about the tahoe-dev mailing list