[tahoe-dev] Modifying the robots.txt file on allmydata.org
Zooko Wilcox-O'Hearn
zooko at zooko.com
Wed Feb 24 10:38:18 PST 2010
On Wednesday, 2010-02-24, at 1:50 , David-Sarah Hopwood wrote:
> Allowing crawlers to index some of the dynamically generated pages
> under /trac could cause horrible breakage, given darcs+trac's
> performance problems. You'd have to look at what subsets of that
> are sufficiently static.
The main thing to avoid is URLs that have "rev=XYZ" in them, like these:
http://allmydata.org/trac/tahoe-lafs/browser/setup.cfg?rev=3996
http://allmydata.org/trac/tahoe-lafs/browser/setup.cfg?
annotate=blame&rev=3996
Those are asking darcs to reconstruct what a particular file or
directory looked like at some point in the past, which is relatively
expensive.
On the other hand the trac-darcs plugin caches the results of those
in its sqlite db, so perhaps letting a spider laboriously crawl the
whole thing is a way to fix the performance problems. :-)
Regards,
Zooko
More information about the tahoe-dev
mailing list