[tahoe-dev] Perforce backend served by tahoe.
Marc Tooley
tahoe-devPOST at quake.ca
Mon Jun 29 12:20:01 PDT 2009
One of the upcoming features of Perforce which is available in the beta
version (2009.1 beta) right now is the ability to override the backend
storage of the versioned files with.. well pretty much anything you
want, really. All you have to do is write a script which takes three
arguments in the form of:
%op% %filename% %revision%
... and then reads in actual file data in stdin (or prints it out on
stdout.)
Then, tell Perforce that the files you store should be fed through this
script with its trigger mechanism.
I've written such a script, available here:
http://public.perforce.com:8080/guest/marc_tooley/trig/archive/tahoe_backend.pl
It seems to work very nicely, primarily because tahoe allows you to
store arbitrary filenames with arbitrary paths without creating the
directories first to store them. (It does that itself.) I can sync
files, submit them.. everything I tested.
Note that this is not for the database/metadata files. Those are
read/written in 8K pages and tahoe is not a suitable backend for that.
But that doesn't really matter, because you can store the flat-text
checkpoint files (metadata backup files) in tahoe and use it as the
backup mechanism for those, too, so you'll never be behind more than
your last checkpoint/backup cycle.
The script itself has no instructions, so here they are:
1. Set up your tahoe grid. Make sure the user you will be running
Perforce as can run "tahoe ls" without difficulty and you have a root
node/alias with a full-permissions directory cap.
2. Create a Perforce root directory. This is basically as simple as:
i) Download p4d and p4 for your platform.
ii) Run: "p4d -r. -d"
3. Set up your archive trigger with:
i) p4 triggers
ii) Underneath the word "Triggers:" add the following trigger:
happyday archive //... "/path/to/tahoe_backend.pl %op% %file% %rev%"
(Note there needs to be a tab at the beginning.)
iii) Save form and exit you editor.
4. Set up Perforce typemap:
i) p4 typemap
ii) Under the line "TypeMap:" put:
+X //...
(Note again the leading tab.)
iii) Save your form, and exit.
5. (OPTIONAL) Inside the script, if you want it to put it somewhere
other than in your default "tahoe:" alias then change $tahoetop to your
other alias. Don't forget the trailing colon.
Voila! Virtually an instant Perforce server running with a tahoe
backend.
The script may be called on to store files that contain multiple
revisions. Thus, inside the tahoe backend store are not the filenames
themselves, but directories named after the filenames, containing files
that are named after the revision. For example, when Perforce asks us
to store:
//depot/path/to/filename#10
... we are actually dumping it into:
tahoe:/depot/path/to/filename/10
... and it turns out this works nicely.
Also nice (but likely unnecessary) is the additional double-check
Perforce can do when verifying files. Perforce stores an MD5 sum
per-file which can then be double-checked via the "p4 verify" command
to ensure that the files are retrievable and untouched.
Further niceties include the fact that not only does Perforce make use
of lazy copies, but tahoe does so, too, automatically on a per-file
basis!
That is, when you branch in Perforce:
p4 integ a b
p4 submit -d "new file b"
... b doesn't exist. It knows that it's the same file as "a" and goes by
itself off to fetch "b" from the real backend file "a" automatically.
Similarly, tahoe clients via convergence encryption can encrypt
identical files identically, which means if ten developers all check-in
the same file, only one of them actually exists on the backend!
Normally in Perforce they'd all exist as separate files. (You don't
even have to share around tahoe's convergence key for this, because the
same client is doing all the encryption.)
WARNINGS: +X files are NOT proxied by the perforce proxy (P4P)! Also,
this backend is quite slow anyway even without the proxy, just as tahoe
is, for many small files. But really, who cares?! :-) Tahoe is awesome.
More information about the tahoe-dev
mailing list