[tahoe-dev] wanted: a permanent copy of everything I've ever looked at through my web browser

anders pearson anders at columbia.edu
Wed Feb 17 11:14:41 PST 2010


> I want to be able to recall, and to search through, everything that
> I've ever looked at through my web browser.

I had a very similar desire a couple years ago and wrote this as an
experiment: 

	    http://github.com/thraxil/foxy/blob/master/foxy.py

Basically a very simple twisted HTTP proxy that extracted the text
content of every page it passed through and submitted it to a web
based fulltext indexing engine that I was running (but don't use any
longer). 

It basically worked although it was the first time I'd really touched
Twisted so I didn't quite know what I was doing there (I still don't
know what I'm doing with Twisted) and it had some issues (I remember
most pages working OK, but gmail and some trickier ones like that
failing miserably).

I think this is basically the right way to approach the problem; I was
just never able to get comfortable enough with Twisted to get the bugs
worked out. 

-- 
anders pearson : http://www.columbia.edu/~anders/
   C C N M T L : http://www.ccnmtl.columbia.edu/
        weblog : http://thraxil.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
Url : http://allmydata.org/pipermail/tahoe-dev/attachments/20100217/0b3d5057/attachment.pgp 


More information about the tahoe-dev mailing list