[tahoe-dev] Where does allmydata.client start listening?

Callme Whatiwant nejucomo at gmail.com
Tue Jul 23 07:00:05 UTC 2013


On Sun, Jul 21, 2013 at 2:56 AM, Steven Lee <elderry at outlook.com> wrote:
> First of all thanks for your help to my understanding of the source these
> days, now I am struggling to comprehend what is going to happen when a
> client was launched(Here I mean the client node's HTTP server). I learnt how
> twistd works then checked tahoe-client.tac. In my opinion, these lines:
>> c = client.Client()
>> application = service.Application("allmydata_client")
>> c.setServiceParent(application)
> showed that "c"(a Client class) appear as a service and was then registered
> to a twistd application. Then, according to the documents I found(for
> example: http://twistedmatrix.com/trac/):
>> class Echo(protocol.Protocol):
>>     def dataReceived(self, data):
>>         self.transport.write(data)
> I can often find a function like "dataReceived" to tell me that the server
> starts to listen here.

This sounds slightly incorrect to me.  The dataReceived method of a
Protocol subclass is called by the twisted reactor *after* a TCP
connection has been made and data has arrived over the network for
that connection.  Different subclasses handle the bytes passed to
dataReceived in different ways.

I'm not sure if you are interested in where the Tahoe webapi TCP port
is bound and starts listening, or if you are interested in where it
handles incoming TCP data (which in this case would be HTTP requests).


> Now my question is: where does allmydata.Client start
> listening?

I knew from previous experience that the webapi processing code is
related to the allmydata.webish python module, so I searched the
client class for that to find this:

https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/client.py#L452

That's passing webport to the WebishServer constructor.  That
constructor calls the buildServer method, and inside there is calls
strports.service() here:

https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/webish.py#L154

Now, that API is a bit hard to understand, based on the documentation
for that function:

https://twistedmatrix.com/documents/current/api/twisted.application.strports.html#service

Based on the description string, different implementations of IService
are returned.  In this case, the string should begin with 'tcp:', or I
think if it is just a number there's an assumption somewhere that it's
meant to be a TCP port.  I'm not 100% certain, but I think that call
should return a StreamServerEndpointService and assign it to s in the
buildServer method of WebishServer.

I think the StreamServerEndpointService will bind and listen to the
TCP port whenever its startService method gets called.  Here's the doc
page for it:

https://twistedmatrix.com/documents/current/api/twisted.application.internet.StreamServerEndpointService.html

(I'm a little uncertain because that page doesn't mention TCP anywhere...)

Ok, so that is my broad overview of "where" the code will actually
start listening to the TCP port.  There are several layers of twisted
abstractions here: a "service" abstraction for starting and stopping
something; an "endpoint" which represents one or the other end of a
TCP connection, plus the Tahoe code itself.


> Since I think this class "Client" implements a HTTP server, it
> should listen to a port and do something with the data it received, but I
> didn't find the location. If you find any point of my understanding is
> absurd, which I think is likely, please help me correct it.
>

Ah, when you talk about the data received, it makes me wonder if you
are more interested in how the HTTP requests are processed rather than
how the TCP port is bound and listened to.

Tahoe uses a library called Nevow to process HTTP requests.  If you go
back to the buildServer method where strports.service() is called,
notice that the second argument is called "site".

I'm guessing based on my knowledge of the standard twisted.web.server
library that Nevow has a similar design, so there is a single "site"
which can have a tree of resources or pages.  If we scan back to see
how site is prepared, I see for example the static.File handler is
installed under the "static" path.  So if an HTTP request comes in for
a url that begins with "/static" that static.File code will process
it.  (I haven't read that code but I know it simply reads the files
from disk and sends them back in the response.)  That line is here:

https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/webish.py#L150

Scanning back more I see that site is constructed by passing self.root:

https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/webish.py#L147

-and self.root is constructed in the __init__ of WebishServer:

https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/webish.py#L139

-glancing back at the import:

https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/webish.py#L8

-and then finding the other module for the class definition of Root:

https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/web/root.py#L130

It looks like all of the different URL paths are handled by subclasses
of rend.Page, which are spread throughout that web directory.  I think
a rend.Page subclass can do a combination of up to three things:

a. Generate the response directory in a render_<HTTP METHOD> method, like this:

https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/web/root.py#L120

b. Generate a response from a template file.  (I'm not sure of the details.)

c. Pass the responsibility to "child resources" further down the URL path.


Does this give you a better picture of what you wanted to learn?


regards,
nejucomo

ps: These kinds of questions may be more likely to be answered if you
ask on the IRC channel.


> _______________________________________________
> tahoe-dev mailing list
> tahoe-dev at tahoe-lafs.org
> https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
>


More information about the tahoe-dev mailing list