#113 assigned enhancement

command-line: do things in an incremental fashion and accept stdin as input

Reported by: zooko Owned by: zooko
Priority: major Milestone: eventually
Component: code-frontend-cli Version: 0.7.0
Keywords: tahoe-put http streaming memory Cc: zooko
Launchpad Bug:

Description

The "put" command-line currently can't take stdin as its input, because it needs to find the file size (Content-Length) before it starts. Fix this! Details: maybe use chunked transfer encoding? Maybe twisted.web2 client already does this? See if tahoe_put-web2ish.py already does the right thing.

Alternately, maybe our web server could be trained to recognize everything between the header and the (half-)close of the connection as being body?

Change History (16)

comment:1 Changed at 2007-08-20T18:01:55Z by zooko

This is part of the "improved command-line" task. I would like to see it done for v0.6.

comment:2 Changed at 2007-08-20T18:55:18Z by warner

  • Component changed from unknown to code-frontend

comment:3 Changed at 2007-09-19T22:59:02Z by zooko

  • Milestone changed from 0.6.0 to 0.7.0

comment:4 Changed at 2007-10-01T18:16:42Z by zooko

  • Status changed from new to assigned

comment:5 Changed at 2007-10-19T23:15:15Z by zooko

  • Milestone changed from 0.7.0 to 0.6.2
  • Version changed from 0.4.0 to 0.6.1

I'm interested in working on a few tickets which all have to do with improving the cmdline, for v0.6.2. This is one of them.

comment:6 Changed at 2007-11-01T18:09:08Z by zooko

  • Milestone changed from 0.6.2 to 0.7.1

We're focussing on an imminent v0.7.0 (see the roadmap) which hopefully has #197 -- Small Distributed Mutable Files and also a fix for #199 -- bad SHA-256. So I'm bumping less urgent tickets to v0.7.1.

comment:7 Changed at 2007-11-13T18:29:32Z by zooko

  • Milestone changed from 0.7.1 to 0.7.2
  • Version changed from 0.6.1 to 0.7.0

We need to choose a manageable subset of desired improvements for v0.7.1, scheduled for two week hence, so I'm bumping this one into v0.7.2, scheduled for mid-December.

comment:8 Changed at 2008-01-15T21:37:44Z by zooko

  • Component changed from code-frontend to code-frontend-cli

comment:9 Changed at 2008-01-23T04:21:35Z by zooko

  • Milestone changed from 0.7.2 to undecided

comment:10 Changed at 2009-12-13T05:02:02Z by davidsarah

Accepting a half-close as end of file would be quite error-prone.

comment:11 Changed at 2011-05-21T15:54:37Z by davidsarah

Related to #320 (add streaming (on-line) upload to HTTP interface).

comment:12 Changed at 2011-08-26T23:53:09Z by davidsarah

  • Keywords tahoe-put http streaming added
  • Priority changed from minor to major

comment:13 follow-up: Changed at 2011-09-02T03:22:36Z by davidsarah

  • Keywords memory added
  • Type changed from enhancement to defect

Note that tahoe put never uses streaming, even when its input is from a file rather than stdin. This results in memory usage proportional to the file size (which would be expected for SDMF files, but not for immutable or MDMF files).

Note that the increase in memory usage of the gateway process seems to be at least double the file size; for example, when uploading a 191 MiB MDMF file in 1.9alpha using tahoe put --mutable --mutable-type=mdmf, the peak RSS of the gateway (which was also a storage server) was about 510 MiB greater than when updating the same file using SFTP. I think that counts as a defect.

comment:14 Changed at 2011-09-02T03:26:55Z by davidsarah

BTW, I'm much less concerned about whether tahoe put accepts input from stdin, than about whether uploads are memory-efficient when the file size is known in advance. The latter case happens much more frequently (also for other commands like tahoe cp).

comment:15 in reply to: ↑ 13 ; follow-up: Changed at 2011-09-02T06:16:58Z by zooko

  • Cc zooko added

Replying to davidsarah:

Note that the increase in memory usage of the gateway process seems to be at least double the file size; for example, when uploading a 191 MiB MDMF file in 1.9alpha using tahoe put --mutable --mutable-type=mdmf, the peak RSS of the gateway (which was also a storage server) was about 510 MiB greater than when updating the same file using SFTP. I think that counts as a defect.

Agreed. And it should occupy its own new ticket.

comment:16 in reply to: ↑ 15 Changed at 2011-09-02T15:55:16Z by davidsarah

  • Type changed from defect to enhancement

Replying to zooko:

Replying to davidsarah:

Note that the increase in memory usage of the gateway process seems to be at least double the file size... I think that counts as a defect.

Agreed. And it should occupy its own new ticket.

Filed as #1523.

Note: See TracTickets for help on using tickets.