Opened at 2010-08-28T01:13:53Z
Last modified at 2014-12-02T19:44:04Z
#1189 new defect
investigate best FUSE+sshfs options to use for performance and correctness of SFTP via sshfs
Reported by: | davidsarah | Owned by: | bj0 |
---|---|---|---|
Priority: | major | Milestone: | undecided |
Component: | code-frontend-ftp-sftp | Version: | 1.8β |
Keywords: | sftp sshfs performance docs | Cc: | |
Launchpad Bug: |
Description
It looks as though at least direct_io and big_writes may be beneficial, so that writes are not limited to 4 KiB blocks.
Change History (6)
comment:1 Changed at 2010-08-28T08:26:44Z by bj0
comment:2 Changed at 2010-08-28T14:40:15Z by zooko
Dear bj0: thanks for the report!
What version(s) of Tahoe-LAFS were you using? If you have just been tracking the official trunk repo at http://tahoe-lafs.org/source/tahoe-lafs/trunk and haven't applied any other patches, then you can find out by running make make-version.
comment:3 Changed at 2010-08-28T22:49:18Z by davidsarah
Thanks bj0.
big_writes should only affect writes, and I can't immediately see why it would have anything but a beneficial effect. direct_io might affect both reads and writes, and could cause some loss of performance for applications whose performance depends on kernel caching. Can you try the same tests with -o big_writes only?
[I initially suggested both because http://xtreemfs.blogspot.com/2008/08/fuse-performance.html said that direct_io was needed (at least for some version of sshfs FUSE and Linux tested in 2008) to support writing in blocks greater than 4 KiB. However, point 2 in http://article.gmane.org/gmane.comp.file-systems.fuse.devel/5292 suggests that this restriction might have been lifted.]
What Linux kernel version and sshfs version did you use?
comment:4 Changed at 2010-08-29T07:18:10Z by bj0
make make-version returned: setup.py darcsver: wrote '1.8.0c2-r4702' into src/allmydata/_version.py
sshfs Version: 2.2-1build1 (from ubuntu repo)
client uname -a: Linux nazgul 2.6.32-020632-generic #020632 SMP Thu Dec 3 10:09:58 UTC 2009 x86_64 GNU/Linux server uname -a: Linux testbuntu 2.6.32-24-generic #41-Ubuntu SMP Thu Aug 19 01:12:52 UTC 2010 i686 GNU/Linux
I was going to try the tests without direct_io, but I seem to be having trouble with my vm...
comment:5 Changed at 2010-09-11T23:57:32Z by davidsarah
- Owner set to bj0
comment:6 Changed at 2014-12-02T19:44:04Z by warner
- Component changed from code-frontend to code-frontend-ftp-sftp
I started doing a couple random tests I could think of. I didn't repeat many of the tests, since they took a bit of time and it's all manual, but it sort of gives an idea:
Most of these tests were copying a large file (728MB .iso file) to and from a tahoe introducer/storage client running inside a VirtualBox? VM on the same host (both guest and host running ubuntu). Most of the copying was done with "time rsync -rhPa ", and when copying a large file to tahoe, after the transfer the command hangs for another minute or so. I checked the flog during this time and there was activity (read i think) so it may be that rsync tries to checksum the file after transfer, i'm not sure.
To verify the file was being transfered correctly, I also did "time md5sum mnt/iso". I would expect this to be similar to simply reading the file, but for some reason it performed differently...
with it mounted as: "sshfs -p PORT server:/ mnt/"
with it mounted as: "sshfs -p PORT -o direct_io,big_writes server:/ mnt/"
obviously I expect the values to fluctuate a bit, but it seems like direct_io,big_writes bumps up the write speed a bit, without really affecting the read speed. I'm not really sure why it hit md5sum time so bad...
I also tried to rsync a large directory of source files (4mb, 981 files) to tahoe, but it seems to be acting odd, and stalls a lot, resulting in a very long transfer time (120 - 140 minutes). This happened with and without the options.