wiki:SftpFrontend

Version 97 (modified by exarkun, at 2021-01-08T16:42:39Z) (diff)

Getting rid of FTP - https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3583

The SFTP frontend is a server that optionally runs as part of a gateway node, and provides read/write access to the Tahoe grid via the SSH File Transfer Protocol.

See [http://tahoe-lafs.org/trac/tahoe-lafs/browser/docs/frontends/FTP-and-SFTP.rst docs/frontends/FTP-and-SFTP.rst for how to enable and set up the SFTP frontend on a gateway. This page is for compatibility issues with particular SFTP clients, and assumes that you are using Tahoe-LAFS v1.10.0 or later. (We do not recommend that you use SFTP with earlier versions of Tahoe.) Please add any more issues that you discover.

Security

(Note: the following issue no longer applies with recent versions of Tahoe and Twisted; need version details.)

The security of the connection between the SFTP client and gateway is dependent on the PyCrypto library, which has not been reviewed to the same extent as the pycryptopp library that we use elsewhere in Tahoe-LAFS. In particular, the AES implementation in PyCrypto might be vulnerable to timing attacks and the RSA implementation in PyCrypto up to and including at least PyCrypto v2.4.1 is vulnerable to timing attacks. Either of these could potentially, depending on the situation, allow a remote attacker to break the encryption protecting the SFTP connection between your SFTP client and the Tahoe-LAFS gateway process that is acting as SFTP server. Therefore we do not recommend that you rely on the confidentiality or authentication provided by this SSH connection in the current release.

In practice, that means you can run the Tahoe-LAFS gateway locally on the same machine as your SFTP client (which is a good, efficient, and secure solution), or tunnel your SFTP connection over another secure connection such as ssh tunnel or VPN, or else just accept the risk that someone could snoop on the data that you are sending and receiving over the SFTP connection.

Server keys with passphrases are not supported (#1039).

General compatibility issues

Before uploading a file to a Tahoe filesystem, the whole file has to be available. This means that the upload can only start when the file has been closed in the SFTP session. Particularly when writing large files, the client may time out between sending the close request and receiving the response (ticket #1041). This is known to be a problem for at least the WinSCP client, which has a default close timeout of 15 seconds. In the case of WinSCP this can be worked around by setting WinSCP -> Connection -> Timeouts to 6000 seconds (the maximum allowed); other clients with this problem may have similar settings.

In the period after the close but before the upload has finished, the closed file may not appear in directory listings, or may appear with an incorrect modification time.

Since Tahoe uses capability access control rather than Unix-style permissions, the permission bits seen by SFTP clients are only an approximation chosen to avoid confusing client programs. In particular the 'user', 'group' and 'world' permissions on a Tahoe file will always be the same. It is possible to clear all of the 'w' bits on a file, which will prevent that file from being opened for writing, but note that its directory entry can still be replaced via a write cap to the directory.

See the last section of docs/frontends/FTP-and-SFTP.rst for information on how the SFTP frontend treats immutable and mutable files.

Deleting a directory via the SFTP frontend will not check that it is empty. The directory will be unlinked from its parent, but its contents will remain accessible via any other capabilities to it.

The 'ctime' and 'mtime' attributes will always be the same, and are set from the Tahoe linkmotime timestamp, which is changed only when the link from the parent directory is modified (see the 'About the metadata' section of webapi.rst). These fields are not updated when the contents of a mutable file are changed. The SFTP protocol and the server are able to represent dates up to the year 2106, but some clients may print dates after 2037 incorrectly.

Unicode filenames

The SFTP frontend encodes all filenames as UTF-8 when communicating with the client. Support for displaying and copying non-ASCII filenames is likely to vary between clients. If you are using a filesystem that represents names as UTF-8 (including via sshfs), then it should just work, but please report your experience with this.

Some clients fail to convert filenames to UTF-8, or require a configuration option to do so; see ticket #1089. In this case they will usually fail to create non-ASCII filenames (although there is a small chance that the name in another encoding will accidentally be decodable as UTF-8), and directory listings will show mojibake for non-ASCII names.

Filenames are normalized to NFC, which means that it is not possible to have two files/subdirectories with canonically equivalent names in the same directory. (This does not cause any incompatibility with filesystems that use a different normalization, such as NFD in Mac OS X.)

Performance

The SFTP frontend currently performs no caching (sshfs does cache, but only for 20 seconds with the default settings). Some applications assume that file operations have relatively low latency, and may have very poor performance when working directly with a Tahoe filesystem. In this case it may be better to copy files to a local filesystem and work on them there, then copy back any changes. Note that just browsing a directory may cause some apps to perform many unnecessary reads or attribute checks of files in that directory.

The -o big_writes option to sshfs may improve write performance.

Specific clients

sshfs on Linux

sshfs is an SFTP client that allows filesystem access via FUSE (a user-space filesystem layer). It works on Linux and other Unix systems that provide FUSE. (See below for Mac OS X.)

Tahoe's SFTP frontend includes several workarounds and extensions to make it function correctly with sshfs.

Mutable parts of a filesystem should only be accessed via a single sshfs mount (this is a stronger restriction than the write coordination directive against writing mutable parts of a filesystem via more than one gateway). Data loss may result for concurrently accessed files if this restriction is not followed.

When writing a file to the Tahoe filesystem, sshfs does not wait for the 'close' request to complete before reporting to the application that the file has been successfully closed (#1059). Therefore, you should not shut down your gateway node immediately after writing files via sshfs, otherwise those files may be lost. It is possible that an upload could fail (due to a network error, lack of storage space, etc.); such failures will not be reported to applications using sshfs. This also implies that during the upload, a file could be visible via SFTP but not via the Tahoe WUI, CLI, or other frontends.

(This patch makes sshfs wait for close requests to complete, but may cause its own compatibility problems; the patch is provided only for testing purposes.)

Some applications may make assumptions that are incompatible with Tahoe. For example, 'flushing' a file does not guarantee that written data is reflected in the Tahoe filesystem, so opening the same file via another handle and attempting to read that data before the original handle is closed will not work.

If a file is written via two handles concurrently, the contents visible at any point in time will be the data written via one handle or the other (or the previous contents), or the read will fail. The result will not be an interleaving as would be the case for a POSIX filesystem. Also, the file contents obtained by a successful read via any handle will be a snapshot at about the time of the open. These differences from the POSIX semantics are arguably improvements (at least when the read succeeds), but in principle they could confuse some applications.

If a file in a mutable directory is closed concurrently with an operation that needs to read the directory, then the latter operation may fail (#1105).

A POSIX application might assume that deleting a non-empty directory will fail, when it does not on a Tahoe filesystem (#1362).

To unmount an sshfs filesystem, make sure you are in the fuse group (if necessary use "sudo adduser `whoami` fuse"), and then do "fusermount -u mountpoint".

If you encounter problems, please use the debugging options -o debug,sshfs_debug,loglevel=debug and send the resulting log to the tahoe-dev list. Also the log output from the gateway, which can be captured as described at docs/logging.rst, may be helpful.

sshfs on Mac OS X

In principle, sshfs should work with OSXFUSE on Lion (Mac OS X 10.7) or later. However, this has not been tested for some time, at least since the merger with Fuse4X. Experience reports of using sshfs with Tahoe on OS X would be appreciated.

All of the caveats noted for Linux above apply, and the following additional ones:

OS X versions of FUSE store "extended attributes" in files with names starting with "._". For example the attributes for "foo.txt" would be stored in a file called "._foo.txt". Since some Mac OS X applications may depend on these attributes (especially for their own file formats), if you need to copy or move the original file then you should copy or move the attribute file along with it. The OS X cp and mv commands will do this by default; operations using the Tahoe WUI or CLI will not (unless you are moving all files in a directory). Note that filenames beginning with "." are not listed by default by ls.

TextEdit and vi are known to have problems editing files on a Tahoe-via-sshfs filesystem on Mac OS X.

Gnome virtual filesystem (gvfs)

gvfs is a set of filesystem adapters provided with the Gnome window system. It can be used in two ways: either via the GIO API, or via a FUSE layer called gvfs-FUSE (not to be confused with sshfs).

Apps that use the GIO API, such as the Nautilus file browser, seem to work correctly with Tahoe.

gvfs-FUSE, on the other hand, is not recommended for use with Tahoe. This is because it has to map POSIX filesystem requests onto GIO requests, and this mapping loses information -- some combinations of 'open' flags cannot be expressed in the GIO API, for example. Therefore it is impossible for gvfs-FUSE to provide a fully correct FUSE filesystem (or even one that is "good enough" for many applications).

It may not be entirely clear to users whether a particular Gnome app is using GIO or gvfs-FUSE. Recent versions of OpenOffice use gvfs-FUSE when opening a file directly from an SFTP filesystem, and this may cause problems (although OpenOffice does appear to work when editing files on an sshfs filesystem).

WinSCP

In the WinSCP Login dialog, the following options need to be set (some require 'Advanced options' to be checked):

  • In the Environment section, set 'UTF-8 encoding for filenames' to 'On'.
  • In the Connection section, set 'Server response timeout' to the maximum 6000 seconds.
  • Use '127.0.0.1' instead of 'localhost' if WinSCP says the connection has been refused. It sometimes tries connecting to [::1] (IPv6 localhost), where tahoe does not listen.

Note that these options are not persistent unless you save them as a 'Stored session', together with the host name, username, etc.

FileZilla

The following options are in the Settings dialog accessed from the Edit menu:

  • In the Connection section, set 'Timeout in seconds' to 0 (disabled).
  • In the Transfers section, you may want to increase the maximum number of simultaneous transfers.
  • In the File Types subsection, you may want to set the Default transfer type to Binary, delete all of the filetypes in the list, and uncheck 'Treat files without extension as ASCII file' and 'Treat dotfiles as ASCII files'. (This isn't Tahoe-specific, but attempting to automatically detect and convert line endings of text files is usually the wrong thing.)

Vim

Vim can edit remote files over SFTP using netrw.vim. Vim 7.3 with netrw.vim 142 was tested and works. I couldn't find out how to specify a port when opening a remote file (perhaps it's not possible), but you can setup a host with the correct port in your ssh config (~/.ssh/config for me):

Host tahoe
HostName 127.0.0.1
User peter
Port 8022

Now to open a file: vim sftp://peter@tahoe/secrets.txt, where secrets.txt is a file at the root of the dircap associated with the SFTP user peter in accounts.file. You'll get a password prompt whenever you open or save. Warning: netrw.vim stores the remote file in a temporary file on local non-volatile memory, so this technique does not prevent the plaintext contents of your file from being stored on your disk.

Emacs

Emacs can in theory edit remote files using TRAMP. Emacs 23.4.1 was tested and does not work since TRAMP expects a "shell" rather than the "SFTP subsystem" (see RFC 4254 section 6.5) interface Tahoe-LAFS supports. It's very unlikely that Tahoe will be changed to support a shell interface over SSH, since the set of commands that should be implemented to allow file transfer is not standardized.

However, TRAMP also supports the GVFS as an external backend. It may be possible to edit remote files with Emacs using GVFS, but it has not been tested. See GVFS-based external methods in the TRAMP documentation for requirements and configuration details. Since this method uses gvfs-FUSE, also note the caveats about that above.

Attachments (1)

Download all attachments as: .zip