Wednesday, July 7, 2010

sftp(1) file transfer pipelining patch under development

A new patch for OpenSSH is under development, which will improve dramatically the performance of a large number of small file transfers, over high bandwidth high latency links, like the now pervasive wireless networks in sftp(1).
The algorithm pipelines readdir/open/read/write calls with a small window for file handles and avoid unnecessary round trip delays.
This improved file transfer mechanism which is possible due to the flexibility of the SSH+SFTPv3 protocols, will allow administrators to securely transfer large quantities of small files in a much shorter time frame using sftp(1).
New regression tests are also being written to make sure this functionality will be 100% reliable and tested, in the current release and all future OpenSSH releases!
Thanks.

Monday, June 14, 2010

New sftp(1) regression tests

I've developed some new regression tests for sftp(1), covering the following functionality:
  • get/put -r (recursive transfers)
  • get/put -p (preserving files' atime and mtime)
  • chown
  • chgrp
  • chmod
If you'd like to integrate them into your OpenBSD -CURRENT tree, you can download the diff here.

If you have any suggestions, just let me know in the comments!

Saturday, June 5, 2010

OpenSSH 5.5 released!

OpenSSH 5.5 has been released and features the Google Summer of Code 2009 work that went into sftp(1). This is a bugfix release and the changes since 5.4 are:

  • Unbreak sshd_config's AuthorizedKeysFile option for $HOME-relative paths
  • Fix compilation failures on platforms that lack dlopen()
  • Include a language tag when sending a protocol 2 disconnection message.
  • Make logging of certificates used for user authentication more clear and consistent between CAs specified using TrustedUserCAKeys and authorized_keys
Portable OpenSSH:
  • * Allow contrib/ssh-copy-id to fail gracefully when there are no keys in the ssh-agent. bz#1723
  • * Explicitly link libX11 into contrib/gnome-ssh-askpass2. bz#1725
  • * Allow ChrootDirectory to work in SELinux platforms. bz#1726
  • * Add configure.ac stanza for Haiku OS. bz#1741
  • * Enable utmpx support on FreeBSD where possible. bz#1732
  • * Use pkg-config to determine libedit linker flags where possible. bz#1744

The complete list of changes can be viewed on the OpenBSD 4.7 release notes

sftp(1) specific changes:
  • Implement tab-completion of commands, local and remote filenames (requires libedit)
  • Support most of scp(1)'s commandline arguments in sftp(1), as a first step towards making sftp(1) a drop-in replacement for scp(1). Note that the rarely-used "-P sftp_server_path" option has been moved to "-D sftp_server_path" to make way for "-P port" to match scp(1). Implements -2 -4 -6 -c -q -i -p -r switches
  • Add recursive transfer support for get/put and on the commandline

The work started in Google Summer of Code 2009 is still going on, so stay tuned for updates.
Congratulations to the OpenSSH team for another great release!

    Wednesday, May 13, 2009

    Project Roadmap

    The project roadmap is:

    Phase 1:


    Add a different switch for destination port in sftp (-d)

    If in the future we eliminate scp and instead create a softlink to sftp named "scp", we can implement old/new behaviour for -P switch according to the launched program name.

    Implement -2 -4 -6 -c -q -i switches by passing them directly to ssh(1).


    Phase 2:


    Implement the -r switch, to implement recursive behaviour. This operation will be optimized by pipelining in a later phase.


    Phase 3:


    Implement -p switch, to preserve original file times. This switch will cause sftp to set/get the appropriate attributes in SSH_FXP_OPEN messages, or if needed, with a separate message.


    Phase 4:


    Implement -l switch, to limit bandwidth. Research the possibility of using scp's bwlimit().


    Phase 5:


    Review, improve if needed to close final bugs or implement any needed functionality, and merge tab completion patch with support for command completion, local file completion and optionally remote file completion, if the OpenSSH's dev team agrees on the remote functionality inclusion.


    Phase 6:


    Improve the user experience in the interactive client, such as allow a put/get with multiple files specified as parameters. Research for improvements for other commands, taking suggestions from the community and also getting inspiration from lftp client functionality.


    Phase 7:


    Improve sftp-server and/or sftp client so it can work on paths where some directories are traverse only, ie, not searchable, and design a solution which ideally will avoid extra round-trips.


    Phase 8:


    Implement the pipelining of readdir() calls, and pipelining of open/read/writes with a small window for file handles in sftp multi-file transfers to avoid unnecessary delays in multi small file transfers.


    Test thoroughly for any regressions, and add new regression tests if necessary.


    Phase 9:


    Implement auto-tuning of best settings in each connection for –B buffer_size and –R num_requests of sftp client in order to get the best speed out of the network link. I will experiment with different values to understand how sftp will benefit the most in different network environments, and design an algorithm to automatically manage this settings.

    I can make sftp use auto-tuning by default, or make it a command line parameter for the user to enable it on-demand, and document it on the man page.

    Test this functionality extensively, ensure we don't introduce any security bug and add new regression tests if necessary.


    Phase 10:


    Research and implement support for resuming file transfers, using djm's existing patch for downloads.


    Phase 11:


    Write the documentation to all the previous work, and create regression tests for sftp's new funcitionalities and test it for compatibility with scp


    If there's time left, close as much relevant bugzilla items as possible, such as:

    Bug #430 - Could add option to sftp-server to disable write access

    Bug #831 - Allow agent forwarding in sftp & scp

    Tuesday, May 12, 2009

    Phase 1 completed

    I'm happy to report the phase 1 has been completed and the patch submitted by e-mail. My Powerbook is working really good now with the full GNOME environment from ports-stable, and -current kernel, userland and xenocara, and everything seems to work well, even the right button click tho there's no DRI in PowerPC. I had to apply the workaround to the kernel to avoid Ultra-DMA modes downgrade on my disk, and use a US layout on the keyboard to be able to access the programming characters under GNOME, so if anyone out there has a portuguese powerbook layout for X.org/Xenocara, i'd be happy to hear about it.

    Cheers

    Tuesday, May 5, 2009

    Update on all the activity that happened so far

    First status update will report on everything that i went through so far in my project.

    I started to work with OpenSSH-portable version CVS tree under Mac OS X. But since my work will be on OpenBSD-current OpenSSH tree, I started the effort to install OpenBSD-current on my only computer, an Apple Powerbook G4 17" 1,67 Ghz.
    I'm using OpenBSD macppc port, and due to a lack of PowerPC virtualization software , I decided to install the latest snapshot of OpenBSD natively on my Mac.
    After all data was safely backed up, I managed to live resize my Mac OS X partition to make room for OpenBSD and installed it without any significant problems. It turned out later the disk space allocated (3 Gb for / and swap) wasn't enough for recompiling the whole system, xenocara, and install some ports, so i proceeded to downsize Mac OS X partition again, and reinstall OpenBSD.
    I had one recurring problem since the beggining of running OpenBSD on my Powerbook, that is the constant kernel messages about wd0 disk timeouts, and mode downgrade, from Ultra-DMA mode 4 all the way down to PIO mode 4, exactly like described in this bug report March 2008 on OpenBSD-bugs ML. In PIO mode 4, any heavy access to the disk, like a CVS update, will cause an interrupt storm and the CPU will be 75% of the time handling interrupts, as shown by top.
    This causes the whole system to run very slow, and after i updated kernel, userland and xenocara to -current, the bug remained, with either GENERIC kernel, or a slimmed down version. I also tried disabling bwi driver, reseting NVRAM in Open Firmware bootup prompt, but nothing helped.
    In the meantime i also installed some packages available in the snapshots/ directory of mirrors, and compiled the remaining from ports -current , to run vim, cscope, and Gnome desktop, to be able to be more productive.
    Unfortunately, under Gnome, the keyboard goes crazy and some keys fail to show up in the terminal, and every time i use Gnome (2.24) the keys seem to get 'stuck', like, after pressing one Enter/Return, Gnome continuously gets Enter/Returns, as If i'm still pressing the key. Any attempt at using the keyboard under Gnome is useless. xev shows key press/release normally tho.
    This behavior rarely happens in FVWM, so that's what I've been using mostly to work on OpenSSH's code.
    I had to fight some more days with Xmodmap to get key recognition of my Powerbook's portuguese keyboard and be able to type basic programming characters like {} and [] . I also spent some more time getting a useful xorg.conf.
    So right now, i am going to try an hackish workaround for the Powerbook hard disk timeouts, until the issue is sorted in -current, and I will soon get a virtual machine on a computer running VMWare and install OpenBSD i386 to do most of my development there remotely with ssh.
    I expect to submit the first patch really soon now.

    Welcome!

    Welcome to the official blog documenting my work on OpenSSH's sftp(1) improvement project, sponsored by Google under the Summer of Code 2009 program.

    I hope to write status updates on a weekly basis, so everyone interested can closely follow my work developing this exciting project.

    Cheers!