New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read larger chunks from sftp in stream wrapper #1452
base: master
Are you sure you want to change the base?
Conversation
@icewind1991 Hi Robin. I've not been active in this project for quite some time, so please bear with me. My understanding is that, essentially, you're introducing a buffer between (external) SFTP requests and (internal) PHP stream requests, correct? That seems to make sense to me. |
phpseclib/Net/SFTP/Stream.php
Outdated
if (strlen($this->buffer) < $count) { | ||
// read 80k chunks from sftp even though php will only request 8k chunk | ||
// this allows us to minimize the overhead of making sftp requests | ||
$chunk = $this->sftp->get($this->path, false, $this->pos, 81920); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make the 81920 a parameter? I think there is a mechanism for that on stream construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what bantu is talking about is this:
<?php
require __DIR__ . '/vendor/autoload.php';
use phpseclib3\Net\SFTP\Stream;
use phpseclib3\Net\SFTP;
Stream::register();
$sftp = new SFTP('website.com');
$sftp->login('username', 'password');
$context = [
'sftp' => ['sftp' => $sftp]
];
$context = stream_context_create($context);
$fp = fopen('sftp://dummy.com/home/vagrant/1mb', 'r', false, $context);
echo fread($fp, 10);
Other options are discussed at https://www.php.net/manual/en/wrappers.ssh2.php
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, "stream context" is what I was referring to.
For the most common use case this does indeed appear to speed things up a good amount. A much less common use case, however, would be slowed down by this. eg. randomly seeking / reading 1kb chunks of a file. That said, I noticed you targeted the master branch with this PR. Maybe 3.0 might not be a bad idea? I say that because long term I do want to do something like #655 (comment) , which could very well eliminate any speedups your PR provides. |
phpseclib/Net/SFTP/Stream.php
Outdated
} | ||
|
||
$result = substr($this->buffer, 0, $count); | ||
$this->buffer = substr($this->buffer, $count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nitpicky but... you could do $result = \phpseclib3\Common\Functions\Strings::shift($this->buffer, $count)
instead. Also, I'd do $this->buffer.= $chunk
instead of $this->buffer .= $chunk
.
What if
Does the stream still behave in accordance to its contracts? Pre-patch it would probably return the newly written data at chunk 2 and post-patch it would return stale data from the buffer. |
@bantu - good points. Maybe it'd just be easier if I implemented the ideas discussed in #655 (comment) . I could do that in the 1.0 / 2.0 / 3.0 / master branches. That's not a BC breaking change, after all. |
@terrafrost Maybe call the parameter something like "readahead length" and make it optional. This way Robin can use it for his use case. If the parameter is not supplied, it should behave as before. @icewind1991 What do you think? |
made the readahead length configurable and added cache invalidation logic to prevent reading stale chunks. |
In the buffering problem I described in #1452 (comment), I was (implicitly) considering two independent instances accessing the file. Your additional changes do not work in this case. Hence my suggestion for not changing the behavior in case the read ahead buffer length is not supplied. |
* @access public | ||
*/ | ||
private $buffer = ''; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this say @access private
?
Here's my idea for an alternate way to deal with this: https://github.com/terrafrost/phpseclib/tree/1.0-sftp-file-handling I'm thinking that this - and making What I've done can be used as a starting point. If no one else wants to take this task on I can do so in the fullness of time. |
This speeds up reading from the stream wrapper about 5x in my local testing, increasing the chunk size further doesn't seem to make a difference