Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large POST requests result in SSL3_WRITE_PENDING from urllib3.contrib.pyopenssl#sendall #855

Closed
linusthe3rd opened this issue May 4, 2016 · 27 comments

Comments

@linusthe3rd
Copy link

linusthe3rd commented May 4, 2016

I currently have a use case where I am sending a large set of data over https via a POST request. Specifically, I am trying to set a POST request to Elasticsearch with an HTTPS URL that has a body of ~2MB.

When sending these types of requests, I get the following stack trace:

  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 188, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 160, in streaming_bulk
    for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs):
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 85, in _process_bulk_chunk
    resp = client.bulk('\n'.join(bulk_actions) + '\n', **kwargs)
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 69, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 782, in bulk
    doc_type, '_bulk'), params=params, body=self._bulk_body(body))
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/elasticsearch/transport.py", line 307, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/elasticsearch/connection/http_requests.py", line 62, in perform_request
    response = self.session.request(method, url, data=body, timeout=timeout or self.timeout)
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/requests/sessions.py", line 475, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/requests/sessions.py", line 585, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/requests/adapters.py", line 403, in send
    timeout=timeout
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 578, in urlopen
    chunked=chunked)
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 362, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1053, in request
    self._send_request(method, url, body, headers)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1093, in _send_request
    self.endheaders(body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders
    self._send_output(message_body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output
    self.send(msg)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 869, in send
    self.sock.sendall(data)
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/requests/packages/urllib3/contrib/pyopenssl.py", line 256, in sendall
    sent = self._send_until_done(data[total_sent:total_sent + SSL_WRITE_BLOCKSIZE])
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/requests/packages/urllib3/contrib/pyopenssl.py", line 245, in _send_until_done
    return self.connection.send(data)
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1274, in send
    self._raise_ssl_error(self._ssl, result)
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1187, in _raise_ssl_error
    _raise_current_error()
  File "/usr/local/.virtualenvs/sqs_to_es/lib/python2.7/site-packages/OpenSSL/_util.py", line 48, in exception_from_error_queue
    raise exception_type(errors)
OpenSSL.SSL.Error: [('SSL routines', 'ssl3_write_pending', 'bad write retry')]

At first, I thought this issue was a result of #717, but that issue has been resolved in the version of requests I am using. After further research, I landed on this comment in an SO answer

OpenSSL has very strict requirements about how writes can be retried -- specifically the buffer's address and contents must not be changed. When you retry a write, you must retry with the exact same buffer (the same contents are not sufficient and, of course, different contents is absolutely prohibited).

Using this clue, I added some print statements to urllib3.contrib.pyopenssl#_send_until_done and OpenSSL.SSL#send:

urllib3.contrib.pyopenssl#sendall

def _send_until_done(self, data):
        print('_send_until_done')
        print('no loop')
        while True:
            print('loop')
                print(hex(id(data)))
                return self.connection.send(data)
            except OpenSSL.SSL.WantWriteError:
                _, wlist, _ = select.select([], [self.socket], [],
                                            self.socket.gettimeout())
                if not wlist:
                    raise timeout()
                continue

    def sendall(self, data):
        total_sent = 0
        while total_sent < len(data):
            sent = self._send_until_done(data[total_sent:total_sent + SSL_WRITE_BLOCKSIZE])
            total_sent += sent

OpenSSL.SSL#send

    def send(self, buf, flags=0):
        """
        Send data on the connection. NOTE: If you get one of the WantRead,
        WantWrite or WantX509Lookup exceptions on this, you have to call the
        method again with the SAME buffer.

        :param buf: The string, buffer or memoryview to send
        :param flags: (optional) Included for compatibility with the socket
                      API, the value is ignored
        :return: The number of bytes written
        """
        # Backward compatibility
        buf = _text_to_bytes_and_warn("buf", buf)
        print(hex(id(buf)))
        print("")
        print("")

        if isinstance(buf, _memoryview):
            buf = buf.tobytes()
        if isinstance(buf, _buffer):
            buf = str(buf)
        if not isinstance(buf, bytes):
            raise TypeError("data must be a memoryview, buffer or byte string")

        result = _lib.SSL_write(self._ssl, buf, len(buf))
        self._raise_ssl_error(self._ssl, result)
        return result
    write = send

After running this code, I realized that what was occurring was that a specific buffered subset of my data was being retried, BUT, the memory address of the buffered data was changing because of the call to buf = _text_to_bytes_and_warn("buf", buf) in OpenSSL.SSL#send. Here is the output I received:

_send_until_done
no loop
0x10a2bba20
loop
0x10a2bba20
0x7fdd6585f400


loop
0x10a2bba20
0x7fdd638f3000

As you can see in that output, the buffer of data in urllib3.contrib.pyopenssl#_send_until_done is correctly retried when a failure occurred and it has the correct memory address (0x10a2bba20), however, on the second attempt, the memory address of the buffer in OpenSSL.SSL#send is different between the two attempts (0x7fdd6585f400 and 0x7fdd638f3000).

Considering this, it looks like the use of OpenSSL.SSL#send is the incorrect API to use in urllib3.contrib.pyopenssl#sendall. A potential solution is to instead use OpenSSL.SSL#sendall, which seems to account for large requests by performing the chunking of data after _text_to_bytes_and_warn has been executed.

Relevant versions in use:

Python 2.7.10
requests 2.10.0
pyOpenSSL 0.15.1
urllib3 1.15.1

@linusthe3rd
Copy link
Author

I'm currently working on a PR to potentially fix this, but I am not sure where the best place to add a new test case is - should it be added to with_dummyserver.test_https or with_dummyserver.test_socketlevel?

I ask because it looks like these specific API calls in pyopenssl.py are not being specifically called, but I can definitely be missing something.

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented May 4, 2016

@strife25 _text_to_bytes_and_warn should only change your buffer if it's a unicode buffer, which you really shouldn't be sending. I'm open to having a change that throws a more descriptive error (e.g. TypeError), but really your code should not be sending unicode data.

@linusthe3rd
Copy link
Author

That's what I just discovered as well!

A more descriptive error would help tremendously here as I'm sure other people using elasticsearch-py may hit this issue (especially as people start using it on AWS over HTTPS) and it is not clear what is happening due to the many layers of libraries occurring.

Thank you for the response and clarification.

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented May 4, 2016

@strife25 I'll happily review/guide on a PR to add that more descriptive error if you're interested in doing the work!

@haikuginger
Copy link
Contributor

haikuginger commented May 4, 2016

In this case, the bytes are being passed directly, but do we know whether Requests is encoding JSON it produces to handle this case? If it's not known already, I can dig into that later and open an upstream downstream issue or PR to handle it if necessary.

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented May 4, 2016

I think requests does not handle that appropriately @haikuginger.

@haikuginger
Copy link
Contributor

Good to know @Lukasa; I'll have a look at that tonight.

@haikuginger
Copy link
Contributor

@Lukasa, @shazow, what would you think on a PR that raises a TypeError (or derivative) when non-bytes objects are passed as request bodies? It would formalize the requirement, and avoid arcane error messages, but it would also be a breaking change with an unclear level of impact on usage.

@shazow
Copy link
Member

shazow commented May 22, 2016

I worry that trying to enforce a specific type will hurt more than help, in case people are depending on scenarios where they pass in the wrong type that happens to coerce into the right format?

We could intercept TypeErrors and annotate them with more info but I'm not sure that's worth the effort? There are a lot of places where the wrong type yields explosions and it would be hard to decide where to draw the line or where it's most useful.

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented May 22, 2016

It's hard to say whether there is much value in being clearer on the exception is here. We could definitely log though, which would be helpful.

@deter3
Copy link

deter3 commented Jan 9, 2017

for my case , the main reason is I have over 2000 tags for 90 tweets uploaded to elasticsearch with chunk_size default 500 , then error SSL3_WRITE_PENDING . if 1000 tags for 90 tweets being uploaded with chunk_size default 500 , no problem .

I changed chunk_size to 30 when total tags count more than 1000 like below 👍

if tag_total_count > 1000 : chunk_size = 30 else : chunk_size = 120 try : helpers.bulk(es, actions,stats_only=True,request_timeout=30,chunk_size = chunk_size ) except Exception as er : print er

Above solved the errors of "SSL routines', 'SSL3_WRITE_PENDING', 'bad write retry" . Hope it will be helpful for other people .

@jminuscula
Copy link

Hey there,

I'm still experiencing this, although it looks like related issues were solved long ago. I'm also using elasticsearch to upload a bunch of documents; decreasing the number of documents in the batch definitely improves the error rate.

Is there anything I can do to help debug this?
Thanks.

requests: '2.14.2'
urllib3: '1.22'
OpenSSL: '17.2.0'
Linux ip-172-31-51-58 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux (AWS)

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Jul 25, 2017

Are you experiencing it using Requests or urllib3?

@jminuscula
Copy link

@Lukasa I'm using Requests through the elasticsearch library. Here's the full stacktrace:

Traceback (most recent call last):
  File "manage.py", line 9, in <module>   
    execute_from_command_line(sys.argv)   
  File "/home/project/.venv/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 351, in execute_from_command_line
    utility.execute()
  File "/home/project/.venv/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 343, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/project/.venv/local/lib/python2.7/site-packages/django/core/management/base.py", line 394, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/home/project/.venv/local/lib/python2.7/site-packages/django/core/management/base.py", line 445, in execute
    output = self.handle(*args, **options)
  File "/home/project/app/stats/management/commands/stats_push.py", line 99, in handle
    self.push_data()
  File "/home/project/app/stats/management/commands/stats_push.py", line 108, in push_data
    getattr(self, self.PUSH_FUNCTIONS[doc_type])()
  File "/home/project/app/stats/management/commands/stats_push.py", line 145, in push_messages
    self.push_messages_queue(stats_messages)
  File "/home/project/app/stats/management/commands/stats_push.py", line 191, in push_messages_queue
    self.service.push_all(StatsTypes.MESSAGE.value, messages)
  File "/home/project/app/stats/services.py", line 183, in push_all
    self.push_all_actions(actions)
  File "/home/project/app/stats/services.py", line 187, in push_all_actions
    helpers.bulk(self.service, actions)
File "/home/project/.venv/src/elasticsearch/elasticsearch/helpers/__init__.py", line 190, in bulk
  for ok, item in streaming_bulk(client, actions, **kwargs):
File "/home/project/.venv/src/elasticsearch/elasticsearch/helpers/__init__.py", line 162, in streaming_bulk
  for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs):
File "/home/project/.venv/src/elasticsearch/elasticsearch/helpers/__init__.py", line 87, in _process_bulk_chunk
  resp = client.bulk('\n'.join(bulk_actions) + '\n', **kwargs)
File "/home/project/.venv/src/elasticsearch/elasticsearch/client/utils.py", line 71, in _wrapped
  return func(*args, params=params, **kwargs)
File "/home/project/.venv/src/elasticsearch/elasticsearch/client/__init__.py", line 1096, in bulk
  doc_type, '_bulk'), params=params, body=self._bulk_body(body))
File "/home/project/.venv/src/elasticsearch/elasticsearch/transport.py", line 327, in perform_request
  status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/home/project/.venv/src/elasticsearch/elasticsearch/connection/http_requests.py", line 74, in perform_request
  timeout=timeout or self.timeout)
File "/home/project/.venv/local/lib/python2.7/site-packages/requests/sessions.py", line 639, in send
  r = adapter.send(request, **kwargs)
File "/home/project/.venv/local/lib/python2.7/site-packages/requests/adapters.py", line 438, in send
  timeout=timeout
File "/home/project/.venv/local/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 600, in urlopen
  chunked=chunked)
File "/home/project/.venv/local/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 356, in _make_request
  conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python2.7/httplib.py", line 1017, in request
  self._send_request(method, url, body, headers)
File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request
  self.endheaders(body)
File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders
  self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 864, in _send_output
  self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 840, in send
  self.sock.sendall(data)
File "/home/project/.venv/local/lib/python2.7/site-packages/requests/packages/urllib3/contrib/pyopenssl.py", line 313, in sendall
  sent = self._send_until_done(data[total_sent:total_sent + SSL_WRITE_BLOCKSIZE])
File "/home/project/.venv/local/lib/python2.7/site-packages/requests/packages/urllib3/contrib/pyopenssl.py", line 301, in _send_until_done
  return self.connection.send(data)
File "/home/project/.venv/local/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1540, in send
  self._raise_ssl_error(self._ssl, result)
File "/home/project/.venv/local/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1456, in _raise_ssl_error
  _raise_current_error()
File "/home/project/.venv/local/lib/python2.7/site-packages/OpenSSL/_util.py", line 54, in exception_from_error_queue
  raise exception_type(errors)
OpenSSL.SSL.Error: [('SSL routines', 'SSL3_WRITE_PENDING', 'bad write retry')]

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Jul 26, 2017

Your Requests version is probably too old to have this fix. Can you try updating it?

@jminuscula
Copy link

@Lukasa thanks for looking into this. I've also tried with the latest Requests, but the stacktrace only changes the line numbers slightly:

$ python -c "import requests; print(requests.__version__)"
2.18.2
Traceback (most recent call last):
  File "manage.py", line 9, in <module>
    execute_from_command_line(sys.argv)
  File "/home/project/.venv/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 351, in execute_from_command_line
    utility.execute()
  File "/home/project/.venv/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 343, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/project/.venv/local/lib/python2.7/site-packages/django/core/management/base.py", line 394, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/home/project/.venv/local/lib/python2.7/site-packages/django/core/management/base.py", line 445, in execute
    output = self.handle(*args, **options)
  File "/home/project/app/stats/management/commands/stats_push.py", line 99, in handle
    self.push_data()
  File "/home/project/app/stats/management/commands/stats_push.py", line 108, in push_data
    getattr(self, self.PUSH_FUNCTIONS[doc_type])()
  File "/home/project/app/stats/management/commands/stats_push.py", line 145, in push_messages
    self.push_messages_queue(stats_messages)
  File "/home/project/app/stats/management/commands/stats_push.py", line 191, in push_messages_queue
    self.service.push_all(StatsTypes.MESSAGE.value, messages)
  File "/home/project/app/stats/services.py", line 183, in push_all
    self.push_all_actions(actions)
  File "/home/project/app/stats/services.py", line 187, in push_all_actions
    helpers.bulk(self.service, actions)
  File "/home/project/.venv/src/elasticsearch/elasticsearch/helpers/__init__.py", line 190, in bulk
    for ok, item in streaming_bulk(client, actions, **kwargs):
  File "/home/project/.venv/src/elasticsearch/elasticsearch/helpers/__init__.py", line 162, in streaming_bulk
    for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs):
  File "/home/project/.venv/src/elasticsearch/elasticsearch/helpers/__init__.py", line 87, in _process_bulk_chunk
    resp = client.bulk('\n'.join(bulk_actions) + '\n', **kwargs)
  File "/home/project/.venv/src/elasticsearch/elasticsearch/client/utils.py", line 71, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/home/project/.venv/src/elasticsearch/elasticsearch/client/__init__.py", line 1096, in bulk
    doc_type, '_bulk'), params=params, body=self._bulk_body(body))
  File "/home/project/.venv/src/elasticsearch/elasticsearch/transport.py", line 327, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
  File "/home/project/.venv/src/elasticsearch/elasticsearch/connection/http_requests.py", line 74, in perform_request
    timeout=timeout or self.timeout)
  File "/home/project/.venv/local/lib/python2.7/site-packages/requests/sessions.py", line 612, in send
    r = adapter.send(request, **kwargs)
  File "/home/project/.venv/local/lib/python2.7/site-packages/requests/adapters.py", line 440, in send
    timeout=timeout
  File "/home/project/.venv/local/lib/python2.7/site-packages/urllib3/connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "/home/project/.venv/local/lib/python2.7/site-packages/urllib3/connectionpool.py", line 357, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python2.7/httplib.py", line 1017, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib/python2.7/httplib.py", line 1051, in _send_request
    self.endheaders(body)
  File "/usr/lib/python2.7/httplib.py", line 1013, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python2.7/httplib.py", line 864, in _send_output
    self.send(msg)
  File "/usr/lib/python2.7/httplib.py", line 840, in send
    self.sock.sendall(data)
  File "/home/project/.venv/local/lib/python2.7/site-packages/urllib3/contrib/pyopenssl.py", line 316, in sendall
    sent = self._send_until_done(data[total_sent:total_sent + SSL_WRITE_BLOCKSIZE])
  File "/home/project/.venv/local/lib/python2.7/site-packages/urllib3/contrib/pyopenssl.py", line 304, in _send_until_done
    return self.connection.send(data)
  File "/home/project/.venv/local/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1540, in send
    self._raise_ssl_error(self._ssl, result)
  File "/home/project/.venv/local/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1456, in _raise_ssl_error
    _raise_current_error()
  File "/home/project/.venv/local/lib/python2.7/site-packages/OpenSSL/_util.py", line 54, in exception_from_error_queue
    raise exception_type(errors)
OpenSSL.SSL.Error: [('SSL routines', 'SSL3_WRITE_PENDING', 'bad write retry')]

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Jul 26, 2017

Hrm, this should be fine. Can you use pdb to catch this and check what the type of data is in _send_until_done?

@haikuginger
Copy link
Contributor

If this is the same issue as before, I believe the fix we did in Requests was only for JSON generated by Requests; if the ES lib is generating its own Unicode body, then we weren't doing anything about it. Let's start throwing an appropriate warning when receiving a Unicode body and think about killing this footgun altogether in v2.

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Jul 26, 2017

pyopenssl should be emitting this warning.

@jminuscula
Copy link

Here's what I got from the debugger

313     def sendall(self, data):
--> 314         total_sent = 0
315         while total_sent < len(data):

ipdb> type(data)
<type 'unicode'>
ipdb> len(data)
109819

ipdb> n
> /home/project/.venv/local/lib/python2.7/site-packages/urllib3/contrib/pyopenssl.py(315)sendall()
314         total_sent = 0
--> 315         while total_sent < len(data):
316             sent = self._send_until_done(data[total_sent:total_sent + SSL_WRITE_BLOCKSIZE])

ipdb> n
> /home/project/.venv/local/lib/python2.7/site-packages/urllib3/contrib/pyopenssl.py(316)sendall()
315         while total_sent < len(data):
--> 316             sent = self._send_until_done(data[total_sent:total_sent + SSL_WRITE_BLOCKSIZE])
317             total_sent += sent

ipdb> s
--Call--
> /home/project/.venv/local/lib/python2.7/site-packages/urllib3/contrib/pyopenssl.py(301)_send_until_done()
300 
--> 301     def _send_until_done(self, data):
302         while True:

ipdb> n

> /home/project/.venv/local/lib/python2.7/site-packages/urllib3/contrib/pyopenssl.py(302)_send_until_done()
301     def _send_until_done(self, data):
--> 302         while True:
303             try:

ipdb> type(data)
<type 'unicode'>
ipdb> len(data)
16384

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Jul 26, 2017

Ok, yeah, this seems like an elasticsearch problem. You should be getting warnings from PyOpenSSL, why aren't you?

@jminuscula
Copy link

jminuscula commented Jul 26, 2017

edit: nevermind, found it above in the thread 👍

hmm… I'm unsure about what's wrong with the unicode data. What warning should I be getting? @haikuginger could you please point me to the fix? I can try to fix this from within the ES library

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Jul 26, 2017

Sending data of type unicode behaves badly: it'll get automatically encoded to some random encoding you don't get to choose. All data should be encoded manually by the user before it's passed to Requests.

@jminuscula
Copy link

There's something weird going on here. After inspecting the ES library, I see they are correctly encoding the body before performing the request:

https://github.com/elastic/elasticsearch-py/blob/5.0/elasticsearch/transport.py#L310

try:
    body = body.encode('utf-8')
except (UnicodeDecodeError, AttributeError):
    # bytes/str - no need to re-encode
    pass

At this point in the stacktrace, body is indeed <type: str>

> /usr/lib/python2.7/httplib.py(1017)request()
   1016         """Send a complete request to the server."""
-> 1017         self._send_request(method, url, body, headers)
   1018 

ipdb> type(body)
<type 'str'>

Following the same trace, I see where the document is being converted to unicode again:

> /usr/lib/python2.7/httplib.py(849)_send_output()
    848 
--> 849     def _send_output(self, message_body=None):
    850         """Send the currently buffered request and clear the buffer.

ipdb> n
> /usr/lib/python2.7/httplib.py(855)_send_output()
    854         """
--> 855         self._buffer.extend(("", ""))
    856         msg = "\r\n".join(self._buffer)

ipdb> n
> /usr/lib/python2.7/httplib.py(856)_send_output()
    855         self._buffer.extend(("", ""))
--> 856         msg = "\r\n".join(self._buffer)
    857         del self._buffer[:]

ipdb> n
> /usr/lib/python2.7/httplib.py(857)_send_output()
    856         msg = "\r\n".join(self._buffer)
--> 857         del self._buffer[:]
    858         # If msg and message_body are sent in a single send() call,

ipdb> type(msg)
<type 'unicode'>

This msg is what is provided later to send and _send_until_done. So maybe how urllib3 is playing with httplib is what it's at fault?

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Jul 27, 2017

You are probably providing a Unicode header somewhere, which is causing Python to automatically promote all the strings to Unicode. Want to check if you or ES are doing that?

@jminuscula
Copy link

you're definitely right @Lukasa; the auth headers set by requests-aws4auth are provided as unicode, which causes the buffer to be joined as such. I've found some related issues:

psf/requests#3573
tedder/requests-aws4auth#24

I'll take the issue there. Thanks so much for your support!

@IvanLauLinTiong
Copy link
Contributor

pyOpenSSL is deprecated and will be removed in future release version 2.x (#2691).

@sethmlarson sethmlarson closed this as not planned Won't fix, can't repro, duplicate, stale Aug 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants