Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk Helper Error([('SSL routines', 'ssl3_write_pending', 'bad write retry')]) #620

Closed
hadoopjax opened this issue Jul 14, 2017 · 5 comments
Assignees

Comments

@hadoopjax
Copy link

Hi,

I have a text document and am attempting to load it into an AWS Elasticsearch (v 5.3) index using Python 2.7. My workflow is pulling the document from S3, cleaning it up a bit (see code below) and pushing it to Elasticsearch. I receive the following error:

elasticsearch.exceptions.ConnectionError: ConnectionError([('SSL routines', 'ssl3_write_pending', 'bad write retry')]) caused by: Error([('SSL routines', 'ssl3_write_pending', 'bad write retry')])

My code is:

import re
from elasticsearch import Elasticsearch, helpers

# unicode mgmt
import sys
reload(sys)
sys.setdefaultencoding('utf8')

s3 = boto3.resource('s3')
bucket = s3.Bucket('somebucket')

# go get elasticsearch connection
from esconn import esconn
es = esconn()

def filing_text():
    for obj in bucket.objects.all():
        key = obj.key
        body = obj.get()['Body'].read()
        clean = body.strip()
        data_load = re.sub('\s+', ' ', clean)
        yield {'filing_type': 'afiletype', 'filing_text': data_load}

# bulk insert into index
helpers.bulk(es, filing_text(), index='myindex')

I haven't been able to track down the cause. I've tried separating the document I'm uploading into several pieces and it works fine as long as I don't do it all at once. I've played with varying settings for chunk_size and nothing seems to work if I try to do the whole document all at once.

I pasted the document text here

@mihneadb
Copy link

I'm seeing something similar, what I can say re: chunksize is that it makes sense, since that triggers the SSL write retry, so the size is not the main issue. Another issue suggested sending str not unicode, but that's not the case either.

+1 to this.

@jminuscula
Copy link

jminuscula commented Jul 26, 2017

The folks over urllib3 have been helping debug this —It looks like the library is sending the wrong type of data. Please have a look at this thread:

urllib3/urllib3#855

@jminuscula
Copy link

After much debugging, my issue was caused by unicode headers generated in requests_aws4auth tedder/requests-aws4auth#24

@hadoopjax could that be your issue too?

Anyway, sorry for the noise!

@fxdgear
Copy link
Contributor

fxdgear commented Oct 12, 2017

@hadoopjax Curious if you're still seeing issues.

@fxdgear
Copy link
Contributor

fxdgear commented Oct 25, 2017

@hadoopjax are you still having issues? Please see the comment from @jminuscula #620 (comment)

If you are still having issues please feel free to open a ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants