Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splunk SDK "output_mode : json" - decode('utf-8', 'xmlcharrefreplace'), match) #285

Closed
sidsinhad opened this issue Oct 10, 2019 · 5 comments

Comments

@sidsinhad
Copy link

I am trying to export splunk result into json format using splunk sdk.
Below is the code I am using, this works when output_mode is csv, but when I use json, it fails with the error mentioned below.

       job = service.jobs.create(searchquery, **{"exec_mode": "blocking",
                                                  "earliest_time": default_timeline,
                                                  "latest_time": "now",
                                                  "output_mode": "json",
                                                  "maxEvents": 10000000})
        offset = 0;
        count = 10000;
        thru_counter = 0
        resultCount = int(job["resultCount"])

        if rescount == 0:
            print "No Results Found for the above searchquery"
            return False

        while (offset < rescount):
            kwargs_paginate = {"count": count, "offset": offset, "output_mode": "json"}
            rs = job.results(**kwargs_paginate)
            output = rs.read()
            print rs.read() 

Below error:

    "maxEvents": 10000000})
  File "/Library/Python/2.7/site-packages/splunklib/client.py", line 2944, in create
    sid = _load_sid(response)
  File "/Library/Python/2.7/site-packages/splunklib/client.py", line 228, in _load_sid
    return _load_atom(response).response.sid
  File "/Library/Python/2.7/site-packages/splunklib/client.py", line 203, in _load_atom
    .decode('utf-8', 'xmlcharrefreplace'), match)
  File "/Library/Python/2.7/site-packages/splunklib/data.py", line 85, in load
    root = XML(text)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1311, in XML
    parser.feed(text)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1659, in feed
    self._raiseerror(v)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1523, in _raiseerror
    raise err
ParseError: not well-formed (invalid token): line 1, column 0

I updated the sdk version to 1.6.11, still the same.

@kashi333
Copy link

ok.

@ghost
Copy link

ghost commented Nov 22, 2019

_load_sid(response) does a simple _load_atom(response), assuming that everything is XML.
However, job creation follows the output_mode and the response is actually JSON in this case, eg:
b'{"sid":"1574422392.8448_AAB56AC3-E7E0-4CF6-A072-3CB90850813D"}'

I've hacked around it for now myself by doing:

def _load_sid(response, output_mode="xml"):
if output_mode.lower().startswith('json'):
response_json = response.body.read().decode("utf-8")
return json.loads(response_json)["sid"]
return _load_atom(response).response.sid

and changing the two call sites to:
sid = _load_sid(response, output_mode=kwargs.get('output_mode', 'xml'))

@ghost
Copy link

ghost commented Nov 22, 2019

And then realized that the results() call comes back with XML by default.
So I'd recommend: don't change the code for the search; specify output_mode='json' when calling job.results(output_mode='json')

@ashah-splunk
Copy link
Contributor

@sidsinhad we have addressed this issue and the fix will be available in the next release.
PR for reference :- #447
Please let us know if you are still facing the issue.

@ashah-splunk
Copy link
Contributor

@sidsinhad we would request you to use the latest SDK release. We have implemented the fix and is available in the latest SDK release. Please let us know if you still face the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants