Splunk SDK "output_mode : json" - decode('utf-8', 'xmlcharrefreplace'), match) #285

sidsinhad · 2019-10-10T07:58:17Z

I am trying to export splunk result into json format using splunk sdk.
Below is the code I am using, this works when output_mode is csv, but when I use json, it fails with the error mentioned below.

       job = service.jobs.create(searchquery, **{"exec_mode": "blocking",
                                                  "earliest_time": default_timeline,
                                                  "latest_time": "now",
                                                  "output_mode": "json",
                                                  "maxEvents": 10000000})
        offset = 0;
        count = 10000;
        thru_counter = 0
        resultCount = int(job["resultCount"])

        if rescount == 0:
            print "No Results Found for the above searchquery"
            return False

        while (offset < rescount):
            kwargs_paginate = {"count": count, "offset": offset, "output_mode": "json"}
            rs = job.results(**kwargs_paginate)
            output = rs.read()
            print rs.read()

Below error:

    "maxEvents": 10000000})
  File "/Library/Python/2.7/site-packages/splunklib/client.py", line 2944, in create
    sid = _load_sid(response)
  File "/Library/Python/2.7/site-packages/splunklib/client.py", line 228, in _load_sid
    return _load_atom(response).response.sid
  File "/Library/Python/2.7/site-packages/splunklib/client.py", line 203, in _load_atom
    .decode('utf-8', 'xmlcharrefreplace'), match)
  File "/Library/Python/2.7/site-packages/splunklib/data.py", line 85, in load
    root = XML(text)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1311, in XML
    parser.feed(text)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1659, in feed
    self._raiseerror(v)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/xml/etree/ElementTree.py", line 1523, in _raiseerror
    raise err
ParseError: not well-formed (invalid token): line 1, column 0

I updated the sdk version to 1.6.11, still the same.

The text was updated successfully, but these errors were encountered:

kashi333 · 2019-10-19T18:50:43Z

ok.

ghost · 2019-11-22T11:55:57Z

_load_sid(response) does a simple _load_atom(response), assuming that everything is XML.
However, job creation follows the output_mode and the response is actually JSON in this case, eg:
b'{"sid":"1574422392.8448_AAB56AC3-E7E0-4CF6-A072-3CB90850813D"}'

I've hacked around it for now myself by doing:

def _load_sid(response, output_mode="xml"):
if output_mode.lower().startswith('json'):
response_json = response.body.read().decode("utf-8")
return json.loads(response_json)["sid"]
return _load_atom(response).response.sid

and changing the two call sites to:
sid = _load_sid(response, output_mode=kwargs.get('output_mode', 'xml'))

ghost · 2019-11-22T13:52:13Z

And then realized that the results() call comes back with XML by default.
So I'd recommend: don't change the code for the search; specify output_mode='json' when calling job.results(output_mode='json')

ashah-splunk · 2022-04-29T05:22:06Z

@sidsinhad we have addressed this issue and the fix will be available in the next release.
PR for reference :- #447
Please let us know if you are still facing the issue.

ashah-splunk · 2022-07-12T12:19:19Z

@sidsinhad we would request you to use the latest SDK release. We have implemented the fix and is available in the latest SDK release. Please let us know if you still face the issue.

ncanumalla-splunk added the enhancement label Oct 6, 2021

vmalaviya-splunk mentioned this issue Jun 6, 2022

Release/1.6.20 #461

Merged

ashah-splunk closed this as completed Jul 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Splunk SDK "output_mode : json" - decode('utf-8', 'xmlcharrefreplace'), match) #285

Splunk SDK "output_mode : json" - decode('utf-8', 'xmlcharrefreplace'), match) #285

sidsinhad commented Oct 10, 2019

kashi333 commented Oct 19, 2019

ghost commented Nov 22, 2019

ghost commented Nov 22, 2019

ashah-splunk commented Apr 29, 2022

ashah-splunk commented Jul 12, 2022

Splunk SDK "output_mode : json" - decode('utf-8', 'xmlcharrefreplace'), match) #285

Splunk SDK "output_mode : json" - decode('utf-8', 'xmlcharrefreplace'), match) #285

Comments

sidsinhad commented Oct 10, 2019

kashi333 commented Oct 19, 2019

ghost commented Nov 22, 2019

ghost commented Nov 22, 2019

ashah-splunk commented Apr 29, 2022

ashah-splunk commented Jul 12, 2022