Add json "strict" parameter to CoreNLP #2993

james-huang · 2022-05-04T06:45:38Z

This allows the (optional) processing of text control characters without raising errors.

Attempting to process text with control characters like the vertical tab \x0b/\v causes the following error:

from nltk.parse.corenlp import CoreNLPParser
CORENLP_PARSER = CoreNLPParser(url='http://localhost:9000/')

CORENLP_PARSER.api_call(
    'Hello\x0bWorld!',
    properties={
        'annotators': 'ssplit',
        'tokenize.language': 'en',
    }
)

# JSONDecodeError
# Invalid control character at: line x column y (char z)

A simple fix to this is to allow processing of text that does not follow strict json specs.
This is done by passing strict=False.

CoreNLP deals in strings and doesn't really care about the json specs.
The commit currently maintains backwards compatibility.
Maybe there is even an argument made to make the default non-strict?

[1] https://docs.python-requests.org/en/latest/api/#requests.Response.json
[2] https://docs.python.org/3/library/json.html#json.loads
[3] https://docs.python.org/3/library/json.html#json.JSONDecoder

This allows the (optional) processing of text control characters without raising errors.

stevenbird · 2022-05-13T20:05:37Z

Thanks @james-huang

… another

Add json "strict" parameter to CoreNLP

6e50b43

This allows the (optional) processing of text control characters without raising errors.

stevenbird merged commit 11d36fb into nltk:develop May 13, 2022

tomaarsen added a commit to tomaarsen/nltk that referenced this pull request Sep 1, 2022

nltk#2993 introduced strict_json in one class, and tries to use it in…

1003e5f

… another

tomaarsen mentioned this pull request Sep 1, 2022

Resolve CoreNLP Regression #3043

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add json "strict" parameter to CoreNLP #2993

Add json "strict" parameter to CoreNLP #2993

james-huang commented May 4, 2022 •

edited

stevenbird commented May 13, 2022

Add json "strict" parameter to CoreNLP #2993

Add json "strict" parameter to CoreNLP #2993

Conversation

james-huang commented May 4, 2022 • edited

stevenbird commented May 13, 2022

james-huang commented May 4, 2022 •

edited