Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MySQL] [playhouse] [JSONField] contains() on the JSONField causes error: Incorrect arguments to ESCAPE #2609

Closed
recipe opened this issue Aug 25, 2022 · 5 comments

Comments

@recipe
Copy link

recipe commented Aug 25, 2022

Having the following script that replicates the issue:

#!/usr/bin/env python

from peewee import *
from playhouse import mysql_ext

db = MySQLDatabase(
    "db", host="localhost", port=3306, user="user", password="password"
)

class t1(Model):
    id = IntegerField(primary_key=True)
    name = CharField(255) 
    message = mysql_ext.JSONField() 

    class Meta:
        database = db
        db_table = "t1"

db.connect()
db.create_tables([t1])
t1.truncate_table()
t1.insert_many(
    [
        {"id": 1, "name": "bar", "message": ["a \\ b"]},
        {"id": 2, "name": "bar", "message": ["a \\\\ b"]},
    ]
).execute()

key = "\\"

query = t1.select(t1.name).where(t1.name.contains(key) | t1.message.contains(key))

print(query) # SELECT `t1`.`name` FROM `t1` AS `t1` WHERE ((`t1`.`name` LIKE '%\\%' ESCAPE '\') OR (`t1`.`message` LIKE '"%\\\\%"' ESCAPE '"\\"'))

for r in query: # peewee.OperationalError: (1210, 'Incorrect arguments to ESCAPE')
    print(r)
    

It produces invalid query:

SELECT `t1`.`name` FROM `t1` AS `t1` WHERE ((`t1`.`name` LIKE '%\\\\%' ESCAPE '\\') OR (`t1`.`message` LIKE '\"%\\\\%\"' ESCAPE '\"\\\\\"'))

We are currently using the version 3.14.4, however the same issue is in the latest.

pip freeze | grep peewee
peewee==3.15.1
@coleifer
Copy link
Owner

The query seems to work fine for the name column but it does not appear to be working with the JSON type.

Reg.insert_many(
    [
        {"id": 1, "name": "bar \\ 1", "message": ["a \\ b"]},
        {"id": 2, "name": "bar", "message": ["a \\\\ b"]},
    ]
).execute()

key = "\\"

# This query against the name column works fine.
query = Reg.select(Reg.name).where(Reg.name.contains(key))
print('name like')
for r in query:
    print(r.name)

# This query against the message (json) column does not.
query = Reg.select(Reg.name).where(Reg.message.contains(key))
print('message like')
for r in query: # peewee.OperationalError: (1210, 'Incorrect arguments to ESCAPE')
    print(r.name)

Furthermore, I think using LIKE against a JSON data-type is probably not something advisable. Are you using MySQL from Oracle, or MariaDB? It might be better in any case to use the built-in JSON functions or extract the text value from the JSON object and query it that way.

# This works just fine:
query = Reg.select(Reg.name).where(fn.json_extract(Reg.message, '$[0]').contains(key))

# SELECT `t1`.`name` FROM `reg` AS `t1`
# WHERE (json_extract(`t1`.`message`, '$[0]') LIKE '%\\%' ESCAPE '\')
for r in query: # works fine.
    print(r.name)

Basically, I think you're misusing the JSON data-type and should instead either:

  1. Use a plain TEXT / LONGTEXT column to store the json data, or
  2. Use the JSON functions to extract the text value first, then do the LIKE

@recipe
Copy link
Author

recipe commented Aug 26, 2022

@coleifer Using the LIKE with JSON field in the MySQL is still a valid solution when we want to include keywords in some generic search functionality, especially when we have to find bad characters in heterogeneous JSON documents by specific signature to replace in the database migrations. My example above was simplified just for the test.

@recipe
Copy link
Author

recipe commented Aug 26, 2022

In addition to this if we even used the TEXT field to keep JSON data we would write the following:

#!/usr/bin/env python

import json
from peewee import *


db = MySQLDatabase(
    "db", host="localhost", port=3306, user="user", password="password"
)


class JsonTextField(TextField):
    def db_value(self, value):
        if value is not None:
            return json.dumps(value)

    def python_value(self, value):
        if value is not None:
            return json.loads(value)


class t1(Model):
    id = IntegerField(primary_key=True)
    name = CharField(255)
    message = JsonTextField()

    class Meta:
        database = db
        db_table = "t1"


db.connect()
db.create_tables([t1])
t1.truncate_table()
t1.insert_many(
    [
        {"id": 1, "name": "bar", "message": ["a \\ b"]},
        {"id": 2, "name": "bar", "message": ["a \\\\ b"]},
    ]
).execute()

key = "\\"

query = t1.select(t1.name).where(t1.name.contains(key) | t1.message.contains(key))

print(query)
for r in query:
    print(r)

It produces the same error:

SELECT `t1`.`name` FROM `t1` AS `t1` WHERE ((`t1`.`name` LIKE '%\\\\%' ESCAPE '\\') OR (`t1`.`message` LIKE '\"%\\\\\\\\%\"' ESCAPE '\"\\\\\"'))

@coleifer coleifer reopened this Aug 26, 2022
@coleifer
Copy link
Owner

Aha, thank you for your persistence -- I spotted the bug. The escape sequence is being handled by the Json converter, which dumps these strings as json and creates invalid escapes.

This should be fixed in master now.

@coleifer
Copy link
Owner

3.15.2 released now, which contains this fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants