New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(panic) Preloading does not work with many values due to using IN #283
Comments
hi, thank you for submitting this issue
I'm not an expert as well, but I think
as for other ORM, the Preload function in REL is inspired from ActiveRecord (which also produce two queries), and other ORM such as GORM also does the same. I think loading using
another solution I think is just to split it into multiple queries where the ids is large 🤔 |
I think it is very different between the dbms based on how much they cache, if the select is anyway transformed into ORs and so on... Also if multiple selects are better than a single IN() OR IN(), I don't know. Another thing that bothers me, is that it does a hard panic and not just return an error explaining that the preload did not work. It was a bit hard to find the cause. However I think I will investigate other ORMS a bit in this regard. Maybe I find some interesting techniques. |
maybe need to check other ORM first about whether they split it to multiple SELECT IN or SELECT IN OR IN
Thank you, let me know if you need any help |
I found this: The oracle-enhanced adapter for rails uses separate selects: I think for now I just implement the multiple SELECT way |
Note: In Gorm I could not find any place where that case is handled. I think I will just have to build a test-application for gorm as well to understand what it does... Edit: now thats interesting: It seems that mariadb just silently returns no rows.
What do you think about this? |
agree, lets go with multiple select 👍
somehow making unexpected edge case panic paid of LOL |
hi,
I noticed a problem with preloading due to utilizing IN-statements.
(maybe related: #104)
The problem
If you preload an entity and it results in more than 999 (on mariadb) ids in the preload-select, it will fail with a rather not explaining panic:
To help you to understand and debug the problem, I created a demo-repo which triggers exactly this problem:
https://github.com/aligator/rel-in
background knowledge
Many (if not most?) databases do not allow IN-Statements with many values.
e.g. https://stackoverflow.com/questions/8650324/how-many-values-in-an-in-clause-is-too-many-in-a-sql-query
There are some different ways to use IN which do not have these limitations. (may be different in some dbms)
IN
statements (also a somewhat dirty workaround)After all an IN-statement is never the best solution if you have many values. It is also not always more performant than just doing a join. (dbms may differ, as they may optimize the queries in different ways)
--> The way preloading is implemented currently is neither (most of the time) efficient nor does it work in all cases.
Possible solutions
fast and easy, dirty fix
just split the ids and concatenate multiple IN-statements with an OR (based on the amount of max-values the adapters should provide as this is dbms specific)
clean solution
There are some problems with the dirty-fix:
IN
statement wont be very performant at all if you have many ids (eg. 10000, 1000000, ...).I am not an expert with dbms performance, so I cannot say if a normal JOIN or an IN with a subselect is better. (That may also differe for the dbms.)
I think an IN statement with the same select as the main-select is more easy to implement, however I also think that the databases are much more optimized for normal JOINs as they are built with them in mind.
Actually for the auto-preloading JOINS should be much better as you save an extra db-call. (if not stacked too deep)
Maybe it is a good Idea to observe how well-known ORMs do the joins and preloading before implementing a clean fix.
I hope I could explain the issue good enough.
What are your thoughts on this?
some links about IN statements (note: some of them are quite old and some dbms may optimize it, but the basics should still apply)
https://stackoverflow.com/questions/6219501/is-a-long-in-clause-a-code-smell
https://www.postgresql.org/message-id/1178821226.6034.63.camel@goldbach
https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:778625947169
The text was updated successfully, but these errors were encountered: