-
Notifications
You must be signed in to change notification settings - Fork 973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EnforceDistribution fails, seems to turn all the types of the schema to UInt64 #10421
Comments
Thanks for the report @fabianmurariu Is there any way we can get a self contained reproducer? I ran the query in the description and it doesn't seem to have all the tables > WITH e1 AS (SELECT * FROM _default), e2 AS (SELECT * FROM _default), a AS (SELECT * FROM nodes), b AS (SELECT * FROM nodes), c AS (SELECT * FROM nodes) SELECT a.name, b.name, c.name FROM e1 JOIN a ON e1.src = a.id JOIN b ON e1.dst = b.id JOIN e2 ON b.id = e2.src JOIN c ON e2.dst = c.id WHERE e1.id <> e2.id;
Error during planning: table 'datafusion.public._default' not found |
I'll try next week to open source the code where this is happening |
Thanks @fabianmurariu cc @mustafasrepo in case you have any thoughts |
The physical plans before and after enforce distribution rule, might help in locating the problem. You can use get_plan_string helper to print this information. Putting prints at the start and at the end of
|
I have tried to reproduce problem by defining absolutely necessary fields in the query with below queries statement ok
CREATE TABLE IF NOT EXISTS _default (name VARCHAR, src BIGINT, dst BIGINT, id BIGINT) AS VALUES('mustafa', 1, 2, 0),('test', 2, 3, 1);
statement ok
CREATE TABLE IF NOT EXISTS nodes (name VARCHAR,id BIGINT) AS VALUES('TR', 1),('GR', 2);
statement ok
set datafusion.execution.target_partitions = 8;
query TTT
WITH e1 AS (SELECT * FROM _default),
e2 AS (SELECT * FROM _default),
a AS (SELECT * FROM nodes),
b AS (SELECT * FROM nodes),
c AS (SELECT * FROM nodes)
SELECT a.name, b.name, c.name
FROM e1 JOIN a ON e1.src = a.id JOIN b ON e1.dst = b.id JOIN e2 ON b.id = e2.src JOIN c ON e2.dst = c.id WHERE e1.id <> e2.id;
---- However, this test seems to pass. Unfortunately I cannot debug further. After seeing plans, or after full reproducer I will take another look. |
Strange, I'm encountering this with custom TableProviders, I'll be able to share more next week tho |
Describe the bug
This happens in 37 it works in 36
EnforceDistribution fails with
"PhysicalOptimizer rule 'EnforceDistribution' failed, due to generate a different schema, original schema:
To Reproduce
Error
The text was updated successfully, but these errors were encountered: