Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Min/Max on Datetimes result in interface{} as type #1574

Closed
Neokil opened this issue Apr 27, 2022 · 12 comments
Closed

Min/Max on Datetimes result in interface{} as type #1574

Neokil opened this issue Apr 27, 2022 · 12 comments
Labels
analyzer 📚 postgresql bug Something isn't working

Comments

@Neokil
Copy link

Neokil commented Apr 27, 2022

Version

Other

What happened?

Hint: I am using v1.13.0 (installed via brew) but that is not available in the form.

I currently have a query that looks like this:

SELECT COUNT(*) as NumOfActivities,
        MIN(event_time) as MinDate, 
        MAX(event_time) as MaxDate 
FROM activities 
WHERE account_id = $1

which creates a struct like this:

type GetActivityMetaDataForAccountRow struct {
	Numofactivities int64
	Mindate         interface{}
	Maxdate         interface{}
}

but I would expect

type GetActivityMetaDataForAccountRow struct {
	Numofactivities int64
	Mindate         time.Time
	Maxdate         time.Time
}

so it looks like after putting a date through an aggregate-function sqlc can no longer determine its type? but it seems only to be the case for datetimes as far as I can see it.

Relevant log output

No response

Database schema

CREATE TABLE activities (
    account_id BIGINT NOT NULL,
    event_time TIMESTAMP WITH TIME ZONE NOT NULL, 
);

SQL queries

SELECT COUNT(*) as NumOfActivities,
        MIN(event_time) as MinDate, 
        MAX(event_time) as MaxDate 
FROM activities 
WHERE account_id = $1

Configuration

No response

Playground URL

No response

What operating system are you using?

macOS

What database engines are you using?

PostgreSQL

What type of code are you generating?

Go

@Neokil Neokil added bug Something isn't working triage New issues that hasn't been reviewed labels Apr 27, 2022
@jose-zenledger
Copy link

Yeah I am experiencing this too. So then I changed it to cast the min or max so like:
MIN(event_time)::timestamp as MinDate. However it fails when it returns null.

I then thought maybe I could use nulltime as a custom type but and I can't figure out how to use a custom type for custom queries like this.

@Codelax
Copy link

Codelax commented May 16, 2022

I have this problem too. I would like to get *time.Time but I cannot get it when overriding type on aggregate.

@kdubovikov
Copy link

kdubovikov commented May 19, 2022

This problem reproduces not only for timestamp fields, but for int fields as well.

with t1 as (
	select
		id,
		min(col) as min_col
	from
		table1
	group by
		id
)
select
	t1.min_col
from
	t2
left outer join t1 on t2.id = t1.id

May be related to #1334

UPDATE I just built sqlc from the git to include all latest commits and this query still results into incorrect struct. min_col ends up being interface{}

UPDATE 2
Checked that this simple query also does not work

select
		id,
		min(col) as min_col -- col is an int column
from
		table1
group by
		id

@kdubovikov
Copy link

kdubovikov commented May 19, 2022

I think the problem lies in that definitions of min function in pg_catalog do not use Function.ReturnTypeIsNull field in any way. Here is the example: https://github.com/kyleconroy/sqlc/blob/996a73a27144a625080326f7340a11f837bd94fe/internal/engine/postgresql/pg_catalog.go#L14286

But here this field is the only thing considered when we are inferring if the argument of the function is nullable or not https://github.com/kyleconroy/sqlc/blob/996a73a27144a625080326f7340a11f837bd94fe/internal/compiler/output_columns.go#L219

To solve this, I think we need to make so that for some built-in functions the output type should be inferred from their argument.

UPDATE
Ok, so I found how we can fix this.
https://github.com/kyleconroy/sqlc/blob/996a73a27144a625080326f7340a11f837bd94fe/internal/compiler/output_columns.go#L208

Here, we can parse function parameters using outputColumnRefs and check if the argument type is nullable. Then, we can pass this info to Column structure. Minimalistic implementation that will work only for min looks like this (pardon for horrible code):

		case *ast.FuncCall:
			rel := n.Func
			name := rel.Name
			fmt.Printf("FUNC CALL %s\n", name)
			if res.Name != nil {
				name = *res.Name
			}
			fun, err := qc.catalog.ResolveFuncCall(n)

			if err == nil {
				// CHANGE HERE
				if len(fun.Args) == 1 && fun.Name == "min" {
					ref := n.Args.Items[0].(*ast.ColumnRef)
					columns, err := outputColumnRefs(res, tables, ref)
					if err != nil {
						panic(err)
					}
					fmt.Println(columns[0].Name)
					fmt.Println(columns[0].NotNull)
					fmt.Println(columns[0].Table.Name)

					cols = append(cols, &Column{
						Name:       name,
						DataType:   columns[0].DataType,
						NotNull:    columns[0].NotNull,
						IsFuncCall: true,
					})
				// END CHANGE
				} else {
					cols = append(cols, &Column{
						Name:       name,
						DataType:   dataType(fun.ReturnType),
						NotNull:    !fun.ReturnTypeNullable,
						IsFuncCall: true,
					})
				}
			} else {
				cols = append(cols, &Column{
					Name:       name,
					DataType:   "any",
					IsFuncCall: true,
				})
			}

The code above solves issue for group by queries using min function only.

The problem is I am unsure for which functions we should apply such type inference and for which not. That will be dependent on the database and likely should be implemented somewhere at the catalog level. Also, aggregation functions may have array parameters, so those cases should be handled correctly too: https://www.postgresql.org/docs/9.5/functions-aggregate.html

@kyleconroy , do you have an idea on how this issue should be handled? From the postgres doc it looks like all aggregation functions can return null, so maybe just mark all aggregation functions in catalogs for postgres, dolphin and sqlite? By that I mean to fill ReturnTypeNullable = true for every aggregation function. That would mean modifying sql-pc-gen somehow. And making manual edits for dolphin and sqlite catalogs.
For sql-pc-gen, I think this can be done by setting ReturnTypeNullable = true for all functions with prokind = 'a'. That is, for all aggregation functions.

@ryan-berger
Copy link
Contributor

ryan-berger commented May 24, 2022

I'm not the maintainer, but I can take a look at this soon because I hit this issue in our production project. I believe a refactor of this portion of the codebase just happened, so the preferred solution may change.

I was able to get around this for the time being by using:

COALESCE(extract(epoc from ....)), 0)

And then just using time.Unix(myIntColumn, 0) and using t.IsZero() to check for null times. Not ideal, but it will work until this issue gets closed.

@kdubovikov
Copy link

@ryan-berger I think I have solved the issue in the linked pull request, but I need a bit more help from maintainers to fix tests

@ryan-berger
Copy link
Contributor

ryan-berger commented May 24, 2022

@kdubovikov You are correct, it would just be sqlc-pg-gen that needs to change, and that the refactor didn't change much if Kyle goes with your other solution.

I was thinking about the sqlc-pg-gen a bit more:

For sql-pc-gen, I think this can be done by setting ReturnTypeNullable = true for all functions with prokind = 'a'. That is, for all aggregation functions.

And I realized that COUNT has a prokind of a, but cannot return a null value it will return 0 instead.

The Postgres docs say:

It should be noted that except for count, these functions return a null value when no rows are selected.

Looking at pg_catalog, I can't find any difference between MIN and COUNT that we could use to determine this, so we likely just need to make it a special case in the sql-pg-gen. I don't really like the idea of this though because it would be nice if we could determine nullability of the return type without manually annotating pg_catalog functions.

@kyleconroy any thoughts on this?

@kyleconroy kyleconroy removed the triage New issues that hasn't been reviewed label Jun 4, 2022
@kyleconroy
Copy link
Collaborator

A few thoughts. We should manually tag the aggregate functions that can return null. I honestly do do not know that min and max would return null.

As for type inference, it's a very large project to take on.

@jose-zenledger
Copy link

A few thoughts. We should manually tag the aggregate functions that can return null. I honestly do do not know that min and max would return null.

As for type inference, it's a very large project to take on.

I think it returns null if there are no rows in the set so:

SELECT MAX(amount) FROM sales WHERE product = 'something_that_does_not_exist';

@skabbes
Copy link
Contributor

skabbes commented Nov 27, 2022

I have summarized what has been discussed in this playground link:

https://play.sqlc.dev/p/cbe160df51a07333f3b91b2e13d9b81a0293c3c84f451e2f312e553cdc4ff81d

Bug 1
sqlc does not infer type correct for aggregated timestamp column

Workaround 1
Add cast to aggregate MAX(event_Time)::timestamptz

Bug 2
sqlc does not correctly infer the "nullability" of MIN / MAX

Workaround 2
Unfortunately, there is no workaround.

@kyleconroy I might use this opportunity to bring back up the "output nullability" idea I had a while ago, discussed here:
#1525.

I think this would allow developers to help the sqlc type inferrence / null-inference in complex cases. Here is an example of how it would look. Of course I agree that the type-inferrence, null-inferrence should be better, but bugs like this are showstoppers for many teams.

CREATE TABLE activities (
    account_id BIGINT NOT NULL,
    event_time_nullable TIMESTAMP WITH TIME ZONE
);

-- name: Summary :one
SELECT
  COUNT(*) as "NumOfActivities",
  sqlc.nonnull(MIN(event_time_nullable)) as "MinDateNullable"
FROM activities 
  WHERE account_id = $1 
  AND event_time_nullable is not null
HAVING COUNT(*) > 0;

@jose-zenledger @Neokil what do you think?

@jose-zenledger
Copy link

I have summarized what has been discussed in this playground link:

https://play.sqlc.dev/p/cbe160df51a07333f3b91b2e13d9b81a0293c3c84f451e2f312e553cdc4ff81d

Bug 1 sqlc does not infer type correct for aggregated timestamp column

Workaround 1 Add cast to aggregate MAX(event_Time)::timestamptz

Bug 2 sqlc does not correctly infer the "nullability" of MIN / MAX

Workaround 2 Unfortunately, there is no workaround.

@kyleconroy I might use this opportunity to bring back up the "output nullability" idea I had a while ago, discussed here: #1525.

I think this would allow developers to help the sqlc type inferrence / null-inference in complex cases. Here is an example of how it would look. Of course I agree that the type-inferrence, null-inferrence should be better, but bugs like this are showstoppers for many teams.

CREATE TABLE activities (
    account_id BIGINT NOT NULL,
    event_time_nullable TIMESTAMP WITH TIME ZONE
);

-- name: Summary :one
SELECT
  COUNT(*) as "NumOfActivities",
  sqlc.nonnull(MIN(event_time_nullable)) as "MinDateNullable"
FROM activities 
  WHERE account_id = $1 
  AND event_time_nullable is not null
HAVING COUNT(*) > 0;

@jose-zenledger @Neokil what do you think?

I like your solution, I did figure out a workaround for bug 2 though. You have to use with to assign a name to the table then you can override it in the sqlc.yaml

Query:

with agg as (select max(timestamp) as tstamp from prices where currency_id = $1 and source = $2)
select tstamp from agg;

Then in sqlc.yaml:

overrides:
  - column: "agg.tstamp"
    go_type: "github.com/zenledger-io/zazen/nullable.Time"
    nullable: true

kyleconroy pushed a commit that referenced this issue Jul 30, 2023
What is this

As the title said, this PR wants to add support for CAST function in MySQL.

This PR is based from PR by @ryanpbrewster here (which unfortunately he didn't send here, and only exist in his repository).
Why is this PR created

Currently sqlc unable to infer the correct type from SQL function like MAX, MIN, SUM, etc. For those function, sqlc will return its value as interface{}. This behavior can be seen in this playground.

As workaround, it advised to use CAST function to explicitly tell what is the type for that column, as mentioned in #1574.

Unfortunately, currently sqlc only support CAST function in PostgreSQL and not in MySQL. Thanks to this, right now MySQL users have to parse the interface{} manually, which is not really desirable.
What does this PR do?

    Implement convertFuncCast function for MySQL.
    Add better nil pointer check in some functions that related to convertFuncCast.

I haven't write any test because I'm not sure how and where to put it. However, as far as I know the code that handle ast.TypeCast for PostgreSQL also don't have any test, so I guess it's fine 🤷‍♂️
Related issues

Support CAST ... AS #687, which currently is the oldest MySQL issue that still opened.
Using MYSQL functions ( CONVERT and CAST) result in removing column from struct #1622
Unable to Type Alias #1866
sum in select result in model field type interface{} #1901
MIN() returns an interface{} #1965
kyleconroy added a commit that referenced this issue Oct 18, 2023
* fix(compiler): Pull in array information from analyzer
Fixes #1532
* test(analyzer): Add testcase for #1574
* test: Added test for #1634
* test: Add test case for #1646
* test: Add test for #1714
* Fixes #1912
* test: Add case for #1916
* test: Add two test cases
#1917
#1545
* test: Add case for #1979
* test: Add case for #1990
@kyleconroy
Copy link
Collaborator

This is fixed in v1.23.0 by enabling the database-backed query analyzer. We added a test case for this issue so it won’t break in the future.

You can play around with the working example on the playground

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analyzer 📚 postgresql bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants