Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support COUNT of relevant rows #94

Closed
soedirgo opened this issue Sep 7, 2020 · 33 comments
Closed

feat: Support COUNT of relevant rows #94

soedirgo opened this issue Sep 7, 2020 · 33 comments
Assignees
Labels
enhancement New feature or request

Comments

@soedirgo
Copy link
Member

soedirgo commented Sep 7, 2020

Feature request

Is your feature request related to a problem? Please describe.

I'd like to perform COUNT(*) on relevant rows.

Describe the solution you'd like

A PostgrestTransformBuilder method. PostgREST supports this.

@stupidawesome
Copy link
Contributor

stupidawesome commented Oct 7, 2020

Postgrest reports counts in the Content-Range header. This has to be requested by setting Prefer: count={strategy} where {strategy} is one of exact, planned or estimated.

range() could be modified to accept a count option as this is useful for displaying search results with a paginator.

range(
        from: number,
        to: number,
        {
            count,
            foreignTable,
        }: {
            foreignTable?: string
            count?: "exact" | "estimated" | "planned"
        } = {},
    ): PostgrestTransformBuilder<T> {
        const keyOffset =
            typeof foreignTable === "undefined"
                ? "offset"
                : `"${foreignTable}".offset`
        const keyLimit =
            typeof foreignTable === "undefined"
                ? "limit"
                : `"${foreignTable}".limit`
        this.url.searchParams.set(keyOffset, `${from}`)
        // Range is inclusive, so add 1
        this.url.searchParams.set(keyLimit, `${to - from + 1}`)
        if (count) {
            this.headers["Prefer"] = `count=${count}`
        }
        return this
    }

Then in the fetch promise could return this as part of the response:

const contentRange = res.headers.get("Content-Range")
if (contentRange) {
    const parts = contentRange.split(/[-\/]/)
    // Note: this doesn't not handle responses like `0/*`
    range = {
        from: parseInt(parts[0], 10) + 1, // zero-based index
        to: parseInt(parts[1], 10) + 1, // zero-based index
        total: parseInt(parts[2], 10),
    }
}

return {
    range,
    data,
    status,
    ...etc
}

This would involve modifying the response format

@soedirgo
Copy link
Member Author

soedirgo commented Oct 9, 2020

Hey @stupidawesome! This issue is for getting just the COUNT using a HEAD request, as laid out here.

I'd like to keep the library simple, so we don't plan to expose the HTTP response headers.

@dshukertjr
Copy link
Member

Hi! First of all, thank you for creating such an amazing product. Supabase is what exactly I was looking for.

Is this issue moving forward? I would like to have a count feature in Supabase. If it's pending right now, I was thinking I could tackle this issue.

@soedirgo
Copy link
Member Author

soedirgo commented Jan 6, 2021

Thanks for hopping in @dshukertjr, this feature is long overdue...

I'm starting to think that it might be worth modifying the response format, adding a count field. I can't see any way of supporting this otherwise.

There are two problems here:

  1. getting a COUNT(*) with a HEAD request (no data), and
  2. getting a count of the selected/modified set of rows.

The solution I'm thinking of is similar to how we now handle select: (1) is done in PostgrestQueryBuilder (i.e. like select(), insert(), ...), while (2) is done in PostgrestTransformBuilder (i.e. like order(), range(), ...). So to do PostgREST's HEAD request we do:

const { count } = await postgrest
    .from('table')
    .count('exact') // or 'planned', or 'estimated'. What should be the default here?
    .gt('id', 10)

And to get a count of the selected/modified rows (this is useful for our table editor pagination) we do:

const { error, data, count } = await postgrest
    .from('table')
    .select()
    .gt('id', 10)
    .count('exact')

I'm not sure if we need from and to, I can't think of a case when these can't be easily computed with count and from (that the user specifies in range()).

Would love to hear your thoughts @kiwicopple @steve-chavez @thorwebdev

@kiwicopple
Copy link
Member

Just checking - the count() function is simply going to pluck the value out of the header right?

@soedirgo
Copy link
Member Author

soedirgo commented Jan 6, 2021

Yup, it just uses Content-Range, no additional query done.

@kiwicopple
Copy link
Member

kiwicopple commented Jan 6, 2021

OK cool. IMO it seems redundant to add a whole function when it's still fetching the results from the database just for the headers. This one seems a natural extension of select()

const { data, error, range, count } = supabase
  .from('table')
  .select('*', { count: 'exact' })

console.log(range) // [0, 24] (?)
console.log(count) // 1000

Does that work?

// or 'planned', or 'estimated'. What should be the default here?

It's a good question. We're seeing a lot of people confused by our "row counts" in the dashboard becuase they are estimates on the tuples. Even though it's slower, I think the principle of least surprise applies here - we should use exact as a default.

Once again this is breaking the PostgREST defaults, but we are seeing that people are coming in with small databases, then growing. In the time that it takes for them to grow their database, they can learn about performance optimizations like using planned. We could also add it as a param in the constructor which can be overridden.

wdyt?

@steve-chavez
Copy link
Member

const { error, data, count } = await postgrest
.from('table')
.select()
.gt('id', 10)
.count('exact')

I like the extra count field in the response since it would work for every operation.

Exact count is supported for insert/delete/update/rpc. And planned plus estimated might come to RPC. So the interface seems fit.

const { count } = await postgrest
.from('table')
.count('exact') // or 'planned', or 'estimated'. What should be the default here?
.gt('id', 10)

The problem with that interface is that it doesn't specify the operation. It assumes count only works for GET.

For HEAD, how about having an extra modifier that nulls the data and error fields?

const { error, data, count } = await postgrest
    .from('table')
    .select()
    .gt('id', 10)
    .count('exact')
    .head()

console.log(error, data, count) // null null 100

@steve-chavez
Copy link
Member

For HEAD, how about having an extra modifier that nulls the data and error fields?

Actually scratch that, HEAD only makes sense for select and rpc.

How about having head as parameters of both:

// for select
const { data, error, count } = postgrest
  .from('table')
  .select('*', { head: true })

// for rpc
const {data, error, count} = postgrest
  .rpc('get_status', { name_param: 'supabot' }, {head: true})

Both would null the data and error fields.

we should use exact as a default.

Agree with exact as default.

@soedirgo
Copy link
Member Author

soedirgo commented Jan 6, 2021

This one seems a natural extension of select()

Makes sense, since we already have a bunch of Prefer-related stuff in each respective CRUD method. Though with that approach, if we want exact to be the default, maybe we should also allow null for if we don't need the count? If the count is not very expensive this might be OK.

HEAD only makes sense for select and rpc.

Didn't know you can do HEAD on rpc. Sounds good to me.

@kiwicopple
Copy link
Member

yeah head is cool! Agree with this approach 👍

@dshukertjr
Copy link
Member

dshukertjr commented Jan 7, 2021

Correct me if I am talking gibberish, but with this method, would you be able to query count value of joined table?

For example, let's say you are developing a twitter clone, and you want to retrieve a users profile as well as the tweet count of the user to display them on a profile page. Something along the line of

const { data, error } = await postgrest
    .from('users')
    .select(`
        name,
        tweets (
          count(*) as count
        )
  `)

I guess I would have to send two request to achieve this huh? One to get the user profile, and one to get the tweet count.

@dshukertjr
Copy link
Member

dshukertjr commented Jan 7, 2021

Ignoring what I am saying about counts from foreign table, to summarize the discussion in this thread, it will look like this.

Getting count along with table data:

const { data, error, count } = await postgrest
    .from('table')
    .select('*')  
    .gt('id', 10)
    .count() // 'exact' by default. 'estimate', 'planned' and null are also accepted

Only getting the count:

const { data, error, count } = await postgrest
    .from('table')
    .select('*', {head: true})  
    .gt('id', 10)
    .count() // 'exact' by default. 'estimate', 'planned' and null are also accepted

console.log(data, error, count) // data and error are null. count contains row count. 

Do I understand it correctly?

@steve-chavez
Copy link
Member

Correct me if I am talking gibberish, but with this method, would you be able to query count value of joined table?

@dshukertjr No, this interface is just for counting the root table.

I guess I would have to send two request to achieve this huh?

Not necessarily, you can always create a VIEW or FUNCTION that has the aggregate to do it in one request.

There's also another option, that is currently undocumented in postgrest since it's not complete. Basically:

const { data, error } = await postgrest
    .from('users')
    .select(`
        name,
        tweets (
          count
        )
  `)

That will get the tweets count as you'd expect. However if you add more attributes(like tweets(content,date,count)) it will fail.

Calling other aggregate functions with group by is a future enhancement.

@dshukertjr
Copy link
Member

Thanks @steve-chavez for pointing me to the right direction.

Regarding this post, does this look like what everyone had in mind? I could try to write a pull request for this.

@soedirgo
Copy link
Member Author

soedirgo commented Jan 8, 2021

That'd be awesome @dshukertjr! 👍

The only change is to make count an option on select(), insert(), update(), delete(), and rpc() instead of a function, mostly because other similar stuff are also done as options (returning, onConflict).

@steve-chavez
Copy link
Member

I could try to write a pull request for this.

@dshukertjr Great! Go ahead!

In case it helps, here's an example of joining both the head and count ideas:

const { data, error, count } = postgrest
  .from('table')
  .select('*', { head: true, count: 'exact' })

Also agree with soedirgo's idea about count being null by default. There's a perf loss(16% loss in throughput with this setup) when adding count= exact. It would be wasteful to have it by default.

@dshukertjr
Copy link
Member

Thanks @soedirgo and @steve-chavez. I will add this feature and create a pull request.

@dshukertjr
Copy link
Member

@steve-chavez I guess I am a bit confused with what the head option does.

I do understand that it will turn data and error null, but how would it look like behind the scenes? Would postgrest-js still fetch the data and just not return the response body as data, or is there a flag or something to not get the response body to begin with?

@steve-chavez
Copy link
Member

@dshukertjr No, it would execute an HTTP HEAD request.

See how a GET method is specified here. You'd have to change that to HEAD conditionally.

@dshukertjr
Copy link
Member

@steve-chavez Got it! Thank you!

@dshukertjr
Copy link
Member

I may have a question regarding this.

Having head and count as an option for select makes sense to me, but having those options for all the other methods listed do not make sense to me. Could you provide a simple example situation of when you would want to use head or count options for method other than select()?

@steve-chavez
Copy link
Member

@dshukertjr For RPC, there are set returning functions(return multiple rows) that can be called. A simple one:

create function getallprojects() returns setof projects as $$
select * from test.projects;
$$ language sql;

Using count there makes the same sense as using it for a select. Also head serves the same need(only interested in the count and not the body).

For POST(insert), you'd be interested in the count when bulk inserting many rows.

For UPDATE/DELETE, the count is useful for knowing how many rows you updated/deleted in one request.

@dshukertjr
Copy link
Member

@steve-chavez Thanks for the clarification! I will start implementing count and head to other methods once I implement the tests for select() in the pull request!

@soedirgo
Copy link
Member Author

@steve-chavez head is only for select() and rpc() though right? For the others you'd use returning: minimal instead.

@steve-chavez
Copy link
Member

@soedirgo Yes. Now that you mention it, returning:minimal is like HEAD for insert/update/delete.

I will start implementing count and head to other methods

@dshukertjr head only for select and rpc :-)

@dshukertjr
Copy link
Member

Okay, to summarize, it sounds like head for select and rpc, count for everything. I will get to it!

@soedirgo
Copy link
Member Author

Closed by #147 🚀

@kiwicopple
Copy link
Member

Amazing job @dshukertjr 🚀

@dshukertjr
Copy link
Member

dshukertjr commented Jan 18, 2021

@kiwicopple
Thank you so much! I'm glad that I was able to contribute!

@johndpope
Copy link

https://supabase.com/docs/reference/javascript/select

const { data, error, count } = await supabase
  .from('cities')
  .select('name', { count: 'exact' }) 

@lauridskern
Copy link

is there a way to use this with filtering? eg. only counting rows that have a specific user id

@johndpope
Copy link

should be fine. just tack on .eq("user_id",1) at the end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants