New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Facets or similar but on all values in a subset? #16144
Comments
Hi, Can you post a sample document and the query result you are trying to achieve? |
Here's a document:
I want to programmatically return all unique Lesson.Name as a group with each value and then a set of aggregates for that value. I.e.:
Note that the group values are the unique values in the fields, not all unique words in a field which is the only difference between what I'm asking for and facets. (I believe) So even a parameter on the facet option to not break apart by word would solve the problem and I could just use facets. |
References:
https://www.telerik.com/kendo-angular-ui/components/grid/data-operations/
https://www.telerik.com/kendo-angular-ui/components/grid/data-binding/basics/
https://www.telerik.com/kendo-angular-ui/components/grid/data-binding/remote-data/
https://www.nuget.org/packages/ClientFiltering/
Result:
(After the first request, Groups and Summaries are null as you already have them and don't need to calculate them for subsequent page requests, just return the secondary page)
Problem:
When you use a grid control such as Telerik's Kendo etc. and you want to do everything server side so that the entire dataset isn't spewed over the wire, this is straight forward until you start grouping results. And it gets worse when you want to also have a chart based on the group.
All data has to be first security filtered so that they only see what they're supposed to, and then filtered by the filters defined in the grid columns.
I.e. if I have a list of users and I group the grid by an attribute of those users (i.e. let's say an organizational group) I have 4 problems that I have to solve:
The first one is relatively straight forward: If you order your resultset by the groups in the direction that the groups were requested to be displayed BEFORE you order by whatever other requested fields, you can then use skip/take to get the page of data that is needed for display. It's important that this occurs in this way because you want to return strongly typed data over the wire that isn't of arbitrary shape, otherwise you end up with subtle bugs over time if the server starts returning slightly different data. Your client won't know it and you won't get compile time errors (we strongly generate our client from the server using swagger for this reason.)
I have solved the Filtering problem and the ordering problem by using an AsyncDocumentQuery because of the excellent extension methods that let me use the WhereEquals etc. with a field name.
Problem 2 is much more difficult. Especially if the grid is grouped by one field and then a second etc. Basically you need to take the group, find all unique values in that group, and then aggregate by the requested fields and aggregate types on the security and grid filtered greater dataset.
The problem comes taking the pre-filtered dataset and then getting the groups with aggregates.
Then I discovered facets and I thought "Woohooo! This solves the whole problem!" But I can't figure out how to use facets to operate on the entire value on the field instead of breaking the values apart into words too. i.e. I want to get the same as facets but just the same as GroupBy(fieldname) instead of based on the individual words of the group by.
So, question No 1 is simple: Is it possible to create facets that don't break apart the values in the field and just operate on the unique values in the dataset by the field? (i.e. Select(fieldname) => aggregate on these values). This would allow me to programmatically build up the aggregations against the group.
If question No 1 is not possible, do you have a suggestion on how to go about getting the aggregates programmatically on a group by?
Then, on problem No 4. What you ideally want to do is GroupBy(true).Select({Aggregates}) or something similar, but I can't find a way to programmatically build up aggregates. It appears that I might be able to use facets but that breaks down by basically groups, whereas this problem is on the entire set, give me the aggregates of the given fields with the given aggregate type.
Finally, is there a way, or have you considered, creating an index that takes an entire document collection and enables this type of interaction? I.e. take the functionality of the Kendo Grid + a chart at the top that has the X axis based on the grouping of the grid and then defines the aggregate, pass that information into a rest api and have RavenDb be able to handle the entire thing automatically? Right now I have to create an index that includes all of the fields in the dataset that is flattened which is non-optimal for many reasons. Obviously I could put an OLAP server in front of this, but that just seems like overkill for creating interactive reporting when it appears that RavenDb is really close to just being able to do this.
I created a library that has both client side and server-side code that does all of the functionality against linq. (hence my other bug. See the nuget reference above) that the Telerik structure can be converted into, and then C# can read, convert to expression trees and execute. But it doesn't work against Raven because the DTO structure can be different than the source structure and Automapper AsDataSource.For doesn't work against RavenDb correctly (if you're using async) thus meaning that fields have to be converted to their original from the DTO fields that the client knows about, and grouping and building these aggregates also are not reliable. It does work against EF Core, but I see why Raven doesn't like it because EF does a select on the projection and does a where on the projection, which is slow. So I'm trying to adapt this to the methodology of RavenDb and get the same functionality so that I can use interactive Grids and Pivots to generating summary charts that the executives LOVE in companies. They love their power BI but they love it more when they can just go in and drag stuff around themselves without having to know anything about Power BI.
If I'm indeed not blind it would be ideal to get the following functionality on DocumentQuery:
These 2 would address No 2 and No 4 above.
And a way to create a "Report Index" that would take an entire collection as is and create some sort of index on it that allowed all fields to be filtered, grouped, ordered by and aggregated would be amazing. I was thinking that the way this would work is that it would take 3 generics: SourceType, OperatingType, OutputType (DTO) and then have a method on it called CreateReportIndex() or something similar. The SourceType is obviously the type of the document collection, the OperatingType is what the map generates and the outputType is what the reduce should create that is the same as the client sees as the DTO. And the reduce could be a straight projection and OperatingType could in many cases be the same as SourceType unless you were doing grouping etc on reduce. This solves No 5 and without CreateReportIndex ultimately would be a great way to create Projections and you could even use Mapperly/Automapper or similar in the second phase here to do the mapping. It would create a structured way of mapping that would be reversable back to the source so that filtering etc. could come in as the DTO values and be converted to the Source expressions necessary. And all of this could use a source generator in C# to precompile all of the mapping data from OutputType to SourceType I would think.
Is there a blog article for you here?? 😉
This image gives you a visual of what I'm trying to achieve.
The text was updated successfully, but these errors were encountered: