Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some additional data type #141

Open
mitar opened this issue Oct 28, 2019 · 3 comments
Open

Some additional data type #141

mitar opened this issue Oct 28, 2019 · 3 comments

Comments

@mitar
Copy link

mitar commented Oct 28, 2019

I would like to request some additional data types:

  • Timezone aware timestamps.
  • Binary blobs.
  • Compound data types: allowing nested sub-objects: list of others objects or objects themselves. This would allow Noria to work with more document-oriented data where fields have sub-documents or arrays of values or other sub-documents. One could potentially represent this with highly-normalized tables, but then it would still be useful for one to be able to create views where you could aggregate values into lists, for example. like PostgreSQL array_agg function. What I would like is that my app's state represented in materialized views contain such arrays. I have found out that this is often much more efficient for many common cases like 1:N relations. In regular SQL many rows are duplicated (N rows for each 1 row) which makes all those values duplicated: both in memory an on-wire between server and client. Being able to represent that as an array is both more efficient and more natural.
@jonhoo
Copy link
Contributor

jonhoo commented Oct 28, 2019

Hi! Given that Noria is a research prototype, we're not really focused on adding additional data types at the moment. I'd be happy to take a look at a PR, but our main attention is on the research aspects of the system, such as sharding, security policies, and fault-tolerance. I'll leave this open for later work though :)

@mitar
Copy link
Author

mitar commented Oct 28, 2019

I also prefer that the core team works on those hard problems. :-)

Thank you for being open to PRs though. But to make it clearer what you would like to see in a PR, do you have any preference about compound types? So given a special Noria architecture I am not completely sure what would performer better/be a better design choice:

  • Having compound types of different complexity and operators to extract sub-values from it.
  • Do not have compound types but have data highly normalized data, where every such compound type is split into its own table, and then that table is referenced. And then have functions which combine those values into compound types at query time, like combining all referenced rows into a list/array of those.

In some way the default dsign would be the first. But with Noria it might be that the second can perform better, especially with its internal cache/materialization. Where some of those compound values would be constructed and cached as needed, and also freed when not needed.

So do you have any insight here? Or should we support both and see then? In a way supporting both would provide more flexibility to the app developer to structure data as it suits the app the best, while leaving to Noria to optimize that (I think this is of goals of Noria, that app developer does not have to care about optimizing queries and data structures and indices, but that it figures all that out on its own).

@jonhoo
Copy link
Contributor

jonhoo commented Oct 29, 2019

My first instinct here is also to take the first approach. You're right that Noria might be able to automatically delegate compound types to an operator, but I think that would necessarily have to come after support for "basic" compound types. Basically, once we have compound types, I could then imagine that we'd add operators (like fancy versions of GROUP_CONCAT) which construct and incrementally maintain these compound types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants