-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Named query parameters implemented #71
Conversation
By caching the index of a field we make querying values by field name faster, but however where we only query one field the optimizations wont help performance as seen in the _2 suffixed benchmarks _3 suffixed benchmarks query 3 fields The reason why we don't want to use
|
Performance is no longer a concern
|
Hello! Thank you for your effort and for drafting this PR. I've gone through the code and it seems to me that you are inserting data inside the SQL query text directly, so you are escaping the data yourself. It might be dangerous (SQL injections) and error-prune. We shouldn't take this job from the underlying database library. Instead, the new named-query logic should compile the SQL query in a different way:
So it replaces our name placeholders with the default placeholders for the database library and passes data from the struct argument as corresponding positional arguments to the database library function call. Here is an example of how it's implemented for the |
sure, when i wrote the lexer 6 months ago i just wanted to learn how they work. It is not only error prone, but it also doesn't support every data type. |
having looked at the way sqlx has done it i noticed that it is not done by the most optimal way as it converts the entire struct into a instead of doing exactly what sqlx has done i would probably take an approach of making a map which stores the indecies of already mapped fields where the field name is the key and the value is the index where the field is mapped to. |
allocating that much memory for the strings still sounds kinda bad in my opinion maybe i could use the "ordinal" value of the field as the key which would reduce memory usage, but at the same time would maybe increase the processing time a bit. |
@georgysavva should the NamedGet and NamedSelect actually be GetNamed and SelectNamed since it would probably be easier when autocompleting in IDEs? |
Yes. we don't need to stick to the sqlx implementation completely, I was talking more about the overall design.
type User struct {
FirstName string
Post struct {
Title string
}
} For such struct Valid SQL queries using scany would be: SELECT first_name as "first_name", post as "post.title" FROM users
----
INSERT INTO users(first_name, post) VALUES( :first_name, :post.title) The default columns scany would look for are: |
If we do not give the sqlscan interface a default value for the compile delimeter. would we then need to get rid of the package level api that it has? |
Sorry, I don't quite understand your message. Could you please elaborate? The way I see it: |
sqlscan/sqlscan.go
Outdated
// DefaultAPI is the default instance of API that is wrapped around the dbscan.DefaultAPI instance. | ||
var DefaultAPI = NewAPI(dbscan.DefaultAPI) | ||
// DefaultAPI is the default instance of API that is wrapped around the dbscan.API instance. | ||
var DefaultAPI = NewAPI(dbscan.NewAPI(dbscan.WithLexer(lexer.NewLexer(':', '$')))) //this or no default value at all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that if we don't define the DefaultAPI with default parameters...
(1/2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You define DefaultAPI
as before, you just don't pass any delimiter sign '$' (or the whole lexer object) to dbscan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then we need to remove the package level SelectNamed, GetNamed, ExecNamed, because if we wouldn't it would lead to unexpected behaviour and in my opinion it does no look very good when there is a partly exposed DefaultAPI
so it would be better to just get rid of the default api and the package level stuff because it would reduce confusion between the two interfaces (pgxscan and sqlscan).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e. when the user would call the sqlscan.SelectNamed
it would probably run into an null pointer reference because the lexer is not defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Yes, you are right. Since there is no default delimiter for sqlscan we can't offer package-level functions like SelectNamed()
, etc. And yeah, let's not add them for the pgxscan package either, for consistency reasons.
But I would like to keep package level exported DefaultAPI
object since some users might already depend on it and it's convenient to have it. We just need to document that one cannot call *Named()
methods on DefaultAPI
in the sqlscan package.
Few more observations:
|
i dont think using the getColumnToFieldIndexMap() with the same casing as the database table names etc... because if you have the named params camel cased it increases readability in my opinion. INSERT INTO users(first_name, post) VALUES( :first_name, :post.title)
INSERT INTO users(first_name, post) VALUES( :FirstName, :Post.Title) But i can for sure use the getColumnFieldIndexMap if i just add the camel cased option to it if it doesn't have it |
The point here is we should have the same name mapping in both ways SQL -> Go, Go -> SQL. It is just simpler for users to reason about names that way. The name you used for the column alias to scan data from the database is the same you use to pass data into the database. Consider this: type User struct {
ID string
FullName string
...
}
type Post struct {
ID string
UserID string
Title string
...
}
type UserPost struct {
User *User
Post *Post
}
up:= &UserPost{User: &User{FullName: "foo"}, Post: &Post{Title: "bar"}}
GetNamed(ctx, db, up /* dst */, `SELECT u.full_name AS "user.full_name", p.title AS "post.title", ... FROM users u JOIN posts p ON u.id = p.user_id WHERE u.full_name = :user.full_name AND p.title = :post.title`, up /* arg */) The struct always translates to the same names: "user.full_name", "post.title", ... . And you use them for both column aliases and named parameters. |
I undestand your point but in this case simple may not be the best solution if we want scany to be a high productivity tool. We also do not want to over complicate the library as it also brings its own new set of problems. So should we then add an option to change the named query casing which would be by default the way that you want and then the users can configure it how ever they like? We could also ask the users which way they like it best. |
also it seems like |
|
|
as i have seperated lexer.Compile() and the function that binds the field names to args it currently would be pretty simple to add an option to prepare queries before hand. But if we don't want to add prepared queries we should combine them to reduce unnecessary allocations. |
var typeMap cmap.Cmap = cmap.Cmap{}
//getStructIndex tries to find the struct from the typeMap if it is not found it creates it
func getStructIndex(e reflect.Value) structIndex {
t := e.Type()
val, found := typeMap.Load(t)
if found {
return val.(structIndex)
}
return indexStruct(t)
}
//indexStruct creates an index of the struct
func indexStruct(t reflect.Type) structIndex {
indexMap := structIndex{
mappings: map[string]int{},
}
for i := 0; i < t.NumField(); i++ {
indexMap.mappings[t.Field(i).Name] = i
}
typeMap.Store(t, indexMap)
return indexMap
} this is basically how the caching of struct reflection works in the |
why does |
type User struct {
Name string
Post *Post
}
type Post struct {
Title string
Date time.Time
}
getColumnToFieldIndexMap(&User{})
// returns map:
// "name" -> [0]
// "post" -> [1]
// "post.title" [1, 0]
// "post.date" -> [1, 1] |
prepared queries still need tests and documentation other than that it should function as it should |
Yeah it is safe to say that we should go with the sync,Map from the standard library as the docs for it pretty clearly state that it is generally thought to be used in this sort of scenario. |
any updates? |
Hi @fuadarradhi. I am waiting for the final notice from @anton7r that PR is done and ready for review. It seems to be still WIP. |
This is ready feature wise. But some of my chnages still may need better documentation and tests. |
@anton7r understood. I will review it soon. Thank you for your work! |
Similar PR for sqlx, which uses https://github.com/muir/sqltoken: jmoiron/sqlx#775 Personally I'm not a huge fan of having a separate More importantly, will your current implementation work with things like:
Will it replace the I also spent quite some time trying to find failures in sqltoken and couldn't find any, so I think that's probably a better way to go(?) It would also allow things like rebinding |
Hello! @anton7r Thank you for your work and such a huge contribution. I am sorry, I need to put this feature on hold for now. I don't have enough time to dig deeper for review and etc. |
Indeed it will replace the The sqltoken library could be definitely used, but i would have to dive deeper in to it in order to know more about its pros and cons. I would prefer that the sqltoken library would also have a callback function in order to avoid (in this case) unnecessary array creation and modification operations. |
**Managed to fix the git history on this one |
Any updates here? |
This feature requires a thorough review and design decisions. Unfortunately, I don't have time at this moment to do that. I will get back to it as soon as I can. I hope you understand! |
After all it may or may not be a good idea to try to add named query parameters to scany as scany has focused on scanning the mapping the query results and on that regard being lightweight and easily extendable. If we were to add named query parameters it could possibly hinder the extendability of scany and likely the maintainability which wouldn't be desirable. Especially since the current implementation of named query parameters is pretty opionated and may not be able to support some specific database drivers even though it is customisable. I'd also like to hear other opinions on the matter. But right now it looks like the best option here would be to create another library that is just using scany for the data mapping part and a custom implementation for the named queries so that it could be more specific towards supporting postgresql/mysql etc... |
@anton7r, thank you so much for such an excellent summary. I am aligned with everything you've written. I think it's the best course for now. Thank you for your effort and for trying to tackle that. |
I've integrated my pgxe library into scany.
Note:What this PR still lacks however is the possiblity to do named queries that do not return anything (NamedExec()
).Also test cases for the new api methods need to be added and the documentation needs to be improved.The query parser as of right now only supports:
as the delimeter, if we want to support more delimeters i need to update the code