Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stable column order for generated SQL statements #280

Open
lafriks opened this issue May 10, 2022 · 25 comments
Open

Stable column order for generated SQL statements #280

lafriks opened this issue May 10, 2022 · 25 comments
Labels
enhancement New feature or request

Comments

@lafriks
Copy link
Contributor

lafriks commented May 10, 2022

Currently looks like because of map usage for storing column data each time insert/update/select SQL statements are generated column order is different, that makes grouping of SQL statements for perf stats in logs impossible and also this makes it unusable for prepared statement caching being polluted and unusable that hurts performance.

@Fs02
Copy link
Member

Fs02 commented May 11, 2022

sounds important and need to be fixed

do you have suggestion what to use other than map? maybe sorted slice? 🤔

@lafriks
Copy link
Contributor Author

lafriks commented May 11, 2022

Something like https://github.com/elliotchance/orderedmap can be used instead of map

@lafriks lafriks added the enhancement New feature or request label May 11, 2022
@lafriks
Copy link
Contributor Author

lafriks commented May 11, 2022

At least with this library elements can be looped in order elements were added to map with code:

for el := m.Front(); el != nil; el = el.Next() {
    fmt.Println(el.Key, el.Value)
}

@Fs02
Copy link
Member

Fs02 commented May 11, 2022

although this may increase allocation due to internal linked list, this seems better and easier solution than using sorted slice 🤔

(feel free to work on this if you have time 🙏 )

@lafriks
Copy link
Contributor Author

lafriks commented May 11, 2022

Looked a bit into this but problem is that it affects a lot of public API where maps are used to pass around mutations that are of type map[string]Mutate so these would need to be changed to custom new type with needed methods that could internally use either ordered map or maybe even array ([]Mutate) would be sufficient. Anyway it would be breaking change... At least for manual mutation usage and for adapter public API

@lafriks
Copy link
Contributor Author

lafriks commented May 11, 2022

I could create new type Mutates struct { .. } that could be used in place of map[string]Mutate

@Fs02
Copy link
Member

Fs02 commented May 13, 2022

I think a better way is to make existing Mutation and Mutate struct as linked list container as well, and we can keep the same field as is for now to keep compatibility:

// add new field to store first mutation
type Mutation struct {
        // don't want to export this, this should not be modified from outside
        // should be accessed using First() method
	first  *Mutate
        last *Mutate

	//  ... other existing definition
}

func (m *Mutation) First() *Mutate {
  return m.first
}

// Mutate stores mutation instruction.
type Mutate struct {
        // don't want to export this, this should not be modified from outside
        // should be accessed using Next() method
        next *Mutate

	Type  ChangeOp
	Field string
	Value interface{}
}

And then we can gradually update existing implementation:

for mut := mutation.First(); mut != nil; mut = mut.Next() {
        // do something ...
}

@lafriks
Copy link
Contributor Author

lafriks commented May 14, 2022

problem is how would you initialize them them when currently everywhere map[string]Mutate is used in API?

@Fs02
Copy link
Member

Fs02 commented May 14, 2022

hmmm true, looks like this mostly used in unit testing in this repo, not very sure about use case outside this repo 🤔

how about, for now we can do lazy initialization during first First() calls, like this?

func (m *Mutation) First() *Mutate {
  // check if list is not yet initialized, or the content of map is updated
  if (m.size != len(m.Mutates)) {
   // initialize the list
  }

  return m.first
}

in the future, I think we can deprecate direct access to the underlying map, and remove it completely in future version
(actually I always wanted to revisit what API should be exported and what's not, I think this is one of case that should be un-exported)

@Fs02
Copy link
Member

Fs02 commented May 14, 2022

ah forgot that we pass map of Mutates instead of Mutation to adapter, seems breaking change is un-avoidable 🤦

rel/adapter.go

Lines 15 to 17 in 1cac74f

Insert(ctx context.Context, query Query, primaryField string, mutates map[string]Mutate, onConflict OnConflict) (interface{}, error)
InsertAll(ctx context.Context, query Query, primaryField string, fields []string, bulkMutates []map[string]Mutate, onConflict OnConflict) ([]interface{}, error)
Update(ctx context.Context, query Query, primaryField string, mutates map[string]Mutate) (int, error)

@lafriks
Copy link
Contributor Author

lafriks commented May 16, 2022

yes that was my proposal to change map[string]Mutate to Mutates in interfaces and expose only needed functionality with type functions and hide underlying storage structure

@lafriks
Copy link
Contributor Author

lafriks commented May 16, 2022

Also why do you avoid using pointers in interfaces? (Query vs *Query). Passing it by value does impact allocation count imho

@Fs02
Copy link
Member

Fs02 commented May 17, 2022

yes that was my proposal to change map[string]Mutate to Mutates in interfaces and expose only needed functionality with type functions and hide underlying storage structure

How will Mutates looks like?
I was thinking to replace it by rel.Mutation actually, seems more flexible in the long run 🤔

Also why do you avoid using pointers in interfaces? (Query vs *Query). Passing it by value does impact allocation count imho

as far as I know, unlike C++, in golang it's not about whether passing reference vs passing by value to avoid copying. it's about whether it'll cause stack allocation vs heap allocation (stack is generally the fastest because heap allocation increase load to GC)

and as far as I understand, compiler decide where to allocate using escape analysis, and one of the cause is related to the use of pointer (indirection). so my rule of thumb is basically use plain object as much as possible, use pointer where necessary or it's proven to be better by benchmark.

also created a quick benchmark here (shows calling a function wrapped on interface with ptr cause allocation): https://github.com/Fs02/go-pattern-benchmark/tree/master/pass_value_vs_ptr

@lafriks
Copy link
Contributor Author

lafriks commented May 18, 2022

I was thinking something like:

func NewMutates(mutates ...Mutate) Mutates

type Mutates struct {
  ...
}

func (m *Mutates) Add(mutate Mutate)

func (m *Mutates) List() []Mutate

probably other functions would be needed (Get(name), Remove(name) etc) but that still needs to be decided in process of refactoring

@Fs02
Copy link
Member

Fs02 commented May 18, 2022

any reason why not reuse existing Mutation? is it to avoid passing the association?

@lafriks
Copy link
Contributor Author

lafriks commented May 24, 2022

Yes Mutation can also be reused for this. Just field Mutates needs to be made private and different type not map

@lafriks
Copy link
Contributor Author

lafriks commented May 24, 2022

Only it's that it does duplicate values passed around as fields from Mutation are first preprocessed and split to multiple fields that are passed to adapter functions, that would make it a bit hard to use imho

@Fs02
Copy link
Member

Fs02 commented May 24, 2022

Yes Mutation can also be reused for this. Just field Mutates needs to be made private and different type not map

if not using map, how will you handle access using field name?

Only it's that it does duplicate values passed around as fields from Mutation are first preprocessed and split to multiple fields that are passed to adapter functions, that would make it a bit hard to use imho

sorry, I don't quite understand the case

@lafriks
Copy link
Contributor Author

lafriks commented May 24, 2022

if not using map, how will you handle access using field name?

I don't seem to be finding anywhere where currently there would be need to access Mutate by field name (only use case currently seems to be only to not allow duplicate mutates for same field).

If that's really needed we could have two struct fields:

mutates []Mutate
fields map[string]int

fields would contain field name as map key and position in mutates array as value.

If really needed new function could be added to return Mutate by field name:

func (m *Mutatation) Get(field string) Mutate

sorry, I don't quite understand the case

I mean OnConflict is passed as different argument etc but in such case it could be removed from arguments also so that's fine.

@Fs02
Copy link
Member

Fs02 commented May 24, 2022

I don't seem to be finding anywhere where currently there would be need to access Mutate by field name (only use case currently seems to be only to not allow duplicate mutates for same field).

Insert all query use it for example:
https://sourcegraph.com/search?q=context:global+repo:%5Egithub%5C.com/go-rel+Mutates%5B&patternType=literal

If that's really needed we could have two struct fields:

mutates []Mutate
fields map[string]int

that won't be scalable in the future if we need to have function to delete Mutate (in worst case we have to update all fields map value)

I prefer to use solution like orderedmap (map that refer to linkedlist items), which I described here: #280 (comment)

@lafriks
Copy link
Contributor Author

lafriks commented May 24, 2022

Even if delete is needed I don't think it's much of a problem:

i, ok := fields[fieldName]
if !ok {
    return
}
delete(fields, fieldName)
mutates = append(mutates[:i], mutates[i+1:]...)

@Fs02
Copy link
Member

Fs02 commented May 24, 2022

basically need to shift the whole array?
I don't see why that's better than map of linked list, which just changing the pointer 🤔

@lafriks
Copy link
Contributor Author

lafriks commented May 24, 2022

Because it can have errors later on. As mostly in library uses non pointer values to pass around you could have copied Mutate value somewhere that would point to different next that in the original map where it could have been changed (like same deleted element).

@Fs02
Copy link
Member

Fs02 commented May 25, 2022

copied Mutate value somewhere that would point to different next that in the original map where it could have been changed (like same deleted element).

hmm, even if we delete the Next value, copied Mutate will still point to a valid pointer, because we can only delete element entry from map (not deallocating the actual element)

Even if delete is needed I don't think it's much of a problem:

if !ok {
    return
}
delete(fields, fieldName)
mutates = append(mutates[:i], mutates[i+1:]...)

this still have in worst case we have to update all fields map value, because after shifting the array, now fields map will have wrong index

but I guess this shouldn't be a problem, better way to delete aMutate is just by moving last element to deleted element

@Fs02
Copy link
Member

Fs02 commented May 25, 2022

let's go with the slice + map[string]int solution 🚀

(seems map of linked list item is not a very clean solution, a lot of pointers involved https://go.dev/play/p/tIgI5M2fWq0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants