Upsert functionality needs some love to work properly #325

bloopletech · 2015-10-08T06:25:09Z

I apologise if the following is a bit disorganised, this is a bit of a brain dump. But I think there's a glimmer of gold in the middle! I'm loving the stripe-ruby gem so please excuse me if I get a bit blunt further in.

I'm using account here as an example, but you can substitute any of the resources you can update through the API.

I am importing a bunch of managed accounts from a CSV file. I'm storing the Stripe account ids somewhere else. I want to create accounts as needed, but if an account in stripe already exists (using the account id I have stored), then update the existing stripe account. So, obviously, upsert functionality would be great for this.

Here's an example of how you might imagine this working:

# id_or_nil will contain ether the id of an account, or will be nil.
account = Stripe::Account.retrieve(id_or_nil)
account.update_attributes({
  email: 'test@example.com',
  business_name: 'Potato Jackets'
})
account.save

I know there has been work recently (#300 for example) to add upsert-style functionality to the API. I want to use this functionality to make my client code simpler.

Unfortunately, there are multiple issues with the API code that prevent upsert from working.

The example code above obviously doesn't work as-is, as Stripe::Account.retrieve raises an error if you pass nil to it. And if you don't pass any parameters at all, it retrieves your standalone account information. (Which I don't want; I'm updating managed accounts, not the overall account).

So problem (1) is that there's no single API method that finds an existing object if possible, otherwise creating a new object. i.e. like Blah.find_or_initialize_by(api_id: '123456789') in ActiveRecord in Rails.

There is StripeObject.construct_from(id: id_or_nil) and StripeObject#initialize(id_or_nil), but they both have the same flaw: they only ever initialize the object and assign the balues you've passed in, they never call APIResource#refresh to actually request the values from an existing record and assign them to the object.

To be honest, the design of StripeObject here confuses me a bit.

StripeObject#initialize takes an id as it's first parameter but doesn't really do anything with it. Unless the user explicitly calls APIResource#refresh, setting the id doesn't do anything anyway.
As a user of the class, if you do have an id, you'd want to call APIResource.retrieve because it initialises the StripeObject and also calls APIResource#refresh to get the data and stuff it into the object.

Then we have StripeObject.construct_from, which is really just initializing the object and then calls #refresh_from. Why not just change StripeObject#initialize to accept a hash, and call #refresh_from in the initializer with the hash, if supplied? Then you could just delete .construct_from.

I'm having trouble understanding the design and difference between #refresh_from and #update_attributes.

update_attributes is a straightforward mass-assignment call. Although as a rails developer, it's surprising to me that an update call doesn't call save on the object; cf. ActiveRecord::Base#update_attributes.

refresh_from seems to be used internally when you'd want a mass-assignment call, and indeed calls #update_attributes inside the method, but it also does a bunch of munging of values lists, and also removed and adds accessors.

It seems that if the munging of values lists and accessor stuff is important, then you would want #update_attributes to also implement the same functionality.

Given that the #method_missing implementation on StripeObject already defines #write

That leads me to StripeObject#refresh_from. This method takes a hash of attributes and munges some lists of values. But it also calls #remove_accessors and #add_accessors which does a bunch of metaprogramming magic. And then it calls #update_attributes_with_options, which does mass-assignment of the attributes.

Now StripeObject has a #method_missing implementation to allow you to get/set all the attributes on the object, and also defines write accessors (but not read accessors for some reason). I feel like there is a confusion of responsibilities between #refresh_from, #update_attributes, and #method_missing here about who should be defining the accessors.

I'm not %100 sure where I'm going with this except to say: the whole implementation here feels a bit more complex than it should be, and I worry that by mixing responsibilities like it does, there's a lot of potential for subtle bugs.

The text was updated successfully, but these errors were encountered:

kyleconroy · 2015-10-08T06:29:19Z

@bloopletech Could you please provide more context here? I'm tempted to close this issue without an example or two.

bloopletech · 2015-10-08T06:32:35Z

Sorry! I clicked create early, I'm actually writing up a screed right this second.

kyleconroy · 2015-10-08T06:34:14Z

Screed away!

bloopletech · 2015-10-08T06:37:57Z

Screed updated. The specific problem I was having that led me to fiddle around in the code is #324

bloopletech · 2015-10-08T06:42:23Z

An example of a subtle bug: Because of issue #324 , you can't call #update_attributes on a new StripeObject (or even one created via StripeObject.construct_from({})). But because when you call APIResource.retrieve, it calls APIResource#refresh, which calls StripeObject#refresh_from, which will stuff the @values hash with all the keys you'd want to update. Then #update_attributes does work, because the keys you're passing in already exist in the @values hash.

bloopletech · 2015-10-08T06:45:13Z

Alright I have to head off but I'll check back ~9:30AM AEDT if anyone has any feedback or comments. Thanks for reading!

brandur · 2015-10-08T15:23:15Z

Hey @bloopletech, thanks a lot for taking such a detailed look at all of this stuff. I'll try to answer most of your comments inline, but in general there's basically two things going on here:

This gem's been around for a long time. In some places like the object initializer, it seems that great pains have been taken to maintain API compatibility, even if introduced quite a bit of complexity into the gem. e.g. The maybe hash / maybe not hash / maybe nil of id:
```
    def initialize(id=nil, opts={})
      id, @retrieve_params = Util.normalize_id(id)
```
The gem doesn't know what its own schema looks like and relies on getting an object back from the server to tell it. This leads to problems like you've detailed in StripeObject#method_missing doesn't agree with StripeObject#respond_to_missing? #324. We've got our eye on fixing this problem in particular.

So problem (1) is that there's no single API method that finds an existing object if possible, otherwise creating a new object. i.e. like Blah.find_or_initialize_by(api_id: '123456789') in ActiveRecord in Rails.

Right on. I think rather than getting too deep into metaprogramming, we might want to just recommend a pattern like this one instead:

blah = begin
  Blah.retrieve("acct_123")
rescue Stripe::InvalidRequestError
  raise unless $!.http_status == 404
  Blah.new
end

We might want to consider adding a helper for it, but we wouldn't be able to allow initialization on non-ID fields like Rails does.

There is StripeObject.construct_from(id: id_or_nil) and StripeObject#initialize(id_or_nil), but they both have the same flaw: they only ever initialize the object and assign the balues you've passed in, they never call APIResource#refresh to actually request the values from an existing record and assign them to the object.

Yeah :/ See my general no. 2 theme above, but these awkwardly would have to go to the server, introspect another object of the same type, and then initialize your new object right now. I think we need to punt a little more on a proper solution to this problem.

StripeObject#initialize takes an id as it's first parameter but doesn't really do anything with it. Unless the user explicitly calls APIResource#refresh, setting the id doesn't do anything anyway.
As a user of the class, if you do have an id, you'd want to call APIResource.retrieve because it initialises the StripeObject and also calls APIResource#refresh to get the data and stuff it into the object.

Yeah ... this doesn't make sense. I suspect that it might be legacy API compatibility, but we should see if we can fix this.

Then we have StripeObject.construct_from, which is really just initializing the object and then calls #refresh_from. Why not just change StripeObject#initialize to accept a hash, and call #refresh_from in the initializer with the hash, if supplied? Then you could just delete .construct_from.

I think that this would actually be pretty reasonable.

I'm having trouble understanding the design and difference between #refresh_from and #update_attributes.

The basic idea here is that #update_attributes is a safe-to-use mass assignment call for a model that already exists. Think of #refresh_from as an internal initializer that helps build a new object. It should have been protected and we should consider deprecating its public visibility.

#update_attributes is a straightforward mass-assignment call. Although as a rails developer, it's surprising to me that an update call doesn't call save on the object; cf. ActiveRecord::Base#update_attributes.

Yeah, this is a fair point. I tried to name it as a homage to ActiveRecord, but didn't consider that the fact that it doesn't save to be an oddity.

And then it calls #update_attributes_with_options, which does mass-assignment of the attributes.

Right. #update_attributes_with_options is the mass assignment heavy lifter. It is protected and it would be better to consider its existence purely an implementation detail.

Now StripeObject has a #method_missing implementation to allow you to get/set all the attributes on the object, and also defines write accessors (but not read accessors for some reason). I feel like there is a confusion of responsibilities between #refresh_from, #update_attributes, and #method_missing here about who should be defining the accessors.

Yeah ... I'll leave another comment in #324 about this one, but generally speaking this brings us back to general theme no. 2. I believe the universal accessors are there so that new objects can be initialized (again, the gem doesn't know anything about its own schema). This is a bad idea and we should try to deprecate it as soon as possible. It might be worth allow arbitrary setting of attributes in #update_attributes to mirror this behavior.

brandur · 2015-10-08T15:28:58Z

So in summary, I think that the actionable things here are:

~~See if we can merge #initialize and .construct_from while maintain reasonable API compatibility.~~ See comment below
~~Mark #refresh_from as deprecated (or at least its public visibility).~~ See Deprecate StripeObject#refresh_from #328.
Possibly loosen safeguards on #update_attributes so that it can be used for new object initialization (will elaborate on this one in StripeObject#method_missing doesn't agree with StripeObject#respond_to_missing? #324). Marking this off in favor of more granular issue in StripeObject#method_missing doesn't agree with StripeObject#respond_to_missing? #324.
~~Log a future task to the effect of "Make attribute assignment safer with schema information" so that we can unwind some of the worst practices here when we have the capability to do so.~~ See Testing stubs #243.

Let me know if you have any thoughts on this as well @kyleconroy.

As discussed in #325, this deprecates the public visibility of `#refresh_from` (by renaming it). It also adds some deprecation infrastructure to produce warnings when it's used.

brandur · 2015-10-08T17:08:33Z

So it turns out that merging #initialize and .construct_from is going to be difficult. The way that it's set up right now is that if a hash is sent into the constructor, then whatever was in there gets used as custom request parameters when a refresh is run. It's not clear what this could be for or if anybody uses it, but we should probably wait for a major version bump to drop it. Its presence unfortunately makes any symbols sent into the constructor ambiguous (is it a parameter or a property?), so we will not be able to pull the .construct_from functionality into it until we can deprecate the old path.

bloopletech · 2015-10-08T23:29:41Z

Whoa! I was not expecting such a fast and detailed response to what I wrote. Thank you for replying and considering my unorganised points :).

I think you've got a more solid handle than I do on the details here so I'll only respond on a couple points:

I fully support your efforts to maintain backwards compatibility, and I can see that some of the complexity I was seeing is due to that. That's life of course, but it's great you're working to deprecate going forward.
I agree that the gem not knowing its schema is the main underlying issue here. It's a challenge, because you (presumably) don't want to codify the stripe API schema in the gem and then have to keep it up to date; and on the other hand you want some safety checks so the user doesn't happily set keys that don't exist and then only find out on save that nothing worked.

As a user I personally would be happy to sacrifice some safety here to allow me to be lazier as a developer and also to make the gem code simpler (who woulda expected that, a lazy developer!). But I understand totally if you want to put more safety in. But I think at the end of the day, if you're going to allow people to do

account = Stripe::Account.new
account.email = 'blah'
account.save

then there's no safety check anyway, until the user calls #save. Which isn't the end of the world, but just flagging that currently, you can use StripeObjects somewhat unsafely at the moment.

Anyway I feel much better now I've had a complain on the internet, and I'll follow this issue to see what else comes out of it :).

brandur · 2015-10-09T00:52:39Z

As a user I personally would be happy to sacrifice some safety here to allow me to be lazier as a developer and also to make the gem code simpler (who woulda expected that, a lazy developer!). But I understand totally if you want to put more safety in. But I think at the end of the day, if you're going to allow people to do
account = Stripe::Account.new
account.email = 'blah'
account.save
then there's no safety check anyway, until the user calls #save. Which isn't the end of the world, but just flagging that currently, you can use StripeObjects somewhat unsafely at the moment.

Yeah, +1. Let us know what you think of the idea of relaxed constraints as seen in #329.

In the future I want to tighten these back up again, but we'll try to prioritize making sure that we have a sane experience with today's technology.

Anyway I feel much better now I've had a complain on the internet, and I'll follow this issue to see what else comes out of it :).

Hah, cool :) We're trying to do a better job in general in maintaining our open-source projects these days, so let us know if anything else comes up.

See this comment for details, but I think we've addressed all the actionable points now (and they're in master). I'm going to close this issue out for now.

brandur added a commit that referenced this issue Oct 8, 2015

Deprecate StripeObject#refresh_from

8b255c7

As discussed in #325, this deprecates the public visibility of `#refresh_from` (by renaming it). It also adds some deprecation infrastructure to produce warnings when it's used.

brandur mentioned this issue Oct 8, 2015

Deprecate StripeObject#refresh_from #328

Merged

brandur closed this as completed Oct 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upsert functionality needs some love to work properly #325

Upsert functionality needs some love to work properly #325

bloopletech commented Oct 8, 2015

kyleconroy commented Oct 8, 2015

bloopletech commented Oct 8, 2015

kyleconroy commented Oct 8, 2015

bloopletech commented Oct 8, 2015

bloopletech commented Oct 8, 2015

bloopletech commented Oct 8, 2015

brandur commented Oct 8, 2015

brandur commented Oct 8, 2015

brandur commented Oct 8, 2015

bloopletech commented Oct 8, 2015

brandur commented Oct 9, 2015

Upsert functionality needs some love to work properly #325

Upsert functionality needs some love to work properly #325

Comments

bloopletech commented Oct 8, 2015

update_attributes is a straightforward mass-assignment call. Although as a rails developer, it's surprising to me that an update call doesn't call save on the object; cf. ActiveRecord::Base#update_attributes.

refresh_from seems to be used internally when you'd want a mass-assignment call, and indeed calls #update_attributes inside the method, but it also does a bunch of munging of values lists, and also removed and adds accessors.

kyleconroy commented Oct 8, 2015

bloopletech commented Oct 8, 2015

kyleconroy commented Oct 8, 2015

bloopletech commented Oct 8, 2015

bloopletech commented Oct 8, 2015

bloopletech commented Oct 8, 2015

brandur commented Oct 8, 2015

brandur commented Oct 8, 2015

brandur commented Oct 8, 2015

bloopletech commented Oct 8, 2015

brandur commented Oct 9, 2015