Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resource: Initial MoveResourceState RPC support #917

Merged
merged 6 commits into from
Feb 16, 2024
Merged

Conversation

bflad
Copy link
Member

@bflad bflad commented Feb 1, 2024

This change adds initial support for the MoveResourceState RPC to the framework, including:

  • Adding the framework shared server implementation for the new RPC
  • Adding the protocol version 5 and 6 server implementations for the new RPC
  • Adding the type conversion logic for the framework types to/from the protocol types
  • Exposing a new resource.ResourceWithMoveState interface that providers can implement to support the new RPC
  • Adding a website documentation page for the new functionality

A state move using the new RPC occurs when a Terraform 1.8 and later configuration includes a moved configuration block such as the following:

moved {
    from = "examplecloud_source.XXX"
    to   = "examplecloud_target.XXX"
}

There are no restrictions on the source resource types, but target resource types must opt into support to prevent data loss. Target resources can support moves from multiple, differing source resources, so the framework implementation is exposed as an ordered list to developers.

The framework implementation for the new RPC is as follows:

  • If no state move support is defined for the resource, the framework returns an error diagnostic.
  • If state move support is defined for the resource, each provider-defined state move implementation is called until one responds with error diagnostics or state data.
  • If all provider-defined state move implementations return without error diagnostics and state data, the framework returns an error diagnostic.

The protocol server unit testing shows the end-to-end handling of the new RPC, including the provider-defined resource implementations, the type conversion logic, and the framework shared server implementation. The exposed resource.ResourceWithMoveState implementation is intentionally similar to the existing resource.ResourceWithUpgradeState handling, both in internal details and the exposed API. This is to provide a smoother experience for provider developers familiar with the other functionality and maintainers of the framework.

This initial implementation exposes a lot of the request/response handling as-is, however there is likely potential for introducing native helper functionality for developers, such as implementing one to simplify "aliasing"/"renaming" an existing resource.Resource within the same provider codebase.

This change adds initial support for the `MoveResourceState` RPC to the framework, including:

* Adding the framework shared server implementation for the new RPC
* Adding the protocol version 5 and 6 server implementations for the new RPC
* Adding the type conversion logic for the framework types to/from the protocol types
* Exposing a new `resource.ResourceWithMoveState` interface that providers can implement to support the new RPC
* Adding a website documentation page for the new functionality

A state move using the new RPC occurs when a Terraform 1.8 and later configuration includes a `moved` configuration block such as the following:

```terraform
moved {
    from = "examplecloud_source.XXX"
    to   = "examplecloud_target.XXX"
}
```

There are no restrictions on the source resource types, but target resource types must opt into support to prevent data loss. Target resources can support moves from multiple, differing source resources, so the framework implementation is exposed as an ordered list to developers.

The framework implementation for the new RPC is as follows:

* If no state move support is defined for the resource, the framework returns an error diagnostic.
* If state move support is defined for the resource, each provider-defined state move implementation is called until one responds with error diagnostics or state data.
* If all provider-defined state move implementations return without error diagnostics and state data, the framework returns an error diagnostic.

The protocol server unit testing shows the end-to-end handling of the new RPC, including the provider-defined resource implementations, the type conversion logic, and the framework shared server implementation. The exposed `resource.ResourceWithMoveState` implementation is intentionally similar to the existing `resource.ResourceWithUpgradeState` handling, both in internal details and the exposed API. This is to provide a smoother experience for provider developers familiar with the other functionality and maintainers of the framework.
@bflad bflad added the enhancement New feature or request label Feb 1, 2024
@bflad bflad added this to the v1.6.0 milestone Feb 1, 2024
@bflad bflad marked this pull request as ready for review February 1, 2024 22:37
@bflad bflad requested a review from a team as a code owner February 1, 2024 22:37
Copy link
Member

@austinvalle austinvalle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I had a couple notes and questions but nothing major. :shipit:

SourcePrivate *privatestate.Data
SourceProviderAddress string
SourceSchemaVersion int64
SourceRawState *tfprotov6.RawState
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was initially confusing to see in the PR, but I looked through the upgrade resource request code and found this comment:

// TODO: Create framework defined type that is not protocol specific.
// Reference: https://github.com/hashicorp/terraform-plugin-framework/issues/340
RawState *tfprotov6.RawState

Do we want to add a comment here describing why it's okay to keep it protocol version specific for future maintainers? Seems like it's unlikely we'll come back and refactor this since the issue is closed and glancing through it, seems like a non-zero effort.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a comment here is an excellent suggestion, will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a swing at this in the latest changes -- please reach out if you have other considerations or suggestions!

return nil
}

proto5 := &tfprotov6.MoveResourceStateResponse{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: proto6 :=

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦅 👀 : Fixed in 46a1487.

Comment on lines +131 to +143
if err != nil {
logging.FrameworkDebug(
ctx,
"Error unmarshalling SourceRawState using the provided SourceSchema for source "+
req.SourceProviderAddress+" resource type "+
req.SourceTypeName+" with schema version "+
strconv.FormatInt(req.SourceSchemaVersion, 10)+". "+
"This is not a fatal error since resources can support multiple source resources which cause this type of error to be unavoidable, "+
"but due to this error the SourceState will not be populated for the implementation.",
map[string]any{
logging.KeyError: err,
},
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious q: is there a reason to not just continue here after logging? Are there scenarios where a source schema would be present, it doesn't match, but the implementation of move state would still work and would return a target state?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may have found out later in reading the implementation but yeah, the comment at the beginning of this error message conditional here could probably use a little more verbiage. This logic essentially will always run if SourceSchema is set but before its ever passed off to StateMover provider-defined logic which might "skip" based on the other request fields (source resource type, source provider address, etc.).

We could technically support some upfront "filtering" via framework logic before it reaches here by adding SourceTypeName, etc. fields next to SourceSchema in StateMover that have some opinionated behaviors, but I was hesitant to immediately introduce that. 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may have found out later in reading the implementation

Ah yes! My main issue with reviewing PRs is not returning to previous in-progress comments lol.. 😅

This all makes sense to me and the additional docs on SourceState I think closes the loop nicely with this log message 👍🏻

Copy link
Member Author

@bflad bflad Feb 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a swing at this in the latest changes -- please reach out if you have other considerations or suggestions!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want/need to have tests for some of the error "fail-forward" conditions when there are multiple implementations? Like:

  • StateMover 1 schema doesn't match source, StateMover 2 schema matches + returns target state == success
  • StateMover 1 schema doesn't match source, StateMover 2 doesn't define schema + returns target state == success
  • StateMover 1 schema matches + returns target state, StaveMover 2 schema matches + would return target state == first is called, second is not

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aye -- yes! Thank you. I had meant to do that (and even added a code comment alluding to that in the protocol server unit testing), but neglected to actually do this. Will add these scenarios to the shared framework server unit testing to give us some more confidence that it does what it should while catching regressions in the future for these behaviors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually found an interesting quirk and thank you for talking with me about it! Due to our usage of IgnoreUndefinedAttributes it is possible to declare a schema that will not error on conversion (generally caused by invalid types), but instead will have null values. Due to this we are thinking to adjust our wording around checking SourceState for nil, since its not always guaranteed to be that with a different schema. The good news is that a subsequent plan against the target resource should catch data oddities, if for example null values were unexpectedly being moved.

Comment on lines 47 to 49
// TODO: Create framework defined type that is not protocol specific.
// Reference: https://github.com/hashicorp/terraform-plugin-framework/issues/340
SourceRawState *tfprotov6.RawState
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this in another comment, but this issue is closed now. Do we need to re-open it or remove this comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing our best action right now is removing it from the public API docs and keeping some form of it on internal/fwserver.MoveResourceStateRequest.SourceRawState as you mentioned in your other comment. 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a swing at this in the latest changes -- please reach out if you have other considerations or suggestions!

// If this request matches the intended implementation, the implementation
// logic must set [MoveStateResponse.TargetState] as it is intentionally not
// copied automatically.
SourceState *tfsdk.State
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a comment above about this field. If it's possible that the StateMover implementation may be called with this field as nil when StateMover.SourceSchema doesn't match, we may want to document that here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I see what you are getting at. Great suggestion. Will update to mention that, along with the fact that the framework will log those errors instead of returning them, similar to what is mentioned in the SourceSchema field documentation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a swing at this in the latest changes -- please reach out if you have other considerations or suggestions!

Comment on lines 32 to 41
// SourceSchema is an optional schema for the intended source resource state
// and schema version. While not required, setting this will populate
// [MoveStateRequest.SourceState] when possible similar to other [Resource]
// types. State conversion errors based on this schema are only logged at
// DEBUG level as there may be multiple [StateMover] implementations on the
// same target resource for differing source resources.
//
// If not set, source state data is only available in
// [MoveStateRequest.SourceRawState].
SourceSchema *schema.Schema
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention explicitly that state conversion errors will not cause the StateMover function to be skipped? It's kind of implied but was something that came to my mind when reading the interface.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely appreciate being more explicit 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a swing at this in the latest changes -- please reach out if you have other considerations or suggestions!


Note these caveats when implementing the `MoveState` method:

* An error is returned if the response state contains unknown values. Set all attributes to either null or known values in the response.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious q: Is this validation implemented on the Terraform core side?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had asked about it early on since this was being implemented at the same time as that new checking logic, but if it isn't already, we can certainly make the quick change on that side. 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs look great ❤️

Copy link
Member Author

@bflad bflad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take care of these things shortly (likely tomorrow 😄 ).

return nil
}

proto5 := &tfprotov6.MoveResourceStateResponse{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦅 👀 : Fixed in 46a1487.

// If this request matches the intended implementation, the implementation
// logic must set [MoveStateResponse.TargetState] as it is intentionally not
// copied automatically.
SourceState *tfsdk.State
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I see what you are getting at. Great suggestion. Will update to mention that, along with the fact that the framework will log those errors instead of returning them, similar to what is mentioned in the SourceSchema field documentation.

Comment on lines 32 to 41
// SourceSchema is an optional schema for the intended source resource state
// and schema version. While not required, setting this will populate
// [MoveStateRequest.SourceState] when possible similar to other [Resource]
// types. State conversion errors based on this schema are only logged at
// DEBUG level as there may be multiple [StateMover] implementations on the
// same target resource for differing source resources.
//
// If not set, source state data is only available in
// [MoveStateRequest.SourceRawState].
SourceSchema *schema.Schema
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely appreciate being more explicit 👍


Note these caveats when implementing the `MoveState` method:

* An error is returned if the response state contains unknown values. Set all attributes to either null or known values in the response.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had asked about it early on since this was being implemented at the same time as that new checking logic, but if it isn't already, we can certainly make the quick change on that side. 👍

Comment on lines 47 to 49
// TODO: Create framework defined type that is not protocol specific.
// Reference: https://github.com/hashicorp/terraform-plugin-framework/issues/340
SourceRawState *tfprotov6.RawState
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing our best action right now is removing it from the public API docs and keeping some form of it on internal/fwserver.MoveResourceStateRequest.SourceRawState as you mentioned in your other comment. 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aye -- yes! Thank you. I had meant to do that (and even added a code comment alluding to that in the protocol server unit testing), but neglected to actually do this. Will add these scenarios to the shared framework server unit testing to give us some more confidence that it does what it should while catching regressions in the future for these behaviors.

Comment on lines +131 to +143
if err != nil {
logging.FrameworkDebug(
ctx,
"Error unmarshalling SourceRawState using the provided SourceSchema for source "+
req.SourceProviderAddress+" resource type "+
req.SourceTypeName+" with schema version "+
strconv.FormatInt(req.SourceSchemaVersion, 10)+". "+
"This is not a fatal error since resources can support multiple source resources which cause this type of error to be unavoidable, "+
"but due to this error the SourceState will not be populated for the implementation.",
map[string]any{
logging.KeyError: err,
},
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may have found out later in reading the implementation but yeah, the comment at the beginning of this error message conditional here could probably use a little more verbiage. This logic essentially will always run if SourceSchema is set but before its ever passed off to StateMover provider-defined logic which might "skip" based on the other request fields (source resource type, source provider address, etc.).

We could technically support some upfront "filtering" via framework logic before it reaches here by adding SourceTypeName, etc. fields next to SourceSchema in StateMover that have some opinionated behaviors, but I was hesitant to immediately introduce that. 😄

SourcePrivate *privatestate.Data
SourceProviderAddress string
SourceSchemaVersion int64
SourceRawState *tfprotov6.RawState
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a comment here is an excellent suggestion, will do.

@bflad bflad merged commit 86b2acb into main Feb 16, 2024
28 checks passed
@bflad bflad deleted the bflad/MoveResourceState branch February 16, 2024 22:05
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 18, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants