Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: flatten allOf models #1524

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

krishanjmistry
Copy link
Contributor

@krishanjmistry krishanjmistry commented Dec 22, 2023

Primarily, this is an attempt to get further on #599.

What the solution has ended up looking like is [flattening/merging/composing] the allOf schema properties into the child models - there is no longer a use of #[serde(flatten)].
For starters, I'd say this provides a neater API to the end user - more resemblant of what we see in Bicep and easier to reason about.

It also removes some ambiguities in the models, where a property could be defined multiple times in the created models. For an example of this, see #1524 (comment)

@krishanjmistry krishanjmistry changed the title WIP - comments welcome: flatten allOf models WIP: flatten allOf models Dec 22, 2023
@krishanjmistry
Copy link
Contributor Author

krishanjmistry commented Dec 22, 2023

Example where generated model is previously ambiguous

Using azure-rest-api-specs/specification/vmware/resource-manager/Microsoft.AVS/stable/2020-03-20/vmware.json - note this is an older tag than the latest, but sufficient as an example

  • ClusterProperties
    • has a property of provisioningState
    • takes an allOf over #/definitions/ManagementCluster
  • ManagementCluster
    • has a property of provisioningState
    • takes an allOf over #/definitions/ClusterUpdateProperties

Deserializing:
Whilst deserializing this response into a Rust model is successful, it's ambiguous where the provisioningState would end up:

#[cfg(feature = "package-2020-03-20")]
#[test]
fn test_cluster_get() -> anyhow::Result<()> {
    use anyhow::{bail, ensure};
    use azure_mgmt_vmware::package_2020_03_20::models::*;
    // copied from specification\vmware\resource-manager\Microsoft.AVS\stable\2023-03-01\examples\Clusters_Get.json
    let json = br#"
    {
        "id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1",
        "name": "cluster1",
        "sku": {
          "name": "AV20"
        },
        "properties": {
          "clusterSize": 4,
          "hosts": [
            "fakehost22.nyc1.kubernetes.center",
            "fakehost23.nyc1.kubernetes.center",
            "fakehost24.nyc1.kubernetes.center",
            "fakehost25.nyc1.kubernetes.center"
          ],
          "provisioningState": "Succeeded"
        },
        "type": "Microsoft.AVS/privateClouds/clusters"
      }
      "#;

    let cluster: Cluster = serde_json::from_slice(json)?;
    println!("{:#?}", cluster);
    Ok(())
}
Cluster {
    resource: Resource {
        id: Some(
            "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1",
        ),
        name: Some(
            "cluster1",
        ),
        type_: Some(
            "Microsoft.AVS/privateClouds/clusters",
        ),
    },
    sku: Sku {
        name: "AV20",
    },
    properties: ClusterProperties {
        management_cluster: ManagementCluster {
            cluster_update_properties: ClusterUpdateProperties {
                cluster_size: Some(
                    4,
                ),
            },
            provisioning_state: None,
            cluster_id: None,
            hosts: [
                "fakehost22.nyc1.kubernetes.center",
                "fakehost23.nyc1.kubernetes.center",
                "fakehost24.nyc1.kubernetes.center",
                "fakehost25.nyc1.kubernetes.center",
            ],
        },
        provisioning_state: Some(
            Succeeded,
        ),
    },
}

Serializing:

#[cfg(feature = "package-2020-03-20")]
#[test]
fn serialize_cluster() -> anyhow::Result<()> {
    use anyhow::{bail, ensure};
    use azure_mgmt_vmware::package_2020_03_20::models::*;

    let cluster = Cluster {
        resource: Resource {
            id: Some(
                "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1".to_string(),
            ),
            name: Some(
                "cluster1".to_string(),
            ),
            type_: Some(
                "Microsoft.AVS/privateClouds/clusters".to_string(),
            ),
        },
        sku: Sku {
            name: "AV20".to_string(),
        },
        properties: ClusterProperties {
            management_cluster: ManagementCluster {
                cluster_update_properties: ClusterUpdateProperties {
                    cluster_size: Some(
                        4,
                    ),
                },
                provisioning_state: Some(ClusterProvisioningState::Updating),
                cluster_id: None,
                hosts: vec![
                    "fakehost22.nyc1.kubernetes.center".to_string(),
                    "fakehost23.nyc1.kubernetes.center".to_string(),
                    "fakehost24.nyc1.kubernetes.center".to_string(),
                    "fakehost25.nyc1.kubernetes.center".to_string(),
                ],
            },
            provisioning_state: Some(
                ClusterProvisioningState::Succeeded,
            ),
        },
    };

    let json = serde_json::to_string_pretty(&cluster)?;
    println!("{}", json);
    Ok(())
}

This means we end up with two provisioningState properties within the serialized result.

{
  "id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1",
  "name": "cluster1",
  "type": "Microsoft.AVS/privateClouds/clusters",
  "sku": {
    "name": "AV20"
  },
  "properties": {
    "clusterSize": 4,
    "provisioningState": "Updating",
    "hosts": [
      "fakehost22.nyc1.kubernetes.center",
      "fakehost23.nyc1.kubernetes.center",
      "fakehost24.nyc1.kubernetes.center",
      "fakehost25.nyc1.kubernetes.center"
    ],
    "provisioningState": "Succeeded"
  }
}

@krishanjmistry
Copy link
Contributor Author

Now, the example I've used in #1524 (comment)

serializes as:

#[cfg(feature = "package-2020-03-20")]
#[test]
fn serialize_cluster() -> anyhow::Result<()> {
    use azure_mgmt_vmware::package_2020_03_20::models::*;

    let cluster = Cluster {
        id: Some(
            "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1".to_string(),
        ),
        name: Some(
            "cluster1".to_string(),
        ),
        type_: Some(
            "Microsoft.AVS/privateClouds/clusters".to_string(),
        ),
        sku: Sku {
            name: "AV20".to_string(),
        },
        properties: ClusterProperties {
            cluster_size: 4,
            provisioning_state: Some(ClusterProvisioningState::Updating),
            cluster_id: None,
            hosts: vec![
                "fakehost22.nyc1.kubernetes.center".to_string(),
                "fakehost23.nyc1.kubernetes.center".to_string(),
                "fakehost24.nyc1.kubernetes.center".to_string(),
                "fakehost25.nyc1.kubernetes.center".to_string(),
            ],
        },
    };

    let json = serde_json::to_string_pretty(&cluster)?;
    println!("{}", json);
    Ok(())
}

into

{
  "sku": {
    "name": "AV20"
  },
  "properties": {
    "provisioningState": "Updating",
    "hosts": [
      "fakehost22.nyc1.kubernetes.center",
      "fakehost23.nyc1.kubernetes.center",
      "fakehost24.nyc1.kubernetes.center",
      "fakehost25.nyc1.kubernetes.center"
    ],
    "clusterSize": 4
  },
  "id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1",
  "name": "cluster1",
  "type": "Microsoft.AVS/privateClouds/clusters"
}

deserializes as:

#[cfg(feature = "package-2020-03-20")]
#[test]
fn test_cluster_get() -> anyhow::Result<()> {
    use anyhow::ensure;
    use azure_mgmt_vmware::package_2020_03_20::models::*;
    // copied from specification\vmware\resource-manager\Microsoft.AVS\stable\2023-03-01\examples\Clusters_Get.json
    let json = br#"
    {
        "id": "/subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/group1/providers/Microsoft.AVS/privateClouds/cloud1/clusters/cluster1",
        "name": "cluster1",
        "sku": {
          "name": "AV20"
        },
        "properties": {
          "clusterSize": 4,
          "hosts": [
            "fakehost22.nyc1.kubernetes.center",
            "fakehost23.nyc1.kubernetes.center",
            "fakehost24.nyc1.kubernetes.center",
            "fakehost25.nyc1.kubernetes.center"
          ],
          "provisioningState": "Succeeded"
        },
        "type": "Microsoft.AVS/privateClouds/clusters"
      }
      "#;

    let cluster: Cluster = serde_json::from_slice(json)?;
    println!("{:#?}", cluster);
    ensure!(cluster.name == Some("cluster1".to_string()));
    ensure!(cluster.sku.name == "AV20".to_string());
    ensure!(cluster.properties.cluster_size == 4);
    ensure!(cluster.properties.hosts.len() == 4);
    ensure!(cluster.properties.provisioning_state == Some(ClusterProvisioningState::Succeeded));
    Ok(())
}

@cataggar
Copy link
Member

Example where generated model is previously ambiguous

Using azure-rest-api-specs/specification/vmware/resource-manager/Microsoft.AVS/stable/2020-03-20/vmware.json - note this is an older tag than the latest, but sufficient as an example

Hi @krishanjmistry, I appreciate using the vmware spec as the example, since that I the one I manage. Having duplicate entries in the spec like this is an error. I am not sure it is something that we should support. In this particular case, I fixed it in the next version 2021-06-01.

@cataggar
Copy link
Member

cataggar commented Dec 22, 2023

For starters, I'd say this provides a neater API to the end user - more resemblant of what we see in Bicep and easier to reason about.

I think there is room for improvement here. I think there are some advantages of keeping the structs to match the definitions. There are ways to improve the API to the end user without changing the structs. For #601, I brought up using the builder pattern. @demoray, then mentioned:

I think having a middle ground of providing something akin to the setters! macro would be sufficient for many (most?) uses.

@krishanjmistry
Copy link
Contributor Author

@cataggar, thanks for taking a look.

Hi @krishanjmistry, I appreciate using the vmware spec as the example, since that I the one I manage. Having duplicate entries in the spec like this is an error. I am not sure it is something that we should support. In this particular case, I fixed it in the next version 2021-06-01.

I did notice that I was picking on an old tag! More generally, the main thing that I believe would influence whether we support this case is how prevalent this is in the existing specs. I think it's worth noting that I don't think such a thing could be described in TypeSpec (which I believe is where the specs are heading).

Likewise, I also don't think the scenario in #599 could be described in TypeSpec. In TypeSpec, it seems a property can either be defined required or optional in its base model. In other words, it should be possible to inherit the required properties, but not overwrite them.

I think there is room for improvement here. I think there are some advantages of keeping the structs to match the definitions. There are ways to improve the API to the end user without changing the structs.

My main concern here would be, in the case of if we do want to support a scenario like #599 - the models won't reflect what gets [de]serialized. I do like the suggestion of a builder pattern/setters! macro.

It seems there's two questions, on both I'll try and get some stats:

  1. how prevalent are duplicated properties in models that include allOf?
  2. how prevalent is the issue in required parameters in allOf model  #599, where a required property is not set in the base class?

@krishanjmistry
Copy link
Contributor Author

Answering the questions I left at the end of the last comment.

  1. how prevalent are duplicated properties in models that include allOf?

    • 39 resource manager crates have schemas where the property is defined in both the schema, and the allOf it takes.
    • Among those 39 resource manager crates, there's 435 cases in total.
    • Here's a CSV containing all the places: all_of_issues.csv
  2. how prevalent is it that a property is mentioned to be required outside of the base class?

    • 46 resource manager crates
    • 4 data plane crates
    • 1193 cases across both sets of crates
    • Here's a CSV containing all the places: required_issues.csv

@heaths
Copy link
Member

heaths commented Jan 18, 2024

FWIW - and I don't know if/how this factors in here - but for context, all the other language code generators have traditionally treated the first allOf as derivation (if that even applies in this case) and any subsequent allOfs as flattened members. Also worth keeping in mind when we start supporting a TypeSpec-compiled object model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants