Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default polymorphic type handler for deserialization not working for Protobuf #2598

Closed
ShreckYe opened this issue Mar 8, 2024 · 9 comments
Closed

Comments

@ShreckYe
Copy link
Contributor

ShreckYe commented Mar 8, 2024

Describe the bug

Default polymorphic type handler for deserialization is not working for Protobuf. Originally found while working on #2587 here.

To Reproduce
Attach a code snippet or test data if possible.

Code (adpated from example-poly-19.kt):

@Serializable
abstract class Project {
    abstract val name: String
}

@Serializable
data class NamedProject(@ProtoNumber(1) override val name: String) : Project()

@Serializable
data class BasicProject(@ProtoNumber(1) val type: String, @ProtoNumber(1) override val name: String) : Project()

val module = SerializersModule {
    polymorphic(Project::class) {
        subclass(NamedProject::class)
        defaultDeserializer { BasicProject.serializer() }
    }
}
val json = Json { serializersModule = module }
val protoBuf = ProtoBuf { serializersModule = module }

@OptIn(ExperimentalStdlibApi::class)
fun main() {
    fun decodeProjectTest(serialName: String) {
        println(json.decodeFromString<Project>("""{"type":"$serialName","name":"test"}"""))

        val serializedNameMessageString = "12" + "06" + "0a" + "04" + "test".encodeToByteArray().toHexString()
        println(
            protoBuf.decodeFromHexString<Project>(
                "0a" +
                    serialName.length.toByte().toHexString() +
                    serialName.encodeToByteArray().toHexString() +
                    serializedNameMessageString
            )
        )
    }
    decodeProjectTest(NamedProject.serializer().descriptor.serialName)
    decodeProjectTest("NotRegisteredProject")
}

Output:

NamedProject(name=test)
NamedProject(name=test)
BasicProject(type=NotRegisteredProject, name=test)
Exception in thread "main" kotlinx.serialization.MissingFieldException: Field 'type' is required for type with serial name 'example.examplePoly19.BasicProject', but it was missing

Expected behavior

Environment

Commit 1f7372a of this project.

  • Kotlin version: [e.g. 1.9.22]
  • Library version: [e.g. 1.6.4-SNAPSHOT]
  • Kotlin platforms: [e.g. JVM]
  • Gradle version: [e.g. 7.6.1]
  • IDE version (if bug is related to the IDE) [e.g. IntelliJ IDEA 2023.3.4 (Community Edition)]
  • Other relevant context [OS version Ubuntu 22.04.4 LTS, JRE version 17]
@pdvrieze
Copy link
Contributor

pdvrieze commented Mar 8, 2024

There is an error in your class: BasicProject(@ProtoNumber(1) val type: String, @ProtoNumber(1) override val name: String) You reuse the ProtoNumber. Also the type member for polymorphic serialization is not part of the actual serialized type.

The polymorphic serializer works by injecting a synthetics structure with 2 members: type and value. Formats can then hide this by having a special type member/attribute (as such the json deserialization works by accident - it can't distinguish between the generated (and inlined) type attribute and the explicit type attribute that is part of BasicProject).

@sandwwraith
Copy link
Member

Yes, I think that the problem is that type and name have the same proto numbers. As we build Int -> class property map on this data, only name got deserialized.

Also, pay attention that only Json supports polymorphism with an embedded 'type' key. Other formats use default polymorphism, which is array-based (see PolymorphicSerializer.descriptor structure)

@ShreckYe
Copy link
Contributor Author

There is an error in your class: BasicProject(@ProtoNumber(1) val type: String, @ProtoNumber(1) override val name: String) You reuse the ProtoNumber.

There are 2 things I'd like to explain for this. Firstly, after either I remove all the @ProtoNumber annotations, or I give all names a different @ProtoNumber such as @ProtoNumber(2), it's still the same exception with a slightly different message in that now it's the field name missing instead of type:

Exception in thread "main" kotlinx.serialization.MissingFieldException: Field 'name' is required for type with serial name 'example.examplePoly19.BasicProject', but it was missing

And secondly, I was actually doing this intentionally. As you both explained:

The polymorphic serializer works by injecting a synthetics structure with 2 members: type and value.

Also, pay attention that only Json supports polymorphism with an embedded 'type' key. Other formats use default polymorphism, which is array-based (see PolymorphicSerializer.descriptor structure)

So type would be a member in the top-level structure, and name would be in the nested value structure, and then they shouldn't conflict with each other while using the same proto number. I was trying to test this.

@sandwwraith
Copy link
Member

sandwwraith commented Mar 11, 2024

I see. So, the input is:

0a144e6f745265676973746572656450726f6a65637412060a0474657374
Field #1: 0A String Length = 20, Hex = 14, UTF8 = "NotRegisteredProject"
Field #2: 12 String Length = 6, Hex = 06
    As sub-object :
    Field #1: 0A String Length = 4, Hex = 04, UTF8 = "test"

(you can use https://protogen.marcgravell.com/decode to get this kind of info)

And you're trying protoBuf.decodeFromHexString<Project>, expecting it to call defaultDeserializer { BasicProject.serializer() }. Did I get it right?

In that case, I do not see how name can be deserialized because it is missing from the input. The error Field 'name' is required for type with serial name '...', but it was missing is correct — provided that name has proto number 2. If it has proto number 1, then it is a clash of numbers, as I explained earlier. In case name is 1 and type is 2, you get Field 'type' is required for type with serial name 'BasicProject', but it was missing, which is still true.

@ShreckYe

This comment was marked as outdated.

@ShreckYe
Copy link
Contributor Author

ShreckYe commented Mar 12, 2024

Oh I think I get what went wrong (for details see the hidden comment above). After I remove type from BasicProject, the code works:

@Serializable
data class BasicProject(@ProtoNumber(2) override val name: String) : Project()

So does this mean that with Protobuf I can't deserialize type to a string property with a default deserializer in the way we do with JSON as instructed in the guide (I know there is a className parameter in the lambda parameter of defaultDeserializer that I can use, but it's a different approach then)? And decoding the type in default serialization is currently somewhat format-dependent?

@sandwwraith
Copy link
Member

Yes, in the case of default (array) polymorphic serialization you cannot access type afterward, as the deserializer gets the inner object directly, which doesn't exist there unless you specifically duplicate it in input.

And decoding the type in default serialization is currently somewhat format-dependent?

It is for Json. If you enable deprecated useArrayPolymorphism in Json, which essentially disables all specific handling of type inside the object, you get similar behavior to protobuf and other formats. For array-polymorphic Json, if input would look like ["NotRegisteredProject", {"name": "foo"}], the error would be the same — there's no type key inside curly braces of the inner object.

@ShreckYe
Copy link
Contributor Author

@sandwwraith Thank you very much for clarifying. I shall close this issue then. I think it might be helpful to point this out in the guide here, however.

@pdvrieze
Copy link
Contributor

@sandwwraith Thank you very much for clarifying. I shall close this issue then. I think it might be helpful to point this out in the guide here, however.

I had a look at the guide. The use of type there in BasicProject is not great:

  • It is misleading in that it appears needed for polymorphism (but actually is only valid because of the use as a default serializer
  • It is format specific (it captures the polymorphic type attribute by accident) - it only works for Json. Instead a serializer with the type should be created to be format agnostic.

ShreckYe added a commit to huanshankeji/kotlinx.serialization that referenced this issue Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants