Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable visiting descriptors-tree #2632

Closed
Chuckame opened this issue Apr 15, 2024 · 6 comments
Closed

Enable visiting descriptors-tree #2632

Chuckame opened this issue Apr 15, 2024 · 6 comments

Comments

@Chuckame
Copy link
Contributor

What is your use-case and why do you need this feature?
There is no official way of reaching the descriptor tree.

In formats with schema like protobuf or avro, we need to read the full descriptors tree from the root serializer to generate the corresponding schemas. This logic could be provided by the kotlin serialization library to easily provide a way of reaching the descriptors.

Describe the solution you'd like
Here is my current implementation. It has been made following the same concepts, and is customisable.

What we just need is to implement the different interfaces related to each descriptor's kind or key concepts:

  • SerialDescriptorValueVisitor to visit a generic value. It's also the entrypoint for all the other visitors
  • SerialDescriptorMapVisitor when kind is StructureKind.MAP to visit its key and value descriptors
  • SerialDescriptorListVisitor when kind is StructureKind.LIST to visit its item descriptor
  • SerialDescriptorPolymorphicVisitor when kind is PolymorphicKind to visit its implementation(s) descriptors
  • SerialDescriptorClassVisitor when kind is StructureKind.CLASS to visit its fields descriptors
  • SerialDescriptorInlineClassVisitor when descriptor.isInline is true (same workflow as Encoder.encodeInline)

Note that all interfaces could be implemented by the same class as each method have a different name.

All the methods follow the same logic:

  • When a value is a scalar (primitive, enum and object kinds), then the visit method returns Unit as we do not need to visit deeper.
  • When a value is something else (contextual, structure and polymorphic kinds), then the visit method returns the related interface or null if we want to stop the visit.

Here is an image showing all the interfaces and their methods:
image

And the code:

interface SerialDescriptorValueVisitor {
    val serializersModule: SerializersModule

    /**
     * Called when the [descriptor]'s kind is a [PrimitiveKind].
     */
    fun visitPrimitive(
        descriptor: SerialDescriptor,
        kind: PrimitiveKind,
    )

    /**
     * Called when the [descriptor]'s kind is an [SerialKind.ENUM].
     */
    fun visitEnum(descriptor: SerialDescriptor)

    /**
     * Called when the [descriptor]'s kind is an [StructureKind.OBJECT].
     */
    fun visitObject(descriptor: SerialDescriptor)

    /**
     * Called when the [descriptor]'s kind is a [PolymorphicKind].
     * @return null if we don't want to visit the polymorphic type
     */
    fun visitPolymorphic(
        descriptor: SerialDescriptor,
        kind: PolymorphicKind,
    ): SerialDescriptorPolymorphicVisitor?

    /**
     * Called when the [descriptor]'s kind is a [StructureKind.CLASS].
     * Note that when the [descriptor] is an inline class, [visitInlineClass] is called instead.
     * @return null if we don't want to visit the class
     */
    fun visitClass(descriptor: SerialDescriptor): SerialDescriptorClassVisitor?

    /**
     * Called when the [descriptor]'s kind is a [StructureKind.LIST].
     * @return null if we don't want to visit the list
     */
    fun visitList(descriptor: SerialDescriptor): SerialDescriptorListVisitor?

    /**
     * Called when the [descriptor]'s kind is a [StructureKind.MAP].
     * @return null if we don't want to visit the map
     */
    fun visitMap(descriptor: SerialDescriptor): SerialDescriptorMapVisitor?

    /**
     * Called when the [descriptor] is about a value class (e.g. its kind is a [StructureKind.CLASS] and [SerialDescriptor.isInline] is true).
     * @return null if we don't want to visit the inline class
     */
    fun visitInlineClass(descriptor: SerialDescriptor): SerialDescriptorInlineClassVisitor?

    fun visitValue(descriptor: SerialDescriptor) {
        if (descriptor.isInline) {
            visitInlineClass(descriptor)?.apply {
                visitInlineClassElement(descriptor, 0)?.visitValue(descriptor.getElementDescriptor(0))
            }
        } else {
            when (descriptor.kind) {
                is PrimitiveKind -> visitPrimitive(descriptor, descriptor.kind as PrimitiveKind)
                SerialKind.ENUM -> visitEnum(descriptor)
                SerialKind.CONTEXTUAL -> visitValue(descriptor.getNonNullContextualDescriptor(serializersModule))
                StructureKind.CLASS ->
                    visitClass(descriptor)?.apply {
                        for (elementIndex in (0 until descriptor.elementsCount)) {
                            visitClassElement(descriptor, elementIndex)?.visitValue(descriptor.getElementDescriptor(elementIndex))
                        }
                    }?.endClassVisit(descriptor)

                StructureKind.LIST ->
                    visitList(descriptor)?.apply {
                        visitListItem(descriptor, 0)?.visitValue(descriptor.getElementDescriptor(0))
                    }?.endListVisit(descriptor)

                StructureKind.MAP ->
                    visitMap(descriptor)?.apply {
                        visitMapKey(descriptor, 0)?.visitValue(descriptor.getElementDescriptor(0))
                        visitMapValue(descriptor, 1)?.visitValue(descriptor.getElementDescriptor(1))
                    }?.endMapVisit(descriptor)

                is PolymorphicKind ->
                    visitPolymorphic(descriptor, descriptor.kind as PolymorphicKind)?.apply {
                        descriptor.possibleSerializationSubclasses(serializersModule).sortedBy { it.serialName }.forEach { implementationDescriptor ->
                            visitPolymorphicFoundDescriptor(implementationDescriptor)?.visitValue(implementationDescriptor)
                        }
                    }?.endPolymorphicVisit(descriptor)

                StructureKind.OBJECT -> visitObject(descriptor)
            }
        }
    }
}

interface SerialDescriptorMapVisitor {
    /**
     * @return null if we don't want to visit the map key
     */
    fun visitMapKey(
        mapDescriptor: SerialDescriptor,
        keyElementIndex: Int,
    ): SerialDescriptorValueVisitor?

    /**
     * @return null if we don't want to visit the map value
     */
    fun visitMapValue(
        mapDescriptor: SerialDescriptor,
        valueElementIndex: Int,
    ): SerialDescriptorValueVisitor?

    fun endMapVisit(descriptor: SerialDescriptor)
}

interface SerialDescriptorListVisitor {
    /**
     * @return null if we don't want to visit the list item
     */
    fun visitListItem(
        listDescriptor: SerialDescriptor,
        itemElementIndex: Int,
    ): SerialDescriptorValueVisitor?

    fun endListVisit(descriptor: SerialDescriptor)
}

interface SerialDescriptorPolymorphicVisitor {
    /**
     * @return null if we don't want to visit the found polymorphic descriptor
     */
    fun visitPolymorphicFoundDescriptor(descriptor: SerialDescriptor): SerialDescriptorValueVisitor?

    fun endPolymorphicVisit(descriptor: SerialDescriptor)
}

interface SerialDescriptorClassVisitor {
    /**
     * @return null if we don't want to visit the class element
     */
    fun visitClassElement(
        descriptor: SerialDescriptor,
        elementIndex: Int,
    ): SerialDescriptorValueVisitor?

    fun endClassVisit(descriptor: SerialDescriptor)
}

interface SerialDescriptorInlineClassVisitor {
    /**
     * @return null if we don't want to visit the inline class element
     */
    fun visitInlineClassElement(
        inlineClassDescriptor: SerialDescriptor,
        inlineElementIndex: Int,
    ): SerialDescriptorValueVisitor?
}

private fun SerialDescriptor.getNonNullContextualDescriptor(serializersModule: SerializersModule) =
    requireNotNull(serializersModule.getContextualDescriptor(this) ?: this.capturedKClass?.serializerOrNull()?.descriptor) {
        "No descriptor found in serialization context for $this"
    }

private fun SerialDescriptor.possibleSerializationSubclasses(serializersModule: SerializersModule): Sequence<SerialDescriptor> {
    return when (this.kind) {
        PolymorphicKind.SEALED ->
            elementDescriptors.asSequence()
                .filter { it.kind == SerialKind.CONTEXTUAL }
                .flatMap { it.elementDescriptors }
                .flatMap { it.possibleSerializationSubclasses(serializersModule) }

        PolymorphicKind.OPEN ->
            serializersModule.getPolymorphicDescriptors(this@possibleSerializationSubclasses).asSequence()
                .flatMap { it.possibleSerializationSubclasses(serializersModule) }

        SerialKind.CONTEXTUAL -> sequenceOf(getNonNullContextualDescriptor(serializersModule))

        else -> sequenceOf(this)
    }
}

What do you think ? I can do a PR if needed

@Chuckame Chuckame changed the title Enable descriptors-tree visit Enable visiting descriptors-tree Apr 15, 2024
@sandwwraith
Copy link
Member

Do you have any particular reason to use exactly the Visitor pattern? There are existent APIs that provide the ability to simply iterate over sub-descriptors (e.g., public val SerialDescriptor.elementDescriptors: Iterable<SerialDescriptor>). In my [personal] opinion, the Visitor pattern is outdated now and should be replaced with FP operations on collections and iterables, such as map or filter + when over subtypes or kinds, when necessary. It results in a more concise and compact code with the same meaning — no need to override a bunch of different functions, code can be read top-down without additional navigation, etc. Your own SerialDescriptor.possibleSerializationSubclasses showcases a good example of that. See also a similar ticket in kotlin-metadata-jvm: https://youtrack.jetbrains.com/issue/KT-59442

@sandwwraith
Copy link
Member

In any case, this seems like something that can be implemented on top of the kotlinx-serialization-core and even published as an additional utility library when necessary. So it is unlikely that such functionality will be added to the core, but it can be maintained by the community if there's a demand for that.

@sandwwraith sandwwraith closed this as not planned Won't fix, can't repro, duplicate, stale Apr 17, 2024
@Chuckame
Copy link
Contributor Author

I don't have strong reason of using the visitor pattern. I just wanted something similar to encoders and decoders, to be sure to not forget anything, and to not .

The main idea behind it is not to have visitor pattern, but it's to have a standard way of going through the descriptors without a value (to make schemas, debug, generate reports, ...).

Visitor pattern just became naturally (I'm not that old! 😄). To be honest, at the beginning, I've just copy/pasted the Decoders interfaces and mainly removed the return types. After that, I just reverse-engineered the plugin generated serializers to well understand its workflow.

Btw, jackson library does this visitor stuff internally, and it allows users to easily implement new formats without having to think about how is descripted a map, a list, an class, etc

@sandwwraith
Copy link
Member

to have a standard way of going through the descriptors without a value (to make schemas, debug, generate reports, ...).

You can take a look at the protobuf schema generator, it uses SerialDescriptor.elementDescriptors()/getElementDescriptior: https://github.com/Kotlin/kotlinx.serialization/blob/master/formats/protobuf/commonMain/src/kotlinx/serialization/protobuf/schema/ProtoBufSchemaGenerator.kt#L142

Although this API is indeed not showcased anywhere. We likely should add a section to https://github.com/Kotlin/kotlinx.serialization/blob/master/docs/formats.md#custom-formats-experimental with an explanation of how one can write a schema generator for a custom format.

@sandwwraith
Copy link
Member

#2643

@Chuckame
Copy link
Contributor Author

We are currently using nearly all the same apis for generating schemas. In my opinion, a class of 500 lines to generate a schema is less readable than a well structured visitor pattern, and it's difficult to check quickly where is generated what depending on its kind or descriptor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants