Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of the RecordNameStrategy for Protobuf, JSON and Avro(Generic and Specific) #1063

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

djedjethai
Copy link
Contributor

@djedjethai djedjethai commented Sep 23, 2023

Hi, here is a full implementation of the RecordNameStrategy for Protobuf, JSON and Avro(Generic and Specific).

It follows this interface

// the optional ‘subject’ assert the fullyQualifiedName of the message 
SerializeRecordName(msg interface{}, subject ...string) ([]byte, error)

DeserializeRecordName(payload []byte) (interface{}, error)
DeserializeIntoRecordName(subjects map[string]interface{}, payload []byte) error

  • For Protobuf, the solution was obvious since the protobuf schema contains the fullyQualifiedName of the message (which also appears within the SchemaRegistry API record)

  • When serializing JSON messages, I use the Go fullyQualifiedName (instead of nothing). Then, I add it to the schemaBytes. For example, if it's 'main.Person,' 'main' goes into the namespace (of the schemaInfo), and 'Person' remain the name. On the consumer side, the consumer extracts these two fields, rebuilds the fullyQualifiedName, and can then identify the matching instance for deserializing the bytes. As the schemaInfo is stored inside the cache, it works the same way. For the default MessageFactory, it simply returns a *map[string]interface{}.

  • For AvroGeneric, the process is similar to JSON since it doesn't have a specific fullyQualifiedName. In the default MessageFactory, I use 'github.com/linkedin/goavro' to deserialize the bytes without specifying an Avro schema.

  • For AvroSpecific, we use the schema's defined namespace if available, otherwise, we use the Go fullyQualifiedName. The default MessageFactory also uses 'github.com/linkedin/goavro'.

In all four cases, the fullyQualifiedName is present in the schemaInfo, making it conveniently consultable within the SchemaRegistry API record. Additionally, since these schemas are stored in the cache, the fullyQualifiedName is always readily available.

For the cache, I had to add a new keyType called subjectOnlyID, which refers only to the 'id' of the SchemaRegistry API record. This is because it's the only piece of information we have available when receiving a message. Additionally, I added a new function to handle this case.

In the func (s *BaseDeserializer) GetSchema(subject string, payload []byte) method, if the schema is empty, we use s.Client.GetByID(int(id)) (case where only the id is present)

I made some minor modifications to the mock_schemaregistry_client without affecting the previous implementation. Additionally, I introduced 2-3 Protobuf and Avro messages with different namespaces, using the schemas(.avsc .proto) already available, in order to minimize the modifications required for the mock.

All the tests for the full implementation are present.

I've added a complete examples folder in 'examples/schemaregistry_example,' and it's quite comprehensive. I'm excited to showcase the functionality.

I did add few other things, even to improve the topicNameStrategy(like for Avro) and more, I let you findout when going throught the code.

No breaking change.

Based on my testing, it appears to be working well.

I look forward to your feedback and hope you find this implementation valuable.

@owetterau
Copy link

Is there a chance that this pull request will be merged anytime soon? I also need RecordNameStrategy for Protobuf and Avro in this library...

@djedjethai
Copy link
Contributor Author

djedjethai commented Dec 15, 2023

@owetterau, as there was no response to the pull request, I forked the project and implemented the RecordNameStrategy, TopicRecordNameStrategy with Protobuf, JSON, Avro (both generic and specific), and adjusted the TopicNameStrategy. The release is available here https://github.com/djedjethai/gokfk-regent. Feel free to check out the examples; they are quite explicit. I haven't written any papers yet.

Please note a breaking change compared to the original repository: If you register MessageFactory(), the function signature is now func([]string, string) (interface{}, error) instead of func(string, string) (interface{}, error), making it consistent across all strategies.

@rayokota, feel free to take a look as well. If you find it promising, I can create a new pull request with these changes. Adapting the code for MessageFactory() can be done easily to avoid any breaking changes, but the TopicRecordNameStrategy requires it like so(regarding my approach).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants