Representing and Describing Vector Embeddings and Models #4318
Replies: 2 comments
-
Formal ontologies of embeddings might be tricky as unless they are from the same model trained over time, or the same model architecture trained on different datasets there isn't much of a relationship between them. You might want to look at Model Cards which are an effort to produce standardised descriptions for models. Intel gave a talk about adding model cards to the ONNX format at the ONNX community day a few weeks ago, and we're working on similar model card support for models we export to ONNX from Java (though we don't export embedding models at the moment). |
Beta Was this translation helpful? Give feedback.
-
As for data structures and databases, this is dealt with in the OntoLex-FrAC module developed by the W3C CG Ontology-Lexica, cf. ontolex/frequency-attestation-corpus-information#14. It does, however, not aim to provide formal ontologies of models but resorts to human-readable metdata (dc:description) and future machine-readable vocabularies to fill in that gap. Full description (modulo some minor renaming) under https://github.com/ontolex/frequency-attestation-corpus-information/blob/master/index.md#embeddings. OntoLex-FrAC is a working draft, publication expected for late 2022/early 2023. |
Beta Was this translation helpful? Give feedback.
-
Hello. I am interested in and would like to ask about:
Towards representing and describing vector embeddings:
Are any others here interested in these topics? Is there any ONNX documentation which might be useful in these regards? Thank you.
P.S.: Perhaps a new MIME type could be devised for vector embeddings,
embedding
. This would be useful for scenarios including HTTP-based content negotiation, e.g.,Accept: embedding/gpt3
.Beta Was this translation helpful? Give feedback.
All reactions