Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port Haystack v1 DocumentClassifier node to Haystack v2 #7669

Open
ms130 opened this issue May 8, 2024 · 3 comments
Open

Port Haystack v1 DocumentClassifier node to Haystack v2 #7669

ms130 opened this issue May 8, 2024 · 3 comments
Labels
Contributions wanted! Looking for external contributions type:feature New feature or request

Comments

@ms130
Copy link

ms130 commented May 8, 2024

Is your feature request related to a problem? Please describe.

I've been using the DocumentClassifier node in Haystack v1 with a zero-shot classification model to label documents with categories, which are attached to their metadata. We have recently migrated our code to Haystack v2 but have discovered that this component does not yet exist in v2, so I'm currently unable to classify documents.

Describe the solution you'd like

It would be great if someone were able to port this very useful v1 node into a v2 component please! It would also be tremendously useful to add the multi_label argument (see here) to the new component so that the model can be run assuming multiple labels can be true. The existing v1 node doesn't provide this flexibility, so I created a custom node by subclassing it and modifying it's behaviour.

Describe alternatives you've considered

I considered creating my own custom DocumentClassifier component in v2, but have not started this yet, and am unsure about how difficult it would be.

@anakin87
Copy link
Member

anakin87 commented May 9, 2024

This is a legitimate request!

I would start with implementing a TransformersZeroShotDocumentClassifier, only focusing on zero-shot classification.

The code should not be difficult to migrate, starting from the 1.x version.

I will tag this issue as "contributions wanted" and see if any community members would like to address it.

@anakin87 anakin87 added type:feature New feature or request Contributions wanted! Looking for external contributions labels May 9, 2024
@srini047
Copy link

Hi @anakin87,
I would like to work on this. If I am not wrong this ZeroShotDocument classifier must be ported here in align with Haystack 2.0 nomenclature?

@anakin87
Copy link
Member

Good to hear... Yes, I think it should be placed in classifiers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Contributions wanted! Looking for external contributions type:feature New feature or request
Projects
Development

No branches or pull requests

3 participants