Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TIKA-4252: add request metadata #1753

Merged
merged 1 commit into from
May 10, 2024

Conversation

nddipiazza
Copy link
Contributor

add request metadata

@nddipiazza nddipiazza marked this pull request as ready for review May 10, 2024 05:51
@nddipiazza nddipiazza merged commit b068e42 into main May 10, 2024
1 check failed
@nddipiazza nddipiazza deleted the TIKA-4252-fetch-tuple-missing-metadata-2 branch May 10, 2024 05:51
FetchKey fetchKey = t.getFetchKey();
protected MetadataListAndEmbeddedBytes parseFromTuple(FetchEmitTuple fetchEmitTuple, Fetcher fetcher) {
FetchKey fetchKey = fetchEmitTuple.getFetchKey();
Metadata fetchResponseMetadata = new Metadata();
Copy link
Contributor

@tballison tballison May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata that goes in the fetchemittuple was envisioned to be user-injected metadata that was injected after the parse and then emitted (e.g. provenance metadata).

I think we need to put both metadatas on the fetchemittuple.

This is what I'm thinking...let me know what you think.

So, there will be three metadatas in play. The fetchemit tuple will have a fetchRequestMetadata (???) and a userMetadata (???). At parse time, we'll create a fresh metadata object, which we'll call "responseMetadata" in the following call: fetcher.fetch(requestMetadata, responseMetadata).

The parse will then use the responseMetadata and, after the parse, inject the userMetadata from the fetchEmitTuple.

The fetcher may use the fetchRequestMetadata to carry out its request, but info from that one should not make it into the "responseMetadata" nor make it into the emit data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nddipiazza any chance you can revert this in main so that we have a working build? Thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shoot i didn't realize i was deplying broken builds! reverted. i'll make this change and make a new pr

nddipiazza pushed a commit that referenced this pull request May 11, 2024
nddipiazza added a commit that referenced this pull request May 11, 2024
* Revert "TIKA-4252: add request metadata (#1753)"

This reverts commit b068e42.

* Revert "TIKA-4252: fix metadata issue (#1752)"

This reverts commit 2f8dbdf.

---------

Co-authored-by: Nicholas DiPiazza <ndipiazza@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants