Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReadError on microservice page-count request processing recap.email #3944

Open
sentry-io bot opened this issue Apr 4, 2024 · 2 comments
Open

ReadError on microservice page-count request processing recap.email #3944

sentry-io bot opened this issue Apr 4, 2024 · 2 comments

Comments

@sentry-io
Copy link

sentry-io bot commented Apr 4, 2024

This relates to #3693. We can retry the ReadError during the recap.email processing.

However, retrying on ReadError doesn't seem as straightforward in all cases. It will require some refactoring of process_recap_email, currently the task can be retried without problems if the failure happened during merging the RD data of the main document. Since the downloaded file is stored in a PQ, it allows the file to persist across retries. However if the error occurs during the page count of an attachment document, that means the file won't be persisted since the current process of downloading attachments happens within a transaction; therefore, if there is an error, related instances are not stored in the database. Thus, the required refactor should involve steps like:

  • Downloading first all the free documents for the main document and all attachments.
  • Storing the downloaded files in a PQ for each file.
  • Within the transaction, create or update the related Docket, DocketEntry, and RECAPDocument instances.

That way, if there is an error during the data merging process, the task can be retried safely.

Sentry Issue: COURTLISTENER-5W2

BrokenResourceError: 
  File "httpcore/_exceptions.py", line 10, in map_exceptions
    yield
  File "httpcore/_backends/anyio.py", line 34, in read
    return await self._stream.receive(max_bytes=max_bytes)
  File "anyio/_backends/_asyncio.py", line 1132, in receive
    raise self._protocol.exception from None

ReadError: 
(7 additional frame(s) were not displayed)
...
  File "httpcore/_async/http11.py", line 176, in _receive_response_headers
    event = await self._receive_event(timeout=timeout)
  File "httpcore/_async/http11.py", line 212, in _receive_event
    data = await self._network_stream.read(
  File "httpcore/_backends/anyio.py", line 31, in read
    with map_exceptions(exc_map):
  File "contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc

ReadError: 
(11 additional frame(s) were not displayed)
...
  File "cl/recap/tasks.py", line 2363, in process_recap_email
    get_and_copy_recap_attachment_docs(
  File "cl/recap/tasks.py", line 2071, in get_and_copy_recap_attachment_docs
    save_pacer_doc_from_pq(self, rd_att, fq, pq, magic_number)
  File "cl/recap/tasks.py", line 1927, in save_pacer_doc_from_pq
    success, msg = update_rd_metadata(
  File "cl/corpus_importer/tasks.py", line 1837, in update_rd_metadata
    response = async_to_sync(microservice)(
  File "cl/lib/microservice_utils.py", line 93, in microservice
    return await client.send(req)
@mlissner
Copy link
Member

mlissner commented Apr 4, 2024

Thanks Alberto. Very helpful, and I've placed this on Eduardo's queue to work on after ACMS.

Copy link
Author

sentry-io bot commented Apr 5, 2024

This one is related; in this case, the ReadError occurred while requesting the appellate document number from Doctor. Retrying this error seems easier since this request happens outside the transaction block.

Sentry Issue: COURTLISTENER-6ZX

Filed by: @albertisfu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Main Backlog
Development

No branches or pull requests

1 participant