Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error 403 in development environment when uploading files to S3 #4018

Open
quevon24 opened this issue May 3, 2024 · 7 comments
Open

Error 403 in development environment when uploading files to S3 #4018

quevon24 opened this issue May 3, 2024 · 7 comments

Comments

@quevon24
Copy link
Member

quevon24 commented May 3, 2024

@grossir told me that he was having a 403 error when uploading files

An error occurred (403) when calling the HeadObject operation: Forbidden

I was reviewing the problem with @blancoramiro and we discovered that to use temporary credentials it is also required to pass the session token to validate the credentials.

Temporarily, the solution is to add an extra variable to the .env file.

AWS_DEV_SESSION_TOKEN="XXXXXXXX"

and add a line to cl/settings/third_party/aws.py

if DEVELOPMENT:
    AWS_ACCESS_KEY_ID = env("AWS_DEV_ACCESS_KEY_ID", default="")
    AWS_SECRET_ACCESS_KEY = env("AWS_DEV_SECRET_ACCESS_KEY", default="")
    AWS_SESSION_TOKEN = env("AWS_DEV_SESSION_TOKEN", default="") # <-------- add this line

and then reload .env file:
docker-compose -f docker/courtlistener/docker-compose.yml up -d

@blancoramiro told me that he is going to look for an alternative to this so that we don't have to constantly modify the environment variables in development environment.

@blancoramiro
Copy link
Contributor

I created PR 4019 to show some changes that could possibly help help with issue.

In the file cl/settings/third_party/aws.py the variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY but I couldn't find any reference to them anywhere in the repo. Boto does not use those variables, it uses the ones in the environment: boto3 authentication. I tested this locally.

Knowing that docker-compose is only used locally, I suggest adding the lines in the PR to the compose file so that you could set the variables in the following way:

  • Obtain the AWS environment variables from the SSO screen:
    test

  • Paste them into the console.

  • Run docker compose up.

The AWS environment variables will be then available inside the cl-django and cl-celery containers and boto3 will use them.

@quevon24 We could test this to see if works properly locally.

Thank you!

@grossir
Copy link
Contributor

grossir commented May 8, 2024

Not strictly an upload problem, but I think this is related to these tokens/keys. I was trying to review #3915 and couldn't copy the test data:

docker exec -it cl-django python /opt/courtlistener/manage.py clone_from_cl --type search.OpinionCluster --id 1904175 7903720 7903715 7903719 7903933

The traceback:

Traceback (most recent call last):
  File "/opt/courtlistener/manage.py", line 15, in <module>
    main()
  File "/opt/courtlistener/manage.py", line 11, in main
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.12/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.12/site-packages/django/core/management/__init__.py", line 436, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.12/site-packages/django/core/management/base.py", line 413, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.12/site-packages/django/core/management/base.py", line 459, in execute
    output = self.handle(*args, **options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/courtlistener/cl/scrapers/management/commands/clone_from_cl.py", line 1021, in handle
    clone_opinion_cluster(
  File "/opt/courtlistener/cl/scrapers/management/commands/clone_from_cl.py", line 135, in clone_opinion_cluster
    docket_id = get_id_from_url(cluster_datum["docket"])
                                ~~~~~~~~~~~~~^^^^^^^^^^
KeyError: 'docket'

So, I dropped I few print statements
The cluster_datum variable contains {'detail': 'Invalid token header. No credentials provided.'}

print(AWS_SESSION_TOKEN, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) actually show the fresh values I put on the .env files following the comments on this ticket

I am not sure if it is a problem with my permissions or my environment, but perhaps you could try to reproduce it?

@quevon24
Copy link
Member Author

quevon24 commented May 8, 2024

Not strictly an upload problem, but I think this is related to these tokens/keys. I was trying to review #3915 and couldn't copy the test data:

docker exec -it cl-django python /opt/courtlistener/manage.py clone_from_cl --type search.OpinionCluster --id 1904175 7903720 7903715 7903719 7903933

The traceback:

Traceback (most recent call last):
  File "/opt/courtlistener/manage.py", line 15, in <module>
    main()
  File "/opt/courtlistener/manage.py", line 11, in main
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.12/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.12/site-packages/django/core/management/__init__.py", line 436, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.12/site-packages/django/core/management/base.py", line 413, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.12/site-packages/django/core/management/base.py", line 459, in execute
    output = self.handle(*args, **options)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/courtlistener/cl/scrapers/management/commands/clone_from_cl.py", line 1021, in handle
    clone_opinion_cluster(
  File "/opt/courtlistener/cl/scrapers/management/commands/clone_from_cl.py", line 135, in clone_opinion_cluster
    docket_id = get_id_from_url(cluster_datum["docket"])
                                ~~~~~~~~~~~~~^^^^^^^^^^
KeyError: 'docket'

So, I dropped I few print statements The cluster_datum variable contains {'detail': 'Invalid token header. No credentials provided.'}

print(AWS_SESSION_TOKEN, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) actually show the fresh values I put on the .env files following the comments on this ticket

I am not sure if it is a problem with my permissions or my environment, but perhaps you could try to reproduce it?

To run the clone command you require an API token from courtlistener.

You need to create an account in https://www.courtlistener.com/ then you can get your API token here: https://www.courtlistener.com/profile/api-token/

Then you need to set the token with this env variable: CL_API_TOKEN

@grossir
Copy link
Contributor

grossir commented May 9, 2024

Now I see CL_API_TOKEN is documented in the clone_from_cl command, sorry for bringing that up in this issue

@mlissner
Copy link
Member

mlissner commented May 9, 2024

Would a better error message in the command have helped you?

@grossir
Copy link
Contributor

grossir commented May 9, 2024

Would a better error message in the command have helped you?

I think so, the line where the token is collected from the environment should complain if the variable does not exist or is an empty string, since the command won't work without that token

"Authorization": f"Token {os.environ.get('CL_API_TOKEN', '')}"

@quevon24
Copy link
Member Author

quevon24 commented May 9, 2024

I'll add an early abort and take the opportunity to check that there is no need to clone any new fields

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants