Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQUEST] Support creating verifier without knowing upfront the list of user pool IDs and client IDs #82

Open
dax-hurley opened this issue Aug 20, 2022 · 15 comments
Labels
enhancement New feature or request

Comments

@dax-hurley
Copy link

dax-hurley commented Aug 20, 2022

Question
Many other JWT libraries have a concept of a 'decoder' which lets you preview the claims being made in a JWT before verifying them. In my case, I am building a multi-tenant application with many different user pools for the different tenants. I want to extract the user pool id from the issuer claim in the token and use that to look up information about the tenant the user is making the request for before verifying. As a result of issue #24 it is possible to extract the raw JWT token by running a verification that intentionally throws and then extracting the value from the error, but this doesn't make for very efficient, clean or readable code. I am currently using a secondary JWT library for this purpose while taking advantage of this one for the Cognito niceties, but I would like to reduce my bundle size by eliminating my other library.

@dax-hurley dax-hurley added the question Further information is requested label Aug 20, 2022
@ottokruse
Copy link
Contributor

Thanks for opening the discussion!

Our thinking has been to not add this, because we want to prevent users from accidentally using that decode function where they should actually intend to call verify. Not all users are JWT experts, or even that great in the English language, it's easy to become confused. (This infamously happened with a very well known JWT verification lib out there)

We could maybe call it unsafe_decode then or so, to make it really obvious. But still a counter point is, that it is simple enough to just do it in plain JS:

// NodeJS
const decodedPayload = JSON.parse(Buffer.from(jwt.split('.')[1], 'base64url').toString());

// Web
const decodedPayload = JSON.parse(atob(jwt.split('.')[1].replace('-','+').replace('_','/')));

That's why we're on the fence on this one, leaning at this point to not add this to the library.
Appreciate to hear your counter arguments though.

@ottokruse
Copy link
Contributor

ottokruse commented Aug 22, 2022

Also, would be good to understand your user case. Why do you want this, what is your scenario that requires decoding before verifying? (Especially, since unverified JWTs cannot be trusted).

@dax-hurley
Copy link
Author

dax-hurley commented Aug 24, 2022

Thank you for the code snippets!

The application I am building is multi-tenanted with a 1 to 1 relationship between tenants and Cognito user pools (so each tenant has their own unique and isolated user pool). Basically, I want to extract the user pool id and client id from the token to use to lookup which tenant the user is trying to make a request for and then, after verifying that the user pool id and client id do in fact correspond to a tenant in the application, verify the token against that user pool. This will allow the tenancy claim to be directly tied to the user's authentication token as opposed to making it a separate part of the request.

@dax-hurley
Copy link
Author

dax-hurley commented Aug 24, 2022

The broader use case would be situations where you need to refer to values in the token for application logic before verification takes place, most commonly to decide how the token will be verified. In my architecture, I don't know the user pool id or client id ahead of time, so I need use the information in the token to figure that out before I'm even able to verify, I think there are other similar scenarios where this sort of functionality may be needed. I definitely agree that if you decide to implement a decoder that you should go out of your way to tell the user that the information within the unverified token cannot be trusted.

@ottokruse
Copy link
Contributor

Thank you for the code snippets!

You're very welcome.

Thanks for the info, have a couple of follow up questions:

  • Is it an option for you to create the JWT verifier with multiple user pools: https://github.com/awslabs/aws-jwt-verify#trusting-multiple-user-pools Would that solve your issue too?
  • What is the ballpark figure for the nr of tenants, and growth rate, you want to support?
  • Do you even know the list of User Pool IDs you want to check against when the JWT comes by, or do you need to look that up dynamically (from e.g. a database table somewhere)? The CognitoJwtVerifier class does assume you know the (list of) User Pool ID, otherwise you can't instantiate it.

If you want to pursue this discussion, it would be great to see more of your business logic, code snippets if possible, so we can really think along with you.

@dax-hurley
Copy link
Author

dax-hurley commented Aug 25, 2022

Is it an option for you to create the JWT verifier with multiple user pools: https://github.com/awslabs/aws-jwt-verify#trusting-multiple-user-pools Would that solve your issue too?

Unfortunately not as it would require knowing the user pool ids ahead of time in the code and would not scale to many tenants.

What is the ballpark figure for the nr of tenants, and growth rate, you want to support?

I would like to eventually be able to scale up to hundreds of tenants, although I expect only a dozen or so initially. I am also trying to avoid a situation where my tenants are defined by information in the repo as I prefer them to be defined at a persistence layer level so that managing tenants does not require a developer. I realize that there is a soft limit of 1,000 user pools per AWS account and a hard limit of 10,000. If I get to the point where I am nearly exhausting that hard limit that's what's known as a good problem to have, it is unlikely to happen for quiet a long time, if ever.

Do you even know the list of User Pool IDs you want to check against when the JWT comes by, or do you need to look that up dynamically (from e.g. a database table somewhere)? The CognitoJwtVerifier class does assume you know the (list of) User Pool ID, otherwise you can't instantiate it.

I need to look the user pool ids up dynamically, hence I instantiate a new CognitoJwtVerifier on every request, which is probably not ideal in terms of efficiency, but for what its worth I am caching the JWKs in the database and using those cached JWKs for verification, so when I look up the tenant information I also grab the JWK from my DB instead of grabbing it from the public endpoint.

At the moment my authentication logic looks like this (somewhat simplified and still WIP):

async function authenticate(authHeader: string) {
  try {
   const token = authHeader.split(" ")[1];
   
    // At the moment the decode function uses your code snippet
   // It could use something native to the library
    const decodedPayload = decode(token);
    const issSplit = decodedPayload["iss"].split("/") as string[];
    const userPoolId = issSplit[issSplit.length];

    const organization = () => {/* lookup organization from DB using userPoolId */}
    const matchedRegion = organization.userPoolRegion;
    const userPoolClientId = organization.userPoolClientId;
    
    let jwk: any = {};
    if (organization.jwk) {
      // use the stored JWK
    } else {
      // Grab the JWK from the public endpoint and save it.
    }
    const verifier = CognitoJwtVerifier.create({
      userPoolId: userPoolId,
      tokenUse: "access",
      clientId: userPoolClientId,
    });
    verifier.cacheJwks(jwk);
    const verifyResult = await verifier.verify(token);
    return {
      username: verifyResult.username,
      organizationName: organization.organizationName
    };
  } catch (e) {
    // throw unauthorized
  }
}

Thank you for your help!

@ottokruse
Copy link
Contributor

ottokruse commented Aug 25, 2022

Thanks that's very helpful!

I believe your feature request could then be translated in:

  • Support creating verifier without knowing upfront the list of user pool IDs and client IDs
  • So, somehow support dynamic lookup of these (with intelligent caching so you only do such a lookup if you see a new iss come by that you didn't see before)

I we support that (which by the way seems fair and will help many multi tenant cases) that would save you a bunch of custom code and you would not need to decode the unverified JWT yourself. Is that right?

Unrelated point about your code: the lookup of the JWKS from your DB could be coded like this too, which would cache the various JWKS in memory:

import { CognitoJwtVerifier } from "aws-jwt-verify";
import { SimpleJwksCache } from "aws-jwt-verify/jwk";
import { JwksNotAvailableInCacheError } from "aws-jwt-verify/error";

class DbCache extends SimpleJwksCache {
  async getJwk(jwksUri, decomposedJwt) {
    try {
      return super.getCachedJwk(jwksUri, decomposedJwt);
    } catch (err) {
      if (!(err instanceof JwksNotAvailableInCacheError)) {
        throw err;
      }
      const jwksFromDb = await loadJwksFromDatabase(decomposedJwt.payload.iss); // throw an error if you don't recognize the iss
      if (jwksFromDb) {
        super.addJwks(jwksUri, jwksFromDb);
        return super.getCachedJwk(jwksUri, decomposedJwt);
      }
      // This will now fetch from the public endpoint:
      return super.getJwk(jwksUri, decomposedJwt);
    }
  }
}
// Create at this scope to reuse across calls to authenticate:
const jwksCache = new DbCache();

async function authenticate(authHeader: string) {
  try {
    
    // ...

    const verifier = CognitoJwtVerifier.create(
      {
        userPoolId: userPoolId,
        tokenUse: "access",
        clientId: userPoolClientId,
      },
      {
        jwksCache,
      }
    );
  } catch (e) {
    // throw unauthorized
  }
}

@dax-hurley
Copy link
Author

dax-hurley commented Aug 25, 2022

I we support that (which by the way seems fair and will help many multi tenant cases) that would save you a bunch of custom code and you would not need to decode the unverified JWT yourself. Is that right?

That is correct!

One thing I do want to point out though is that you of course don't want to allow the user to simply trust whatever user pool id is passed in the JWT without also having some mechanism to verify that the user pool itself can be trusted. Otherwise an attacker could simply create a new user pool in their own AWS account and pass a valid token from that. One option would be to specify that user pool belongs to an account accessible from the AWS SDK, but this would require the AWS SDK as an additional dependency, which isn't ideal. Another option might be be to pass a callback to create() which would be responsible for determining if the user pool can be trusted. For example, it could look like this:

const verifier = CognitoJwtVerifier.create(
  {
    tokenUse: "access",
    userPoolVerifyCallback: (userPoolId: string, clientId: string) => {
      // logic to check against database or other mechanism to check if user pool and client id are valid
      if(/* valid */) {
        return true
      }
      else {
        return false
      }
    }
  },
);
`

@ottokruse
Copy link
Contributor

One thing I do want to point out though is that you of course don't want to allow the user to simply trust whatever user pool id is passed in the JWT without also having some mechanism to verify that the user pool itself can be trusted.

Yes, totally agree.

Thanks for all the information. I think it's pretty clear what is needed, we'll sync with the team and see when we'd get to this.

@ottokruse ottokruse changed the title [FEATURE REQUEST] Ability to decode unverified token? [FEATURE REQUEST] Support creating verifier without knowing upfront the list of user pool IDs and client IDs Sep 12, 2022
@hakanson
Copy link
Contributor

I've been looking at this while thinking about the Amazon Cognito Developer Guide Multi-tenant application best practices

You have four ways to secure multi-tenant applications: user pools, application clients, groups, or custom attributes.

I look at other options in CognitoVerifyProperties like clientId and groups and wonder how to best dynamically pass in these values in addition to userPoolId since each might require a similar lookup based on some tenant id.

Does it make more sense to have additional constructor properties and some callback on CognitoJwtVerifier, or to create a new MultiTenantCognitoJwtVerifier and either require that customJwtCheck function or have a similar function which returns the important CognitoVerifyProperties and errors if userPoolId is not part of that return value set?

@ottokruse
Copy link
Contributor

ottokruse commented Oct 18, 2022

or have a similar function which returns the important CognitoVerifyProperties and errors if userPoolId is not part of that return value set?

I like that idea.

So instead of providing the config beforehand, one could provide an async function that returns the verify properties (userPoolId, clientId, etc). We'd call that function then, and await it's result, on every verify call. (Won't work for verifySync)

In our implementation we store these properties in issuersConfig that we access synchronously in various places. Would have to change that then to async calls (not hard, but is a pervasive change). CTRL-F for this.issuersConfig here: https://github.com/awslabs/aws-jwt-verify/blob/main/src/jwt-rsa.ts

@ottokruse
Copy link
Contributor

ottokruse commented Nov 15, 2022

FYI turns our we do have a decode-like function, be it a little hidden:

import { decomposeJwt } from "aws-jwt-verify/jwt";

// danger! you're looking at an unverified JWT, trust nothing in it!
const { payload } = decomposeJwt("ey....");

Ideally you would not need to call this yourself though (danger! you're looking at an unverified JWT, trust nothing in it!), so the above discussion and proposed solution direction still stands.

@ottokruse ottokruse added enhancement New feature or request and removed question Further information is requested labels Jan 3, 2023
@ottokruse
Copy link
Contributor

Now properly documented how you can decompose (/decode) an unverified JWT: https://github.com/awslabs/aws-jwt-verify#peeking-inside-unverified-jwts

@Jafectnez
Copy link

I was reading this issue/request and this use case is exactly the same that I want to resolve on my app, do you have an update about this? depending on that I can think about waiting for your solution or implement a temporary solution in the meantime

Thanks in advance!

@ottokruse
Copy link
Contributor

ottokruse commented Apr 24, 2023

The current best way to do it is:

0 Create JWKS cache (once)

Create the JWKS cache at a higher scope so that you can reuse it across requests:

import { SimpleJwksCache } from "aws-jwt-verify/jwk";

const jwksCache = new SimpleJwksCache();

1 Decompose Unverified JWT

Your JWT probably has a custom attribute to identify the tenant, e.g. tenant_id:

import { decomposeJwt } from "aws-jwt-verify/jwt";

// danger! you're looking at an unverified JWT, trust nothing in it!
// only use it to establish e.g. the tenant id to use for further verification
const { payload: { tenant_id } } = decomposeJwt("ey....");

2 Dynamic lookup of UserPool ID, Client ID

Let's say you have a DynamoDB table where you store User Pool ID and Client ID for each tenant, and that the JWT payload has a custom attribute tenant_id. Then lookup the User Pool ID and Client ID for that tenant ID.

const { userPoolId, clientId } = await myLookUpFunction({ tenantId: payload.tenant_id });

Ideally, you would add some in-memory caching here, much like SimpleJwksCache does, so that when subsequent JWTs for the same tenant_id come in, you can use the cached userPoolId and clientId instead of having to call the (more expensive) myLookUpFunction

4 Create verifier, use jwksCache from upper scope, and verify JWT

CognitoJwtVerifier.create({
  userPoolId: "<user_pool_id>",
  tokenUse: "id", // or "access"
  clientId: "<client_id>",
}, {
  jwksCache // use cache from upper scope, instead of creating a new one
}).verify("ey....")

Because you create a new verifier for each verification, it makes sense to explicitly create the JWKS cache at the upper scope (step 0), so that the cache for JWKS downloads is kept and reused for multiple verifications. Creating a verifier is very quick (synchronous), it's the downloading of JWKS that you want to make sure is cached.

--

Hope that helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants