Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating photos of all users in Entra ID via Graph API throttles and throws exception #2374

Open
Balkoth opened this issue Mar 8, 2024 · 16 comments

Comments

@Balkoth
Copy link

Balkoth commented Mar 8, 2024

My process is as follows:

  • Get all users
  • Loop through users
    • Get user photo
    • Check if hash of photo matches current photo
    • If hash does not match upload current photo
    • Get uploaded photo and save hash

This runs into some arbitrary limit set by Microsoft for Graph calls and the code throws System.Threading.Tasks.TaskCanceledException: The request was canceled due to the configured HttpClient.Timeout of 10 seconds elapsing.
It is not always on the same calls, sometime its on the get photo, sometimes on the set.

I can not find any documentation on how to do this correctly, so it would be really nice for a team member to jump in and help.

@andrueastman
Copy link
Member

Thanks for raising this @Balkoth

This looks like the client is timing out due to the server taking a while to respond. However, the default client timeout is 100 seconds and not 10 seconds.
Any chance you can share how you are initializing the client? Are you able to increase the timeout and confirm if you get different results?

@Balkoth
Copy link
Author

Balkoth commented Mar 18, 2024

The GraphServiceClient is initialized like this. There is no custom creation of the HttpClient.

TokenCredentialOptions options = new TokenCredentialOptions { AuthorityHost = AzureAuthorityHosts.AzurePublicCloud };
ClientCertificateCredential credentials = new ClientCertificateCredential(tenantId, clientId, certificate, options);
_graphServiceClient = new GraphServiceClient(credentials);

@andrueastman
Copy link
Member

Any chance you can confirm the version of the SDK you are using?
Are you able to determine if the exception occurs when the request is made to graph or when the token is being acquired?

@Balkoth
Copy link
Author

Balkoth commented Mar 19, 2024

SDK is <PackageReference Include="Microsoft.Graph" Version="5.44.0" />
The token is successfully acquired as it runs for a while until the exception happens.

@andrueastman
Copy link
Member

Could you share the value of the http timeout of the client if you initialize the client as below(before updating)? Do you get different results if you increase it?

            var httpClient = GraphClientFactory.Create();
            Console.WriteLine(httpClient.Timeout);
            httpClient.Timeout = TimeSpan.FromMinutes(5);
            var authprovider = new AzureIdentityAuthenticationProvider(credentials);
            var graphClient = new GraphServiceClient(httpClient, authprovider);

@Balkoth
Copy link
Author

Balkoth commented Mar 19, 2024

image

@andrueastman
Copy link
Member

Just to confirm here @Balkoth

Does the error of 10 second timeout still occur in this scenario on testing? As the default is confirmed to be 100 seconds, it will be helpful to know if the timeout is coming from another place.

@Balkoth
Copy link
Author

Balkoth commented Mar 27, 2024

Yes, but i think we steer away from the root problem. As the timeout can and will happen and defining an indefinite timeout is out of the question for obvious reasons, what is the Microsoft-Way to deal with this situation.

There maybe a valid reason to throttle calls to the api, so please advise how to handle this gracefully. I did not find docs for MSGraph on how to deal with this.

@andrueastman
Copy link
Member

In the event of throttling, the API should respond with a 429 or 502 error so that the SDK may handle this gracefully using the various mechanisms built in the retry handler.
However, the client timing out in 10 seconds would ideally be too short as it may give up before giving the Api reasonable time to retry the request. So fixing that would be a reasonable way to go first.

Alternatively, you can look in batching the requests into a payload so that they may be executed from the server end in parallel. Information on how to do this is available at https://learn.microsoft.com/en-us/graph/sdks/batch-requests?tabs=csharp

@Balkoth
Copy link
Author

Balkoth commented Apr 4, 2024

Here is what i'm actually using to do this process:

Versions:

<ItemGroup>
  <PackageReference Include="Azure.Identity" Version="1.10.4" />
  <PackageReference Include="Microsoft.Graph" Version="5.44.0" />
</ItemGroup>

GraphClient:

using System;
using System.Collections.Generic;
using System.Collections.Immutable;
using System.IO;
using System.Security.Cryptography.X509Certificates;
using System.Threading.Tasks;
using Azure.Identity;
using Microsoft.Graph;
using Microsoft.Graph.Models;

public sealed class GraphClient : IDisposable
{
  private readonly GraphServiceClient _graphServiceClient;

  public GraphClient(string tenantId, string clientId, string thumbprint)
  {
    X509Certificate2 certificate = CertificateLoader.FromStoreWithThumbprint(thumbprint, StoreLocation.LocalMachine);
    TokenCredentialOptions options = new TokenCredentialOptions { AuthorityHost = AzureAuthorityHosts.AzurePublicCloud };
    ClientCertificateCredential credentials = new ClientCertificateCredential(tenantId, clientId, certificate, options);
    _graphServiceClient = new GraphServiceClient(credentials);
  }

  public void Dispose() => _graphServiceClient.Dispose();

  public async Task<ImmutableList<User>> GetUsersAsync()
  {
    List<User> userList = new List<User>();
    UserCollectionResponse? users = await _graphServiceClient.Users.GetAsync(rc =>
    {
      rc.QueryParameters.Select = ["Id", "employeeId"];
      rc.QueryParameters.Top = 999;
    });

    if (users == null) { return ImmutableList<User>.Empty; }

    await PageIterator<User, UserCollectionResponse>.CreatePageIterator
      (_graphServiceClient, users, user => { userList.Add(user); return true; })
      .IterateAsync();

    return userList.ToImmutableList();
  }

  public async Task<byte[]> GetUserPhotoAsync(string userId)
  {
    int attempt = 0;
    const int maxRetryAttempts = 3;

    while (true)
    {
      try
      {
        using Stream? stream = await _graphServiceClient.Users[userId].Photo.Content.GetAsync();
        return stream.ToArray();
      }
      catch (TaskCanceledException)
      {
        // If max retry attempts reached, rethrow.
        if (attempt >= maxRetryAttempts) { throw; }

        // Retry after reaching pause duration.
        await Task.Delay(TimeSpan.FromMinutes(5));
        attempt++;
      }
    }
  }
}

Photo management:

private static async Task<bool> UploadEmployeesPhotos(GraphClient graphClient)
{
  try
  {
    ImmutableList<User> userList = await graphClient.GetUsersAsync();
    foreach (User user in userList)
    {
      if (user.Id == null || user.EmployeeId == null)
      {
        continue;
      }

      byte[] graphPhotoBytes = await graphClient.GetUserPhotoAsync(user.Id);

      // Do something with graphPhotoBytes...
    }
  }
  catch (Exception exception)
  {
    return false;
  }

  return true;
}

There is no custom Graph or HttpClient stuff from my side which would influence the 10 second timeout i would receive sometimes.

@Balkoth
Copy link
Author

Balkoth commented May 3, 2024

Still getting this weird error sometimes:

ODataError: Status 500 - The request was canceled due to the configured HttpClient.Timeout of 10 seconds elapsing.

@petrhollayms
Copy link

Hi @Balkoth ,

Could you please try upgrading to the latest SDK version 5.54 and share the complete stack trace?

@Balkoth
Copy link
Author

Balkoth commented May 28, 2024

Sorry i can not mess with our tenant and do something like this. I have given all information on what and how we do it to run into this error. This is a Microsoft API we are calling from a Microsoft product on a Microsoft service. I hope you can fix this problem.

@petrhollayms
Copy link

hi @Balkoth ,

It would be helpful if you could run it for some time on your end with the latest version, as the non-destructive read operation for getting users and photos shall not make any harm to your environment, right.

As @andrueastman was saying above- we do not understand where the 10 seconds timeout is coming from, unless you have some explicit configuration for it in your code, as the default timeout (we do set in our code, I checked yesterday) is 100 secs, not 10.

Anyway, we will try to reproduce on our end, but we do not see it happening anywhere else.

@petrhollayms petrhollayms added status:waiting-for-author-feedback Issue that we've responded but needs author feedback to close and removed Needs: Attention 👋 labels May 28, 2024
@Balkoth
Copy link
Author

Balkoth commented May 29, 2024

As posted in the code actually used to do this above there is no config of the timeout on my side, right.

@Balkoth
Copy link
Author

Balkoth commented Jun 4, 2024

Hello?

@microsoft-github-policy-service microsoft-github-policy-service bot added Needs: Attention 👋 and removed status:waiting-for-author-feedback Issue that we've responded but needs author feedback to close Status: No recent activity labels Jun 4, 2024
@petrhollayms petrhollayms self-assigned this Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants