Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Akka Service Discovery heartbeat monitor is failing to find Azure Table resource #7019

Open
ChrisWestermann opened this issue Dec 6, 2023 · 1 comment

Comments

@ChrisWestermann
Copy link

Version Information
Akka.Net 1.5.5
Akka.Bootstrap.Docker 0.5.3
Akka.Cluster.Hosting 1.5.5
Akka.Cluster.Sharding 1.5.5
Akka.Cluster.Tools 1.5.5
Akka.DependencyInjection 1.5.5
Akka.Discovery.Azure 1.5.0
Akka.Distributed.Data 1.5.5
Akka.HealthCheck.Hosting 1.5.2
Akka.HealthCheck.Hosting.Web 1.5.2
Akka.Hosting 1.5.5
Akka.Logger.Serilog 1.5.0.1
Akka.Management 1.5.0
Akka.Persistence 1.5.5
Akka.Persistence.Hosting 1.5.5
Akka.Persistence.Redis 1.5.0
Akka.Serialization.Hyperion 1.5.5

Describe the bug
We started to get warnings in our logs that the TTL heartbeat monitor from Akka service discovery was unable to find an Azure resource with a 404. However, the service nodes themselves are finding and updating the Azure table properly, as can be seen when inspecting the Azure table. The configured storage account connection strings and keys are correct and have not changed since deployment.

Note this is not affecting our actual service health.

To Reproduce
Unsure, the service was running for several months until this issue started recently. It could be unrelated to Akka.Net; however, we have ruled out configuration and/or Azure service health issues at this time.

Actual behavior
The warning message

[WARNING][12/06/2023 06:36:03.382Z][Thread 0003][akka.tcp://MedInsightsSys@79d774f39c39:4055/user/$b] Failed to update TTL heartbeat, retrying
Cause: Azure.RequestFailedException: The specified resource does not exist.
RequestId:12f159ec-a002-0069-6d73-289d43000000
Time:2023-12-06T18:36:03.3645943Z
Status: 404 (Not Found)
ErrorCode: ResourceNotFound

Content:
{"odata.error":{"code":"ResourceNotFound","message":{"lang":"en-US","value":"The specified resource does not exist.\nRequestId:12f159ec-a002-0069-6d73-289d43000000\nTime:2023-12-06T18:36:03.3645943Z"}}}

Headers:
Cache-Control: no-cache
Transfer-Encoding: chunked
Server: Windows-Azure-Table/1.0 Microsoft-HTTPAPI/2.0
x-ms-request-id: 12f159ec-a002-0069-6d73-289d43000000
x-ms-client-request-id: 244858d2-f7a4-4e2d-a2df-f38b373090a5
x-ms-version: REDACTED
X-Content-Type-Options: REDACTED
Date: Wed, 06 Dec 2023 18:36:02 GMT
Content-Type: application/json;odata=minimalmetadata;streaming=true;charset=utf-8

   at Azure.Data.Tables.TableRestClient.UpdateEntityAsync(String table, String partitionKey, String rowKey, Nullable`1 timeout, String ifMatch, IDictionary`2 tableEntityProperties, QueryOptions queryOptions, CancellationToken cancellationToken)
   at Azure.Data.Tables.TableClient.UpdateEntityAsync[T](T entity, ETag ifMatch, TableUpdateMode mode, CancellationToken cancellationToken)
   at Akka.Discovery.Azure.ClusterMemberTableClient.UpdateAsync(CancellationToken token)
   at Akka.Discovery.Azure.Actors.HeartbeatActor.ExecuteUpdateOpWithRetry()
   at Akka.Actor.PipeToSupport.PipeTo[T](Task`1 taskToPipe, ICanTell recipient, Boolean useConfigureAwait, IActorRef sender, Func`2 success, Func`2 failure)

Environment
Service nodes are running on a Linux docker swarm.

@Arkatufus
Copy link
Contributor

@ChrisWestermann We really need more data on what is happening in the Azure server side, is it possible for you to turn on resource monitoring and give us the logs of what happens around the incident?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants