Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dapr Invoke Method has very low throughput compared to calling API directly #709

Closed
yschneider-bosch opened this issue Feb 23, 2022 · 16 comments
Assignees
Labels
kind/bug Something isn't working P1 size/M triaged/resolved Items triaged and ready
Milestone

Comments

@yschneider-bosch
Copy link

Expected Behavior

invokeMethod sends requests asynchronously and parallel. The number of connections, threads etc can be configured. The performance should match calling the api endpoint directly.

Actual Behavior

We have started investigating as we noticed that under load, requests are processed almost sequentially, leading to very long processing time if the target method takes long time to complete.

There is no possibility to control the underlying http client or configure anything in order to mitigate this problem, which was only solved by increasing the number of instances.

Comparing to calling the rest endpoint directly, the throughput of invokeMethod is about 16 times slower.

Steps to Reproduce the Problem

Please find a modified version of the invoke example from the sdk. Request processing at the server side is 1 second to demonstrate the effect. HTTP Client call is commented out, please uncomment it and comment the invoke method for a comparison.

Also find modified logs of both executions, with the httpclient and with dapr invokeMethod.

As you can see, execution with the httpclient takes around 1 second for all 50 requests while with dapr invoke it takes around 16 seconds.

with-dapr-invoke.txt
with-http-client.txt

DemoServiceController.java

@PostMapping(path = "/say")
  public Mono<String> handleMethod(@RequestBody(required = false) byte[] body,
                                   @RequestHeader Map<String, String> headers) {
    return Mono.fromSupplier(() -> {
      try {
        String message = body == null ? "" : new String(body, StandardCharsets.UTF_8);

        Calendar utcNow = Calendar.getInstance(TimeZone.getTimeZone("GMT"));

        // Handles the request by printing message.
        System.out.println(String.format("%s server request received", Instant.now().toEpochMilli()));
        Thread.sleep(1000);
        System.out.println(String.format("%s server request done", Instant.now().toEpochMilli()));

        return message + " received";
      } catch (Exception e) {
        throw new RuntimeException(e);
      }
    });
  }

DemoServiceController.java

public static void main(String[] args) throws Exception {
    var executor = Executors.newFixedThreadPool(50);
    try (DaprClient client = (new DaprClientBuilder()).build()) {
      for (var i = 0; i < 50; i++) {
        final String msg = "" + i;
        final Integer key = i;
        executor.execute(() -> {
          var timestamp = Instant.now().toEpochMilli();

          // use invokeMethod
          System.out.println(String.format("%d client invoke started", timestamp));
          client.invokeMethod(SERVICE_APP_ID, "say", "" + msg, HttpExtension.POST, null,
                  byte[].class).block();


          // use http client directly
          //          System.out.println(String.format("%d http client request started", timestamp));
          //          HttpClient httpClient = HttpClient.newHttpClient();
          //          HttpRequest request = HttpRequest.newBuilder()
          //                  .POST(HttpRequest.BodyPublishers.ofString("message"))
          //                  .uri(URI.create("http://localhost:3000/say"))
          //                  .build();
          //          httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofString())
          //                  .thenApply(HttpResponse::body)
          //                  .thenAccept(System.out::println)
          //                  .join();
          //          System.out.println(String.format("%d client http request done", timestamp));

          timestamp = Instant.now().toEpochMilli();
          System.out.println(String.format("%d client invoke done", timestamp));
        });

      }

    }

  }

add this to the application properties:
server.tomcat.threads.max=400

Release Note

RELEASE NOTE:

@yschneider-bosch yschneider-bosch added the kind/bug Something isn't working label Feb 23, 2022
@artursouza
Copy link
Member

We need to revisit the HTTP call solution to remove dependency from OKHTTP in addition to fix this issue. Thanks for reporting this with such detail. In the meantime, you can still use Dapr for service invocation via one of the URL solutions documented here: https://docs.dapr.io/developing-applications/building-blocks/service-invocation/howto-invoke-discover-services/#additional-url-formats

@artursouza artursouza added P1 size/M triaged/resolved Items triaged and ready labels Feb 24, 2022
@artursouza
Copy link
Member

This problem is related to runtime and not the SDK. The code sample above compares calling a service with Dapr Java SDK + Dapr Runtime vs calling with native HTTP Client without runtime. If I use the HTTP Client calling into Dapr runtime, the behavior is the same as Dapr Java SDK + Runtime.

            HttpRequest request = HttpRequest.newBuilder()
                .POST(HttpRequest.BodyPublishers.ofString("message"))
                .uri(URI.create("http://localhost:3500/say"))
                .header("dapr-app-id", "invokedemo")
                .build();

I am moving this item to dapr runtime.

@artursouza artursouza transferred this issue from dapr/java-sdk Mar 3, 2022
@artursouza
Copy link
Member

artursouza commented Mar 3, 2022

I talked to @yaron2 about this and could still be an SDK or JVM config issue. Still needs further investigation.

@skyao skyao self-assigned this Mar 14, 2022
@skyao
Copy link
Member

skyao commented Mar 16, 2022

This is a very tricky bug...... I sent so many times to debug it.

Selection_001

As shown in above picture, by default, if no specified configuration, the service invoke flow will like this.

  • In the java SDK, it will use HTTP to connect to daprd via dapr HTTP API
  • Between two daprd, the protocol will be gRPC (in fact only gRPC support), and mDNS will be used as Name Resolver
  • Daprd in server side will use HTTP to send request to the app (which is invokeDemo in the picture)

Now I have found three issues:

mDNS

mDNS Name Resolver will be very slow (response in more than 10 seconds) in multi-thread.

In daprd, we will call the name resolver for each request without cache or lock:

https://github.com/dapr/dapr/blob/3fbb9326503a64f998c5b2e65afcd12cf89d7925/pkg/messaging/direct_messaging.go#L246

func (d *directMessaging) getRemoteApp(appID string) (remoteApp, error) {
	id, namespace, err := d.requestAppIDAndNamespace(appID)
	if err != nil {
		return remoteApp{}, err
	}
	
	request := nr.ResolveRequest{ID: id, Namespace: namespace, Port: d.grpcPort}
	address, err := d.resolver.ResolveID(request)
	if err != nil {
		return remoteApp{}, err
	}
	
	return remoteApp{
		namespace: namespace,
		id:        id,
		address:   address,
	}, nil
}

I tried to use a sync.once (only for verification since in the test case I only have one app to look up), it work well.

fasthttp (Update: fasthttp is OK)

I added some log to see where the request are blocked.

From the log I saw that:

  1. all the 50 requests will be sent by java sdk to daprd
  2. but in daprd log of HTTP API, it seems that by default there are only 5 requests accepted and served. Others are in the queue and wait.

I will continue to find what happened in fasthttp tomorrow.

Update: from the tcp package captured I saw that the HTTP requests is not sent from SDK to daprd in the client side.

Selection_003

  • all the work thread in the client had been started
  • all the HTTP request had enqueue (okhttp send queue)
  • but only 5 http requests had been send to daprd
  • when one of the requests had been completed (http response received), the next request will be sent to daprd

So, it seems that in the java SDK there is a concurrency limitation of 5 thread ? Let me check it.

Update again: there is a config named maxRequestsPerHost, and it default to 5! default to 5! default to 5!

slow response

Even for one single request without sleep in server, the total response time will be more than 300 ms in my test env.

== server request received: id=1, start=1647454876799, received=1647454877161, interval=362
== APP == client invoke done: id=1, start=1647454876799, used=383 ms

In above log, the request from client to server spent 362 ms, and the response from server to client spent 19 ms.

This is unbelievable.

@jjcollinge
Copy link

jjcollinge commented Mar 16, 2022

@skyao thanks for the detailed breakdown - can I just query your point on the mDNS name resolution - the mDNS name resolution does use a cache internally (see here) so it should only browse the network on the first occurrence of an appID until the cached address expires. You said the responses from the name resolver took 10 seconds - is that the first resolution or consistently? The timeout on the first browse is 1 second - so you should see errors if it cannot find the app within 1 second - are you seeing retries?

@skyao
Copy link
Member

skyao commented Mar 17, 2022

@skyao thanks for the detailed breakdown - can I just query your point on the mDNS name resolution - the mDNS name resolution does use a cache internally (see here) so it should only browse the network on the first occurrence of an appID until the cached address expires. You said the responses from the name resolver took 10 seconds - is that the first resolution or consistently? The timeout on the first browse is 1 second - so you should see errors if it cannot find the app within 1 second - are you seeing retries?

Yes, I agree with you that the mDNS should work like this.

But in my test, I just use the testcase @yschneider-bosch discribed above and with a log arround the ResolveID() method in file pkg/messaging/direct_messaging.go:

func (d *directMessaging) getRemoteApp(appID string) (remoteApp, error) {
	id, namespace, err := d.requestAppIDAndNamespace(appID)
	if err != nil {
		return remoteApp{}, err
	}

	request := nr.ResolveRequest{ID: id, Namespace: namespace, Port: d.grpcPort}
	var time1 = time.Now().UnixMilli()
	address, err := d.resolver.ResolveID(request)
	if err != nil {
		return remoteApp{}, err
	}
	var time2 = time.Now().UnixMilli()
	log.Errorf("getRemoteApp time used: ResolveID()=%d, appid=%s, address=%s \n", time2-time1, appID, address)

	return remoteApp{
		namespace: namespace,
		id:        id,
		address:   address,
	}, nil
}

And rerun the testcase with/without sleep(1000) in the DemoServiceController.

the log without sleep(1000) :

time="2022-03-17T09:02:31.09530459+08:00" level=error msg="getRemoteApp time used: ResolveID()=3, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.126692686+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.133417994+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.139250079+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.14517409+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.150757463+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.156388955+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.161789377+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.167185329+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.172697468+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.178137323+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.184497682+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.190338719+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.196246867+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.201289344+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.206539466+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.211961854+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.21711449+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.222542151+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.227831481+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.233502396+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.238918673+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.243943206+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.249086688+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.254408958+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.259525945+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.264541257+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.269902464+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.275284137+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.28054464+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.285758299+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.290788464+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.296365219+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.301429552+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.307055786+08:00" level=error msg="getRemoteApp time used: ResolveID()=1, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.312315134+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.317650233+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.322533837+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.327782227+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.332554766+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.337336047+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.342125485+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.346902838+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.351753838+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.356840013+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:31.361987206+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:34.095965968+08:00" level=error msg="getRemoteApp time used: ResolveID()=3003, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:37.097115901+08:00" level=error msg="getRemoteApp time used: ResolveID()=6005, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:40.097288839+08:00" level=error msg="getRemoteApp time used: ResolveID()=9005, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:02:43.097999659+08:00" level=error msg="getRemoteApp time used: ResolveID()=12005, appid=invokedemo, address=192.168.0.10:39663 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge

the log with sleep(1000) :

time="2022-03-17T09:11:13.758644029+08:00" level=error msg="getRemoteApp time used: ResolveID()=3, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:14.789710697+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:15.797856915+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:16.759198207+08:00" level=error msg="getRemoteApp time used: ResolveID()=3004, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:16.805333242+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:17.767535979+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:17.812058078+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:18.774751926+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:18.818534289+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:19.760270691+08:00" level=error msg="getRemoteApp time used: ResolveID()=6005, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:19.78112466+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:19.824999131+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:20.768550289+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:20.787606748+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:20.83145024+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:21.775761829+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:21.794048779+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:21.837575032+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:22.760444011+08:00" level=error msg="getRemoteApp time used: ResolveID()=9005, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:22.781968126+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:22.800094101+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:22.843328963+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:23.767743702+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:23.788258109+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:23.805763259+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:23.849091698+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:24.77439113+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:24.794294373+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:24.811664327+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:24.854826557+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:25.761532326+08:00" level=error msg="getRemoteApp time used: ResolveID()=12006, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:25.780819371+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:25.800367547+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:25.817603227+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:25.860563862+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:26.768847435+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:26.787090688+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:26.806202655+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:26.823437317+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:26.866574246+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:27.775054344+08:00" level=error msg="getRemoteApp time used: ResolveID()=1, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:27.792781401+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:27.812324057+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:27.828743762+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:27.872333693+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:28.780839358+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:28.798177795+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:28.81776716+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:28.834015648+08:00" level=error msg="getRemoteApp time used: ResolveID()=1, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:11:28.877804923+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:35721 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge

@skyao
Copy link
Member

skyao commented Mar 17, 2022

From the log we can see that almost of time mDNS works well and the response time of ResolveID() is 0 or near 0, except that three are some strange records with response time 3 / 6 / 9 / 12 seconds.

3 / 6 / 9 / 12 seconds, they are so regular.

I added a test for 100 request to see if the serial will increase or not.

the log with sleep(1000) and run 100 requests:

time="2022-03-17T09:34:43.866035124+08:00" level=error msg="getRemoteApp time used: ResolveID()=5, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:44.961259015+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:45.971792535+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:46.867087994+08:00" level=error msg="getRemoteApp time used: ResolveID()=3005, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:46.981020314+08:00" level=error msg="getRemoteApp time used: ResolveID()=1, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:47.878043866+08:00" level=error msg="getRemoteApp time used: ResolveID()=1, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:47.989598999+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:48.886877901+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:48.998072761+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:49.867918659+08:00" level=error msg="getRemoteApp time used: ResolveID()=6006, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:49.894819256+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:50.006292009+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:50.877510565+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:50.902289623+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:51.013979831+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:51.885459535+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:51.90906856+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:52.020909586+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:52.86852472+08:00" level=error msg="getRemoteApp time used: ResolveID()=9007, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:52.893395202+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:52.916000463+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:53.02778722+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:53.877320015+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:53.900098711+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:53.922287359+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:54.03444127+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:54.88474739+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:54.90681886+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:54.928614828+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:55.041079031+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:55.869195522+08:00" level=error msg="getRemoteApp time used: ResolveID()=12008, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:55.891601281+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:55.913389239+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:55.935007207+08:00" level=error msg="getRemoteApp time used: ResolveID()=1, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:56.047756313+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:56.877375701+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:56.897855516+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:56.919573327+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:56.940894384+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:57.054291982+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:57.884269406+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:57.904430847+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:57.926195884+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:57.94700987+08:00" level=error msg="getRemoteApp time used: ResolveID()=1, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:58.060760185+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:58.891155463+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:58.910462313+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:58.932254217+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:58.952792987+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:59.067041114+08:00" level=error msg="getRemoteApp time used: ResolveID()=1, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:59.897754552+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:59.916475005+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:59.938362824+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:34:59.958795972+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:00.073264291+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:00.904428063+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:00.923053585+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:00.94411871+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:00.964412174+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:01.079226788+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:01.910825281+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:01.928818707+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:01.949924901+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:01.970566144+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:02.085121262+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:02.917571641+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:02.935269391+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:02.955812655+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:02.976848791+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:03.09119021+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:03.923933834+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:03.940855601+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:03.961277278+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:03.982236256+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:04.09669716+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:04.929770255+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:04.946834541+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:04.966644836+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:04.98771275+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:05.102265363+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:05.936046026+08:00" level=error msg="getRemoteApp time used: ResolveID()=1, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:05.95220518+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:05.971954169+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:05.992995284+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:06.10777764+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:06.942199141+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:06.958221108+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:06.977373245+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:06.998618724+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:07.113243421+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:07.947970142+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:07.964082154+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:07.982736175+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:08.003872281+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:08.118589884+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:08.953758068+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:08.969690438+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:08.988050869+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:09.009028858+08:00" level=error msg="getRemoteApp time used: ResolveID()=1, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge
time="2022-03-17T09:35:09.124060683+08:00" level=error msg="getRemoteApp time used: ResolveID()=0, appid=invokedemo, address=192.168.0.10:39853 \n" app_id=invokeclient instance=skywork scope=dapr.runtime.direct_messaging type=log ver=edge

Only 3 / 6/ 9 / 12 seconds. So, it means that after the 4 exceptions, mDNS will work well ?

@jjcollinge
Copy link

Thanks @skyao I will see if I can repro this behaviour in a test of the mDNS name resolution on its own. Will report back as soon as I can.

@skyao
Copy link
Member

skyao commented Mar 17, 2022

Thanks @skyao I will see if I can repro this behaviour in a test of the mDNS name resolution on its own. Will report back as soon as I can.

OK, you can just add the log as I shown above , rebuild the daprd and replace the one in ~/.dapr/bin, and rerun your test case.

I tested it on ubuntu 20.04 with intel x86 cpu. The branch of dapr repo and java-sdk repo are master.

@skyao
Copy link
Member

skyao commented Mar 17, 2022

Repeat one update above to avoid someone doesn't see it.

fasthttp (Update: fasthttp is OK)

I added some log to see where the request are blocked.

From the log I saw that:

  1. all the 50 requests will be sent by java sdk to daprd
  2. but in daprd log of HTTP API, it seems that by default there are only 5 requests accepted and served. Others are in the queue and wait.

I will continue to find what happened in fasthttp tomorrow.

Update: from the tcp package captured I saw that the HTTP requests is not sent from SDK to daprd in the client side.

Selection_003

  • all the work thread in the client had been started
  • all the HTTP request had enqueue (okhttp send queue)
  • but only 5 http requests had been send to daprd
  • when one of the requests had been completed (http response received), the next request will be sent to daprd

So, it seems that in the java SDK there is a concurrency limitation of 5 thread ? Let me check it.

@jjcollinge
Copy link

I suspect there is also a bug in the mDNS resolver too - I'll dig into it.

@skyao
Copy link
Member

skyao commented Mar 17, 2022

update for okhttp:

I checked the dapr java sdk code and found that there are some default settings in okhttp which we don't set in dapr java sdk and use the default value:

  • maxRequests: default to 64
  • maxRequestsPerHost: default to 5! default to 5! default to 5!
  • maxIdleConnections

After change maxRequestsPerHost from default 5 to a big one, the problem resolved.

@artursouza This is why our performance test is so fast but in the test case of this issue the throughput is so low:

  • By default OKHTTP limits the max request from the same host to 5!
  • In the test case, there is a sleep with 1 second added.

So, the TPS is limited to 5.

In performance test, the response from server is fast enough (less then 1 ms), for the TPS is still high with 5 connections.

@skyao
Copy link
Member

skyao commented Mar 17, 2022

Re run the test case after some update:

  1. set maxRequestsPerHost / maxRequests / maxIdleConnections in okhttp
  2. use sync.once as work around for mDNS bug
  3. update the test case to add warm up before test: warm up for mDNS, HTTP connection, Java JIT ...

Code of InvokeClient class:

public static void main(String[] args) throws Exception {
        var executor = Executors.newFixedThreadPool(20);
        try (DaprClient client = (new DaprClientBuilder()).build()) {

            // warn up for mDNS
            System.out.println("warm up for mDNS");
            doInvoke(client);
            Thread.sleep(2 * 1000);

            // warn up for HTTP connections
            System.out.println("warm up for HTTP connections");
            for (var i = 0; i < 20; i++) {
                executor.execute(() -> {
                    doInvoke(client);
                });
            }
            Thread.sleep(2 * 1000);

            // do real test
            System.out.println("warm up OK! Begin to do test!");
            for (var i = 0; i < 10; i++) {
                executor.execute(() -> {
                    for (var j = 0; j < 100; j++) {
                        doInvoke(client);
                    }
                });
            }
        }

    }

private static void doInvoke(DaprClient client) {
        var timestamp = System.currentTimeMillis();
        var time1 = System.nanoTime();

        // use invokeMethod
        int id = counter.getAndIncrement();
        String request = "id=" + id + ",start=" + timestamp;
        //System.out.println(String.format("begin to send request: %s", request));
        client.invokeMethod(SERVICE_APP_ID, "say", request,
                HttpExtension.POST, null,
                byte[].class).block();

        var used = (double)(System.nanoTime() -  time1)/ 1000 / 1000;
        System.out.println(String.format("client invoke done: id=%d, start=%d, used=%.2f ms", id, timestamp, used));
    }

After these changes, the total end-to-end latency is about 1 - 2 ms. And if we sleep 1 second in the server app, the total end-to-end latency is about 1004 - 1005 ms.

latency-without-sleep.txt

latency-with-sleep-1-second.txt

@jjcollinge
Copy link

@skyao I believe I have got to the bottom of the mDNS behaviour you were seeing - please refer to dapr/components-contrib#1591. Thanks for flagging it.

@artursouza
Copy link
Member

@skyao Thanks, I am moving this issue back to Java SDK since that is where the fix is in. The mDNS finding was also great, thanks.

@artursouza artursouza transferred this issue from dapr/dapr Mar 18, 2022
@artursouza artursouza added this to the v1.5 milestone Mar 18, 2022
@artursouza
Copy link
Member

Closing issue. It is planned for next release.

@mukundansundar mukundansundar moved this from Backlog to In Progress (Development) in Dapr Roadmap Mar 27, 2022
@artursouza artursouza moved this from In Progress (Development) to Done in Dapr Roadmap Apr 18, 2022
@artursouza artursouza moved this from Done to Released in Dapr Roadmap Apr 18, 2022
@yaron2 yaron2 modified the milestones: v1.5, v1.6 Jul 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working P1 size/M triaged/resolved Items triaged and ready
Projects
Development

No branches or pull requests

5 participants