Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Unable to use UploadCsvAsync #1175

Open
jeroen-corsius-choreograph opened this issue May 10, 2024 · 1 comment
Open

[Bug]: Unable to use UploadCsvAsync #1175

jeroen-corsius-choreograph opened this issue May 10, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@jeroen-corsius-choreograph

Testcontainers version

3.8.0

Using the latest Testcontainers version?

Yes

Host OS

Windows

Host arch

x64

.NET version

8.0.100

Docker version

26.0.0, build 2ae903e

Docker info

Client: Docker Engine - Community
 Version:    26.0.0
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.13.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.4.1
    Path:     /home/jeroencorsius/.docker/cli-plugins/docker-compose
  scan: Docker Scan (Docker Inc.)
    Version:  v0.23.0
    Path:     /usr/libexec/docker/cli-plugins/docker-scan

Server:
 Containers: 92
  Running: 9
  Paused: 0
  Stopped: 83
 Images: 1577
 Server Version: 26.0.0
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: ae07eda36dd25f8a1b98dfbf587313b99c0190bb
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 5.10.16.3-microsoft-standard-WSL2
 Operating System: Ubuntu 20.04.6 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 24.84GiB
 Name: EINGRMW2791-D
 ID: dfbe5910-3e30-4a9d-8d1c-d81f3ffe03fd
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

[DEPRECATION NOTICE]: API is accessible on http://0.0.0.0:2375 without encryption.
         Access to the remote API is equivalent to root access on the host. Refer
         to the 'Docker daemon attack surface' section in the documentation for
         more information: https://docs.docker.com/go/attack-surface/
In future versions this will be a hard failure preventing the daemon from starting! Learn more at: https://docs.docker.com/go/api-security/
WARNING: No blkio throttle.read_bps_device support
WARNING: No blkio throttle.write_bps_device support
WARNING: No blkio throttle.read_iops_device support
WARNING: No blkio throttle.write_iops_device support

What happened?

Trying to upload a CSV file fails.
Example code:

using System.Net;
using System.Text;
using Google.Cloud.BigQuery.V2;
using Testcontainers.BigQuery;
using Xunit;

namespace Plugin.Google.Test.Integration.Export;

public class ExampleTest {
  [Fact]
  public async Task Example() {
    var projectName = "some_project";
    var datasetName = "some_dataset";
    var tableName = "some_table";

    // Create/start container
    var bigQueryContainer = new BigQueryBuilder().WithProject(projectName).WithReuse(true).Build();
    await bigQueryContainer.StartAsync();
    
    // Create client
    var bigQueryClient = await new BigQueryClientBuilder {
      BaseUri = bigQueryContainer.GetEmulatorEndpoint(),
      ProjectId = projectName,
    }.BuildAsync();

    // Create dataset
    await bigQueryClient.CreateDatasetAsync(datasetName);
    
    // Create table
    var createTableQuery = $"CREATE TABLE IF NOT EXISTS {datasetName}.{tableName} (id INT64, name STRING);";
    await bigQueryClient.ExecuteQueryAsync(createTableQuery, null);
    
    // Create CSV memory stream
    var csvData = "1,John\n2,Jane\n3,Michael";
    var csvMemoryStream = new MemoryStream(Encoding.UTF8.GetBytes(csvData));
    
    // Upload CSV
    var uploadCsvJob = await bigQueryClient.UploadCsvAsync(datasetName, tableName, null, csvMemoryStream);
    await uploadCsvJob.PollUntilCompletedAsync();
  }
}

Relevant log output

System.Net.Http.HttpRequestException: IPv4 address 0.0.0.0 and IPv6 address ::0 are unspecified addresses that cannot be used as a target address. (Pa...

System.Net.Http.HttpRequestException
IPv4 address 0.0.0.0 and IPv6 address ::0 are unspecified addresses that cannot be used as a target address. (Parameter 'hostName') (0.0.0.0:9050)
   at Google.Cloud.BigQuery.V2.BigQueryClientImpl.UploadDataAsync(JobConfigurationLoad loadConfiguration, Stream input, String contentType, JobCreationOptions options, CancellationToken cancellationToken)
   at Google.Cloud.BigQuery.V2.BigQueryClientImpl.UploadCsvAsync(TableReference tableReference, TableSchema schema, Stream input, UploadCsvOptions options, CancellationToken cancellationToken)
   at Plugin.Google.Test.Integration.Export.ExampleTest.Example() in C:\Projects\badger\services\Plugin.Google.Test\Integration\Export\TemptTest.cs:line 37
   at Xunit.Sdk.TestInvoker`1.<>c__DisplayClass46_0.<<InvokeTestMethodAsync>b__1>d.MoveNext() in /_/src/xunit.execution/Sdk/Frameworks/Runners/TestInvoker.cs:line 253
--- End of stack trace from previous location ---
   at Xunit.Sdk.ExecutionTimer.AggregateAsync(Func`1 asyncAction) in /_/src/xunit.execution/Sdk/Frameworks/ExecutionTimer.cs:line 48
   at Xunit.Sdk.ExceptionAggregator.RunAsync(Func`1 code) in /_/src/xunit.core/Sdk/ExceptionAggregator.cs:line 90

System.ArgumentException
IPv4 address 0.0.0.0 and IPv6 address ::0 are unspecified addresses that cannot be used as a target address. (Parameter 'hostName')
   at System.Net.Dns.GetHostEntryOrAddressesCoreAsync(String hostName, Boolean justReturnParsedIp, Boolean throwOnIIPAny, Boolean justAddresses, AddressFamily family, CancellationToken cancellationToken)
   at System.Net.Dns.GetHostAddressesAsync(String hostNameOrAddress, AddressFamily family, CancellationToken cancellationToken)
   at System.Net.Sockets.SocketAsyncEventArgs.DnsConnectAsync(DnsEndPoint endPoint, SocketType socketType, ProtocolType protocolType)
   at System.Net.Sockets.Socket.ConnectAsync(SocketAsyncEventArgs e, Boolean userSocket, Boolean saeaCancelable)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ConnectAsync(Socket socket)
   at System.Net.Sockets.Socket.ConnectAsync(EndPoint remoteEP, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectToTcpHostAsync(String host, Int32 port, HttpRequestMessage initialRequest, Boolean async, CancellationToken cancellationToken)

Additional information

Workaround
If I replace the "Create client" part of the provided example with the code below, I'm able to upload CSV files:

    // Create client
    var mappedPublicPort = bigQueryContainer.GetMappedPublicPort(BigQueryBuilder.BigQueryPort);
    var exposedContainerAddress = $"localhost:{mappedPublicPort}";
    var bigQueryClient = await new BigQueryClientBuilder {
      BaseUri = bigQueryContainer.GetEmulatorEndpoint(),
      ProjectId = projectName,
      HttpClientFactory = HttpClientFactory.ForProxy(new WebProxy(exposedContainerAddress)),
    }.BuildAsync();
@HofmeisterAn
Copy link
Collaborator

HofmeisterAn commented May 13, 2024

I am not very familiar with BigQuery, but while debugging the issue, I noticed that the client initiates the resumable upload session and returns the following URI (for the session URI or upload URI UploadUri property):

http://0.0.0.0:9050/upload/bigquery/v2/projects/some_project/jobs?uploadType=resumable&upload_id=job_b5c859ce_3301_4d2c_aa91_f81c43a355b8

Which I believe is incorrect. I assume the IP should be the Docker host (127.0.0.1), and the port should be the randomly assigned host port. I expect the same problem when running the container from the CLI (without Testcontainers). Although, I am unsure if configurations are missing (or if there are any configurations available at all) or if your workaround is the correct approach and should be documented.

Perhaps the best is to create an upstream issue (probably the initial issue goccy/bigquery-emulator#311 was right) and get more details about how the URI gets created and whether any configurations are available or necessary. I have not looked at the documentation yet, so that is probably something we should do too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants