Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RavenDB-21956 Add azure queue storage etl #18293

Open
wants to merge 1 commit into
base: v6.0
Choose a base branch
from

Conversation

djordjedjukic
Copy link
Contributor

@djordjedjukic djordjedjukic commented Mar 26, 2024

Issue link

https://issues.hibernatingrhinos.com/issue/RavenDB-21956/ETL-to-Azure-Queue-Storage

Additional description

Expanding Queue ETL with Azure Queue Storage provider.

Type of change

  • New feature

How risky is the change?

  • Not relevant

Backward compatibility

  • Non breaking change

Is it platform specific issue?

  • No

Documentation update

  • This change requires a documentation update. Please mark the issue on YouTrack using Documentation Required tag.

Testing by Contributor

  • Tests have been added that prove the fix is effective or that the feature works

Testing by RavenDB QA team

  • This change requires a special QA testing due to possible performance or resources usage implications (CPU, memory, IO). Please mark the issue on YouTrack using QA Required tag.

Is there any existing behavior change of other features due to this change?

  • No

UI work

  • It requires further work in the Studio. Please mark the issue on YouTrack using Studio Required tag.

{
public EntraId EntraId { get; set; }
public string ConnectionString { get; set; }
public bool Passwordless { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why those aren't directly part of AzureQueueStorageConnectionSettings? Do we need AzureQueueStorageConnectionSettings.Authentication field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
(I followed how we did for Elasticsearch, but maybe there also should be directly part of connection settings).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of ElasticSearchConnectionString it was a bit different because we provide node's URLs separately and then the authentication. Here we don't have such separation

@@ -31,6 +33,12 @@ protected override void ValidateImpl(ref List<string> errors)
errors.Add($"{nameof(RabbitMqConnectionSettings)} has no valid setting.");
}
break;
case QueueBrokerType.AzureQueueStorage:
if (AzureQueueStorageConnectionSettings?.Authentication == null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking if it's not null is enough? I think we should rather verify that what was provided there is correct. For example that there is just one auth option provided and that is has all fields / properties filled correctly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -216,7 +216,9 @@ public bool CanAutoRenewLetsEncryptCertificate

public bool HasElasticSearchEtl => Enabled(LicenseAttribute.ElasticSearchEtl);

public bool HasQueueEtl => Enabled(LicenseAttribute.QueueEtl);
//public bool HasQueueEtl => Enabled(LicenseAttribute.QueueEtl);
public bool HasQueueEtl => true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure it will be reverted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

try
{
string authenticationJson = await new StreamReader(HttpContext.Request.Body).ReadToEndAsync();
Authentication authentication = JsonConvert.DeserializeObject<Authentication>(authenticationJson);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why just Authentication instead of entire connection string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
else
{
throw new NotSupportedException("Provided authentication method is not supported");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure it will be validate much earlier - on attempt to put a connection string. We should never get here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just removed this part. You can not create auth option that doesn't exist in AzureQueueStorageConnectionSettings

@@ -155,7 +155,12 @@
<PackageReference Include="AWSSDK.Core" Version="3.7.302.16" />
<PackageReference Include="AWSSDK.Glacier" Version="3.7.300.55" />
<PackageReference Include="AWSSDK.S3" Version="3.7.305.31" />
<PackageReference Include="AWSSDK.Core" Version="3.7.302.9" />
<PackageReference Include="AWSSDK.Glacier" Version="3.7.300.49" />
<PackageReference Include="AWSSDK.S3" Version="3.7.305.25" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a merge conflict - see duplicated AWSSDK references

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


namespace SlowTests.Server.Documents.ETL.Queue;

public class AzureQueueStorageEtlTests : AzureQueueStorageEtlTestBase
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to do any cleanup of queues at the end of each test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was rather thinking about deleting queues during the dispose of tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you shut down azurite everything is deleted.

Do you think we need to delete queues explicitly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our case we'll rather have Azurite emulator always running (in docker) so the cleanup after tests is necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


Assert.Empty(record.QueueEtls);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have tests which intentionally try to push too big messages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

[RavenFact(RavenTestCategory.BackupExportImport | RavenTestCategory.Sharding | RavenTestCategory.Etl)]
public async Task ShouldSkipUnsupportedFeaturesInShardingOnImport_RabbitMqEtl()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public async Task ShouldSkipUnsupportedFeaturesInShardingOnImport_RabbitMqEtl()
public async Task ShouldSkipUnsupportedFeaturesInShardingOnImport_AzureQueueStorage()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

public TimeSetting AzureQueueStorageTimeToLive{ get; set; }

[Description("How long a message is hidden after being retrieved but not deleted")]
[DefaultValue(0)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also the default value in Azure right?

src/Raven.Server/Documents/ETL/EtlProcess.cs Show resolved Hide resolved
@arekpalinski
Copy link
Member

Please do the rebase and solve the conflicts

@arekpalinski
Copy link
Member

There is a studio compilation issue here. @kalczur or @ml054 can you please check it?

@djordjedjukic
Copy link
Contributor Author

There is a studio compilation issue here. @kalczur or @ml054 can you please check it?

Probably I made a mistake during resolving conflicts.

@ml054
Copy link
Member

ml054 commented May 8, 2024

@djordjedjukic Was there any c# conflicts? If no maybe @kalczur can rebase again?

@kalczur It looks like conflict with pinning you were adding last time. I assume we need to apply same changes in this PR.

@kalczur
Copy link
Contributor

kalczur commented May 9, 2024

@djordjedjukic Was there any c# conflicts? If no maybe @kalczur can rebase again?

@kalczur It looks like conflict with pinning you were adding last time. I assume we need to apply same changes in this PR.

sure, I'll fix it

RavenDB-21956 Allow to pin responsible node in azure queue storage etl
<PackageReference Include="AWSSDK.Core" Version="3.7.302.16" />
<PackageReference Include="AWSSDK.Glacier" Version="3.7.300.55" />
<PackageReference Include="AWSSDK.S3" Version="3.7.305.31" />
<PackageReference Include="Azure.Identity" Version="1.10.4" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant AWSSDK.* - because it's duplicated

{
var parts = connectionString.Split(';')
.Select(p => p.Split('='))
.ToDictionary(p => p[0], p => p.Length > 1 ? p[1] : "");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could use SqlConnectionStringParser.GetConnectionStringValue() here

return storageAccountName;
}

private void HandleConnectionStringError(string message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Handle" doesn't suggest this method will throw. Rename do ThrowConnectionStringError() maybe?

return $"https://{storageAccountName}.queue.core.windows.net/";
}

private string GetUrlFromConnectionString(string connectionString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have unit tests for method? So we'll know that we properly deal with connection strings (it can be handled with the usage of a public GetStorageUrl() method)

args.Length == 2 ? args[1].AsObject() : null);
}

private JsValue LoadToFunctionTranslatorWithAttributesInternal(string name, ObjectInstance obj,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do something about that?

CreateQueueIfNotExists(queueClient);

foreach (AzureQueueStorageItem queueItem in queue.Items)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have CancellationToken.ThrowIfCancellationRequested(); here in order to stop processing if the cancellation was requested


public bool IsValid()
{
return !string.IsNullOrWhiteSpace(StorageAccountName) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return !string.IsNullOrWhiteSpace(StorageAccountName) &&
return string.IsNullOrWhiteSpace(StorageAccountName) == false &&

improves readability, same goes for lines below


public bool IsValid()
{
return !string.IsNullOrWhiteSpace(StorageAccountName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return !string.IsNullOrWhiteSpace(StorageAccountName);
return string.IsNullOrWhiteSpace(StorageAccountName) == false;

Copy link
Contributor

@poissoncorp poissoncorp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newest Arek's comments mostly reflect also my thoughts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants