PUT /_snapshot/{repository} with invalid repository configuration returns an error response but still successfully changes the config #107840

DiannaHohensee · 2024-04-24T13:24:53Z

Description

In a support case, it was possible to set invalid config (missing S3 required settings fields) despite the API request returning a 500 error. We should at least verify the expected request fields, and perhaps verify connection with the remote repository can be made successfully, before setting new repository config in the cluster state.

PUT _snapshot/found-snapshots
{
  "type": "s3",
  "settings": {
    "compress": false
  }
}

response

{
  "error": {
    "root_cause": [
      {
        "type": "repository_verification_exception",
        "reason": "[found-snapshots] path  is not accessible on master node"
      }
    ],
    "type": "repository_verification_exception",
    "reason": "[found-snapshots] path  is not accessible on master node",
    "caused_by": {
      "type": "i_o_exception",
      "reason": "Unable to upload object [tests-kdwUK-MWR-mwL1Bhev2agA/master.dat] using a single upload",
      "caused_by": {
        "type": "sdk_client_exception",
        "reason": "Failed to connect to service endpoint: ",
        "caused_by": {
          "type": "connect_exception",
          "reason": "Connection refused"
        }
      }
    }
  },
  "status": 500
}

GET _snapshot/found-snapshots sends this response.

{
  "found-snapshots": {
    "type": "s3",
    "settings": {
      "compress": "false"
    }
  }
}

The text was updated successfully, but these errors were encountered:

elasticsearchmachine · 2024-04-24T13:27:55Z

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner · 2024-04-24T13:54:10Z

I think the only required setting for a S3 repo is bucket, which appears in the code to have the validation you suggest, but there's a bug because it's never null even when not set:

diff --git a/modules/repository-s3/src/main/java/org/elasticsearch/repositories/s3/S3Repository.java b/modules/repository-s3/src/main/java/org/elasticsearch/repositories/s3/S3Repository.java
index 26b1b1158de..a777f846ca7 100644
--- a/modules/repository-s3/src/main/java/org/elasticsearch/repositories/s3/S3Repository.java
+++ b/modules/repository-s3/src/main/java/org/elasticsearch/repositories/s3/S3Repository.java
@@ -223,7 +223,7 @@ class S3Repository extends MeteredBlobStoreRepository {

         // Parse and validate the user's S3 Storage Class setting
         this.bucket = BUCKET_SETTING.get(metadata.settings());
-        if (bucket == null) {
+        if (Strings.hasText(bucket) == false) {
             throw new RepositoryException(metadata.name(), "No bucket defined for s3 repository");
         }

The other settings, including base_path and client, have reasonable defaults and do not need to be present in the request.

DiannaHohensee · 2024-04-24T14:34:59Z

The request also returned a 500 error, so it seems like that wasn't sufficient to stop the config from being set. Would we want to confirm the config works before setting it in the cluster state, or are there reasons, like performance or something, not to do so?

DaveCTurner · 2024-04-24T15:21:57Z

There's no great reason IMO, it's just not how it was originally implemented and would be a little awkward to achieve. At the moment, first we add the repository metadata to the cluster state, which triggers the creation of the repository instance on all the nodes, and then we ask all the nodes to try using the repository we just added. It's this last step which fails with a 500-code exception, by which point the repo is already in the cluster state. Ideally we'd do the verification step first and only if that succeeds would we add the repo to the cluster state, but that'd mean creating a temporary repository instance on each node just for the verification (and worrying about overlapping requests etc etc).

DiannaHohensee · 2024-04-24T15:39:49Z

but that'd mean creating a temporary repository instance on each node just for the verification

Is there anything stopping us from only doing an initial verification on the node running the request? A best-effort check to verify the config, before proceeding to send it out to all nodes via cluster state update (and then all nodes verify as per usual, too).

DaveCTurner · 2024-04-24T15:50:54Z

Yeah that'd work too I reckon. Indeed we do already create a temporary repository instance on the master first, we just don't try and use it at that point:

elasticsearch/server/src/main/java/org/elasticsearch/repositories/RepositoriesService.java

Lines 351 to 352 in 22d015b

    
           // Trying to create the new repository on master to make sure it works 
        
           closeRepository(createRepository(newRepositoryMetadata));

DaveCTurner · 2024-04-24T15:56:33Z

so maybe something like this would be enough:

diff --git a/server/src/main/java/org/elasticsearch/repositories/RepositoriesService.java b/server/src/main/java/org/elasticsearch/repositories/RepositoriesService.java
index f7a2a605a18..0fa7a0abead 100644
--- a/server/src/main/java/org/elasticsearch/repositories/RepositoriesService.java
+++ b/server/src/main/java/org/elasticsearch/repositories/RepositoriesService.java
@@ -349,7 +349,14 @@ public class RepositoriesService extends AbstractLifecycleComponent implements C
         final RepositoryMetadata newRepositoryMetadata = new RepositoryMetadata(request.name(), request.type(), request.settings());

         // Trying to create the new repository on master to make sure it works
-        closeRepository(createRepository(newRepositoryMetadata));
+        final var repository = createRepository(newRepositoryMetadata);
+        try {
+            if (request.verify()) {
+                repository.endVerification(repository.startVerification());
+            }
+        } finally {
+            closeRepository(repository);
+        }
     }

     private void submitUnbatchedTask(@SuppressWarnings("SameParameterValue") String source, ClusterStateUpdateTask task) {

Add basic validation to S3 bucket name - nullity and empty string. It is aligned with public [docs](https://www.elastic.co/guide/en/elasticsearch/reference/8.13/repository-s3.html#repository-s3-repository) for "bucket" as required field. We might want to add more validations based on S3 naming rules. This PR should not be a breaking change because missing bucket will eventually throw exception later in the code with obscure error. I've added yaml test to modules [repository_s3/10_basic.yml](https://github.com/elastic/elasticsearch/compare/main...mhl-b:elasticsearch:s3-bucket-validation?expand=1#diff-08cf26742fe939f5575961254c4d3b4bff6915141cdd6abe4cd28a743d1b70ba), not sure if it's a right place. Addresses #107840

DiannaHohensee added the good first issue low hanging fruit label Apr 24, 2024

DiannaHohensee assigned mhl-b Apr 24, 2024

mhl-b mentioned this issue Apr 25, 2024

Add null and empty string validation to S3 bucket #107883

Merged

This comment was marked as off-topic.

Sign in to view

pxsalehi mentioned this issue May 3, 2024

Snapshot repository creation errors out, but is returned by _snapshot/_all #108248

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PUT /_snapshot/{repository} with invalid repository configuration returns an error response but still successfully changes the config #107840

PUT /_snapshot/{repository} with invalid repository configuration returns an error response but still successfully changes the config #107840

DiannaHohensee commented Apr 24, 2024 •

edited

elasticsearchmachine commented Apr 24, 2024

DaveCTurner commented Apr 24, 2024 •

edited

DiannaHohensee commented Apr 24, 2024

DaveCTurner commented Apr 24, 2024

DiannaHohensee commented Apr 24, 2024 •

edited

DaveCTurner commented Apr 24, 2024

DaveCTurner commented Apr 24, 2024

This comment was marked as off-topic.

PUT /_snapshot/{repository} with invalid repository configuration returns an error response but still successfully changes the config #107840

PUT /_snapshot/{repository} with invalid repository configuration returns an error response but still successfully changes the config #107840

Comments

DiannaHohensee commented Apr 24, 2024 • edited

Description

elasticsearchmachine commented Apr 24, 2024

DaveCTurner commented Apr 24, 2024 • edited

DiannaHohensee commented Apr 24, 2024

DaveCTurner commented Apr 24, 2024

DiannaHohensee commented Apr 24, 2024 • edited

DaveCTurner commented Apr 24, 2024

DaveCTurner commented Apr 24, 2024

This comment was marked as off-topic.

DiannaHohensee commented Apr 24, 2024 •

edited

DaveCTurner commented Apr 24, 2024 •

edited

DiannaHohensee commented Apr 24, 2024 •

edited