Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Elasticsearch 8 #92

Closed
jertel opened this issue Apr 26, 2021 · 44 comments · Fixed by #744
Closed

Support Elasticsearch 8 #92

jertel opened this issue Apr 26, 2021 · 44 comments · Fixed by #744
Assignees
Labels
enhancement New feature or request

Comments

@jertel
Copy link
Owner

jertel commented Apr 26, 2021

Elasticsearch v8 no longer supports doc_type. There is likely going to be some effort need to update this project to deal with this.

@nsano-rururu
Copy link
Collaborator

nsano-rururu commented Apr 27, 2021

ES8 support may need to be considered in addition to doc_type.

In ES7.X this include_type_name default has been changed to false, but elastalert explicitly specifies include_type_name = true to use the type, but it will be removed in ES8. Will be.

Also, the specification of dateOptionalTime is abolished and becomes date_optional_time.

elasticsearch>=7.0.0,<8.0.0

    ProcessController:  /home/node/.local/lib/python3.8/site-packages/elasticsearch/connection/base.py:193: ElasticsearchDeprecationWarning: Camel case format name dateOptionalTime is deprecated and will be removed in a future version. Use snake case name date_optional_time instead.
      warnings.warn(message, category=ElasticsearchDeprecationWarning)
    /home/node/.local/lib/python3.8/site-packages/elasticsearch/connection/base.py:193: ElasticsearchDeprecationWarning: [types removal] Using include_type_name in put mapping requests is deprecated. The parameter will be removed in the next major version.
      warnings.warn(message, category=ElasticsearchDeprecationWarning)

The following are expected changes that need to be addressed.

___init___.py

add

    def is_atleasteight(self):
        """
        Returns True when the Elasticsearch server version >= 8
        """
        return int(self.es_version.split(".")[0]) >= 8

elastalert/create_index.py

before

   if is_atleastseven(esversion):
        # TODO remove doc_type completely when elasicsearch client allows doc_type=None
        # doc_type is a deprecated feature and will be completely removed in Elasicsearch 8
        es_client.indices.put_mapping(index=ea_index, doc_type='_doc',
                                      body=es_index_mappings['elastalert'], include_type_name=True)
        es_client.indices.put_mapping(index=ea_index + '_status', doc_type='_doc',
                                      body=es_index_mappings['elastalert_status'], include_type_name=True)
        es_client.indices.put_mapping(index=ea_index + '_silence', doc_type='_doc',
                                      body=es_index_mappings['silence'], include_type_name=True)
        es_client.indices.put_mapping(index=ea_index + '_error', doc_type='_doc',
                                      body=es_index_mappings['elastalert_error'], include_type_name=True)
        es_client.indices.put_mapping(index=ea_index + '_past', doc_type='_doc',
                                      body=es_index_mappings['past_elastalert'], include_type_name=True)

after

   if is_atleasteight(esversion):
        es_client.indices.put_mapping(index=ea_index,
                                      body=es_index_mappings['elastalert'])
        es_client.indices.put_mapping(index=ea_index + '_status'
                                      body=es_index_mappings['elastalert_status'])
        es_client.indices.put_mapping(index=ea_index + '_silence',
                                      body=es_index_mappings['silence'])
        es_client.indices.put_mapping(index=ea_index + '_error',
                                      body=es_index_mappings['elastalert_error'])
        es_client.indices.put_mapping(index=ea_index + '_past',
                                      body=es_index_mappings['past_elastalert'])
   elif is_atleastseven(esversion):
        es_client.indices.put_mapping(index=ea_index, doc_type='_doc',
                                      body=es_index_mappings['elastalert'])
        es_client.indices.put_mapping(index=ea_index + '_status', doc_type='_doc',
                                      body=es_index_mappings['elastalert_status'])
        es_client.indices.put_mapping(index=ea_index + '_silence', doc_type='_doc',
                                      body=es_index_mappings['silence'])
        es_client.indices.put_mapping(index=ea_index + '_error', doc_type='_doc',
                                      body=es_index_mappings['elastalert_error'])
        es_client.indices.put_mapping(index=ea_index + '_past', doc_type='_doc',
                                      body=es_index_mappings['past_elastalert'])

elastalert/loaders.py

It is necessary to prevent the following checks from being performed on elasticsearch 8 or later versions

        if rule.get('use_count_query') or rule.get('use_terms_query'):
            if 'doc_type' not in rule:
                raise EAException('doc_type must be specified.')

elastalert/elastalert.py

before

        # Record doc_type for use in get_top_counts
        if 'doc_type' not in rule and len(hits):
          rule['doc_type'] = hits[0]['_type']

after

        # Record doc_type for use in get_top_counts
        if not self.thread_data.current_es.is_atleasteight():
            if 'doc_type' not in rule and len(hits):
                rule['doc_type'] = hits[0]['_type']

before

            if not rule['five']:
                res = self.thread_data.current_es.deprecated_search(
                    index=index,
                    doc_type=rule['doc_type'],
                    body=query,
                    search_type='count',
                    ignore_unavailable=True
                )
            else:
                res = self.thread_data.current_es.deprecated_search(index=index, doc_type=rule['doc_type'],
                                                                    body=query, size=0, ignore_unavailable=True)

after

            if not rule['five']:
                if self.thread_data.current_es.is_atleasteight():
                    res = self.thread_data.current_es.deprecated_search(
                        index=index,
                        body=query,
                        search_type='count',
                        ignore_unavailable=True
                    )
                else:
                    res = self.thread_data.current_es.deprecated_search(
                        index=index,
                        doc_type=rule['doc_type'],
                        body=query,
                        search_type='count',
                        ignore_unavailable=True
                    )
            else:
                res = self.thread_data.current_es.deprecated_search(index=index, doc_type=rule['doc_type'],
                                                                    body=query, size=0, ignore_unavailable=True)

before

            if not rule['five']:
                res = self.thread_data.current_es.deprecated_search(
                    index=index,
                    doc_type=rule.get('doc_type'),
                    body=query,
                    search_type='count',
                    ignore_unavailable=True
                )
            else:
                res = self.thread_data.current_es.deprecated_search(index=index, doc_type=rule.get('doc_type'),
                                                                    body=query, size=0, ignore_unavailable=True)

after

            if not rule['five']:
                if self.thread_data.current_es.is_atleasteight():
                    res = self.thread_data.current_es.deprecated_search(
                        index=index,
                        body=query,
                        search_type='count',
                        ignore_unavailable=True
                    )
                else:
                    res = self.thread_data.current_es.deprecated_search(
                        index=index,
                        doc_type=rule.get('doc_type'),
                        body=query,
                        search_type='count',
                        ignore_unavailable=True
                    )
            else:
                res = self.thread_data.current_es.deprecated_search(index=index, doc_type=rule.get('doc_type'),
                                                                    body=query, size=0, ignore_unavailable=True)

elastalert/test_rule.py

before

res = es_client.count(index=index, doc_type=doc_type, body=count_query, ignore_unavailable=True)

after

if es_client.is_atleasteight():
    res = es_client.count(index=index, body=count_query, ignore_unavailable=True)
else:
    res = es_client.count(index=index, doc_type=doc_type, body=count_query, ignore_unavailable=True)

elastalert/tests/conftest.py

mock_es_client

add

self.is_atleasteight = mock.Mock(return_value=False)

mock_es_sixsix_client

add

self.is_atleasteight = mock.Mock(return_value=False)

elastalert/es_mappings/8/elastalert.json

add file

dateOptionalTime→date_optional_time

{
  "numeric_detection": true,
  "date_detection": false,
  "dynamic_templates": [
    {
      "strings_as_keyword": {
        "mapping": {
          "ignore_above": 1024,
          "type": "keyword"
        },
        "match_mapping_type": "string"
      }
    }
  ],
  "properties": {
    "rule_name": {
      "type": "keyword"
    },
    "@timestamp": {
      "type": "date",
      "format": "date_optional_time"
    },
    "alert_time": {
      "type": "date",
      "format": "date_optional_time"
    },
    "match_time": {
      "type": "date",
      "format": "date_optional_time"
    },
    "match_body": {
      "enabled": "false",
      "type": "object"
    },
    "aggregate_id": {
      "type": "keyword"
    }
  }
}

add Directory

elastalert/es_mappings/8

elastalert/es_mappings/8/elastalert_error.json ・・・add file

dateOptionalTime→date_optional_time

{
  "properties": {
    "data": {
      "type": "object",
      "enabled": "false"
    },
    "@timestamp": {
      "type": "date",
      "format": "date_optional_time"
    }
  }
}

elastalert/es_mappings/8/elastalert_status.json・・・add file

dateOptionalTime→date_optional_time

{
  "properties": {
    "rule_name": {
      "type": "keyword"
    },
    "@timestamp": {
      "type": "date",
      "format": "date_optional_time"
    }
  }
}

elastalert/es_mappings/8/past_elastalert.json・・・add file

dateOptionalTime→date_optional_time

{
  "properties": {
    "rule_name": {
      "type": "keyword"
    },
    "match_body": {
      "type": "object",
      "enabled": "false"
    },
    "@timestamp": {
      "type": "date",
      "format": "date_optional_time"
    },
    "aggregate_id": {
      "type": "keyword"
    }
  }
}

elastalert/es_mappings/8/silence.json・・・add file

dateOptionalTime→date_optional_time

{
  "properties": {
    "rule_name": {
      "type": "keyword"
    },
    "until": {
      "type": "date",
      "format": "date_optional_time"
    },
    "@timestamp": {
      "type": "date",
      "format": "date_optional_time"
    }
  }
}

elastalert/create_index.py

before

es_index_mappings = read_es_index_mappings() if is_atleastsix(esversion) else read_es_index_mappings(5)

after

es_index_mappings = read_es_index_mappings() is_atleasteight(esversion) elif is_atleastsix(esversion) else read_es_index_mappings(5)

tests/create_index_test.py

add test_read_es_8_index_mappings

def test_read_es_8_index_mappings():
    mappings = elastalert.create_index.read_es_index_mappings(8)
    assert len(mappings) == len(es_mappings)
    print((json.dumps(mappings, indent=2)))

@nsano-rururu
Copy link
Collaborator

If elasticsearch 8 is released and supported, the docker image will likely need to provide both pip install elasticsearch==7.0.0 for elasticsearch 7 and pip install elasticsearch==8.0.0 for elasticsearch 8.

@nsano-rururu
Copy link
Collaborator

nsano-rururu commented May 15, 2021

This is the exact code block that needs to be changed:
loaders.py
Yelp/elastalert#2424 (comment)

        # Check that doc_type is provided if use_count/terms_query
        if rule.get('use_count_query') or rule.get('use_terms_query'):
            if 'doc_type' not in rule:
                raise EAException('doc_type must be specified.')

@ferozsalam ferozsalam added the enhancement New feature or request label Jun 9, 2021
@ferozsalam
Copy link
Collaborator

I noticed the alpha ES 8 image is now available from in the Elasticsearch Docker registry. I'll start working on ES 8 compatibility this weekend, assuming no one else is already looking at this.

@jertel
Copy link
Owner Author

jertel commented Sep 3, 2021

Thanks @ferozsalam.

@nsano-rururu
Copy link
Collaborator

@ferozsalam

Since elasticsearch-py 8.0.0 of elasticsearch's pytnon client has not been released yet, is it correct to change the code once in the state of elasticsearch-py 7.0.0 and check the operation?
https://github.com/elastic/elasticsearch-py

@ferozsalam
Copy link
Collaborator

That's a good point.

I think my plan will be to see if I can get things working with the ES8 alpha and elasticsearch-py 7.0.0 over the weekend. Elastic has already started work on elasticsearch-py 8.0.0, so if 7.0.0 doesn't work I might try using the latest version of the library direct from the repository.

If it all doesn't work and we need to wait for elasticsearch-py 8.0.0, I'll post here and pause for a while.

@nsano-rururu
Copy link
Collaborator

nsano-rururu commented Sep 5, 2021

Since the latest elasticsearch-py currently has an implementation that does not connect to Amazon Elasticsearch Service, it is necessary to support opensearch-py at the same time as supporting elasticsearch-py 8.
https://aws.amazon.com/jp/blogs/opensource/keeping-clients-of-opensearch-and-elasticsearch-compatible-with-open-source/
https://github.com/opensearch-project/opensearch-py

opensearch-py seems to be able to connect to both elasticsearch and Amazon Elasticsearch Service, but the client version before the connection restrictions were built in was elasticsearch-py 7.13.4. Please note that elastalert2 uses elasticsearch-py7.0.0, so if you change all connections to opensearch-py, some rules will not work.
It's possible to determine the Amazon Elasticsearch Service, but you need to consider what to do if you use opensearch alone.

https://opensearch.org/docs/clients/index/
OpenSearch client compatibility

Python Elasticsearch client 7.13.4

@nsano-rururu
Copy link
Collaborator

@ferozsalam

The index kibana-int described in the code of elastalert/rule_from_kibana.py and elastalert/elastalert.py does not exist in Elasticsearch 7 (also Elasticsearch 6?), So there are some errors in the results I analyzed in the past.
#92 (comment)
Please refer to the content of the discussion commented by jertel.
#442 (reply in thread)

@nsano-rururu
Copy link
Collaborator

@ferozsalam

Share information about es 8.

Deprecation warnings in 7.15.0 pre-releases
elastic/elasticsearch-py#1698

The body parameter for APIs are deprecated

I didn't expect the API body parameter to be deprecated. I think it has a big impact.

@ferozsalam ferozsalam self-assigned this Sep 26, 2021
@ferozsalam ferozsalam changed the title doc_type is obsolete in ES8 Support Elasticsearch 8 Sep 26, 2021
@ferozsalam
Copy link
Collaborator

I've spent some time this weekend (much delayed!) getting a development environment setup, and have managed to get ElastAlert running and communicating to Elasticsearch 8.0.0-alpha2 using elasticsearch-py 7.0.0 with a single debug rule running. The good news is that it looks like ES 8-alpha2 still works (at least for ElastAlert) with elasticsearch-py 7.0.0.

There are several places where I need to make further tweaks to handle the removal of doc types, but I don't foresee that proving a major hurdle.

My goal for this week is to get the unit test suite working to see what else needs fixing/changing. I will probably start doing the development work on a separate branch.

Thanks very much for your code samples above @nsano-rururu - they saved me a lot of time.

A question - does anyone know why we have pinned the current elasticsearch-py version at 7.0.0? We're probably going to run into whatever issue is causing that again with the conversion to ES 8, so I wondered if it might be good to also fix that if possible.

@nsano-rururu
Copy link
Collaborator

nsano-rururu commented Sep 27, 2021

Yelp/elastalert#2593 (comment)

Also note: I also added a pin for elasticsearch==7.0.0, because apparently 7.1.0 will NOT work with ES < 6.6 due to it not supported _source_include(s?). 7.0.0 does. Tests won't pass otherwise.

Fix issue caused by 7.x breaking change (_source_include/_source_exclude)
elastic/elasticsearch-py#1019

https://www.elastic.co/guide/en/elasticsearch/reference/7.0/breaking-changes-7.0.html#source-include-exclude-params-removed

Source filtering url parameters _source_include and _source_exclude have been removed
The deprecated in 6.x url parameters are now removed. Use _source_includes and _source_excludes instead.

ES Version revert to 7.0.0
#90

@nsano-rururu
Copy link
Collaborator

The advantage of fixing with elasticsearch-py 7.0.0 is that ES 6/7/8 and OpenSearch only need to provide one docker image.
The disadvantage is that new features cannot be used. For example, it is not possible to support Elastic Cloud's Cloud ID, or it is not possible to use something with bug fixes or performance improvements.

elasticsearch-py [7.x] » Release notes

https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/release-notes.html

@nsano-rururu
Copy link
Collaborator

After that, if there is a new query writing method supported in the version after elasticsearch-py 7.0.0 release, it will not be usable. And since using elasticsearch-py 7.x with elasticsearch 8 is an unofficial method, the chances of something going wrong are not zero.

Since it is officially announced that it will be used as follows
elasticsearch-py 5.x is elasticsearch 5.x
elasticsearch-py 6.x is elasticsearch 6.x
elasticsearch-py 7.x is elasticsearch 7.x
elasticsearch-py 8.x is elasticsearch 8.x

@nsano-rururu
Copy link
Collaborator

Continuing to use middleware with older versions has a non-zero potential for security issues. If you understand that and still use the older version, I won't say anything anymore.

@ferozsalam
Copy link
Collaborator

Yes, completely understand and agree that the aim should be to move over to the latest library version as quick as is possible.

However if migrating to ES 8 does not have a hard dependency on elasticsearch-py 8.0.0, then I think it might make more sense to work on the two tasks separately, especially if there are significant other changes required to support elasticsearch-py 8.0.0. @jertel do you have an opinion here?

Thanks for the explanation on the Docker image - it might be an idea to offer multiple Docker images so that we can move forward with elasticsearch-py version, perhaps based on different branches of the repo? Again something that I think @jertel would have to set up, so would be interested in knowing his thoughts.

@jertel
Copy link
Owner Author

jertel commented Sep 27, 2021

Ideally we would continue with:

  • 1 branch
  • 1 docker image

My preference is to have a single branch to avoid maintaining multiple copies of the source code. I think having a temporary branch to work through the ES 8 compatibility is fine though, provided the development and testing doesn't take months to complete before we can merge it back to master.

Below is my understanding of the library compatibility matrix (If there are corrections please tell me so I can update this):

elasticsearch-py8

  • ES 8: Supported
  • ES 7: Unknown
  • ES 6 or lower: NOT supported
  • ES Cloud: Supported
  • OpenSearch: NOT supported

elasticsearch-py7 (specifically 7.0.0)

  • ES 8: Mostly Supported (unknown at this time what is missing)
  • ES 7: Supported
  • ES 6 or lower: Supported
  • ES Cloud: Supported (but without cloud_id param support)
  • OpenSearch: Mostly Supported (unknown at this time what is missing)

opensearch-py

  • ES 8: Mostly Supported (unknown at this time what is missing, and over time this will be less reliable)
  • ES 7: Supported (but unsure of whether future releases will remove support, similar to what was done for ES Cloud)
  • ES 6 or lower: NOT supported
  • ES Cloud: NOT supported
  • OpenSearch: Supported

Abstracting the calls to the Elastic API away from the general ElastAlert 2 source code and into a new search.py class would give us the ability to put all the logic in that new class for choosing whether to use the opensearch-py library or the elasticsearch-py library. This might be easier said than done, but it would help isolate all of this complexity into one place.

If @ferozsalam can prove that ES8 compatibility can be had with the removal of doc_type and without switching to the new library then let's proceed with getting ES8 support into master without changing the Python library and without breaking ES7 or OpenSearch compatibility.

Then, separately, we can discuss deprecating support for old versions of Elasticsearch, based on Elastic's End-of-Life (EOL) dates. This will allow us to begin upgrading the elasticsearch-py library to newer versions.

@nsano-rururu
Copy link
Collaborator

Elastic Cloud, but according to the following material, it seems that you need to add the parameter cloud_id when connecting.
https://www.elastic.co/guide/en/elasticsearch/client/python-api/master/connecting.html#auth-ec

from elasticsearch import Elasticsearch

es = Elasticsearch(
    cloud_id=”cluster-1:dXMa5Fx...”
)

elasticsearch-py [7.x] » Release notes
https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/release-notes.html

7.0.2 (2019-05-29)
Add connection parameter for Elastic Cloud cloud_id.

@nsano-rururu
Copy link
Collaborator

It's opensearch-py, but it seems that I'm trying to remove the code related to Elastic Cloud connection. I have a pull request. cloud_id and api_key. The api_key should have been supported by elastalert2.
https://github.com/opensearch-project/opensearch-py

@jertel
Copy link
Owner Author

jertel commented Sep 27, 2021

Thanks, I've updated the post above to mention that ES Cloud is "NOT supported" by opensearch-py.

@nsano-rururu
Copy link
Collaborator

Since opensearch-py should fork elasticsearch-py 7.13.4 as a new project, I think ES7 and ES8 will be as follows.

ES 7: Unknown → Supported
ES 8: Unknown → Mostly Supported (unknown at this time what is missing)

@jertel
Copy link
Owner Author

jertel commented Sep 27, 2021

Ok, then it probably also means opensearch-py does not support ES6 or lower, based on #90. I'll update that now.

@sethmlarson
Copy link

Hello folks 👋 Saw this thread and was wondering if there's anything I can do to help or clarify. One thing I saw that I wasn't sure of was:

elasticsearch-py7 (specifically 7.0.0)
* ES Cloud: NOT supported

Elastic Cloud is definitely supported by all 7.x versions of the client, the cloud_id is only a convenient way of specifying the Elasticsearch cluster you're connecting to. Specifying via a URL works just as well for Elastic Cloud.

@jertel
Copy link
Owner Author

jertel commented Sep 27, 2021

Thanks for chiming in @sethmlarson. I've updated the post above to reflect that clarification.

While you're here, could you comment on the following:

  1. Do you know if elasticsearch-py8 works with ES 7 clusters?
  2. Is there anything significant you can think of that would be a problem with using elasticsearch-py7 against an ES 8 cluster? I'm thinking primarily along the lines of search filters.

@sethmlarson
Copy link

sethmlarson commented Sep 27, 2021

I've updated the post above to reflect that clarification.

Thank you!

1. Do you know if elasticsearch-py8 works with ES 7 clusters?

Our compatibility policy is forwards compatibility, so there's no guarantee that v8.0 clients will work with v7.x servers. However if you're not relying on removed features (mapping types, filters) then it'd maybe work? Wanted to highlight the difference between "supported" and "happens to work".

2. Is there anything significant you can think of that would be a problem with using elasticsearch-py7 against an ES 8 cluster? I'm thinking primarily along the lines of search filters.

Mapping types and anything removed in 8.0 are still removed even when using "compatibility mode". However in client versions pre-7.16 will likely need to be more hands-on with the compatibility mode by settings HTTP headers yourself. In 7.16 I'm working towards getting the mode to be much easier to use.

@nsano-rururu
Copy link
Collaborator

@jertel

One point supplement. The HTTP header cannot be set without modifying the current elastalert2 program.
The following pull request for yelp/elastalert seems to be the corresponding code.

allow custom http_headers in config.yaml
Yelp/elastalert#2952

@ferozsalam
Copy link
Collaborator

ferozsalam commented Sep 28, 2021

Thanks all for your feedback and suggestions here, there’s a lot to think about.

I think we have two options.

Option 1

We’re currently unable to upgrade our elasticsearch-py beyond 7.0.0 because we’re maintaining support for ES 6, which has been EOL for around a year.

If we were to formally drop support for ES 6 and below, we could then move to elasticsearch-py 7.15.0 (and eventually 7.16.0), which will give us some nice fixes while also making compatibility with ES 8 neater, judging by @sethmlarson's comment above.

With support for ES 8 done, we could then work on the changes necessary to support elasticsearch-py 8.0.0 without any (significant) time pressure.

Option 2

Otherwise we hardcode the compatibility header into our HTTP requests and continue with ES 7.0.0 until a point where we are happier to drop support for ES 6.


My preference is for Option 1, as:

  • ES 6 is EOL and has been for a while (over a year for most versions)
  • It would be an opportunity to tidy up the codebase slightly, removing old compatability hacks.
  • We’re missing some bug fixes and nice features that could be handy (async requests!) for improvements in the future.
  • People on ES 6 can continue to use the older packages/image.
  • It's less work on our side and will probably be neater in the codebase.

What does everyone else think?

@nsano-rururu
Copy link
Collaborator

I think option1 is fine.

Since elasticsearch-py 8.0.0 will end support for Python 3.5 and earlier, is it possible to check the python version with setup.py? I think you should add settings if you can easily do it.

@jertel
Copy link
Owner Author

jertel commented Sep 28, 2021

I agree with Option 1.

@nsano-rururu
Copy link
Collaborator

@ferozsalam

Regarding the abolition of the body parameter, it seems that it has changed to elasticsearch-py 9.0.0. elasticsearch-py 8.0.0 seems to support both old and new writing
elastic/elasticsearch-py#1698 (comment)

@ferozsalam
Copy link
Collaborator

Looks like Elasticsearch 8.0 is already out, and somewhat predictably, I haven't found the time to enable support! 😄

I'll take a look later this weekend, although with the compatibility mode I suspect the changes will be minimal.

@LaZyDK
Copy link
Contributor

LaZyDK commented Feb 14, 2022

While trying to update our Elastic Cloud 7.17.0 to 8.0 the cluster upgrade failed.
Elastic support gave me this message:

..there are mappings defined on the indices it creates that are not compatible with 8.0.  
The date fields are being mapped as "dateOptionalTime", but these now need to be defined as "date_optional_time" instead.

To fix this, you would need to do the following:
1. Reindex each elastalert index into a temporary index:  https://www.elastic.co/guide/en/elasticsearch/reference/7.17/docs-reindex.html
2. Delete the original elastalert indices
3. Re-create the elastalert indices, defining the date mappings as "date_optional_time"
4. Reindex the data from your temp indices back into the fixed elastalert indices```

@nsano-rururu
Copy link
Collaborator

@LaZyDK

It's a natural result because it doesn't support elasticsearch 8 yet.

@nsano-rururu
Copy link
Collaborator

nsano-rururu commented Feb 20, 2022

@ferozsalam

Does this Elasticsearch 8 support mean the following support and operation check?

Create Index

elastalert-test-rule

Do you support not to specify doc_type in elasticsearch 8?

Rule Type

  • Frequency, Flatline

  Don't check doc_type in use_count_query and use_terms_query in Elasticsearch 8.
  document update.

  • Metric Aggregation, Percentage Match

  Operation check when doc_type is not specified in Metric Aggregation and Percentage Match in Elasticsearch 8
  document update.

Loading Filters Directly From Kibana 3

Currently not moving normally, so it does not correspond.

others

  • doc_type

elastalert.py・・・elastalert.py may have other modifications

get_hits

The following seems to need to be prevented from running on es8

        # Record doc_type for use in get_top_counts
        if 'doc_type' not in rule and len(hits):
            rule['doc_type'] = hits[0]['_type']

@nsano-rururu
Copy link
Collaborator

@ferozsalam

Please let us know if you need an investigation.

@ferozsalam
Copy link
Collaborator

While trying to update our Elastic Cloud 7.17.0 to 8.0 the cluster upgrade failed. Elastic support gave me this message:

..there are mappings defined on the indices it creates that are not compatible with 8.0.  
The date fields are being mapped as "dateOptionalTime", but these now need to be defined as "date_optional_time" instead.

To fix this, you would need to do the following:
1. Reindex each elastalert index into a temporary index:  https://www.elastic.co/guide/en/elasticsearch/reference/7.17/docs-reindex.html
2. Delete the original elastalert indices
3. Re-create the elastalert indices, defining the date mappings as "date_optional_time"
4. Reindex the data from your temp indices back into the fixed elastalert indices```

Referring to this issue, is there any (relatively simple) automated workaround to this? Or should we be creating an instruction page for people looking to migrate from ES 7 -> ES 8 with instructions on the manual steps they need to take?

@nsano-rururu
Copy link
Collaborator

Referring to this issue, is there any (relatively simple) automated workaround to this? Or should we be creating an instruction page for people looking to migrate from ES 7 -> ES 8 with instructions on the manual steps they need to take?

Do you mean to add to the FAQ how to check the index of elastalert, how to delete it, and have it run createindex again?

@ferozsalam
Copy link
Collaborator

Do you mean to add to the FAQ how to check the index of elastalert, how to delete it, and have it run createindex again?

Yes, exactly. I am not sure if the process is easily automatable, but given that people only have to do it once per cluster, perhaps manual instructions will be enough?

@nsano-rururu
Copy link
Collaborator

Yes, exactly. I am not sure if the process is easily automatable, but given that people only have to do it once per cluster, perhaps manual instructions will be enough?

I agree to add it to the FAQ.

@jertel
Copy link
Owner Author

jertel commented Feb 21, 2022

It would make life easier for users if it was automated. If it's automated then the existing indices should be renamed with a .old suffix, instead of deleting them outright. If it is automated then the documentation would need to explain that the ES_USER must have permissions to delete elastalert-* indices, and the code would need to be able to gracefully fail in that scenario where it doesn't have access. Eg., provide a helpful message explaining why it cannot auto upgrade, and refer them to the manual upgrade steps.

Either way, the manual upgrade steps should be documented. Perhaps something like the following:

To upgrade an existing ElastAlert 2 installation to Elasticsearch 8 the following manual steps are required:

  1. Shutdown ElastAlert 2.
  2. Delete or rename the old elastalert* indices. See Elasticsearch documentation for instructions on how to delete via the API.
  3. If NOT running ElastAlert 2 via Docker or Kubernetes, run elastalert-create-index to create the new indices. This is not needed when running via a container since the container always attempts to creates the indices at startup, if they're not yet created.
  4. Restart ElastAlert 2.

@konstantin-921
Copy link

konstantin-921 commented Feb 28, 2022

It would make life easier for users if it was automated. If it's automated then the existing indices should be renamed with a .old suffix, instead of deleting them outright. If it is automated then the documentation would need to explain that the ES_USER must have permissions to delete elastalert-* indices, and the code would need to be able to gracefully fail in that scenario where it doesn't have access. Eg., provide a helpful message explaining why it cannot auto upgrade, and refer them to the manual upgrade steps.

Either way, the manual upgrade steps should be documented. Perhaps something like the following:

To upgrade an existing ElastAlert 2 installation to Elasticsearch 8 the following manual steps are required:

  1. Shutdown ElastAlert 2.
  2. Delete or rename the old elastalert* indices. See Elasticsearch documentation for instructions on how to delete via the API.
  3. If NOT running ElastAlert 2 via Docker or Kubernetes, run elastalert-create-index to create the new indices. This is not needed when running via a container since the container always attempts to creates the indices at startup, if they're not yet created.
  4. Restart ElastAlert 2.

Hi
I recently upgraded from Elasticsearch 7.17.0 to 8.0.0. I am using Elastalert helm chart (v2.3.0). After the upgrade, Elastalert was broken. I went through these steps - https://elastalert2.readthedocs.io/en/latest/recipes/faq.html?highlight=elasticsearch%208#does-elastalert-2-support-elasticsearch-8, but I keep getting the following errors:

Reading Elastic 6 index mappings:
--
Mon, Feb 28 2022 5:56:11 pm | Reading index mapping 'es_mappings/6/silence.json'
Mon, Feb 28 2022 5:56:11 pm | Reading index mapping 'es_mappings/6/elastalert_status.json'
Mon, Feb 28 2022 5:56:11 pm | Reading index mapping 'es_mappings/6/elastalert.json'
Mon, Feb 28 2022 5:56:11 pm | Reading index mapping 'es_mappings/6/past_elastalert.json'
Mon, Feb 28 2022 5:56:11 pm | Reading index mapping 'es_mappings/6/elastalert_error.json'
Mon, Feb 28 2022 5:56:11 pm | Index elastalert already exists. Skipping index creation.
Mon, Feb 28 2022 5:56:14 pm | WARNING:elasticsearch:GET https://elk1-fl.domain.com:9200/elastalert_status/_search?_source_includes=endtime%2Crule_name&size=1 [status:400 request:0.006s]
Mon, Feb 28 2022 5:56:14 pm | ERROR:elastalert:Error querying for last run: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [@timestamp] in order to sort on')
Mon, Feb 28 2022 5:57:13 pm | WARNING:elasticsearch:GET https://elk1-fl.domain.com:9200/elastalert/_search?size=1000 [status:400 request:0.009s]
Mon, Feb 28 2022 5:57:13 pm | ERROR:elastalert:Error finding recent pending alerts: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [alert_time] in order to sort on') {'query': {'bool': {'must': {'query_string': {'query': '!_exists_:aggregate_id AND alert_sent:false'}}, 'filter': {'range': {'alert_time': {'from': '2022-02-26T14:57:13.392783Z', 'to': '2022-02-28T14:57:13.392873Z'}}}}}, 'sort': {'alert_time': {'order': 'asc'}}}
Mon, Feb 28 2022 5:57:13 pm | Traceback (most recent call last):
Mon, Feb 28 2022 5:57:13 pm | File "/usr/local/lib/python3.10/site-packages/elastalert/elastalert.py", line 1740, in find_recent_pending_alerts
Mon, Feb 28 2022 5:57:13 pm | res = self.writeback_es.search(index=self.writeback_index, body=query, size=1000)
Mon, Feb 28 2022 5:57:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/utils.py", line 84, in _wrapped
Mon, Feb 28 2022 5:57:13 pm | return func(*args, params=params, **kwargs)
Mon, Feb 28 2022 5:57:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/__init__.py", line 810, in search
Mon, Feb 28 2022 5:57:13 pm | return self.transport.perform_request(
Mon, Feb 28 2022 5:57:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/transport.py", line 318, in perform_request
Mon, Feb 28 2022 5:57:13 pm | status, headers_response, data = connection.perform_request(method, url, params, body, headers=headers, ignore=ignore, timeout=timeout)
Mon, Feb 28 2022 5:57:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/http_requests.py", line 91, in perform_request
Mon, Feb 28 2022 5:57:13 pm | self._raise_error(response.status_code, raw_data)
Mon, Feb 28 2022 5:57:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/base.py", line 131, in _raise_error
Mon, Feb 28 2022 5:57:13 pm | raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
Mon, Feb 28 2022 5:57:13 pm | elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [alert_time] in order to sort on')
Mon, Feb 28 2022 5:58:13 pm | WARNING:elasticsearch:GET https://elk1-fl.domain.com:9200/elastalert/_search?size=1000 [status:400 request:0.008s]
Mon, Feb 28 2022 5:58:13 pm | ERROR:elastalert:Error finding recent pending alerts: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [alert_time] in order to sort on') {'query': {'bool': {'must': {'query_string': {'query': '!_exists_:aggregate_id AND alert_sent:false'}}, 'filter': {'range': {'alert_time': {'from': '2022-02-26T14:58:13.392331Z', 'to': '2022-02-28T14:58:13.392427Z'}}}}}, 'sort': {'alert_time': {'order': 'asc'}}}
Mon, Feb 28 2022 5:58:13 pm | Traceback (most recent call last):
Mon, Feb 28 2022 5:58:13 pm | File "/usr/local/lib/python3.10/site-packages/elastalert/elastalert.py", line 1740, in find_recent_pending_alerts
Mon, Feb 28 2022 5:58:13 pm | res = self.writeback_es.search(index=self.writeback_index, body=query, size=1000)
Mon, Feb 28 2022 5:58:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/utils.py", line 84, in _wrapped
Mon, Feb 28 2022 5:58:13 pm | return func(*args, params=params, **kwargs)
Mon, Feb 28 2022 5:58:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/__init__.py", line 810, in search
Mon, Feb 28 2022 5:58:13 pm | return self.transport.perform_request(
Mon, Feb 28 2022 5:58:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/transport.py", line 318, in perform_request
Mon, Feb 28 2022 5:58:13 pm | status, headers_response, data = connection.perform_request(method, url, params, body, headers=headers, ignore=ignore, timeout=timeout)
Mon, Feb 28 2022 5:58:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/http_requests.py", line 91, in perform_request
Mon, Feb 28 2022 5:58:13 pm | self._raise_error(response.status_code, raw_data)
Mon, Feb 28 2022 5:58:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/base.py", line 131, in _raise_error
Mon, Feb 28 2022 5:58:13 pm | raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
Mon, Feb 28 2022 5:58:13 pm | elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [alert_time] in order to sort on')
Mon, Feb 28 2022 5:59:13 pm | WARNING:elasticsearch:GET https://elk1-fl.domain.com:9200/elastalert/_search?size=1000 [status:400 request:0.008s]
Mon, Feb 28 2022 5:59:13 pm | ERROR:elastalert:Error finding recent pending alerts: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [alert_time] in order to sort on') {'query': {'bool': {'must': {'query_string': {'query': '!_exists_:aggregate_id AND alert_sent:false'}}, 'filter': {'range': {'alert_time': {'from': '2022-02-26T14:59:13.391917Z', 'to': '2022-02-28T14:59:13.391993Z'}}}}}, 'sort': {'alert_time': {'order': 'asc'}}}
Mon, Feb 28 2022 5:59:13 pm | Traceback (most recent call last):
Mon, Feb 28 2022 5:59:13 pm | File "/usr/local/lib/python3.10/site-packages/elastalert/elastalert.py", line 1740, in find_recent_pending_alerts
Mon, Feb 28 2022 5:59:13 pm | res = self.writeback_es.search(index=self.writeback_index, body=query, size=1000)
Mon, Feb 28 2022 5:59:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/utils.py", line 84, in _wrapped
Mon, Feb 28 2022 5:59:13 pm | return func(*args, params=params, **kwargs)
Mon, Feb 28 2022 5:59:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/__init__.py", line 810, in search
Mon, Feb 28 2022 5:59:13 pm | return self.transport.perform_request(
Mon, Feb 28 2022 5:59:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/transport.py", line 318, in perform_request
Mon, Feb 28 2022 5:59:13 pm | status, headers_response, data = connection.perform_request(method, url, params, body, headers=headers, ignore=ignore, timeout=timeout)
Mon, Feb 28 2022 5:59:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/http_requests.py", line 91, in perform_request
Mon, Feb 28 2022 5:59:13 pm | self._raise_error(response.status_code, raw_data)
Mon, Feb 28 2022 5:59:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/base.py", line 131, in _raise_error
Mon, Feb 28 2022 5:59:13 pm | raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
Mon, Feb 28 2022 5:59:13 pm | elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [alert_time] in order to sort on')
Mon, Feb 28 2022 6:00:13 pm | WARNING:elasticsearch:GET https://elk1-fl.domain.com:9200/elastalert/_search?size=1000 [status:400 request:0.009s]
Mon, Feb 28 2022 6:00:13 pm | ERROR:elastalert:Error finding recent pending alerts: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [alert_time] in order to sort on') {'query': {'bool': {'must': {'query_string': {'query': '!_exists_:aggregate_id AND alert_sent:false'}}, 'filter': {'range': {'alert_time': {'from': '2022-02-26T15:00:13.392418Z', 'to': '2022-02-28T15:00:13.392516Z'}}}}}, 'sort': {'alert_time': {'order': 'asc'}}}
Mon, Feb 28 2022 6:00:13 pm | Traceback (most recent call last):
Mon, Feb 28 2022 6:00:13 pm | File "/usr/local/lib/python3.10/site-packages/elastalert/elastalert.py", line 1740, in find_recent_pending_alerts
Mon, Feb 28 2022 6:00:13 pm | res = self.writeback_es.search(index=self.writeback_index, body=query, size=1000)
Mon, Feb 28 2022 6:00:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/utils.py", line 84, in _wrapped
Mon, Feb 28 2022 6:00:13 pm | return func(*args, params=params, **kwargs)
Mon, Feb 28 2022 6:00:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/__init__.py", line 810, in search
Mon, Feb 28 2022 6:00:13 pm | return self.transport.perform_request(
Mon, Feb 28 2022 6:00:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/transport.py", line 318, in perform_request
Mon, Feb 28 2022 6:00:13 pm | status, headers_response, data = connection.perform_request(method, url, params, body, headers=headers, ignore=ignore, timeout=timeout)
Mon, Feb 28 2022 6:00:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/http_requests.py", line 91, in perform_request
Mon, Feb 28 2022 6:00:13 pm | self._raise_error(response.status_code, raw_data)
Mon, Feb 28 2022 6:00:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/base.py", line 131, in _raise_error
Mon, Feb 28 2022 6:00:13 pm | raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
Mon, Feb 28 2022 6:00:13 pm | elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [alert_time] in order to sort on')
Mon, Feb 28 2022 6:01:13 pm | WARNING:elasticsearch:GET https://elk1-fl.domain.com:9200/elastalert/_search?size=1000 [status:400 request:0.010s]
Mon, Feb 28 2022 6:01:13 pm | ERROR:elastalert:Error finding recent pending alerts: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [alert_time] in order to sort on') {'query': {'bool': {'must': {'query_string': {'query': '!_exists_:aggregate_id AND alert_sent:false'}}, 'filter': {'range': {'alert_time': {'from': '2022-02-26T15:01:13.392276Z', 'to': '2022-02-28T15:01:13.392346Z'}}}}}, 'sort': {'alert_time': {'order': 'asc'}}}
Mon, Feb 28 2022 6:01:13 pm | Traceback (most recent call last):
Mon, Feb 28 2022 6:01:13 pm | File "/usr/local/lib/python3.10/site-packages/elastalert/elastalert.py", line 1740, in find_recent_pending_alerts
Mon, Feb 28 2022 6:01:13 pm | res = self.writeback_es.search(index=self.writeback_index, body=query, size=1000)
Mon, Feb 28 2022 6:01:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/utils.py", line 84, in _wrapped
Mon, Feb 28 2022 6:01:13 pm | return func(*args, params=params, **kwargs)
Mon, Feb 28 2022 6:01:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/__init__.py", line 810, in search
Mon, Feb 28 2022 6:01:13 pm | return self.transport.perform_request(
Mon, Feb 28 2022 6:01:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/transport.py", line 318, in perform_request
Mon, Feb 28 2022 6:01:13 pm | status, headers_response, data = connection.perform_request(method, url, params, body, headers=headers, ignore=ignore, timeout=timeout)
Mon, Feb 28 2022 6:01:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/http_requests.py", line 91, in perform_request
Mon, Feb 28 2022 6:01:13 pm | self._raise_error(response.status_code, raw_data)
Mon, Feb 28 2022 6:01:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/base.py", line 131, in _raise_error
Mon, Feb 28 2022 6:01:13 pm | raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
Mon, Feb 28 2022 6:01:13 pm | elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [alert_time] in order to sort on')
Mon, Feb 28 2022 6:02:13 pm | WARNING:elasticsearch:GET https://elk1-fl.domain.com:9200/elastalert/_search?size=1000 [status:400 request:0.008s]
Mon, Feb 28 2022 6:02:13 pm | ERROR:elastalert:Error finding recent pending alerts: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [alert_time] in order to sort on') {'query': {'bool': {'must': {'query_string': {'query': '!_exists_:aggregate_id AND alert_sent:false'}}, 'filter': {'range': {'alert_time': {'from': '2022-02-26T15:02:13.392588Z', 'to': '2022-02-28T15:02:13.392690Z'}}}}}, 'sort': {'alert_time': {'order': 'asc'}}}
Mon, Feb 28 2022 6:02:13 pm | Traceback (most recent call last):
Mon, Feb 28 2022 6:02:13 pm | File "/usr/local/lib/python3.10/site-packages/elastalert/elastalert.py", line 1740, in find_recent_pending_alerts
Mon, Feb 28 2022 6:02:13 pm | res = self.writeback_es.search(index=self.writeback_index, body=query, size=1000)
Mon, Feb 28 2022 6:02:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/utils.py", line 84, in _wrapped
Mon, Feb 28 2022 6:02:13 pm | return func(*args, params=params, **kwargs)
Mon, Feb 28 2022 6:02:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/client/__init__.py", line 810, in search
Mon, Feb 28 2022 6:02:13 pm | return self.transport.perform_request(
Mon, Feb 28 2022 6:02:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/transport.py", line 318, in perform_request
Mon, Feb 28 2022 6:02:13 pm | status, headers_response, data = connection.perform_request(method, url, params, body, headers=headers, ignore=ignore, timeout=timeout)
Mon, Feb 28 2022 6:02:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/http_requests.py", line 91, in perform_request
Mon, Feb 28 2022 6:02:13 pm | self._raise_error(response.status_code, raw_data)
Mon, Feb 28 2022 6:02:13 pm | File "/usr/local/lib/python3.10/site-packages/elasticsearch/connection/base.py", line 131, in _raise_error
Mon, Feb 28 2022 6:02:13 pm | raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
Mon, Feb 28 2022 6:02:13 pm | elasticsearch.exceptions.RequestError: RequestError(400, 'search_phase_execution_exception', 'No mapping found for [alert_time] in order to sort on')

Any help?

@ferozsalam
Copy link
Collaborator

Hey @konstantin-921 I suspect that part of the reason is because a release with alpha ES 8 support hasn't been cut yet - if you're using 2.3.0, you won't have the latest changes.

Please note that ElastAlert isn't guaranteed to work with ES 8, even with the latest (unreleased) changes. This is still a work in progress.

@konstantin-921
Copy link

Thank you for your response @ferozsalam . Then I will wait for new releases.

@nsano-rururu
Copy link
Collaborator

In elasticsearch 8, _type should disappear, so there is a possibility of a key error in the following places.

elastalert/test_rule.py

        doc_type = res['hits']['hits'][0]['_type']

elastalert/elastalert.py

        if 'doc_type' not in rule and len(hits):
            rule['doc_type'] = hits[0]['_type']

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants