Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support restricted index patterns in Elasticsearch log handler #23888

Merged
merged 3 commits into from Dec 8, 2022

Conversation

kouk
Copy link
Contributor

@kouk kouk commented May 24, 2022

Sometimes Airflow doesn't have the ability to search across all indices
in an Elasticsearch server. This might be due to security settings in
the server. In these cases fetching the remote logs fails. To fix this
we create a index_patterns configuration setting that can be set to a
more restrictive pattern.

closes: #16828

@boring-cyborg
Copy link

boring-cyborg bot commented May 24, 2022

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@kouk
Copy link
Contributor Author

kouk commented May 24, 2022

@jedcunningham hi! I am have run airflow with elasticsearch logging and it is solving the issue for us. We configured a role with restricted access to a specific index pattern, and where before Airflow could not read the logs, with the patch it could. Appreciate it if you could take a look.

@kouk kouk force-pushed the support-es-index-patterns branch from 8d648fd to 44caca6 Compare May 29, 2022 11:25
@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label May 30, 2022
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@kouk kouk force-pushed the support-es-index-patterns branch from 8aef878 to 3d42e66 Compare May 30, 2022 08:22
@eladkal eladkal added this to the Airflow 2.3.3 milestone May 30, 2022
Copy link
Member

@jedcunningham jedcunningham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs some changes to support backwards compatibility. We can't set index_patterns in airflow_local_settings, because we might have an older provider. We also should try and get the config value when on older core versions. Look at how es_kwargs is handled - that is a good example.

Otherwise, this looks good.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Aug 15, 2022
@eladkal
Copy link
Contributor

eladkal commented Aug 15, 2022

@kouk are you still working on this PR?

@github-actions github-actions bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Aug 16, 2022
@kouk
Copy link
Contributor Author

kouk commented Aug 16, 2022 via email

@kouk kouk force-pushed the support-es-index-patterns branch 2 times, most recently from 91a33bd to 4eb1ce2 Compare August 22, 2022 20:19
@potiuk
Copy link
Member

potiuk commented Sep 5, 2022

@jedcunningham - I think that one waits for you :)

@ashb ashb modified the milestones: Airflow 2.4.0, Airflow 2.4.1 Sep 8, 2022
@potiuk
Copy link
Member

potiuk commented Sep 19, 2022

You will need to rebase @kouk to account for recent changes in main.

@potiuk
Copy link
Member

potiuk commented Sep 20, 2022

@jedcunningham @potiuk I bumped the version to 2.4.1. Also I pushed acommit that updates the tests to emulate the case where we have an older version of airflow core, where no index pattern is passed by default to the constructor. Let me know if you prefer the previous version.

I think it will have to be 2.5.0 - > we are following semver and we only add bugfixes to "patchlevel" versions. This one will have to wait for 2.5 to be released.

@jedcunningham
Copy link
Member

Marking this for 2.5.0, but after the bulk of these changes land in a provider release, it'll work if you set the config.

@kouk kouk force-pushed the support-es-index-patterns branch from 1c832ed to eedfa36 Compare October 6, 2022 02:58
@kouk
Copy link
Contributor Author

kouk commented Oct 6, 2022

@jedcunningham thanks for pointing that out. I wasn't familiar enough with the config code to realize that it's ConfigParser under the hood. I've updated both the version and the config handling code like you suggested.

@kouk
Copy link
Contributor Author

kouk commented Oct 22, 2022

@eladkal @jedcunningham anything else we need here?

@kouk kouk force-pushed the support-es-index-patterns branch 2 times, most recently from 011d935 to 310ca97 Compare November 9, 2022 17:23
@kouk
Copy link
Contributor Author

kouk commented Nov 9, 2022

@jedcunningham appreciate if you could check if anything else is needed here.

Sometimes Airflow doesn't have the ability to search across all indices
in an Elasticsearch server. This might be due to security settings in
the server. In these cases fetching the remote logs fails. To fix this
we create a index_patterns configuration setting that can be set to a
more restrictive pattern.
@potiuk
Copy link
Member

potiuk commented Dec 3, 2022

Rebased to be sure. @jedcunningham - want to still take a look?

airflow/config_templates/config.yml Outdated Show resolved Hide resolved
@jedcunningham jedcunningham merged commit 99bbcd3 into apache:main Dec 8, 2022
1 check passed
@boring-cyborg
Copy link

boring-cyborg bot commented Dec 8, 2022

Awesome work, congrats on your first merged pull request!

@jedcunningham
Copy link
Member

Thanks @kouk! Congrats on your first commit 🎉. (I apologize for the delay)

As I mentioned earlier, since almost all of the changes are in the provider, this feature will be available once it's released in the provider (even before 2.6 with the config option is released).

@kouk
Copy link
Contributor Author

kouk commented Dec 8, 2022

thank you @jedcunningham and everyone else.

@ephraimbuddy ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:logging area:providers changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) full tests needed We need to run full set of tests for this PR to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot Set Index Pattern on Elasticsearch as a Log Handler
7 participants