Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend ProxyHandler to support CIDR ranges for no_proxy #83085

Open
wants to merge 7 commits into
base: devel
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions changelogs/fragments/urls-cidr-no-proxy.yml
@@ -0,0 +1,2 @@
minor_changes:
- urls.py - Support CIDR ranges for no_proxy
52 changes: 50 additions & 2 deletions lib/ansible/module_utils/urls.py
Expand Up @@ -37,6 +37,7 @@
import email.policy
import email.utils
import http.client
import ipaddress
import mimetypes
import netrc
import os
Expand Down Expand Up @@ -67,6 +68,7 @@
from ansible.module_utils.basic import missing_required_lib
from ansible.module_utils.common.collections import Mapping, is_sequence
from ansible.module_utils.common.text.converters import to_bytes, to_native, to_text
from ansible.module_utils.common import warnings

try:
import ssl
Expand Down Expand Up @@ -300,6 +302,51 @@ def http_open(self, req):
return self.do_open(UnixHTTPConnection(self._unix_socket), req)


class ProxyHandler(urllib.request.ProxyHandler):
_SPLITPORT_RE = re.compile('(.*):([0-9]*)', re.DOTALL)
sivel marked this conversation as resolved.
Show resolved Hide resolved

@classmethod
def _splitport(cls, host):
# Copied from cpython urllib.parse
match = cls._SPLITPORT_RE.fullmatch(host)
if match:
host, port = match.groups()
if port:
return host, port
return host, None
sivel marked this conversation as resolved.
Show resolved Hide resolved

@staticmethod
def _matches(host, port, bypass_network, bypass_port, scheme):
if not port:
port = '443' if scheme == 'https' else '80'
if bypass_port and port != bypass_port:
return False
return host in bypass_network

def proxy_open(self, req, *args):
Copy link
Member Author

@sivel sivel Apr 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest concern I have about this implementation, is in overriding this method. proxy_open is not a documented method at https://docs.python.org/3/library/urllib.request.html

At closest it the <protocol>_open methods are described as part of https://docs.python.org/3/library/urllib.request.html#protocol-open but proxy isn't really a protocol like http, https, file, etc are.

And further more, it's highly dependent on urllib.request.ProxyHandler.__init__: https://github.com/python/cpython/blob/8f25cc992021d6ffc62bb110545b97a92f7cb295/Lib/urllib/request.py#L770-L774

However, the use of proxy_open has been around, unchanged for 25+ years, so it seems relatively safe and stable: python/cpython@6d7e47b

I guess at a minimum, this is why we have tests, which this PR includes.

"""Implements proxy bypassing for cidr ranges"""
hostonly, port = self._splitport(req.host)
try:
req_ip = ipaddress.ip_address(hostonly)
except ValueError:
return super().proxy_open(req, *args)

no_proxy = self.proxies.get('no') or ''
for bypass in map(str.strip, no_proxy.split(',')):
if '/' not in bypass:
continue
bypass_host, bypass_port = self._splitport(bypass)
try:
bypass_network = ipaddress.ip_network(bypass_host)
except ValueError as e:
warnings.warn(f'no_proxy entry appears to be a CIDR range, but could not be parsed: {e}')
sivel marked this conversation as resolved.
Show resolved Hide resolved
continue
if self._matches(req_ip, port, bypass_network, bypass_port, req.type):
return None

return super().proxy_open(req, *args)


class ParseResultDottedDict(dict):
'''
A dict that acts similarly to the ParseResult named tuple from urllib
Expand Down Expand Up @@ -844,9 +891,10 @@ def open(self, method, url, data=None, headers=None, use_proxy=None,
headers.update(auth_headers)
handlers.extend(auth_handlers)

proxies = None
if not use_proxy:
proxyhandler = urllib.request.ProxyHandler({})
handlers.append(proxyhandler)
proxies = {}
handlers.append(ProxyHandler(proxies))

if not context:
context = make_context(
Expand Down
40 changes: 40 additions & 0 deletions test/integration/targets/uri/tasks/install-proxy-and-test.yml
@@ -0,0 +1,40 @@
# tinyproxy not available in RHEL repos, and no easy install method for us on macOS
sivel marked this conversation as resolved.
Show resolved Hide resolved
- when: ansible_facts.distribution not in ['MacOSX', 'RedHat']
vars:
packages:
- tinyproxy
block:
- name: install packages for proxy testing
package:
name: '{{ item }}'
state: present
register: proxy_install_package
when: ansible_facts.distribution not in ['Alpine']
loop: '{{ packages }}'

- name: install packages for proxy testing for alpine
command: apk add --virtual .{{ item }} {{ item }}
register: proxy_install_alpine
when: ansible_facts.distribution == 'Alpine'
loop: '{{ packages }}'

- include_tasks: proxy.yml
always:
- name: uninstall packages for proxy testing
package:
name: '{{ item }}'
state: absent
when: proxy_install_package.results[idx]|default({}) is changed
loop: '{{ packages }}'
loop_control:
index_var: idx

- name: uninstall packages for proxy testing for alpine
command: apk del .{{ item }}
when:
- proxy_install_alpine|default({}) is changed
- >-
'Installing ' ~ item in proxy_install_alpine.results[idx].stdout
loop: '{{ packages }}'
loop_control:
index_var: idx
3 changes: 3 additions & 0 deletions test/integration/targets/uri/tasks/main.yml
Expand Up @@ -727,6 +727,9 @@
- name: Test unix socket
import_tasks: install-socat-and-test-unix-socket.yml

- name: Test proxy
import_tasks: install-proxy-and-test.yml

- name: ensure skip action
uri:
url: http://example.com
Expand Down
159 changes: 159 additions & 0 deletions test/integration/targets/uri/tasks/proxy.yml
@@ -0,0 +1,159 @@
- name: Get IP address for ansible.http.tests
command: >-
{{ ansible_python_interpreter }} -c 'import socket; print(socket.gethostbyname("{{ httpbin_host }}"))'
register: httpbin_ip

- name: Get groups
ansible.builtin.getent:
database: group

- name: Set var for nobody/nogroup group
set_fact:
nobody_group: '{{ ansible_facts.getent_group|list|select("match", "no(group|body)")|first }}'

- name: Allow nobody to traverse remote_tmp_dir
file:
path: '{{ remote_tmp_dir }}'
mode: '0755'

- name: Create log dir
file:
path: '{{ remote_tmp_dir }}/proxy-logs'
state: directory
owner: nobody
group: '{{ nobody_group }}'

- name: Install tinyproxy config
copy:
dest: '{{ remote_tmp_dir }}/tinyproxy.conf'
content: |
User nobody
Group {{ nobody_group }}
Port 8080
Listen 127.0.0.1
Timeout 10
LogLevel Info
MaxClients 1
StartServers 1
Allow 127.0.0.1
Allow ::1
Allow ::
ViaProxyName "tinyproxy"
LogFile "{{ remote_tmp_dir }}/proxy-logs/tinyproxy.log"

- name: Start tinyproxy
command: tinyproxy -d -c "{{ remote_tmp_dir }}/tinyproxy.conf"
async: 30
poll: 0
register: tinyproxy

- name: Ensure tinyproxy started
async_status:
jid: '{{ tinyproxy.ansible_job_id }}'

- name: Test http over http proxy
uri:
url: http://{{ httpbin_host }}/get
environment:
http_proxy: http://127.0.0.1:8080
register: http_over_http
failed_when: http_over_http.via is undefined

- name: Test https over http proxy
uri:
url: https://{{ httpbin_host }}/get
environment:
https_proxy: http://127.0.0.1:8080
register: https_over_http
# failed_when:
# failure checking is handled by the assert at the bottom comparing logs
# because we aren't running a proxy that can inspect the https stream
# there won't be added headers

- name: Test request without a proxy
uri:
url: http://{{ httpbin_host }}/get
register: request_without_proxy
failed_when: request_without_proxy.via is defined

- name: Test request with proxy and no_proxy=hostname
uri:
url: http://{{ httpbin_host }}/get
environment:
http_proxy: http://127.0.0.1:8080
no_proxy: '{{ httpbin_host }}'
register: no_proxy_hostname
failed_when: no_proxy_hostname.via is defined

- name: Test request with proxy and no_proxy=ip
uri:
url: http://{{ httpbin_ip.stdout }}/get
environment:
http_proxy: http://127.0.0.1:8080
no_proxy: '{{ httpbin_ip.stdout }}'
register: no_proxy_ip
failed_when: no_proxy_ip.via is defined

- name: Test request with proxy and no_proxy=cidr/32
uri:
url: http://{{ httpbin_ip.stdout }}/get
environment:
http_proxy: http://127.0.0.1:8080
no_proxy: '{{ httpbin_ip.stdout }}/32'
register: no_proxy_cidr_32
failed_when: no_proxy_cidr_32.via is defined

- name: Test request with proxy and no_proxy=cidr/24
uri:
url: http://{{ httpbin_ip.stdout }}/get
environment:
http_proxy: http://127.0.0.1:8080
no_proxy: '{{ httpbin_cidr }}'
register: no_proxy_cidr_24
vars:
httpbin_cidr: "{{ httpbin_ip.stdout.split('.')[:3]|join('.') }}.0/24"
failed_when: no_proxy_cidr_24.via is defined

- name: Test request with proxy and non-matching no_proxy=cidr
uri:
url: http://{{ httpbin_ip.stdout }}/get
environment:
http_proxy: http://127.0.0.1:8080
no_proxy: 1.2.3.0/24
register: no_proxy_non_matching_cidr
failed_when: no_proxy_non_matching_cidr.via is undefined

- name: Test request with proxy and no_proxy=cidr:port
uri:
url: http://{{ httpbin_ip.stdout }}/get
environment:
http_proxy: http://127.0.0.1:8080
no_proxy: '{{ httpbin_ip.stdout }}/32:80'
register: no_proxy_cidr_port
failed_when: no_proxy_cidr_port.via is defined

- name: Test request with proxy and non-matching no_proxy=cidr:port
uri:
url: http://{{ httpbin_ip.stdout }}/get
environment:
http_proxy: http://127.0.0.1:8080
no_proxy: '{{ httpbin_ip.stdout }}/32:8080'
register: no_proxy_non_matching_cidr_port
failed_when: no_proxy_non_matching_cidr_port.via is undefined

- slurp:
path: "{{ remote_tmp_dir }}/proxy-logs/tinyproxy.log"
register: tinyproxy_logs

- assert:
that:
- >-
log_content is contains "CONNECT " ~ httpbin_host ~ ":443"
# https over http
- >-
log_content|regex_findall(": CONNECT ")|length == 1
# 3 http over http
- >-
log_content|regex_findall(': GET')|length == 3
vars:
log_content: '{{ tinyproxy_logs.content|b64decode }}'