Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Attempt to make Blockhole blocking on L7 #17891

Closed

Conversation

henrybear327
Copy link
Contributor

@henrybear327 henrybear327 commented Apr 28, 2024

Based on the idea from @fuweid, this PR attempts to employ blocking on L7 but without using external tools.

Main ideas

The main idea is

  1. utilize the X-PeerURLs field from the header, as we know all traffic to peer will contain this header field.
  2. As we also know all nodes will create direct connections with their peers, so the traffic blocking will have to happen at all nodes' proxies, contrary to the current design, where only the proxy of the peer is being blackholed.

Based on the main ideas, we introduce an SSL termination proxy so we can obtain the X-PeerURLs field from the header.

PoC Implementation Issues

There are 2 known issues with this approach

  1. still leaking some traffic. But the leaked traffic (as discussed later, it won't affect the blackhole idea that we would like to achieve) as stream and pipeline traffic between raft nodes are now properly terminated
  2. we would need to employ an SSL termination proxy, which might lead to a small performance hit

For 1), this way of blocking (by utilizing X-PeerURLs from the header) will miss certain types of traffic to certain endpoints, as the traffic to some endpoints doesn't have the X-PeerURLs field. Currently, the known ones are: /members, /version, and /raft/probing.
As you can see from the log, its header doesn't contain the X-PeerURLs field, but only the following fields:

  • map[Accept-Encoding:[gzip] User-Agent:[Go-http-client/1.1]] /members
  • map[Accept-Encoding:[gzip] User-Agent:[Go-http-client/1.1]] /version
  • map[Accept-Encoding:[gzip] User-Agent:[Go-http-client/1.1]] /raft/probing

For 2), in order to read out X-PeerURLs from the header, we need to terminate the SSL connection, as we can't drop cleartext traffic (ref [1]). Thus, a new option e2e.WithSSLTerminationProxy(true) is introduced, which will change the network flow into

A -- B's SSL termination proxy - B's transparent proxy - B
       ^ newly introduced                ^ in the original codebase

Known improvements required before turning RFC into PR

The prototype needs to be further improved for code review after fixing the following issues:

  • blocking only RX or TX traffic (as currently a call to blackholeTX or blackholeRX will black both TX and RX traffic instead of just the specified one.
  • slowness when performing test cleanup (I think this is related to the SSL timeout setting, but I haven't looked into it yet)
  • coding style improvements

References:
[1] #15595

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

@k8s-ci-robot
Copy link

Hi @henrybear327. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@henrybear327
Copy link
Contributor Author

henrybear327 commented Apr 28, 2024

Thanks to @fuweid for suggesting to look into #15271

I applied the PR to the robustness test test-robustness-issue15271. The robustness test can pass on my M1 machine.

We can see that the revision log is the following:

network.go:126: clusterRevision: 109, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 119, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 135, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 146, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 160, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 175, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 193, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 202, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 216, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 228, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 244, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 256, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 271, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 283, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 295, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 307, blackholedMemberRevision: 108
    network.go:126: clusterRevision: 323, blackholedMemberRevision: 108

Compared to before patching this PR, and the robustness test failed on my machine, but not with the Broke watch guarantee error though (?)

network.go:129: clusterRevision: 126, blackholedMemberRevision: 125
    network.go:129: clusterRevision: 142, blackholedMemberRevision: 141
    network.go:129: clusterRevision: 158, blackholedMemberRevision: 157
    network.go:129: clusterRevision: 166, blackholedMemberRevision: 166
    network.go:129: clusterRevision: 181, blackholedMemberRevision: 181
    network.go:129: clusterRevision: 194, blackholedMemberRevision: 194
    network.go:129: clusterRevision: 208, blackholedMemberRevision: 208
    network.go:129: clusterRevision: 222, blackholedMemberRevision: 220
    network.go:129: clusterRevision: 236, blackholedMemberRevision: 236
    network.go:129: clusterRevision: 253, blackholedMemberRevision: 253
    network.go:129: clusterRevision: 266, blackholedMemberRevision: 266
    network.go:129: clusterRevision: 282, blackholedMemberRevision: 282
    network.go:129: clusterRevision: 296, blackholedMemberRevision: 296
    network.go:129: clusterRevision: 306, blackholedMemberRevision: 306
    network.go:129: clusterRevision: 326, blackholedMemberRevision: 326
    network.go:129: clusterRevision: 341, blackholedMemberRevision: 341
    network.go:129: clusterRevision: 355, blackholedMemberRevision: 355
    network.go:129: clusterRevision: 368, blackholedMemberRevision: 368
    network.go:129: clusterRevision: 381, blackholedMemberRevision: 381
    network.go:129: clusterRevision: 395, blackholedMemberRevision: 394
    network.go:129: clusterRevision: 407, blackholedMemberRevision: 407
    network.go:129: clusterRevision: 420, blackholedMemberRevision: 420
    network.go:129: clusterRevision: 439, blackholedMemberRevision: 439
    network.go:129: clusterRevision: 454, blackholedMemberRevision: 452
    network.go:129: clusterRevision: 467, blackholedMemberRevision: 466
    network.go:129: clusterRevision: 485, blackholedMemberRevision: 485
    network.go:129: clusterRevision: 500, blackholedMemberRevision: 498
    network.go:129: clusterRevision: 510, blackholedMemberRevision: 510
    network.go:129: clusterRevision: 523, blackholedMemberRevision: 523
    network.go:129: clusterRevision: 539, blackholedMemberRevision: 539
    network.go:129: clusterRevision: 555, blackholedMemberRevision: 555
    network.go:129: clusterRevision: 570, blackholedMemberRevision: 570
    network.go:129: clusterRevision: 583, blackholedMemberRevision: 583
    network.go:129: clusterRevision: 601, blackholedMemberRevision: 601
    network.go:129: clusterRevision: 612, blackholedMemberRevision: 611
    network.go:129: clusterRevision: 622, blackholedMemberRevision: 622
    network.go:129: clusterRevision: 641, blackholedMemberRevision: 636
    network.go:129: clusterRevision: 659, blackholedMemberRevision: 636
    network.go:129: clusterRevision: 675, blackholedMemberRevision: 636
    network.go:129: clusterRevision: 690, blackholedMemberRevision: 636
    network.go:129: clusterRevision: 704, blackholedMemberRevision: 636
    network.go:129: clusterRevision: 715, blackholedMemberRevision: 636
    network.go:129: clusterRevision: 736, blackholedMemberRevision: 636
    network.go:129: clusterRevision: 752, blackholedMemberRevision: 636
    network.go:129: clusterRevision: 765, blackholedMemberRevision: 636
    network.go:129: clusterRevision: 781, blackholedMemberRevision: 636
    network.go:129: clusterRevision: 794, blackholedMemberRevision: 636
    network.go:129: clusterRevision: 807, blackholedMemberRevision: 636
    network.go:129: clusterRevision: 824, blackholedMemberRevision: 636
    network.go:129: clusterRevision: 838, blackholedMemberRevision: 636

So I think the PR is working in a sense, as the blackholedMemberRevision is fixed at a version.

@henrybear327 henrybear327 force-pushed the fix/e2e_blackhole_proxy_review branch from 236d14b to 60c5240 Compare April 28, 2024 13:32
@henrybear327
Copy link
Contributor Author

henrybear327 commented Apr 28, 2024

/cc @fuweid @siyuanfoundation

Can you please take a look when you have time, and see if this PR idea is worth refining into a reviewable PR :)

I have tested it on the robustness test, and attached the log and findings!

@siyuanfoundation
Copy link
Contributor

I think using a separate proxy type makes sense, and the remaining 3 leaking traffic should not really affect p2p communication much.
@henrybear327 have you figured out why transport package does not work?

siyuanfoundation and others added 3 commits May 3, 2024 23:14
Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
Shorten the wait time for the open connection to expire to 5s

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
Based on Fu Wei's idea, we employ blocking on L7 but without using
external tools.

[Main ideas]
The main idea is
1) utilize the `X-PeerURLs` field from the header, as we know all
traffic to peer will contain this header field.
2) As we also know that all nodes will create direct connections with their
peers, so the traffic blocking will have to happen at all nodes'
proxies, contrary to the current design, where only the proxy of the
peer is being blackholed.

Based on the main ideas, we introduce a SSL termination proxy so we can
obtain the `X-PeerURLs` field from the header.

[Issues]

There are 2 known issues with this approach
1) still leaking some traffic. But the leaked traffic (as discussed
later won't affect the blackhole idea that we would like to achieve (as
stream and pipeline traffic between raft nodes are now properly terminated)
2) we would need to employ SSL termination proxy, which might lead to a
small performance hit

For 1), as this way of blocking (by utilizing `X-PeerURLs` from the
header) will miss certain types of traffic to certain endpoints, as
the traffic to some endpoints doesn't have the `X-PeerURLs` field.
Currently, here are the known ones: /members, /version, and /raft/probing.
As you can see from the log, its header doesn't contain the `X-PeerURLs`
field, but only the following fields:
- map[Accept-Encoding:[gzip] User-Agent:[Go-http-client/1.1]] /members
- map[Accept-Encoding:[gzip] User-Agent:[Go-http-client/1.1]] /version
- map[Accept-Encoding:[gzip] User-Agent:[Go-http-client/1.1]] /raft/probing

For 2) in order to read out `X-PeerURLs` from the header, we need to
terminate the SSL connection, as we can't drop cleartext traffic
(ref [1]). Thus, a new option `e2e.WithSSLTerminationProxy(true)`
is introduced, which will change the network flow into
```
A -- B's SSL termination proxy - B's transparent proxy - B
     ^ newly introduced          ^ in the original codebase
```

[Known improvements required before turning RFC into PR]

The prototype needs to be further improved for code review after
fixing the following issues:
- blocking only RX or TX traffic (as currently a call to `blackholeTX`
or `blackholeRX` will black both TX and RX traffic instead of just
the specified one.
- slowness when performing test cleanup (I think this is related to the
SSL timeout setting, but I haven't looked into it yet)
- coding style improvements

References:
[1] etcd-io#15595

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
@henrybear327 henrybear327 force-pushed the fix/e2e_blackhole_proxy_review branch from 60c5240 to 930dde5 Compare May 3, 2024 15:17
pkg/proxy/server.go Show resolved Hide resolved
tests/framework/e2e/etcd_process.go Show resolved Hide resolved
pkg/proxy/server.go Show resolved Hide resolved
@henrybear327
Copy link
Contributor Author

A different approach is used to supersede this PR, thus closing this one.

henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 5, 2024
Based on Fu Wei's idea discussed in the issue [1], we employ the
blocking on L7 but without using external tools.

[Problem]

A peer will receive traffic from its peers and initiate connections to
its peers (via stream and pipeline), thus, the current mechanism of
only blocking traffic via proxy is incomplete, since the traffic
initiated from the peers to others will be "leaking", since `blackholeTX`
and `blackholeRX` only drop traffic coming in and out of the peer's
proxy.

[Solution - main idea]

Let's first agree on the naming of the existing proxy as a
"reverse proxy".

We will introduce a "forward proxy", which will be proxying all the
connection initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- B's SSL termination proxy - B's transparent proxy - B
     ^ newly introduced          ^ in the original codebase
```

By adding this forward proxy, we can block all traffic coming in and out
of a peer, without having to resort to external tools, such as iptables,
and the blocking of traffic is complete.

[Implementation]

The main subtasks are:
- Set up forward proxy
- Enable/disable forward proxy

[Test]
make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

[Reference]
[1] Issue etcd-io#17737
[2] Supersedes PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] Superseded PR (V2) etcd-io#17891
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 5, 2024
Based on Fu Wei's idea discussed in the issue [1], we employ the
blocking on L7 but without using external tools.

[Problem]

A peer will (a) receive traffic from its peers and (b) initiate
connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Solution - main idea]

Let's first agree on the naming of the existing proxy as a
"reverse proxy", since it sits in front of every peer and ingest the
traffic into each of them.

We introduce a "forward proxy" for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy - B's transparent proxy - B
     ^ newly introduced  ^ in the original codebase
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables. It's verified that the blocking of traffic is
complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

[Test]
make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

[Known issue]

I run into `context deadline exceeded` sometimes.

[References]
[1] Issue etcd-io#17737
[2] Supersedes PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] Superseded PR (V2) etcd-io#17891
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 5, 2024
Based on Fu Wei's idea discussed in the issue [1], we employ the
blocking on L7 but without using external tools.

[Problem]

A peer will (a) receive traffic from its peers and (b) initiate
connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Solution - main idea]

Let's first agree on the naming of the existing proxy as a
"reverse proxy", since it sits in front of every peer and ingest the
traffic into each of them.

We introduce a "forward proxy" for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy - B's transparent proxy - B
     ^ newly introduced  ^ in the original codebase
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables. It's verified that the blocking of traffic is
complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

[Test]
make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

[Known issue]

I run into `context deadline exceeded` sometimes.

[References]
[1] Issue etcd-io#17737
[2] Supersedes PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] Superseded PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 5, 2024
Based on Fu Wei's idea discussed in the issue [1], we employ the
blocking on L7 but without using external tools.

[Problem]

A peer will (a) receive traffic from its peers and (b) initiate
connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Solution - main idea]

Let's first agree on the naming of the existing proxy as a
"reverse proxy", since it sits in front of every peer and ingest the
traffic into each of them.

We introduce a "forward proxy" for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy - B's transparent proxy - B
     ^ newly introduced  ^ in the original codebase
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables. It's verified that the blocking of traffic is
complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

[Test]
make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

[Known issue]

I run into `context deadline exceeded` sometimes.

[References]
[1] Issue etcd-io#17737
[2] Supersedes PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] Superseded PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 5, 2024
Based on Fu Wei's idea discussed in the issue [1], we employ the
blocking on L7 but without using external tools.

[Problem]

A peer will (a) receive traffic from its peers and (b) initiate
connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Solution - main idea]

Let's first agree on the naming of the existing proxy as a
"reverse proxy", since it sits in front of every peer and ingest the
traffic into each of them.

We introduce a "forward proxy" for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy - B's reverse proxy - B
     ^ newly introduced  ^ in the original codebase
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables. It's verified that the blocking of traffic is
complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

[Test]
make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

[Known issue]

I run into `context deadline exceeded` sometimes.

[References]
[1] Issue etcd-io#17737
[2] Supersedes PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] Superseded PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 5, 2024
Based on Fu Wei's idea discussed in the issue [1], we employ the
blocking on L7 but without using external tools.

[Problem]

A peer will (a) receive traffic from its peers and (b) initiate
connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Solution - main idea]

Let's first agree on the naming of the existing proxy as a
"reverse proxy", since it sits in front of every peer and ingest the
traffic into each of them.

We introduce a "forward proxy" for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy - B's reverse proxy - B
     ^ newly introduced  ^ in the original codebase
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables. It's verified that the blocking of traffic is
complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

[Test]
make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

[Known issue]

I run into `context deadline exceeded` sometimes.

[References]
[1] Issue etcd-io#17737
[2] Supersedes PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] Superseded PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 5, 2024
Based on Fu Wei's idea discussed in the issue [1], we employ the
blocking on L7 but without using external tools.

[Problem]

A peer will (a) receive traffic from its peers and (b) initiate
connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Solution - main idea]

Let's first agree on the naming of the existing proxy as a
"reverse proxy", since it sits in front of every peer and ingest the
traffic into each of them.

We introduce a "forward proxy" for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy - B's reverse proxy - B
     ^ newly introduced  ^ in the original codebase
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables. It's verified that the blocking of traffic is
complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

[Test]
make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

[Known issue]

I run into `context deadline exceeded` sometimes.

[References]
[1] Issue etcd-io#17737
[2] Supersedes PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] Superseded PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 5, 2024
Based on Fu Wei's idea discussed in the issue [1], we employ the
blocking on L7 but without using external tools.

[Problem]

A peer will (a) receive traffic from its peers and (b) initiate
connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Solution - main idea]

Let's first agree on the naming of the existing proxy as a
"reverse proxy", since it sits in front of every peer and ingest the
traffic into each of them.

We introduce a "forward proxy" for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy - B's reverse proxy - B
     ^ newly introduced  ^ in the original codebase
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables. It's verified that the blocking of traffic is
complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

[Test]
make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

[Known issue]

I run into `context deadline exceeded` sometimes.

[References]
[1] Issue etcd-io#17737
[2] Supersedes PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] Superseded PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 5, 2024
Based on Fu Wei's idea discussed in the issue [1], we employ the
blocking on L7 but without using external tools.

[Problem]

A peer will (a) receive traffic from its peers and (b) initiate
connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Solution - main idea]

Let's first agree on the naming of the existing proxy as a
"reverse proxy", since it sits in front of every peer and ingest the
traffic into each of them.

We introduce a "forward proxy" for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy - B's reverse proxy - B
     ^ newly introduced  ^ in the original codebase
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables. It's verified that the blocking of traffic is
complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

[Test]
make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

[Known issue]

I run into `context deadline exceeded` sometimes.

[References]
[1] Issue etcd-io#17737
[2] Supersedes PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] Superseded PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 5, 2024
Based on Fu Wei's idea discussed in the issue [1], we employ the
blocking on L7 but without using external tools.

[Problem]

A peer will (a) receive traffic from its peers and (b) initiate
connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Solution - main idea]

Let's first agree on the naming of the existing proxy as a
"reverse proxy", since it sits in front of every peer and ingest the
traffic into each of them.

We introduce a "forward proxy" for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy - B's reverse proxy - B
     ^ newly introduced  ^ in the original codebase
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables. It's verified that the blocking of traffic is
complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

[Test]
make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

[Known issue]

I run into `context deadline exceeded` sometimes.

[References]
[1] Issue etcd-io#17737
[2] Supersedes PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] Superseded PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 5, 2024
Based on Fu Wei's idea discussed in the issue [1], we employ the
blocking on L7 but without using external tools.

[Problem]

A peer will (a) receive traffic from its peers and (b) initiate
connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Solution - main idea]

Let's first agree on the naming of the existing proxy as a
"reverse proxy", since it sits in front of every peer and ingest the
traffic into each of them.

We introduce a "forward proxy" for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy - B's reverse proxy - B
     ^ newly introduced  ^ in the original codebase
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables. It's verified that the blocking of traffic is
complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

[Test]
make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

[Known issue]

I run into `context deadline exceeded` sometimes.

[References]
[1] Issue etcd-io#17737
[2] Supersedes PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] Superseded PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 6, 2024
Based on Fu Wei's idea discussed in the issue [1], we employ the
blocking on L7 but without using external tools.

[Problem]

A peer will (a) receive traffic from its peers and (b) initiate
connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Solution - main idea]

Let's first agree on the naming of the existing proxy as a
"reverse proxy", since it sits in front of every peer and ingest the
traffic into each of them.

We introduce a "forward proxy" for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy - B's reverse proxy - B
     ^ newly introduced  ^ in the original codebase
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables. It's verified that the blocking of traffic is
complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

The end result is shown as the following diagram:
```
A -- A's forward proxy (connections initiated from A will be forward from this proxy)
 |   ^ covers case (b)
 |
 --- A's reverse proxy (advertised to other peers where the connection should come in from)
     ^ covers case (a)
```

[Test]
make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1

[Known issue]

I run into `context deadline exceeded` sometimes.

[References]
[1] Issue etcd-io#17737
[2] Supersedes PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] Superseded PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 6, 2024
Based on Fu Wei's idea discussed in the [issue](etcd-io#17737), we employ the blocking on L7 but without using external tools.

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's existing proxy is insufficient, since only scenario (a) is handled, and scenario (b) is not blocked at all.

We introduce an "HTTP proxy" for each peer, which will be proxying all the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's HTTP proxy ----- B's (currently existing) proxy - B
     ^ newly introduced   ^ in the original codebase
```

By adding this HTTP proxy, we can block all in and out traffic that is initiated from a peer to others, without having to resort to external tools, such as iptables. It's verified that the blocking of traffic is complete, compared to previous solutions [2][3].

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement HTTP proxy by extending the existing proxy server code
- implement enable/disable of the HTTP proxy in the e2e test

The result is that for every peer, we will have the arch like this
```
A -- A's HTTP proxy (connections initiated from A will be forwarded from this proxy)
 |   ^ covers case (b)
 |
 --- A's (currently existing) proxy (advertised to other peers where the connection should come in from)
     ^ covers case (a)
```

- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

- I run into `context deadline exceeded` sometimes
```
    etcd_mix_versions_test.go:175:
                Error Trace:    /Users/henrybear327/go/src/etcd/tests/e2e/etcd_mix_versions_test.go:175
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:75
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:31
                Error:          Received unexpected error:
                                [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
                Test:           TestBlackholeByMockingPartitionLeader
                Messages:       failed to put "key-0", error: [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
```

[References]
[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 6, 2024
Thanks to Fu Wei for the input regarding etcd-io#17938.

[Problem]

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's existing proxy is insufficient, since only scenario (a) is handled, and scenario (b) is not blocked at all.

[Main idea]

We introduce 1 shared "HTTP proxy" for all peers. All peers will be proxying all the connections through it.

The modified architecture will look something like this:
```
A -- shared HTTP proxy ----- B
     ^ newly introduced
```

By adding this HTTP proxy, we can block all in and out traffic that is initiated from a peer to others, without having to resort to external tools, such as iptables. It's verified that the blocking of traffic is complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`, because go will not parse HTTP_PROXY and HTTPS_PROXY that is using localhost or 127.0.0.1, regardless if the port is present or not
- implement the shared HTTP proxy by extending the existing proxy server code
- remove existing proxy setup (the per-peer proxy)
- implement enable/disable of the HTTP proxy in the e2e test

[Testing]

- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

[References]
[1] Tracking issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891
[4] PR (V3) etcd-io#17938

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 6, 2024
Thanks to Fu Wei for the input regarding etcd-io#17938.

[Problem]

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's existing proxy is insufficient, since only scenario (a) is handled, and scenario (b) is not blocked at all.

[Main idea]

We introduce 1 shared HTTP proxy for all peers. All peers will be proxying all the connections through it.

The modified architecture will look something like this:
```
A -- shared HTTP proxy ----- B
     ^ newly introduced
```

By adding this HTTP proxy, we can block all in and out traffic that is initiated from a peer to others, without having to resort to external tools, such as iptables. It's verified that the blocking of traffic is complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`, because go will not parse HTTP_PROXY and HTTPS_PROXY that is using localhost or 127.0.0.1, regardless if the port is present or not
- implement the shared HTTP proxy by extending the existing proxy server code (we need to be able to identify the source sender)
- remove existing proxy setup (the per-peer proxy)
- implement enable/disable of the HTTP proxy in the e2e test

[Testing]

- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

[References]
[1] Tracking issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891
[4] PR (V3) etcd-io#17938

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 11, 2024
Based on Fu Wei's idea discussed in the [issue](etcd-io#17737), we employ the blocking on L7 but without using external tools.

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's existing proxy is insufficient, since only scenario (a) is handled, and scenario (b) is not blocked at all.

We introduce an "HTTP proxy" for each peer, which will be proxying all the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's HTTP proxy ----- B's (currently existing) proxy - B
     ^ newly introduced   ^ in the original codebase
```

By adding this HTTP proxy, we can block all in and out traffic that is initiated from a peer to others, without having to resort to external tools, such as iptables. It's verified that the blocking of traffic is complete, compared to previous solutions [2][3].

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement HTTP proxy by extending the existing proxy server code
- implement enable/disable of the HTTP proxy in the e2e test

The result is that for every peer, we will have the arch like this
```
A -- A's HTTP proxy (connections initiated from A will be forwarded from this proxy)
 |   ^ covers case (b)
 |
 --- A's (currently existing) proxy (advertised to other peers where the connection should come in from)
     ^ covers case (a)
```

- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

- I run into `context deadline exceeded` sometimes
```
    etcd_mix_versions_test.go:175:
                Error Trace:    /Users/henrybear327/go/src/etcd/tests/e2e/etcd_mix_versions_test.go:175
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:75
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:31
                Error:          Received unexpected error:
                                [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
                Test:           TestBlackholeByMockingPartitionLeader
                Messages:       failed to put "key-0", error: [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
```

[References]
[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 11, 2024
Based on Fu Wei's idea discussed in the [issue](etcd-io#17737), we employ the blocking on L7 but without using external tools.

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's existing proxy is insufficient, since only scenario (a) is handled, and scenario (b) is not blocked at all.

We introduce an "HTTP proxy" for each peer, which will be proxying all the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's HTTP proxy ----- B's (currently existing) proxy - B
     ^ newly introduced   ^ in the original codebase
```

By adding this HTTP proxy, we can block all in and out traffic that is initiated from a peer to others, without having to resort to external tools, such as iptables. It's verified that the blocking of traffic is complete, compared to previous solutions [2][3].

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement HTTP proxy by extending the existing proxy server code
- implement enable/disable of the HTTP proxy in the e2e test

The result is that for every peer, we will have the arch like this
```
A -- A's HTTP proxy (connections initiated from A will be forwarded from this proxy)
 |   ^ covers case (b)
 |
 --- A's (currently existing) proxy (advertised to other peers where the connection should come in from)
     ^ covers case (a)
```

- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

- I run into `context deadline exceeded` sometimes
```
    etcd_mix_versions_test.go:175:
                Error Trace:    /Users/henrybear327/go/src/etcd/tests/e2e/etcd_mix_versions_test.go:175
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:75
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:31
                Error:          Received unexpected error:
                                [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
                Test:           TestBlackholeByMockingPartitionLeader
                Messages:       failed to put "key-0", error: [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
```

[References]
[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 11, 2024
Thanks to Fu Wei for the input regarding etcd-io#17938.

[Problem]

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's existing proxy is insufficient, since only scenario (a) is handled, and scenario (b) is not blocked at all.

[Main idea]

We introduce 1 shared HTTP proxy for all peers. All peers will be proxying all the connections through it.

The modified architecture will look something like this:
```
A -- shared HTTP proxy ----- B
     ^ newly introduced
```

By adding this HTTP proxy, we can block all in and out traffic that is initiated from a peer to others, without having to resort to external tools, such as iptables. It's verified that the blocking of traffic is complete, compared to previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `FORWARD_PROXY`, because go will not parse HTTP_PROXY and HTTPS_PROXY that is using localhost or 127.0.0.1, regardless if the port is present or not
- implement the shared HTTP proxy by extending the existing proxy server code (we need to be able to identify the source sender)
- remove existing proxy setup (the per-peer proxy)
- implement enable/disable of the HTTP proxy in the e2e test

[Testing]

- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

[References]
[1] Tracking issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891
[4] PR (V3) etcd-io#17938

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 11, 2024
Based on Fu Wei's idea discussed in the [issue](etcd-io#17737), we employ the blocking on L7 but without using external tools.

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's existing proxy is insufficient, since only scenario (a) is handled, and scenario (b) is not blocked at all.

We introduce an "HTTP proxy" for each peer, which will be proxying all the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's HTTP proxy ----- B's (currently existing) proxy - B
     ^ newly introduced   ^ in the original codebase
```

By adding this HTTP proxy, we can block all in and out traffic that is initiated from a peer to others, without having to resort to external tools, such as iptables. It's verified that the blocking of traffic is complete, compared to previous solutions [2][3].

The main subtasks are
- set up an environment variable `FORWARD_PROXY`
- implement HTTP proxy by extending the existing proxy server code
- implement enable/disable of the HTTP proxy in the e2e test

The result is that for every peer, we will have the arch like this
```
A -- A's HTTP proxy (connections initiated from A will be forwarded from this proxy)
 |   ^ covers case (b)
 |
 --- A's (currently existing) proxy (advertised to other peers where the connection should come in from)
     ^ covers case (a)
```

- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

- I run into `context deadline exceeded` sometimes
```
    etcd_mix_versions_test.go:175:
                Error Trace:    /Users/henrybear327/go/src/etcd/tests/e2e/etcd_mix_versions_test.go:175
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:75
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:31
                Error:          Received unexpected error:
                                [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
                Test:           TestBlackholeByMockingPartitionLeader
                Messages:       failed to put "key-0", error: [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
```

[References]
[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 12, 2024
Based on Fu Wei's idea discussed in the [issue](etcd-io#17737),
we employ the blocking on L7 but without using external tools.

[Background]

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Proposed solution]

We introduce an forward proxy for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy ----- B's reverse proxy - B
     ^ newly introduced      ^ in the original codebase (renamed)
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables.

It's verified that the blocking of traffic is complete, compared to
previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `E2E_TEST_FORWARD_PROXY_IP`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

The result is that for every peer, we will have the arch like this
```
A -- A's forward proxy
     (connections initiated from A will be forwarded from this proxy)
 |   ^ covers case (b)
 |
 --- A's (currently existing) reverse proxy
     (advertised to other peers where the connection should come in from)
     ^ covers case (a)
```

[Testing]
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

[Issues]
- I run into `context deadline exceeded` sometimes
```
    etcd_mix_versions_test.go:175:
                Error Trace:    /Users/henrybear327/go/src/etcd/tests/e2e/etcd_mix_versions_test.go:175
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:75
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:31
                Error:          Received unexpected error:
                                [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
                Test:           TestBlackholeByMockingPartitionLeader
                Messages:       failed to put "key-0", error: [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
```

[References]
[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 15, 2024
Based on Fu Wei's idea discussed in the [issue](etcd-io#17737),
we employ the blocking on L7 but without using external tools.

[Background]

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Proposed solution]

We introduce an forward proxy for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy ----- B's reverse proxy - B
     ^ newly introduced      ^ in the original codebase (renamed)
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables.

It's verified that the blocking of traffic is complete, compared to
previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `E2E_TEST_FORWARD_PROXY_IP`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

The result is that for every peer, we will have the arch like this
```
A -- A's forward proxy
     (connections initiated from A will be forwarded from this proxy)
 |   ^ covers case (b)
 |
 --- A's (currently existing) reverse proxy
     (advertised to other peers where the connection should come in from)
     ^ covers case (a)
```

[Testing]
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

[Issues]
- I run into `context deadline exceeded` sometimes
```
    etcd_mix_versions_test.go:175:
                Error Trace:    /Users/henrybear327/go/src/etcd/tests/e2e/etcd_mix_versions_test.go:175
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:75
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:31
                Error:          Received unexpected error:
                                [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
                Test:           TestBlackholeByMockingPartitionLeader
                Messages:       failed to put "key-0", error: [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
```

[References]
[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 16, 2024
Based on Fu Wei's idea discussed in the [issue](etcd-io#17737),
we employ the blocking on L7 but without using external tools.

[Background]

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Proposed solution]

We introduce an forward proxy for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy ----- B's reverse proxy - B
     ^ newly introduced      ^ in the original codebase (renamed)
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables.

[Testing]
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

[Issues]
- I run into `context deadline exceeded` sometimes
```
    etcd_mix_versions_test.go:175:
                Error Trace:    /Users/henrybear327/go/src/etcd/tests/e2e/etcd_mix_versions_test.go:175
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:75
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:31
                Error:          Received unexpected error:
                                [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
                Test:           TestBlackholeByMockingPartitionLeader
                Messages:       failed to put "key-0", error: [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
```

[References]
[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>

It's verified that the blocking of traffic is complete, compared to
previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `E2E_TEST_FORWARD_PROXY_IP`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

The result is that for every peer, we will have the arch like this
```
A -- A's forward proxy
     (connections initiated from A will be forwarded from this proxy)
 |   ^ covers case (b)
 |
 --- A's (currently existing) reverse proxy
     (advertised to other peers where the connection should come in from)
     ^ covers case (a)
```
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 18, 2024
Based on Fu Wei's idea discussed in the [issue](etcd-io#17737),
we employ the blocking on L7 but without using external tools.

[Background]

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Proposed solution]

We introduce an forward proxy for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy ----- B's reverse proxy - B
     ^ newly introduced      ^ in the original codebase (renamed)
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables.

[Testing]
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

[Issues]
- I run into `context deadline exceeded` sometimes
```
    etcd_mix_versions_test.go:175:
                Error Trace:    /Users/henrybear327/go/src/etcd/tests/e2e/etcd_mix_versions_test.go:175
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:75
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:31
                Error:          Received unexpected error:
                                [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
                Test:           TestBlackholeByMockingPartitionLeader
                Messages:       failed to put "key-0", error: [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
```

[References]
[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>

It's verified that the blocking of traffic is complete, compared to
previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `E2E_TEST_FORWARD_PROXY_IP`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

The result is that for every peer, we will have the arch like this
```
A -- A's forward proxy
     (connections initiated from A will be forwarded from this proxy)
 |   ^ covers case (b)
 |
 --- A's (currently existing) reverse proxy
     (advertised to other peers where the connection should come in from)
     ^ covers case (a)
```
henrybear327 added a commit to henrybear327/etcd that referenced this pull request May 21, 2024
Based on Fu Wei's idea discussed in the [issue](etcd-io#17737),
we employ the blocking on L7 but without using external tools.

[Background]

A peer will
(a) receive traffic from its peers
(b) initiate connections to its peers (via stream and pipeline).

Thus, the current mechanism of only blocking peer traffic via the peer's
existing proxy is insufficient, since only scenario (a) is handled, and
scenario (b) is not blocked at all.

[Proposed solution]

We introduce an forward proxy for each peer, which will be proxying all
the connections initiated from a peer to its peers.

The modified architecture will look something like this:
```
A -- A's forward proxy ----- B's reverse proxy - B
     ^ newly introduced      ^ in the original codebase (renamed)
```

By adding this forward proxy, we can block all in and out traffic that
is initiated from a peer to others, without having to resort to external
tools, such as iptables.

[Testing]
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionLeader$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`
- `make gofail-enable && make build && make gofail-disable && \
go test -timeout 60s -run ^TestBlackholeByMockingPartitionFollower$ go.etcd.io/etcd/tests/v3/e2e -v -count=1`

[Issues]
- I run into `context deadline exceeded` sometimes
```
    etcd_mix_versions_test.go:175:
                Error Trace:    /Users/henrybear327/go/src/etcd/tests/e2e/etcd_mix_versions_test.go:175
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:75
                                                        /Users/henrybear327/go/src/etcd/tests/e2e/blackhole_test.go:31
                Error:          Received unexpected error:
                                [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
                Test:           TestBlackholeByMockingPartitionLeader
                Messages:       failed to put "key-0", error: [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0] match not found.  Set EXPECT_DEBUG for more info Errs: [unexpected exit code [1] after running [/Users/henrybear327/go/src/etcd/bin/etcdctl --endpoints=http://localhost:20006 put key-0 value-0]], last lines:
                                {"level":"warn","ts":"2024-05-05T23:02:36.809726+0800","logger":"etcd-client","caller":"v3@v3.6.0-alpha.0/retry_interceptor.go:65","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x140001ee960/localhost:20006","method":"/etcdserverpb.KV/Put","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
                                Error: context deadline exceeded
                                 (expected "OK", got []). Try EXPECT_DEBUG=TRUE
```

[References]
[1] issue etcd-io#17737
[2] PR (V1) https://github.com/henrybear327/etcd/tree/fix/e2e_blackhole
[3] PR (V2) etcd-io#17891

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>

It's verified that the blocking of traffic is complete, compared to
previous solutions [2][3].

[Implementation]

The main subtasks are
- set up an environment variable `E2E_TEST_FORWARD_PROXY_IP`
- implement forward proxy by extending the existing proxy server code
- implement enable/disable of the forward proxy in the e2e test

The result is that for every peer, we will have the arch like this
```
A -- A's forward proxy
     (connections initiated from A will be forwarded from this proxy)
 |   ^ covers case (b)
 |
 --- A's (currently existing) reverse proxy
     (advertised to other peers where the connection should come in from)
     ^ covers case (a)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants