Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/1.1] Update to latest compatible tokio and rayon releases #6134

Closed
wants to merge 349 commits into from

Conversation

onalante-msft
Copy link
Contributor

crossbeam-utils recently patched an unsoundness bug in AtomicCell arithmetic: crossbeam-rs/crossbeam#781. However, this patch was not backported to the older versions required by tokio 0.1 and rayon. This PR updates tokio and rayon to the latest available versions to unify our crossbeam dependencies, in particular unifying crossbeam-util to version 0.7.2. In the process, we can verify that we are not affected by the unsoundness bug by attempting a build with crossbeam-util patched to remove AtomicCell arithmetic.

Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

General Guidelines and Best Practices

  • I have read the contribution guidelines.
  • Title of the pull request is clear and informative.
  • Description of the pull request includes a concise summary of the enhancement or bug fix.

Testing Guidelines

  • Pull request includes test coverage for the included changes.
  • Description of the pull request includes
    • concise summary of tests added/modified
    • local testing done.

Draft PRs

  • Open the PR in Draft mode if it is:
    • Work in progress or not intended to be merged.
    • Encountering multiple pipeline failures and working on fixes.

Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned here for the PR title and description

yophilav and others added 30 commits November 17, 2020 18:15
Update ARM base image to 3.1.10-bionic
# 1.0.10.3 (2020-11-18)
## Edge Agent
### Bug Fixes
* Fix vulnerability issues in ARM-based docker images [6603778](Azure@6603778)

## Edge Hub
### Bug Fixes
* Fix vulnerability issues in ARM-based docker images [6603778](Azure@6603778)
* Remove await from support bundle autofac construction (Azure#3453)

Autofac inconsistently breaks if you await before resolving something.

* adding bool.trueString to cleanup processor metrics label to fix it

Co-authored-by: Lee Fitchett <lefitche@microsoft.com>
This was a simple fix. 

~~The new e2e tests currently do not test the http configuration for mgmt and workload sockets, so there may be a follow up to add tests.~~
Since http is not officially supported for mgmt and workload sockets, no tests will be added.
Currently the `ValidateMetrics` parses [metric readme page](https://github.com/Azure/iotedge/blob/master/doc/BuiltInMetrics.md) to obtain all the metric keys. The test then deploy the system runtime modules and populate its metrics which then are compared against the parsed-document set.
There are a handful of not-so-positive metrics that are not being populated if the system modules operated on a happy path during the E2E test. These not-so-positive metrics are currently causing the `ValidateMetrics` to fail since they are not being populated.
Currently the `ValidateMetrics` parses [metric readme page](https://github.com/Azure/iotedge/blob/master/doc/BuiltInMetrics.md) to obtain all the metric keys. The test then deploy the system runtime modules and populate its metrics which then are compared against the parsed-document set.
There are a handful of not-so-positive metrics that are not being populated if the system modules operated on a happy path during the E2E test. These not-so-positive metrics are currently causing the `ValidateMetrics` to fail since they are not being populated.

Cherry-picked: Azure@f443dde
Cherry-picking from main branch

This PR fixes a bug in product code where the cleanup task in MessageStore.cs can potentially start before all of its settings are set, resulting in it taking default values for its settings.
Also re-enables the MultipleTTL test in MessageStoreTest.cs and makes it run with polling vs a delay, which will speed up the test.
…e/1.0.10 (Azure#4042)

Cherry pick Fix message priorities test (Azure#4015) from main to release/1.0.10

Fixing message priorities test. It's flaky. There's a delay in it, which is likely the issue. Changing it to be a poll and making it a bit longer should improve the test.
Cherry-pick from main branch to release/1.0.10
InMemoryDbStoreLimitsTest is flaky in the CI pipeline. I believe it is because we overuse the "sender1" name. The test keeps failing on "Amqp resource is disconnected," which I think we get because we're trying to access an AMQP resource that has previously closed. This would make sense if we are using the "sender1" name somewhere else around the same time.

So this pr just changes the name to be more specific to the test. I've run it ~8 times in the pipeline and haven't seen the failure, so it looks like it solves the problem.
Cherry-picking part of this commit from main branch to 1.0.10: 43e8151

This PR cleans up workarounds that were put in when the PlugAndPlay tests were first made.

Previously, we needed to use a preview version of IoTHub that supported PnP. Since then, they've gone GA, and we can use our normal IoTHub and normal EventHub endpoint.
Cherry-pick from main branch to 1.0.10
Previously, we implemented a workaround for end-to-end (e2e) tests because config.yaml + iotedged was default pulling version 1.0.9 because we use the 1.0 tag, and 1.0.9 was tagged as 1.0 version. This was an issue because there was validation logic in the e2e tests that checks that EA has been deployed correctly. But this validation logic didn't work if the EA versions were incompatible. EA 1.0.10 had a change that 1.0.9 didn't have that made the two versions incompatible. So we added a custom bootstrap image to the e2e tests that would start the config.yaml with a 1.0.10 version of EA that was compatible with the newer version of EA.

Once we tagged 1.0.10 with 1.0, all of these changes can no go away, since we don't need to have a bootstrap image anymore.
This PR removes the bootstrap workaround.
* added metrics for current number of connected clients, individual device connect/disconnected to/from iot edge
* updated doc/BuiltInMetrics.md
* modified failed connection metrics description to align with doc and added ignore disconnect in e2e test
1. Update ARM base image usage to version 1.0.5.6 (3.1.10-bionic-arm*)
2. Update Windows AMD64 base image to 3.1.10-nanoserver-1809
This rebases all changes from Edge on K8s public preview branch into latest release/1.0.10 branch.
Cherry-picked all commits from merge base to latest edge-on-k8s-public-preview branch into this PR, then cleaned it up.
Most C# code is in the "Kubernetes" project, but some changes affect the "Service" and "Core" projects.

Tests:

TEST:
Bring up k8s based on: https://microsoft.github.io/iotedge-k8s-doc/
only using the images from this PR's build.

[k8s] Update Helm charts to allow for correct proxy settings. (Azure#3641)
TEST:
Set ".Values.iotedged.data.no_proxy" on installation.
Confirm "no_proxy" environment var is set for iotedged and agent pods.

[k8s] Add all root CA certs to TLS connections (Azure#3616)
TEST:
Set up Edge runtime with self-signed certificate. Ensure it starts and all modules connect.

[k8s] Allow Edge on K8s modules to join HostNetwork. (Azure#3618)
TEST:
Set "createOptions.NetworkMode=Host" in a module, ensure it starts and connects to EdgeHub.

[k8s] Set resource limits and requests for Edge runtime components Azure#3666
TEST:
Start runtime with these Helm charts. Once pods are running, run
kubectl get pod -n -o yaml
For each pod.
Confirm iotedged has resource limit & request set. and qosClass: Guaranteed
Confirm edgagent has resource limit for all containers and qosClass: Guaranteed
Module will have a resource limit for proxy container, and may or may not have resource for module.

[k8s public preview] Have EdgeDeploymentOperator report status to Agent Azure#3232
TEST:

Set up a module with bad bind mount:
{"image":"mcr.microsoft.com/media/live-video-analytics:1","createOptions":{"hostConfig":{"Binds":["/home/lvaadmin/samples/output:/var/media/","//:/var/lib/azuremediaservices/"],"LogConfig":{"Config":{"max-size":"10m","max-file":"10"},"Type":""}}},"auth":null}

Add space at the end of the ACR user name in the deployment. Use a module that attempts to access the ACR.

Both setups should not crash agent. Runtime status of device on portal should show a "500" error.

[k8s] Discovered some problems with createOptions parsing. (Azure#3743)
TEST:
Set a module with Cmd, EntryPoint, WorkingDir with different capitalization and ensure these are parsed and assigned in pod.

[k8s] Environment Variables can be null or empty. (Azure#3780)
TEST:
In Agent section of Helm charts, set an empty environment variable, ensure iotedged starts and launches Agent with empty variable.

In any module's environment variable section, make an unset environment variable, and one that is all "=", ensure Pod is created with these environment variables present.
* added metrics for current number of connected clients, individual device connect/disconnected to/from iot edge

* fixed metrics description

* fixed UT ConnectionManagerTest by adding missing mock which impacted by metrics change

* Update edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Core/DeviceConnectionMetrics.cs

Co-authored-by: Venkat Yalla <veyalla@microsoft.com>

* Update edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Core/DeviceConnectionMetrics.cs

Co-authored-by: Venkat Yalla <veyalla@microsoft.com>

* Update edge-hub/src/Microsoft.Azure.Devices.Edge.Hub.Core/DeviceConnectionMetrics.cs

Co-authored-by: Venkat Yalla <veyalla@microsoft.com>

* Update doc/BuiltInMetrics.md

Co-authored-by: Venkat Yalla <veyalla@microsoft.com>

* Update doc/BuiltInMetrics.md

Co-authored-by: Venkat Yalla <veyalla@microsoft.com>

* Update doc/BuiltInMetrics.md

Co-authored-by: Venkat Yalla <veyalla@microsoft.com>

* Update doc/BuiltInMetrics.md

Co-authored-by: Venkat Yalla <veyalla@microsoft.com>

* modified to use Array.Empty

* modified failed connection metrics description to align with doc and added ignore disconnect in e2e test

* fixed typo in md file

Co-authored-by: Venkat Yalla <veyalla@microsoft.com>
# 1.0.10.4 (2020-12-18)
## Edge Agent
### Bug Fixes
* Fix vulnerability issues in ARM-based docker images [6603778](Azure@6603778)
* GetModuleLogs works with http endpoints [cf176be](Azure@cf176be)

## Edge Hub
### Bug Fixes
* Fix `edgehub_queue_length` metric [2861dbc](Azure@2861dbc)
* Introduce new metrics [40b2de9](Azure@40b2de9)
* Fix vulnerability issues in Linux ARM and Windows AMD docker images [73fa197](Azure@73fa197)
* Remove metrics v0 (Azure#4109)

* remove metrics v0 doc (Azure#4162)
iiot usually cleans up its packages on completion, but won't be able to do this if it crashes, times out, or is canceled. Update tests to remove these packages if encountered.
First take on automating ARM Base Image update.

Use case scenario: 
1. tools/BaseImageUpdate/baseImage.ps1 is a helper script that a developer can use on his powershell to update the ARM base image version.  i.e. `Update-ARM-BaseImages -NewASPNetCoreVersion "2.1.23"`.
2. A developer send out a Pull request with base image changes.
3. A developer uses a pipeline `builds/misc/base-images-publish.yaml` to kick off the build & base image publication.
4. (Optional) As a good measure, once (3) is completed, a developer can run the build image pipeline to verify a successful publication.

Cherry-pick: Azure@931778a
nimanch and others added 29 commits January 14, 2022 23:56
This PR brings changes from
1. Azure#5963 
2. Azure#5983 

to release/1.1. Manual Changes had to be done due to significant difference in how packages are build in 1.2 vs 1.1.

This PR is a follow up of
1. Azure#5995 

Testing
1. Hardcoded Version and Branch and verified that Packages are uploaded to Github : https://dev.azure.com/msazure/One/_build/results?buildId=50827929&view=results
2. Verified Can run repo client tool and query status: https://dev.azure.com/msazure/One/_build/results?buildId=50830155&view=results


## Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

### General Guidelines and Best Practices
- [x] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
- [x] Title of the pull request is clear and informative.
- [x] Description of the pull request includes a concise summary of the enhancement or bug fix.

### Testing Guidelines
- [x] Pull request includes test coverage for the included changes.
- Description of the pull request includes 
	- [x] concise summary of tests added/modified
	- [x] local testing done.  

### Draft PRs
- Open the PR in `Draft` mode if it is:
	- Work in progress or not intended to be merged.
	- Encountering multiple pipeline failures and working on fixes.

_Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned [here](https://chris.beams.io/posts/git-commit/#:~:text=The%20seven%20rules%20of%20a%20great%20Git%20commit,what%20and%20why%20vs.%20how%20For%20example%3A%20) for the PR title and description_
Same Changes as Azure#6015 

## Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

### General Guidelines and Best Practices
- [x] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
- [x] Title of the pull request is clear and informative.
- [x] Description of the pull request includes a concise summary of the enhancement or bug fix.

### Testing Guidelines
- [x] Pull request includes test coverage for the included changes.
- Description of the pull request includes 
	- [x] concise summary of tests added/modified
	- [x] local testing done.  

### Draft PRs
- Open the PR in `Draft` mode if it is:
	- Work in progress or not intended to be merged.
	- Encountering multiple pipeline failures and working on fixes.

_Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned [here](https://chris.beams.io/posts/git-commit/#:~:text=The%20seven%20rules%20of%20a%20great%20Git%20commit,what%20and%20why%20vs.%20how%20For%20example%3A%20) for the PR title and description_
* [1.1] Build rocksdb and arm images in amd64 hosts (Azure#5954)

cherry pick into 1.1 for Azure#5947 and Azure#5950. (Azure#5954)
When this project was started, we used base images to run some commands in the native execution environment (for example: create a user in the container). Now with cross-platform docker builds, we can use runtime emulation to do this work, and not maintain the base images.

We also had an inconsistent way to house the RocksDB library files. This PR builds the libraries in parallel with the project executables, and collects the libraries with the other artifacts for the image build to use. I also updated the container images build to ubuntu 20.04 to remove all references to qemu-static container.

This PR reduces (but does not completely eliminate) dependency on docker hub for the project.

Tested by running E2E tests on images built on the PR branch.

This checklist is used to make sure that common guidelines for a pull request are followed.
- [x] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
- [x] Title of the pull request is clear and informative.
- [x] Description of the pull request includes a concise summary of the enhancement or bug fix.
- [x] Pull request includes test coverage for the included changes.
- Description of the pull request includes
	- [x] concise summary of tests added/modified
	- [ ] local testing done.
- Open the PR in `Draft` mode if it is:
	- Work in progress or not intended to be merged.
	- Encountering multiple pipeline failures and working on fixes.

_Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned [here](https://chris.beams.io/posts/git-commit/#:~:text=The%20seven%20rules%20of%20a%20great%20Git%20commit,what%20and%20why%20vs.%20how%20For%20example%3A%20) for the PR title and description_

1.1 is again slightly different than 1.2, with no consolidated artifacts, so slightly different handling of RocksDb libraries.

Place rocksdb lib in a directory.

Make sure destination and source are the same.

no consolidated artifacts in 1.1

Processing buildx args got lost in the merge.

Update to 20.04 for build image stage.
Per Azure#5987, the build pipelines should be updated to be using Ubuntu20.04 to keep OS dependency consistence.

## Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

### General Guidelines and Best Practices
- [x] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
- [x] Title of the pull request is clear and informative.
- [x] Description of the pull request includes a concise summary of the enhancement or bug fix.

### Testing Guidelines
- [x] Pull request includes test coverage for the included changes.
- Description of the pull request includes 
	- [x] concise summary of tests added/modified
	- [x] local testing done.  

### Draft PRs
- Open the PR in `Draft` mode if it is:
	- Work in progress or not intended to be merged.
	- Encountering multiple pipeline failures and working on fixes.

_Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned [here](https://chris.beams.io/posts/git-commit/#:~:text=The%20seven%20rules%20of%20a%20great%20Git%20commit,what%20and%20why%20vs.%20how%20For%20example%3A%20) for the PR title and description_
- Update underlying .NET images for a security update (.Net Core 3.1.22)
- Use "stable" tags for each of the .NET flavors 

## Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

### General Guidelines and Best Practices
- [x] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
- [x] Title of the pull request is clear and informative.
- [x] Description of the pull request includes a concise summary of the enhancement or bug fix.

### Testing Guidelines
- [x] Pull request includes test coverage for the included changes.
- Description of the pull request includes 
	- [x] concise summary of tests added/modified
	- [x] local testing done.  

### Draft PRs
- Open the PR in `Draft` mode if it is:
	- Work in progress or not intended to be merged.
	- Encountering multiple pipeline failures and working on fixes.

_Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned [here](https://chris.beams.io/posts/git-commit/#:~:text=The%20seven%20rules%20of%20a%20great%20Git%20commit,what%20and%20why%20vs.%20how%20For%20example%3A%20) for the PR title and description_
Cherry Pick (Azure#6022)

## Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

### General Guidelines and Best Practices
- [x] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
- [x] Title of the pull request is clear and informative.
- [x] Description of the pull request includes a concise summary of the enhancement or bug fix.

### Testing Guidelines
- [ ] Pull request includes test coverage for the included changes.
- Description of the pull request includes 
	- [ ] concise summary of tests added/modified
	- [ ] local testing done.  

### Draft PRs
- Open the PR in `Draft` mode if it is:
	- Work in progress or not intended to be merged.
	- Encountering multiple pipeline failures and working on fixes.

_Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned [here](https://chris.beams.io/posts/git-commit/#:~:text=The%20seven%20rules%20of%20a%20great%20Git%20commit,what%20and%20why%20vs.%20how%20For%20example%3A%20) for the PR title and description_
* Prepare for Release 1.1.9

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
… updates for release/1.1 (Azure#6054)

This is a standalone Release pipeline, intended to simplify the existing internal release process.

To wit:

- It eliminates the need to run a separate pipeline to build test images by removing external dependences,
- It only builds images that are needed, either for the release or for testing purposes, and
- It obviates the need for running manual checks/smoke tests by running E2E tests on the Release bits (which isn't something we were doing before).

This change also adds Ubuntu 20.04 support (to account for a change in directory structure).

**Testing Notes:**

![image](https://user-images.githubusercontent.com/90283547/151478992-3b39fb6b-43de-43ce-a611-9bf1d9e4b958.png)


### General Guidelines and Best Practices
- [X] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
…6023)

* Fix underflow

* Use compare exchange and make separate test

* Use Max for count instead of trying to make atomic check
Some cleanup from previous PR: Azure#6054


### General Guidelines and Best Practices
- [X] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
This change updates the Dockerfiles for our Windows images to use stable nanoserver tags. It also removes all Windows arm32v7 Dockerfiles.
Prepare for release 1.1.10
The K8s client is currently requiring a library (BouncyCastle) we are
attempting to remove from the project. Since the existing Edge on K8s
feature is only on 1.1-k8s-preview branch, we can remove it from 1.1 branch.


## Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

### General Guidelines and Best Practices
- [ ] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
- [ ] Title of the pull request is clear and informative.
- [ ] Description of the pull request includes a concise summary of the enhancement or bug fix.

### Testing Guidelines
- [ ] Pull request includes test coverage for the included changes.
- Description of the pull request includes 
	- [ ] concise summary of tests added/modified
	- [ ] local testing done.  

### Draft PRs
- Open the PR in `Draft` mode if it is:
	- Work in progress or not intended to be merged.
	- Encountering multiple pipeline failures and working on fixes.

_Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned [here](https://chris.beams.io/posts/git-commit/#:~:text=The%20seven%20rules%20of%20a%20great%20Git%20commit,what%20and%20why%20vs.%20how%20For%20example%3A%20) for the PR title and description_
…6069)

This change:

- integrates with Dave's rocksDb changes.
- adds E2E tests (for all supported architectures) to the Release bits of the Metrics Collector (just Metrics collector related tests, not the full suite of E2E tests).
- uses the same workflow as the Base Image update pipelines (for building test images, running E2E tests), but is NOT a separate workflow. _This changes the existing Metrics Collector pipeline!_


**Testing Notes:**

![image](https://user-images.githubusercontent.com/90283547/152253646-372cf6ee-0976-4b43-a5c6-6abfe9692e26.png)



**Other Notes:**
The E2E test architectures added in the metrics-collector-images-release.yaml file have been added with intent. That file contains the entire workflow and it's solely for metrics collector. There is no plan to create a template to extract common portions, because a thing should be made as simple as possible, but no simpler.

### General Guidelines and Best Practices
- [X] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
…rors (Azure#6083)

Component governance identified a vulnerability in thread_local, which
was a transitive dependency brought in by older regex versions.

This has the additional benefit of unifying our regex dependency
versions.

RUSTSEC advisory: https://rustsec.org/advisories/RUSTSEC-2022-0006.html

## Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

### General Guidelines and Best Practices
- [x] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
- [x] Title of the pull request is clear and informative.
- [x] Description of the pull request includes a concise summary of the enhancement or bug fix.

### Testing Guidelines
- [ ] Pull request includes test coverage for the included changes.
- Description of the pull request includes 
	- [ ] concise summary of tests added/modified
	- [x] local testing done.
…6069)

This change:

- integrates with Dave's rocksDb changes.
- adds E2E tests (for all supported architectures) to the Release bits of the Metrics Collector (just Metrics collector related tests, not the full suite of E2E tests).
- uses the same workflow as the Base Image update pipelines (for building test images, running E2E tests), but is NOT a separate workflow. _This changes the existing Metrics Collector pipeline!_


**Testing Notes:**

![image](https://user-images.githubusercontent.com/90283547/152253646-372cf6ee-0976-4b43-a5c6-6abfe9692e26.png)



**Other Notes:**
The E2E test architectures added in the metrics-collector-images-release.yaml file have been added with intent. That file contains the entire workflow and it's solely for metrics collector. There is no plan to create a template to extract common portions, because a thing should be made as simple as possible, but no simpler.

### General Guidelines and Best Practices
- [X] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
It can sometimes be that builds succeed but the resulting artifacts lead
to test timeouts, for example if edgeHub crashes upon execution.
Unconditionally publishing logs would make it easier to debug these
cases.

## Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

### General Guidelines and Best Practices
- [x] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
- [x] Title of the pull request is clear and informative.
- [x] Description of the pull request includes a concise summary of the enhancement or bug fix.

### Testing Guidelines
- [ ] Pull request includes test coverage for the included changes.
- Description of the pull request includes 
	- [ ] concise summary of tests added/modified
	- [ ] local testing done.
* fix shutdown and E2E test

***Please replace this line with your PR description and read PR checklist below***

## Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

### General Guidelines and Best Practices
- [x] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
- [x] Title of the pull request is clear and informative.
- [x] Description of the pull request includes a concise summary of the enhancement or bug fix.

### Testing Guidelines
- [x] Pull request includes test coverage for the included changes.
- Description of the pull request includes 
	- [ ] concise summary of tests added/modified
	- [ ] local testing done.  

### Draft PRs
- Open the PR in `Draft` mode if it is:
	- Work in progress or not intended to be merged.
	- Encountering multiple pipeline failures and working on fixes.

_Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned [here](https://chris.beams.io/posts/git-commit/#:~:text=The%20seven%20rules%20of%20a%20great%20Git%20commit,what%20and%20why%20vs.%20how%20For%20example%3A%20) for the PR title and description_
Based on two recent pipeline failures on ARM32 Raspberry Pis, the ValidateMetrics test needs more time to complete. Kusto logs were examined and they indicated that the ValidateMetrics direct method call succeeded within 30 seconds of the test timing out and failing.  Therefore, extending the timeout by 2 minutes should be more than enough time to fix this issue.

## Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

### General Guidelines and Best Practices
- [x] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
- [x] Title of the pull request is clear and informative.
- [x] Description of the pull request includes a concise summary of the enhancement or bug fix.

### Testing Guidelines
- [ ] Pull request includes test coverage for the included changes.
- Description of the pull request includes 
	- [ ] concise summary of tests added/modified
	- [ ] local testing done.  

### Draft PRs
- Open the PR in `Draft` mode if it is:
	- Work in progress or not intended to be merged.
	- Encountering multiple pipeline failures and working on fixes.

_Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned [here](https://chris.beams.io/posts/git-commit/#:~:text=The%20seven%20rules%20of%20a%20great%20Git%20commit,what%20and%20why%20vs.%20how%20For%20example%3A%20) for the PR title and description_
The TestGetModuleLogs test occasionally fails, not because it does something wrong but because it simply doesn't have time to finish all the stuff it's trying to do. This change increases the end-to-end test timeout to 8 minutes for Windows. It also increases the overall test time allotment to 120 minutes (default is 60). We really don't need to double the allotment, but arm32v7 will already take this long so it won't hurt to double (as opposed to picking some arbitrarily smaller value).

TestGetModuleLogs has only failed twice for this reason in the last couple of weeks, so it's hard to know whether this corrects the problem. But as long as this doesn't break anything, we'll make this change and see if flaky runs disappear over time.

## Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

### General Guidelines and Best Practices
- [X ] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
- [X] Title of the pull request is clear and informative.
- [X] Description of the pull request includes a concise summary of the enhancement or bug fix.

### Testing Guidelines
- [X] Pull request includes test coverage for the included changes.
- Description of the pull request includes 
	- [X] concise summary of tests added/modified
	- [X] local testing done.
…e key (Azure#6105)

This is a workaround for Azure#5087

When EdgeHub starts up and imports the certificates with their private keys, sometimes the private key does not get associated with the certificate. As a result, using the certificate object and accessing the private key throws an exception.
Because this certificate is handed over to Kestrel to provide a TLS connection, Kestrel tries to access the private key and throws an exception every time a client tries to connect. As a result, no client can connect to EdgeHub and it never recovers from this error.

We could not find the root cause that why the import fails sometimes and other times not, also it is not clear why only on windows. However, what we found is that retrying the import (with the same certificate and key files) usually works (always worked in the tests, but we cannot claim that it always will).

As a workaround, during startup, edge hub tries to access the private key to see if the association between the certificate and the key is working. If an error occurs, it re-imports the certificate, and it tries at max 10 times before stopping. Stop is needed because it is possible that the certificate files themself not correct and we want to avoid endless tries.
Reference dotnetty.common 0.7.1 to force using this version because it has a fix for bug introduced in 0.7.0 which client SDK uses. Bug causes edgeHub process to not be terminated for certificate renewal when UpstreamProtocol is Mqtt

Tested manually that edgeHub shutdown works when UpstreamProtocol is Mqtt and added e2e test for Mqtt.

## Azure IoT Edge PR checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

### General Guidelines and Best Practices
- [x] I have read the [contribution guidelines](https://github.com/azure/iotedge#contributing).
- [x] Title of the pull request is clear and informative.
- [x] Description of the pull request includes a concise summary of the enhancement or bug fix.

### Testing Guidelines
- [x] Pull request includes test coverage for the included changes.
- Description of the pull request includes 
	- [x] concise summary of tests added/modified
	- [x] local testing done.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet