Skip to content

Commit

Permalink
Merge pull request #639 from crim-ca/add-cwl-secrets
Browse files Browse the repository at this point in the history
  • Loading branch information
fmigneault committed May 13, 2024
2 parents 34cd8f1 + 33df41e commit e06f75d
Show file tree
Hide file tree
Showing 20 changed files with 522 additions and 52 deletions.
39 changes: 26 additions & 13 deletions .github/.gitleaks.toml
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,22 @@ title = "gitleaks config"
regex = '''(?i)(api_key|apikey|secret)(.{0,20})?['|"][0-9a-zA-Z]{16,45}['|"]'''
tags = ["key", "API", "generic"]
[rules.allowlist]
description = "ignore old commit secret (v0.1.0)"
commits = ["11cdaf9bb4ffa9eb060ae58dd81268012fd60c28"]
paths = ['''magpie/security.py''']
regexes = ['''randomsecretstring''']
description = "ignore process ID containing secret as plain word"
commits = [
'6927799b84fac00ec582dbd946031d6547c5a898',
'ca1e0b0ac5ee9e0a676f7f29a14648688fcce9de',
'b49611bca182a952b7a91c0f56f73433ce444a24',
'6927799b84fac00ec582dbd946031d6547c5a898',
'ca1e0b0ac5ee9e0a676f7f29a14648688fcce9de',
'b49611bca182a952b7a91c0f56f73433ce444a24',
]
paths = [
'''tests/functional/test_workflow.py''',
]
regexes = [
'''EchoSecrets''',
'''WorkflowEchoSecrets''',
]
[[rules]]
description = "Google API key"
regex = '''AIza[0-9A-Za-z\\-_]{35}'''
Expand Down Expand Up @@ -107,12 +119,13 @@ title = "gitleaks config"
regex = '''(?i)twilio(.{0,20})?['\"][0-9a-f]{32}['\"]'''
tags = ["key", "twilio"]
[allowlist]
description = "Allowlisted files"
files = [
# original contents
'''^\.?gitleaks.toml$''',
'''(.*?)(jpg|gif|doc|pdf|bin)$''',
'''(go.mod|go.sum)$''',
# extra ignores
'''weaver/wps_restapi/examples/vault_file_uploaded.json''',
'''.+(.js.map)$''']
description = "Allowlisted files"
files = [
# original contents
'''^\.?gitleaks.toml$''',
'''(.*?)(jpg|gif|doc|pdf|bin)$''',
'''(go.mod|go.sum)$''',
# extra ignores
'''weaver/wps_restapi/examples/vault_file_uploaded.json''',
'''.+(.js.map)$'''
]
1 change: 1 addition & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ Changes

Changes:
--------
- Add `CWL` ``cwltool:Secrets`` support (fixes `#511 <https://github.com/crim-ca/weaver/issues/511>`_).
- Add `CWL` ``StepInputExpressionRequirement`` support.

Fixes:
Expand Down
95 changes: 95 additions & 0 deletions docs/source/package.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1199,3 +1199,98 @@ Below is a list of compatible elements.
:trim:

.. |<=>| unicode:: 0x21D4

.. _app_pkg_secret_parameters:

Using Secret Parameters
=======================

Under some circumstances, input parameters to a :term:`Job` must be hidden, whether to avoid leaking credentials
required by the underlying application, or for using sensible information that should not be easily accessible.
In such cases, typical :term:`CWL` ``string`` inputs should not be directly employed.

There are 2 strategies available to employ *secrets* when working with `Weaver`:

1. Using the :term:`Vault` feature
2. Using ``cwltool:Secrets`` hint

.. _app_pkg_secret_vault:

Secrets using the File Vault
----------------------------

Using :ref:`file_vault_inputs` essentially consists of wrapping any sensible data within an input of type ``File``,
which will be :ref:`Uploaded to the Vault <vault_upload>` for :term:`Job` execution. Once the file is accessed and
staged by the relevant :term:`Job`, its contents are automatically deleted from the :term:`Vault`. This offers a
secured single access endpoint only available by the client that uploaded the file, for a short period of time,
which decides for which :term:`Process` it should be summited to with the corresponding authentication token and
:term:`Vault` ID. Since the sensible data is contained within a file, its contents are only available by the targeted
:term:`Job` for the selected :term:`Process`, while logs will only display a temporary path.

However, the :term:`Vault` approach as potential drawbacks.

1. It is a feature specific to `Weaver`, which will not be available an easily interoperable when involving
other :term:`OGC API - Processes` servers.

2. It forces the :term:`CWL` to be implemented using a ``File`` input. While this is not necessarily an issue
in some cases, it becomes the responsibility of the :term:`Application Package` developer to figure out how
to propagate the contained data to the relevant piece of code if a plain string is needed. To do so, the
developer must also avoid outputting any information to ``stdout``. Otherwise, the data would be captured
in :term:`Job` logs and defeating the purpose of using the :term:`Vault`.

.. note::
For more details about the :term:`Vault`, refer to sections :ref:`file_vault_inputs`, :ref:`vault_upload`,
and the corresponding capabilities in :term:`cli_example_upload`.

.. _app_pkg_secret_cwltool:

Secrets using the CWL Hints
---------------------------

An alternative approach is to use the :term:`CWL` hints as follows:

.. code-block:: json
:caption: CWL Secrets Definition
{
"cwlVersion": "v1.2",
"inputs": {
"<input-name>": {
"type": "string"
}
},
"hints": {
"cwltool:Secrets": {
"secrets": [
"<input-name>"
]
}
},
"$namespaces": {
"cwltool": "http://commonwl.org/cwltool#"
}
}
Using this definition either in a ``class: CommandLineTool`` (see :ref:`app_pkg_cmd`)
or a ``class: Workflow`` (see :ref:`app_pkg_workflow`) will instruct the underlying :term:`Job` execution
to replace all specified inputs (i.e.: ``<input-name>`` in the above example) to be masked in commands and logs.
Looking at :term:`Job` logs, all sensible inputs will be replaced by a representation similar to ``(secret-<UUID>)``
The original data identified by this masked definition will be substituted back only at the last possible moment,
when the underlying operation accessed it to perform its processing.

A few notable considerations must be taken when using the ``cwltool:Secrets`` definition.

1. It is limited to ``string`` inputs. Any other literal data type and intermediate conversions would need
to be handled explicitly by the :term:`Application Package` maintainer.

2. The secrets definition can only be provided in the ``hints`` section of the :term:`CWL` document, meaning
that any remote server supporting :term:`CWL` are not required to support this feature.
If the :term:`Application Package` is expected to be deployed remotely, it is up to the client to
determine whether the remote server will perform the necessary actions to mask sensible data.
If unsupported, secrets could become visible in the :term:`Job` logs as if they were submitted using
typical ``string`` inputs.

3. The feature does not avoid any misuse of underlying commands that could expose the sensible data due
to manipulation errors or the use of operations that are redirected to ``stdout``. For example, if the
shell ``echo`` command is used within the :term:`CWL` with an input listed in ``cwltool:Secrets``, its
value will still be displayed in plain text in the :term:`Job` logs.
8 changes: 8 additions & 0 deletions tests/functional/application-packages/EchoSecrets/deploy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# YAML representation supported by WeaverClient
processDescription:
process:
id: EchoSecrets
executionUnit:
# note: This does not work by itself! The test suite injects the file dynamically.
- href: "tests/functional/application-packages/EchoSecrets/package.cwl"
deploymentProfileName: "http://www.opengis.net/profiles/eoc/dockerizedApplication"
2 changes: 2 additions & 0 deletions tests/functional/application-packages/EchoSecrets/execute.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
inputs:
message: "secret message"
37 changes: 37 additions & 0 deletions tests/functional/application-packages/EchoSecrets/package.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
cwlVersion: "v1.2"
class: CommandLineTool
# WARNING:
# Use a script instead of using 'echo' command, which would write to the secret to stdout and be displayed!
# A script using secrets could have other standard output messages that are relevant to log.
# It is up to the application developer to make sure they do no echo their own secrets...
# baseCommand: echo
baseCommand: python
arguments: ["echo.py"]
requirements:
DockerRequirement:
dockerPull: "docker.io/python:3-slim"
InitialWorkDirRequirement:
listing:
- entryname: echo.py
entry: |
import sys
with open("out.txt", mode="w", encoding="utf-8") as f:
f.write(sys.argv[1])
print("OK!") # print on purpose to test stdout includes only this, and not the secret input
hints:
cwltool:Secrets:
secrets:
- message
$namespaces:
cwltool: http://commonwl.org/cwltool#
inputs:
message:
type: string
inputBinding:
position: 1
outputs:
output:
type: File
outputBinding:
glob: "out.txt"
stdout: "stdout.log"
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
processDescription:
process:
id: WorkflowEchoSecrets
executionUnit:
# note: This does not work by itself! The test suite injects the file dynamically.
- test: "tests/functional/application-packages/WorkflowEchoSecrets/package.cwl"
deploymentProfileName: "http://www.opengis.net/profiles/eoc/workflow"
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
inputs:
message: "secret message"
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: Workflow
doc: Workflow that calls the echo process with secrets feature applied on the input.
hints:
cwltool:Secrets:
secrets:
- message
$namespaces:
cwltool: http://commonwl.org/cwltool#
inputs:
message: string
outputs:
output:
type: File
outputSource: echo/output
steps:
echo:
run: EchoSecrets.cwl
in:
message: message
out:
- output

0 comments on commit e06f75d

Please sign in to comment.