Skip to content

An Activity to generate DMR++ files from netCDF4 and HDF files

License

Notifications You must be signed in to change notification settings

sliu008/dmrpp-generator

 
 

Repository files navigation

 ____  __  __ ____  ____  ____
|  _ \|  \/  |  _ \|  _ \|  _ \
| | | | |\/| | |_) | |_) | |_) |
| |_| | |  | |  _ <|  __/|  __/
|____/|_|  |_|_| \_\_|   |_|

Overview

DMR++ files generator is a cloud based activity that generate DMRPP files from netCDF4 and HDF files

📖 Documentation

Versioning

We are following v<major>.<minor>.<patch> versioning convention, where:

  • <major>+1 means we changed the infrastructure and/or the major components that makes this software run. Will definitely lead to breaking changes.
  • <minor>+1 means we upgraded/patched the dependencies this software relays on. Can lead to breaking changes.
  • <patch>+1 means we fixed a bug and/or added a feature. Breaking changes are not expected.

🔨 Pre-requisite

This module is meant to run within Cumulus stack. If you don't have Cumulus stack deployed yet please consult this repo and follow the documetation to provision it.

Deploying with Cumulus Stack

In main.tf file (where you defined cumulus module) add

module "dmrpp-generator" {
 // Required parameters
 source = "https://github.com/ghrcdaac/dmrpp-generator/releases/download/<tag_num>/dmrpp-generator.zip"
 cluster_arn = module.cumulus.ecs_cluster_arn
 region = var.region
 prefix = var.prefix
 

 // Optional parameters
 docker_image = "ghrcdaac/dmrpp-generator:<tag_num>" // default to the correct release
 cpu = 800 // default to 800
 enable_cw_logging = False // default to False
 memory_reservation = 900 // default to 900
 prefix = "Cumulus stack prefix" // default Cumulus stack prefix
 desired_count = 1  // Default to 1
 log_destination_arn = var.aws_log_mechanism // default to null
}


In variables.tf file you need to define

variable "dmrpp-generator-docker-image" {
  default = "ghrcdaac/dmrpp-generator:<tag_num>"
}

Assuming you already defined the region and the prefix

Add the activity to your workflow

In your workflow.tf add

   "HyraxProcessing": {
      "Parameters": {
        "cma": {
          "event.$": "$",
          "task_config": {
            "buckets": "{$.meta.buckets}",
            "distribution_endpoint": "{$.meta.distribution_endpoint}",
            "files_config": "{$.meta.collection.files}",
            "fileStagingDir": "{$.meta.collection.url_path}",
            "granuleIdExtraction": "{$.meta.collection.granuleIdExtraction}",
            "collection": "{$.meta.collection}"
          }
        }
      },
      "Type": "Task",
      "Resource": "${module.dmrpp-generator.dmrpp_task_id}",
      "Catch": [
        {
          "ErrorEquals": [
            "States.ALL"
          ],
          "ResultPath": "$.exception",
          "Next": "WorkflowFailed"
        }
      ],
      "Retry": [
        {
          "ErrorEquals": [
            "States.ALL"
          ],
          "IntervalSeconds": 2,
          "MaxAttempts": 3
        }
      ],
      "Next": "<Your next Step>"
    }

Where <Your next Step> is the next step in your workflow.

Cumulus Collection Configuration

Add the options desired to the collection definition as follows:

{
  "config": {
    "meta": {
      "dmrpp": {
        "options": [
          {
            "flag": "-M"
          },
          {
            "flag": "-s",
            "opt": "s3://ghrcsbxw-public/dmrpp_config/file.config",
            "download": "true"
          },
          {
            "flag": "-c",
            "opt": "s3://ghrcsbxw-public/aces1cont__1/aces1cont_2002.212_v2.50.tar.cmr.json",
            "download": "false"
          }
        ]
      }
    }
  }
}

For a list of all configuration options see: https://docs.opendap.org/index.php?title=DMR%2B%2B#:~:text=4.2%20Command%20line%20options

Cumulus Workflow Configuration

If your workflow is used by multiple collections which use a common dmrpp config, the config can be set at the workflow's ${StepName}.Parameters.cma.task_config.dmrpp instead of in the collection (Note: if the workflow and collection both have a dmrpp key, the configurations will be merged together, with the collection's config overriding any keys that are found in both the workflow and collection):

# terraform

dmrpp_config = {
  options = [
    {
      flag = "-M"
    },
    {
      flag = "-s"
      opt = "s3://ghrcsbxw-public/dmrpp_config/file.config"
      download = "true"
    },
    {
      flag = "-c"
      opt = "s3://ghrcsbxw-public/aces1cont__1/aces1cont_2002.212_v2.50.tar.cmr.json"
      download = "false"
    }
  ]
}

# workflow JSON
   "HyraxProcessing": {
      "Parameters": {
        "cma": {
          "event.$": "$",
          "task_config": {
            ...
            "dmrpp": ${jsonencode(dmrpp_config)}
          }
        }
      },

    ...
    }

About

An Activity to generate DMR++ files from netCDF4 and HDF files

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • HCL 69.4%
  • Shell 30.6%