Skip to content

Commit

Permalink
Reduce & simplify API
Browse files Browse the repository at this point in the history
  • Loading branch information
radeksimko committed Jun 10, 2022
1 parent b563dc8 commit 284fcc7
Show file tree
Hide file tree
Showing 6 changed files with 255 additions and 489 deletions.
186 changes: 127 additions & 59 deletions README.md
@@ -1,48 +1,147 @@
# terraform-registry-address

This package helps with representation, comparison and parsing of
Terraform Registry addresses, such as
`registry.terraform.io/grafana/grafana` or `hashicorp/aws`.
This module enables parsing, comparison and canonical representation of
[Terraform Registry](https://registry.terraform.io/) **provider** addresses
(such as `registry.terraform.io/grafana/grafana` or `hashicorp/aws`)
and **module** addresses (such as `hashicorp/subnets/cidr`).

The most common source of these addresses outside of Terraform Core
is JSON representation of state, plan, or schemas as obtained
via [`hashicorp/terraform-exec`](https://github.com/hashicorp/terraform-exec).
**Provider** addresses can be found in

## Parsing Provider Addresses
- [`terraform show -json <FILE>`](https://www.terraform.io/internals/json-format#configuration-representation) (`full_name`)
- [`terraform version -json`](https://www.terraform.io/cli/commands/version#example) (`provider_selections`)
- [`terraform providers schema -json`](https://www.terraform.io/cli/commands/providers/schema#providers-schema-representation) (keys of `provider_schemas`)
- within `required_providers` block in Terraform configuration (`*.tf`)

### Example
**Module** addresses can be found within `source` argument
of `module` block in Terraform configuration (`*.tf`)
and parts of the address (namespace and name) in the Registry API.

## Compatibility

The module assumes compatibility with Terraform v0.12 and later,
which have the mentioned JSON output produced by corresponding CLI flags.

We recommend carefully reading the [ambigouous provider addresses](#Ambiguous-Provider-Addresses)
section below which may impact versions `0.12` and `0.13`.

## Related Libraries

Other libraries which may help with consuming most of the above Terraform
outputs in automation:

- [`hashicorp/terraform-exec`](https://github.com/hashicorp/terraform-exec)
- [`hashicorp/terraform-json`](https://github.com/hashicorp/terraform-json)

## Usage

### Provider

```go
p, err := ParseRawProviderSourceString("hashicorp/aws")
pAddr, err := ParseProviderSource("hashicorp/aws")
if err != nil {
// deal with error
}

// p == Provider{
// pAddr == Provider{
// Type: "aws",
// Namespace: "hashicorp",
// Hostname: svchost.Hostname("registry.terraform.io"),
// Hostname: DefaultProviderRegistryHost,
// }
```

### Legacy address
### Module

```go
mAddr, err := ParseModuleSource("hashicorp/consul/aws//modules/consul-cluster")
if err != nil {
// deal with error
}

// mAddr == Module{
// Package: ModulePackage{
// Host: DefaultProviderRegistryHost,
// Namespace: "hashicorp",
// Name: "consul",
// TargetSystem: "aws",
// },
// Subdir: "modules/consul-cluster",
// },
```

## Other Module Address Formats

Modules can also be sourced from [other sources](https://www.terraform.io/language/modules/sources)
and these other sources (outside of Terraform Registry)
have different address formats, such as `./local` or
`github.com/hashicorp/example`.

This library does _not_ recognize such other address formats
and it will return error upon parsing these.

## Ambiguous Provider Addresses

Qualified addresses with namespace (such as `hashicorp/aws`)
are used exclusively in all recent versions (`0.14+`) of Terraform.
If you only work with Terraform `v0.14.0+` configuration/output, you may
safely ignore the rest of this section and related part of the API.

There are a few types of ambiguous addresses you may comes accross:

- Terraform `v0.12` uses "namespace-less address", such as `aws`.
- Terraform `v0.13` may use `-` as a placeholder for the unknown namespace,
resulting in address such as `-/aws`.
- Terraform `v0.14+` _configuration_ still allows ambiguous providers
through `provider "<NAME>" {}` block _without_ corresponding
entry inside `required_providers`, but these providers are always
resolved as `hashicorp/<NAME>` and all JSON outputs only use that
resolved address.

Both ambiguous address formats are accepted by `ParseProviderSource()`

```go
pAddr, err := ParseProviderSource("aws")
if err != nil {
// deal with error
}

// pAddr == Provider{
// Type: "aws",
// Namespace: UnknownProviderNamespace, // "?"
// Hostname: DefaultProviderRegistryHost, // "registry.terraform.io"
// }
pAddr.HasKnownNamespace() // == false
pAddr.IsLegacy() // == false
```
```go
pAddr, err := ParseProviderSource("-/aws")
if err != nil {
// deal with error
}

A legacy address is by itself (without more context) ambiguous.
For example `aws` may represent either the official `hashicorp/aws`
or just any custom-built provider called `aws`.
// pAddr == Provider{
// Type: "aws",
// Namespace: LegacyProviderNamespace, // "-"
// Hostname: DefaultProviderRegistryHost, // "registry.terraform.io"
// }
pAddr.HasKnownNamespace() // == true
pAddr.IsLegacy() // == true
```

Such ambiguous address can be produced by Terraform `<=0.12`. You can
just use `ImpliedProviderForUnqualifiedType` if you know for sure
the address was produced by an affected version.
However `NewProvider()` will panic if you pass an empty namespace
or any placeholder indicating unknown namespace.

If you do not have that context you should parse the string via
`ParseRawProviderSourceString` and then check `addr.IsLegacy()`.
```go
NewProvider(DefaultProviderRegistryHost, "aws", "") // panic
NewProvider(DefaultProviderRegistryHost, "aws", "-") // panic
NewProvider(DefaultProviderRegistryHost, "aws", "?") // panic
```

#### What to do with a legacy address?
If you come across an ambiguous address, you should resolve
it to a fully qualified one and use that one instead.

Ask the Registry API whether and where the provider was moved to
### Resolving Ambiguous Address

(`-` represents the legacy, basically unknown namespace)
The Registry API provides the safest way of resolving an ambiguous address.

```sh
# grafana (redirected to its own namespace)
Expand All @@ -56,28 +155,15 @@ $ curl -s https://registry.terraform.io/v1/providers/-/aws/versions | jq '(.id,
null
```

Then:

- Reparse the _new_ address (`moved_to`) of any _moved_ provider (e.g. `grafana/grafana`) via `ParseRawProviderSourceString`
- Reparse the full address (`id`) of any other provider (e.g. `hashicorp/aws`)

Depending on context (legacy) `terraform` may need to be parsed separately.
Read more about this provider below.

If for some reason you cannot ask the Registry API you may also use
`ParseAndInferProviderSourceString` which assumes that any legacy address
(including `terraform`) belongs to the `hashicorp` namespace.

If you cache results (which you should), ensure you have invalidation
mechanism in place because target (migrated) namespace may change.
Hard-coding migrations anywhere in code is strongly discouraged.
When you cache results, ensure you have invalidation
mechanism in place as target (migrated) namespace may change.

#### `terraform` provider

Like any other legacy address `terraform` is also ambiguous. Such address may
(most unlikely) represent a custom-built provider called `terraform`,
or the now archived [`hashicorp/terraform` provider in the registry](https://registry.terraform.io/providers/hashicorp/terraform/latest),
or (most likely) the `terraform` provider built into 0.12+, which is
or (most likely) the `terraform` provider built into 0.11+, which is
represented via a dedicated FQN of `terraform.io/builtin/terraform` in 0.13+.

You may be able to differentiate between these different providers if you
Expand All @@ -87,25 +173,7 @@ Alternatively you may just treat the address as the builtin provider,
i.e. assume all of its logic including schema is contained within
Terraform Core.

In such case you should just use `NewBuiltInProvider("terraform")`.

## Parsing Module Addresses

### Example

In such case you should construct the address in the following way
```go
registry, err := ParseRawModuleSourceRegistry("hashicorp/subnets/cidr")
if err != nil {
// deal with error
}

// registry == ModuleSourceRegistry{
// PackageAddr: ModuleRegistryPackage{
// Host: svchost.Hostname("registry.terraform.io"),
// Namespace: "hashicorp",
// Name: "subnets",
// TargetSystem: "cidr",
// },
// Subdir: "",
// },
pAddr := NewProvider(BuiltInProviderHost, BuiltInProviderNamespace, "terraform")
```
56 changes: 33 additions & 23 deletions module.go
Expand Up @@ -9,14 +9,14 @@ import (
svchost "github.com/hashicorp/terraform-svchost"
)

// ModuleSourceRegistry is representing a module listed in a Terraform module
// Module is representing a module listed in a Terraform module
// registry.
type ModuleSourceRegistry struct {
// PackageAddr is the registry package that the target module belongs to.
type Module struct {
// Package is the registry package that the target module belongs to.
// The module installer must translate this into a ModuleSourceRemote
// using the registry API and then take that underlying address's
// PackageAddr in order to find the actual package location.
PackageAddr ModuleRegistryPackage
// Package in order to find the actual package location.
Package ModulePackage

// If Subdir is non-empty then it represents a sub-directory within the
// remote package that the registry address eventually resolves to.
Expand All @@ -36,22 +36,22 @@ const DefaultModuleRegistryHost = svchost.Hostname("registry.terraform.io")
var moduleRegistryNamePattern = regexp.MustCompile("^[0-9A-Za-z](?:[0-9A-Za-z-_]{0,62}[0-9A-Za-z])?$")
var moduleRegistryTargetSystemPattern = regexp.MustCompile("^[0-9a-z]{1,64}$")

// ParseRawModuleSourceRegistry only accepts module registry addresses, and
// ParseModuleSource only accepts module registry addresses, and
// will reject any other address type.
func ParseRawModuleSourceRegistry(raw string) (ModuleSourceRegistry, error) {
func ParseModuleSource(raw string) (Module, error) {
var err error

var subDir string
raw, subDir = splitPackageSubdir(raw)
if strings.HasPrefix(subDir, "../") {
return ModuleSourceRegistry{}, fmt.Errorf("subdirectory path %q leads outside of the module package", subDir)
return Module{}, fmt.Errorf("subdirectory path %q leads outside of the module package", subDir)
}

parts := strings.Split(raw, "/")
// A valid registry address has either three or four parts, because the
// leading hostname part is optional.
if len(parts) != 3 && len(parts) != 4 {
return ModuleSourceRegistry{}, fmt.Errorf("a module registry source address must have either three or four slash-separated components")
return Module{}, fmt.Errorf("a module registry source address must have either three or four slash-separated components")
}

host := DefaultModuleRegistryHost
Expand All @@ -64,20 +64,20 @@ func ParseRawModuleSourceRegistry(raw string) (ModuleSourceRegistry, error) {
case strings.Contains(parts[0], "--"):
// Looks like possibly punycode, which we don't allow here
// to ensure that source addresses are written readably.
return ModuleSourceRegistry{}, fmt.Errorf("invalid module registry hostname %q; internationalized domain names must be given as direct unicode characters, not in punycode", parts[0])
return Module{}, fmt.Errorf("invalid module registry hostname %q; internationalized domain names must be given as direct unicode characters, not in punycode", parts[0])
default:
return ModuleSourceRegistry{}, fmt.Errorf("invalid module registry hostname %q", parts[0])
return Module{}, fmt.Errorf("invalid module registry hostname %q", parts[0])
}
}
if !strings.Contains(host.String(), ".") {
return ModuleSourceRegistry{}, fmt.Errorf("invalid module registry hostname: must contain at least one dot")
return Module{}, fmt.Errorf("invalid module registry hostname: must contain at least one dot")
}
// Discard the hostname prefix now that we've processed it
parts = parts[1:]
}

ret := ModuleSourceRegistry{
PackageAddr: ModuleRegistryPackage{
ret := Module{
Package: ModulePackage{
Host: host,
},

Expand All @@ -88,18 +88,18 @@ func ParseRawModuleSourceRegistry(raw string) (ModuleSourceRegistry, error) {
return ret, fmt.Errorf("can't use %q as a module registry host, because it's reserved for installing directly from version control repositories", host)
}

if ret.PackageAddr.Namespace, err = parseModuleRegistryName(parts[0]); err != nil {
if ret.Package.Namespace, err = parseModuleRegistryName(parts[0]); err != nil {
if strings.Contains(parts[0], ".") {
// Seems like the user omitted one of the latter components in
// an address with an explicit hostname.
return ret, fmt.Errorf("source address must have three more components after the hostname: the namespace, the name, and the target system")
}
return ret, fmt.Errorf("invalid namespace %q: %s", parts[0], err)
}
if ret.PackageAddr.Name, err = parseModuleRegistryName(parts[1]); err != nil {
if ret.Package.Name, err = parseModuleRegistryName(parts[1]); err != nil {
return ret, fmt.Errorf("invalid module name %q: %s", parts[1], err)
}
if ret.PackageAddr.TargetSystem, err = parseModuleRegistryTargetSystem(parts[2]); err != nil {
if ret.Package.TargetSystem, err = parseModuleRegistryTargetSystem(parts[2]); err != nil {
if strings.Contains(parts[2], "?") {
// The user was trying to include a query string, probably?
return ret, fmt.Errorf("module registry addresses may not include a query string portion")
Expand All @@ -110,6 +110,16 @@ func ParseRawModuleSourceRegistry(raw string) (ModuleSourceRegistry, error) {
return ret, nil
}

// MustParseModuleSource is a wrapper around ParseModuleSource that panics if
// it returns an error.
func MustParseModuleSource(raw string) (Module) {
mod, err := ParseModuleSource(raw)
if err != nil {
panic(err)
}
return mod
}

// parseModuleRegistryName validates and normalizes a string in either the
// "namespace" or "name" position of a module registry source address.
func parseModuleRegistryName(given string) (string, error) {
Expand Down Expand Up @@ -163,11 +173,11 @@ func parseModuleRegistryTargetSystem(given string) (string, error) {
// We typically use this longer representation in error message, in case
// the inclusion of normally-omitted components is helpful in debugging
// unexpected behavior.
func (s ModuleSourceRegistry) String() string {
func (s Module) String() string {
if s.Subdir != "" {
return s.PackageAddr.String() + "//" + s.Subdir
return s.Package.String() + "//" + s.Subdir
}
return s.PackageAddr.String()
return s.Package.String()
}

// ForDisplay is similar to String but instead returns a representation of
Expand All @@ -177,11 +187,11 @@ func (s ModuleSourceRegistry) String() string {
//
// We typically use this shorter representation in informational messages,
// such as the note that we're about to start downloading a package.
func (s ModuleSourceRegistry) ForDisplay() string {
func (s Module) ForDisplay() string {
if s.Subdir != "" {
return s.PackageAddr.ForDisplay() + "//" + s.Subdir
return s.Package.ForDisplay() + "//" + s.Subdir
}
return s.PackageAddr.ForDisplay()
return s.Package.ForDisplay()
}

// splitPackageSubdir detects whether the given address string has a
Expand Down

0 comments on commit 284fcc7

Please sign in to comment.