Estuary Connectors

This repo hosts development for connectors for use with Flow.

The source-* connectors all implement the flow capture protocol. Source connectors in this repo can be used with Flow.

The materialize-* connectors all implement the Flow Materialize gRPC protocol, and only work with Flow.

All connectors in this repository are dual licensed under MIT or Apache 2.0 at your discretion.

Developing a New Connector

There are some things to consider and some tips here for developing a new connector.

Writing integration tests is highly recommended, it can help us avoid regressions as we develop connectors. See the tests directly for more information and examples.
- Try to be comprehensive in the tests to include edge cases, such as different data types, different types of tables, hitting certain limits of your connector (e.g. maximum character limits, etc.)
You can use the base-image provided for your connector Docker images.

Capture connectors

Check out the protocol definitions of Flow, they include a lot of comments explaining the various interfaces and messages used to communicate between your connector and Flow runtime: capture.proto
For connectors that work on files or file-like objects, the filesource provides abstractions to reduce boilerplate work. For an example of a connector implemented using this library see source-http-file.
For SQL captures, you can use the sqlcapture library. For an example, see source-postgres.
When emitting date-time values, if the discovered schema of the connector emits format: date-time, the value must be RFC3339 compliant, or otherwise parsing of the value will fail.

Materialization Connectors

Check out the protocol definitions of Flow, they include a lot of comments explaining the various interfaces and messages used to communicate between your connector and Flow runtime: materialize.proto
You need to choose the right pattern for materialization based on transactional guarantees of your destination technology. This choice is important to uphold exactly-once semantics expected from most connectors. See the comments in materialize.proto for more technical details.
- If your technology supports committing all data during store phase as part of a transaction, then you can have your destination be authoritative about the checkpoint by also updating the checkpoint as part of the store phase transaction. For an example of this see materialize-postgres.
- If your technology does not support transactions, but does support a retriable idempotent store operation, then you can have the Flow Recovery Log be authoritative and use the idempotency of the operation to ensure exactly-once semantics even in cases of failure. The general idea is to keep track of the operations that are being run in the store phase in the checkpoint sent to the runtime, so that in case the connector fails before it can successfully commit the checkpoint to the recovery log, on the next start of the connector, the operation can be retried by looking at the checkpoint and running the idempotent operations again as part of the Open phase. For an example of this see materialize-databricks.
- There are certain technologies that will not support either of these, in those cases we cannot guarantee exactly-once semantics, but at-least-once. For an example of this see materialize-google-pubsub.
If your technology supports different modes of authentication, the recommended approach is to have a oneOf jsonschema under the key credentials. See materialize-databricks as an example.
We recommend using our schema-gen if you are using the golang invopop/jsonschema module to generate JSONSchema for your module as it has some extra quality-of-life improvements.
If you want to support network tunnelling access to your technology (at the moment that means connecting through a SSH bastion), you can use the network-tunnel library. See materialize-postgres for an example of its usage.
If you want to support Google OAuth and Service Account authentication methods, you can use the auth/google library.

SQL Materializations

The materialize-sql library abstracts away a lot of shared logic among our SQL materializations. Connector developers will need to implement the various interfaces of this library, and there are standard implementations available as well, but they may or may not work with your destination technology. There are many examples of SQL materializations using this library that you can check to get an idea of how they work.

Name		Name	Last commit message	Last commit date
Latest commit History 2,676 Commits
.docker-cache		.docker-cache
.github		.github
.vscode		.vscode
base-image		base-image
docs		docs
estuary-cdk		estuary-cdk
filesource		filesource
go		go
infra		infra
materialize-bigquery		materialize-bigquery
materialize-boilerplate		materialize-boilerplate
materialize-databricks		materialize-databricks
materialize-dynamodb		materialize-dynamodb
materialize-elasticsearch		materialize-elasticsearch
materialize-firebolt		materialize-firebolt
materialize-google-pubsub		materialize-google-pubsub
materialize-google-sheets		materialize-google-sheets
materialize-mongodb		materialize-mongodb
materialize-motherduck		materialize-motherduck
materialize-mysql		materialize-mysql
materialize-pinecone		materialize-pinecone
materialize-postgres		materialize-postgres
materialize-redshift		materialize-redshift
materialize-rockset		materialize-rockset
materialize-s3-parquet		materialize-s3-parquet
materialize-snowflake		materialize-snowflake
materialize-sql		materialize-sql
materialize-sqlite		materialize-sqlite
materialize-sqlserver		materialize-sqlserver
materialize-starburst		materialize-starburst
materialize-webhook		materialize-webhook
python		python
source-airtable		source-airtable
source-alpaca		source-alpaca
source-asana		source-asana
source-azure-blob-storage		source-azure-blob-storage
source-bigquery-batch		source-bigquery-batch
source-boilerplate		source-boilerplate
source-criteo		source-criteo
source-dynamodb		source-dynamodb
source-facebook-marketing		source-facebook-marketing
source-firestore		source-firestore
source-gcs		source-gcs
source-gladly		source-gladly
source-google-ads		source-google-ads
source-google-drive		source-google-drive
source-google-sheets-native		source-google-sheets-native
source-hello-world		source-hello-world
source-http-file		source-http-file
source-http-ingest		source-http-ingest
source-hubspot-native		source-hubspot-native
source-hubspot		source-hubspot
source-kafka		source-kafka
source-kinesis		source-kinesis
source-linkedin-pages		source-linkedin-pages
source-mongodb		source-mongodb
source-mysql-batch		source-mysql-batch
source-mysql		source-mysql
source-notion		source-notion
source-pokemon		source-pokemon
source-postgres-batch		source-postgres-batch
source-postgres		source-postgres
source-redshift-batch		source-redshift-batch
source-s3		source-s3
source-sftp		source-sftp
source-shopify		source-shopify
source-snowflake		source-snowflake
source-sqlserver		source-sqlserver
source-stripe-native		source-stripe-native
source-test		source-test
source-yahoo-finance		source-yahoo-finance
sqlcapture		sqlcapture
tests		tests
testsupport		testsupport
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE-APACHE		LICENSE-APACHE
LICENSE-BSL		LICENSE-BSL
LICENSE-MIT		LICENSE-MIT
README.md		README.md
build-local.sh		build-local.sh
config_schema_guidelines.md		config_schema_guidelines.md
connector-variant.Dockerfile		connector-variant.Dockerfile
fetch-flow.sh		fetch-flow.sh
go.mod		go.mod
go.sum		go.sum
poetry.toml		poetry.toml

License

Licenses found

estuary/connectors

Folders and files

Latest commit

History