Skip to content

Latest commit

 

History

History
311 lines (200 loc) · 17.8 KB

CHANGELOG.md

File metadata and controls

311 lines (200 loc) · 17.8 KB

Changelog

v0.3.12 (2022-07-28)

  • Bump PostGIS deps in Engine

v0.3.11 (2022-07-28)

  • Fix bug for Python 3.10 compatibility (#704)
  • sgr engine add password prompt clarification (#709)

v0.3.10 (2022-06-08)

  • Fix surrogate PK object ordering edge case, impacting some LQ mounts (#681).
  • Add new postgres_fdw params: use_remote_estimate and fetch_size (#678)
  • Add support for writable LQ to Splitfiles (#668)

v0.3.9 (2022-04-26)

Add support for writes when using layered querying (sgr checkout --layered) (#662). Note schema changes when an image is checked out in layered querying mode are currently unsupported.

Also add support for using writeable layered querying when using Splitfiles (#668).

v0.3.8 (2022-03-18)

  • Fix slow startup of sgr binary on Mac OS / Darwin (#656)
    • Add new release artifact sgr-osx-x86_64.tgz including executable sgr, shared libraries, and resources
    • Change install.sh on Darwin to default to the .tgz artifact (for previous behavior, set FORCE_ONEFILE=1)
    • Extract the archive to ~/.splitgraph/pkg/sgr and symlink ~/.splitgraph/sgr -> ~/.splitgraph/pkg/sgr/sgr
  • Add support for incremental loads to FDW plugins (#647)
    • Add a cursor_columns field to the table parameters of FDW data sources
  • Fix bug on Windows where sgr failed to locate .sgconfig in non-existent $HOME directory (#651) Thanks @harrybiddle!
    • Switch to cross-platform path expansion when adding home directory to config search paths

Starting from this version, all future releases will include sgr-osx-x86_64.tgz, which we recommend installing on Mac OS. The install.sh script will default to it.

Note: if you download an executable directly from this release page, using a web browser, then Mac will quarantine the file and refuse to execute it. Command line HTTP clients like curl do not have this limitation, and the recommended installation method is to run the install.sh script included as a part of every release. See the pull request adding sgr-osx-x86_64.tgz for more details.

v0.3.7 (2022-02-28)

  • Add Google BigQuery data plugin (#638)
  • Add Amazon Athena data plugin (#634)
  • Skip pushdown of aggregations with WHERE clause downcasting (splitgraph/Multicorn#6)

v0.3.6 (2022-02-02)

  • Fix libffi crashes when using the Snowflake FDW (#623)

v0.3.5 (2022-01-31)

  • Support for aggregation and GROUP BY pushdown for the Snowflake FDW
  • Fix FDW previews when column names have percentage signs (#619)

v0.3.4 (2022-01-21)

v0.3.3 (2022-01-05)

  • Minor fixes to sgr cloud seed (#604)
  • Add a flag to restore old sgr csv import behaviour (#605)

v0.3.2 (2021-12-30)

v0.3.1 (2021-12-20)

Fix sgr cloud sync invocation issue (#589)

v0.3.0 (2021-12-17)

Fleshing out the splitgraph.yml (aka repositories.yml) format that defines a Splitgraph Cloud "project" (datasets, their sources and metadata).

Existing users of repositories.yml don't need to change anything, though note that sgr cloud commands using the YAML format will now default to splitgraph.yml unless explicitly set to repositories.yml.

New sgr cloud commands

See #582 and #587

These let users manipulate Splitgraph Cloud and ingestion jobs from the CLI:

  • sgr cloud status: view the status of ingestion jobs in the current project
  • sgr cloud logs: view job logs
  • sgr cloud upload: upload a CSV file to Splitgraph Cloud (without using the engine)
  • sgr cloud sync: trigger a one-off load of a dataset
  • sgr cloud stub: generate a splitgraph.yml file
  • sgr cloud seed: generate a Splitgraph Cloud project with a splitgraph.yml, GitHub Actions, dbt etc
  • sgr cloud validate: merge multiple project files and output the result (like docker-compose config)
  • sgr cloud download: download a query result from Splitgraph Cloud as a CSV file, bypassing time/query size limits.

splitgraph.yml

Change various commands that use repositories.yml to default to splitgraph.yml instead. Allow "mixing in" multiple .yml files Docker Compose-style, useful for splitting credentials (and not checking them in) and data settings.

Temporary location for the new full documentation on splitgraph.yml: https://github.com/splitgraph/sgr.com/blob/f7ac524cb5023091832e8bf51b277991c435f241/content/docs/0900_splitgraph-cloud/0500_splitgraph-yml.mdx

Miscellaneous

  • Initial backend support for "transforming" Splitgraph plugins, including dbt (#574)
  • Dump scheduled ingestion/transformation jobs with sgr cloud dump (#577)

Full set of changes: v0.2.18...v0.3.0

v0.2.18 (2021-11-17)

  • Splitfile speedups (#567)
  • Various query speedups (#563, #561)
  • More robust CSV querying (#562)

Full set of changes: v0.2.17...v0.2.18

v0.2.17 (2021-10-14)

  • Code refactor / optimizations (#531)
  • Support for pluggable authorization logic (#542, #549)
  • FDW JSONSchema fixes (#545)
  • Upgrade pglast to 3.4 to fix issues with Splitfile validation (#534)
  • Speed up Splitfile builds (#550)

Full set of changes: v0.2.16...v0.2.17

v0.2.16 (2021-08-18)

  • Various Airbyte ingestion improvements and support for different normalization modes, including a custom dbt model (#510, #513, #514)
  • Fix mount for data source with empty credentials schema (#515)
  • Fix sgr cloud load/dump (#520)

Full set of changes: v0.2.15...v0.2.16

v0.2.15 (2021-07-26)

  • API functionality to get the raw URL for a data source (#457)
  • LQ scan / filtering simplification to speed up writes / Singer loads (#464, #489)
  • API functionality for Airbyte support (AirbyteDataSource class, #493)
  • Speed up sgr cloud load by bulk API calls (#500)

Full set of changes: v0.2.14...v0.2.15

v0.2.14 (2021-05-05)

  • Functionality to dump and load a Splitgraph catalog to/from a special repositories.yml format (#445)

Full set of changes: v0.2.13...v0.2.14

v0.2.13 (2021-04-14)

  • Various fixes to CSV inference and querying (#433)
  • Add customizable fetch size to the Snowflake data source (#434)
  • Fix issue with changing the engine password (#437)
  • Data source refactor (#438):
    • MySQL: parameter remote_schema has been renamed to dbname
    • Mongo: parameter coll has been renamed to collection; db to database
    • Table options are now a separate parameter that is passed to the
    • Introspection now returns a dictionary of tables and proposed table options OR error classes for tables that we weren't able to introspect (allowing for partial failures)
    • Mounting can now return a list of mount errors (caller can choose to ignore).
    • CSV data source: allow passing a partially initialized list of table options without a schema, making it introspect just those S3 keys and fill out the missing table options.
  • Postgres-level notices are now available in the PsycopgEngine.notices list after a run_sql invocation.
  • Multicorn: fix bug where server-level FDW options would override table-level FDW options.

Full set of changes: v0.2.12...v0.2.13

v0.2.12 (2021-04-07)

  • Fixes to the Snowflake data source (#421)
  • Add automatic encoding, newline and dialect inference to the CSV data source (#432)

Full set of changes: v0.2.11...v0.2.12

v0.2.11 (2021-03-29)

  • Snowflake data source improvements:
    • Allow passing envvars to set HTTP proxy parameters, fix incorrect query string generation when passing a warehouse (#414, #413)
    • Support for authentication using a private key (#418)
  • Splitfiles: relax AST restrictions to support all SELECT/INSERT/UPDATE/DELETE statements (#411)
  • Change the default installation port to 6432 and handle port conflicts during install (#375)
  • Add retry logic to fix registry closing the SSL connection after 30 seconds, close remote connections in some places (#417)

Full set of changes: v0.2.10...v0.2.11

v0.2.10 (2021-03-17)

  • Fix CSV schema inference not supporting BIGINT data types (#407)
  • Fix Splitfiles only expecting tags to contain alphanumeric characters (#407)
  • Speedups for the Snowflake / SQLAlchemy data source (#405)

Full set of changes: v0.2.9...v0.2.10

v0.2.9 (2021-03-12)

  • Add a Snowflake data source, backed by a SQLAlchemy connector (#404)

Full set of changes: v0.2.8...v0.2.9

v0.2.8 (2021-03-09)

  • Allow deleting tags on remote registries (#403)

Full set of changes: v0.2.7...v0.2.8

v0.2.7 (2021-03-09)

  • Fix MySQL plugin crashes on binary data types.

Full set of changes: v0.2.6...v0.2.7

v0.2.6 (2021-03-04)

  • Fix querying when there are NULLs in primary keys (#373)
  • Data source and foreign data wrapper for querying CSV files in S3 buckets and HTTP (#397)
  • Ctrl+C can now interrupt long-running PostgreSQL queries and stop sgr (#398)
  • Support for updating miscellaneous repository metadata from the sgr cloud metadata CLI (#399)

Full set of changes: v0.2.5...v0.2.6

v0.2.5 (2021-01-25)

  • Fix piping CSV files from stdin (#350)
  • Truncate commit comments if they're above the max field size (currently 4096) (#353)
  • Add support for updating repository topics from the CLI (sgr cloud metadata) (#371)

Full set of changes: v0.2.4...v0.2.5

v0.2.4 (2020-12-08)

  • Mount handlers are now called "data sources", a generalization that will make them more pluggable and support sources beyond FDWs. See #324 for more documentation and necessary steps to migrate.
  • Added sgr singer target, a Singer-compatible target that can read Singer tap output from stdin and build Splitgraph images. It's based on a fork of https://github.com/transferwise/pipelinewise-singer-python with additions that let us produce deltas and ingest them directly as Splitgraph objects.
  • Support for dynamically loading plugins without specifying them in .sgconfig, by looking up plugins in a certain directory (see #329)

Full set of changes: v0.2.3...v0.2.4

v0.2.3 (2020-09-16)

  • Socrata FDW now correctly emits IS NULL / IS NOT NULL, same with ES (using ES query syntax).
  • Fix array handling (a IN (1,2,3) queries get rewritten and pushed down correctly).
  • Output more query information in EXPLAIN for Socrata/LQ.

Full set of changes: v0.2.2...v0.2.3

v0.2.2 (2020-09-16)

  • Add ability to pass extra server args to postgres_fdw (extra_server_args)
  • Add ability to rename object files in-engine (utility function for some ingestion).
  • Allow disabling IMPORT FOREIGN SCHEMA and passing a table schema in Postgres/MySQL FDWs.
  • Add a fork (https://github.com/splitgraph/postgres-elasticsearch-fdw) of https://github.com/matthewfranglen/postgres-elasticsearch-fdw to sgr mount, letting others mount ES indexes. Fork changes:
    • Pass qualifiers as ElasticSearch queries using the query DSL (was using the query=... qual as a Lucene query string, which is useless in JOINs. Now we combine both the query implied from the quals and the Lucene query string, if passed)
    • Close the search context on end_scan (otherwise many ES queries to the FDW in a 10 minute span would cause it to error with a "too many scroll contexts" exception)
    • Add EXPLAIN support (outputs the used ES query)

Full set of changes: v0.2.1...v0.2.2

v0.2.1 (2020-09-02)

  • Add ability to skip config injection at the end of config-manipulating functions (pass -s) and don't fail if the Docker socket isn't reachable

Full set of changes: v0.2.0...v0.2.1

v0.2.0 (2020-08-18)

  • Introducing the Splitgraph Data Delivery Network: a single SQL endpoint to query all datasets hosted on or proxied by Splitgraph Cloud with any PostgreSQL client.
  • Extra sgr cloud commands:
  • Add daily update check to sgr.

Full set of changes: v0.1.4...v0.2.0

v0.1.4 (2020-07-19)

  • Various dependency bumps (including PostGIS)
  • Fix Splitfiles and sgr import not respectng the SG_COMMIT_CHUNK_SIZE envvar/config variable

Full set of changes: v0.1.3...v0.1.4

v0.1.3 (2020-06-27)

  • Fix Socrata querying for datasets with long column names (#268)

Full set of changes: v0.1.2...v0.1.3

v0.1.2 (2020-06-23)

  • Example for writing a custom FDW and integrating it with Splitgraph
  • Add dbt adapter that uses Splitgraph data and a sample dbt project
  • Socrata UX improvements
  • Command line parameters that require JSON now also accept @filename.json or @- for stdin

Full set of changes: v0.1.1...v0.1.2

v0.1.1 (2020-06-12)

  • Fixed Socrata querying for datasets with columns that match keywords (e.g. first/last)

Full set of changes: v0.1.0...v0.1.1

v0.1.0 (2020-06-05)

  • Initial release.