Skip to content

7.0.0

Compare
Choose a tag to compare
@ibis-project-bot ibis-project-bot released this 02 Oct 17:04

7.0.0 (2023-10-02)

⚠ BREAKING CHANGES

  • api: the interpolation argument was only supported in the dask and pandas backends; for interpolated quantiles use dask or pandas directly
  • ir: Dask and Pandas only; cumulative operations that relied on implicit ordering from prior operations such as calls to table.order_by may no longer work, pass order_by=... into the appropriate cumulative method to achieve the same behavior.
  • api: UUID, MACADDR and INET are no longer subclasses of strings. Cast those values to string to enable use of the string APIs.
  • impala: ImpalaTable.rename is removed, use Backend.rename_table instead.
  • pyspark: PySparkTable.rename is removed, use Backend.rename_table instead.
  • clickhouse: ClickhouseTable is removed. This class only provided a single insert method. Use the Clickhouse backend's insert method instead.
  • datatypes: The minimum version of sqlglot is now 17.2.0, to support much faster and more robust backend type parsing.
  • ir: ibis.expr.selectors module is removed, use ibis.selectors instead
  • api: passing a tuple or a sequence of tuples to table.order_by() calls is not allowed anymore; use ibis.asc(key) or ibis.desc(key) instead
  • ir: the ibis.common.validators module has been removed
    and all validation rules from ibis.expr.rules, either use typehints
    or patterns from ibis.common.patterns

Features

  • api: add .delta method for computing difference in units between two temporal values (18617bf)
  • api: add ArrayIntersect operation and corresponding ArrayValue.intersect API (76c95b2)
  • api: add Backend.rename_table (0047143)
  • api: add levenshtein edit distance API (ab211a8)
  • api: add relocate table expression API for moving columns around based on selectors (ee8a86f)
  • api: add Table.rename, with support for renaming via keyword arguments (917d7ec)
  • api: add to_pandas_batches (740778f)
  • api: add support for referencing backend-builtin functions (76f5f4b)
  • api: implement negative slice indexing (caee5c1)
  • api: improve repr for deferred expressions containing Column/Scalar values (6b1218a)
  • api: improve repr of deferred functions (f2b3744)
  • api: support deferred and literal values in ibis.ifelse (685dbc1)
  • api: support deferred arguments in ibis.case() (6f9f7c5)
  • api: support deferred arguments to ibis.array (b1b83f9)
  • api: support deferred arguments to ibis.map (86c8669)
  • api: support deferred arguments to ibis.struct (7ef870d)
  • api: support deferred arguments to udfs (a49d259)
  • api: support deferred expressions in ibis.date (f454a71)
  • api: support deferred expressions in ibis.time (be1fd65)
  • api: support deferred expressions in ibis.timestamp (0e71505)
  • api: support deferred values in ibis.coalesce/ibis.greatest/ibis.least (e423480)
  • bigquery: implement array functions (04f5a11)
  • bigquery: use sqlglot to implement functional unnest to relational unnest (167c3bd)
  • clickhouse: add read_parquet and read_csv (dc2ea25)
  • clickhouse: add support for .sql methods (f1d004b)
  • clickhouse: implement builtin agg functions (eea679a)
  • clickhouse: support caching tables with the .cache() method (621bdac)
  • clickhouse: support reading parquet and csv globs (4ea1834)
  • common: match and replace graph nodes (78865c0)
  • datafusion: add coalesce, nullif, ifnull, zeroifnull (1cc67c9)
  • datafusion: add ExtractWeekOfYear, ExtractMicrosecond, ExtractEpochSeconds (5612d48)
  • datafusion: add join support (e2c143a)
  • datafusion: add temporal functions (6be6c2b)
  • datafusion: implement builtin agg functions (0367069)
  • duckdb: expose loading extensions (2feecf7)
  • examples: name examples tables according to example name (169d889)
  • flink: add batch and streaming mode test fixtures for Flink backend (49485f6)
  • flink: allow translation of decimal literals (52f7032)
  • flink: fine-tune numeric literal translation (2f2d0d9)
  • flink: implement ops.FloorDivide operation (95474e6)
  • flink: implement a minimal PyFlink Backend (46d0e33)
  • flink: implement insert dml (6bdec79)
  • flink: implement table-related ddl in Flink backend to support streaming connectors (8dabefd)
  • flink: implement translation of NULLIFZERO (6ad1e96)
  • flink: implement translation of ZEROIFNULL (31560eb)
  • flink: support translating typed null values (83beb7e)
  • impala: implement Backend.rename_table (309c999)
  • introduce watermarks in ibis api (eaaebb8)
  • just chat to open Zulip in terminal (95e164e)
  • patterns: support building sequences in replacement patterns (f320c2e)
  • patterns: support building sequences in replacement patterns (beab068)
  • patterns: support calling methods on builders like a variable (58b2d0e)
  • polars: implement new UDF API (becbf41)
  • polars: implement support for builtin aggregate udfs (c383f62)
  • polars: support reading ndjson (1bda3bd)
  • postgres: implement array functions (fe41d57)
  • postgres: implement array sort (4791cb4)
  • postgres: implement array union (6d3d518)
  • pyspark: enable reading csv and parquet globs and implement read_json (d487e10)
  • pyspark: enable the new scalar UDF API (f29a8e7)
  • pyspark: implement Backend.rename_table (0a8b201)
  • selectors: support column references in column selector (d4fae08)
  • snowflake: add ArrayRemove implementation (4f9d9f9)
  • snowflake: allow disabling creation of object UDFs (569aa12)
  • snowflake: handle glob patterns in read_csv, read_parquet and read_json (adb8f4c)
  • snowflake: implement ops.ArrayRepeat (a93cbd6)
  • snowflake: implement read_csv (3323156)
  • snowflake: implement read_json (ec870a2)
  • snowflake: implement read_parquet (e02888b)
  • snowflake: implement array sort (465fae1)
  • snowflake: support literal map key contains check (dbe7d4e)
  • sql: add database argument to list_schemas (22ceba7)
  • sqlalchemy: support builtin aggregate functions (3b27e23)
  • sqlite: implement caching support (0677f8d)
  • tests: support defining datatype nullability for hypothesis strategies (ff26fb8)
  • trino: cross-schema table support (9c7c65f)
  • udf: add support for builtin aggregate UDFs (8ee12bf)
  • udf: support inputs without type annotations (99e531d)
  • ux: promote lists of strings to any_of selectors (5e11529)

Bug Fixes

  • api: ensure that deferred objects cannot be converted into literals (b37804a)
  • api: ensure that normalization of boolean, ints and floats fail with readable error message (556f7cc)
  • api: ensure the order of duplicate non-renamed columns in relocate is preserved (19a59aa)
  • api: fail on trying to construct an iterable of a deferred object (89bf919)
  • api: improve error message for bad arguments to Table.select (258a289)
  • api: support passing functools.partial objects to array .map/.filter methods (28f45d0)
  • bigquery: generate the correct temporal literal type based on the presence of timezone information (98a6ae0)
  • bigquery: quote struct field names in memtable when necessary (b1fcde8)
  • clickhouse: do not always prefix the table name with database, because temp tables cannot be assigned a database (5f88102)
  • clickhouse: list temporary tables with list_tables (758a875)
  • clickhouse: make sure that array1.union(array2) null handling matches across backends (8d42794)
  • clickhouse: workaround clickhouse_connect usage of removed APIs in pandas 2.1.0 (577599a)
  • clip: preserve nulls when clipping (c12dfa4)
  • common: pattern() factory should construct a CoercedTo(type) pattern from coercible types (09be2cd)
  • common: disallow plain string inputs for SequenceOf patterns (578980d)
  • common: disallow type coercion when checking for generic type fields (df63e8b)
  • common: support optional keyword-only parameters when validating callables (519a9e0)
  • datafusion: cast division inputs to float64 before dividing (197342d)
  • datatypes: decimal normalization failed for integers (5213958)
  • deps: update dependency datafusion to v28 (1a8b223)
  • deps: update dependency datafusion to v31 (fa0a8bd)
  • deps: update dependency pyarrow to v13 (43dc1e1)
  • deps: update dependency sqlglot to v18 (5fa0083)
  • drop: support deferred objects in calls to drop (d27374b)
  • druid: avoid double escaping percent-signs in strings (1d1f7bd)
  • druid: convert type strings to lowercase before looking up (4a838f7)
  • druid: ensure that string types are translated to VARCHAR (56e6ffc)
  • dtypes: switch scale and timestamp parameter order when formatting a timestamp datatype (302b122)
  • duckdb: load httpfs with read_csv from s3 (da1b95f)
  • duckdb: make sure that array1.union(array2) null handling matches across backends (849dea4)
  • duckdb: remove hack to workaround bug that was fixed upstream (310c521)
  • duckdb: workaround aggressive importing on the duckdb side (105e2d6)
  • flink: correct ops.RegexSearch translations (a3427a1)
  • flink: correct the translation of ops.Power (42d2236)
  • flink: correct translation of ops.IfNull op (85de81c)
  • flink: fix the pandas conversion in execute (2f6564f)
  • flink: fix translation of ops.TimestampDiff (580eff7)
  • flink: implement an in-memory table formatter (217a14b)
  • flink: remove broken, untested epochseconds (f18c760)
  • flink: rewrite ops.Clip using if statements (b7153ea)
  • flink: rewrite ops.Date as a cast operation (2470e81)
  • flink: translate ops.RandomScalar to rand (c485a92)
  • format: support rendering empty schemas (f8faada)
  • histogram: ensure that the bin width calculation matches numpy (e6a0037)
  • impala: allow arbitrary connection params (f251289)
  • mysql: handle null literals (79788c7)
  • oracle: clarify sid vs service_name handling and allow dsn (d4ea3bf)
  • oracle: ensure that metadata queries use SQL and not sqlplus-specific syntax (2c1bf93)
  • pandas: compatibility with 2.1 groupby behavior (ab3fc9e)
  • patterns: fix pattern mismatch error for default Pattern (f68079a)
  • patterns: support optional keyword arguments in CallableWith (a78aa60)
  • patterns: support passing mappings to Getitem builder (25864cf)
  • patterns: support string inputs for builder() (3610e52)
  • polars: polars no longer panics on a value_counts-ed expression (e14185a)
  • pyspark: default to inferring the schema of CSV files and assuming they have a header with header=True (0ffda75)
  • pyspark: gate datediff op to restore pyspark 3.2 support (4a8d611)
  • pyspark: gate other usage of DayTimeIntervalType for PySpark 3.2 (ab01de0)
  • remove pandas license (476a659)
  • repr: ensure that column expressions are not promoted to table when repring non-interactively (d57a162)
  • selectors: error when trying to select a non-existent column with s.c (ae3e76e)
  • snowflake: allow backend to choose how to prefix table names during compilation (933fb32)
  • snowflake: disable filter and map (53bc22e)
  • snowflake: ensure that laterals joins with newlines are also rewritten (dfd3c9b)
  • snowflake: ensure the correct compilation of tables from other databases and schemas (0ee68e2)
  • snowflake: fix timestamp scale inference (083bdae)
  • snowflake: use ibis-defined array_sort until upstream lands (6f7e13d)
  • sqlalchemy: ignore database when specified with temp=True (04461d5)
  • sql: avoid reselecting relations that do not need it to prevent dropping order by clauses (8ae2f03)
  • sqlglot: ensure back compat for DataTypeParam import (65851fc)
  • struct-column: make ops.StructColumn dshape depend on its input (7086d58)
  • trino: differentiate between a single column struct and a non-struct column (b1f1939)
  • type hints: improvements to type hints in ibis.expr (297b449)
  • type hints: remove notimplemented as type hints as not valid (57ea7a1)
  • type hints: various improvements to type hints in common (ff00347)

Documentation

  • add functools.partial and lambda closures to ArrayValue.map and ArrayValue.filter (e245e83)
  • add ibis.connect to top_level API docs (0d197e8)
  • add 404 page (8b2de41)
  • add 6.2.0 release notes (f5a2aed)
  • add back goatcounter to website (a0095bf)
  • add docs issue to navbar (ea41e53)
  • add exending how-to guides (0bad961)
  • add how-to for working with raw sql strings (3e08556)
  • add interactivity to altair example (0037fd8)
  • add interactivity to altair example on homepage (217f080)
  • add more redirects based on Google search console findings (fe890a4)
  • add proper zulip icon (f854327)
  • add some prose and move operation support matrix (48b7e34)
  • add UDF API documentation (5354689)
  • add v6.1.0 release blog (a66f7b7)
  • arrays: update blog post to include unnest examples (e765712)
  • backends: add support more supported IO types (124f085)
  • backends: explain how to release a cursor and suggest using .sql instead (1e1a574)
  • basic starburst galaxy tutorial (a7a49ca)
  • blog: add bigquery arrays 7.0.0 blog post (8f2a40f)
  • blog: add tags to blog posts (1527655)
  • blog: embed the torch youtube video directly (70515ed)
  • blog: snowflake io (ee8c512)
  • bring back backend API documentation (df981d5)
  • bring back versioning policy doc (9dc8966)
  • clean up extending tutorials (8da58d4)
  • community to contriute (f02c2fb)
  • dark mode for life (2fa181a)
  • default to short signature but show full path for top level functions (9793862)
  • draft posts to draft and very minor edits (43499e3)
  • ensure that all parameters elements overflow with a scrollbar (5002d9f)
  • expose API under ibis when possible (bc05ced)
  • fix edit this page button (284b48a)
  • fix links from install to connect (9bf27fb)
  • fix numerics and move connection apis elsewhere (95bb2e0)
  • fix setuptools extra install style (d9ab537)
  • fix zulip link for new members (3732f46)
  • gitter -> zulip (ef79e64)
  • give reference docs a more organized layout (583af94)
  • hand roll datatypes.core APIs to avoid documenting private types (6ad8069)
  • import ibis in doctests (7f340de)
  • include full name of signatures (66a58e2), closes /github.com/ibis-project/ibis/pull/7159#issuecomment-1735845163
  • install: point to connect anchor instead of do_connect (d680b40)
  • IO: add missing word, add line breaks (f4fdfd3)
  • language: de-simple-fy prose in docs (0617271)
  • link to zulip in README.md (15112bb)
  • major home page refactor (a4e4569)
  • minor blog fixes/prose update (304edd1)
  • minor blog update (df14997)
  • minor consistency on capitalization of versioning concept (b838760)
  • minor fix in starburst tutorial (54473ba)
  • more redirects and add ibis 3.0.2 (8d900ee)
  • move column selectors closer to relevant expression page (ea3a090)
  • nix: add configuration notes to nix environment setup (1c60318)
  • only render file support methods once (98b348c)
  • port bigquery ci-analysis blog post to use the delta API (e543b1d)
  • quarto: add quartodoc interlinks filter (9f7a1ef)
  • quarto: make blog post titles visible in light mode (be2e95f)
  • quarto: override api code blocks with custom renderer (b504fee)
  • quarto: shorten method signature names (e37bfea)
  • quart: remove temporary eval false setting (0b9e3a3)
  • refactor and move to quarto (487a5e5)
  • reference: fix ibis.ifelse() docstring (a80bb75)
  • reference: improve descriptions of sections (6a4924a)
  • reference: move collections API from global (4780536)
  • reference: move generic API from global (ba1f72e)
  • reference: move numeric/bool API from global (f3f23ac)
  • reference: move Table API from global (efcc2fb)
  • reference: move temporal API from global (c452fbb)
  • reference: move types API from global (662c509)
  • reference: rename Complex to Collection (194afa7)
  • reference: rename top-level to connection (9b9cd03)
  • reference: simplify title of Generic section (ef86165)
  • reference: soft-deprecate ibis.where (3c94f7b)
  • reference: sort numeric before strings (0df8bba)
  • remove duckdb code annotations (b0bcdde)
  • remove extra bits in zulip links (29680a3)
  • remove final instances of gitter (69f941a)
  • remove keywords (53a9f9f)
  • remove old S3 comment in impala docs (57c0596)
  • remove old-style schema construction from examples and docstrings (1b1c33a)
  • remove stray bracket (72c9039)
  • remove streamlit app on front page (d6d498e)
  • remove underscores that are not deferreds in doctest (5d300a9)
  • remove warning on front page (e68ec90)
  • set expectations for the impala backend (09c7678)
  • some how-to updates (6627016)
  • swap release notes and contribute (a242e31)
  • update poetry version (15e77f7)
  • update link to 'example repository' instead of 'tutorial' (c20d3ee)
  • update link to sqlalchemy tutorial (047aef7)
  • why ibis and other edits (a3c1c3f)

Refactors

  • add deferrable decorator (b09d978)
  • add type annotations to set operation functions (13f593b)
  • add types to Case and Window Builders (b85b424)
  • analysis: remove find_memtables function in favor of node.find() (c4658e7)
  • analysis: remove find_phyisical_tables() function in favor of node.find() (4daf2df)
  • analysis: remove is_analytic function in favor of node.find() (0452810)
  • analysis: remove ScalarAggregate, reduction_to_aggregation and has_multiple_bases (ed75866)
  • analysis: rewrite substitute_unbound to use the new pattern system (885d2ff)
  • api: remove deprecated tuple syntax for order_by() (57733e0)
  • api: remove interpolation argument (7c242af)
  • api: remove string as a parent type from expression API (2db98fb)
  • array-apply: adjust array map and array filter representation for easier non-recursive compilation (b91ecf0)
  • backends: adjust backends to work with new array representation (90befb2)
  • bigquery: make literals less messy (8d8ad87)
  • clickhouse: move ClickhouseTable.insert method to clickhouse backend and remove ClickhouseTable class (c9c72ae)
  • clickhouse: remove recursion from the compiler (ccbcdc0)
  • clickhouse: use more sqlglot constructs (c7ca7cd)
  • common: disallow None for Annotation.pattern in favor of using Any() (7434068)
  • common: factor out base classes to ibis.common.bases from ibis.common.grounds (01671d2)
  • common: ibis.common.patterns.match() should return with the matched value rather than the context (cbb9b2f)
  • common: improve error messages raised during validation (f95613a)
  • common: remove ibis.collections.DotDict (fedd4b1)
  • common: remove Validator mixin for better clarity (4697e7d)
  • common: remove ibis.common.parse since it is only used by the datatype parser (557414f)
  • common: restrict implicit traversals to common builtin collections (8531347)
  • common: turn annotations into slotted classes (0770e92)
  • datatypes: use sqlglot for parsing backend specific types (fe7ba24)
  • delete unexposed ibis.api.category_label function (24ac5e7)
  • examples: replace pooch with lighter weight pins (521669c)
  • flink: reorder registry to match SQL one (93dad5b)
  • flink: use built-in DEGREES, RADIANS (33518e9)
  • formats: turn TypeParser into a TypeMapper implementation for sqlglot (468bed1)
  • ir: construct ArrayContains instead of Contains for value.isin(array_value) (e826037)
  • ir: decompose Contains into InValues and InColumn (fe9a289)
  • ir: glue patterns and rules together (c20ba7f)
  • ir: remove deprecated ibis.expr.selectors module (d4161d7)
  • ir: rename .output_dtype and .output_shape to .dtype and .shape respectively (f9d5403)
  • ir: replace Cumulative operations by adding where, group_by and order_by kwargs to cumulative APIs (26ffc68)
  • ir: rewrite ibis.expr.format using node.map() (94ee679)
  • ir: use @annotated decorator to coerce Selection.order_by and Aggregation.order_by arguments (8b841c1)
  • mysql: use describe temporary table to retrieve ibis schema from query (a723637)
  • rename ops.Where to ops.IfElse (a64b7ad)
  • replace deprecated classes of type hints (25946f9)
  • snowflake: get query schema using describe of last query id (890d54a)
  • snowflake: remove unnecessary schema setting (9b0e6c8)
  • snowflake: replace custom temp table ddl for memtables with read_parquet (41df410)
  • snowflake: sort column names in the database instead of on the client (fb52814)
  • tests: move test_visualize.py to ibis/expr/tests (46d74ee)
  • tests: reorganize ibis.expr.decompile and ibis.expr.sql test files to be under the ibis.expr subpackage (d0d006e)
  • tests: reorganize operation related tests from ibis.tests.exprs to ibis.expr.operations.tests (3cbe2f3)
  • tests: simplify pattern matching tests on Value operations (d87e65a)
  • traverse builtin collections for in deferrable (b5ee8f4)
  • use deferrable to implement deferred case statements (5577d51)

Performance

  • common: improve Concrete construction performance (2cb1a55)
  • duckdb: improve to_pyarrow performance (5970cfe)
  • duckdb: speed up metadata access to support the many-columns use case (2854143)
  • duckdb: use information_schema instead of describe select (ef7f69f)
  • introduce quicker abstract base classes (47822c6)
  • ops: early return if two nodes do not hash to the same value (b0b62cc)
  • ops: store schema on relation ops to avoid large traversals (0b49c96)
  • snowflake: speed up metadata accesses from the existing schema and database (f2ef129)

Deprecations

  • api: deprecate ibis.negate in favor of negate method (47cdbe8)
  • api: deprecate ibis.where in favor of ibis.ifelse (995c1bc)
  • api: deprecate Table.relabel in favor of Table.rename (dcd9772)
  • api: deprecate top-level ibis.geo_* functions in favor of their corresponding methods (71b7106)
  • api: replace nullifzero with ifnull and zeroifnull with fillna (ac85d11)