diff --git a/arrow/src/compute/README.md b/arrow/src/compute/README.md new file mode 100644 index 00000000000..761713a531b --- /dev/null +++ b/arrow/src/compute/README.md @@ -0,0 +1,48 @@ + + +## Apache Arrow Rust Compute Kernels + +This module contains analytical kernels that process primarily Arrow +columnar data; some kernels can process scalar or Arrow-based array +inputs. These are intended for use inside query engines, data frame libraries, +etc. + +Many kernels have SQL-like semantics in that they perform elementwise or +scalar operations on whole arrays at a time. Other kernels are not SQL-like +and compute results that may be a different length or whose results depend on +the order of the values. + +We use the term "kernel" to refer to particular general operation that contains many different functions corresponding to different combinations of types or function behavior options. + +Types of functions + +* Scalar functions: elementwise functions that perform scalar operations in a + vectorized manner. These functions are generally valid for SQL-like + context. These are called "scalar" in that the functions executed consider + each value in an array independently, and the output array or arrays have the + same length as the input arrays. The result for each array cell is generally + independent of its position in the array. +* Vector functions, which produce a result whose output is generally dependent + on the entire contents of the input arrays. These functions **are generally + not valid** for SQL-like processing because the output size may be different + than the input size, and the result may change based on the order of the + values in the array. This includes things like array subselection, sorting, + hashing, and more. +* Scalar aggregate functions of which can be used in a SQL-like context \ No newline at end of file