-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WEBSITE] Blog post about DataFusion 13.0.0 #254
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? Then could you also rename pull request title in the following format?
See also: |
249bf06
to
20bb42f
Compare
@andygrove mentions there is a draft blog for datafusion 11 that was not published that we can use for additional content: https://docs.google.com/document/d/1tPCgeB6iQPVvbRyaXft7nKqorrhv-XDqVXuMp4pG9ns/edit?usp=sharing |
|
||
While Velox and Acero focus on execution engines, DataFusion provides the entire suite of components needed to build most analytic systems, including a SQL frontend, a dataframe API, and extension points for just about everything. Some [DataFusion users](https://github.com/apache/arrow-datafusion#known-uses) use a subset of the features such as the frontend (e.g. (dask-sql)[https://dask-sql.readthedocs.io/en/latest/] or the execution engine, such as [Blaze](https://github.com/blaze-init/blaze), and some users use many different components to build both SQL based and customized DSL based systems such as [InfluxDB IOx](https://github.com/influxdata/influxdb_iox/pulls) and [VegaFusion](https://github.com/vegafusion/vegafusion). | ||
|
||
One of DataFusion’s advantages is its implementation in [Rust](https://www.rust-lang.org/) and thus its easy integration with the broader Rust ecosystem. Rust continues to be a major source of benefit, from the [ease of parallelization with the high quality and standardized `async` ecosystem](https://www.influxdata.com/blog/using-rustlangs-async-tokio-runtime-for-cpu-bound-tasks/) , as well as its modern dependency management system and wonderful performance. <!-- I wonder if we should link to clickbench?? --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should link to clickbench??
Did the clickbench results got updated with 13.0? AFAIK we should be much faster than we were compared to the initial integration time (there were a lot of slowness coming from SelectK queries and a few other optimizations like regex_replace
, which we should handle much better now). CC: @waitingkuo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was the result from 12.0. i'll update it soon with the latest version
Co-authored-by: Remzi Yang <59198230+HaoYang670@users.noreply.github.com>
Co-authored-by: Andy Grove <andygrove73@gmail.com>
Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
Co-authored-by: Andy Grove <andygrove73@gmail.com>
…ite into alamb/datafusion_update_13
I plan to update the dates on this PR and publish it tomorrow unless anyone needs more time to review. Please just let me know if you do so |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
re apache/datafusion#3671
This blog describes what has been going on in DataFusion for the last 5 months
Edit: URL location https://arrow.apache.org/blog/2022/10/25/datafusion-13.0.0/