ClickHouse® is a real-time analytics DBMS
-
Updated
May 24, 2024 - C++
ClickHouse® is a real-time analytics DBMS
YTsaurus is a scalable and fault-tolerant open-source big data platform.
A collection of my data science journey - projects, code, and notes.
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Postgres for Search and Analytics
Data-Centric Pipelines and Data Versioning
Open-source BI for engineers
Server for the ListenBrainz project, including the front-end (javascript/react) code that it serves and all of the data processing components that LB uses.
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Stroom is a highly scalable data storage, processing and analysis platform.
open source tools for interaction with IBM PAIRS:
SageWorks: An easy to use Python API for creating and deploying AWS SageMaker Models
An open source time-series database for fast ingest and SQL queries
Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
CovsirPhy: Python library for COVID-19 analysis with phase-dependent SIR-derived ODE models.
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."