Name		Name	Last commit message	Last commit date
parent directory ..
co-group		co-group
early-window-results		early-window-results
enrichment		enrichment
event-journal		event-journal
fault-tolerance		fault-tolerance
files		files
grpc		grpc
hadoop		hadoop
hello-world		hello-world
imdg-connectors		imdg-connectors
jdbc		jdbc
jms		jms
job-management		job-management
kafka		kafka
pattern-matching		pattern-matching
python		python
return-results		return-results
rolling-aggregation		rolling-aggregation
session-windows		session-windows
sliding-windows		sliding-windows
sockets		sockets
source-sink-builder		source-sink-builder
spring-boot		spring-boot
spring		spring
tf-idf		tf-idf
trade-source		trade-source
wordcount		wordcount
README.md		README.md
pom.xml		pom.xml

README.md

Hazelcast Jet Code Samples

A repository of code samples for Hazelcast Jet. The samples show you how to use the Pipeline API to solve a range of use cases, how to integrate Jet with other systems and how to connect to various data sources (both from a Hazelcast IMDG and 3rd-party systems). There is also a folder with samples using the Core API.

Stream Aggregation

Sliding Window

apply a sliding window
perform basic aggregation (counting)
print the results on the console

Sliding Window with Nested Aggregation

like the above, plus:
add a second-level aggregation stage to find top/bottom N results

Session Window

apply a session window
use a custom Core API processor as the event source
perform a composite aggregate operation (apply two aggregate functions in parallel).
print the results on the console

Early Window Results

use the SourceBuilder to create a mock source of trade events from a stock market
apply a tumbling window, configure to emit early results
aggregate by summing a derived value
present the results in a live GUI chart

Pattern Matching

use stateful mapping on an event stream to track the state of many concurrent transactions, detect when a transaction is done, and compute its duration
open a GUI window that shows the transaction status

Rolling Aggregation

use SourceBuilder to create a mock source of trade events from a stock market
simple rolling aggregation (summing the price)
keep updating the target map with the current values of aggregation
present the results in a live GUI chart

Batch Aggregation

Word Count

use an IMap as the data source
stateless transforms to clean up the input (flatMap + filter)
perform basic aggregation (counting)
print a table of the most frequent words on the console using an Observable

Inverted Index with TF-IDF Scoring

serialize a small dataset to use as side input
fork a pipeline stage into two downstream stages
stateless transformations to clean up input
count distinct items
group by key, then group by secondary key
aggregate to a map of (secondary key -> result)
hash-join the forked stages
open an interactive GUI to try out the results

Joins

Co-Group and Aggregate

co-group three bounded data streams on a common key
for each distinct key, emit the co-grouped items in a 3-tuple of lists
store the results in an IMap and check they are as expected

Windowed Co-Group and Aggregate

use the Event Journal of an IMap as a streaming source
apply a sliding window
co-group three unbounded data streams on a common key
print the results on the console

Hash Join

see below

Data Enrichment

Enrich Using IMap

the sample is in the enrichUsingIMap() method
use the Event Journal of an IMap as a streaming data source
apply the mapUsingIMap transform to fetch the enriching data from another IMap
enrich from two IMaps in two mapUsingIMap steps
print the results on the console

Enrich Using ReplicatedMap

the sample is in the enrichUsingReplicatedMap() method
use the Event Journal of an IMap as a streaming data source
apply the mapUsingReplicatedMap transform to fetch the enriching data from another IMap
enrich from two ReplicatedMaps in two mapUsingReplicatedMap steps
print the results on the console

Enrich using gRPC

prepare a data service: a gRPC-based network service
use the Event Journal of an IMap as a streaming data source
enrich the unbounded data stream by making async gRPC calls to the service
print the results on the console

Enrich Using Hash Join

the sample is in the enrichUsingHashJoin() method
use the Event Journal of an IMap as a streaming data source
use a directory of files as a batch data source
hash-join an unbounded stream with two batch streams in one step
print the results on the console

Return Results to the Caller

Basic Observables

obtain an Observable
incorporate it in a streaming pipeline by wrapping it in a Sink
register an Observer on it
execute the pipeline (streaming job)
observe the results as they show up in the Observer

Iterable results

obtain an Observable
use it as Sink in a batch job
get a result Iterator form of the Observable
execute the batch job
observe the results by iterating once execution has finished

Results as a Future

obtain an Observable
use it as Sink in a batch job
get the CompletableFuture form of the Observable
specify actions to be executed once the results are complete
execute the batch job
observe the results when they become available

Job Management

Suspend/Resume a Job
Restart/Rescale a Job
Inspect and Manage Existing Jobs
Idempotently Submit a Job
- submit a job with the same name to two Jet members
- result: only one job running, both clients get a reference to it

Integration with Hazelcast IMDG

Integration with Other Systems

Kafka Source
- variant with Avro Serialization
- variant with JSON Serialization
Kafka Sink
Hadoop Distributed File System (HDFS) Source and Sink
- variant with Avro Serialization
JDBC Source
JDBC Sink
JMS Queue Source and Sink
JMS Topic Source and Sink
Python Mapping Function
TCP/IP Socket Source
TCP/IP Socket Sink
CSV Batch Source
- use Jet to analyze sales transactions from CSV file
JSON Batch Source
- use Jet to analyze sales transactions from JSON file
File Batch Source
- use Jet to analyze an HTTP access log file
- variant with Avro serialization
File Streaming Source
- analyze the data being appended to log files while the Jet job is running
File Sink
- variant with Avro serialization
Amazon AWS S3 Source and Sink
Hadoop Source and Sink
- variant with Avro serialization
- variant with Parquet format
- variant with Amazon S3
- variant with Azure Cloud Storage
- variant with Azure Data Lake
- variant with Google Cloud Storage

Custom Sources and Sinks

Custom Source:
- start an Undertow HTTP server that collects basic JVM stats
- construct a custom Jet source based on Java 11 HTTP client
- apply a sliding window
- compute linear trend of the JVM metric provided by the HTTP server
- present the results in a live GUI chart
Custom Sink
- construct a custom Hazelcast ITopic sink

Integration with Frameworks

Spring Framework

Annotation-Based Spring Context
- use programmatic Jet configuration in a Spring Application Context class
- annotation-based dependency injection into a Jet Processor
XML-Based Spring Context
- configure Jet as a Spring bean in application-context.xml
- XML-based dependency injection into a Jet Processor
XML-Based Dependency Injection into a Jet Processor
- configure Jet as a Spring bean using Jet's XML Schema for Spring Configuration
- XML-based dependency injection into a Jet Processor
Spring Boot App that Runs a Jet Job

Files

examples

Directory actions

More options