Koober

An uber data pipeline sample app. Play Framework, Akka Streams, Kafka, Flink, Spark Streaming, and Cassandra.

Start Kafka:

./sbt kafkaServer/run

Web App:

Obtain an API key from mapbox.com
Start the Play web app: MAPBOX_ACCESS_TOKEN=YOUR-MAPBOX-API-KEY ./sbt webapp/run

Try it out:

Start Flink:

./sbt flinkClient/run
Initiate a few pickups and see the average pickup wait time change (in the stdout console for the Flink process)

Start Cassandra:

./sbt cassandraServer/run

Start the Spark Streaming process:

Setup PredictionIO Pipeline:

Set the PIO Access Key:

 export PIO_ACCESS_KEY=<YOUR PIO ACCESS KEY>

Copy demo data into Kafka or PIO:

For fake data, run:

./sbt "demoData/run <kafka|pio> fake <number of records> <number of months> <number of clusters>"

For New York data, run:

./sbt "demoData/run <kafka|pio> ny <number of months> <sample rate>"

Start the Demand Dashboard

PREDICTIONIO_URL=http://asdf.com MAPBOX_ACCESS_TOKEN=YOUR_MAPBOX_TOKEN ./sbt demandDashboard/run

Provide feedback