title says it all 😉
Tag Archives: tutorial
eliminate rollup’s null confusion (hint: grouping keyword)
rollup functions,such as cube, identify the rolled up totals by using null for the column they are representing a total for – this gets rather confusing when the column itself has null values in the individual rows – this post will show you how to definitively differentiate between these two cases
querying aviation data in the cloud (leveraging starburst galaxy)
come along on a quick tutorial of loading some airline flight data into a cloud object store and performing some data analysis of it from the starburst galaxy sql engine in the sky
federated queries on starburst galaxy (long and short videos)
a long (and short) video of performing a federated join across s3, redshift, and mysql using trino-based starburst galaxy
querying starburst galaxy from tableau (super easy)
short post pointing to the youtube video i created showing how to use starburst galaxy to query data from tableau desktop
batch as a “special case” of flink streaming (yes, now we’re mv’ing streaming back to batch)
the third part of a loosely coupled trilogy on flink batch and streaming that take us full-circle with the collapse of the DataSet API into the DataStream API — i’m not sure Run-D.M.C. could make this less tricky
mv’ing batch flink to streaming (easy breezy)
building on a prior post, this tutorial ports a simple flink batch program to become a streaming solution – put lakeside on the turntable and let’s finish up the fantastic voyage
hello world with flink (from scratch)
come along and ride on a fantastic voyage where we will setup an apache flink environment, code up a very simple job, and execute it & verify our results — we’ll just slide, glide, slippity-side
big data api’s look a lot alike (code comparison with flink, kafka, spark, trident and pig)
exploring the similarity of the APIs from flink, kafka streams, spark (RDDs & DFs), storm’s trident and yes, even good old pig by implementing the canonical word count solution with each framework
functional programming and big data (what a pair)
a high-level overview of how functional programming with immutable datasets is a great partner with big data processing frameworks — code examples with spark rdds using scala