starburst galaxy’s materialized views (using apache iceberg)

join me on a quick test drive of the features of materialized views in starburst galaxy (saas offering powered by trino) which use apache iceberg for persistence and features some pretty cool features around snapshots and awareness of stale data

eliminate rollup’s null confusion (hint: grouping keyword)

rollup functions,such as cube, identify the rolled up totals by using null for the column they are representing a total for – this gets rather confusing when the column itself has null values in the individual rows – this post will show you how to definitively differentiate between these two cases

batch as a “special case” of flink streaming (yes, now we’re mv’ing streaming back to batch)

the third part of a loosely coupled trilogy on flink batch and streaming that take us full-circle with the collapse of the DataSet API into the DataStream API — i’m not sure Run-D.M.C. could make this less tricky

big data api’s look a lot alike (code comparison with flink, kafka, spark, trident and pig)

exploring the similarity of the APIs from flink, kafka streams, spark (RDDs & DFs), storm’s trident and yes, even good old pig by implementing the canonical word count solution with each framework