eliminate rollup’s null confusion (hint: grouping keyword)

rollup functions,such as cube, identify the rolled up totals by using null for the column they are representing a total for – this gets rather confusing when the column itself has null values in the individual rows – this post will show you how to definitively differentiate between these two cases

hive, trino & spark features (their journeys to sql, performance & durability)

different big data sql engines are created to solve a particular lack of focus from existing ones, but sooner or later they all start looking like each other from their list of features and observable behaviors

wrapping up my 8 year hortonworks – cloudera adventure (best job ever)

what an amazing eight years at hortonworks/cloudera — the technology, the focus, the use cases, the domains, the FUN and most importantly, the PEOPLE, made this the best job of my entire career and make it super hard to say goodbye to this role

batch as a “special case” of flink streaming (yes, now we’re mv’ing streaming back to batch)

the third part of a loosely coupled trilogy on flink batch and streaming that take us full-circle with the collapse of the DataSet API into the DataStream API — i’m not sure Run-D.M.C. could make this less tricky