java – Lester Martin (l11n)

becoming a data engineer (yet another top 10 list)

after a recent class i was asked what skills someone needs to become a data engineer – there are plenty of these lists all over the internet, yet here i go assuming i know enough to jot down yet another; at least i put mine all in a single picture 😉

batch as a “special case” of flink streaming (yes, now we’re mv’ing streaming back to batch)

the third part of a loosely coupled trilogy on flink batch and streaming that take us full-circle with the collapse of the DataSet API into the DataStream API — i’m not sure Run-D.M.C. could make this less tricky

mv’ing batch flink to streaming (easy breezy)

building on a prior post, this tutorial ports a simple flink batch program to become a streaming solution – put lakeside on the turntable and let’s finish up the fantastic voyage

hello world with flink (from scratch)

come along and ride on a fantastic voyage where we will setup an apache flink environment, code up a very simple job, and execute it & verify our results — we’ll just slide, glide, slippity-side

big data api’s look a lot alike (code comparison with flink, kafka, spark, trident and pig)

exploring the similarity of the APIs from flink, kafka streams, spark (RDDs & DFs), storm’s trident and yes, even good old pig by implementing the canonical word count solution with each framework

viewing the content of ORC files (using the Java ORC tool jar)

a quick tutorial about finding and using the orc java tool jar for peering into the contents of the otherwise non humanly readable orc file format