building on a prior post, this tutorial ports a simple flink batch program to become a streaming solution – put lakeside on the turntable and let’s finish up the fantastic voyage
Tag Archives: software_development
hello world with flink (from scratch)
come along and ride on a fantastic voyage where we will setup an apache flink environment, code up a very simple job, and execute it & verify our results — we’ll just slide, glide, slippity-side
big data api’s look a lot alike (code comparison with flink, kafka, spark, trident and pig)
exploring the similarity of the APIs from flink, kafka streams, spark (RDDs & DFs), storm’s trident and yes, even good old pig by implementing the canonical word count solution with each framework
functional programming and big data (what a pair)
a high-level overview of how functional programming with immutable datasets is a great partner with big data processing frameworks — code examples with spark rdds using scala
building a spark sql udf with scala (using multiple arguments)
a short & sweet code-focused tutorial declaring a scala function as a spark sql udf that can be leveraged via the api approach or in a formal sql statement
joining spark dataframes with identical column names (not just in the join condition)
a quick walkthru of spark sql dataframe code showing joining scenarios when both tables have columns with the same name; this includes when they are used in the join condition as well as when they are not
how do i load a fixed-width formatted file into hive? (with a little help from pig)
presents a couple of options for converting a fixed-width formatted file a a delimited one to prepare it to be exposed as a hive table
visiting the computer history museum (yes, i’m a geek)
pictures and observations from my visit to the computer history museum in palo alto, ca
are you a mort, elvis or einstein (or are these labels nonsense)?
a few simple personas represent the vast majority of software developers — which one are you?