a smackdown of sort pitting kafka streams, spark streaming, and storm against each other — not for the features they give developers, but for the features they offer the operations side of the devops formula
Tag Archives: data_engineering
presenting at hadoop summit (archiving evolving databases in hive)
overview of, and links to related artifacts for, my presentation at hadoop summit about strategies to handle changing data in hive’s immutable architecture
how do i load a fixed-width formatted file into hive? (with a little help from pig)
presents a couple of options for converting a fixed-width formatted file a a delimited one to prepare it to be exposed as a hive table