exploring the similarity of the APIs from flink, kafka streams, spark (RDDs & DFs), storm’s trident and yes, even good old pig by implementing the canonical word count solution with each framework
Tag Archives: pig
viewing the content of ORC files (using the Java ORC tool jar)
a quick tutorial about finding and using the orc java tool jar for peering into the contents of the otherwise non humanly readable orc file format
presenting at hadoop summit (archiving evolving databases in hive)
overview of, and links to related artifacts for, my presentation at hadoop summit about strategies to handle changing data in hive’s immutable architecture