delta lake is a popular data lake table format and the trino engine, and starburst galaxy, easily integrate with it all while using your favorite cloud provider’s object store thanks to galaxy’s great lakes connectivity
Tag Archives: acid
starburst galaxy’s materialized views (using apache iceberg)
join me on a quick test drive of the features of materialized views in starburst galaxy (saas offering powered by trino) which use apache iceberg for persistence and features some pretty cool features around snapshots and awareness of stale data
hive’s merge statement (it drops a lot of acid)
hive’s merge command provides another option for acid transactioning beyond insert, update and delete — this post walks you through a simple example and looks at the underlying filesystem at all the base, delta and delta_delete files that are created to support this standard sql command
hive delta file compaction (minor and major)
a quick walk-thru of how minor and major compactions occur for hive transactional tables; ensuring all the delta files eventually roll into base ones
hive acid transactions with partitions (a behind the scenes perspective)
let’s take a deeper look at what happens under the hood of hive on these “acid” activities such as insert, update and delete — including look at the actual directories and orc files created