wrapper post for two starburst deliverables (webinar & blog post) discussing why you should, or maybe shouldn’t, move from apache hive to apache iceberg for your data lake table format
Monthly Archives: May 2024
iceberg materialized views in galaxy (no más storage_schema)
starburst galaxy, as a saas offering, just keeps slipping in nice bits of features & functionality — this one tackles hiding the underlying storage table of an iceberg materialized view
recap of the inaugural iceberg summit (my top 5 observations)
tl;dr – iceberg is pervasive, the real fight is for the catalog, concurrent transactional writes are a bitch, append-only tables still rule, and trino is widely adopted
joining spark dataframes with identical column names (an easier way)
presenting an easier solution to the problem of colliding column names when joining spark dataframes than i previously offered in my most popular post that just happens to be four years old — some things do age well
pystarburst in 90 seconds (try it)
still thinking about trying to get a pystarburst code stub up/n/running? starburst galaxy makes it pain free and you can even get your first dataframe created via python in under 90 seconds — why not give it a try?