Big Data – Page 2 – Lester Martin (l11n)

building scalar udf’s w/sql for trino (aka sql routines)

check out this quick set of simple examples showing how easily you can create sql-based user-defined functions (udf), formally referred to as trino sql routines, to allow more succinct queries and offer reusability

apache iceberg table maintenance (is_current_ancestor part deux)

as a follow-on to my earlier post about iceberg versioning (and the is_current_ancestor flag), i thought it would be useful to show working examples of the maintenance activities that are needed to manage the sprawl of data lake files that come with more and more versions

becoming a data engineer (yet another top 10 list)

after a recent class i was asked what skills someone needs to become a data engineer – there are plenty of these lists all over the internet, yet here i go assuming i know enough to jot down yet another; at least i put mine all in a single picture 😉

iceberg snapshot is_current_ancestor flag (what does it tell us)

i’ve noticed the is_current_ancestor column of the apache iceberg $history metadata table for a while now – it wasn’t until I got a direct question about it that i realized it was time to find out for sure

dbt cloud & starburst galaxy workshop (beta testers welcome)

interested in building a data pipeline with dbt cloud and starburst galaxy? if so, then this post presents recorded videos of 7 lab exercises plus the lab guide itself so you work through them on your own & at your pace

z-order (visualized)

when asked to compare sort-by with z-order for data lake tables i realized i finally needed to have a better understanding of what z-order is all about and my goal with this blog post is to present a simplified visualization of what’s going on and how it can help

Category Archives: Big Data

building scalar udf’s w/sql for trino (aka sql routines)

apache iceberg table maintenance (is_current_ancestor part deux)

becoming a data engineer (yet another top 10 list)

iceberg snapshot is_current_ancestor flag (what does it tell us)

dbt cloud & starburst galaxy workshop (beta testers welcome)

z-order (visualized)

ibis & trino (dataframe api part deux)

viewing astronauts thru windows (more pystarburst examples)

sql window functions explained (transparently as possible)

pystarburst analytics examples (querying aviation data part deux)