hive acid transactions work on trino (can even update a partitioned column)

Ever since I joined Starburst, I’ve had to push back on my fellow All Stars when they told me Apache Hive does NOT allow for INSERT/UPDATE/DELETE/MERGE operations. I let them know that I was using Hive ACID for years at Hortonworks/Cloudera. This blog post is here to set the record straight on two important points.

  • Hive ACID does allow for INSERT/UPDATE/DELETE/MERGE operations
  • Probably more cool to me personally, Trino (and Starburst Galaxy/Enterprise) works very well with Hive ACID thanks to the base Hive Connector‘s functionality

That said…

Just because you can, doesn’t mean you should.

Sherrilyn Kenyon, William C. Taylor, Scott Bedbury, and just about everyone else…

I tossed out that age-old quote to make the point that I’m NOT actually recommending that new efforts should use Hive ACID. Modern table formats, my favorite Apache Iceberg for example, do all the cool things Hive ACID does and much more — including versioning and its benefits of time-travel querying and table rollbacks.

CRUD operations

This section shows the output of walking through the same use cases as my previous hive acid transactions with partitions post. This time, of course, I’m using Starburst instead of Hive. I usually test with Starburst Galaxy, but this time I’m using Starburst Enterprise.

Why? To make this work you do have to use an actual Hive MetaStore (HMS), not AWS Glue or the internal Starburst metastore implementation, and well… I didn’t have one setup for my Galaxy tenant and I was already in Starburst Enterprise with a Hive catalog using HMS.

And, of course, this functionality exist in the base Hive connector code in the Trino project for those running just Trino (or Presto for that matter).

Transactional table DDL

Here is the Trino version of the same DDL in the original post.

CREATE TABLE try_it (
    id int, a_val varchar, b_val varchar,
    prt varchar
)
WITH (
    format = 'ORC',
    transactional = true,
    partitioned_by = ARRAY['prt']
);

I highly encourage you to at least skim my previous hive acid transactions with partitions post as there is a lot of “behind the scenes” information that will be assumed you know. I’m talking about HOW Hive ACID works down at the data lake directory & file level.

Even if you don’t take my recommendation, this blog post will clearly show that INSERT/UPDATE/DELETE/MERGE operations do exist in Apache Hive AND can be leveraged from Trino.

DML use cases

Let’s explore some CRUD (Create, Retrieve, Update, Delete) use cases as expressed in Data Manipulation Language (DML).

Txn 1: INSERT single row

INSERT INTO try_it 
VALUES (1, 'noise', 'bogus', 'p1');

Like in the other post, verify that the p1 partition has a delta file and that it only includes changes belonging to transaction #1 (see the delta_0000001_... indicator).

NOTE: I’m not going to be exploring all the actual ORC files like I did in the posts I am reproducing now, but rest assured, the values are the same.

Txn 2: INSERT multiple rows across multiple partitions

Insert statements allow multiple rows to be added at once and they all belong to a single ACID transaction. This use case is to exercise that, but to make it a bit more fun we can span more than one partition.

INSERT INTO try_it VALUES
(2, 'noise', 'bogus', 'p2'),
(3, 'noise', 'bogus', 'p3');

Both the p2 and p3 partitions are present on the object store and they have delta directories & files each containing changes belonging to transaction #2.

NOTE: Again, I’m not going to be exploring all the actual delta directories and ORC files like I did in the posts I am reproducing now. Again, rest assured that the values are the same. I also promise to stop making this point. 😉

Txn 3: UPDATE multiple rows across multiple partitions

UPDATE try_it 
  SET b_val = 'bogus2' 
WHERE a_val = 'noise';

All three partitions are modified by each having delete_delta_ and delta_ directories.

Txn 4: UPDATE single row (leveraging partitioning)

This use case is just calling out that that we should be using the partitioned virtual column in the update statement as much as possible to make Trino’s job a bit easier; by only looking in the folders that can possibly be affected instead of walking the full table’s contents.

UPDATE try_it 
   SET b_val = 'bogus3'
 WHERE b_val = 'bogus2' 
   AND prt = 'p2';

In this example, without the partition condition we would have updated all three partitions again. Only the p2 partition has a delete_delta_0000004_0000004_ and delta_0000004_0000004_ folder.

Txn 5 (6 is not needed): UPDATE single row to change partition

UPDATE try_it 
   SET prt = 'p3' 
 WHERE a_val = 'noise' 
   AND prt = 'p1';

I discovered a VERY cool thing when trying to see what the equivalent Trino error message was going to be for this one…

Error: Error while compiling statement: FAILED:
 
   Updating values of partition columns is not supported 

What error message surfaced? NONE! IT WORKED!!

Check out the original blog post to see that with Hive I had to run two statements. The first to delete the record and a second one to add it back with the change to the partition column’s value. If anyone knows that Hive is NOW doing this as well, please leave a comment below as I was a couple of years ago when I wrote that blog post.

What about the delta file compactions?

My prior hive delta file compaction post walked through the minor & major file compaction processes. More importantly, it explained WHY they are needed. Since Trino is writing the very same directories and files while performing CRUD operations, this is STILL needed.

The bad news is that Trino cannot trigger either of these specialized processes with its own compaction process for these Hive ACID tables.

What does this mean? It means that you still need to have Hive around to run the minor & major compactions on. Even if all other CRUD operations and queries run solely on Trino. This is only one reason I would NOT suggest you create new tables with this Hive advanced feature.

That said, if you are still running a Hadoop/Hive cluster (likely where these tables where created and initially populated from) then you can easily run compactions as needed. If you are going to move away from Hadoop, I’d stop accessing the table, perform a final minor and then major compaction, and then migrate your Hive tables to Apache Iceberg.

MERGE works, too?

Of course it does! I’m also a huge fan of the MERGE statement and it’s ability to bundle all the changes it makes as a single transaction. This section just replays the scenario and solution I documented in hive’s merge statement so that you can see it works with Hive ACID operations running in Trino.

Create and populate a base table

CREATE TABLE bogus_info (
    bogus_id int,
    field_one varchar,
    field_two varchar,
    field_three varchar,
    date_created varchar
)
WITH (
    format = 'ORC',
    transactional = true,
    partitioned_by = ARRAY['date_created']
);

insert into bogus_info 
(bogus_id, date_created, field_one, field_two, field_three)
    values
        (11,'2019-09-17','base','base','base'),
        (12,'2019-09-17','base','base','base'),
        (13,'2019-09-17','base','base','base'),
        (14,'2019-09-18','base','base','base'),
        (15,'2019-09-18','base','base','base'),
        (16,'2019-09-18','base','base','base'),
        (17,'2019-09-19','base','base','base'),
        (18,'2019-09-19','base','base','base'),
        (19,'2019-09-19','base','base','base');

Create and populate a table of changes

CREATE TABLE deltas_recd (
    bogus_id int,
    date_created varchar,
    field_one varchar,
    field_two varchar,
    field_three varchar
)
WITH (
    format = 'ORC',
    transactional = true
);

insert into deltas_recd 
(bogus_id, date_created, field_one, field_two, field_three)
    values
        (20,'2019-09-20','NEW','base','base'),
        (21,'2019-09-20','NEW','base','base'),
        (22,'2019-09-20','NEW','base','base'),
        (12,'2019-09-17','EXISTS','CHANGED','base'),
        (14,'2019-09-18','EXISTS','CHANGED','base'),
        (16,'2019-09-18','EXISTS','CHANGED','base');

This shows us that the new changes include 3 totally new records (20, 21, and 22). We can also see that records 12, 14, and 16, need to be modified.

Create and execute the MERGE statement

The MERGE statement below lines up the matching records based on their ID and creation date. When there is a match it is treated as an UPDATE and when there is not it is handled as an INSERT. In this simple case, I am not addressing DELETE operations.

MERGE INTO bogus_info  AS B 
     USING deltas_recd AS D
      
  ON B.bogus_id     = D.bogus_id 
 AND B.date_created = D.date_created
  
WHEN MATCHED THEN
    UPDATE SET field_one   = D.field_one, 
               field_two   = D.field_two, 
               field_three = D.field_three
                
WHEN NOT MATCHED THEN
    INSERT VALUES (D.bogus_id,  D.field_one, 
                   D.field_two, D.field_three, 
                   D.date_created);

Gosh darn it, I <3 MERGE!!

Final thoughts

As I said in the beginning, Hive ACID works solidly on Trino clusters. Again, I also said I would NOT start there today, but if you have some of these already in production don’t feel you can’t work with them. You can!

In all fairness, people don’t give Hive the love and accolades it deserves. Hive ACID has been out there for years and IMHO, it was the archetype of the modern table formats. Should we move on from it and embrace these new approaches? Absolutely, but doesn’t mean we can’t appreciate the past as well!

I’d even suggest you read my comparison of hive, trino & spark features post for some more thoughts on why Hive deserves our respect; even if not our go-forward approach. Makes me think of Teachers by Daft Punk. Enjoy!

Published by lestermartin

Developer advocate, trainer, blogger, and data engineer focused on data lake & streaming frameworks including Trino, Hive, Spark, Flink, Kafka and NiFi.

Leave a Reply

Discover more from Lester Martin (l11n)

Subscribe now to keep reading and get access to the full archive.

Continue reading