How To Update Recursively With Common Table Expression

Wojciech Maciejak

CTE

How it’s begun

Some days ago I coded a new feature for my project in work(ROM-rb with a sequel). At the beginning, the task seemed easy. After all, it was just another standard update of the entity without any significant corner cases.

I have been lucky until I watched the structure of this entity. In a nutshell, every record from the table had a column parent which is… kind of another record from the same table. For example, there's a table locations with following records:

    { id: 1, name: "Europe" }
    { id: 2, name: "Poland", parent_id: 1}

Not bad - I thought. But it wasn't the end of the story. There was another column which inherits and concat value from the parent record:

    { id: 1, name: "Europe", code: "EU", path: "ES" }
    { id: 2, name: "Poland", code: "PL", path: "ES.PL", parent_id: 1}
    { id: 3, name: "Germany", code: "DE", path: "ES.DE", parent_id: 1}
    { id: 4, name: "Berlin", code: "BER", path: "ES.DE.BER", parent_id: 3}

Wait… We made a mistake… Path of Europe should be equal to EU…

What happens now? Update on the Europe record should propagate update on all descendants, so a naive update each one record in DB give to us N queries to DB.

In the above-mentioned example, it’s not a big problem, but what would happen if there's a parent in the database which consists of 10 000 records for example? Briefly, it’s not the best solution...

Okay, so how does it look? Root, parents, child, especially descendants, and summary? Tree and update. I assume that for most people it is a simple issue, but personally, I've never had a task like this before. I saw some tree implementation during my studies where I had to implement my own tree, but I've never implemented update action using parent attributes.

Idea

Very first thought? Recursive, recursive ALL!

I did some research, talked with my friends, and I read about CTE and RCTEE tree in SEQUEL. I also built a prototype and thought about updating some objects. The outcome: it's a simple problem. It must not be such a simple function in PG. With RECURSIVE statement there were more issues.

    db[:temp_table].with_recursive(
      :temp_table,
      db[:source_table].select(:id, :code, :path).where(id: source_id),
      db[:source_table].from_self.join(:temp_table, id: :parent_id).select(
       :t1__id, :t1__code, Sequel.join([:temp_table__path, ".", :t1__code])
       )
    )

At first, temp_table is like the name said, a temporary table for doing some operations on it (I will write later about it). source_table is a table with source records to update. All we need here is to filter this table for records which should be updated. with_recursive statement says something like - okay, get all the records from this query method non-recursive.

    db[:source_table]
      .select(:id, :code, :path)
      .where(id: source_id)

Next, join recursively records from this query. For the best performance possible, concatenation of parent and child value should be done in this part of the code.

     db[:source_table]
       .from_self
       .join(:temp_table, id: :parent_id)
       .select(:t1__id, :t1__code, Sequel.join([:temp_table__path, ".", :t1__code]))

Store results to temp_table and does it until algorithm meets the leafs.

Solution

Okay, so now we can update records directly in DB. How have I done it? Firstly, you must remember about an update root object before run recursive update on the table. The reason is simple - this function base on updated root.

    def descendants(location_id)
      db[:temp_table].with_recursive(
      :temp_table,
      db[:source_table].select(:id, :code, :path).where(id: source_id),
      db[:source_table].from_self.join(:temp_table, id: :parent_id).select(
       :t1__id, :t1__code, Sequel.join([:temp_table__path, ".", :t1__code])
       )
    )
    end

    def update_descendants(location_id)
          db[:source_table]
            .from(:source_table, descendants(location_id))
            .where(location_id: :t1__id)
            .update(path: :t1__path)
    end

That's all! Probably it can be done quicker with an RCTE tree(SEQUEL plugin), but frankly speaking, I am not convinced that using external functions is a good way for doing everything... If something can be done without a plugin, let's do this without a plugin? I'm looking forward to your feedback and opinions. If you have any valuable experience on this, please write us in the comments section below!

 
Wojciech Maciejak avatar
Wojciech Maciejak