PostgreSQL 8.0.1 Documentation | ||||
---|---|---|---|---|
Prev | Fast Backward | Chapter 33. The Rule System | Fast Forward | Next |
33.3. Rules on INSERT, UPDATE, and DELETE
Rules that are defined on INSERT, UPDATE, and DELETE are significantly different from the view rules described in the previous section. First, their CREATE RULE command allows more:
They are allowed to have no action.
They can have multiple actions.
They can be INSTEAD or ALSO (default).
The pseudorelations NEW and OLD become useful.
They can have rule qualifications.
Second, they don't modify the query tree in place. Instead they create zero or more new query trees and can throw away the original one.
33.3.1. How Update Rules Work
Keep the syntax
in mind. In the following, update rules means rules that are defined on INSERT, UPDATE, or DELETE.
Update rules get applied by the rule system when the result relation and the command type of a query tree are equal to the object and event given in the CREATE RULE command. For update rules, the rule system creates a list of query trees. Initially the query-tree list is empty. There can be zero (NOTHING key word), one, or multiple actions. To simplify, we will look at a rule with one action. This rule can have a qualification or not and it can be INSTEAD or ALSO (default).
What is a rule qualification? It is a restriction that tells when the actions of the rule should be done and when not. This qualification can only reference the pseudorelations NEW and/or OLD, which basically represent the relation that was given as object (but with a special meaning).
So we have four cases that produce the following query trees for a one-action rule.
- No qualification and ALSO
the query tree from the rule action with the original query tree's qualification added
- No qualification but INSTEAD
the query tree from the rule action with the original query tree's qualification added
- Qualification given and ALSO
the query tree from the rule action with the rule qualification and the original query tree's qualification added
- Qualification given and INSTEAD
the query tree from the rule action with the rule qualification and the original query tree's qualification; and the original query tree with the negated rule qualification added
Finally, if the rule is ALSO, the unchanged original query tree is added to the list. Since only qualified INSTEAD rules already add the original query tree, we end up with either one or two output query trees for a rule with one action.
For ON INSERT rules, the original query (if not suppressed by INSTEAD) is done before any actions added by rules. This allows the actions to see the inserted row(s). But for ON UPDATE and ON DELETE rules, the original query is done after the actions added by rules. This ensures that the actions can see the to-be-updated or to-be-deleted rows; otherwise, the actions might do nothing because they find no rows matching their qualifications.
The query trees generated from rule actions are thrown into the rewrite system again, and maybe more rules get applied resulting in more or less query trees. So a rule's actions must have either a different command type or a different result relation than the rule itself is on, otherwise this recursive process will end up in an infinite loop. (Recursive expansion of a rule will be detected and reported as an error.)
The query trees found in the actions of the pg_rewrite system catalog are only templates. Since they can reference the range-table entries for NEW and OLD, some substitutions have to be made before they can be used. For any reference to NEW, the target list of the original query is searched for a corresponding entry. If found, that entry's expression replaces the reference. Otherwise, NEW means the same as OLD (for an UPDATE) or is replaced by a null value (for an INSERT). Any reference to OLD is replaced by a reference to the range-table entry that is the result relation.
After the system is done applying update rules, it applies view rules to the produced query tree(s). Views cannot insert new update actions so there is no need to apply update rules to the output of view rewriting.
33.3.1.1. A First Rule Step by Step
Say we want to trace changes to the sl_avail column in the shoelace_data relation. So we set up a log table and a rule that conditionally writes a log entry when an UPDATE is performed on shoelace_data.
Now someone does:
and we look at the log table:
That's what we expected. What happened in the background is the following. The parser created the query tree
There is a rule log_shoelace that is ON UPDATE with the rule qualification expression
and the action
(This looks a little strange since you can't normally write INSERT ... VALUES ... FROM. The FROM clause here is just to indicate that there are range-table entries in the query tree for *NEW* and *OLD*. These are needed so that they can be referenced by variables in the INSERT command's query tree.)
The rule is a qualified ALSO rule, so the rule system has to return two query trees: the modified rule action and the original query tree. In step 1, the range table of the original query is incorporated into the rule's action query tree. This results in:
In step 2, the rule qualification is added to it, so the result set is restricted to rows where sl_avail changes:
(This looks even stranger, since INSERT ... VALUES doesn't have a WHERE clause either, but the planner and executor will have no difficulty with it. They need to support this same functionality anyway for INSERT ... SELECT.)
In step 3, the original query tree's qualification is added, restricting the result set further to only the rows that would have been touched by the original query:
Step 4 replaces references to NEW by the target list entries from the original query tree or by the matching variable references from the result relation:
Step 5 changes OLD references into result relation references:
That's it. Since the rule is ALSO, we also output the original query tree. In short, the output from the rule system is a list of two query trees that correspond to these statements:
These are executed in this order, and that is exactly what the rule was meant to do.
The substitutions and the added qualifications ensure that, if the original query would be, say,
no log entry would get written. In that case, the original query tree does not contain a target list entry for sl_avail, so NEW.sl_avail will get replaced by shoelace_data.sl_avail. Thus, the extra command generated by the rule is
and that qualification will never be true.
It will also work if the original query modifies multiple rows. So if someone issued the command
four rows in fact get updated (sl1, sl2, sl3, and sl4). But sl3 already has sl_avail = 0. In this case, the original query trees qualification is different and that results in the extra query tree
being generated by the rule. This query tree will surely insert three new log entries. And that's absolutely correct.
Here we can see why it is important that the original query tree is executed last. If the UPDATE had been executed first, all the rows would have already been set to zero, so the logging INSERT would not find any row where 0 <> shoelace_data.sl_avail.
33.3.2. Cooperation with Views
A simple way to protect view relations from the mentioned possibility that someone can try to run INSERT, UPDATE, or DELETE on them is to let those query trees get thrown away. So we create the rules
If someone now tries to do any of these operations on the view relation shoe, the rule system will apply these rules. Since the rules have no actions and are INSTEAD, the resulting list of query trees will be empty and the whole query will become nothing because there is nothing left to be optimized or executed after the rule system is done with it.
A more sophisticated way to use the rule system is to create rules that rewrite the query tree into one that does the right operation on the real tables. To do that on the shoelace view, we create the following rules:
Now assume that once in a while, a pack of shoelaces arrives at the shop and a big parts list along with it. But you don't want to manually update the shoelace view every time. Instead we setup two little tables: one where you can insert the items from the part list, and one with a special trick. The creation commands for these are:
Now you can fill the table shoelace_arrive with the data from the parts list:
Take a quick look at the current data:
Now move the arrived shoelaces in:
and check the results:
It's a long way from the one INSERT ... SELECT to these results. And the description of the query-tree transformation will be the last in this chapter. First, there is the parser's output
Now the first rule shoelace_ok_ins is applied and turns this into
and throws away the original INSERT on shoelace_ok. This rewritten query is passed to the rule system again, and the second applied rule shoelace_upd produces
Again it's an INSTEAD rule and the previous query tree is trashed. Note that this query still uses the view shoelace. But the rule system isn't finished with this step, so it continues and applies the _RETURN rule on it, and we get
Finally, the rule log_shoelace gets applied, producing the extra query tree
After that the rule system runs out of rules and returns the generated query trees.
So we end up with two final query trees that are equivalent to the SQL statements
The result is that data coming from one relation inserted into another, changed into updates on a third, changed into updating a fourth plus logging that final update in a fifth gets reduced into two queries.
There is a little detail that's a bit ugly. Looking at the two queries, it turns out that the shoelace_data relation appears twice in the range table where it could definitely be reduced to one. The planner does not handle it and so the execution plan for the rule systems output of the INSERT will be
while omitting the extra range table entry would result in a
which produces exactly the same entries in the log table. Thus, the rule system caused one extra scan on the table shoelace_data that is absolutely not necessary. And the same redundant scan is done once more in the UPDATE. But it was a really hard job to make that all possible at all.
Now we make a final demonstration of the PostgreSQL rule system and its power. Say you add some shoelaces with extraordinary colors to your database:
We would like to make a view to check which shoelace entries do not fit any shoe in color. The view for this is
Its output is
Now we want to set it up so that mismatching shoelaces that are not in stock are deleted from the database. To make it a little harder for PostgreSQL, we don't delete it directly. Instead we create one more view
and do it this way:
Voil:
A DELETE on a view, with a subquery qualification that in total uses 4 nesting/joined views, where one of them itself has a subquery qualification containing a view and where calculated view columns are used, gets rewritten into one single query tree that deletes the requested data from a real table.
There are probably only a few situations out in the real world where such a construct is necessary. But it makes you feel comfortable that it works.