Everything from 1.4.1 through 1.4.9 happened inside a single function, standard_planner(). Building paths, costing them, searching for a join order, estimating cardinality from statistics: all of it runs inside that one function. Yet PostgreSQL does not call standard_planner() directly. It puts another function, planner(), one step ahead of it, and has planner() call standard_planner(). And planner() can be made to call some other function instead of standard_planner().
That replacement is what the planner hook enables. When pg_stat_statements measures per-query planning time, or pg_hint_plan rewrites a plan according to hints, it all goes through this hook. Let’s look at how PostgreSQL provides a way to observe or change planning behavior without touching a single line of the core, and how external code plugs into it.
All planner() does is check the hook
The body of planner() is essentially this.
if (planner_hook)
result = (*planner_hook) (parse, query_string, cursorOptions, boundParams);
else
result = standard_planner(parse, query_string, cursorOptions, boundParams);
planner_hook is a global function pointer. Its default value is NULL, in which case standard_planner() is called right away. A plain PostgreSQL build always takes this path: planner_hook is empty, so the incoming query goes straight to standard_planner().
The key here is the type of planner_hook.
typedef PlannedStmt *(*planner_hook_type) (Query *parse,
const char *query_string,
int cursorOptions,
ParamListInfo boundParams);
This signature is identical, down to the character, to that of planner() and standard_planner(). It takes the same Query and returns the same PlannedStmt (the execution plan). So external code only has to write a planner function matching this type and store its address in planner_hook. Let’s call this function, written by external code to register in planner_hook, a custom planner function. The moment its address is stored, every planning request enters this custom planner function first, instead of standard_planner().
Leaving a single function pointer empty is a simple device, but it is the core of PostgreSQL’s extension model. Rather than editing the core source to insert new behavior, you connect an external function to a pointer the core left empty in advance. This is why you can change behavior while leaving the compiled PostgreSQL binary untouched.
A custom planner function delegates to standard_planner
So when and how does external code fill in planner_hook? When an extension is loaded, PostgreSQL calls its library’s _PG_init() function once. Hook installation almost always happens here. The _PG_init() of pg_stat_statements ends in two lines.
prev_planner_hook = planner_hook;
planner_hook = pgss_planner;
The second line stores the address of the custom planner function pgss_planner into the global pointer. At a glance this looks like it fully replaces standard_planner(). But what pg_stat_statements wants to do is not to plan in place of PostgreSQL; it wants to measure the time PostgreSQL spends planning. The plan itself must still be built by PostgreSQL.
So pgss_planner is called first, but it hands the actual planning back to standard_planner(). The order is:
- Pre-work: start a timer and record buffer usage.
- Delegate: call
standard_planner()to get the actual plan. - Post-work: stop the timer, compute the elapsed time, and store that figure in its own statistics table.
- Return: hand the plan received in step 2 back to the caller unchanged.
It does not touch the plan itself; it only measures time around it. Having the custom planner function run first and then hand the real work to standard_planner() is the most common shape. If instead a function does not call standard_planner() and returns a plan it built itself, that amounts to replacing planning altogether. An extension that changes plans, like pg_hint_plan, is closer to this. That said, even hooks that change plans usually adjust the plan standard_planner() produced rather than building one from scratch; doing the latter is rare.
I once built dynamic row-level security (RLS) on top of planner_hook, going beyond what PostgreSQL’s built-in RLS can do. Built-in RLS only accepts a fixed condition expression in a policy, so it cannot carry a condition that changes per query, like one a policy function produces at execution time. So I intercepted the
Querywith planner_hook right before planning and injected the dynamic condition there. For a SELECT, I adjusted theQuerybefore delegating to gate the read path; for an INSERT/UPDATE, I reworked the plan returned after delegating to check the value on the write path. A single hook thus passed through both before and after delegation.
A custom planner function wrapping standard_planner() with pre-work and post-work, as above, is the common case, but not every function has all three parts. Some only touch the Query before delegating and return the resulting plan as is (pre-work only); some only adjust the plan after delegating (post-work only). What it does depends on the extension’s purpose; only the skeleton, with delegation sitting in the middle, stays the same.
Why save prev_planner_hook
The first of those two lines has not been explained yet: prev_planner_hook = planner_hook. Before storing its own function, it saves the value already there. If only one extension is in use, this line is unnecessary, since the saved value would be NULL anyway. The line matters when several extensions are loaded.
Suppose shared_preload_libraries lists two extensions. Extension A, listed first, loads and stores its function in planner_hook. The prev A saved is NULL. Then extension B loads, and planner_hook currently holds A’s function. When B runs prev_planner_hook = planner_hook, that captures A’s function address, and then planner_hook gets B’s function. If B had skipped this save and just stored its own function, the function A installed would be referenced by nothing and vanish. A’s feature would die silently.
So each custom planner function, when delegating, does not call standard_planner() directly; it checks the saved prev first.
if (prev_planner_hook)
result = prev_planner_hook(parse, query_string, cursorOptions, boundParams);
else
result = standard_planner(parse, query_string, cursorOptions, boundParams);
If prev exists, it calls that; only when there is none does it call standard_planner(). This delegation happens at step 2 from the previous section, after the pre-work is done. Not after the extension has finished all its work, but with the delegation sitting between pre-work and post-work.
The result is that the extensions’ functions form a chain.
planner() → extension B's function → extension A's function → standard_planner()
The call order is the reverse of the install order. Extension B, loaded last, is called first; B calls A through prev; A, whose prev is NULL, calls standard_planner().
Here, B calling A happens in the middle of B’s function code (the delegation step). Likewise, A calls standard_planner() from the middle of its own code. So when standard_planner() finishes building the plan and returns its value, control comes back to A’s function and the post-work code remaining in A runs; when A returns, control goes back to B’s function and B’s post-work runs. Calls go in the order B → A → standard_planner(), and returns come out in the reverse order, standard_planner() → A → B. When a function calls another function from the middle of its code, the rest of the outer function continues after the inner one finishes; this is just how ordinary function calls work. That structure is why the two extensions can each slot in their own pre-work and post-work without colliding.
When it fires
Looking at where planner() is called tells you when the hook fires. planner() is called by a thin wrapper, pg_plan_query(), and that wrapper runs whenever an optimizable query needs a plan. So the hook fires every time a query like SELECT/INSERT/UPDATE/DELETE enters planning, one query at a time.
EXPLAIN is included here too. EXPLAIN simply does not execute the query; it still builds the plan the same way, so it passes through the planner hook. This is also why pg_stat_statements captures the planning time of EXPLAIN-ed queries.
Prepared statements are an exception, though. While a plan that was built once stays in the plan cache and gets reused, no planning happens, so the hook does not fire either. The hook fires again only when the cache has no plan and one must be built anew, for instance when a generic plan (a plan built once and reused regardless of the parameter values) is first created or a custom plan (a plan rebuilt for each value) is built each time. When a tool that measures planning time via the hook shows “some executions have zero planning time” for a prepared statement, that is not a bug but the cache doing its job.
You can choose how deep to intervene
The planner hook intervenes at the very outside of the whole planning process. So it does not see what happens inside; it only receives the incoming Query and the outgoing PlannedStmt. But sometimes you want to step into just one intermediate stage of planning. For example, if you only want to add one more path to the scan path candidates of a particular table, taking over all of planning is excessive.
PostgreSQL therefore provides separate hooks at deeper stages. A hook that fires right after a base table’s scan paths are all collected, a hook at the point where join paths are gathered, a hook that replaces the join-order search itself with your own algorithm (the stage where the DP and GEQO from 1.4.5 run), a hook that fires right after the planner reads table information from the catalog so you can add or remove index information, and so on.
The criterion for choosing is the depth of intervention. Work that only needs the input and output of the whole planning process (timing, plan post-processing) uses planner_hook; work that has to touch an intermediate result of a specific stage uses that stage’s hook. The deeper the hook, the more of the planner’s internal data structures you need to know, but you do not have to take responsibility for the whole of planning.
standard_planner mutates its input
There is a trap that those who write a hook directly can easily fall into. The source comment above planner() warns about it directly.
/* standard_planner() scribbles on its Query input, so you'd
* better copy that data structure if you want to plan more than once. */
standard_planner() does not just read the Query tree it receives; it rewrites it in place. Once a plan has been built, the input Query is already mutated. If a hook tries to call standard_planner() twice on the same Query (say, to plan two ways and compare them), the second call receives already-mutated input. So a hook that needs to plan twice must copy the Query before handing it to standard_planner(). If you hit a symptom while writing a hook where “the plan is fine the first time but goes wrong from the second,” it is usually a case of reusing the same input without knowing about this mutation.
What this means in practice
First, the more extensions use the planner hook, the more overhead is added per query. Turning pg_stat_statements on with track_planning = on measures planning time for each query, and that measurement is not free. The custom planner function being called first to turn the timer and buffer counters on and off is added to every planning operation. Usually this is negligible, but when you process tens of thousands of short queries per second, this overhead can show. Stacking several extensions that use the planner hook means their functions are called in a chain on every query, so keep this in mind on systems that plan frequently.
Second, the load order of extensions can affect behavior. Hooks are chained in reverse install order, so the order you list them in shared_preload_libraries determines the order they fire. Among extensions that only measure plans, the order is meaningless; but if you use an extension that modifies plans (like pg_hint_plan) together with one that measures, the order decides who touches the plan first and who measures the result. One thing to check when extensions do not mesh as expected is the listing order in shared_preload_libraries.
Third, the planner hook is an entry point you can use for debugging and diagnosis. Without touching the core, you can look right there at what Query a given query comes in as and what PlannedStmt it goes out as. With SQL you can only see the execution plan via EXPLAIN, but to inspect the structure of the Query tree itself, before and after it goes to the planner, the hook is just about the only avenue. A small extension that installs a planner hook and logs the incoming Query and outgoing PlannedStmt lets you observe the input and output of planning without recompiling the core.
When several extensions interleave, or rewrite stages stack up, the
Querytree often ends up changed in ways I did not intend. For those cases I built a monitoring tool that dumps theQuerytree before and after the planner and used it for debugging. Wiring that tool to a single GUC meant I could toggle monitoring by changing the GUC value through SQL alone, with no rebuild and no touching the server. Combining a hook with a GUC lets you observe internal behavior on a live server using nothing but SQL.