PostgreSQL subquery alias transformation (Oracle compatible)

The sub query in PostgreSQL must be added with an alias, even if we will not refer to this alias elsewhere.
Otherwise, the following error will be reported:

postgres=# select * from (select * from t1 limit 5);
ERROR:  subquery in FROM must have an alias
LINE 1: select * from (select * from t1 limit 5);
                      ^
HINT:  For example, FROM (SELECT ...) [AS] foo.

In Oracle, aliases can be omitted, for example:

SQL> select * from (select * from t1);

        ID
----------
         1

Of course, this is not bad, because the syntax in PG is SQL standard syntax, but for some users migrating from Oracle, there may be some problems. Let's see how to achieve compatibility from the kernel level?

First of all, we need to know that the SQL statement we enter is passed to the query analyzer as a string, and then the database performs lexical analysis and syntax analysis to generate an analysis tree.
In PG, the files that lexical analysis and syntax analysis depend on are scan.l and gram.y. Lexical analysis is carried out through scan.l to identify the keywords in sql and pass them to the parser as tokens, while gram.y defines and parses the incoming tokens to generate parsetree.

We find the corresponding part in gram.y according to the previous error reports:

| select_with_parens opt_alias_clause
	{
		RangeSubselect *n = makeNode(RangeSubselect);
		n->lateral = false;
		n->subquery = $1;
		n->alias = $2;
		/*
		 * The SQL spec does not permit a subselect
		 * (<derived_table>) without an alias clause,
		 * so we don't either.  This avoids the problem
		 * of needing to invent a unique refname for it.
		 * That could be surmounted if there's sufficient
		 * popular demand, but for now let's just implement
		 * the spec and see if anyone complains.
		 * However, it does seem like a good idea to emit
		 * an error message that's better than "syntax error".
		 */
		
		if ($2 == NULL)
		{
			if (IsA($1, SelectStmt) &&
				((SelectStmt *) $1)->valuesLists)
				ereport(ERROR,
						(errcode(ERRCODE_SYNTAX_ERROR),
						 errmsg("VALUES in FROM must have an alias"),
						 errhint("For example, FROM (VALUES ...) [AS] foo."),
						 parser_errposition(@1)));
			else
				ereport(ERROR,
						(errcode(ERRCODE_SYNTAX_ERROR),
						 errmsg("subquery in FROM must have an alias"),
						 errhint("For example, FROM (SELECT ...) [AS] foo."),
						 parser_errposition(@1)));
		}

You can see select_with_parens is a bracketed select statement, followed by opt_ alias_ Claim, and when $2 == NULL, it is opt_ alias_ When the claim is empty, the error report we encountered earlier will be thrown.
From here, we can see why PG does not allow sub queries to have aliases.

Next, let's start our transformation:

At first I wanted to add a separate select_ with_ The parens option is not enough, as follows:

| select_with_parens
	{
		...
    }

But obviously, there is a reduction conflict here. Sure enough, an error is reported when compiling..

Here is a brief description:

  • Move in – reduction conflict: the right part of one production is the prefix of another production
  • Reduction – reduction conflict: different productions have the same right part, or the right part of a production is the suffix of another production

Since this path doesn't work, let's change our thinking. When $2 == NULL, we don't throw an error, but define an alias for ourselves, which is just invisible to the user, so we can modify it as follows:

if ($2 == NULL)
					{
						Alias *a = makeNode(Alias);
						a->aliasname = "Alias";
						n->alias = a;
					}

Although this can achieve the functions we want, there will be a problem. What if the alias constant we define conflicts with other object names in the database?

Reference This article , with the following modifications:

if ($2 == NULL)
					{
						Alias *a = makeNode(Alias);
						StringInfoData newAliasBuf;

						initStringInfo(&newAliasBuf);
						appendStringInfo(&newAliasBuf, "__new_alias__%d", ++emptyAliasCounts);
						a->aliasname = newAliasBuf.data;
						n->alias = a;
 					}

So recompile to realize the function of sub query without alias!

bill=# select * from (select * from t1 limit 5);
 id |               info
----+----------------------------------
  1 | 9240016a94250b03b5a5c39d01946e3c
  2 | dd0ea69b5d9c4fa385c5918d832627c5
  3 | 1509dc3c2e147d574cb5cbc64687a132
  4 | cb635ead5172046f68e517ba894ae6de
  5 | a61265e5b65a243b59f1f920aff300ae
(5 rows)

The complete patch is as follows:

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 5fa322d8d4..bec1826865 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -66,6 +66,7 @@
 #include "utils/numeric.h"
 #include "utils/xml.h"
 
+int emptyAliasCounts = 0;
 
 /*
  * Location tracking support --- simpler than bison's default, since we only
@@ -12097,6 +12098,8 @@ table_ref:	relation_expr opt_alias_clause
 					 * However, it does seem like a good idea to emit
 					 * an error message that's better than "syntax error".
 					 */
+
+					 /*
 					if ($2 == NULL)
 					{
 						if (IsA($1, SelectStmt) &&
@@ -12112,7 +12115,19 @@ table_ref:	relation_expr opt_alias_clause
 									 errmsg("subquery in FROM must have an alias"),
 									 errhint("For example, FROM (SELECT ...) [AS] foo."),
 									 parser_errposition(@1)));
+					} */
+
+					if ($2 == NULL)
+					{
+						Alias *a = makeNode(Alias);
+						StringInfoData newAliasBuf;
+
+						initStringInfo(&newAliasBuf);
+						appendStringInfo(&newAliasBuf, "__new_alias__%d", ++emptyAliasCounts);
+						a->aliasname = newAliasBuf.data;
+						n->alias = a;
 					}
+
 					$$ = (Node *) n;
 				}
 			| LATERAL_P select_with_parens opt_alias_clause
@@ -12122,6 +12137,8 @@ table_ref:	relation_expr opt_alias_clause
 					n->subquery = $2;
 					n->alias = $3;
 					/* same comment as above */
+
+					/*
 					if ($3 == NULL)
 					{
 						if (IsA($2, SelectStmt) &&
@@ -12137,7 +12154,19 @@ table_ref:	relation_expr opt_alias_clause
 									 errmsg("subquery in FROM must have an alias"),
 									 errhint("For example, FROM (SELECT ...) [AS] foo."),
 									 parser_errposition(@2)));
+					} */
+
+					if ($3 == NULL)
+					{
+						Alias *a = makeNode(Alias);
+						StringInfoData newAliasBuf;
+
+						initStringInfo(&newAliasBuf);
+						appendStringInfo(&newAliasBuf, "__new_alias__%d", ++emptyAliasCounts);
+						a->aliasname = newAliasBuf.data;
+						n->alias = a;
 					}
+
 					$$ = (Node *) n;
 				}
 			| joined_table
diff --git a/src/include/parser/parser.h b/src/include/parser/parser.h
index 3bdeeb8b0b..0e1b418b99 100644
--- a/src/include/parser/parser.h
+++ b/src/include/parser/parser.h
@@ -38,4 +38,6 @@ extern List *raw_parser(const char *str);
 extern List *SystemFuncName(char *name);
 extern TypeName *SystemTypeName(char *name);
 
+extern int emptyAliasCounts;
+
 #endif							/* PARSER_H */

Reference link:
https://rng-songbaobao.blog.csdn.net/article/details/111152337
https://www.modb.pro/db/11343

Tags: Database Oracle PostgreSQL

Posted on Wed, 27 Oct 2021 08:59:46 -0400 by curtm