Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: SQL in SQL prettification #680

Open
innermatrix opened this issue Dec 1, 2023 · 5 comments
Open

Feature Request: SQL in SQL prettification #680

innermatrix opened this issue Dec 1, 2023 · 5 comments
Labels

Comments

@innermatrix
Copy link

Describe the Feature

I want formatting of SQL inside function bodies written in quoted stings

Why do you want this feature?

If you are writing SQL functions, it's pretty common to have SQL embedded inside SQL; for example (straight out of Postgres documentation):

CREATE FUNCTION dup(int) RETURNS dup_result
    AS $$ SELECT $1, CAST($1 AS text) || ' is text' $$
    LANGUAGE SQL;

In this case, $$ is a string quoting delimiter, and SELECT $1, CAST($1 AS text) || ' is text' is a quoted string.

The issue is that most SQL formatters do not recurse into these quoted string function bodies, so the function declaration is prettified but the function implementation remains ugly. Here's what I get if I run the above through prettier with prettier-plugin-sql:

create function dup (int) returns dup_result as $$ SELECT $1, CAST($1 AS text) || ' is text' $$ language sql;

What I wish I got would be more like

create function dup (int) returns dup_result as $$ 
    select
        $1,
        cast($1 as text)||' is text'
$$ language sql;

(Which is what I get if I manually run the function body through prettier-plugin-sql and then paste it inside $$ delimiters.)

@nene
Copy link
Collaborator

nene commented Dec 2, 2023

It's a feature that would be definitely nice to have. However, there are several obstacles to implementing it.

  • $$-delimited strings can be used for anything in Postgres. And even in CREATE FUNCTION statement it's possible to have other languages besides pl/pgsql. So the formatter needs to understand quite a bit about the context of the string before formatting it. That said, likely 99% of cases can be easily detected with some simple heuristics.
  • In general, formatting of procedural SQL is not really supported in SQL Formatter. Properly formatting that is really not achievable with the current architecture. While formatting a simple select (as in the above example) would easily work, it's imho quite common for these functions to contain much more complex syntax.

See readme for information about another formatting library, which does support that, though not yet for Postgres. It implements a similar feature for BigQuery, and makes use of Prettier API to e.g. format Python code inside strings.

@innermatrix
Copy link
Author

When it comes to the issue of "what is in this quoted string", I think that can be addressed in a well-defined way, since a function definition is going to have a language foobar either before or after the quoted string. (And when language declaration is missing from the function definition, then the default is also well-defined, either because it's specified the database engine — for example, Postgres specifies the default language to be sql — or because it can be provided as a config option). This extends to other custom languages, such as Python or JS, which are also identified by specific language foobar declarations.

In my particular case, I would be happy to change every $$ to $sql$ or $plpgsql$ if that would get me consistent function body formatting, but I honestly don't think that would even be necessary.

You are right that if you implemented this, my next request would be formatting plpgsql. I am happy to hear you are working on another another library that supports this better (though Postgres is my main use case)! However, it sounds like that corner of the ecosystem has not yet matured to the point where it is useful to me — I am using prettier v3 (which is incompatible with prettier-plugin-sql-cst) and prettier-plugin-embed which (currently) only knows how to call through to prettier-plugin-sql.

So I think I will continue ignoring this problem for now, because I think that what needs to happen is for prettier-plugin-sql-cst to gain prettier v3 compat, then for prettier-plugin-embed to gain support for prettier-plugin-sql-cst, and on top of that prettier-plugin-sql-cst would need to gain support for plpgsql, none of which are imminent 🙂

Thanks for explaining the lay of the land!

@nene
Copy link
Collaborator

nene commented Dec 3, 2023

The default language case actually complicates this problem even more, as then we'd need to detect whether the string occurs inside a CREATE FUNCTION statement, which is not a trivial thing, because the formatter doesn't really understand the context.

The good news is that Prettier 3 support in prettier-plugin-sql-cst is just around the corner, and when that happens, integration with the embed plugin should similarly be a in short order. But yeah... Postgres support, especially full pl/pgsql is quite a bit further.

@innermatrix
Copy link
Author

I guess I assumed that the formatter has access to enough of the AST to be able to tell when it's inside a create function

But even so, I'd be happy to change my $$ tags to indicate the language; that's already what I am used to from JS tagged templates anyway. 🙂

@nene
Copy link
Collaborator

nene commented Dec 4, 2023

Asking users to modify their SQL to suit the formatter should really be the last resort. I think there are better approaches to make it work for most cases. Like detecting whether a string is followed by LANGUAGE SQL is not that hard. Not sure though how large portion of use-cases that will cover.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants