Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parsing plpgsql #43

Open
tomuxmon opened this issue Jan 11, 2021 · 5 comments
Open

parsing plpgsql #43

tomuxmon opened this issue Jan 11, 2021 · 5 comments

Comments

@tomuxmon
Copy link

Any way to parse plpgsql?

It parses correctly CreateFunctionStmt, but since function declaration itself is a string literal it is "correctly" parsed as a string.
Would be great if it could detect plpgsql language option and use the appropriate parser for that.

@eeeebbbbrrrr
Copy link
Collaborator

No support for that right now. Maybe in the future. It's a thing I've thought about, but I haven't investigated how to also pull that bit of code out of the LLVM IR postgres-parser currently generates from the postgres sources.

@tomuxmon
Copy link
Author

There are 2 interesting places in pl_comp.c

  • plpgsql_compile_inline. With that you can compile anonymous blocks with just a char as an input.
  • plpgsql_compile. Which allows compiling full function, but I am not sure how FunctionCallInfo could be constructed properly in rust (sorry I am a noob in rust lang so far :) ).

So I wonder if we can already parse the function definition (a function name, parameters, return type...) would it be not too crazy just trying to parse the function body with plpgsql_compile_inline and stitch it together with the current results?

@eeeebbbbrrrr
Copy link
Collaborator

hmm. If it's that simple to just specify one function (_compile_inline), then all the llvm extraction stuff should do the rest for us.

Are there additional headers we'll have to process? I assume there's additional structs to represent the plpgsql parse tree...

@tomuxmon
Copy link
Author

I see plpgsql.h seems to have all the structs needed. Most probably PLpgSQL_execstate is not interesting in our case since it should include runtime data.
To clarify regarding compile inline: it actually accepts anonymous code block like this:

do
$$
declare
    _a int = 1;
begin
    raise notice 'a = %;', _a;
end;
$$;

So we would need to tweak the actual string literal code block by adding do in front. Also not sure how would it treat the missing variable in case the function had a parameter.
But I guess trying out is the only way to find it out :)
What should change in build.rs (and other places) to try that out?

@eeeebbbbrrrr
Copy link
Collaborator

eeeebbbbrrrr commented Jan 20, 2021

I was actually poking at this a bit earlier today.

The first thing is that the patch file postgres-parser applies against some of the Postgres Makefiles (in order to compile as llvm ir) purposely excludes the pl/ directory tree.

Additionally, the plpgsql code is not compiled into postgres proper, but as a postgres extension. So we're gonna have to do a lot of additional work in build.sh to be able to extract the plpgsql symbols we need from the plpgsql object file that we don't even generate yet.

Then we'll be able to work on build.rs. There's a number of structs in plpgsql.h that we'll need. Mostly all the ones that include PLpgSQL_stmt_type as their first member along with PLpgSQL_function and whatever types it uses.

This is looking like a lot of work. It's probably good work to do, but it's not an afternoon job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants