Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Haskell bindings + Nix build #117

Open
dmjio opened this issue Jul 1, 2021 · 3 comments
Open

Haskell bindings + Nix build #117

dmjio opened this issue Jul 1, 2021 · 3 comments

Comments

@dmjio
Copy link

dmjio commented Jul 1, 2021

Hello 馃憢馃徏 thanks for making this library, its great work.

In the course of making Haskell bindings (pg-query), I've also created a nix build.

{ pkgs ? import <nixpkgs> {} }:
with pkgs.haskell.lib;
let
  pg_query = pkgs.stdenv.mkDerivation {
    name = "pg_query";
    src = pkgs.fetchFromGitHub {
      owner = "pganalyze";
      repo = "libpg_query";
      rev = "a2482061d8ad8731c485b3cb7e8c9109b7b3529b";
      sha256 = "1zdq3lq0ym329gjak8xj2a9i0psp2qvr7imlm1cy4in3rwx9vdf6";
    };
    buildInputs = with pkgs; [ clang which protobufc ] ;
    patchPhase = with pkgs; ''
      ${gnused}/bin/sed -i.bak 's/prefix = .*/prefix = $(out)/g' Makefile
    '' + pkgs.lib.optionalString pkgs.stdenv.isDarwin ''
      ${gnused}/bin/sed -i.bak 's/-Wl,-soname,$(SONAME)/-Wl/g' Makefile
    '';
  };
in
  pg_query

If you'd like I can submit a PR of this derivation to nixpkgs, this repo, or both (or neither).

Also, is there a formal specification of the JSON structure of the SQL AST? I'd like to begin making a Haskell AST now that its accessible via C, but unsure if there's a spec somewhere (i.e. like a JSON schema file, etc.)

@lfittl
Copy link
Member

lfittl commented Jul 1, 2021

Hello 馃憢馃徏 thanks for making this library, its great work.

Thanks! Happy to see you created Haskell bindings!

...
If you'd like I can submit a PR of this derivation to nixpkgs, this repo, or both (or neither).

Hmm, good question - since the library is usually not used as a package (but rather embedded in the various language-specific wrappers), we don't have a place in this repo for packaging information. If there is no downside to it, I think for now I'd prefer not to have packaging-related files in this repo (but open to it, if it makes a difference for maintainability).

Also, is there a formal specification of the JSON structure of the SQL AST? I'd like to begin making a Haskell AST now that its accessible via C, but unsure if there's a spec somewhere (i.e. like a JSON schema file, etc.)

There isn't really - the source of truth is the Postgres source code, and the structure gets generated from that. One could potentially envision creating a JSON schema file for the AST if you auto-generated it, similar to how the Protobuf definitions get auto-generated (see https://github.com/pganalyze/libpg_query/blob/13-latest/scripts/generate_protobuf_and_funcs.rb - not the easiest to read file, but that does all the work based on the files located in the srcdata directory).

@dmjio
Copy link
Author

dmjio commented Jul 3, 2021

Hmm, my approach was to copy the AST from the PostgreSQL gram.y file, and attempt to match up the JSON with the AST, sound reasonable?

@lfittl
Copy link
Member

lfittl commented Jul 3, 2021

Hmm, my approach was to copy the AST from the PostgreSQL gram.y file, and attempt to match up the JSON with the AST, sound reasonable?

Not sure how you mean - the gram.y file / the raw_parser method, ends up creating C structs (thats how it works in Postgres itself), and what pg_query does is it outputs them as JSON (see https://github.com/pganalyze/libpg_query/blob/13-latest/src/pg_query_outfuncs_json.c).

This JSON output logic is based on the automatic analysis of the Postgres source code, with the information saved to the srcdata/ directory. e.g. here you can find the description of a SelectStmt AST node: https://github.com/pganalyze/libpg_query/blob/13-latest/srcdata/struct_defs.json#L2280 (describing this C struct in the original source: https://github.com/postgres/postgres/blob/master/src/include/nodes/parsenodes.h#L1639)

If you were to translate them back to a language-specific structure, my recommendation would be to either:

  1. Auto-generate the language-specific structs based on the srcdata/ information (or your own reading of the Postgres source code), and then parse the JSON into these structs

  2. Use a Protobuf library to parse the Protobuf output (instead of the JSON output), which automatically gives you language-specific structs (this is e.g. how the Go library works)

Hope that helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants