Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for virtual tables #422

Open
graf0 opened this issue Nov 8, 2023 · 5 comments
Open

Add support for virtual tables #422

graf0 opened this issue Nov 8, 2023 · 5 comments

Comments

@graf0
Copy link

graf0 commented Nov 8, 2023

Hi,
Could we implement virtual table support in sqlite3 gem?
Api is describe here: https://www.sqlite.org/vtab.html
There is even pull request from 2013 - #115

@flavorjones
Copy link
Member

@graf0 If you'd like to help contribute to the project, I would welcome your help. What do you think the API should look like for this feature?

@graf0
Copy link
Author

graf0 commented Nov 20, 2023

Hi,

I think good example of how it could be done is apsw package from python - https://github.com/rogerbinns/apsw/

Trying to translate that into ruby world & terminology:

First you should allow to define low-level access to sqlite vtab internals, that is: we should have 3 classes - VTModule, VTTable, VTCursor - that when you want to create virtual table you should inherit from.

Instance of SQLite3::Database has method create_module(name, object) - given object inhertis from VTModule and should have few methods, that is: connect, create, shadowname - they correspond with proper names from sqlite3_module struct.

Methods create & connect returns array of 2 elements - create sql statement and object that inherits from VTTable and defines methods: bestindex, destroy, disconnect, open. Those methods are needed to create readonly virtual table - and that is what I suggest for version 1 of this feature.

From what I saw in other languages - it is good to implement bestindex method as close to sqlite interface as possible. So I suggest to pass it whole IndexInfo structure, maybe with just a little bit of more ruby-like syntax. Apsw did that in recent version in by adding BestIndexObject method that gets full structure.

VTTable instance open method returns instance of VTCursor object, that implements methods: close, column, eof, filter, next and rowid - it’s again mimum number of methods to implement read-only virtual table.

I suggest also to implement in version 1 of the feature SQlite3::Database instance method make_virtual_module. It should take name of virtual table and instance of Enumerable object. This method (also implemented by apsw) is just sugar on top of api described earlier - but allows any one, that would like to experiment with ruby-based virtual tables to start working without studying (quite complicated - especially bestindex method) vtab api.

Next versions could add write funcionality.

Now, I’m by no means expert in writing ruby c extensions, but I see few issues that implementation should address. We are in situation, that first we call c extension (ie.: sqlite3_prepare or sqlite3_step etc.), that will call sqlite3 gem, that will call ruby, that will return value to c extension.

So first - imho it’s important to make sure all exceptions in ruby code that is called by sqlite3 are handled. Ideally - to return info about those exceptions as sqlite3 errors, with possible stack trace. It would ease debugging virtual tables a lot!

Second issue I see - because of switching in and out of c and ruby, we need to address GC issue. That is - we need to keep references to ruby objects involved as long as there is sqlite module, table or cursor used. Sqlite make sure that there are open/close (or simillar) interface function called for each element of api - but between them we need to keep ruby object from GC.

@graf0
Copy link
Author

graf0 commented Dec 23, 2023

Hi,
what do you think? Is this design ok?
Bye

@tenderlove
Copy link
Member

The patch to introduce vtables is pretty complicated (which is why it hasn't been merged). Just for my own edification, what are the use cases for exposing virtual tables to the Ruby API?

@graf0
Copy link
Author

graf0 commented Jan 4, 2024

Virtual tables could be used to access data outside of sqlite just like other tables. Ie.: you can query running processes, list currently running EC instances on AWS, or content of table from remote postgresql db or parquet file. So ie.: you can query list of processes and join it my process name with normal sqlite table that contains names of malicious software. Or - with small api - you can query list of used ip addresses on fleet of machines and group them by ip network. Good example of such software is https://www.osquery.io

It seems to be pretty complicated - but from rubygem point of view - it's just bunch of callbacks that needs to be passed to ruby.

But I understand you point of view - that really a lot of c code, that needs to be maintained, will be used by small percent of users. And that is not only extension of sqlite with similar properties - ie.: there is session extension, used to sync data between sqlite databaseses.

So maybe better solution will be to provide user with sqlite db pointer (sqlite3 *) which then could be used to write ffi exention? Workflow would be like this:

  1. you create db connection
  2. you get sqlite db pointer and pass to you ffi exention - ie.: virtual table implementation
  3. ffi extension takes care of implementing all callbacks, keeps important data out of GC hands, converts exceptions to sqlite errors etc.

This way - relatively simple - we can have cake & eat cake. And this gives us ability to use any other extension of sqlite that will popup in future, or is available now and have special c api.

That just fast idea - but what do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants