New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seed a single file! #801
Comments
👍 |
1 similar comment
+1 |
👍 This is very useful for tests. |
Must have! |
👍 I join to request. It would be very convenient for tests! |
i support this request. in laravel you can just easily do this: |
👍 Highly relevant and would be extremely useful! |
Much needed feature :) 👍 |
As a workaround you can seed single file already by just writing node script that does the seeding and running it directly: const knexConf = require('./knexfile');
const knex = require('knex')(knexConf);
// insert anything you like... I don't see much benefit of having separate seed file. For running tests creating separate custom populate schemes is lot more flexible and fast that using knex client + seed file system. |
adding a table after the app has gone to production and seeding it is my actual use case |
@asequeir-public You should use migrations for that. |
I am starting a new project to seed tables with knex.js, here. Any help will be nice! |
For schema and changes to existing data as a result of the schema change sure, but for new data (especially large amounts of it) can be quite problematic in migrations. It seems to be a general consensus across frameworks, talking to peers googling around a bit, that migrations are kept free of data where possible. |
Well if that is static data to populate to every DB (for example information about categories, which are the same for every installation etc.), migrations are exactly correct place to do it. Otherwise it depends on the data. Populate scripts are not good for adding large amount of data which is suitable only for single database and not to other installations / test setups. If its one-time data to add to the production it should be done with any script, which pushes the data in where it belongs. After that it should be restored from DB backups, since applying it again would break everything. I would rather like to have a feature, which would drop the whole populate script system from knex and maybe add an external package for the people who likes to have that (they doesn't implement any functionality which won't be trivial to implement with simple node.js script as mentioned in comment #801 (comment)). |
I also disagree with this. That is precisely what seeds are for, not migrations. Migrations are only meant to change the database structure, not its data. On top of that migrations are supposed to be transitory, and can be deleted once they have served their useful purpose. If they contain data this will not be possible. |
@ricardograca Theoretically, yes. Practically migrations are the best way to ensure that you have consistent datasets across different environments. And I disagree that "transactions can be deleted", typically you want to keep them around forever so that you can always recreate an environment from a scratch (at least the part that is not dependent on external input). |
@pmead What we typically do in production with big datasets is use migrations to call importers and store actual data in CSV files that may or may not be stored in repo (based on how much long-lasting value they have and on their size) |
Not really. You only want to keep them around forever now because there's no better system to recreate the schema from scratch with Knex, like the schema file functionality from Rails. But even still, once all machines that are running your code are updated to the latest version of the schema you can safely delete all migrations and create a new one based on the current state that will take the place of the initial migration. If you work on a big project you'll end up with hundreds or even thousands of migrations if you don't clean them up from time to time. It is super wasteful to have to run a bunch of migrations on a new machine where you deploy your code (or every time when running tests). In this case you probably don't want to go through all the history of migration changes just so that you can get to the latest state, you just want the latest state. |
One of the particular use case where population scripts are useful is when you want the data set to be refreshed but it doesn't happen frequent enough to warrant another method e.g. say an annual import of data. Migrations at that point feel somewhat cumbersome because for some environments they will have already run meaning you would need a new migration and couldn't just change the CSV (which I think would happen in what you're mentioning @kibertoad ?) and new environments end up unnecessarily loading potentially huge data sets in again and again. I also think the split of schema -> migrations, data -> seed / populate scripts is a nicer separation of concerns and holds less surprises in large projects. Interestingly @ricardograca Rails is also the paradigm I'm mentally mapping this to, although that has its own set of problems. |
Just to start with actual support of seeds I want to say clearly: Having support for seeds in knex doesn't help anything with any of the points discussed in this reply Now that being said, we can continue philosophical discussions about migrations and schema things :)
That sound really rails-centric and purifist view to the subject. I would like to hear any real actual reason why migration would need to be able to throw away after its used and why they should not contain any data. If the only reason is that accumulation of thousands of migration files can be prevented also with knex and initial / midway database dumps (I have really good tools for that in my own projects for various purposes). You can throw them away when they have served their purpose (for example when new initial database dump is created with all necessary initial data). Usually people find it easier to keep migrations in repo though.
Knex cannot adapt that way in a long time + even rails' schemafile is quite limited with SQL features it supports. Knex will never be rails (but it might eventually get something like schema file supported).
Yes, with knex database dumps are really useful, with larger projects and for tests to reset the latest initial schema + data. I know it would be nice to offer better tooling and some best practices for that. |
It doesn't work so well in the real life and you can easily have directory full of database population scripts without any seed wrapping around them. Then you can just call
Yeah, rails is quite a different setup and active record is much higher level tool than knex. Sure we can learn from there too. |
Well, it's mainly an issue when it comes to setting up a new machine. For example, why would you want to go through all the process of creating a column, modifying it, modifying it again, and again and then dropping it? This will happen if you have all the migrations and it makes no sense. This is just an example, the same same thing will happen with tables, and a large project can easily accumulate a lot of small changes. In my mind it makes no sense to go through the whole history of changes, that's why migrations should be transitory. Note that this also applies to a test suite when using Knex, where the whole migration list is run in between each test, which slows down the process. The only upside I can think of when it comes to keeping migrations forever is that you can go back in time to an older commit and rebuild the database as it was at that point in time. You'll have to drop all tables before attempting that though, so the usefulness is debatable. All of these issues would be solved by a "schema file"-like functionality. And BTW, the limitations mentioned in Rails are true, but it can also generate an The bottom line is that I think this feature request is valid, and even if you prefer to use migrations for inserting data into the database I don't see a conflict with what this is asking. |
No need to run migrations before every test, once for every test suite is enough and after that truncating data + inserting new data is enough. Running/rolling back migrations before each test is a bad practise. Also initial SQL dumps loaded to DB before running the tests eliminates need for that.
As mentioned in this thread already few times, one does not seed file system for that at all since it is as easy to initialize new knex instance from simple node.js script than writing it in seed-file format. Seed files just adds unnecessary complexity. |
There are paradigms where you don't want to maintain consistent DB state across your environments. A/B testing is one example. Separating DB schema from DB data adds additional flexibility. |
Discussion is running circles. Locking further discussion. If someone wants to extend support how seeds are ran through knex client pull request is welcome. However we should not add support for storing information about which seeds has been already run. If you need to verify that the same seed is not ran multiple time, then you should use migrations instead. |
This isn't asking to store information about which seeds have run or to verify that the same seed is not run multiple times. It's only asking for the ability to specify a single random seed file to be run. E.g. |
@ricardograca It does make sense. How do you think seed file could be identified? by full filename? |
I think that would make the most sense yes. However, not sure if a full grep like functionality is needed, or just something like |
Glob has been the most common tool I have seen to match multiple or single file names in node world. |
I dont see any option for seeding a single file.
For ex I have folder called "development" where I have all seed files.
How do I seed only a single file from that folder?
knex seed:run --env=development
this command does seeding for all files on development folder
The text was updated successfully, but these errors were encountered: