Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seed a single file! #801

Closed
lfaz opened this issue Apr 28, 2015 · 30 comments
Closed

Seed a single file! #801

lfaz opened this issue Apr 28, 2015 · 30 comments

Comments

@lfaz
Copy link

lfaz commented Apr 28, 2015

I dont see any option for seeding a single file.
For ex I have folder called "development" where I have all seed files.
How do I seed only a single file from that folder?

knex seed:run --env=development
this command does seeding for all files on development folder

@mmestani
Copy link

mmestani commented Jun 4, 2015

+1 @tgriesser @bendrucker

@kimmobrunfeldt
Copy link

👍

1 similar comment
@nabati
Copy link

nabati commented Jan 24, 2016

+1

@miguelcobain
Copy link

👍 This is very useful for tests.

@barinbritva
Copy link

Must have!

@Zikoel
Copy link

Zikoel commented Aug 21, 2017

👍 I join to request. It would be very convenient for tests!

@iqbalsafian
Copy link

i support this request. in laravel you can just easily do this:
php artisan db:seed --class=UsersTableSeeder

@marcpre
Copy link

marcpre commented Nov 9, 2017

👍 Highly relevant and would be extremely useful!

@jsartisan
Copy link

jsartisan commented Jan 5, 2018

Much needed feature :) 👍

@elhigu
Copy link
Member

elhigu commented Jan 5, 2018

As a workaround you can seed single file already by just writing node script that does the seeding and running it directly:

const knexConf = require('./knexfile');
const knex = require('knex')(knexConf);

// insert anything you like...

I don't see much benefit of having separate seed file. For running tests creating separate custom populate schemes is lot more flexible and fast that using knex client + seed file system.

@asequeir-public
Copy link

adding a table after the app has gone to production and seeding it is my actual use case

@elhigu
Copy link
Member

elhigu commented May 27, 2018

@asequeir-public You should use migrations for that.

@renerpdev
Copy link

I am starting a new project to seed tables with knex.js, here. Any help will be nice!

@pmead
Copy link

pmead commented Sep 24, 2018

@asequeir-public You should use migrations for that.

For schema and changes to existing data as a result of the schema change sure, but for new data (especially large amounts of it) can be quite problematic in migrations. It seems to be a general consensus across frameworks, talking to peers googling around a bit, that migrations are kept free of data where possible.

@elhigu
Copy link
Member

elhigu commented Sep 25, 2018

but for new data (especially large amounts of it)

Well if that is static data to populate to every DB (for example information about categories, which are the same for every installation etc.), migrations are exactly correct place to do it.

Otherwise it depends on the data. Populate scripts are not good for adding large amount of data which is suitable only for single database and not to other installations / test setups.

If its one-time data to add to the production it should be done with any script, which pushes the data in where it belongs. After that it should be restored from DB backups, since applying it again would break everything.

I would rather like to have a feature, which would drop the whole populate script system from knex and maybe add an external package for the people who likes to have that (they doesn't implement any functionality which won't be trivial to implement with simple node.js script as mentioned in comment #801 (comment)).

@ricardograca
Copy link
Member

Well if that is static data to populate to every DB (for example information about categories, which are the same for every installation etc.), migrations are exactly correct place to do it.

I also disagree with this. That is precisely what seeds are for, not migrations. Migrations are only meant to change the database structure, not its data. On top of that migrations are supposed to be transitory, and can be deleted once they have served their useful purpose. If they contain data this will not be possible.

@kibertoad
Copy link
Collaborator

@ricardograca Theoretically, yes. Practically migrations are the best way to ensure that you have consistent datasets across different environments. And I disagree that "transactions can be deleted", typically you want to keep them around forever so that you can always recreate an environment from a scratch (at least the part that is not dependent on external input).

@kibertoad
Copy link
Collaborator

@pmead What we typically do in production with big datasets is use migrations to call importers and store actual data in CSV files that may or may not be stored in repo (based on how much long-lasting value they have and on their size)

@ricardograca
Copy link
Member

...typically you want to keep them around forever so that you can always recreate an environment from a scratch (at least the part that is not dependent on external input).

Not really. You only want to keep them around forever now because there's no better system to recreate the schema from scratch with Knex, like the schema file functionality from Rails. But even still, once all machines that are running your code are updated to the latest version of the schema you can safely delete all migrations and create a new one based on the current state that will take the place of the initial migration. If you work on a big project you'll end up with hundreds or even thousands of migrations if you don't clean them up from time to time.

It is super wasteful to have to run a bunch of migrations on a new machine where you deploy your code (or every time when running tests). In this case you probably don't want to go through all the history of migration changes just so that you can get to the latest state, you just want the latest state.

@pmead
Copy link

pmead commented Sep 25, 2018

One of the particular use case where population scripts are useful is when you want the data set to be refreshed but it doesn't happen frequent enough to warrant another method e.g. say an annual import of data.

Migrations at that point feel somewhat cumbersome because for some environments they will have already run meaning you would need a new migration and couldn't just change the CSV (which I think would happen in what you're mentioning @kibertoad ?) and new environments end up unnecessarily loading potentially huge data sets in again and again.

I also think the split of schema -> migrations, data -> seed / populate scripts is a nicer separation of concerns and holds less surprises in large projects.

Interestingly @ricardograca Rails is also the paradigm I'm mentally mapping this to, although that has its own set of problems.

@elhigu
Copy link
Member

elhigu commented Sep 26, 2018

Just to start with actual support of seeds I want to say clearly:

Having support for seeds in knex doesn't help anything with any of the points discussed in this reply

Now that being said, we can continue philosophical discussions about migrations and schema things :)

I also disagree with this. That is precisely what seeds are for, not migrations. Migrations are only meant to change the database structure, not its data. On top of that migrations are supposed to be transitory, and can be deleted once they have served their useful purpose. If they contain data this will not be possible.

That sound really rails-centric and purifist view to the subject. I would like to hear any real actual reason why migration would need to be able to throw away after its used and why they should not contain any data.

If the only reason is that accumulation of thousands of migration files can be prevented also with knex and initial / midway database dumps (I have really good tools for that in my own projects for various purposes).

You can throw them away when they have served their purpose (for example when new initial database dump is created with all necessary initial data). Usually people find it easier to keep migrations in repo though.

Not really. You only want to keep them around forever now because there's no better system to recreate the schema from scratch with Knex, like the schema file functionality from Rails.

Knex cannot adapt that way in a long time + even rails' schemafile is quite limited with SQL features it supports. Knex will never be rails (but it might eventually get something like schema file supported).

It is super wasteful to have to run a bunch of migrations on a new machine where you deploy your code (or every time when running tests). In this case you probably don't want to go through all the history of migration changes just so that you can get to the latest state, you just want the latest state.

Yes, with knex database dumps are really useful, with larger projects and for tests to reset the latest initial schema + data. I know it would be nice to offer better tooling and some best practices for that.

@elhigu
Copy link
Member

elhigu commented Sep 26, 2018

@pmead

I also think the split of schema -> migrations, data -> seed / populate scripts is a nicer separation of concerns and holds less surprises in large projects.

It doesn't work so well in the real life and you can easily have directory full of database population scripts without any seed wrapping around them. Then you can just call node seeds/refresh-yearly.js and your data gets populated.

Interestingly @ricardograca Rails is also the paradigm I'm mentally mapping this to, although that has its own set of problems.

Yeah, rails is quite a different setup and active record is much higher level tool than knex. Sure we can learn from there too.

@ricardograca
Copy link
Member

I would like to hear any real actual reason why migration would need to be able to throw away after its used and why they should not contain any data.

Well, it's mainly an issue when it comes to setting up a new machine. For example, why would you want to go through all the process of creating a column, modifying it, modifying it again, and again and then dropping it? This will happen if you have all the migrations and it makes no sense. This is just an example, the same same thing will happen with tables, and a large project can easily accumulate a lot of small changes. In my mind it makes no sense to go through the whole history of changes, that's why migrations should be transitory. Note that this also applies to a test suite when using Knex, where the whole migration list is run in between each test, which slows down the process.

The only upside I can think of when it comes to keeping migrations forever is that you can go back in time to an older commit and rebuild the database as it was at that point in time. You'll have to drop all tables before attempting that though, so the usefulness is debatable.

All of these issues would be solved by a "schema file"-like functionality. And BTW, the limitations mentioned in Rails are true, but it can also generate an sql file which doesn't have such limitations and the file can be used with the integrated schema creation tool, so there's no need to run extra scripts or anything. I know it's not related to this feature, so I'm not even sure why we're discussing this here 😅

The bottom line is that I think this feature request is valid, and even if you prefer to use migrations for inserting data into the database I don't see a conflict with what this is asking.

@elhigu
Copy link
Member

elhigu commented Sep 26, 2018

it makes no sense to go through the whole history of changes, that's why migrations should be transitory. Note that this also applies to a test suite when using Knex, where the whole migration list is run in between each test, which slows down the process.

No need to run migrations before every test, once for every test suite is enough and after that truncating data + inserting new data is enough. Running/rolling back migrations before each test is a bad practise. Also initial SQL dumps loaded to DB before running the tests eliminates need for that.

The bottom line is that I think this feature request is valid, and even if you prefer to use migrations for inserting data into the database I don't see a conflict with what this is asking.

As mentioned in this thread already few times, one does not seed file system for that at all since it is as easy to initialize new knex instance from simple node.js script than writing it in seed-file format. Seed files just adds unnecessary complexity.

@acoulon99
Copy link

There are paradigms where you don't want to maintain consistent DB state across your environments. A/B testing is one example. Separating DB schema from DB data adds additional flexibility.

@elhigu
Copy link
Member

elhigu commented Nov 23, 2018

Discussion is running circles. Locking further discussion.

If someone wants to extend support how seeds are ran through knex client pull request is welcome.

However we should not add support for storing information about which seeds has been already run. If you need to verify that the same seed is not ran multiple time, then you should use migrations instead.

@knex knex locked as resolved and limited conversation to collaborators Nov 23, 2018
@ricardograca
Copy link
Member

ricardograca commented Nov 25, 2018

This isn't asking to store information about which seeds have run or to verify that the same seed is not run multiple times. It's only asking for the ability to specify a single random seed file to be run. E.g. knex seed:run --grep=users.

@knex knex unlocked this conversation Nov 25, 2018
@kibertoad
Copy link
Collaborator

@ricardograca It does make sense. How do you think seed file could be identified? by full filename?

@knex knex locked as resolved and limited conversation to collaborators Nov 25, 2018
@ricardograca
Copy link
Member

I think that would make the most sense yes. However, not sure if a full grep like functionality is needed, or just something like --single=my_seed_file.js.

@elhigu
Copy link
Member

elhigu commented Nov 25, 2018

Glob has been the most common tool I have seen to match multiple or single file names in node world.

felixmosh added a commit to felixmosh/knex that referenced this issue Jul 6, 2019
felixmosh added a commit to felixmosh/knex that referenced this issue Jul 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests