Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Improved submodule support #572

Open
lorenwest opened this issue Aug 4, 2019 · 35 comments · May be fixed by #670
Open

[feature] Improved submodule support #572

lorenwest opened this issue Aug 4, 2019 · 35 comments · May be fixed by #670

Comments

@lorenwest
Copy link
Collaborator

lorenwest commented Aug 4, 2019

Let's use this thread to host issues related to upgrading submodule support. We should start by listing the functionality we'd like to see. I'll start:

  • Dual purpose - Submodules run both as submodules and as first class applications. Mostly they run as apps for testing, but they must be able to run in either mode. When running in application mode, all configurations at the top level of their config object are used. When running in submodule mode, only the top level config branch matching the module name is used.

  • Submodule config as default - When loaded as a submodule, a full config object should be loaded from files in the /config directory of the submodule. Once loaded, the top level branch of that config object named by the module name is folded as defaults into the same top branch of the main app config.

  • Late loading - A healthy modular system doesn't require the main app to have direct knowledge of all modules in the dependency graph. The main app knows only of dependencies listed in package.json. It should not be forced to change as the dependency graph changes over time. In addition to this lack of knowledge, the main app doesn't control the timing of submodule loading. Submodule configs must be able to be loaded as the runtime discovers the need to load the module.

  • Backward compatibility - I'm listing this as a highly desired but optional feature. This feature may force us to hide the config object, requiring all access through get() in order to guarantee immutability.

@lorenwest
Copy link
Collaborator Author

Ideally we can establish a pattern that doesn't classify a module as submodule - where all modules using node-config work in both modes.

@lorenwest
Copy link
Collaborator Author

We must address the reality that some submodules will be loaded multiple times with different versions. We can either:

  1. Try to address the complexity of multiple config values for different submodule versions, or
  2. Allow only one version - first loaded wins, then gets merged into the top level app config

I fear that if we support #1, this will make app level overrides impossible (which version is the app overriding?), leaving #2 the only viable option - unless there's a #3 I'm not seeing.

@iMoses
Copy link

iMoses commented Aug 6, 2019

Ideally we can establish a pattern that doesn't classify a module as submodule - where all modules using node-config work in both modes.

I'm not sure this is desirable. In some cases you would like to have a completely separate and independent config instance, like when using 3rd party packages. A package owner in most cases shouldn't pollute the configuration instance of its users. For these cases I differentiated between Config.create() and Config.subModule(). The former creates a completely new instance while the later creates an instance derived from the original configuration instance, and keeps the connection to it. I believe this is an important distinction.

Can you provide some sort of "code samples" of how you imagine the functionality to look and function like?

@lorenwest
Copy link
Collaborator Author

First some definitions to clarify communication

  • module - Any module that has a package.json file and a ./config directory.
  • app - A module running as the entry point.
  • submodule - A module not running as the entry point.

When a module is running in app mode, it's the entry point. All modules run in this mode at some point - many for testing, and a few as long lived app entry points. There is no difference between an app module and a submodule except how its being run in a particular invocation.

The reason is to allow all modules to configure not only their configs, but also their dependent submodules for testing purposes.

Imagine a module called myModule. Here's an example from the default.js under its config directory:

{
  myModule: {
    ...all myModule configs
  },
  dependentModule1: {
    ...my overrides for the defaults from dependentModule1
  },
  ...
}

This places all module definitions under a top level branch named by the module. It doesn't require independent config objects and is consistent with the existing submodule naming convention.

@lorenwest
Copy link
Collaborator Author

Most importantly this supports the concept of modular based coding. It recognizes that all modules are run as apps at some point, and some (hopefully most) are also reusable as submodules of another module running as an app.

Another term used in this discussion:

  • folding - The process of starting at default.json and applying configurations based on NODE_ENV, etc.

Given the above example, when a module runs in submodule mode, only the myModule branch is brought into the app. It's brought in at time of myModule load, and it's brought in with all folding done under its ./config directory. That folded branch becomes the default for that branch of the app module. The app module then folds app module specific configurations of myModule.

@lorenwest
Copy link
Collaborator Author

lorenwest commented Aug 6, 2019

The above paradigm may use multiple objects under the covers, but only to build and fold the submodule branch into the main config object.

This keeps existing applications running as-is, supports the submodule naming convention already in place, and keeps the single config object exposed to the application.

It remains 100% backwardly compatible and allows node-config users to start using the new functionality at their own pace. In my opinion the biggest advantage is the realization that all modules are created equally, and the only difference is how they're invoked.

@lorenwest
Copy link
Collaborator Author

This prevents a submodule from polluting the main config object. Only configurations from the submodule branch are folded into the main config object. Other configs (used when the module is run as an app) are ignored.

@iMoses
Copy link

iMoses commented Aug 6, 2019

@lorenwest I'm not sure how to enter this discussion as I don't believe we are on the same page regarding the topic to discuss.

Personally I'd like to discuss refactoring the API, remove support for utilities methods and offer official methods to handle the configuration instance. In my proposal implementing the logic you presented is very easy, like two lines of code easy... But if I had to implement this feautre on the existing API then I'm stuck and my intial proposal goes back to "let's refactor the API first".

But I'm assuming that this is something you are trying to avoid and implement with as little change to the API as possible, and I don't see the clear benefit of it. The API today is borken and patchy, this is part of the reason why such a simple feature has been on the waiting list for over 5 years, and even if we do implement this feature it's still gonna operate in a broken environment with borken immutability and such.. So why not take care of the real issue first and then add support for this kind of logic on top of it?

@iMoses
Copy link

iMoses commented Aug 6, 2019

If I understand you correctly then all I need from the user of the library is to know the main directory of the module, to override the cwd which holds the main directory of the application, this way I can apply all the original logic on the config directory local to the module, so it might look something like that:

const config = require('config').subModule('MyModule', __dirname);

And of course the ability to lazy load the main configuration files so that they be resolved last.

@lorenwest
Copy link
Collaborator Author

I'm not opposed to a refactor, but that's the subject of a different thread. In fact I'm in favor of a refactor as long as justifications for possible compatibility issues aren't confused with this feature.

You're right that the underlying code is patchy, but that alone doesn't justify user work. Any work we ask the user to do because of incompatibility must be justified by benefit to the user.

@lorenwest
Copy link
Collaborator Author

If you'd like to put this thread on hold and start another on the API refactor, I'd be happy to chime in there.

@lorenwest
Copy link
Collaborator Author

In reference to this:

const config = require('config').subModule('MyModule', __dirname);

That's very close to how I see it. This should work if you're running as an app or a submodule, so maybe the way you register your module to participate as either an app or submodule is this:

const config = require('config').registerModule('MyModule', __dirname);

If __dirname is cwd(), then it's running as an app, otherwise it's running as a submodule.

This would be done in the .js file representing the module - specified as main in package.json or index.js by default.

@iMoses
Copy link

iMoses commented Aug 6, 2019

I agree that we should have a good enough justification to force users to migrate their code, that's why I actually believe it's a good idea to discuss sub-module as you see them because if support to such features as sub-modules and the parser is by far better with the refactor (that only effects advanced users - aka the users who uses them) then I believe we'll have this justification.

So as long as we talk about this feature in a version agnostic way I'm on board.. Later we'll see what can be done regarding the actual implemantation.

@iMoses
Copy link

iMoses commented Aug 6, 2019

@lorenwest of course you're right about the naming, this should be simply called a module and not a sub-module..

What would you expect to return from this methods call in case of sub-modules, the entire configuration object or just the sub-config representing the module, disabling access to anything outside the module?

P.S.
on second thought it may be better to take an options object for easier refactoring and extensions in the future.

const config = require('config').registerModule('MyModule', {cwd: __dirname});
config.get('prop'); // will it return config.prop or config.MyModule.prop?

@lorenwest
Copy link
Collaborator Author

I believe the return from the above method would be the same as require('config') - the top level config object. No need for registerModule to hide other top level configs by doing an implicit get('myModule').

const config = require('config').registerModule('MyModule', __dirname);
config.get('myModule.prop');

I'm on the fence regarding {cwd: __dirname}


I'm on board with discussing a refactor and a new API as long as any major version bumps stand on their own merit. Many features justify a major version bump, just not this one.

Also - if other features justify a major version bump, I'm not opposed to including this feature in the new version, even implementing this with features in the major version.

@iMoses
Copy link

iMoses commented Aug 6, 2019

I'm on the fence regarding {cwd: __dirname}

Do you think it's possible that in the future we'll want to set more options here? I can think of a couple of possibilities..

  • version - in case we wanna manage different versions of the same module and allow fetching them by version
  • environment - override certain env-vars (or other options?) that effect files loading

But if your reason to dislike it is because you wanna keep it as simple as possible then there's no need.. I just don't want it to come back and bite us in the ass a year from now :)

@markstos
Copy link
Collaborator

markstos commented Aug 6, 2019

I'm just going to add a note about backcompat here. As a user of the module, of course I want to minimize my own upgrade pain. Breaking changes should be meaningfully different.

As a module maintainer, in the last five years our user-base has doubled at least 4 times (ref: https://npm-stat.com/charts.html?package=config&from=2014-08-05&to=2019-08-05 ). That's almost one doubling per year!

Each time that happens, that means the majority of users are new users. If we want to optimize for giving the best experience to the most users, we should expect the doubling trend to continue at least one more time and optimize for new users, making breaking changes if they make for a better module. Existing users can stick with the current major version, or upgrade and adapt.

@lorenwest
Copy link
Collaborator Author

I noticed a misspelling in the module name (MyModule vs. myModule):

const config = require('config').registerModule('MyModule', {cwd: __dirname});
config.get('myModule.prop');

Due to this, lets use module name from package.json to prevent these bugs - we have the __dirname to load it:

const config = require('config').registerModule({dir: __dirname});
config.get('myModule.prop');

(also - the key cwd isn't correct as it's not the current working directory, it's the module directory)

@iMoses
Copy link

iMoses commented Aug 7, 2019

@lorenwest just a thought.. if we're reading the package name it may contain a scope. in such cases. do we want to ignore it?

e.g.

"name": "@imoses/my-module"

What should the module name be inside the configuration object?
"@imoses/my-module"? "my-module"? "myModule"? "MyModule"?
naming convension is a bit of a problem when relying on the package name..

I made a first implementation for testing purposes..
https://github.com/iMoses/node-config/blob/next-gen/lib/config.js#L151

@lorenwest
Copy link
Collaborator Author

If done well, I can imagine a future where a /config directory is commonplace for modules. Module developers can't choose random names for their top level object or you'll surely get clashes. Using npm as the arbitrator makes good sense.

In that world I would imagine config.get('@imoses/my-module.param') would work as well as config.get('@imoses.my-module.param').

Either way I think we should enforce name uniqueness using the package.json file, and expect a large number of disconnected modules to use this feature of node-config.

@iMoses
Copy link

iMoses commented Aug 7, 2019

Just to be clear, we're talking only about cases in which users decide to registerModule, otherwise we remain with the current mechanism. If that's true, how do we prevent clashed between registered modules and the app module which is loaded by the client and isn't forced to registerModule?

Perhaps the modules system should be a side mechanism? Either manage a separate global configuration object for registered modules or even completely separate the functionality (but still use the Config class) at config/module instead of config..

@lorenwest
Copy link
Collaborator Author

tl;dr: This should be considered the new way to use node-config vs. separate functionality.


Users generally don't decide to call registerModule(), modules decide on their own that they want to participate as a module config. The app may not even know that a module is doing this.

So there's a chance of namespace collision in the config namespace. Our philosophy of locking the config object on get prevents accidental clashing - or at least exposes the problem quickly.

The migration to this mechanism at our company wasn't difficult. The main change was that apps were considered modules in their own right, and created a top level namespace for the app module to prevent collisions.

Once a person realizes all modules can participate this way, adding a top level namespace for your module quickly becomes the new normal. The migration was fast in our company.

@iMoses
Copy link

iMoses commented Aug 14, 2019

I have to say that I don't like this direction.. I believe (and my proposal is based on that belief) that the library should provide more flexibility to help users accomplish their needs even if it's not part of the library's default heuristics. Your proposal is to instead change the heuristics to strengthen strictness and force the library's methodology on our users.

As a user of this library I don't like this approach, and as a contributor I'm asking myself how common is this use case that you are trying to promote? Because it seems to me that very few users of this library would benefit from such a change, while most will suffer from more strictness and may very well choose a different library to manage their configurations, and I think that's a shame.

I'm not against having this functionality, but after taking a few days to think about it I still believe that separate functionality (config/modules) is the right way to do it. I wish to keep the library's next version heuristics the same as they're today and allow access to the internal mechanisms so that power users can modify and influence some of the logic, and this discussion seems to be going in the other direction.

@lorenwest
Copy link
Collaborator Author

I'd be inclined to agree if I thought this were fringe functionality, of the type that one module author could choose one implementation and another choose a different mechanism.

Modules will use this, and I believe it's of little use - even a disservice - to have multiple mechanisms that may or may not interoperate.

That said, none of this proposal has to do with where the code lives to manage the feature. I've been careful to keep the discussion at the functionality level vs. the implementation level, and if config/modules is where you think it should live, I don't have any problem with that.

The one thing we may disagree on is managing independent objects for each module vs. using the existing pattern of assigning a top level namespace in the config object to a submodule. I don't see the value in changing this convention, but I'd be open to a discussion on that or any other functionality.

@lorenwest
Copy link
Collaborator Author

Of the major features, backward compatibility is at the bottom - as a nice to have.

If we could get the other features and it comes with a different API for interacting with the config object (or objects), I'd still be supportive.

If this is the biggest issue, let's move the discussion to the pros/cons of managing multiple config objects. If the outcome is to support multiple config objects and it happens to make submodule configs easier to implement, that'd be ok as long as the major features are intact.

My guess is that multi-object support can stand on it's own - I just didn't want to conflate that concern with advanced submodule support.

@iMoses
Copy link

iMoses commented Aug 15, 2019

The one thing we may disagree on is managing independent objects for each module vs. using the existing pattern of assigning a top level namespace in the config object to a submodule. I don't see the value in changing this convention, but I'd be open to a discussion on that or any other functionality.

I'm not suggesting we separate between the configuration object of each module, but that we have a separation between registered modules as a whole and the main app's configuration (in case it isn't registered as a module), and that we don't force main app's to register as such.

It could live in a separate file with a separate API or be part of the main Config instance API, but whatever we choose in that regard I believe we should manage them differently.

Modules configuration should be linked to a single global config object that holds only and all of the registered modules configuration (including different versions of it, if exist), and in addition we'll have the current/new API that will remain the same and will not be effected or polluted by registered modules.

We could decide to change the current API to support fetching modules config (e.g. get() will know how to distinct and fetch from both config objects) or simply add new methods to interact with modules, if we decide to keep it as part of the default Config API, but that's something I think we should discuss only once we agreed on the basics.

What I wouldn't wanna see happening is us forcing users to register modules if they don't wanna use this mechanism. The idea for separate configuration instances is a part of the new proposal which is meant to provide an alternative to package owners who want to keep their configuration privately managed and perhaps even through different heuristics. (e.g. usage of middleware)

@lorenwest
Copy link
Collaborator Author

So instead of multiple config objects you're suggesting two. I could get on board with that, but I'd need to wrap my head around the mechanics of usage.

For example, how would you propose we separate submodule config extension by the main app, with environment specific overrides? As a set of parallel files next to the main app environment overrides?

For discussion purposes, let's use a real example. Let's say we have a cache module that uses cache provider modules (in-memory, filesystem, memcached, redis, etc.). The app that uses the cache module needs to configure the cache provider on a deploy-by-deploy basis. Local development uses in-memory, pre-prod uses memcached, and prod uses redis.

Where would you define these providers in the cache module for development/testing, and where would you define the providers in the host app configs?

In my proposal they'd all be in the same config files we use today - under the namespace of the cache module.

@iMoses
Copy link

iMoses commented Aug 15, 2019

I was thinking that if a main app wants to play along with modules then it should register as such. As you said about the process your company went through:

Once a person realizes all modules can participate this way, adding a top level namespace for your module quickly becomes the new normal. The migration was fast in our company.

Does it make sense to you? It's basically offering two modes of operation, and I understand your reasons to dislike it, I share them myself..

Personally, I don't see a use-case in which a user would want to access the configuration of modules from outside of their domain (company), or a package that will allow its clients to override the configuration only if they install our library.

The only use-case I see is when you are managing under your domain multiple modules and would like them to share configuration, in which case I think it's okay to always force you to register a module, even in case of a main app that would never run as a sub-module.

BUT if you don't need this mechanism then this should be invisible to you, even if some of your pacakges are using this mechanism behind the scenes.

@lorenwest
Copy link
Collaborator Author

You can't imagine a generic cache library with multiple cache strategies (each as a module) and multiple backend providers (each as a module) needing configuration from outside a domain?

I can imagine the standard way of module development including a /config directory and module instantiation parameters. A module using another module could choose to store configurations under its own /config directory (and use node-config), or use a different config library and pass configs onto the module constructor.

If they choose node-config, then they have a standard for the placement of configs. They don't have a million places configs could reside because we don't support storing your configs wherever you want.

@lorenwest
Copy link
Collaborator Author

Having multiple config objects does solve the variable clash problem, but now you have to separate those configs - either into multiple files, or multiple sections within one file.

I'd like to hear your solution to this.

@iMoses
Copy link

iMoses commented Aug 15, 2019

You can't imagine a generic cache library with multiple cache strategies (each as a module) and multiple backend providers (each as a module) needing configuration from outside a domain?

Let's say I'm the proud owner of the express.js alternative Xpress. I have middleware, cache, templates, etc. Some of them I've created and some belong to the community, many plugins and middleware.

Personally I use the modules mechanism of node-config to centralize configuration management between my modules. Because all of my library is run as sub modules, no module in my domain can override another module's configuration. It's strictly for centralization and the ability to access one module's config from another.

If my users wanna override my configuraion they can do it, but they must install node-config as a dependency. Most likely is I won't wanna force my users to be dependent on another package, so I'll offer a regular mechanism to pass/override default configuration, and although the node-config feature is nice it is also unnecessary because I already have a way of passing config and supporting both may even require a specific architecture from me, unlike what I currently have.

I can perhaps force my plugins and middleware creators to install node-config and manage their configuration with mine, that makes more sense to me, but that's a something I'll have to consider as package owners usually want to be dependent on as little as possible, and I'll most likely be forced to add a default way to access these as well. Anyhow, only the main app will be able to override configuration from outside these modules, and I don't have control over it.

Am I getting this wrong?


Having multiple config objects does solve the variable clash problem, but now you have to separate those configs - either into multiple files, or multiple sections within one file.

I'd rather have multiple sections than files. I think we can solve it nicely by using a predefined Symbol name as a top level property.

// config.js
module.exports.moduleSymbol = Symbol('config-modules');

// default.js
const config = require('config');
module.exports = {
  myProp: {},
  [config.moduleSymbol]: {
    '@some/module': {
      prop: 123,
    }
  }
};

Or alternatively:

module.exports = {
  myProp: {},
  [config.moduleDef('@some/module')]: {
    prop: 123,
  }
};

We could add something to none js files such as "~@some/module"... I'm not creazy about this idea, I'll try to think of alternatives.. Though I'm still not convinced there's a use-case for mixing between the two mechanisms, it'd be much simpler if either defined yourself as a module or not.

@lorenwest
Copy link
Collaborator Author

We agree that a good module pattern is to define your configs and default values using node-config, and allow users of your module to choose between using node-config or module construction parameters for config overrides. Users of your module should have their own choice for configuration.

One thing you may yet to see is my #1 requirement at the top of the thread. 100% of the modules we write are invoked both independently, and as a dependent of another module.

At a minimum, test invocations require local overrides of both your module and dependent modules. Your module is an application in this mode, and this can't be overlooked when evaluating implementations.

@lorenwest
Copy link
Collaborator Author

The value of multiple config objects (even just 2) must be weighed against the complexity of namespace separation, and must take into account node-config installations that choose .json or other non-executable config file formats.

@markstos
Copy link
Collaborator

This conversation seems productive so far. @iMoses What do you think about @lorenwest most recent feedback?

@iMoses
Copy link

iMoses commented Oct 19, 2019

@markstos unfortunately life happened, my company closed down and I don't have as much time atm to invest in this feature, which honestly I basically understand but am not the audience for it so it's harder for me to solve a problem I never had or wanted to address.

I was mainly aimming towards making this library more flexible towards its users with my next-gen proposal but this new approach is quite the opposite of it. If we are aimming to go in different directions then I have no choice but to fork this library for the time being so that I could use "next-gen" in my current projects and neglect this feature.

Sorry, I just don't see enough value in it to invest more time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants