long list vs single file per prototype (AKA: YAML vs TOML) #1236

Yam76 · 2020-08-17T02:31:51Z

This is with regard to prototype file structure.

Long list pros:

All objects of a category are contained in one list, one file

Long list cons:

representing a prototype file as a list is not semantically meaningful (i.e., we don't actually care about order, as far as I can tell, but YAML variable references exist)
extremely long lists are annoying to edit

Single file per prototype pros:

Easy to tell at a glance what is contained in the category
Easier to edit, add, and delete prototypes

Single file per prototype cons:

Objects are scattered across multiple files
naming files is annoying, e.g. microwave recipes don't need id's

YAML pros:

top level arrays and human-readable, makes it good for these long lists

YAML cons:

Billion laughs attack (do we even care about this?)
bloated spec

TOML: A human-readable configuration language.

TOML pros:

No billion laughs (again, do we even care?)
slightly less bloated spec
spec more geared to dictionaries - good for prototype configuration

TOML cons:

No top level array - more fitting for single file prototype format
No variable references

Note that I'm not particularly advocating for this version of prototyping, but I didn't catch any discussion about TOML, etc., so I thought I'd throw this out there.

Part of the problem (that I see) is that the current format ties meaning (a set of prototypes) to structures (a long list) when it could and probably would be more convenient to separate meaning from structure, allowing separate files for different prototypes when it suits the situation. But it's entirely possible that this is not a big enough issue to worry about.

PJB3005 · 2020-08-17T11:59:40Z

extremely long lists are annoying to edit

We could make it so that prototypes can directly be put at the root of the document, then the YAML could look like this, which would make it slightly easier but not by much:

type: entity
id: foo
---
type: entity
id: bar

Single file per prototype pros:

It might be a good idea to just do this, yeah. Without IDE support the only way to "find" prototypes is by doing a project-wide text search. If prototype names matched the files they're in that'd be much easier.

Billion laughs attack (do we even care about this?)

YamlDotNet actually deserializes references as actual references to the same instance, at least when using the AST stuff (like we do). So this isn't an actual issue for us I don't think. Even then we're not accepting any malicious YAML currently so it's fine.

bloated spec

This isn't the worst problem though related is poor loading performance (see this issue I raised on the YamlDotNet repo)

spec more geared to dictionaries - good for prototype configuration

Not really. TOML is basically only good for simple k/v configuration files. Prototypes (with many layers of nesting of complex data) are completely incompatible with this. Trust me, I put all the config files for MoMMIv2 (my Discord bot) in TOML and immensely regret it (nested datastructures for every server's data and such).

Yam76 · 2020-08-18T05:35:34Z

spec more geared to dictionaries - good for prototype configuration

Not really. TOML is basically only good for simple k/v configuration files. Prototypes (with many layers of nesting of complex data) are completely incompatible with this. Trust me, I put all the config files for MoMMIv2 (my Discord bot) in TOML and immensely regret it (nested datastructures for every server's data and such).

I got the impression that TOML is more suited towards flat files, but I am surprised that it had such a high cost since it can basically be JSON with slightly nicer syntax (though JSON isn't so nice to write either way).

We could make it so that prototypes can directly be put at the root of the document, then the YAML could look like this, which would make it slightly easier but not by much:

Single file per prototype pros:

It might be a good idea to just do this, yeah. Without IDE support the only way to "find" prototypes is by doing a project-wide text search. If prototype names matched the files they're in that'd be much easier.

Maybe we could do something like this:

YAML files with a top level array are deserialized as we are currently doing.
YAML files with a top level map are treated as a single prototype.

Then it's very easy to

Group related prototypes into one file when they don't have a clear naming scheme, e.g. microwave recipes
Separate prototypes into their own files when needed (for e.g. heavy nesting, convenient browsing).

This isn't the worst problem though related is poor loading performance (see this issue I raised on the YamlDotNet repo)

If this is a large enough problem, I wonder if it would be worth it to gut the YamlDotNet and replace it with a C/C++/Rust YAML library?

PJB3005 · 2020-08-18T06:08:44Z

I got the impression that TOML is more suited towards flat files, but I am surprised that it had such a high cost since it can basically be JSON with slightly nicer syntax (though JSON isn't so nice to write either way).

The core problem with TOML is that actually nesting stuff becomes an unreadable mess. The need to constantly re-specify headers for nested tables is awful and readability is trash because of lack of indentation. JSON at the very least doesn't have these issues. Though obviously JSON has its own issues.

Compare, I converted one of the power prototypes to "what TOML would look like":

- type: entity
  id: DebugSmes
  name: Debug Smes
  placement:
    mode: SnapgridCenter
  components:
  - type: Clickable
  - type: InteractionOutline
  - type: Collidable
    shapes:
    - !type:PhysShapeAabb
      bounds: "-0.5, -0.5, 0.5, 0.5"
      layer: [MobMask, Opaque]

  - type: SnapGrid
    offset: Center
  
  - type: Sprite
    netsync: false
    sprite: Constructible/Power/smes.rsi
    state: smes
    layers:
    - state: smes-display
      shader: unshaded
  
  - type: Icon
    sprite: Constructible/Power/smes.rsi
    state: smes
  
  - type: Smes
  - type: Appearance
    visuals:
    - type: SmesVisualizer
  
  - type: Battery
    maxCharge: 1000
    startingCharge: 1000
  
  - type: NodeContainer
    nodes:
    - !type:AdjacentNode
      nodeGroupID: HVPower
  
  - type: PowerConsumer
  - type: BatteryStorage
    activeDrawRate: 1500
  
  - type: PowerSupplier
  - type: BatteryDischarger
    activeSupplyRate: 1000
  
  - type: Anchorable

[[prototypes]]
type = "entity"
id = "DebugSmes"
name = "Debug Smes"

[prototypes.placement]
mode = "SnapgridCenter"

[prototypes.components.Clickable]
[prototypes.components.InteractionOutline]
[[prototypes.components.Collidable.shapes]]
type = "PhysShapeAabb"
bounds = "-0.5, -0.5, 0.5, 0.5"
layer = ["MobMask", "Opaque"]

[prototypes.components.SnapGrid]
offset = "Center"

[prototypes.components.Sprite]
netsync = false
sprite = "Constructible/Power/smes.rsi"
state = "smes"

[[prototypes.components.Sprite.layers]]
state = "smes-display"
shader = "unshaded"

[prototypes.components.Icon]
sprite = "Constructible/Power/smes.rsi"
state = "smes"

[prototypes.components.Smes]
[[prototypes.components.Appearance.Visuals]]
type = "AdjacentNode"
nodeGroupID = "HVPower"

[prototypes.components.PowerConsumer]
[prototypes.components.BatteryStorage]
activeDrawRate = 1500

[prototypes.components.PowerSupplier]
[prototypes.components.BatteryDischarger]
activeSupplyRate = 1000

[prototypes.components.Anchorable]

I'd say YAML wins hands down here already (and we could still make some improvements like mapping the components sequence a proper mapping instead).

Maybe we could do something like this:

YAML files with a top level array are deserialized as we are currently doing.

YAML files with a top level map are treated as a single prototype.

Yes, this would work fine

If this is a large enough problem, I wonder if it would be worth it to gut the YamlDotNet and replace it with a C/C++/Rust YAML library?

Forking/optimizing YamlDotNet would probably be easier.

PJB3005 · 2020-08-18T06:11:05Z

Also here's a version of the prototype using proper YAML type tags:

- type: entity
  id: DebugSmes
  name: Debug Smes
  placement:
    mode: SnapgridCenter
  components:
  - !Clickable {}
  - !InteractionOutline {}
  - !Collidable
    shapes:
    - !type:PhysShapeAabb
      bounds: "-0.5, -0.5, 0.5, 0.5"
      layer: [MobMask, Opaque]

  - !SnapGrid
    offset: Center
  
  - !Sprite
    netsync: false
    sprite: Constructible/Power/smes.rsi
    state: smes
    layers:
    - state: smes-display
      shader: unshaded
  
  - !Icon
    sprite: Constructible/Power/smes.rsi
    state: smes
  
  - !Smes {}
  - !Appearance
    visuals:
    - type: SmesVisualizer
  
  - !Battery
    maxCharge: 1000
    startingCharge: 1000
  
  - !NodeContainer
    nodes:
    - !type:AdjacentNode
      nodeGroupID: HVPower
  
  - !PowerConsumer {}
  - !BatteryStorage
    activeDrawRate: 1500
  
  - !PowerSupplier {}
  - !BatteryDischarger
    activeSupplyRate: 1000
  
  - !Anchorable {}

Which might be even better.

Yam76 · 2020-08-19T04:49:08Z

That expanded TOML is pretty bad, thanks for writing it out.

The YAML type tags are interesting, but I wonder if the additional complexity is a significant enough barrier to be worth just keeping it simple.

PJB3005 · 2020-08-21T15:59:32Z

The YAML type tags are interesting, but I wonder if the additional complexity is a significant enough barrier to be worth just keeping it simple.

I mean all the parsing complexity is handled by the YAML library, the extra syntax distinction helps (better syntax highlighting) and they're very easy to "work with" in the engine.

Yam76 · 2020-08-22T03:09:23Z

Good point. Thank you for discussing these questions with me.

ShadowCommander added Feature: Prototypes Status: Needs Discussion I want to ask an open-ended question that has no way of being resolved. labels Aug 17, 2020

Yam76 mentioned this issue Aug 19, 2020

YAML files with a top level map should be interpreted as a single prototype #1238

Open

Yam76 closed this as completed Aug 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

long list vs single file per prototype (AKA: YAML vs TOML) #1236

long list vs single file per prototype (AKA: YAML vs TOML) #1236

Yam76 commented Aug 17, 2020

PJB3005 commented Aug 17, 2020

Yam76 commented Aug 18, 2020 •

edited

PJB3005 commented Aug 18, 2020

PJB3005 commented Aug 18, 2020

Yam76 commented Aug 19, 2020

PJB3005 commented Aug 21, 2020

Yam76 commented Aug 22, 2020

long list vs single file per prototype (AKA: YAML vs TOML) #1236

long list vs single file per prototype (AKA: YAML vs TOML) #1236

Comments

Yam76 commented Aug 17, 2020

PJB3005 commented Aug 17, 2020

Yam76 commented Aug 18, 2020 • edited

PJB3005 commented Aug 18, 2020

PJB3005 commented Aug 18, 2020

Yam76 commented Aug 19, 2020

PJB3005 commented Aug 21, 2020

Yam76 commented Aug 22, 2020

Yam76 commented Aug 18, 2020 •

edited