Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested function calls leads to extreme parsing times #82

Closed
archoversight opened this issue Oct 4, 2022 · 13 comments · Fixed by #90 or #89
Closed

Nested function calls leads to extreme parsing times #82

archoversight opened this issue Oct 4, 2022 · 13 comments · Fixed by #90 or #89

Comments

@archoversight
Copy link

Here's a heavily reduced version of a terraform variable definition:

variable "test" {
  level1 = map(object({
    level2 = map(object({
      level3 = map(object({
        level4 = map(object({
        }))
      }))
    }))
  }))
}

On my system the above takes ~24 seconds to parse.

Removing a level by commenting it out:

variable "test" {
  level1 = map(object({
    level2 = map(object({
      level3 = map(object({
        # level4 = map(object({
        # }))
      }))
    }))
  }))
}

Takes about ~1.77 seconds to parse

Removing one additional level (level3) and parsing this file drops to ~0.455 seconds to parse.

It seems that there is something that goes quadratic when having multiple function calls nested. We have a bunch of terraform that heavily uses this to make sure the type of the variable is correct:

variable "network_integration" {
  description = "Map of networking integrations between accounts"
  type = map(object({
    friendly_name = string,
    vpcs = map(object({
      id           = string
      cidr         = string
      region       = string
      description  = string
      subnets      = map(string)
      route_tables = map(string)
      security_groups = map(object({
        id = string
        rules = map(object({
          direction   = string
          protocol    = string
          from_port   = string
          to_port     = string
          description = string
        }))
      }))
    }))
    additional_propagated_vpcs   = list(string)
    additional_static_vpc_routes = list(string)
  }))
  default = {}
}

As a full example (which this takes ~130 seconds to parse).

@martinohmann
Copy link
Owner

martinohmann commented Oct 7, 2022

Hi! This one is interesting, thanks for reporting.

Can you tell me if it makes a big difference if you use hcl::parse instead of hcl::from_str (the former bypasses serde and is generally preferred if you want to have a hcl::Body anyways)? The performance issue could be either in the grammar, the parser or the serde layer.

If I find some time during the weekend i'll try to investigate this.

@martinohmann
Copy link
Owner

I experimented a bit with different grammars for parsing the function calls, but none of them got rid of the performance issues. Weirdly, using a grammar that is almost identical to the one used for parsing arrays is still seeing massive degradation in performance, whereas deeply nested arrays do not have this issue.

I fear that this is one of these cases where the generated pest parser isn't optimal: pest-parser/pest#685

I wanted to try switching to a parser combinators library like chumsky when performance becomes an issue. Looks like now would be the time to do that, but it will take a bit to rewrite the parser.

@archoversight
Copy link
Author

@martinohmann I am using hcl::from_reader with an open file. I would assume that it would be more optimized than reading directly from a string with hcl::from_str, but maybe not. I will see what I need to do to switch to hcl::parse instead.

I will note that when I added a pest::set_call_limit(NonZeroUsize::new(1_000_000_000)); pest will return part-way through parsing that last example because of its call limit.

If I profile the binary while it is running using macOS's sample application, the call-stack is incredibly deep and it looks like it recurses a ton.

@archoversight
Copy link
Author

Using hcl::parse instead does not change the timing unfortunately.

@martinohmann
Copy link
Owner

Thanks for the hint with the call-stack depth. I did some further tests and realised that map(object({})) is actually 3 nested expressions. So your example

variable "test" {
  level1 = map(object({
    level2 = map(object({
      level3 = map(object({
        level4 = map(object({
        }))
      }))
    }))
  }))
}

is 12 nested expressions on the right hand side of level1 if I counted right.

I did some benchmarks with nested arrays, objects and func calls, and all of these perform roughly as bad as your example:

variable "nested_arrays" {
  level1 = [[[[[[[[[[[[]]]]]]]]]]]]
}

variable "nested_objects" {
  level1 = {
    level2 = {
      level3 = {
        level4 = {
          level5 = {
            level6 = {
              level7 = {
                level8 = {
                  level9 = {
                    level10 = {
                      level11 = {
                        level12 = {
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

variable "nested_func_calls" {
  level1 = map(map(map(map(map(map(map(map(map(map(map(map())))))))))))
}

Here, nested_arrays has the "best" performance and nested_objects has the worst by far. This is expected, since the grammar for arrays is simpler than the one for objects and function calls are somewhere in the middle.

So the issue is not about nested function calls specifically, but deeply nested expressions that contain themselves. It looks like pest cannot handle this very well. I'll keep digging if there's a way to improve recursion performance in pest and avoid large call stacks.

@martinohmann
Copy link
Owner

martinohmann commented Oct 11, 2022

@archoversight I did some changes to the grammar which improve the performance a bit in the nested-expr-perf, maybe you want to try that against your usecase to see if it at least brings the performance to an acceptable level.

It's still not fast, but in my benchmarks it shows a ~60% performance increase. There are probably other improvements that can be made, i'll see if i can find some more performance bottlenecks.

Parsing RHS expressions is the slowest part of the parser as there are a lot of possible expressions in HCL. The biggest performance issues are caused by objects, since object keys can be any valid HCL expression instead of just quoted strings or identifiers. So for each object item it tries to parse complicated expressions on both the LHS and RHS of the = which becomes very expensive the deeper the nesting level gets.

My guess is that the parser needs to do a lot of backtracking since needs to try to match conditionals and binary operators (which are less common than e.g. strings or identifiers) before everything else due to ambiguities in the grammar. It might be possible to solve this by not re-parsing partial results, but this will require quite a bit of restructuring of the grammar and parsing logic.

@martinohmann
Copy link
Owner

martinohmann commented Oct 11, 2022

@archoversight Good news! Looks like I found the performance issue. Conditionals and binary operators were the culpit since they caused unnecessary re-parsing of the same expressions when they did not match (which happens in most cases).

You can try out the nested-expr-perf branch.

Here is the result for your first example from the issue:

variable "test" {
  level1 = map(object({
    level2 = map(object({
      level3 = map(object({
        level4 = map(object({
        }))
      }))
    }))
  }))
}

This is the parsing time without the fix on my machine:

hcl::parse(&nested_func_calls)
                        time:   [1.3608 s 1.3616 s 1.3623 s]

and this is the parsing time after the fix:

hcl::parse(&nested_func_calls)
                        time:   [18.882 µs 18.912 µs 18.957 µs]
                        change: [-99.999% -99.999% -99.999%] (p = 0.00 < 0.05)
                        Performance has improved.

The branch is a bit messy to be merged as is, so I need to clean it up in the next few days when I have time.

@digitalresistor
Copy link

digitalresistor commented Oct 12, 2022

This is @archoversight on the personal Github account (vs the work account)! Thank you for tracking this down. I will give this a shot tomorrow. That looks like a massive performance improvement, which should hopefully make a world of difference.

I will note that the timing above was with time cargo run -- file.tf so it is not the most accurate... probably not as accurate as your timing.

@martinohmann
Copy link
Owner

martinohmann commented Oct 12, 2022

I'm still investigating an issue which causes the parser to hang when it tries to parse certain kinds of nested function calls. So if it locks up completely in your test, it's probably that. But I'm confident that i'll find the cause later today if I find the time.

Update: the issue is actually not in the parser, but in the formatter, so this is unrelated. Found it when doing some roundtrip testing. Essentially, formatting nested func calls currently hangs (or takes too long). I already identified the root cause and will fix it separately.

@archoversight
Copy link
Author

@martinohmann good news, just tested out that branch, and I am now able to parse our entire code base in 12 seconds in debug mode, and ~3 seconds in release mode. This is excellent.

At this point in time I am not using the formatter, so no issue there :-D

Thanks for the great work!

@martinohmann
Copy link
Owner

@archoversight That's great! I'll ping you when the fix is released, I'm almost done with the cleanup ;).

@archoversight
Copy link
Author

@martinohmann much appreciated!

martinohmann added a commit that referenced this issue Oct 12, 2022
Previously, the parser needed to do a lot of backtracking while parsing conditionals and operations which massively degraded performance for deeply nested expressions.

This fix inlines the handling for operations and conditionals into the `Expression` rule so that an already matched `ExprTerm` rule does not need to be re-evaluated when no operation/conditional is matched (which is the case in the vast majority of cases since literal values and collections are way more common than these.

Fixes #82
@martinohmann
Copy link
Owner

@archoversight I just published https://crates.io/crates/hcl-rs/0.8.5 which includes the fix. Thanks for reporting the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants