Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize AddCaller+AddStacktrace by sharing stacks #1052

Merged
merged 7 commits into from Jan 27, 2022

Commits on Jan 25, 2022

  1. Benchmark for AddCaller+AddStacktrace

    Adds a benchmark that tests the performance of a Zap logger with both,
    `AddCaller` and `AddStacktrace` options enabled.
    abhinav committed Jan 25, 2022
    Copy the full SHA
    648b2be View commit details
    Browse the repository at this point in the history
  2. Optimize AddCaller+AddStacktrace by sharing stacks

    If we use the `AddCaller(true)` option for a logger with
    `AddStacktrace(level)`, which captures a stack trace for the specified
    level or higher, Zap currently captures inspects the stack twice:
    
    - `getCallerFrame` retrieves information about the immediate caller of
      the log entry
    - `StackSkip` calls `takeStacktrace` to capture the entire stack trace
      and build a string from it
    
    For the case where both caller and stack trace are requested,
    we can avoid redundant work by sharing information about the call stack
    between the logic for the two.
    
    To accomplish this, the following high-level pieces were added:
    
        type stacktrace
          captureStacktrace(skip int, depth stacktraceDepth) *stacktrace
          func (*stacktrace) Next() (runtime.Frame, bool)
    
        type stackFormatter
          func newStackFormatter(*buffer.Buffer)
          func (*stackFormatter) FormatStack(*stacktrace)
          func (*stackFormatter) FormatFrame(runtime.Frame)
    
    `captureStacktrace` captures stack traces (either just one frame or the
    entire stack) that can then be inspected manually (to extract caller
    information) and formatted into a string representation either
    one-frame-at-a-time or wholesale with `stackFormatter`.
    
    This allows us reorganize logic that applies the AddCaller and
    AddStacktrace options so that it can:
    
    - capture as much of the stack as it needs
    - extract caller information from the first frame if we need it
    - format that frame and the remaining frames into a stack trace
    
    ---
    
    Benchmarking:
    I ran the included benchmark before and after the patch with the
    following flags and compared the results with benchstat.
    
    ```
    % go test -run '^$' -bench AddCallerAndStack -benchmem -count 10
    [...]
    % benchstat before.txt after.txt
    name                      old time/op    new time/op    delta
    AddCallerAndStacktrace-2    4.68µs ± 7%    3.63µs ± 7%  -22.35%  (p=0.000 n=10+10)
    
    name                      old alloc/op   new alloc/op   delta
    AddCallerAndStacktrace-2      632B ± 0%      416B ± 0%  -34.18%  (p=0.000 n=10+10)
    
    name                      old allocs/op  new allocs/op  delta
    AddCallerAndStacktrace-2      5.00 ± 0%      3.00 ± 0%  -40.00%  (p=0.000 n=10+10)
    ```
    
    Allocations for this code path are down from 5 to 2, and CPU down by
    ~20%.
    
    To check for regressions, I also ran all existing benchmarks with
    "Caller" or "Stack" in their names:
    
    ```
    % go test -run '^$' -bench 'Caller|Stack' -benchmem -count 10
    [...]
    % benchstat before.txt after.txt
    name              old time/op    new time/op    delta
    StackField-2        3.28µs ± 2%    3.49µs ± 2%   +6.38%  (p=0.000 n=9+9)
    AddCallerHook-2     1.89µs ± 2%    1.92µs ± 3%     ~     (p=0.055 n=10+9)
    TakeStacktrace-2    3.17µs ± 1%    3.60µs ± 8%  +13.63%  (p=0.000 n=8+9)
    
    name              old alloc/op   new alloc/op   delta
    StackField-2          528B ± 0%      528B ± 0%     ~     (all equal)
    AddCallerHook-2       248B ± 0%      240B ± 0%   -3.23%  (p=0.000 n=10+10)
    TakeStacktrace-2      416B ± 0%      416B ± 0%     ~     (all equal)
    
    name              old allocs/op  new allocs/op  delta
    StackField-2          3.00 ± 0%      3.00 ± 0%     ~     (all equal)
    AddCallerHook-2       3.00 ± 0%      2.00 ± 0%  -33.33%  (p=0.000 n=10+10)
    TakeStacktrace-2      2.00 ± 0%      2.00 ± 0%     ~     (all equal)
    ```
    
    AddCaller costs one less allocation, but the cost of capturing a stack
    trace is *slightly* higher. I think that may be worth it for the 20%+
    improvement on Caller+Stack and reduced allocations.
    abhinav committed Jan 25, 2022
    Copy the full SHA
    de178a2 View commit details
    Browse the repository at this point in the history
  3. Copy the full SHA
    b09d402 View commit details
    Browse the repository at this point in the history

Commits on Jan 26, 2022

  1. Avoid CallersFrames from escaping (#1053)

    The `runtime.CallersFrames` method doesn't do much other than returning a struct pointer wrapping the passed in `pcs`. Calling this method twice ends up simplifying some of the logic (not having to format the first stack separately from the remaining), and ends up brining the performance of `TakeStacktrace` similar to `master`.
    
    ```
    # benchstat comparing 3 runs of master vs this branch, each run using benchtime=10s
    name            old time/op    new time/op    delta
    TakeStacktrace    1.47µs ± 2%    1.48µs ± 2%   ~     (p=1.000 n=3+3)
    
    name            old alloc/op   new alloc/op   delta
    TakeStacktrace      496B ± 0%      496B ± 0%   ~     (all equal)
    
    name            old allocs/op  new allocs/op  delta
    TakeStacktrace      2.00 ± 0%      2.00 ± 0%   ~     (all equal)
    ```
    prashantv committed Jan 26, 2022
    Copy the full SHA
    9cbe14e View commit details
    Browse the repository at this point in the history
  2. Revert "Avoid CallersFrames from escaping (#1053)"

    This reverts commit 9cbe14e.
    abhinav committed Jan 26, 2022
    Copy the full SHA
    440d598 View commit details
    Browse the repository at this point in the history

Commits on Jan 27, 2022

  1. typo in logger.go

    Co-authored-by: Sung Yoon Whang <sungyoon@uber.com>
    abhinav and sywhang committed Jan 27, 2022
    Copy the full SHA
    83f6340 View commit details
    Browse the repository at this point in the history
  2. stackFormatter: Replace counter with a bool

    Track whether we've written at least one frame,
    and use that to decide if we need the prefix.
    abhinav committed Jan 27, 2022
    Copy the full SHA
    4de4fc5 View commit details
    Browse the repository at this point in the history