tracing: reduce memory pressure throughout #64233

irfansharif · 2021-04-26T19:33:56Z

This commit attempts to reduce the memory overhead of tracing by doing a
few things, guided mostly by BenchmarkTracing and kv95 (see results
below). In decreasing order of impact:

We no longer materialize recordings when the trace is "empty" or
non-verbose. When traces straddle RPC boundaries, we serialize the
recording and to send it over the wire. For "empty" traces (ones with
no structured recordings or log messages), this is needlessly
allocation-heavy. We end up shipping the trace skeleton (parent-child
span relationships, and the operation names + metadata that identify
each span). Not only does this allocate within pkg/util/tracing, it
also incurs allocations within gRPC where all this serialization is
taking place. This commit takes the (opinionated) stance that we can
avoid materializing these recordings altogether.
We do this by having each span hold onto a reference to the rootspan,
and updating an atomic value on the rootspan whenever anything is
recorded. When materializing the recording, from any span in the tree,
we can consult the root span to see if the trace was non-empty, and
allocate if so.
We pre-allocate a small list of child pointer slots for spans to refer
to, instead of allocating every time a child span is created.
Span tags aren't rendered unless the span is verbose, but we were
still collecting them previously in order to render them if verbosity
was toggled. Toggling an active span's verbosity rarely every happens,
so now we don't allocate for tags unless the span is already verbose.
This also lets us avoid the WithTags option, which allocates even if
the span will end up dropping tags.
We were previously allocating SpanMeta, despite the objects being
extremely shortlived in practice (they existed primarily to pass on
the parent span's metadata to the newly created child spans). We now
pass SpanMetas by value.
We can make use of explicit carriers when extracting SpanMeta from
remote spans, instead of accessing data through the Carrier interface.
This helps reduce SpanMeta allocations, at the cost of duplicating some
code.
metadata.MD, used to carry span metadata across gRPC, is also
relatively short-lived (existing only for the duration of the RPC).
Its API is also relatively allocation-heavy (improved with
metadata: reduce memory footprint in FromOutgoingContext grpc/grpc-go#4360), allocating for every key
being written. Tracing has a very specific usage pattern (adding to
the metadata.MD only the trace and span ID), so we can pool our
allocations here.
We can slightly improve lock contention around the tracing registry by
locking only when we're dealing with rootspans. We can also avoid
computing the span duration outside the critical area.

Before this PR, comparing traced scans vs. not:

name                                 old time/op    new time/op    delta
Tracing/Cockroach/Scan1-24              403µs ± 3%     415µs ± 1%   +2.82%  (p=0.000 n=10+9)
Tracing/MultinodeCockroach/Scan1-24     878µs ± 1%     975µs ± 6%  +11.07%  (p=0.000 n=10+10)

name                                 old alloc/op   new alloc/op   delta
Tracing/Cockroach/Scan1-24             23.2kB ± 2%    29.8kB ± 2%  +28.64%  (p=0.000 n=10+10)
Tracing/MultinodeCockroach/Scan1-24    76.5kB ± 5%    95.1kB ± 1%  +24.31%  (p=0.000 n=10+10)

name                                 old allocs/op  new allocs/op  delta
Tracing/Cockroach/Scan1-24                235 ± 2%       258 ± 1%   +9.50%  (p=0.000 n=10+9)
Tracing/MultinodeCockroach/Scan1-24       760 ± 1%       891 ± 1%  +17.20%  (p=0.000 n=9+10)

After this PR:

name                                 old time/op    new time/op    delta
Tracing/Cockroach/Scan1-24              437µs ± 4%     443µs ± 3%     ~     (p=0.315 n=10+10)
Tracing/MultinodeCockroach/Scan1-24     925µs ± 2%     968µs ± 1%   +4.62%  (p=0.000 n=10+10)

name                                 old alloc/op   new alloc/op   delta
Tracing/Cockroach/Scan1-24             23.3kB ± 3%    26.3kB ± 2%  +13.24%  (p=0.000 n=10+10)
Tracing/MultinodeCockroach/Scan1-24    77.0kB ± 4%    81.7kB ± 3%   +6.08%  (p=0.000 n=10+10)

name                                 old allocs/op  new allocs/op  delta
Tracing/Cockroach/Scan1-24                236 ± 1%       241 ± 1%   +2.45%  (p=0.000 n=9+9)
Tracing/MultinodeCockroach/Scan1-24       758 ± 1%       793 ± 2%   +4.67%  (p=0.000 n=10+10)

kv95/enc=false/nodes=1/cpu=32 across a few runs also shows a modest
improvement before and after this PR, using a sampling rate of 10%:

36929.02 v. 37415.52  (reads/s)   +1.30%
 1944.38 v.  1968.94  (writes/s)  +1.24%

Release note: None

cockroach-teamcity · 2021-04-26T19:34:11Z

This change is

irfansharif · 2021-04-30T20:11:43Z

+cc @abarganier, @dhartunian, @rimadeodhar.

tbg

Sorry this took so long! The improvements look reasonable, but I'm worried that the pooling is racy. I also think this should get a close review by the folks on observability-infra, to make sure that everything is documented well. You'll know better who to tag.

Reviewed 13 of 13 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif)

pkg/sql/sem/builtins/builtins.go, line 3736 at r1 (raw file):

				var rootSpan *tracing.Span
				if err := ctx.Settings.Tracer.VisitSpans(func(span *tracing.Span) error {
					if span.TraceID() == traceID || rootSpan != nil {

🤔 don't you want span.TraceID() == traceID && rootSpan == nil?

pkg/util/tracing/crdbspan.go, line 135 at r1 (raw file):

}

func (c *childSpanRefs) list() []*crdbSpan {

This allocs for no good reason, just pointing it out in case this is in a hot path.

pkg/util/tracing/grpc_interceptor.go, line 244 at r1 (raw file):

}

func getPooledMetadata() (md metadata.MD, putBack func(md metadata.MD)) {

I'm seeing that we are basically using this as follows

md, putBack := getPooledMetadata()
mutate(md)
ctx = context.WithValue(ctx, raw{md:md})
cs := newClientStream(ctx)
putBack()
return cs

so... data race? We're putting md into ctx, then returning it to the pool, but there's no reason to assume that it wouldn't be read from the ctx later.

pkg/util/tracing/tracer.go, line 410 at r1 (raw file):

		helper.crdbSpan.rootSpan = rootSpan
	} else {
		helper.crdbSpan.traceEmpty.Store(true)

Does this allocate on every call? I assume it doesn't have to, but not sure it's smart enough to figure it out. Curious why you didn't use an atomic which would also keep crdbSpan smaller.

pkg/util/tracing/tracer.go, line 790 at r1 (raw file):

// differing only in the explicit carrier type. We do this to avoid allocating
// on the heap when iterating over kvs through the Carrier interface.
func (t *Tracer) extractMetaFromMapCarrier(carrier MapCarrier) (SpanMeta, error) {

Rather than duplicating this, can we remove MapCarrier? I think it's only used in two places, and these don't care about the particular type of carrier and could use a metadata carrier instead. If we have duplication like this we need testing that it doesn't diverge, and that will hurt given the amount of code duplicated here.

irfansharif

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)

pkg/sql/sem/builtins/builtins.go, line 3736 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

🤔 don't you want span.TraceID() == traceID && rootSpan == nil?

ha, wow, I would've failed this interview. Done.

pkg/util/tracing/crdbspan.go, line 135 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

This allocs for no good reason, just pointing it out in case this is in a hot path.

Done.

pkg/util/tracing/grpc_interceptor.go, line 244 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I'm seeing that we are basically using this as follows
md, putBack := getPooledMetadata()
mutate(md)
ctx = context.WithValue(ctx, raw{md:md})
cs := newClientStream(ctx)
putBack()
return cs
so... data race? We're putting md into ctx, then returning it to the pool, but there's no reason to assume that it wouldn't be read from the ctx later.

Good catch, it's not applicable here but still should be for unary RPCs, which I think is most things. Removed.

pkg/util/tracing/tracer.go, line 410 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Does this allocate on every call? I assume it doesn't have to, but not sure it's smart enough to figure it out. Curious why you didn't use an atomic which would also keep crdbSpan smaller.

You're right, it doesn't. Changed it to use an atomic int instead (I thought they'd be equally performant, alloc-wise, but I see it isn't).

pkg/util/tracing/tracer.go, line 790 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Rather than duplicating this, can we remove MapCarrier? I think it's only used in two places, and these don't care about the particular type of carrier and could use a metadata carrier instead. If we have duplication like this we need testing that it doesn't diverge, and that will hurt given the amount of code duplicated here.

Don't think we can. The MapCarrier (map[string]string) representation is serialized over the wire. On the other end of the RPC, we could convert this representation into a metadataCarrier (map[string][]string), but doing so would incur allocations which we're trying to avoid. To start using the metadataCarrier upstream of raft, for backwards compatibility reasons we'd have to do it over multiple releases, keeping this duplicated code around anyway.

Actually, found a way to do it without duplicating all the code. PTAL.

tbg

Reviewed 4 of 4 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @abarganier and @irfansharif)

pkg/util/tracing/grpc_interceptor.go, line 305 at r2 (raw file):

Seeing as how invokeCtx's scope extends as far as our use of the metadata

what does this mean? If I had to explain why this is safe, I would say

This context is only used on the sending path of gRPC, and we trust that gRPC (and any other client interceptors stacked under us) won't access the embedded md object after it has returned.

But ultimately, this isn't really inherently safe. Is this a big part of the perf wins? I think we shold back this out if it isn't.

tbg · 2021-05-07T08:22:17Z

PR message needs an update. I'll leave the actual review to @abarganier in the interest of building ownership muscle 💪🏽

irfansharif · 2021-05-12T17:10:58Z

(bump, I'll get to the rest of the proposed changes here after soliciting another round of reviews from y'all)

abarganier

I'll leave the actual review to @abarganier in the interest of building ownership muscle 💪🏽

Thanks @tbg! And sorry for lagging on this @irfansharif - I'll be sure to chime in by EOD tomorrow

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dhartunian, @irfansharif, and @rimadeodhar)

rimadeodhar

Overall, LGTM. My apologies for asking some fairly basic questions but needed to build some context on some of this code.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dhartunian and @irfansharif)

pkg/util/tracing/crdbspan.go, line 109 at r2 (raw file):

}

func (c *childSpanRefs) len() int {

Nit: The len and add methods could be made simpler by adding a variable to childSpanRefs which stores the current spanLen. So the add method would look something like this:
func (c *childSpanRefs) add(ref *crdbSpan) {
if c.spanLen < len(c.preAllocated) {
c.preAllocated[c.spanLen] = ref
}
if maxChildrenPerSpan - spanLen > 0 {
c.overflow = append(c.overflow, ref)
}
spanLen++
}

and len() simply returns spanLen.

However, given that tracing library is more memory constrained than cpu constrained, it might be okay to leave these methods as is in favor of not adding an additional variable.

pkg/util/tracing/grpc_interceptor.go, line 144 at r2 (raw file):

		serverSpan.SetTag(gRPCComponentTag.Key, gRPCComponentTag.Value)
		serverSpan.SetTag(ext.SpanKindRPCServer.Key, ext.SpanKindRPCServer.Value)

Why have we moved these out of the previous calls?

pkg/util/tracing/grpc_interceptor.go, line 254 at r2 (raw file):

Quoted 4 lines of code…

		if md.Len() == 2 && len(md[fieldNameTraceID]) == 1 && len(md[fieldNameSpanID]) == 1 {
			md[fieldNameTraceID][0] = "0"
			md[fieldNameSpanID][0] = "0"
			grpcMetadataPool.Put(md)

I don't fully understand what this is doing here? Why do we only add it to the pool if these conditions are met?

abarganier

Apologies for the delay on this. Overall LGTM as well, piggybacking onto one of @rimadeodhar's questions and a convenience nit/curiosity question.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dhartunian, @irfansharif, and @rimadeodhar)

pkg/util/tracing/crdbspan.go, line 511 at r2 (raw file):

Quoted 8 lines of code…

    var children []*crdbSpan
    for i := 0; i < s.mu.recording.children.len(); i++ {
         children = append(children, s.mu.recording.children.get(i))
    }
    ...
    for _, child := range children {
         child.setVerboseRecursively(to)
    }

This is more of a nit/curiosity question than anything else. Shallow copying children into a slice to then perform some operation on each child span appears to be somewhat common. Would a convenience method for applying some operation to each child make sense?

// Only requires a single pass of the entirety of s.mu.recording.children.len()
func (c *childSpanRefs) apply(func(c *crdbSpan)) {
    // Apply provided fn first to preAllocated, then overflow
}

Then you could do...

s.mu.recording.children.apply(func(child *crdbSpan) {
    child.setVerboseRecursively(to)
})

I know we're memory constrained here, so I'm also curious how something like this would compare memory-wise to the shallow copy.

pkg/util/tracing/grpc_interceptor.go, line 254 at r2 (raw file):

Previously, rimadeodhar wrote…

		if md.Len() == 2 && len(md[fieldNameTraceID]) == 1 && len(md[fieldNameSpanID]) == 1 {
			md[fieldNameTraceID][0] = "0"
			md[fieldNameSpanID][0] = "0"
			grpcMetadataPool.Put(md)
I don't fully understand what this is doing here? Why do we only add it to the pool if these conditions are met?

+1, also wondering about this

This commit attempts to reduce the memory overhead of tracing by doing a few things, guided mostly by BenchmarkTracing and kv95 (see results below). In decreasing order of impact: - We no longer materialize recordings when the trace is "empty" or non-verbose. When traces straddle RPC boundaries, we serialize the recording and to send it over the wire. For "empty" traces (ones with no structured recordings or log messages), this is needlessly allocation-heavy. We end up shipping the trace skeleton (parent-child span relationships, and the operation names + metadata that identify each span). Not only does this allocate within pkg/util/tracing, it also incurs allocations within gRPC where all this serialization is taking place. This commit takes the (opinionated) stance that we can avoid materializing these recordings altogether. We do this by having each span hold onto a reference to the rootspan, and updating an atomic value on the rootspan whenever anything is recorded. When materializing the recording, from any span in the tree, we can consult the root span to see if the trace was non-empty, and allocate if so. - We pre-allocate a small list of child pointer slots for spans to refer to, instead of allocating every time a child span is created. - Span tags aren't rendered unless the span is verbose, but we were still collecting them previously in order to render them if verbosity was toggled. Toggling an active span's verbosity rarely every happens, so now we don't allocate for tags unless the span is already verbose. This also lets us avoid the WithTags option, which allocates even if the span will end up dropping tags. - We were previously allocating SpanMeta, despite the objects being extremely shortlived in practice (they existed primarily to pass on the parent span's metadata to the newly created child spans). We now pass SpanMetas by value. - When extracting SpanMetas for remote spans, through Carriers, instead of doing it through an interface (which incurs allocations), we can use explicit types. - We can slightly improve lock contention around the tracing registry by locking only when we're dealing with rootspans. We can also avoid computing the span duration outside the critical area. --- Before this PR, comparing traced scans vs. not: name old time/op new time/op delta Tracing/Cockroach/Scan1-24 403µs ± 3% 415µs ± 1% +2.82% (p=0.000 n=10+9) Tracing/MultinodeCockroach/Scan1-24 878µs ± 1% 975µs ± 6% +11.07% (p=0.000 n=10+10) name old alloc/op new alloc/op delta Tracing/Cockroach/Scan1-24 23.2kB ± 2% 29.8kB ± 2% +28.64% (p=0.000 n=10+10) Tracing/MultinodeCockroach/Scan1-24 76.5kB ± 5% 95.1kB ± 1% +24.31% (p=0.000 n=10+10) name old allocs/op new allocs/op delta Tracing/Cockroach/Scan1-24 235 ± 2% 258 ± 1% +9.50% (p=0.000 n=10+9) Tracing/MultinodeCockroach/Scan1-24 760 ± 1% 891 ± 1% +17.20% (p=0.000 n=9+10) After this PR: name old time/op new time/op delta Tracing/Cockroach/Scan1-24 437µs ± 4% 443µs ± 3% ~ (p=0.315 n=10+10) Tracing/MultinodeCockroach/Scan1-24 925µs ± 2% 968µs ± 1% +4.62% (p=0.000 n=10+10) name old alloc/op new alloc/op delta Tracing/Cockroach/Scan1-24 23.3kB ± 3% 26.3kB ± 2% +13.24% (p=0.000 n=10+10) Tracing/MultinodeCockroach/Scan1-24 77.0kB ± 4% 81.7kB ± 3% +6.08% (p=0.000 n=10+10) name old allocs/op new allocs/op delta Tracing/Cockroach/Scan1-24 236 ± 1% 241 ± 1% +2.45% (p=0.000 n=9+9) Tracing/MultinodeCockroach/Scan1-24 758 ± 1% 793 ± 2% +4.67% (p=0.000 n=10+10) --- kv95/enc=false/nodes=1/cpu=32 across a few runs also shows a modest improvement before and after this PR, using a sampling rate of 10%: 36929.02 v. 37415.52 (reads/s) +1.30% 1944.38 v. 1968.94 (writes/s) +1.24% Release note: None

irfansharif

TFTRs! I've removed the metadata allocations so it should be less controversial. I'll not immediately go bump our gRPC dep to pick up grpc/grpc-go#4360, I assume we will at some point, but that should help size down the allocation profiles for tracing well.

Can a brother get a green stamp?

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @abarganier, @dhartunian, @rimadeodhar, and @tbg)

pkg/util/tracing/crdbspan.go, line 109 at r2 (raw file):

Previously, rimadeodhar wrote…

Nit: The len and add methods could be made simpler by adding a variable to childSpanRefs which stores the current spanLen. So the add method would look something like this:
func (c *childSpanRefs) add(ref *crdbSpan) {
if c.spanLen < len(c.preAllocated) {
c.preAllocated[c.spanLen] = ref
}
if maxChildrenPerSpan - spanLen > 0 {
c.overflow = append(c.overflow, ref)
}
spanLen++
}

and len() simply returns spanLen.

However, given that tracing library is more memory constrained than cpu constrained, it might be okay to leave these methods as is in favor of not adding an additional variable.

Good idea, done. The pre-allocated array here only helps remove the total number of individual allocations, by allocating 4 pointers all at once. The actual size of this struct matters less, so adding a refcount is more than fine. Thanks!

pkg/util/tracing/crdbspan.go, line 511 at r2 (raw file):

Previously, abarganier (Alex Barganier) wrote…

    var children []*crdbSpan
    for i := 0; i < s.mu.recording.children.len(); i++ {
         children = append(children, s.mu.recording.children.get(i))
    }
    ...
    for _, child := range children {
         child.setVerboseRecursively(to)
    }
This is more of a nit/curiosity question than anything else. Shallow copying children into a slice to then perform some operation on each child span appears to be somewhat common. Would a convenience method for applying some operation to each child make sense?
// Only requires a single pass of the entirety of s.mu.recording.children.len()
func (c *childSpanRefs) apply(func(c *crdbSpan)) {
    // Apply provided fn first to preAllocated, then overflow
} 
Then you could do...
s.mu.recording.children.apply(func(child *crdbSpan) {
    child.setVerboseRecursively(to)
})
I know we're memory constrained here, so I'm also curious how something like this would compare memory-wise to the shallow copy.

In this case it wouldn't have allocated, cause we're simple calling setVerboseRecursively on each element, but in the general case where we collect responses from each element would allocate on the heap as Go needs to ensure the longevity across closure lifetimes. I think for that reason this pattern isn't a super familiar one in Go (also, in case you haven't noticed, we revel in the verbosity). I do agree it makes things cleaner though.

Example where we're collecting results per child:

	for _, child := range children {
		result = append(result, child.getRecording(everyoneIsV211, wantTags)...)
	}

pkg/util/tracing/grpc_interceptor.go, line 144 at r2 (raw file):

Previously, rimadeodhar wrote…

		serverSpan.SetTag(gRPCComponentTag.Key, gRPCComponentTag.Value)
		serverSpan.SetTag(ext.SpanKindRPCServer.Key, ext.SpanKindRPCServer.Value)

Why have we moved these out of the previous calls?

The WithTags option allocates even if the underlying span is going to end up being non-verbose, and will drop the tags anyway. The SetTag API on the other hand can avoid allocating cause it has visibility into the fully constructed span, and can see whether or not it's verbose.

pkg/util/tracing/grpc_interceptor.go, line 254 at r2 (raw file):

Previously, abarganier (Alex Barganier) wrote…

+1, also wondering about this

This has gone away cause I've since given up on trying to pool metadata objects, but the idea here was to only pool metadata.MD objects that pkg/tracing mutated. It's possible that the caller into package tracing had also embedded into ctx metadata of its own. It'd be erroneous to pool those objects, and possibly re-use that metadata for subsequent requests that didn't intend to specify that metadata.

pkg/util/tracing/grpc_interceptor.go, line 305 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Seeing as how invokeCtx's scope extends as far as our use of the metadata

what does this mean? If I had to explain why this is safe, I would say

This context is only used on the sending path of gRPC, and we trust that gRPC (and any other client interceptors stacked under us) won't access the embedded md object after it has returned.

But ultimately, this isn't really inherently safe. Is this a big part of the perf wins? I think we shold back this out if it isn't.

Done. It isn't a big part, though not super small either. Seeing as how we've collapsed empty recordings away, that gave us most of the wins here, so removed.

abarganier

Reviewed 9 of 13 files at r1, 2 of 4 files at r2, 2 of 2 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @abarganier, @dhartunian, and @rimadeodhar)

irfansharif · 2021-05-18T21:33:33Z

Woot!

bors r=abarganier

rimadeodhar

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @abarganier, @dhartunian, and @irfansharif)

pkg/util/tracing/grpc_interceptor.go, line 144 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

The WithTags option allocates even if the underlying span is going to end up being non-verbose, and will drop the tags anyway. The SetTag API on the other hand can avoid allocating cause it has visibility into the fully constructed span, and can see whether or not it's verbose.

Aah I see, thanks for the explanation!

pkg/util/tracing/grpc_interceptor.go, line 254 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

This has gone away cause I've since given up on trying to pool metadata objects, but the idea here was to only pool metadata.MD objects that pkg/tracing mutated. It's possible that the caller into package tracing had also embedded into ctx metadata of its own. It'd be erroneous to pool those objects, and possibly re-use that metadata for subsequent requests that didn't intend to specify that metadata.

Thanks! That makes sense.

pkg/util/tracing/tracer.go, line 598 at r3 (raw file):

}

// var noopSpanMeta = (*SpanMeta)(nil)

Can we delete this comment?

craig · 2021-05-18T22:48:39Z

Build succeeded:

GitHub CI (Cockroach)

andreimatei

Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/sql/sem/builtins/builtins.go, line 3736 at r1 (raw file):

Previously, irfansharif (irfan sharif) wrote…

ha, wow, I would've failed this interview. Done.

Is this code correct? What makes this truly return the root span and not just some random span from the trace - the first in a random iteration order?

irfansharif

Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/sql/sem/builtins/builtins.go, line 3736 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Is this code correct? What makes this truly return the root span and not just some random span from the trace - the first in a random iteration order?

The map it's iterating over only contains root spans:

cockroach/pkg/util/tracing/tracer.go

Lines 166 to 175 in 1fc4fc1

    
           // activeSpans is a map that references all non-Finish'ed local root spans, 
        
           // i.e. those for which no WithLocalParent(<non-nil>) option was supplied. 
        
           // The map is keyed on the span ID, which is deterministically unique. 
        
           // 
        
           // In normal operation, a local root Span is inserted on creation and 
        
           // removed on .Finish(). 
        
           // 
        
           // The map can be introspected by `Tracer.VisitSpans`. A Span can also be 
        
           // retrieved from its ID by `Tracer.GetActiveSpanFromID`. 
        
           activeSpans struct {

andreimatei

Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/sql/sem/builtins/builtins.go, line 3736 at r1 (raw file):

Previously, irfansharif (irfan sharif) wrote…

The map it's iterating over only contains root spans:

cockroach/pkg/util/tracing/tracer.go

Lines 166 to 175 in 1fc4fc1

// activeSpans is a map that references all non-Finish'ed local root spans,

// i.e. those for which no WithLocalParent(<non-nil>) option was supplied.

// The map is keyed on the span ID, which is deterministically unique.

//

// In normal operation, a local root Span is inserted on creation and

// removed on .Finish().

//

// The map can be introspected by `Tracer.VisitSpans`. A Span can also be

// retrieved from its ID by `Tracer.GetActiveSpanFromID`.

activeSpans struct {

oh doh I forgot cause I'm in the middle of mucking with that.
But then is the rootSpan != nil condition necessary?

irfansharif

Reviewable status: complete! 0 of 0 LGTMs obtained

pkg/sql/sem/builtins/builtins.go, line 3736 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

oh doh I forgot cause I'm in the middle of mucking with that.
But then is the rootSpan != nil condition necessary?

Hm, no, we probably want to return a iterutil.StopIteration() when we find the first (and only) matching root span.

irfansharif force-pushed the 210419.tracing-perf branch 4 times, most recently from 5ff9815 to 99c96b8 Compare April 30, 2021 20:00

irfansharif requested a review from a team April 30, 2021 20:00

irfansharif force-pushed the 210419.tracing-perf branch 2 times, most recently from e513da3 to 4a794d0 Compare April 30, 2021 20:01

irfansharif changed the title ~~[wip,dnm] tracing: peephole optimizations round=2~~ tracing: reduce memory pressure throughout Apr 30, 2021

irfansharif removed the request for review from a team April 30, 2021 20:02

irfansharif force-pushed the 210419.tracing-perf branch from 4a794d0 to df69cf9 Compare April 30, 2021 20:10

irfansharif requested a review from tbg April 30, 2021 20:10

irfansharif force-pushed the 210419.tracing-perf branch 2 times, most recently from b48c5a8 to 4bcc3ca Compare April 30, 2021 21:19

irfansharif mentioned this pull request Apr 30, 2021

tracing: improve memory usage by using metadata-only child span types #62020

Closed

tbg reviewed May 4, 2021

View reviewed changes

irfansharif commented May 5, 2021

View reviewed changes

irfansharif force-pushed the 210419.tracing-perf branch from 4bcc3ca to be4c1aa Compare May 5, 2021 16:00

irfansharif requested review from tbg and abarganier May 5, 2021 16:01

tbg reviewed May 7, 2021

View reviewed changes

dhartunian requested review from rimadeodhar and dhartunian May 8, 2021 00:24

abarganier reviewed May 12, 2021

View reviewed changes

rimadeodhar reviewed May 12, 2021

View reviewed changes

abarganier reviewed May 13, 2021

View reviewed changes

irfansharif force-pushed the 210419.tracing-perf branch from be4c1aa to eba03c4 Compare May 18, 2021 18:32

irfansharif commented May 18, 2021

View reviewed changes

abarganier approved these changes May 18, 2021

View reviewed changes

rimadeodhar approved these changes May 18, 2021

View reviewed changes

craig bot merged commit 73bcba9 into cockroachdb:master May 18, 2021

irfansharif deleted the 210419.tracing-perf branch May 19, 2021 01:44

andreimatei reviewed Oct 4, 2021

View reviewed changes

irfansharif commented Oct 4, 2021

View reviewed changes

andreimatei reviewed Oct 4, 2021

View reviewed changes

irfansharif commented Oct 4, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tracing: reduce memory pressure throughout #64233

tracing: reduce memory pressure throughout #64233

irfansharif commented Apr 26, 2021 •

edited

cockroach-teamcity commented Apr 26, 2021

irfansharif commented Apr 30, 2021

tbg left a comment

irfansharif left a comment

tbg left a comment

tbg commented May 7, 2021

irfansharif commented May 12, 2021

abarganier left a comment

rimadeodhar left a comment

abarganier left a comment

irfansharif left a comment

abarganier left a comment

irfansharif commented May 18, 2021

rimadeodhar left a comment

craig bot commented May 18, 2021

andreimatei left a comment

irfansharif left a comment

andreimatei left a comment

irfansharif left a comment

	// activeSpans is a map that references all non-Finish'ed local root spans,
	// i.e. those for which no WithLocalParent(<non-nil>) option was supplied.
	// The map is keyed on the span ID, which is deterministically unique.
	//
	// In normal operation, a local root Span is inserted on creation and
	// removed on .Finish().
	//
	// The map can be introspected by `Tracer.VisitSpans`. A Span can also be
	// retrieved from its ID by `Tracer.GetActiveSpanFromID`.
	activeSpans struct {

tracing: reduce memory pressure throughout #64233

tracing: reduce memory pressure throughout #64233

Conversation

irfansharif commented Apr 26, 2021 • edited

cockroach-teamcity commented Apr 26, 2021

irfansharif commented Apr 30, 2021

tbg left a comment

Choose a reason for hiding this comment

irfansharif left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

tbg commented May 7, 2021

irfansharif commented May 12, 2021

abarganier left a comment

Choose a reason for hiding this comment

rimadeodhar left a comment

Choose a reason for hiding this comment

abarganier left a comment

Choose a reason for hiding this comment

irfansharif left a comment

Choose a reason for hiding this comment

abarganier left a comment

Choose a reason for hiding this comment

irfansharif commented May 18, 2021

rimadeodhar left a comment

Choose a reason for hiding this comment

craig bot commented May 18, 2021

andreimatei left a comment

Choose a reason for hiding this comment

irfansharif left a comment

Choose a reason for hiding this comment

andreimatei left a comment

Choose a reason for hiding this comment

irfansharif left a comment

Choose a reason for hiding this comment

irfansharif commented Apr 26, 2021 •

edited