Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heap addresses are not reliable to produce diffs #31

Open
mttkay opened this issue Aug 26, 2022 · 8 comments
Open

Heap addresses are not reliable to produce diffs #31

mttkay opened this issue Aug 26, 2022 · 8 comments

Comments

@mttkay
Copy link
Contributor

mttkay commented Aug 26, 2022

I've been looking at heapy diff, and one thing I noticed is that it keys on an object's address to decide whether it was present in the heap dump we compare against.

Is this reliable? I looked at MRI, and address is just an object's VALUE pointer: https://github.com/ruby/ruby/blob/e315f3a1341f123051b75e589b746132c3510079/ext/objspace/objspace_dump.c#L238

The way I understood the Ruby GC to work is that each 40B slot (once populated) always contains an RVALUE, which is a union type so it can "morph" into a different type. If this object gets GC'ed, it is not removed from the heap page, but rather a flag is cleared that tags this slot (or object) as "empty": https://github.com/ruby/ruby/blob/6ef46f71c743507a0e2ae0eef14dce0539b0ff52/gc.c#L569. This makes a slot reusable by changing its union type, but its memory address does not change.

Wouldn't this mean that if an object is GC'ed between two snapshots and the same slot is reused for a completely different object, it would then be omitted from the heapy diff, because the slot address already appeared in the first snapshot?

I'm sure I'm missing something but I wanted to make sure I understand how this works. Thanks!

@mttkay
Copy link
Contributor Author

mttkay commented Aug 26, 2022

A related problem with this is that diffs do not survive heap compaction:

require 'objspace'

BEFORE = 'before.json'
AFTER = 'after.json'

GC.start

ObjectSpace.dump_all(output: open(BEFORE, 'w'))

# GC.compact

ObjectSpace.dump_all(output: open(AFTER, 'w'))

Semantically, there should be no heap difference between the two calls to dump_all aside from objects allocated by the first dump_all call:

$ruby ./heap.rb && heapy diff before.json after.json
Allocated STRING 5 objects of size 256/824 (in bytes) at: :
Allocated HASH 2 objects of size 336/824 (in bytes) at: :
Allocated FILE 1 objects of size 232/824 (in bytes) at: :

When I uncomment the call to compact:

$ruby ./heap.rb && heapy diff before.json after.json
Allocated STRING 2153 objects of size 110441/657516 (in bytes) at: :
Allocated IMEMO 683 objects of size 461852/657516 (in bytes) at: :
Allocated ARRAY 232 objects of size 28784/657516 (in bytes) at: :
Allocated OBJECT 66 objects of size 7272/657516 (in bytes) at: :
Allocated CLASS 25 objects of size 16608/657516 (in bytes) at: :
Allocated HASH 24 objects of size 16736/657516 (in bytes) at: :
Allocated DATA 20 objects of size 1678/657516 (in bytes) at: :
Allocated REGEXP 19 objects of size 12769/657516 (in bytes) at: :
Allocated STRUCT 5 objects of size 200/657516 (in bytes) at: :
Allocated ICLASS 4 objects of size 160/657516 (in bytes) at: :
Allocated MODULE 2 objects of size 1016/657516 (in bytes) at: :

This makes sense because many of these objects moved around on the heap and had their address changed.

I wonder if there is a more stable way to identify an object on the heap? I'm not sure whether this is fixable, but it could be worth pointing out in the README?

@mttkay
Copy link
Contributor Author

mttkay commented Aug 27, 2022

I constructed a simple program that should demonstrate what I meant in the original question, i.e. without even considering heap compaction:

# frozen_string_literal: true

require 'objspace'

BEFORE = 'before.json'
AFTER = 'after.json'

class Before; end
class After; end

GC.start

ObjectSpace.trace_object_allocations_start

# Allocate nested array.
a = []
1000.times do
  a << Before.new
end

ObjectSpace.dump_all(output: open(BEFORE, 'w'))

# Free all array elements and make sure they are GC'ed;
# This will merely tag these object slots as T_NONE.
a.clear
GC.start

# Refill the original array with different objects. These will use the same heap slots
# we just freed, i.e. they will have the same memory address.
1000.times do
  a << After.new
end

ObjectSpace.dump_all(output: open(AFTER, 'w'))

Here, we first allocate an array of Before instances that require a heap slot, then dump heap to take the before-snapshot. We then GC all of these elements, and refill the array and dump heap again.

The expected diff should be 1000 instances of the After class, but it isn't:

$ruby ./heap.rb && heapy diff before.json after.json
Allocated HASH 1 objects of size 40/40 (in bytes) at: :

This is because when Ruby GCs an object, it merely uses type tagging to indicate that this heap slot can be reused by another allocation. But the memory address does not change. So heapy thinks these objects aren't new in the second dump.

You can double-check as follows: find the memory address of the Before and After types, then compare memory addresses of instances of these types in the before and after dumps and we see:

$ grep 0x15b4ae0 before.json | head
{"address":"0x1414028", "type":"OBJECT", "class":"0x15b4ae0", "ivars":0, "file":"./heap.rb", "line":18, "method":"new", "generation":9, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414050", "type":"OBJECT", "class":"0x15b4ae0", "ivars":0, "file":"./heap.rb", "line":18, "method":"new", "generation":9, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414118", "type":"OBJECT", "class":"0x15b4ae0", "ivars":0, "file":"./heap.rb", "line":18, "method":"new", "generation":9, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x14141e0", "type":"OBJECT", "class":"0x15b4ae0", "ivars":0, "file":"./heap.rb", "line":18, "method":"new", "generation":9, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414208", "type":"OBJECT", "class":"0x15b4ae0", "ivars":0, "file":"./heap.rb", "line":18, "method":"new", "generation":9, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414230", "type":"OBJECT", "class":"0x15b4ae0", "ivars":0, "file":"./heap.rb", "line":18, "method":"new", "generation":9, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414258", "type":"OBJECT", "class":"0x15b4ae0", "ivars":0, "file":"./heap.rb", "line":18, "method":"new", "generation":9, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414280", "type":"OBJECT", "class":"0x15b4ae0", "ivars":0, "file":"./heap.rb", "line":18, "method":"new", "generation":9, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x14142f8", "type":"OBJECT", "class":"0x15b4ae0", "ivars":0, "file":"./heap.rb", "line":18, "method":"new", "generation":9, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414410", "type":"OBJECT", "class":"0x15b4ae0", "ivars":0, "file":"./heap.rb", "line":18, "method":"new", "generation":9, "memsize":40, "flags":{"wb_protected":true}}
grep 0x15b49a0 after.json | head 
{"address":"0x1414028", "type":"OBJECT", "class":"0x15b49a0", "ivars":0, "file":"./heap.rb", "line":31, "method":"new", "generation":10, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414050", "type":"OBJECT", "class":"0x15b49a0", "ivars":0, "file":"./heap.rb", "line":31, "method":"new", "generation":10, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414118", "type":"OBJECT", "class":"0x15b49a0", "ivars":0, "file":"./heap.rb", "line":31, "method":"new", "generation":10, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x14141e0", "type":"OBJECT", "class":"0x15b49a0", "ivars":0, "file":"./heap.rb", "line":31, "method":"new", "generation":10, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414208", "type":"OBJECT", "class":"0x15b49a0", "ivars":0, "file":"./heap.rb", "line":31, "method":"new", "generation":10, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414230", "type":"OBJECT", "class":"0x15b49a0", "ivars":0, "file":"./heap.rb", "line":31, "method":"new", "generation":10, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414258", "type":"OBJECT", "class":"0x15b49a0", "ivars":0, "file":"./heap.rb", "line":31, "method":"new", "generation":10, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414280", "type":"OBJECT", "class":"0x15b49a0", "ivars":0, "file":"./heap.rb", "line":31, "method":"new", "generation":10, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x14142f8", "type":"OBJECT", "class":"0x15b49a0", "ivars":0, "file":"./heap.rb", "line":31, "method":"new", "generation":10, "memsize":40, "flags":{"wb_protected":true}}
{"address":"0x1414410", "type":"OBJECT", "class":"0x15b49a0", "ivars":0, "file":"./heap.rb", "line":31, "method":"new", "generation":10, "memsize":40, "flags":{"wb_protected":true}}

@mttkay
Copy link
Contributor Author

mttkay commented Aug 27, 2022

Based on this, maybe a better (though not perfect) solution would be to key on both address and type? That is still not accurate of course since ivar values may change. It would also not solve the problem with compaction since unfortunately heap dumps don't appear to serialize the moved flag of an object slot (though that could be fixed by patching MRI)?

We could also incorporate line since clearly an object is never the same when allocated from a different program region. But any of these approaches will merely be best guesses.

@schneems
Copy link
Member

schneems commented Aug 27, 2022 via email

@mttkay
Copy link
Contributor Author

mttkay commented Aug 28, 2022

Fair enough. I wonder if the typical heapy user expects that heapy diff can yield both false negatives (objects being GC'ed whose heap slots are reused) and false positives (objects move around on the heap).

With that in mind, would it make sense to just document this as a known limitation? I think that's always a good first step.

I don't think the issue itself is dramatic, considering the most likely use case is to detect differences in tenured objects i.e. heap growth that isn't just temporary. In order to make this more reliable, the following ideas come to mind:

  • Instead of keying on address, key on the JSON entry in the heap diff. This might have to ignore the flags and potentially other fields but it would likely provide a better proxy for object identity than just a pointer value.
  • When taking diffs, only consider old objects. The issue with reusing addresses only occurs when objects get GC'ed and slots are reused. To detect long-term heap growth or memory leaks, temporary garbage is just noise, so we might as well exclude it.

I generally won’t be able to answer arbitrary questions. I suggest trying
to use heapy and if it’s behaving in a strange manner then you can ask
questions on other Q&A sites or sometimes Reddit to understand the
mechanics at play.

I believe the reproducible test case I provided is sufficient evidence that we have moved from "arbitrary question" territory into issue territory. Do you disagree?

@mttkay mttkay changed the title Are heap addresses reliable across heap diffs? Heap addresses are not reliable to produce diffs Aug 28, 2022
@mttkay
Copy link
Contributor Author

mttkay commented Aug 28, 2022

I updated the issue title and description to restate it as an issue report rather than a question. Let me know if I am missing anything.

@schneems
Copy link
Member

Do you disagree?

Nope. Seems good. That comment had less to do with this being a question originally and more with my availability.

With that in mind, would it make sense to just document this as a known limitation? I think that's always a good first step.

Totally. Send me a PR? While Calling GC.compact in the middle of a heap generation is probably unlikely (right now) its good to document edge cases . No matter how unlikely, even one in a million bugs still happen at enough scale.

mttkay added a commit to mttkay/heapy that referenced this issue Aug 30, 2022
Heapy diff can yield both false positives
and false negatives due to keying on heap
pointers. This is difficult to fix, so
we're just documenting this for now.

Refs zombocom#31
@mttkay
Copy link
Contributor Author

mttkay commented Aug 30, 2022

Happy to help out with this! I took a stab at it here: #32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants