Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make compaction available for Oj::Doc #650

Merged
merged 4 commits into from Apr 8, 2021

Conversation

BuonOmo
Copy link
Contributor

@BuonOmo BuonOmo commented Apr 7, 2021

Note for review: I've made two commits, the first one is just a code formatting since it use to mix space and tabs and was really hard to read. The second one contains my work.

Also, I can consider doing this work for rails.c and valstack.c, but the patch is big enough for you to review in the first time! I'd be glad to do so later :)


Introduce compaction to fast.c, only if rb_gc_mark_movable is
available. Otherwise, keep a version that supports ruby <2.7.

Since rb_data_object_wrap and TypedData_Wrap_Struct is available
in every supported versions (eg >=2.4.10), I've also removed support
for earlier versions.

Result for compaction is great. I've tested it by allocating and
removing arbitrary sized json. For a count of 70 pages, we can see
that we used to need 68 pages after compaction,and now 61 pages. Of
course this test include also lots of noise, but if we count pinned
objects (objects that are marked in C extension as uncollectible),
we can see that for a benchmark with no allocations, we have the
same amount as a benchmark with this commit with thousands of allocations
(499 pinned objects total), and before this, Oj::Doc used to generate
5k more objects (5499 pinned objects total).

Hence this is a win.

Here's the result of my tests after a GC.compact (reddots are pinned objects):

before patch after patch
5499 pinned objects 499 pinned objects
before patch after

And here are the scripts:

Details
# arbitrary_json.rb
#!/usr/bin/env ruby
# frozen_string_literal: true

require 'objspace'
require_relative '../../lib/oj'

srand 42
ObjectSpace.trace_object_allocations_start

def record_compaction
  GC.start
  GC.start
  GC.start

  File.open('out/before.ndjson', 'w') do |file|
    ObjectSpace.dump_all(output: file)
  end

  puts GC.compact

  File.open('out/after.ndjson', 'w') do |file|
    ObjectSpace.dump_all(output: file)
  end
end

class ArbitraryAllocator
  def initialize
    @array = []
    @jsons = [
      *['{ "foo": "bar", "baz": 1, "boom": true }'] * 10,
      *Array.new(10) { |i| "{ \"num\": #{i}, \"d\": \"#{'a' * i * 100}\" }" },
      *Dir[File.join __dir__, 'data', '*.json'].map { |path| File.read(path) }
    ]
  end

  def append_any_json
    sample = @jsons.sample
    @array.push Oj::Doc.open sample
  end

  def remove_any_json
    deleted = @array.delete_at(rand(0...@array.size))
    deleted.close if rand > 0.9
  end

  def clear_jsons
    @jsons.clear
  end
end

$alloc = ArbitraryAllocator.new

def allocate(iterations)
  [iterations, 5_000].min.times { $alloc.append_any_json }
  iterations.times do |i|
    print "#{i}\r"
    $alloc.remove_any_json
    $alloc.append_any_json
  end
  $alloc.clear_jsons
end

allocate Integer ARGV[0] || 1_000
record_compaction
# heapviz.rb
#!/usr/bin/env ruby
# frozen_string_literal: true

# function viz {
# 	ruby heapviz.rb before_compact-$1.txt before_compact-$1.txt.png
# 	ruby heapviz.rb after_compact-$1.txt after_compact-$1.txt.png
# 	open before_compact-$1.txt.png
# 	open after_compact-$1.txt.png
# }

require 'fiddle'

color_iter = DATA.readlines.map(&:chomp).map { |i|
  i = i.to_i(16)
  [(i >> 16) & 0xFF, (i >> 8) & 0xFF, i & 0xFF, 255]
}.each

SIZEOF_HEAP_PAGE_HEADER_STRUCT = Fiddle::SIZEOF_VOIDP

SIZEOF_RVALUE           = 40
HEAP_PAGE_ALIGN_LOG     = 14
HEAP_PAGE_ALIGN         = 1 << HEAP_PAGE_ALIGN_LOG      # 2 ^ 14
HEAP_PAGE_ALIGN_MASK    = ~(~0 << HEAP_PAGE_ALIGN_LOG)  # Mask for getting page address
REQUIRED_SIZE_BY_MALLOC = Fiddle::SIZEOF_SIZE_T * 5     # padding needed by malloc
HEAP_PAGE_SIZE          = HEAP_PAGE_ALIGN - REQUIRED_SIZE_BY_MALLOC # Actual page size
HEAP_PAGE_OBJ_LIMIT     = (HEAP_PAGE_SIZE - SIZEOF_HEAP_PAGE_HEADER_STRUCT) / SIZEOF_RVALUE

def page_address_from_object_address object_address
  object_address & ~HEAP_PAGE_ALIGN_MASK
end

class Page < Struct.new :address, :obj_start_address, :capacity
  attr_reader :live_objects

  def initialize address, obj_start_address, capacity
    super
    @live_objects = []
  end

  def add_object address
    @live_objects << address
  end

  def each_slot
    return enum_for(:each_slot) unless block_given?

    objs = @live_objects.sort_by { |o| o["address"].to_i(16) }

    capacity.times do |i|
      expected = obj_start_address + (i * SIZEOF_RVALUE)
      if objs.any? && objs.first["address"].to_i(16) == expected
        yield objs.shift
      else
        yield nil
      end
    end
  end

  def full?
    @live_objects.count == capacity
  end

  def pinned_count
    @pinned_count ||= live_objects.find_all { |lo| lo.dig("flags", "pinned") }.count
  end

  def live_object_count
    @live_objects.count
  end

  def fragmented?
    !full?
  end
end

def page_info page_address
  limit = HEAP_PAGE_OBJ_LIMIT # Max number of objects per page

  # Pages have a header with information, so we have to take that in to account
  obj_start_address = page_address + SIZEOF_HEAP_PAGE_HEADER_STRUCT

  # If the object start address isn't evenly divisible by the size of a
  # Ruby object, we need to calculate the padding required to find the first
  # address that is divisible by SIZEOF_RVALUE
  if obj_start_address % SIZEOF_RVALUE != 0
    delta = SIZEOF_RVALUE - (obj_start_address % SIZEOF_RVALUE)
    obj_start_address += delta # Move forward to first address

    # Calculate the number of objects this page can actually hold
    limit = (HEAP_PAGE_SIZE - (obj_start_address - page_address)) / SIZEOF_RVALUE
  end

  Page.new page_address, obj_start_address, limit
end

require 'json'

# Keep track of pages
pages = {}

File.open(ARGV[0]) do |f|
  f.each_line do |line|
    object = JSON.load line

    # Skip roots. I don't want to cover this today :)
    if object["type"] != "ROOT"
      # The object addresses are stored as strings in base 16
      address      = object["address"].to_i(16)

      # Get the address for the page
      page_address = page_address_from_object_address(address)

      # Get the page, or make a new one
      page         = pages[page_address] ||= page_info(page_address)

      page.add_object object
    end
  end
end

require 'chunky_png'

pages = pages.values

p COUNT: pages.count
p FULL_PAGES: pages.find_all(&:full?).count
p FRAGMENTED_PAGES: pages.find_all(&:fragmented?).count

# We're using 2x2 pixel squares to represent objects, so the height of
# the PNG will be 2x the max number of objects, and the width will be 2x the
# number of pages
height = HEAP_PAGE_OBJ_LIMIT * 2
width = pages.size * 2

png = ChunkyPNG::Image.new(width, height, ChunkyPNG::Color::TRANSPARENT)
type_colors = Hash.new { |h,k|
  h[k] = ChunkyPNG::Color.rgba(*color_iter.next)
}

pinned_pages, rest = pages.partition { |page| page.pinned_count > 0 }

print_pages = pinned_pages.sort_by(&:pinned_count).reverse +
  rest.sort_by(&:live_object_count).reverse

$pinned_count=0
print_pages.each_with_index do |page, i|
  i = i * 2

  page.each_slot.with_index do |slot, j|
    j = j * 2
    if slot
      if slot.dig("flags", "pinned")
        $pinned_count +=1
        color = type_colors[slot["type"]]
        red = ChunkyPNG::Color.rgba(255, 0, 0, 255)
        color = red

        png[i, j]         = color
        png[i + 1, j]     = color
        png[i, j + 1]     = color
        png[i + 1, j + 1] = color
      else
        png[i, j]         = ChunkyPNG::Color.rgba(0, 0, 0, 255)
        png[i + 1, j]     = ChunkyPNG::Color.rgba(0, 0, 0, 255)
        png[i, j + 1]     = ChunkyPNG::Color.rgba(0, 0, 0, 255)
        png[i + 1, j + 1] = ChunkyPNG::Color.rgba(0, 0, 0, 255)
      end
    else
      # white = ChunkyPNG::Color.rgba(255, 255, 255, 255)
      # color = white

      # png[i, j]         = color
      # png[i + 1, j]     = color
      # png[i, j + 1]     = color
      # png[i + 1, j + 1] = color
    end
  end
end

png.save(ARGV[1], :interlace => true)
p type_colors
puts $pinned_count

__END__
00FF00
0000FF
FF0000
01FFFE
FFA6FE
FFDB66
006401
010067
95003A
007DB5
FF00F6
FFEEE8
774D00
90FB92
0076FF
D5FF00
FF937E
6A826C
FF029D
FE8900
7A4782
7E2DD2
85A900
FF0056
A42400
00AE7E
683D3B
BDC6FF
263400
BDD393

This one was given by tenderlove: https://gist.github.com/tenderlove/578cfb4ce677465d0a8ceb4362928a89

rm -rf out
mkdir -p out
./arbitrary_json.rb "$1" # 1000000 in my case
./heapviz.rb out/before.ndjson out/before.png
./heapviz.rb out/after.ndjson out/after.png

open out/before.png
open out/after.png

Formatting done using VSCode's C extension and using tabular indent.

https://lea.verou.me/2012/01/why-tabs-are-clearly-superior/

Signed-off-by: Ulysse Buonomo <buonomo.ulysse@gmail.com>
Introduce compaction to `fast.c`, only if `rb_gc_mark_movable` is
available. Otherwise, keep a version that supports ruby <2.7.

Since `rb_data_object_wrap` and `TypedData_Wrap_Struct` is available
in every supported versions (eg >=2.4.10), I've also removed support
for earlier versions.

Result for compaction is great. I've tested it by allocating and
removing arbitrary sized json. For a count of 70 pages, we can see
that we used to need 68 pages after compaction,and now 61 pages. Of
course this test include also lots of noise, but if we count pinned
objects (objects that are marked in C extension as uncollectible),
we can see that for a benchmark with no allocations, we have the
same amount as a benchmark with this commit with thousands of allocations
(499 pinned objects total), and before this, `Oj::Doc` used to generate
5k more objects (5499 pinned objects total).

Hence this is a win, see PR for a graphical result.

Signed-off-by: Ulysse Buonomo <buonomo.ulysse@gmail.com>
@ohler55
Copy link
Owner

ohler55 commented Apr 7, 2021

The benchmark results are good but I'm not happy with the code reformatting. The fast.c code is now formatted much different than the rest of the code. Please don't take that as an incentive to reformat the whole project. I prefer the format I use. I like having the code line up nicely which it is with 4 space tabs. I also prefer the open { to be on the same line as the condition or loop. Revert the formatting changes and I'll be glad to accept the PR.

The compaction is a nice addition.

@BuonOmo
Copy link
Contributor Author

BuonOmo commented Apr 7, 2021

@ohler55 may I still format while using both 4 spaces and { on the same line as code ? The original code uses a mix of tab and spaces which is really deceiving when trying to read the code, see:

image

Or on github, where we can see that there is no identation inside functions:

oj/ext/oj/fast.c

Lines 293 to 310 in a4f4590

leaf_fixnum_value(Leaf leaf) {
char *s = leaf->str;
int64_t n = 0;
int neg = 0;
int big = 0;
if ('-' == *s) {
s++;
neg = 1;
} else if ('+' == *s) {
s++;
}
for (; '0' <= *s && *s <= '9'; s++) {
n = n * 10 + (*s - '0');
if (NUM_MAX <= n) {
big = 1;
}
}

@BuonOmo
Copy link
Contributor Author

BuonOmo commented Apr 7, 2021

Here's an option that seems to fit your coding style:

# .clang-format
BasedOnStyle: Mozilla
IndentWidth: 4
BreakBeforeBraces: Attach
ColumnLimit: 0
AlwaysBreakAfterReturnType: AllDefinitions
PointerBindsToType: false

(See https://releases.llvm.org/11.0.0/tools/clang/docs/ClangFormatStyleOptions.html for options)

EDIT: you can see it with the latest commit 😄
EDIT2: Added AlwaysBreakAfterReturnType: AllDefinitions, even closer
EDIT3: PointerBindsToType: false

The latest edit is as close as it can be as what you seem to expect: doesn't change anything but tabs into spaces.

If this is now okay for you, I can squash before merging, eventually apply formatting changes to the whole stack and dig for space sensitive macros if there are any. And tell me if you want I to add compaction support for rails.c and valstack.c (maybe in a follow-up PR)!

@BuonOmo BuonOmo force-pushed the gc-compaction-compatibility branch from 47d7f12 to 36f21d9 Compare April 7, 2021 16:30
@ohler55
Copy link
Owner

ohler55 commented Apr 7, 2021

Better. So you are using a code formatting tool. Is there a way to get struct member and variable declaration to line up?

For example, instead of:

typedef struct Foo {
    int a;
    char *b;
}

Something like this:

typedef struct Foo {
    int  a;
    char *b;
}

@BuonOmo
Copy link
Contributor Author

BuonOmo commented Apr 7, 2021

@ohler55 there is a way with AlignConsecutiveDeclarations, unfortunately it does mix well with having pointers next to variable (PointerAlignment: Right or DerivePointerAlignment: true). We get kind of a hanging pointer (lgtm, but I understand that it may not be convenient..)

typedef struct _doc {
    Leaf          data;
    Leaf *        where;                 // points to current location
    Leaf          where_path[MAX_STACK]; // points to head of path
    char *        json;
    unsigned long size; // number of leaves/branches in the doc
    VALUE         self;
    Batch         batches;
    struct _batch batch0;
} * Doc;

And yeah, I'm using clang's tool which should be pretty portable !

It seems that the alignement you desire is not implemented because of some situations where it cannot apply: https://stackoverflow.com/a/40312667/6320039

@ohler55
Copy link
Owner

ohler55 commented Apr 7, 2021

Thats too bad. Not sure which I prefer of the two choices. Maybe the use of AlignConsecutiveDeclaration and then I'll use a quick emacs macro to fix it up.

It might be time to set up a nice formatter though. Something like what the golang goimports does. Super nice with go code anyway.

@BuonOmo
Copy link
Contributor Author

BuonOmo commented Apr 7, 2021

@ohler55 unfortunately the implementation of that precise feature seems to have been waiting for a long time (https://reviews.llvm.org/D27651).

I've looker a bit at possible alternative, the only one viable is https://github.com/uncrustify/uncrustify. However, its settings are very hard to understand. I've spent way too much time on that option.. If you want to take a look yourself, here are the two important entry points IMHO:

The other issue with uncrustify is that it is also less used, and harder for newcomer to contribute to Oj. I would use clang-format and make a concession on either variable or star alignment. (my favorite is star alignment, I've seen that variable alignement messes way too much with git's history, making it harder to blame or just navigate through)

Let me know what you decide !

@ohler55
Copy link
Owner

ohler55 commented Apr 7, 2021

Okay, let's go with no alignment. Disappointing but maybe there is something out there that does the job. I'll look a little bit too. You do make a good point about the clang formatter vs uncrustify. Let me know when you are ready and I'll approve and merge.

Using a `.clang-format` file, we fixed formatting rules to make sur they
fit as much as possible current existing code, while still fixing spaces
and tabs mixing issues.

Signed-off-by: Ulysse Buonomo <buonomo.ulysse@gmail.com>
Signed-off-by: Ulysse Buonomo <buonomo.ulysse@gmail.com>
@BuonOmo BuonOmo force-pushed the gc-compaction-compatibility branch from ef1516d to 11cc133 Compare April 7, 2021 20:52
@BuonOmo
Copy link
Contributor Author

BuonOmo commented Apr 7, 2021

@ohler55 I've updated my commits. Since you mentionned having the same formatting for all files, I've ran clang format for every file in the last commit. Feel free to discard that one (or ask me to) if you are not OK for such a huge change right now. I would definitely understand seeing the 10k diff..

@ohler55 ohler55 merged commit 33c6dd3 into ohler55:develop Apr 8, 2021
ohler55 added a commit that referenced this pull request Apr 8, 2021
ohler55 added a commit that referenced this pull request Apr 8, 2021
@ohler55
Copy link
Owner

ohler55 commented Apr 8, 2021

I merged and then reverted. I took your suggestion and set up a .clang-format file but for some reason when starting with the conversions you made the formatter refused to honor the setting I used. I'm trying to resolve that now. I certainly want the changes you made for the compaction but they are kind of lost in the complete reformatting of all the files.

@ohler55
Copy link
Owner

ohler55 commented Apr 9, 2021

The only way I could get the changes to stick was by starting with the original. It seems some of the formatting considers previous formatting. That or I don't know how to configure the clang-format tool correctly. Anyway the code has been reformatted using the checked in clang-format configuration file in the repo root. It does not contain your changes for compaction. If you can make those changes on the current dev without reformatting all the files that would be great. If not I'll do a visual diff or look at the earlier commits and put in the changes. I'd rather give you credit for the compaction changes though if you don't mind the extra effort.

@BuonOmo BuonOmo deleted the gc-compaction-compatibility branch April 9, 2021 08:39
BuonOmo added a commit to BuonOmo/oj that referenced this pull request Apr 9, 2021
Introduce compaction to `fast.c`, only if `rb_gc_mark_movable` is
available. Otherwise, keep a version that supports ruby <2.7.

Since `rb_data_object_wrap` and `TypedData_Wrap_Struct` is available
in every supported versions (eg >=2.4.10), I've also removed support
for earlier versions.

Result for compaction is great. I've tested it by allocating and
removing arbitrary sized json. For a count of 70 pages, we can see
that we used to need 68 pages after compaction,and now 61 pages. Of
course this test include also lots of noise, but if we count pinned
objects (objects that are marked in C extension as uncollectible),
we can see that for a benchmark with no allocations, we have the
same amount as a benchmark with this commit with thousands of allocations
(499 pinned objects total), and before this, `Oj::Doc` used to generate
5k more objects (5499 pinned objects total).

Hence this is a win, see ohler55#650 for a graphical result.

Signed-off-by: Ulysse Buonomo <buonomo.ulysse@gmail.com>
@BuonOmo
Copy link
Contributor Author

BuonOmo commented Apr 9, 2021

@ohler55 I had the same issue while running the PR, here's the new one: #653.

Thank you for the work, and it looks like the options you found are a great fit to your coding style !

ohler55 pushed a commit that referenced this pull request Apr 9, 2021
* Make compaction available for `Oj::Doc`

Introduce compaction to `fast.c`, only if `rb_gc_mark_movable` is
available. Otherwise, keep a version that supports ruby <2.7.

Since `rb_data_object_wrap` and `TypedData_Wrap_Struct` is available
in every supported versions (eg >=2.4.10), I've also removed support
for earlier versions.

Result for compaction is great. I've tested it by allocating and
removing arbitrary sized json. For a count of 70 pages, we can see
that we used to need 68 pages after compaction,and now 61 pages. Of
course this test include also lots of noise, but if we count pinned
objects (objects that are marked in C extension as uncollectible),
we can see that for a benchmark with no allocations, we have the
same amount as a benchmark with this commit with thousands of allocations
(499 pinned objects total), and before this, `Oj::Doc` used to generate
5k more objects (5499 pinned objects total).

Hence this is a win, see #650 for a graphical result.

Signed-off-by: Ulysse Buonomo <buonomo.ulysse@gmail.com>

* Change `clang-format` -> `.clang-format`

According to clang's documentation:

> When using `-style=file`, **clang-format** for each input file will
> try to find the `.clang-format` file located in the closest parent
> directory of the input file. When the standard input is used, the
> search is started from the current directory.

Hence the `.clang-format` seems more appropriate

Signed-off-by: Ulysse Buonomo <buonomo.ulysse@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants