Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use memcpy() to dump fixnum/float #734

Merged
merged 1 commit into from Jan 10, 2022
Merged

Conversation

Watson1978
Copy link
Collaborator

This patch uses standard C library to copy the string.
(Ref. #674)

before after result
Oj.dump 1.046M 1.102M 1.054x

Environment

  • Zorin OS 16
  • AMD Ryzen 7 5700G
  • gcc version 11.1.0
  • Ruby 3.1.0

Before

Warming up --------------------------------------
             Oj.dump   106.035k i/100ms
Calculating -------------------------------------
             Oj.dump      1.046M (± 1.0%) i/s -     15.799M in  15.098842s

After

Warming up --------------------------------------
             Oj.dump   112.786k i/100ms
Calculating -------------------------------------
             Oj.dump      1.102M (± 1.1%) i/s -     16.580M in  15.051956s

Test code

require 'benchmark/ips'
require 'oj'

data = {
  float: 3.141592653589793,
  fixnum: 2 ** 60
}

Benchmark.ips do |x|
  x.time = 15

  x.report('Oj.dump') { Oj.dump(data, mode: :compat) }
end

This patch uses standard C library to copy the string.
(Ref. ohler55#674)

−               | before | after  | result
--               | --     | --     | --
Oj.dump          | 1.046M | 1.102M | 1.054x

### Environment
- Zorin OS 16
- AMD Ryzen 7 5700G
- gcc version 11.1.0
- Ruby 3.1.0

### Before
```
Warming up --------------------------------------
             Oj.dump   106.035k i/100ms
Calculating -------------------------------------
             Oj.dump      1.046M (± 1.0%) i/s -     15.799M in  15.098842s
```

### After
```
Warming up --------------------------------------
             Oj.dump   112.786k i/100ms
Calculating -------------------------------------
             Oj.dump      1.102M (± 1.1%) i/s -     16.580M in  15.051956s
```

### Test code
```ruby
require 'benchmark/ips'
require 'oj'

data = {
  float: 3.141592653589793,
  fixnum: 2 ** 60
}

Benchmark.ips do |x|
  x.time = 15

  x.report('Oj.dump') { Oj.dump(data, mode: :compat) }
end
```
@ohler55
Copy link
Owner

ohler55 commented Jan 10, 2022

Nice catch.

@ohler55 ohler55 merged commit 5bf1d89 into ohler55:develop Jan 10, 2022
@ohler55
Copy link
Owner

ohler55 commented Jan 10, 2022

Nice catch and an improvement.

@Watson1978 Watson1978 deleted the memcpy branch January 10, 2022 00:40
Watson1978 added a commit to Watson1978/oj that referenced this pull request Jan 14, 2022
Maybe, the standard C library may use SIMD instructions,
so it is faster than our own code.

Similar:
- ohler55#734
- ohler55#674

−               | before | after  | result
--               | --     | --     | --
Oj.dump (macOS)  | 1.699M | 2.020M | 1.189x
Oj.dump (Linux)  | 1.849M | 2.260M | 1.222x

### Environment
- macOS
  - macOS 12.1
  - Apple M1 Max
  - Apple clang version 13.0.0 (clang-1300.0.29.30)
  - Ruby 3.1.0
- Linux
  - Zorin OS 16
  - AMD Ryzen 7 5700G
  - gcc version 11.1.0
  - Ruby 3.1.0

### macOS
#### Before
```
Warming up --------------------------------------
             Oj.dump   169.730k i/100ms
Calculating -------------------------------------
             Oj.dump      1.699M (± 0.7%) i/s -     25.629M in  15.089624s
```

#### After
```
Warming up --------------------------------------
             Oj.dump   201.206k i/100ms
Calculating -------------------------------------
             Oj.dump      2.020M (± 0.9%) i/s -     30.382M in  15.044372s
```

### Linux
#### Before
```
Warming up --------------------------------------
             Oj.dump   180.943k i/100ms
Calculating -------------------------------------
             Oj.dump      1.849M (± 1.1%) i/s -     27.865M in  15.072276s
```

#### After
```
Warming up --------------------------------------
             Oj.dump   224.695k i/100ms
Calculating -------------------------------------
             Oj.dump      2.260M (± 1.4%) i/s -     33.929M in  15.012352s
```

### Test code
```ruby
require 'benchmark/ips'
require 'oj'

data = {
  true: (0..10).map { true },
  false: (0..10).map { false },
  null: (0..10).map { nil },
}

Benchmark.ips do |x|
  x.time = 15

  x.report('Oj.dump') { Oj.dump(data) }
end
```
ohler55 pushed a commit that referenced this pull request Jan 14, 2022
Maybe, the standard C library may use SIMD instructions,
so it is faster than our own code.

Similar:
- #734
- #674

−               | before | after  | result
--               | --     | --     | --
Oj.dump (macOS)  | 1.699M | 2.020M | 1.189x
Oj.dump (Linux)  | 1.849M | 2.260M | 1.222x

### Environment
- macOS
  - macOS 12.1
  - Apple M1 Max
  - Apple clang version 13.0.0 (clang-1300.0.29.30)
  - Ruby 3.1.0
- Linux
  - Zorin OS 16
  - AMD Ryzen 7 5700G
  - gcc version 11.1.0
  - Ruby 3.1.0

### macOS
#### Before
```
Warming up --------------------------------------
             Oj.dump   169.730k i/100ms
Calculating -------------------------------------
             Oj.dump      1.699M (± 0.7%) i/s -     25.629M in  15.089624s
```

#### After
```
Warming up --------------------------------------
             Oj.dump   201.206k i/100ms
Calculating -------------------------------------
             Oj.dump      2.020M (± 0.9%) i/s -     30.382M in  15.044372s
```

### Linux
#### Before
```
Warming up --------------------------------------
             Oj.dump   180.943k i/100ms
Calculating -------------------------------------
             Oj.dump      1.849M (± 1.1%) i/s -     27.865M in  15.072276s
```

#### After
```
Warming up --------------------------------------
             Oj.dump   224.695k i/100ms
Calculating -------------------------------------
             Oj.dump      2.260M (± 1.4%) i/s -     33.929M in  15.012352s
```

### Test code
```ruby
require 'benchmark/ips'
require 'oj'

data = {
  true: (0..10).map { true },
  false: (0..10).map { false },
  null: (0..10).map { nil },
}

Benchmark.ips do |x|
  x.time = 15

  x.report('Oj.dump') { Oj.dump(data) }
end
```
casperisfine pushed a commit to Shopify/oj that referenced this pull request Jan 14, 2022
This patch uses standard C library to copy the string.
(Ref. ohler55#674)

−               | before | after  | result
--               | --     | --     | --
Oj.dump          | 1.046M | 1.102M | 1.054x

### Environment
- Zorin OS 16
- AMD Ryzen 7 5700G
- gcc version 11.1.0
- Ruby 3.1.0

### Before
```
Warming up --------------------------------------
             Oj.dump   106.035k i/100ms
Calculating -------------------------------------
             Oj.dump      1.046M (± 1.0%) i/s -     15.799M in  15.098842s
```

### After
```
Warming up --------------------------------------
             Oj.dump   112.786k i/100ms
Calculating -------------------------------------
             Oj.dump      1.102M (± 1.1%) i/s -     16.580M in  15.051956s
```

### Test code
```ruby
require 'benchmark/ips'
require 'oj'

data = {
  float: 3.141592653589793,
  fixnum: 2 ** 60
}

Benchmark.ips do |x|
  x.time = 15

  x.report('Oj.dump') { Oj.dump(data, mode: :compat) }
end
```
casperisfine pushed a commit to Shopify/oj that referenced this pull request Jan 14, 2022
Maybe, the standard C library may use SIMD instructions,
so it is faster than our own code.

Similar:
- ohler55#734
- ohler55#674

−               | before | after  | result
--               | --     | --     | --
Oj.dump (macOS)  | 1.699M | 2.020M | 1.189x
Oj.dump (Linux)  | 1.849M | 2.260M | 1.222x

### Environment
- macOS
  - macOS 12.1
  - Apple M1 Max
  - Apple clang version 13.0.0 (clang-1300.0.29.30)
  - Ruby 3.1.0
- Linux
  - Zorin OS 16
  - AMD Ryzen 7 5700G
  - gcc version 11.1.0
  - Ruby 3.1.0

### macOS
#### Before
```
Warming up --------------------------------------
             Oj.dump   169.730k i/100ms
Calculating -------------------------------------
             Oj.dump      1.699M (± 0.7%) i/s -     25.629M in  15.089624s
```

#### After
```
Warming up --------------------------------------
             Oj.dump   201.206k i/100ms
Calculating -------------------------------------
             Oj.dump      2.020M (± 0.9%) i/s -     30.382M in  15.044372s
```

### Linux
#### Before
```
Warming up --------------------------------------
             Oj.dump   180.943k i/100ms
Calculating -------------------------------------
             Oj.dump      1.849M (± 1.1%) i/s -     27.865M in  15.072276s
```

#### After
```
Warming up --------------------------------------
             Oj.dump   224.695k i/100ms
Calculating -------------------------------------
             Oj.dump      2.260M (± 1.4%) i/s -     33.929M in  15.012352s
```

### Test code
```ruby
require 'benchmark/ips'
require 'oj'

data = {
  true: (0..10).map { true },
  false: (0..10).map { false },
  null: (0..10).map { nil },
}

Benchmark.ips do |x|
  x.time = 15

  x.report('Oj.dump') { Oj.dump(data) }
end
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants