Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alloc-free way of using SubImage #2902

Open
quasilyte opened this issue Feb 4, 2024 · 2 comments
Open

Alloc-free way of using SubImage #2902

quasilyte opened this issue Feb 4, 2024 · 2 comments

Comments

@quasilyte
Copy link
Contributor

quasilyte commented Feb 4, 2024

Operating System

n/a, all systems.

What feature would you like to be added?

A way to render/draw a subimage of an image without extra allocations.

Some ways to do it:

  • A method like img.TransferSubImage(&dst, rect) (suggested by @tinne26)
  • A new DrawImageOptions field that can take a subimage bounds (zero value - no subimage is needed)
  • A new DrawSubImage function that takes the bounds in addition to the image and options

There could be other ways to solve it.

I don't really need a generic solution, but I want there to be a way to avoid that extra allocation. Then I can wrap the Image type with my own code and make this allocation go away using any means possible.

Right now I'm using an unsafe approach.

https://github.com/quasilyte/ge/blob/master/ebiten_image.go
https://github.com/quasilyte/ge/blob/7696bd01a11f683b4067c8fdf14c63ab29dee920/image_cache.go#L20
https://github.com/quasilyte/ge/blob/7696bd01a11f683b4067c8fdf14c63ab29dee920/sprite.go#L299-L309

Existing ways of avoiding that:

  • Only do a SubImage when necessary (when bounds change, when image changes, etc.) instead of on every Draw
  • Use an unsafe approach above and pray for your gods if you have any

Why is this needed?

Let's assume that there is a user-defined Sprite type. It uses the ebitengine Image to do stuff.

Some of the use-cases of this Sprite may involve SubImage per Draw call: an underlying image represents a sprite-sheet with multiple frames (we only want to render single "frame"). This happens with animations, tilesets, user-defined atlases, etc.

Let's suppose that this Sprite calls an Image.SubImage for every Draw call (every sprite's Draw is called for every Game's Draw call). This will cause the game to potentially do hundreds of unintended heap allocations per every Draw call.

This is especially painful on mobiles and wasm targets.

Given this benchmark:

package mytest

import (
    "image"
    "testing"

    "github.com/hajimehoshi/ebiten/v2"
)

func BenchmarkSubImage(b *testing.B) {
    img := ebiten.NewImage(1, 1)
    rect := image.Rect(0, 0, 1, 1)

    b.ResetTimer() // Resets allocs as well
    for i := 0; i < b.N; i++ {
        img.SubImage(rect)
    }
}

We'll get these results:

$ go test -bench=. -benchmem .
goos: linux
goarch: amd64
pkg: github.com/quasilyte/mytest
cpu: 12th Gen Intel(R) Core(TM) i5-1235U
BenchmarkSubImage-12         8730554           128.8 ns/op         112 B/op           1 allocs/op
PASS
ok      github.com/quasilyte/mytest    1.314s

This allocation comes from here:

   38.63GB    38.63GB (flat, cum)   100% of Total
         .          .    823:func (i *Image) SubImage(r image.Rectangle) image.Image {
         .          .    824:	i.copyCheck()
         .          .    825:	if i.isDisposed() {
         .          .    826:		return nil
         .          .    827:	}
         .          .    828:
         .          .    829:	r = r.Intersect(i.Bounds())
         .          .    830:	// Need to check Empty explicitly. See the standard image package implementations.
         .          .    831:	if r.Empty() {
         .          .    832:		r = image.ZR
         .          .    833:	}
         .          .    834:
         .          .    835:	var orig = i
         .          .    836:	if i.isSubImage() {
         .          .    837:		orig = i.original
         .          .    838:	}
         .          .    839:
   38.63GB    38.63GB    840:	img := &Image{
         .          .    841:		image:    i.image,
         .          .    842:		bounds:   r,
         .          .    843:		original: orig,
         .          .    844:	}

And even if we would return Image instead of *Image, it would still allocate as we return it as an interface value.

Every allocation is of sizeof(ebiten.Image).

quasilyte added a commit to quasilyte/ebiten that referenced this issue Feb 4, 2024
Instead of allocating the result on heap for the user,
allow the caller side to decide whether it should be heap-allocated
or stack-allocated.

This new method allows us to have a zero-alloc SubImage routine
for the use cases where we don't want to make an extra allocation.

The performance difference is measurable (comparison is done with benchstat):

	name             old time/op    new time/op    delta
	SubImage-12     203ns ± 2%      33ns ± 2%   -83.91%  (p=0.000 n=20+19)

	name             old alloc/op   new alloc/op   delta
	SubImage-12      112B ± 0%        0B       -100.00%  (p=0.000 n=20+20)

Using SubImage is `~203ns`, SubImageInto is `~33ns`.
SubImage does 1 allocation, SubImageInto does none.

SubImage is rewritten in a way to avoid the code duplication.
It should work identically to the previous contract (it returns nil
if `i` is disposed, etc.)

Fixes hajimehoshi#2902
@hajimehoshi hajimehoshi added this to the v2.7.0 milestone Feb 4, 2024
@hajimehoshi hajimehoshi removed this from the v2.7.0 milestone Feb 11, 2024
@SolarLune
Copy link

SolarLune commented Feb 16, 2024

I would probably go with @tinne26's idea of transferring a subimage, mainly because that should maintain current functionality. I like the idea of adding a subimage field to DrawImageOptions, but you can't easily add subimage rects together (though I don't really think you'd usually want to do that, so it's not really a big deal).

@quasilyte Just wondering, is it possible or performant for you to simply use DrawImage() to manually handle subimaging? To, with an intermediate buffer, draw the spritesheet with the proper offset to "subimage"? I'd imagine not, but I'm not sure how efficiently ebiten handles drawing pixels that fall outside of a destination buffer's draw range.

@quasilyte
Copy link
Contributor Author

@SolarLune the intermediate image solution could work, but I haven't measured the performance difference.

For now, I'm caching the subimage inside the sprite object and updating it whether the "frame" fields were updated (like the frame size being changed, etc.)
This is good enough in most cases as frame sizes usually don't change.
It wouldn't work for things where frame offset changes very frequently (making caching less efficient).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants