Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues on Android when enabling config.useGL30 #7345

Open
1 of 2 tasks
LobbyDivinus opened this issue Feb 19, 2024 · 7 comments
Open
1 of 2 tasks

Performance issues on Android when enabling config.useGL30 #7345

LobbyDivinus opened this issue Feb 19, 2024 · 7 comments

Comments

@LobbyDivinus
Copy link

LobbyDivinus commented Feb 19, 2024

Issue details

I'm experiencing quite reduced performance on Android when enabling AndroidApplicationConfiguration.useGL30. It seems to be related to the amount of draw calls, regardless of whether I'm drawing meshes with custom shaders or just default sprites. I'd like to use GL30 functionality that's why I'd like to enable it. In the example below the performance difference on my old but highend Samsung device is 10 FPS vs 60 FPS.

Reproduction steps/code

In my project I experience this issue with 3D meshes and UI. However, I was able to reproduce it by just drawing 200 sprites (each with a separate draw call) in the template project.

public class AndroidLauncher extends AndroidApplication {
	@Override
	protected void onCreate (Bundle savedInstanceState) {
		super.onCreate(savedInstanceState);
		AndroidApplicationConfiguration config = new AndroidApplicationConfiguration();
		config.useGL30 = true; // Toggle this to test difference
		initialize(new MyGdxGame(), config);
	}
}
public class MyGdxGame extends ApplicationAdapter {
	SpriteBatch batch;
	Texture img;

	Stage stage;
	Label label;
	StringBuilder labelBuilder = new StringBuilder();
	
	@Override
	public void create () {
		batch = new SpriteBatch();
		img = new Texture("badlogic.jpg");

		stage = new Stage();
		label = new Label(" ", new Label.LabelStyle(new BitmapFont(), Color.WHITE));
		label.setFillParent(true);
		label.setAlignment(Align.topLeft);
		stage.addActor(label);
		stage.setViewport(new StretchViewport(400f, 300f));
	}

	@Override
	public void render () {
		ScreenUtils.clear(0, 0, 0, 1);
		int xmax = 400;
		int ymax = 300;
		int ctr = 200;

		batch.getProjectionMatrix().setToOrtho2D(0f, 0f, xmax, ymax);
		for (int i = 0; i < ctr; i++) {
			batch.begin();
			batch.draw(img, xmax * (0.5f + ((float) i / (ctr - 1) - 0.5f) * 0.8f) - 20f, 0.5f * ymax, 40f, 40);
			batch.end();
		}

		stage.getViewport().update(Gdx.graphics.getBackBufferWidth(), Gdx.graphics.getBackBufferHeight(), true);
		labelBuilder.clear();
		labelBuilder.append(Gdx.graphics.getFramesPerSecond());
		label.setText(labelBuilder);
		stage.draw();
	}
	
	@Override
	public void dispose () {
		stage.dispose();
		batch.dispose();
		img.dispose();
	}
}

I'm using the SpriteBatch for convenience here. Of course, in a real project you would draw the sprites with a single draw call. However, this toy example aims to show draw call related performance issues that are relevant when draw calls are hard to reduce.
If your performance is too good to see a difference (that is, above 60 FPS) you may try to increase the ctr until FPS is below the limit.

The difference seems to be so big that I wonder if there's some bigger underlying issue here.

Version of libGDX and/or relevant dependencies

1.12.1
No other dependencies were used.

Please select the affected platforms

  • Android
  • Windows
    I didn't test it on other platforms.
@Tom-Ski
Copy link
Member

Tom-Ski commented Feb 19, 2024

We had some similar issue here a while back.
#3916

Its probably worth dialling it in to make sure it can be isolated to just VertexArray vs VBOwithVAO directly with a Mesh. SpriteBatch here shouldn't be important. tests/gdx-tests/src/com/badlogic/gdx/tests/VBOWithVAOPerformanceTest.java has a test for regressions, we can probably make one duplicate for comparing VBOWithVAO against the vertex array.

That said, I wonder if VertexArray may before better in rendering very tiny mesh in your test scenario.

What kind of throughput were you doing with your 3D work? Were you uploading lots of vertices per frame?

@LobbyDivinus
Copy link
Author

Hi, thank you for your fast response. The VAO thing sounds related since that's what differentiates the SpriteBatch implementation when using gles3 instead of gles2.

I did some more thorough testing by basically replacing the SpriteBatch with a model batch and came to the conclusion that meshes are actually not affected by this. The reason I thought otherwise in my project before is likely related to the UI having such a huge impact. The meshes in my project are static.

For the SpriteBatch I tried different VertexDataTypes:

  • VertexArray: Only works with GLES2
  • VertexBufferObjectWithVAO: This is the default for GLES3 atm. Shows the described performance issues.
  • VertexBufferObject: Noticable better performance, not on par with GLES2 though
  • VertexBufferObjectSubData: same as VertexBufferObjectWithVAO

@LobbyDivinus
Copy link
Author

I made an interesting observation. Commenting out the following three lines in SpriteBatch.flush() fixes the performance issues completely regardless of which VertexDataType is used:

                Buffer indicesBuffer = (Buffer)mesh.getIndicesBuffer(true);
                indicesBuffer.position(0);
                indicesBuffer.limit(count);

https://github.com/libgdx/libgdx/blob/7bed7d3666d3b343aebeb1c3f733eae63380dd10/gdx/src/com/badlogic/gdx/graphics/g2d/SpriteBatch.java#L959C1-L961C30

No idea how this is connected to VAO, though. Either way, I wonder why the index buffer is touched at all in flush() since it does not actually change so we don't have to re-upload it, right?

@Tom-Ski
Copy link
Member

Tom-Ski commented Feb 19, 2024

Yes that is strange. That is marking it as dirty and the buffer data is going to be uploaded every frame when the mesh is bound.

Can you try changing to just mesh.getIndeicesBuffer(false) and keeping the rest of the code?

@LobbyDivinus
Copy link
Author

Indeed, not marking the buffer as changed also fixes the issue.

@Tom-Ski
Copy link
Member

Tom-Ski commented Feb 19, 2024

Yup, I think this is completely unnecessary as sprite batch indices never change from construction. Will be a nice little performance boost. Will do some further testing, and if its good in all scenarios will make the change

@Tom-Ski
Copy link
Member

Tom-Ski commented Feb 20, 2024

Seems to be no regressions in behaviour. I want to make some more sophisticated tests to check the behaviours under different types of load. On desktop I can't really see anything noticeable yet, but probably have to hit it with a very high draw call app. Even then I don't think my GPUs have little bandwidth to encounter this as a bottleneck. Many draw calls with a full batch should be the best stress test I think, since it will be uploading a full index buffer every flush. Will try that on android asap.

Overall I makes sense to just upload at the start though, it just means that the memory usage is going to be increased to the max size of your batch from the get go, instead of previous behaviour of - it depends how well you are utilizing the batch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants