New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issues on Android when enabling config.useGL30 #7345
Comments
We had some similar issue here a while back. Its probably worth dialling it in to make sure it can be isolated to just VertexArray vs VBOwithVAO directly with a Mesh. SpriteBatch here shouldn't be important. tests/gdx-tests/src/com/badlogic/gdx/tests/VBOWithVAOPerformanceTest.java has a test for regressions, we can probably make one duplicate for comparing VBOWithVAO against the vertex array. That said, I wonder if VertexArray may before better in rendering very tiny mesh in your test scenario. What kind of throughput were you doing with your 3D work? Were you uploading lots of vertices per frame? |
Hi, thank you for your fast response. The VAO thing sounds related since that's what differentiates the SpriteBatch implementation when using gles3 instead of gles2. I did some more thorough testing by basically replacing the SpriteBatch with a model batch and came to the conclusion that meshes are actually not affected by this. The reason I thought otherwise in my project before is likely related to the UI having such a huge impact. The meshes in my project are static. For the SpriteBatch I tried different VertexDataTypes:
|
I made an interesting observation. Commenting out the following three lines in SpriteBatch.flush() fixes the performance issues completely regardless of which VertexDataType is used: Buffer indicesBuffer = (Buffer)mesh.getIndicesBuffer(true);
indicesBuffer.position(0);
indicesBuffer.limit(count); No idea how this is connected to VAO, though. Either way, I wonder why the index buffer is touched at all in flush() since it does not actually change so we don't have to re-upload it, right? |
Yes that is strange. That is marking it as dirty and the buffer data is going to be uploaded every frame when the mesh is bound. Can you try changing to just mesh.getIndeicesBuffer(false) and keeping the rest of the code? |
Indeed, not marking the buffer as changed also fixes the issue. |
Yup, I think this is completely unnecessary as sprite batch indices never change from construction. Will be a nice little performance boost. Will do some further testing, and if its good in all scenarios will make the change |
Seems to be no regressions in behaviour. I want to make some more sophisticated tests to check the behaviours under different types of load. On desktop I can't really see anything noticeable yet, but probably have to hit it with a very high draw call app. Even then I don't think my GPUs have little bandwidth to encounter this as a bottleneck. Many draw calls with a full batch should be the best stress test I think, since it will be uploading a full index buffer every flush. Will try that on android asap. Overall I makes sense to just upload at the start though, it just means that the memory usage is going to be increased to the max size of your batch from the get go, instead of previous behaviour of - it depends how well you are utilizing the batch. |
Issue details
I'm experiencing quite reduced performance on Android when enabling AndroidApplicationConfiguration.useGL30. It seems to be related to the amount of draw calls, regardless of whether I'm drawing meshes with custom shaders or just default sprites. I'd like to use GL30 functionality that's why I'd like to enable it. In the example below the performance difference on my old but highend Samsung device is 10 FPS vs 60 FPS.
Reproduction steps/code
In my project I experience this issue with 3D meshes and UI. However, I was able to reproduce it by just drawing 200 sprites (each with a separate draw call) in the template project.
I'm using the SpriteBatch for convenience here. Of course, in a real project you would draw the sprites with a single draw call. However, this toy example aims to show draw call related performance issues that are relevant when draw calls are hard to reduce.
If your performance is too good to see a difference (that is, above 60 FPS) you may try to increase the ctr until FPS is below the limit.
The difference seems to be so big that I wonder if there's some bigger underlying issue here.
Version of libGDX and/or relevant dependencies
1.12.1
No other dependencies were used.
Please select the affected platforms
I didn't test it on other platforms.
The text was updated successfully, but these errors were encountered: