Print some machinst metrics #8107

lpereira · 2024-03-12T22:15:06Z

This is just a first pass at trying to gather some insights that might lead to memory savings later. For instance, by compiling SpiderMonkey with this patch, we get these metrics out:

Key(machinst.buffer.data): 0.0: 4 0.5: 412.0315874877959 0.9: 1636.1480687972453 1.0: 110084
Key(machinst.buffer.relocs): 0.0: 0 0.5: 1.9999056357269998 0.9: 9.999149133010329 1.0: 600
Key(machinst.buffer.traps): 0.0: 0 0.5: 11.999920242205052 0.9: 57.00277828432678 1.0: 3833
Key(machinst.buffer.call_sites): 0.0: 0 0.5: 1.9999056357269998 0.9: 9.999149133010329 1.0: 600

Which suggest that the fields in MachBuffer struct could be reduced:

Half of the functions are under 412 bytes; most are under 1.6kB.
Half of the functions have 2 relocs; most have at most 10.
Half of the functions have 12 traps; most have at most 57. The default allocation of 16 is probably OK for this field.
Half of the functions have 2 call sites; most have at most 10.

Of course, there's only one MachBuffer per thread compiling code, so the savings here are marginal, but this approach could be used in other parts of the code to find more useful things.

The output formatting at this point is pretty crude as this is merely a proof-of-concept, but something better can be made later if necessary.

This is just a first pass at trying to gather some insights that might lead to memory savings later. For instance, by compiling SpiderMonkey with this patch, we get these metrics out: Key(machinst.buffer.data): 0.0: 4 0.5: 412.0315874877959 0.9: 1636.1480687972453 1.0: 110084 Key(machinst.buffer.relocs): 0.0: 0 0.5: 1.9999056357269998 0.9: 9.999149133010329 1.0: 600 Key(machinst.buffer.traps): 0.0: 0 0.5: 11.999920242205052 0.9: 57.00277828432678 1.0: 3833 Key(machinst.buffer.call_sites): 0.0: 0 0.5: 1.9999056357269998 0.9: 9.999149133010329 1.0: 600 Which suggest that the fields in MachBuffer struct could be reduced: - Half of the functions are under 412 bytes; most are under 1.6kB. - Half of the functions have 2 relocs; most have at most 10. - Half of the functions have 12 traps; most have at most 57. The default allocation of 16 is probably OK for this field. - Half of the functions have 2 call sites; most have at most 10. Of course, there's only one MachBuffer per thread compiling code, so the savings here are marginal, but this approach could be used in other parts of the code to find more useful things. The output formatting at this point is pretty crude as this is merely a proof-of-concept, but something better can be made later if necessary.

github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. labels Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Print some machinst metrics #8107

Print some machinst metrics #8107

lpereira commented Mar 12, 2024

Print some machinst metrics #8107

Are you sure you want to change the base?

Print some machinst metrics #8107

Conversation

lpereira commented Mar 12, 2024