Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"operation not permitted" errors for batch operations #147

Closed
kakkoyun opened this issue Apr 12, 2022 · 17 comments · Fixed by #157
Closed

"operation not permitted" errors for batch operations #147

kakkoyun opened this issue Apr 12, 2022 · 17 comments · Fixed by #157

Comments

@kakkoyun
Copy link
Contributor

kakkoyun commented Apr 12, 2022

I have been trying to debug the reason for this, but so far I haven't managed to be successful. Thus, I'm asking for help.

We try to use GetValuesAndBatch and it receives "operation not permitted" error.

https://github.com/parca-dev/parca-agent/blob/bd9807a3a0e16302b5944d570967ef5a828dfc80/pkg/profiler/profiler.go#L344-L358

I have tried to bump the rlimits (usually that's the culprit under this error) but no luck with that either
https://github.com/parca-dev/parca-agent/blob/bd9807a3a0e16302b5944d570967ef5a828dfc80/pkg/profiler/profiler.go#L623-L642

Do you happen to have any pointers or guideline for me to further debug this? Or could this be related to error handling?

@grantseltzer
Copy link
Contributor

Hm, i'll investigate this now. I typically run with tracee in background to see what syscall is giving the EPERM and go from there.

@kakkoyun
Copy link
Contributor Author

I guess, it's time for me to give tracee a spin as well :D

@grantseltzer
Copy link
Contributor

@kakkoyun Do you have an issue on the parca agent describing this so we can discuss how to reproduce?

@grantseltzer
Copy link
Contributor

I'm seeing that if the count parameter passed to bpf_map_lookup_and_delete_batch is greater than the number of elements in the map that I get EPERM.

Checking out the documentation for the libbpf function, count is an input and output parameter, so you should be able to print that value to see if anything is being read/deleted before the permission denied error occurs.

@kakkoyun
Copy link
Contributor Author

I'm seeing that if the count parameter passed to bpf_map_lookup_and_delete_batch is greater than the number of elements in the map that I get EPERM.

Checking out the documentation for the libbpf function, count is an input and output parameter, so you should be able to print that value to see if anything is being read/deleted before the permission denied error occurs.

This is an amazing find. I guess I can start from here. Thank you very much.

I have the PR and we have a demo setup to run it in a minikube if it helps next time parca-dev/parca-agent#326

I assume you debugged it using tracee, right? I want to add that to our debugging flow if it's case :)

I'll update here with my findings.

@kakkoyun
Copy link
Contributor Author

@grantseltzer I think we were passing the capacity of the map, and we need to pass the size of the map, (still tying to validate)

In any case, from the documentation I understand that even though AP returns error it could be a partial success. And the index of the last successful operation is indicated by the count in/out parameter. Is this also want you understand from it?

If it's the case, the current batch APIs don't consider this fact.

@grantseltzer
Copy link
Contributor

I assume you debugged it using tracee, right? I want to add that to our debugging flow if it's case :)

I would have but no I have no tried running parca-agent yet. I do recommend tracee for debugging though! It's easier to use for debugging than strace.

In any case, from the documentation I understand that even though AP returns error it could be a partial success. And the index of the last successful operation is indicated by the count in/out parameter. Is this also want you understand from it?

Yes that's what I understand as well (I wrote that documentation btw haha)

If it's the case, the current batch APIs don't consider this fact.

Do you mean within libbpfgo? You may be right, i'm not sure if an error would surface if the 'count' output value isn't checked. This should also be elaborated on in the GoDocs for these functions. I will create an issue for improving.

@kakkoyun
Copy link
Contributor Author

Thanks 👍

I'll be working on this through-out the week, and I'll update here.
I would be happy to add necessary documentation or API changes if necessary.

@kakkoyun
Copy link
Contributor Author

Hey @grantseltzer, I made it work in a degree. My previous mistake was to pass the capacity of the array as batch size/count.

The "somewhat" working version is below.

https://github.com/parca-dev/parca-agent/blob/d44bf3134624064580b269c81621b85857bdd7e4/pkg/profiler/profiler.go#L342-L381

The problem with it that I need to know the actual number of elements in the map before determining the maximum allow batch count. My first question is, is there a neater way to fetch the number of the elements in a BPF map?

The second and maybe more important question is concerning this: https://github.com/parca-dev/parca-agent/blob/d44bf3134624064580b269c81621b85857bdd7e4/pkg/profiler/profiler.go#L356

Is there reconciliation lag or implicit behavior between kernel and user-space regarding BPF maps? Without waiting between operations, it'd constantly give EPERM errors. I have discovered this as a result of pure coincidence. It was working when a debugger attached and a breakpoint exists before the GetValueAndDeleteBatch.

Do you have any idea? What's happening here? What am I doing wrong?

@kakkoyun
Copy link
Contributor Author

I've also added this PR from what I've gathered from the libbpf docs. #152

@kakkoyun
Copy link
Contributor Author

The second and maybe more important question is concerning this: https://github.com/parca-dev/parca-agent/blob/d44bf3134624064580b269c81621b85857bdd7e4/pkg/profiler/profiler.go#L356

Is there reconciliation lag or implicit behavior between kernel and user-space regarding BPF maps? Without waiting between operations, it'd constantly give EPERM errors. I have discovered this as a result of pure coincidence. It was working when a debugger attached and a breakpoint exists before the GetValueAndDeleteBatch.

Do you have any idea? What's happening here? What am I doing wrong?

@grantseltzer Sorry to ping you again, any ideas about this part?

@grantseltzer
Copy link
Contributor

@kakkoyun Hi, sorry about the delay, I will get back to you on this a little later today!

@grantseltzer
Copy link
Contributor

grantseltzer commented Apr 19, 2022

Hey @grantseltzer, I made it work in a degree. My previous mistake was to pass the capacity of the array as batch size/count.

The "somewhat" working version is below.

https://github.com/parca-dev/parca-agent/blob/d44bf3134624064580b269c81621b85857bdd7e4/pkg/profiler/profiler.go#L342-L381

The problem with it that I need to know the actual number of elements in the map before determining the maximum allow batch count. My first question is, is there a neater way to fetch the number of the elements in a BPF map?

Do I understand correctly that you get EPERM if you set the count to max_entries? That it has to be equal to the actual number of entries? I wonder if it's possible to initialize the map with max_entries number of empty values to avoid this?

It doesn't seem that there's any API available to get the number of loaded entries.

The second and maybe more important question is concerning this: https://github.com/parca-dev/parca-agent/blob/d44bf3134624064580b269c81621b85857bdd7e4/pkg/profiler/profiler.go#L356

Is there reconciliation lag or implicit behavior between kernel and user-space regarding BPF maps? Without waiting between operations, it'd constantly give EPERM errors. I have discovered this as a result of pure coincidence. It was working when a debugger attached and a breakpoint exists before the GetValueAndDeleteBatch.

It's possible that the userspace program is updating the map in a different thread? No where along the line down to the actual BPF system call invocation is there threads being spawned. It's perhaps possible that the BPF syscall returns a value before completing work but I've never heard of that being done. Perhaps there's another way that EPERM can be returned besides the issue of count being off.

Overall I don't have a very good answer for you :-/ I highly recommend asking both of these questions on the bpf mailing list where you'll get a much better in-depth answer than what I can tell you. Please let me know if you're not familiar with the mailing list and I can guide you in that.

@grantseltzer
Copy link
Contributor

I'm also curious to see what syscall is actually return the EPERM, did you try running tracee in the background to see?

@kakkoyun
Copy link
Contributor Author

Do I understand correctly that you get EPERM if you set the count to max_entries? That it has to be equal to the actual number of entries? I wonder if it's possible to initialize the map with max_entries number of empty values to avoid this?

Yes, exactly you need to pass a count value that's less than or equal to the number of elements in the BPF map, otherwise it seems like you get a EPERM error. Probably because it tries to read a memory chunk that's not allocated for the map. I'll check if it is possible to initialize a BPF map with zero-values. The implementation on our side would be neater.

It doesn't seem that there's any API available to get the number of loaded entries.

That's a bummer. Right now, we are just counting using an iterator. It does the job 🤷

It's possible that the userspace program is updating the map in a different thread? No where along the line down to the actual BPF system call invocation is there threads being spawned. It's perhaps possible that the BPF syscall returns a value before completing work but I've never heard of that being done. Perhaps there's another way that EPERM can be returned besides the issue of count being off.

I don't think we are using a different thread knowingly. The Go runtime might be the culprit here. I'll make sure we lock the threads and test if it's the case. Nice pointer. Thanks!

Please let me know if you're not familiar with the mailing list and I can guide you in that.

Thanks, your blogpost was helpful to dip my toes into Linux mailing lists. And I'll reach out if it comes to that 😊

@kakkoyun
Copy link
Contributor Author

I'm also curious to see what syscall is actually return the EPERM, did you try running tracee in the background to see?

I haven't done that. I guess I need to do that first. Is there a good place to start with tracee? A tutorial to run it on a minikube cluster maybe?

@grantseltzer
Copy link
Contributor

We have one for running with vagrant, and a small sheet on installing in kubernetes. I'm not very experienced with k8s but my teammates are, so if you have any issues feel free to ask and i'll tag the appropriate people!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants