Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

btf: NewHandle should gracefully handle unsupported types #895

Open
lmb opened this issue Jan 12, 2023 · 1 comment
Open

btf: NewHandle should gracefully handle unsupported types #895

lmb opened this issue Jan 12, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@lmb
Copy link
Collaborator

lmb commented Jan 12, 2023

BTF has been extended over time to support new metadata, but older kernels refuse to load such extended BTF. This is a problem when loading a program that has been compiled by a recent version of clang on an old kernel, so libbpf modifies the loaded BTF on the fly so that it is accepted by the kernel. We've copied this behaviour for one specific case: 5.6 allows specifying the linkage of a function (static, global, etc.). On kernels < 5.6 we replace all function linkage with static when loading BTF into the kernel via NewHandle. In the future I want to extend NewHandle to cover more kernel incompatibilities, like libbpf does (e.g. #713).

The rest of this issue explores the implications this has on my proposed API to create BTF blobs from Go:

spec := NewSpec()
id, err := spec.Add(fooType)
handle, err := NewHandle(spec)

Why skipping types is hard

From the commit that introduces TYPE_TAG to pahole:

[$ ~] cat t.c
#define __tag1 __attribute__((btf_type_tag("tag1")))
#define __tag2 __attribute__((btf_type_tag("tag2")))
int __tag1 * __tag1 __tag2 *g __attribute__((section(".data..percpu")));
[$ ~] clang -O2 -g -c t.c
[$ ~] pahole -JV t.o
Found per-CPU symbol 'g' at address 0x0
Found 1 per-CPU variables!
File t.o:
[1] TYPE_TAG tag1 type_id=5
[2] TYPE_TAG tag2 type_id=1
[3] PTR (anon) type_id=2
[4] TYPE_TAG tag1 type_id=6
[5] PTR (anon) type_id=4
[6] INT int size=4 nr_bits=32 encoding=SIGNED
search cu 't.c' for percpu global variables.
Variable 'g' from CU 't.c' at address 0x0 encoded
[7] VAR g type=3 linkage=1
[8] DATASEC .data..percpu size=8 vlen=1
		type=7 offset=0 size=8

You can see for the source:

  int __tag1 * __tag1 __tag2 *g __attribute__((section(".data..percpu")));

the following type chain is generated:

  var -> ptr -> tag2 -> tag1 -> ptr -> tag1 -> int

Stripping out TYPE_TAG means we need to reallocate IDs for PTR, INT, etc. and we also need to adjust the chain. I can't see a way how to do this with the proposed API since callers of Spec.Add will have put the IDs in syscall args, etc. already.

The libbpf way: replace types instead of stripping them

libbpf replaces types instead of removing them outright. This avoids having to reallocate IDs. Here are the current rules:

  • VAR / DECL_TAG replaced with INT.
  • TYPE_TAG becomes CONST
  • DATASEC replaced with STRUCT
  • FUNC_PROTO becomes ENUM
  • FUNC becomes TYPEDEF
  • ENUM64 replaced with UNION

This is very simple, and has the benefit of being "battle tested". If we stick to the same rules as libbpf we won't break additional things. Replacing DATASEC with STRUCT seems like a no brainer. I'm not a huge fan of the TYPE_TAG, DECL_TAG, FUNC_PROTO replacements. It'd be better if we didn't have to include those types in the first place.

Idea: delay allocating type IDs

There is a way to skip types during marshaling without reallocating IDs by simply delaying ID allocation until the last possible moment, during marshaling. It's easier to explain via some pseudo code:

s := NewSpec()
id, err := s.Add(foo) // Allocates an ID for foo only!
handle, err := NewHandle(s) // Allocates IDs for children of foo

The call to Add allocates an ID for foo, which the caller can use in syscalls, etc. NewHandle marshals foo and all of it's children. If a child is not supported by the kernel we can "simply" skip encoding it, since we know that the user can't refer to the child via ID. (Skipping TYPE_TAG will still be annoying since it's part of a chain, but oh well.)

What happens if we passed a type to Spec.Add that the kernel doesn't understand? In this case NewHandle will return an error instead of silently skipping the type. This is because Spec.Add makes a guarantee to the user: if NewHandle succeeds the returned type ID is valid.

A side effect of this proposal is that Spec.TypeByName, etc. won't find children of foo. I think that is an improvement over my current implementation. If there is a use case we can add AddAll or similar later.

Downsides:

  • There are now two places that allocate IDs: Add and NewHandle
  • Marshaling has to deal with skipped types. The worst case is probably TYPE_TAG.
  • Overall, more complex code for a cleaner API
@lmb lmb added the enhancement New feature or request label Jan 12, 2023
@lmb lmb self-assigned this Jan 17, 2023
@lmb lmb removed their assignment Mar 2, 2023
@lmb
Copy link
Collaborator Author

lmb commented Aug 15, 2023

I spent some more time thinking about this due to #1116. To skip interior types like DECL_TAG or qualifiers we can replace references to them with references to the type they point at. For leaf types like ENUM64 this doesn't work, we need to point at something. Earlier I assumed we should just return an error to the user at that point, but that doesn't really help them. They can't load the type yet would still like to. We should provide that workaround, at the risk that it won't work for some use cases.

With that in mind I'd probably try to implement this like so:

  • Strip intermediate types. A type passed to Add will never be stripped though.
  • Replace leaf types, probably like libbpf does. A type passed to Add will be replaced if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant