Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cow based interface for StructObject::fields #157

Closed
mitsuhiko opened this issue Nov 20, 2022 · 11 comments
Closed

Cow based interface for StructObject::fields #157

mitsuhiko opened this issue Nov 20, 2022 · 11 comments

Comments

@mitsuhiko
Copy link
Owner

The current fields interface only allows borrowing a str out of self. While attempting to make a safe interface for stack borrowing I realized that there are cases where you really need to create an owned string there (#156).

There are two options here:

  • fn fields(&self) -> Box<dyn Iterator<Item = Cow<'_, str>> + '_> — which has the advantage that you can borrow in a lot of cases, but it means that code which cannot hold on to the lifetime needs to allocate a string in all cases
  • fn fields(&self) -> Box<dyn Iterator<Item = Cow<'static, str>> + '_> — would require more dynamic allocations in case the field name is not fully known, but has the advantage that we could internally start having a ValueRepr::StaticStr which might avoid unnecessary allocations in a lot of cases.
@mitsuhiko
Copy link
Owner Author

Looking through the code base a bit now the case for Cow<'static, str> I think is very strong. serde also passes &'static str in a few places where we can then benefit of a ValueRepr::StaticStr as well which looks very appealing to me.

@mitsuhiko
Copy link
Owner Author

@SergioBenitez so since I changed the fields interface originally for the dynamic object case I wonder if your case lets you return a &'static str or if that's really just a &str. &'static makes some things easier, but probably introduces more allocations because when you need to allocate, it bypasses the internal string interning.

An alternative would be to let it return Either<&'static str, Value> :)

@SergioBenitez
Copy link
Contributor

SergioBenitez commented Nov 22, 2022

@SergioBenitez so since I changed the fields interface originally for the dynamic object case I wonder if your case lets you return a &'static str or if that's really just a &str. &'static makes some things easier, but probably introduces more allocations because when you need to allocate, it bypasses the internal string interning.

An alternative would be to let it return Either<&'static str, Value> :)

You can model my case as a HashMap<String, serde_json::Value>. I have what surmounts to a wrapper around such a type that I want to expose, as free of charge as possible, as a template context. As such, static strings aren't available, and returning a String would mean allocating each iteration, which is expensive. An Arc<str> for each key would be the next best thing, but since I don't control the nested maps inside the serde_json::Value, it only helps at the outer most level. (In reality, I optimize for two levels, but the problem remains.)

The ideal would be to be able to return values that borrow from anything or values that can optimally borrow from nothing. I'm not sure what prohibits the former in minijinja's case. I would imagine that since the context is only held during the rendering call, this wouldn't be too difficult to pull off, but I haven't given it a closer look.

In the latter case, I have imagined an API by which the engine tracks object path accesses at compile-time and then asks once (say via an Object trait) for a terminal value that isn't compound, so anything but a map or sequence. For example, for foo.bar[0].baz, the engine would ask the foo object (assuming it is one) for bar[0].baz. The object can then perform an efficient lookup without any clones and return the terminal baz. Similarly, if we have:

{% set bar = foo.bar[0] %}
{{ bar.baz }}

The engine would again ask foo for bar[0].baz. This of course requires some thought and implementation grease, but, aside from allowing full borrows of anything, seems like the most sensible approach.

I'm in favor of any zero-overhead solution that allows me to make use of my HashMap wrapper as a template context. The less code I have to write to make that possible, the better, of course.

@mitsuhiko
Copy link
Owner Author

As such, static strings aren't available, and returning a String would mean allocating each iteration, which is expensive. An Arc for each key would be the next best thing, but since I don't control the nested maps inside the serde_json::Value, it only helps at the outer most level. (In reality, I optimize for two levels, but the problem remains.)

For what it's worth MiniJinja today already has an optimization for this, but it's not exposed. When minijinja converts string keys it auto interns. In some ways a Value::from_str_intern could be provided that exposes this optimization to the user in which case the allocation again is shared.

Exposing a compound lookup is definitely possible and I have considered this before, but for the case where someone needs to iterate over one of those objects probably the same issue applies.

@SergioBenitez
Copy link
Contributor

SergioBenitez commented Nov 23, 2022

As such, static strings aren't available, and returning a String would mean allocating each iteration, which is expensive. An Arc for each key would be the next best thing, but since I don't control the nested maps inside the serde_json::Value, it only helps at the outer most level. (In reality, I optimize for two levels, but the problem remains.)

For what it's worth MiniJinja today already has an optimization for this, but it's not exposed. When minijinja converts string keys it auto interns. In some ways a Value::from_str_intern could be provided that exposes this optimization to the user in which case the allocation again is shared.

Sorry, I conflated a couple of things without properly distinguishing them.

The optimization I was referring to was one where clones of compound values (maps, vectors) are avoided (in response to get_field() or get_item() calls) as long as the compound value is within k levels of the root value. That is, within k lookups, no deep cloning occurs, but values are cloned thereafter. It is here where it would be ideal to avoid cloning by either allowing borrows to be returned or by invoking the get() methods with as complete a path as possible so that the lookup can be performed internally.

In terms of the iterator of strings returned by fields(), I'd like to avoid allocating for each string, or at all, so I think I'd need to be able to return an &str from the iterator.

Exposing a compound lookup is definitely possible and I have considered this before, but for the case where someone needs to iterate over one of those objects probably the same issue applies.

I think if properly tracked, the final lookup should still be compound. So something like:

{% for k in map %}
{{ other.map[k].foo }}
{% endfor %}

Should still perform a single call per iteration to get_field() of other.

@mitsuhiko
Copy link
Owner Author

I will have a look at this, but in terms of this:

In terms of the iterator of strings returned by fields(), I'd like to avoid allocating for each string, or at all, so I think I'd need to be able to return an &str from the iterator.

If I were to expose Value::from_str_interned(val) then you can create a value with a string without allocating (at least usually). The reason for this is that MiniJinja internally already performs key interning and this could be exposed to user code.

@mitsuhiko
Copy link
Owner Author

mitsuhiko commented Nov 26, 2022

Today there is actually not a single code path that is likely to be encountered where fields being an iterator makes a lot of sense. While there is a theoretical benefit where the iteration over objects could be optimized, for the loop interface that would have to be an ExactSizeIterator which cannot be part of the interface.

I'm heavily leaning towards changing the interface to this:

pub trait StructObject: Send + Sync {
    fn get_field(&self, name: &str) -> Option<Value>;

    fn static_fields(&self) -> Option<&'static [&'static str]> {
        None
    }

    fn fields(&self) -> Vec<Arc<String>> {
        self.static_fields()
            .into_iter()
            .flat_map(|fields| fields.iter().copied().map(intern))
            .collect()
    }

    fn field_count(&self) -> usize {
        if let Some(fields) = self.static_fields() {
            fields.len()
        } else {
            self.fields().len()
        }
    }
}

Exposing the intern functionality makes sense anyways in which case for the actual strings most allocations are gone, and the allocation for the vector of fields was necessary anyways for the dynamic case. For code that wants to optimize it can first try to use the entire static_fields interface which requires no allocations at all. But again there for iteration purposes, the user likely has to use intern anyways.

Implementation is in #158

@SergioBenitez
Copy link
Contributor

SergioBenitez commented Nov 26, 2022

I'm still not entirely sure where fields() is used in minijinja, aside from iterating through the keys of a map in a template and printing the map for debug purposes. In both of these cases, returning an iterator saves on the cost of dynamically allocating the entire vector, especially when the size of the resulting vector is unknown and thus you can't allocate it with_capacity(). Furthermore, being able to return an &str means you can implement fields() allocation-free, but these changes make that impossible for anything that doesn't know its fields at compile-time. Finally, this API, of returning a vector, immediately feels wrong to me, perhaps because it breaks from what I expect fields() to be semantically.

In any case, I expect (but would like clarity) that fields() is rarely called, or that if it is, it's largely inconsequential to the overall performance. In that light, I'll offer some remarks on moving forward with this api: instead of Arc<String>, you could consider using something like ArcStr instead. Then, if instead of returning a Vec from fields() you return a Cow<'static, [ArcStr]>, you don't need the two methods and you don't lose any performance for static strings.

Arc<String> is really weird to me. It's odd to me that this isn't an Arc<str> to begin with, but it seems like you use make_mut() internally; I wonder if there's really any advantage here.

I also wonder what the benefit to string interning is here. If the strings are static, then you can just store the pointer. If the strings are dynamic, are they really used all that often to merit the cost of string interning? The alternative being, of course, that strings aren't interned and the method is allowed to return &strs as before.

@mitsuhiko
Copy link
Owner Author

I'm still not entirely sure where fields() is used in minijinja, aside from iterating through the keys of a map in a template and printing the map for debug purposes. In both of these cases, returning an iterator saves on the cost of dynamically allocating the entire vector, especially when the size of the resulting vector is unknown and thus you can't allocate it with_capacity()

Iteration is the only place where it's used, but iteration there is not lazy for structs. This is because the length needs to be known anyways for the struct (loop.revindex) and lifetimes make it (with safe code) impossible to hold on to the iterator. I would generally say that iteration over structs is not particularly common. However even if it is, the only pointless iteration is in fact the creation of the vector. The strings within need to be placed in a Value anyways for the evaluation so it's really only a single allocation that this optimizes away and that only if there was a way to resolve the challenge with the lifetimes.

instead of Arc<String>, you could consider using something like ArcStr instead. Then, if instead of returning a Vec from fields() you return a Cow<'static, [ArcStr]>, you don't need the two methods and you don't lose any performance for static strings.

Arc<String> is how a string is stored internally in MiniJinja for all string types.

I also wonder what the benefit to string interning is here. If the strings are static, then you can just store the pointer.

I thought so too but it turns out that having two representations (String(Arc<String>) and Str(&'static str)) causes extra branching everywhere, where strings are needed and the performance suffers significantly. The string interning on the other hand is incredible useful performance wise as without it, duplicate strings in the serialization system are impossible to detect and cause over allocation.

@mitsuhiko
Copy link
Owner Author

One other bit of information I suppose is that even in Jinja2, iteration over dictionaries typically takes place in the form of |dictsort and not |items due to hash randomization. Both of those actually allocate in both Jinja2 and MiniJinja. If you do |items today in MiniJinja you basically create a vector anyways.

@mitsuhiko
Copy link
Owner Author

I'm changing this on main for now and will play around with it before release to see if this works well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants