Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ndarray iterators don't always work well with impl Iterator #1290

Open
lePerdu opened this issue Jun 2, 2023 · 2 comments
Open

ndarray iterators don't always work well with impl Iterator #1290

lePerdu opened this issue Jun 2, 2023 · 2 comments

Comments

@lePerdu
Copy link

lePerdu commented Jun 2, 2023

When writing some iterator helper functions around the base ndarray iterators, I found a scenario where the borrow checker rejects what seems to be valid code. The scenario is generating an iterator over indices matching some criteria in an Array of references.

use ndarray::{iter::IndexedIter, Array1, Ix1};

/// This doesn't compile:
///
/// > hidden type for `impl Iterator<Item = usize> + 'a` captures lifetime that does not appear in bounds
/// > hidden type ... captures the lifetime `'s` as defined here
///
/// Various alternatives I tried which have the same issue:
/// - `.iter().enumerate()` instead of `.indexed_iter()`
/// - `.filter_map()` instead of `.filter().map()`
pub fn nd_iter_non_empty_indices_not_compiling<'s, 'a>(
    array: &'a Array1<&'s str>,
) -> impl Iterator<Item = usize> + 'a {
    array
        .indexed_iter()
        .filter(|(_index, elem)| !elem.is_empty())
        .map(|(index, _elem)| index)
}

/// Adding `'a: 's` makes the function compile, but now code using it doesn't
/// compile because the constraint `'a: 's` is impossible: the container can't
/// outlive the contained elements.
///
/// ```
/// use ndarray::array;
/// use broken_iter::*;
/// let arr = array!["", "abc", "", "123"];
/// assert_eq!(nd_iter_non_empty_indices_broken(&arr).collect::<Vec<_>>(), vec![1, 3]);
/// ```
pub fn nd_iter_non_empty_indices_broken<'s, 'a: 's>(
    array: &'a Array1<&'s str>,
) -> impl Iterator<Item = usize> + 'a {
    array
        .indexed_iter()
        .filter(|(_index, elem)| !elem.is_empty())
        .map(|(index, _elem)| index)
}

The original problem I had was with a 2D array with more complex types and filtering operation, but the error is the same.

Similar functions which work fine

I thought it might have been a limitation of the Rust compiler, but the same operation with a normal Rust slice (or Vec) works fine:

pub fn slice_iter_non_empty_indices<'s, 'a>(
    array: &'a [&'s str],
    // All of these also work:
    // array: &'a Box<[&'s str]>,
    // array: &'a Vec<&'s str>,
    // array: &'a LinkedList<&'s str>,
    // array: &'a HashSet<&'s str>,
) -> impl Iterator<Item = usize> + 'a {
    array
        .iter()
        .enumerate()
        .filter(|(_index, elem)| !elem.is_empty())
        .map(|(index, _elem)| index)
}

It also works if the iterator returns references:

/// A similar function which returns the `&str` elements themselves works fine.
/// ```
/// use ndarray::array;
/// use broken_iter::*;
/// let arr = array!["", "abc", "", "123"];
/// assert_eq!(nd_iter_non_empty_strings(&arr).collect::<Vec<_>>(), vec!["abc", "123"]);
/// ```
pub fn nd_iter_non_empty_strings<'s, 'a>(
    array: &'a Array1<&'s str>,
) -> impl Iterator<Item = &'s str> + 'a {
    array.iter().filter(|elem| !elem.is_empty()).cloned()
}

Workaround

I am able to work around the issue by creating a custom type for the iterator, but it seems like all the extra boilerplate shouldn't be necessary, especially since it isn't required when working with Vec:

/// It works fine if I use a custom struct to do the filter/map.
///
/// ```
/// use ndarray::array;
/// use broken_iter::*;
/// assert_eq!(nd_iter_non_empty_indices_custom(&array!["", "abc", "", "123"]).collect::<Vec<_>>(), vec![1, 3]);
/// ```
pub fn nd_iter_non_empty_indices_custom<'s, 'a>(
    array: &'a Array1<&'s str>,
) -> NonEmptyIndices<'a, 's> {
    NonEmptyIndices {
        indexed_iter: array.indexed_iter(),
    }
}

pub struct NonEmptyIndices<'a, 's> {
    indexed_iter: IndexedIter<'a, &'s str, Ix1>,
}

impl<'a, 's> Iterator for NonEmptyIndices<'a, 's> {
    type Item = usize;

    fn next(&mut self) -> Option<Self::Item> {
        // Manual loop for the filter/map operation
        loop {
            let (index, elem) = self.indexed_iter.next()?;
            if !elem.is_empty() {
                return Some(index);
            }
        }
    }
}

EDIT:

I found a much better work-around by specifying a more concrete return type. (Also changed it to filter_map to simplify the type).

pub fn nd_iter_non_empty_indices_better<'s, 'a>(
    array: &'a Array1<&'s str>,
) -> FilterMap<
    IndexedIter<'a, &'s str, Ix1>,
    impl FnMut((usize, &'a &'s str)) -> Option<usize>,
> {
    array.indexed_iter().filter_map(
        |(index, elem)| {
            if !elem.is_empty() {
                Some(index)
            } else {
                None
            }
        },
    )
}

This isn't terrible to work with, but it's still weird that all of the types in std::collections work fine with impl Iterator<Item = usize> + 'a but ndarray types need the explicit return type.

@adamreichold
Copy link
Collaborator

There is something else going on with the slice though, i.e. if you remove the lifetime on return value type, it shows the error message

error[[E0700]](https://doc.rust-lang.org/stable/error_codes/E0700.html): hidden type for `impl Iterator<Item = usize>` captures lifetime that does not appear in bounds
 --> src/main.rs:4:5
  |
1 |   pub fn slice_iter_non_empty_indices<'s, 'a>(
  |                                       -- hidden type `Map<Filter<Enumerate<std::slice::Iter<'s, &'s str>>, [closure@src/main.rs:7:17: 7:33]>, [closure@src/main.rs:8:14: 8:30]>` captures the lifetime `'s` as defined here
2 |       array: &'a [&'s str],
3 |   ) -> impl Iterator<Item = usize> {
  |        --------------------------- opaque type defined here
4 | /     array
5 | |         .iter()
6 | |         .enumerate()
7 | |         .filter(|(_index, elem)| !elem.is_empty())
8 | |         .map(|(index, _elem)| index)
  | |____________________________________^
  |
help: to declare that `impl Iterator<Item = usize>` captures `'s`, you can add an explicit `'s` lifetime bound
  |
3 | ) -> impl Iterator<Item = usize> + 's {
  |                                  ++++

For more information about this error, try `rustc --explain E0700`.

I.e. notice how the non-opaque type already contains only references to the lifetime 's but not 'a whereas the same error message for ndarray referenfces both 's and 'a.

@lePerdu
Copy link
Author

lePerdu commented Jun 2, 2023

That doesn't seem to be restricted to slices; the compiler also gives the same sort of warnings in the original example without any lifetime annotations on the return:

error[E0700]: hidden type for `impl Iterator<Item = usize>` captures lifetime that does not appear in bounds
  --> src/lib.rs:16:5
   |
13 |   pub fn nd_iter_non_empty_indices_not_compiling<'s, 'c>(
   |                                                  -- hidden type `Map<Filter<IndexedIter<'s, &'s str, Dim<[usize; 1]>>, [closure@src/lib.rs:18:17: 18:33]>, [closure@src/lib.rs:19:14: 19:30]>` captures the lifetime `'s` as defined here
14 |       array: &'c Array1<&'s str>,
15 |   ) -> impl Iterator<Item = usize> {
   |        --------------------------- opaque type defined here
16 | /     array
17 | |         .indexed_iter()
18 | |         .filter(|(_index, elem)| !elem.is_empty())
19 | |         .map(|(index, _elem)| index)
   | |____________________________________^
   |
help: to declare that `impl Iterator<Item = usize>` captures `'s`, you can add an explicit `'s` lifetime bound
   |
15 | ) -> impl Iterator<Item = usize> + 's {
   |                                  ++++

Note IndexedIter<'s, &'s str> which doesn't reference the lifetime of the array at all.

It seems like some bad type/lifetime inference is going on without lifetime annotations on the return type. std::slice::Iter<'s, &'s str> is incorrect in my understanding: the first lifetime parameter should be 'a (lifetime of the array) not 's (lifetime of the strings).

Following the compiler's suggestions there leads to non-working code:

  • Add + 's: Compiler says "lifetime may not live long enough consider adding the following bound: 'a: 's"
  • Add 'a: 's: The function compiles but usages of it don't compile because the array can't outlive the contained references.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants