Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify which dimension contains image channel in AnchorImage #487

Open
sakoush opened this issue Sep 24, 2021 · 11 comments
Open

Specify which dimension contains image channel in AnchorImage #487

sakoush opened this issue Sep 24, 2021 · 11 comments
Labels
Priority: Medium Type: API Anything affecting the public API (what is exposed, function signatures, return values etc.)

Comments

@sakoush
Copy link
Member

sakoush commented Sep 24, 2021

In AnchorImage the channel dimension is assumed to be the last dimension as defined here. This is not necessarily true, for example in mnist handwritten pytorch model the shape of the data is (1, 28*28).

Note sure if this is applicable in other explainers as well? so the fix needs to be applied on all relevant explainers.

Lets have this as a configurable parameter at object init?

@jklaise jklaise added the Type: API Anything affecting the public API (what is exposed, function signatures, return values etc.) label Sep 27, 2021
@jklaise
Copy link
Member

jklaise commented Sep 27, 2021

The easiest solution would be to introduce another argument with a default value, e.g. layout='nhwc', this seems relevant as "image metadata" for explainers (and detectors?) that need this info @arnaudvl @ascillitoe .

@jklaise
Copy link
Member

jklaise commented Oct 7, 2021

Annoyingly, it seems the latest release of scikit-image==0.18.3 doesn't support a different layout other than channel-last, however in the upcoming version 0.19 this is added and can be controlled via an argument. Fixing this in the meantime would require a bit of custom reshaping.

@jklaise
Copy link
Member

jklaise commented Oct 8, 2021

It seems like the 0.19 release of scikit-image should not be too far away now (scikit-image/scikit-image#5169), so it's tempting to wait and upgrade to avoid doing array conversions ourselves just to be removed later. Otherwise we would have to reimplement the logic of this decorator to patch the skimage segmentation functions that can only handle channel-last: https://github.com/scikit-image/scikit-image/blob/8a13f15b7629a3be643fe0eb45f6ede55ba69c84/skimage/_shared/utils.py#L219

@jklaise jklaise added the Blocked Issue is blocked by some bigger issue label Oct 12, 2021
@sakoush
Copy link
Member Author

sakoush commented Nov 2, 2021

Tthere are cases e.g. sklearn mnist that do not even have a channel dimension. Basically the image is flattened for grayscale images to 2D: (samples, features).

@jklaise
Copy link
Member

jklaise commented Nov 2, 2021

Yup, the standard response is to wrap the predictor so it's of the form that alibi expects (i.e. in this case adding a dummy channel dimension).

In this particular case it should be easy to support this internally (e.g. check image_shape, if 2D raise a warning and construct the wrapped predictor internally).

This is actually the same type of issue as #516, attempting to push the complexity of wrapping non-compliant predictors internally as opposed to keeping it a responsibility of the user.

@jklaise jklaise removed the Blocked Issue is blocked by some bigger issue label Dec 6, 2021
@jklaise
Copy link
Member

jklaise commented Dec 6, 2021

Since scikit-image 0.19 has been released (#551 ), this is unblocked. A fix would then need to bump the minimum required version of scikit-image to 0.19.

@jklaise
Copy link
Member

jklaise commented Jul 21, 2022

I've thought a little about this and there are two main choices, both requiring an extra argument:

  • image_layout: Literal['nhwc', 'nchw'] = 'nhwc'
    • By default have nhwc which is the tensorflow default image shape, also support the other popular layout nchw often used with torch
    • Could be extended in the future to include other permutations
  • channel_axis: int = -1
    • By default treat last axis as channel
    • For the nchw format channel_axis would have to be 1 (since 0 is for batch and using strings such as 'first' and 'last' might be confusing?)
    • Perhaps less flexible than image_layout because this would not be extensible to support more weird layouts (e.g. height and weight axes swapped around)
    • channel_axis is the way multi-channel images are treated in scikit-image from version 0.19: https://github.com/scikit-image/scikit-image/releases/tag/v0.19.0

I think image_layout has a slight edge do to being more general and extensible but would be great to hear another opinion @RobertSamoilescu @ascillitoe.

EDIT: it's also possible we stick with conventions like skimage and go the channel_axis route, see more discussion here: https://scikit-image.org/docs/stable/user_guide/numpy_images.html#coordinate-conventions

EDIT2: in the future supporting volumetric images may require even more conventions like skimage that could be more easily managed by image_layout instead

EDIT3: including the batch dimension n in image_layout may be slightly controversial as it's not part of the image layout and also (for the time being) AnchorImage operates only on single images. On the other hand, the implicit assumption then is that the underlying predictor has batch as leading dimension (but this is an assumptions we've made throughout the library).

@ascillitoe
Copy link
Contributor

ascillitoe commented Jul 21, 2022

Mmmn a tricky one. I tend to agree that image_layout is nicer, since it is more flexible and could be extended in the future. I also that channel_axis might be confusing due to the issue with whether the batch dimension n is included in the integer count.

Maybe image_layout could also aid scenarios like (samples, features) that @sakoush mentioned. If numeric values where given i.e. image_layout='n,h32,w32' (we'd need to think about formatting rules) this would help us avoid needing to try to infer anything when reshaping the flattened feature array. Or is this getting too complicated now?!

@jklaise
Copy link
Member

jklaise commented Jul 21, 2022

@ascillitoe I would avoid putting in extra functionality, especially something that would be a new custom format. We already have an image_shape argument which is also a standard concept in ML.

@ascillitoe
Copy link
Contributor

Yeh that's fair enough. Even without a need for the added flexibility of image_layout right now, I agree that it is easier to interpret, and might be useful for volumetric images etc in the future. I don't have a super strong opinion though.

@RobertSamoilescu
Copy link
Collaborator

PyTorch example for AnchorImage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: Medium Type: API Anything affecting the public API (what is exposed, function signatures, return values etc.)
Projects
None yet
Development

No branches or pull requests

4 participants