Drift Detection Methods -> Learned Kernel -> Dataset format incompatible #798

righelcpm · 2023-05-24T13:10:21Z

I am facing a format incompatibility issue. I have tried to follow the structure here (https://docs.seldon.io/projects/alibi-detect/en/stable/cd/methods/learnedkerneldrift.html).
I could not understand properly the (imput) file format required/needed.
Could someone help me, please?

My code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats
import os
import tensorflow as tf

pip install alibi-detect

from alibi_detect.cd import LearnedKernelDrift
from tensorflow.keras.layers import Conv2D, Dense, Flatten, Input
from alibi_detect.utils.tensorflow import DeepKernel

sea_noise = pd.read_csv("/content/drive/MyDrive/Dataset/sea_0123_gradual_noise_0.2_1000.csv")

X_ref_2 = np.transpose(np.vstack([sea_noise.X1.values]))
X_test_2 = np.transpose(np.vstack([sea_noise.X3.values]))

# Learned Kernel Drift

# To define the projection phi
proj = tf.keras.Sequential(
  [
      Input(shape=(32, 32, 3)),
      Conv2D(8, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2D(16, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2D(32, 4, strides=2, padding='same', activation=tf.nn.relu),
      Flatten(),
  ]
  
)
[sea_0123_gradual_noise_0.2_1000.csv](https://github.com/SeldonIO/alibi-detect/files/11554875/sea_0123_gradual_noise_0.2_1000.csv)


# To define the kernel
kernel = DeepKernel(proj, eps=0.01)

# To instantiate the detector
cd_7 = LearnedKernelDrift(X_ref_2, kernel, backend='tensorflow', p_val=.05, epochs=10, batch_size=32)

preds_7 = cd_7.predict(X_test_2, return_p_val=True, return_distance=True)

The text was updated successfully, but these errors were encountered:

mauicv · 2023-05-25T09:25:49Z

Hey @righelcpm,
It's hard to help without more details as I'm not sure exactly what the error is. Can you include the full error message? What is the dataset exactly?

righelcpm · 2023-05-25T10:43:27Z

I forgot to show the most important: the error message:

ValueError                                Traceback (most recent call last)
[<ipython-input-18-059926eedbfd>](https://localhost:8080/#) in <cell line: 21>()
     19 cd_7 = LearnedKernelDrift(X_ref_2, kernel, backend='tensorflow', p_val=.05, epochs=10, batch_size=32)
     20 
---> 21 preds_7 = cd_7.predict(X_test_2, return_p_val=True, return_distance=True)

6 frames
[/usr/local/lib/python3.10/dist-packages/alibi_detect/utils/tensorflow/kernels.py](https://localhost:8080/#) in call(self, x, y)
    169 
    170     def call(self, x: tf.Tensor, y: tf.Tensor) -> tf.Tensor:
--> 171         similarity = self.kernel_a(self.proj(x), self.proj(y))  # type: ignore[operator]
    172         if self.kernel_b is not None:
    173             similarity = (1-self.eps)*similarity + self.eps*self.kernel_b(x, y)  # type: ignore[operator]

ValueError: Exception encountered when calling layer 'sequential' (type Sequential).

Input 0 of layer "conv2d" is incompatible with the layer: expected min_ndim=4, found ndim=2. Full shape received: (32, 1)

Call arguments received by layer 'sequential' (type Sequential):
  • inputs=tf.Tensor(shape=(32, 1), dtype=float32)
  • training=None
  • mask=None

mauicv · 2023-05-25T12:29:54Z

It looks like your input shape is wrong. If you define the projection as:

proj = tf.keras.Sequential(
  [
      Input(shape=(32, 32, 3)),
      Conv2D(8, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2D(16, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2D(32, 4, strides=2, padding='same', activation=tf.nn.relu),
      Flatten(),
  ]
)

then it expects the data to be shape (None, 32, 32, 3) on input but the dataset you've attached seems to be shape (None, 3, )? Assuming this is the issue then you'll have to change the model to match the data. You'll also probably want to use Dense layers instead of the Conv2D layers.

mauicv · 2023-05-25T12:34:02Z

That being said, I'm a little confused what you're doing here:

sea_noise = pd.read_csv("/content/drive/MyDrive/Dataset/sea_0123_gradual_noise_0.2_1000.csv")

X_ref_2 = np.transpose(np.vstack([sea_noise.X1.values]))
X_test_2 = np.transpose(np.vstack([sea_noise.X3.values]))

It looks like you are fitting on one feature and testing on another. Is this what you mean to do? Can you explain what it is you're trying to attempt and what the data is I might be able to give a better answer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drift Detection Methods -> Learned Kernel -> Dataset format incompatible #798

Drift Detection Methods -> Learned Kernel -> Dataset format incompatible #798

righelcpm commented May 24, 2023

mauicv commented May 25, 2023

righelcpm commented May 25, 2023 •

edited by mauicv

mauicv commented May 25, 2023

mauicv commented May 25, 2023

Drift Detection Methods -> Learned Kernel -> Dataset format incompatible #798

Drift Detection Methods -> Learned Kernel -> Dataset format incompatible #798

Comments

righelcpm commented May 24, 2023

mauicv commented May 25, 2023

righelcpm commented May 25, 2023 • edited by mauicv

mauicv commented May 25, 2023

mauicv commented May 25, 2023

righelcpm commented May 25, 2023 •

edited by mauicv