Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can not find sample image #6

Closed
cindyyyl opened this issue Mar 19, 2024 · 67 comments
Closed

can not find sample image #6

cindyyyl opened this issue Mar 19, 2024 · 67 comments

Comments

@cindyyyl
Copy link

Hi, i met a problem when i run sh scripts/maml/sample_mnist.sh. i can not find the sample image. i open the dir ( inr_minst/samples) but it was empty. however , when i run exp_ninist. it will generate image. could you give me some instructions?

best,
xinxin

@giulio98
Copy link
Owner

Hello, @cindyyyl !
Does the sampled image are correctly displayed on wandb?

@cindyyyl
Copy link
Author

Copy that, I saw it , but i find maybe the key problem is i did not run them sucessfully. I stuck in the evaluation part, i try to adjust the number of samples from 10001 to 10000( i think it was a bug ) ,and i am sure that the both dir has file end with xxx.npz. could you give me any instructions?
image

@giulio98
Copy link
Owner

Hello @cindyyyl,

Sorry for the late reply. To better assist you, I need that you describe better your issue could you please fill out the following details, please:

  1. Steps to Reproduce:

    • Step 1:
    • Step 2:
    • Step 3:
  2. Expected Behavior:

    • What you expected to happen after completing the steps above.
  3. Actual Behavior:

    • What actually happened. Please include any error messages or screenshots if possible.
  4. Additional Information:

    • Any other details or context you think might be helpful.

Best regards,

Giulio

@cindyyyl
Copy link
Author

cindyyyl commented Apr 30, 2024 via email

@giulio98
Copy link
Owner

giulio98 commented May 2, 2024

Hello @cindyyyl,

It appears there's some confusion regarding the FID calculations for MNIST. The FID Clean and FID Clip scores cannot be computed with the current setup as the feature extractor networks require images with 3 channels. However, for MNIST, we've utilized a pretrained LeNet5, using its penultimate layer to extract features for the standard FID computation.

You can verify in the configuration file that LeNet is specified as the feature extractor for MNIST at this link:

.

Additionally, we have provided the pretrained LeNet5 model weights in the models/lenet5 for reproducibility.

Therefore, it is expected that you won’t be able to obtain the FID CLIP and FID Clean scores—it’s not a bug. Could you confirm if you were able to calculate the standard FID score? If so, I'll initiate a pull request to clarify that FID CLIP and FID Clean scores should not be computed when dealing with single-channel images, to avoid further confusion.

Best regards,
Giulio

@cindyyyl
Copy link
Author

cindyyyl commented May 2, 2024 via email

@giulio98
Copy link
Owner

giulio98 commented May 3, 2024

Hi,

Can you send here a screenshot of a batch for the sampled image for mnist?

@cindyyyl
Copy link
Author

cindyyyl commented May 4, 2024 via email

@cindyyyl
Copy link
Author

cindyyyl commented May 4, 2024 via email

@giulio98
Copy link
Owner

giulio98 commented May 5, 2024

Hey sorry I can't see your screenshot. I just see [image.png] can you try again?

@cindyyyl
Copy link
Author

cindyyyl commented May 5, 2024 via email

@giulio98
Copy link
Owner

giulio98 commented May 5, 2024

Hello, I run the script and i was able to get the FID score for mnist
This is my full log:

(fdp) corallo@atlas1:~/PycharmProjects/functional-diffusion-processes$ sh scripts/maml/eval_mnist.sh
2024-05-05 15:39:17.027317: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[2024-05-05 15:39:19,697][HYDRA] Launching 1 jobs locally
[2024-05-05 15:39:19,697][HYDRA]        #0 : +experiments_maml=eval_mnist
[2024-05-05 15:39:19,880][__main__][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
[2024-05-05 15:39:20,082][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,082][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,087][absl][INFO] - Load dataset info from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
[2024-05-05 15:39:20,089][__main__][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
[2024-05-05 15:39:20,092][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,092][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,092][absl][INFO] - Load dataset info from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
[2024-05-05 15:39:20,093][__main__][INFO] - Instantiating <functional_diffusion_processes.models.mlp_modulation.MLPModulationLR>
[2024-05-05 15:39:20,127][__main__][INFO] - Instantiating <functional_diffusion_processes.sdetools.heat_subvp_sde.HeatSubVPSDE>
[2024-05-05 15:39:20,209][absl][INFO] - Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: 
[2024-05-05 15:39:20,776][absl][INFO] - Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter Host CUDA
[2024-05-05 15:39:20,777][absl][INFO] - Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client'
[2024-05-05 15:39:20,777][absl][INFO] - Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this.
[2024-05-05 15:39:21,560][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.correctors.langevin_corrector.LangevinCorrector>
[2024-05-05 15:39:21,564][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.predictors.euler_predictor.EulerMaruyamaPredictor>
[2024-05-05 15:39:21,568][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.pc_sampler.PCSampler>
[2024-05-05 15:39:21,571][__main__][INFO] - Instantiating <functional_diffusion_processes.losses.mse_loss.MSELoss>
[2024-05-05 15:39:21,573][__main__][INFO] - Instantiating <functional_diffusion_processes.trainers.trainer.Trainer>
[2024-05-05 15:39:21,591][absl][WARNING] - GlobalAsyncCheckpointManager is not imported correctly. Checkpointing of GlobalDeviceArrays will not be available.To use the feature, install tensorstore.
WARNING:tensorflow:From /home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-05 15:39:23,457][tensorflow][WARNING] - From /home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-05 15:39:23,465][__main__][INFO] - Instantiating <functional_diffusion_processes.metrics.fid_metric.FIDMetric>
[2024-05-05 15:39:23,604][functional_diffusion_processes.metrics.feature_extractor][INFO] - Extracting features from dataset...
[2024-05-05 15:39:23,605][absl][INFO] - Reusing dataset mnist (/home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1)
[2024-05-05 15:39:23,640][absl][INFO] - Constructing tf.data.Dataset mnist for split test, from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
WARNING:tensorflow:AutoGraph could not transform <bound method PercentStyle._format of <logging.PercentStyle object at 0x7fb89752cb80>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'NoneType' object has no attribute '_fields'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
[2024-05-05 15:39:25,504][tensorflow][WARNING] - AutoGraph could not transform <bound method PercentStyle._format of <logging.PercentStyle object at 0x7fb89752cb80>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'NoneType' object has no attribute '_fields'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
[2024-05-05 15:39:24,513][functional_diffusion_processes.datasets.mnist_dataset][INFO] - Converting image to range [0,1]...
[2024-05-05 15:39:25,637][functional_diffusion_processes.datasets.mnist_dataset][INFO] - Resizing image to size 32...
[2024-05-05 15:39:25,661][functional_diffusion_processes.datasets.image_dataset][INFO] - Preprocessing images for split test...
[2024-05-05 15:39:25,682][functional_diffusion_processes.datasets.image_dataset][INFO] - Image reshaped to shape (1024, 1)...
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/jax/_src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
  warnings.warn(
[2024-05-05 15:39:25,912][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 0
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow/python/util/nest.py:917: UserWarning: `tf.layers.flatten` is deprecated and will be removed in a future version. Please use `tf.keras.layers.Flatten` instead.
  structure[0], [func(*x) for x in entries],
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/keras/legacy_tf_layers/base.py:627: UserWarning: `layer.updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
  self.updates, tf.compat.v1.GraphKeys.UPDATE_OPS
[2024-05-05 15:39:27,005][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 1
[2024-05-05 15:39:27,542][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 2
[2024-05-05 15:39:28,074][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 3
[2024-05-05 15:39:28,655][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 4
[2024-05-05 15:39:29,643][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 5
[2024-05-05 15:39:30,365][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 6
[2024-05-05 15:39:30,861][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 7
[2024-05-05 15:39:31,510][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 8
[2024-05-05 15:39:32,186][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 9
[2024-05-05 15:39:32,710][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 10
[2024-05-05 15:39:33,367][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 11
[2024-05-05 15:39:34,191][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 12
[2024-05-05 15:39:34,736][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 13
[2024-05-05 15:39:35,238][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 14
[2024-05-05 15:39:35,738][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 15
[2024-05-05 15:39:36,249][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 16
[2024-05-05 15:39:36,761][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 17
[2024-05-05 15:39:37,254][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 18
[2024-05-05 15:39:37,758][functional_diffusion_processes.metrics.fid_metric][INFO] - Saving real dataset stats to: /home/corallo/PycharmProjects/functional-diffusion-processes/data/stats/mnist_test_stats.npz
[2024-05-05 15:39:37,980][__main__][INFO] - Starting testing!
wandb: Currently logged in as: giulio-corallo (eurecom-ds). Use `wandb login --relogin` to force relogin
wandb: wandb version 0.16.6 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.14.0
wandb: Run data is saved locally in /home/corallo/PycharmProjects/functional-diffusion-processes/wandb/run-20240505_153938-lk5wmrth
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run inr_mnist
wandb: ⭐️ View project at https://wandb.ai/eurecom-ds/fpd
wandb: 🚀 View run at https://wandb.ai/eurecom-ds/fpd/runs/lk5wmrth
[2024-05-05 15:39:48,607][functional_diffusion_processes.trainers.trainer][INFO] - Total number of parameters: 0.12M
[2024-05-05 15:39:48,904][functional_diffusion_processes.trainers.trainer][WARNING] - Resuming training from the latest checkpoint.
[2024-05-05 15:39:48,905][absl][INFO] - Restoring checkpoint from /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/checkpoints/checkpoint_27
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/jax/_src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
  warnings.warn(
[2024-05-05 15:39:48,989][absl][INFO] - Found no checkpoint files in /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist with prefix meta_0_
[2024-05-05 15:39:48,989][functional_diffusion_processes.trainers.trainer][INFO] - Starting sampling loop at step 0.
  0%|                                                                                                                                                                                 | 0/32 [00:00<?, ?it/s][2024-05-05 15:39:48,990][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 0
[2024-05-05 15:41:24,948][absl][INFO] - Saving checkpoint at step: 0
[2024-05-05 15:41:24,952][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_0
  3%|█████▎                                                                                                                                                                   | 1/32 [01:35<49:34, 95.96s/it][2024-05-05 15:41:24,953][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 1
[2024-05-05 15:41:43,396][absl][INFO] - Saving checkpoint at step: 1
[2024-05-05 15:41:43,397][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_1
[2024-05-05 15:41:43,397][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_0
  6%|██████████▌                                                                                                                                                              | 2/32 [01:54<25:10, 50.36s/it][2024-05-05 15:41:43,397][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 2
[2024-05-05 15:42:01,615][absl][INFO] - Saving checkpoint at step: 2
[2024-05-05 15:42:01,618][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_2
[2024-05-05 15:42:01,619][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_1
  9%|███████████████▊                                                                                                                                                         | 3/32 [02:12<17:14, 35.69s/it][2024-05-05 15:42:01,619][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 3
[2024-05-05 15:42:19,896][absl][INFO] - Saving checkpoint at step: 3
[2024-05-05 15:42:19,897][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_3
[2024-05-05 15:42:19,897][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_2
 12%|█████████████████████▏                                                                                                                                                   | 4/32 [02:30<13:26, 28.81s/it][2024-05-05 15:42:19,898][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 4
[2024-05-05 15:42:38,094][absl][INFO] - Saving checkpoint at step: 4
[2024-05-05 15:42:38,096][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_4
[2024-05-05 15:42:38,097][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_3
 16%|██████████████████████████▍                                                                                                                                              | 5/32 [02:49<11:14, 24.99s/it][2024-05-05 15:42:38,097][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 5
[2024-05-05 15:42:56,403][absl][INFO] - Saving checkpoint at step: 5
[2024-05-05 15:42:56,404][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_5
[2024-05-05 15:42:56,404][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_4
 19%|███████████████████████████████▋                                                                                                                                         | 6/32 [03:07<09:50, 22.72s/it][2024-05-05 15:42:56,405][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 6
[2024-05-05 15:43:14,647][absl][INFO] - Saving checkpoint at step: 6
[2024-05-05 15:43:14,648][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_6
[2024-05-05 15:43:14,648][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_5
 22%|████████████████████████████████████▉                                                                                                                                    | 7/32 [03:25<08:51, 21.25s/it][2024-05-05 15:43:14,651][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 7
[2024-05-05 15:43:32,949][absl][INFO] - Saving checkpoint at step: 7
[2024-05-05 15:43:32,956][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_7
[2024-05-05 15:43:32,956][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_6
 25%|██████████████████████████████████████████▎                                                                                                                              | 8/32 [03:43<08:07, 20.32s/it][2024-05-05 15:43:32,957][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 8
[2024-05-05 15:43:51,240][absl][INFO] - Saving checkpoint at step: 8
[2024-05-05 15:43:51,242][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_8
[2024-05-05 15:43:51,244][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_7
 28%|███████████████████████████████████████████████▌                                                                                                                         | 9/32 [04:02<07:32, 19.68s/it][2024-05-05 15:43:51,244][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 9
[2024-05-05 15:44:09,502][absl][INFO] - Saving checkpoint at step: 9
[2024-05-05 15:44:09,505][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_9
[2024-05-05 15:44:09,506][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_8
 31%|████████████████████████████████████████████████████▌                                                                                                                   | 10/32 [04:20<07:03, 19.24s/it][2024-05-05 15:44:09,507][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 10
[2024-05-05 15:44:27,768][absl][INFO] - Saving checkpoint at step: 10
[2024-05-05 15:44:27,776][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_10
[2024-05-05 15:44:27,776][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_9
 34%|█████████████████████████████████████████████████████████▊                                                                                                              | 11/32 [04:38<06:37, 18.95s/it][2024-05-05 15:44:27,777][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 11
[2024-05-05 15:44:45,992][absl][INFO] - Saving checkpoint at step: 11
[2024-05-05 15:44:46,002][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_11
[2024-05-05 15:44:46,004][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_10
 38%|███████████████████████████████████████████████████████████████                                                                                                         | 12/32 [04:57<06:14, 18.73s/it][2024-05-05 15:44:46,005][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 12
[2024-05-05 15:45:04,337][absl][INFO] - Saving checkpoint at step: 12
[2024-05-05 15:45:04,339][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_12
[2024-05-05 15:45:04,340][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_11
 41%|████████████████████████████████████████████████████████████████████▎                                                                                                   | 13/32 [05:15<05:53, 18.61s/it][2024-05-05 15:45:04,341][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 13
[2024-05-05 15:45:22,587][absl][INFO] - Saving checkpoint at step: 13
[2024-05-05 15:45:22,591][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_13
[2024-05-05 15:45:22,594][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_12
 44%|█████████████████████████████████████████████████████████████████████████▌                                                                                              | 14/32 [05:33<05:33, 18.50s/it][2024-05-05 15:45:22,594][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 14
[2024-05-05 15:45:40,872][absl][INFO] - Saving checkpoint at step: 14
[2024-05-05 15:45:40,880][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_14
[2024-05-05 15:45:40,881][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_13
 47%|██████████████████████████████████████████████████████████████████████████████▊                                                                                         | 15/32 [05:51<05:13, 18.44s/it][2024-05-05 15:45:40,881][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 15
[2024-05-05 15:45:59,246][absl][INFO] - Saving checkpoint at step: 15
[2024-05-05 15:45:59,251][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_15
[2024-05-05 15:45:59,252][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_14
 50%|████████████████████████████████████████████████████████████████████████████████████                                                                                    | 16/32 [06:10<04:54, 18.42s/it][2024-05-05 15:45:59,252][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 16
[2024-05-05 15:46:17,534][absl][INFO] - Saving checkpoint at step: 16
[2024-05-05 15:46:17,538][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_16
[2024-05-05 15:46:17,538][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_15
 53%|█████████████████████████████████████████████████████████████████████████████████████████▎                                                                              | 17/32 [06:28<04:35, 18.38s/it][2024-05-05 15:46:17,539][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 17
[2024-05-05 15:46:35,893][absl][INFO] - Saving checkpoint at step: 17
[2024-05-05 15:46:35,899][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_17
[2024-05-05 15:46:35,900][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_16
 56%|██████████████████████████████████████████████████████████████████████████████████████████████▌                                                                         | 18/32 [06:46<04:17, 18.37s/it][2024-05-05 15:46:35,900][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 18
[2024-05-05 15:46:54,148][absl][INFO] - Saving checkpoint at step: 18
[2024-05-05 15:46:54,150][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_18
[2024-05-05 15:46:54,150][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_17
 59%|███████████████████████████████████████████████████████████████████████████████████████████████████▊                                                                    | 19/32 [07:05<03:58, 18.34s/it][2024-05-05 15:46:54,151][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 19
[2024-05-05 15:47:12,467][absl][INFO] - Saving checkpoint at step: 19
[2024-05-05 15:47:12,468][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_19
[2024-05-05 15:47:12,469][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_18
 62%|█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                               | 20/32 [07:23<03:39, 18.33s/it][2024-05-05 15:47:12,469][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 20
[2024-05-05 15:47:30,717][absl][INFO] - Saving checkpoint at step: 20
[2024-05-05 15:47:30,718][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_20
[2024-05-05 15:47:30,719][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_19
 66%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                         | 21/32 [07:41<03:21, 18.31s/it][2024-05-05 15:47:30,719][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 21
[2024-05-05 15:47:49,020][absl][INFO] - Saving checkpoint at step: 21
[2024-05-05 15:47:49,026][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_21
[2024-05-05 15:47:49,026][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_20
 69%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                                                    | 22/32 [08:00<03:03, 18.31s/it][2024-05-05 15:47:49,027][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 22
[2024-05-05 15:48:07,384][absl][INFO] - Saving checkpoint at step: 22
[2024-05-05 15:48:07,386][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_22
[2024-05-05 15:48:07,386][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_21
 72%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                               | 23/32 [08:18<02:44, 18.32s/it][2024-05-05 15:48:07,387][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 23
[2024-05-05 15:48:25,666][absl][INFO] - Saving checkpoint at step: 23
[2024-05-05 15:48:25,671][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_23
[2024-05-05 15:48:25,672][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_22
 75%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                          | 24/32 [08:36<02:26, 18.31s/it][2024-05-05 15:48:25,672][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 24
[2024-05-05 15:48:43,856][absl][INFO] - Saving checkpoint at step: 24
[2024-05-05 15:48:43,861][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_24
[2024-05-05 15:48:43,862][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_23
 78%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                    | 25/32 [08:54<02:07, 18.28s/it][2024-05-05 15:48:43,862][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 25
[2024-05-05 15:49:02,126][absl][INFO] - Saving checkpoint at step: 25
[2024-05-05 15:49:02,128][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_25
[2024-05-05 15:49:02,129][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_24
 81%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                               | 26/32 [09:13<01:49, 18.27s/it][2024-05-05 15:49:02,129][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 26
[2024-05-05 15:49:20,398][absl][INFO] - Saving checkpoint at step: 26
[2024-05-05 15:49:20,402][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_26
[2024-05-05 15:49:20,403][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_25
 84%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                          | 27/32 [09:31<01:31, 18.27s/it][2024-05-05 15:49:20,403][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 27
[2024-05-05 15:49:38,634][absl][INFO] - Saving checkpoint at step: 27
[2024-05-05 15:49:38,636][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_27
[2024-05-05 15:49:38,637][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_26
 88%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                     | 28/32 [09:49<01:13, 18.26s/it][2024-05-05 15:49:38,637][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 28
[2024-05-05 15:49:56,883][absl][INFO] - Saving checkpoint at step: 28
[2024-05-05 15:49:56,885][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_28
[2024-05-05 15:49:56,885][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_27
 91%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎               | 29/32 [10:07<00:54, 18.26s/it][2024-05-05 15:49:56,886][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 29
[2024-05-05 15:50:15,084][absl][INFO] - Saving checkpoint at step: 29
[2024-05-05 15:50:15,089][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_29
[2024-05-05 15:50:15,089][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_28
 94%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌          | 30/32 [10:26<00:36, 18.24s/it][2024-05-05 15:50:15,090][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 30
[2024-05-05 15:50:33,486][absl][INFO] - Saving checkpoint at step: 30
[2024-05-05 15:50:33,487][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_30
[2024-05-05 15:50:33,488][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_29
 97%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊     | 31/32 [10:44<00:18, 18.29s/it][2024-05-05 15:50:33,488][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 31
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [11:02<00:00, 18.31s/it][2024-05-05 15:50:51,953][functional_diffusion_processes.trainers.trainer][INFO] - FID: 1.288000e+00
[2024-05-05 15:50:51,953][functional_diffusion_processes.trainers.trainer][INFO] - Inception score -1.000000e+00
wandb: Waiting for W&B process to finish... (success).
wandb: 
wandb: Run history:
wandb:             FID ▁
wandb: inception score ▁
wandb: 
wandb: Run summary:
wandb:             FID 1.288
wandb: inception score -1.0
wandb: 
wandb: 🚀 View run inr_mnist at: https://wandb.ai/eurecom-ds/fpd/runs/lk5wmrth
wandb: Synced 6 W&B file(s), 32 media file(s), 2 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20240505_153938-lk5wmrth/logs
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [11:12<00:00, 21.01s/it]

Please do the following:
Synchronize your local repository with my changes
git pull

Carefully check your .env files
In my case is for example

export WANDB_API_KEY=<my secret wandb key>
export HOME=/home/corallo/PycharmProjects
export CUDA_HOME=/usr/local/cuda
export PROJECT_ROOT=/home/corallo/PycharmProjects/functional-diffusion-processes
export DATA_ROOT=${PROJECT_ROOT}/data
export LOGS_ROOT=${PROJECT_ROOT}/logs
export TFDS_DATA_DIR=${DATA_ROOT}/tensorflow_datasets
export PYTHONPATH=${PROJECT_ROOT}
export PYTHONUNBUFFERED=1
export HYDRA_FULL_ERROR=1
export WANDB_DISABLE_SERVICE=true
export CUDA_VISIBLE_DEVICES=2

please notice the following, HOME for me is /home/corallo/PycharmProjects because is the directory where is located the project functional-diffusion-processes in your case you have to check where is located yours -- we can't know in advance where is located.
Also PROJECT_ROOT for me is /home/corallo/PycharmProjects/functional-diffusion-processes which correspond to my HOME + functional-diffusion-processes

leave the others environment variables as is except you have to fill WANDB_API_KEY with yours and CUDA_VISIBLE_DEVICES with the ids (comma separated) of the GPUs you would like to use.

Let me know if after following this steps you are able to get the FID score.

@cindyyyl
Copy link
Author

cindyyyl commented May 6, 2024 via email

@giulio98
Copy link
Owner

giulio98 commented May 6, 2024

Hello,

I can't see the images you share with me.
This is what I see from your message:
image
Anyway this is a batch i get from the provided checkpoint
image

@cindyyyl
Copy link
Author

cindyyyl commented May 6, 2024 via email

@giulio98
Copy link
Owner

giulio98 commented May 6, 2024

No I can't see it

Can you try directly on GitHub?

@cindyyyl
Copy link
Author

cindyyyl commented May 6, 2024

image

@cindyyyl
Copy link
Author

cindyyyl commented May 6, 2024

hahahahaha

@giulio98
Copy link
Owner

giulio98 commented May 6, 2024

Hey,

This is not supposed to happen, please run command
git pull
And let me know if you will get the correct image. You should get something similar to my previous comment

@cindyyyl
Copy link
Author

cindyyyl commented May 6, 2024 via email

@giulio98
Copy link
Owner

giulio98 commented May 7, 2024

No i mean to pull my changes
git pull

@cindyyyl
Copy link
Author

cindyyyl commented May 7, 2024 via email

@giulio98
Copy link
Owner

giulio98 commented May 7, 2024

I'm unable to reproduce your experiment, if i run
sh scripts/maml/eval_mnist.sh

with this .env file

export WANDB_API_KEY=<my secret wandb key>
export HOME=/home/corallo/PycharmProjects
export CUDA_HOME=/usr/local/cuda
export PROJECT_ROOT=/home/corallo/PycharmProjects/functional-diffusion-processes
export DATA_ROOT=${PROJECT_ROOT}/data
export LOGS_ROOT=${PROJECT_ROOT}/logs
export TFDS_DATA_DIR=${DATA_ROOT}/tensorflow_datasets
export PYTHONPATH=${PROJECT_ROOT}
export PYTHONUNBUFFERED=1
export HYDRA_FULL_ERROR=1
export WANDB_DISABLE_SERVICE=true
export CUDA_VISIBLE_DEVICES=2

I get this logs:

(fdp) corallo@atlas1:~/PycharmProjects/functional-diffusion-processes$ sh scripts/maml/eval_mnist.sh
2024-05-05 15:39:17.027317: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[2024-05-05 15:39:19,697][HYDRA] Launching 1 jobs locally
[2024-05-05 15:39:19,697][HYDRA]        #0 : +experiments_maml=eval_mnist
[2024-05-05 15:39:19,880][__main__][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
[2024-05-05 15:39:20,082][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,082][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,087][absl][INFO] - Load dataset info from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
[2024-05-05 15:39:20,089][__main__][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
[2024-05-05 15:39:20,092][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,092][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,092][absl][INFO] - Load dataset info from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
[2024-05-05 15:39:20,093][__main__][INFO] - Instantiating <functional_diffusion_processes.models.mlp_modulation.MLPModulationLR>
[2024-05-05 15:39:20,127][__main__][INFO] - Instantiating <functional_diffusion_processes.sdetools.heat_subvp_sde.HeatSubVPSDE>
[2024-05-05 15:39:20,209][absl][INFO] - Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: 
[2024-05-05 15:39:20,776][absl][INFO] - Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter Host CUDA
[2024-05-05 15:39:20,777][absl][INFO] - Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client'
[2024-05-05 15:39:20,777][absl][INFO] - Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this.
[2024-05-05 15:39:21,560][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.correctors.langevin_corrector.LangevinCorrector>
[2024-05-05 15:39:21,564][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.predictors.euler_predictor.EulerMaruyamaPredictor>
[2024-05-05 15:39:21,568][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.pc_sampler.PCSampler>
[2024-05-05 15:39:21,571][__main__][INFO] - Instantiating <functional_diffusion_processes.losses.mse_loss.MSELoss>
[2024-05-05 15:39:21,573][__main__][INFO] - Instantiating <functional_diffusion_processes.trainers.trainer.Trainer>
[2024-05-05 15:39:21,591][absl][WARNING] - GlobalAsyncCheckpointManager is not imported correctly. Checkpointing of GlobalDeviceArrays will not be available.To use the feature, install tensorstore.
WARNING:tensorflow:From /home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-05 15:39:23,457][tensorflow][WARNING] - From /home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-05 15:39:23,465][__main__][INFO] - Instantiating <functional_diffusion_processes.metrics.fid_metric.FIDMetric>
[2024-05-05 15:39:23,604][functional_diffusion_processes.metrics.feature_extractor][INFO] - Extracting features from dataset...
[2024-05-05 15:39:23,605][absl][INFO] - Reusing dataset mnist (/home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1)
[2024-05-05 15:39:23,640][absl][INFO] - Constructing tf.data.Dataset mnist for split test, from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
WARNING:tensorflow:AutoGraph could not transform <bound method PercentStyle._format of <logging.PercentStyle object at 0x7fb89752cb80>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'NoneType' object has no attribute '_fields'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
[2024-05-05 15:39:25,504][tensorflow][WARNING] - AutoGraph could not transform <bound method PercentStyle._format of <logging.PercentStyle object at 0x7fb89752cb80>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'NoneType' object has no attribute '_fields'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
[2024-05-05 15:39:24,513][functional_diffusion_processes.datasets.mnist_dataset][INFO] - Converting image to range [0,1]...
[2024-05-05 15:39:25,637][functional_diffusion_processes.datasets.mnist_dataset][INFO] - Resizing image to size 32...
[2024-05-05 15:39:25,661][functional_diffusion_processes.datasets.image_dataset][INFO] - Preprocessing images for split test...
[2024-05-05 15:39:25,682][functional_diffusion_processes.datasets.image_dataset][INFO] - Image reshaped to shape (1024, 1)...
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/jax/_src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
  warnings.warn(
[2024-05-05 15:39:25,912][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 0
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow/python/util/nest.py:917: UserWarning: `tf.layers.flatten` is deprecated and will be removed in a future version. Please use `tf.keras.layers.Flatten` instead.
  structure[0], [func(*x) for x in entries],
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/keras/legacy_tf_layers/base.py:627: UserWarning: `layer.updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
  self.updates, tf.compat.v1.GraphKeys.UPDATE_OPS
[2024-05-05 15:39:27,005][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 1
[2024-05-05 15:39:27,542][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 2
[2024-05-05 15:39:28,074][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 3
[2024-05-05 15:39:28,655][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 4
[2024-05-05 15:39:29,643][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 5
[2024-05-05 15:39:30,365][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 6
[2024-05-05 15:39:30,861][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 7
[2024-05-05 15:39:31,510][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 8
[2024-05-05 15:39:32,186][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 9
[2024-05-05 15:39:32,710][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 10
[2024-05-05 15:39:33,367][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 11
[2024-05-05 15:39:34,191][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 12
[2024-05-05 15:39:34,736][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 13
[2024-05-05 15:39:35,238][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 14
[2024-05-05 15:39:35,738][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 15
[2024-05-05 15:39:36,249][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 16
[2024-05-05 15:39:36,761][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 17
[2024-05-05 15:39:37,254][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 18
[2024-05-05 15:39:37,758][functional_diffusion_processes.metrics.fid_metric][INFO] - Saving real dataset stats to: /home/corallo/PycharmProjects/functional-diffusion-processes/data/stats/mnist_test_stats.npz
[2024-05-05 15:39:37,980][__main__][INFO] - Starting testing!
wandb: Currently logged in as: giulio-corallo (eurecom-ds). Use `wandb login --relogin` to force relogin
wandb: wandb version 0.16.6 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.14.0
wandb: Run data is saved locally in /home/corallo/PycharmProjects/functional-diffusion-processes/wandb/run-20240505_153938-lk5wmrth
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run inr_mnist
wandb: ⭐️ View project at https://wandb.ai/eurecom-ds/fpd
wandb: 🚀 View run at https://wandb.ai/eurecom-ds/fpd/runs/lk5wmrth
[2024-05-05 15:39:48,607][functional_diffusion_processes.trainers.trainer][INFO] - Total number of parameters: 0.12M
[2024-05-05 15:39:48,904][functional_diffusion_processes.trainers.trainer][WARNING] - Resuming training from the latest checkpoint.
[2024-05-05 15:39:48,905][absl][INFO] - Restoring checkpoint from /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/checkpoints/checkpoint_27
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/jax/_src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
  warnings.warn(
[2024-05-05 15:39:48,989][absl][INFO] - Found no checkpoint files in /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist with prefix meta_0_
[2024-05-05 15:39:48,989][functional_diffusion_processes.trainers.trainer][INFO] - Starting sampling loop at step 0.
  0%|                                                                                                                                                                                 | 0/32 [00:00<?, ?it/s][2024-05-05 15:39:48,990][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 0
[2024-05-05 15:41:24,948][absl][INFO] - Saving checkpoint at step: 0
[2024-05-05 15:41:24,952][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_0
  3%|█████▎                                                                                                                                                                   | 1/32 [01:35<49:34, 95.96s/it][2024-05-05 15:41:24,953][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 1
[2024-05-05 15:41:43,396][absl][INFO] - Saving checkpoint at step: 1
[2024-05-05 15:41:43,397][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_1
[2024-05-05 15:41:43,397][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_0
  6%|██████████▌                                                                                                                                                              | 2/32 [01:54<25:10, 50.36s/it][2024-05-05 15:41:43,397][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 2
[2024-05-05 15:42:01,615][absl][INFO] - Saving checkpoint at step: 2
[2024-05-05 15:42:01,618][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_2
[2024-05-05 15:42:01,619][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_1
  9%|███████████████▊                                                                                                                                                         | 3/32 [02:12<17:14, 35.69s/it][2024-05-05 15:42:01,619][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 3
[2024-05-05 15:42:19,896][absl][INFO] - Saving checkpoint at step: 3
[2024-05-05 15:42:19,897][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_3
[2024-05-05 15:42:19,897][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_2
 12%|█████████████████████▏                                                                                                                                                   | 4/32 [02:30<13:26, 28.81s/it][2024-05-05 15:42:19,898][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 4
[2024-05-05 15:42:38,094][absl][INFO] - Saving checkpoint at step: 4
[2024-05-05 15:42:38,096][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_4
[2024-05-05 15:42:38,097][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_3
 16%|██████████████████████████▍                                                                                                                                              | 5/32 [02:49<11:14, 24.99s/it][2024-05-05 15:42:38,097][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 5
[2024-05-05 15:42:56,403][absl][INFO] - Saving checkpoint at step: 5
[2024-05-05 15:42:56,404][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_5
[2024-05-05 15:42:56,404][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_4
 19%|███████████████████████████████▋                                                                                                                                         | 6/32 [03:07<09:50, 22.72s/it][2024-05-05 15:42:56,405][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 6
[2024-05-05 15:43:14,647][absl][INFO] - Saving checkpoint at step: 6
[2024-05-05 15:43:14,648][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_6
[2024-05-05 15:43:14,648][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_5
 22%|████████████████████████████████████▉                                                                                                                                    | 7/32 [03:25<08:51, 21.25s/it][2024-05-05 15:43:14,651][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 7
[2024-05-05 15:43:32,949][absl][INFO] - Saving checkpoint at step: 7
[2024-05-05 15:43:32,956][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_7
[2024-05-05 15:43:32,956][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_6
 25%|██████████████████████████████████████████▎                                                                                                                              | 8/32 [03:43<08:07, 20.32s/it][2024-05-05 15:43:32,957][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 8
[2024-05-05 15:43:51,240][absl][INFO] - Saving checkpoint at step: 8
[2024-05-05 15:43:51,242][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_8
[2024-05-05 15:43:51,244][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_7
 28%|███████████████████████████████████████████████▌                                                                                                                         | 9/32 [04:02<07:32, 19.68s/it][2024-05-05 15:43:51,244][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 9
[2024-05-05 15:44:09,502][absl][INFO] - Saving checkpoint at step: 9
[2024-05-05 15:44:09,505][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_9
[2024-05-05 15:44:09,506][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_8
 31%|████████████████████████████████████████████████████▌                                                                                                                   | 10/32 [04:20<07:03, 19.24s/it][2024-05-05 15:44:09,507][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 10
[2024-05-05 15:44:27,768][absl][INFO] - Saving checkpoint at step: 10
[2024-05-05 15:44:27,776][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_10
[2024-05-05 15:44:27,776][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_9
 34%|█████████████████████████████████████████████████████████▊                                                                                                              | 11/32 [04:38<06:37, 18.95s/it][2024-05-05 15:44:27,777][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 11
[2024-05-05 15:44:45,992][absl][INFO] - Saving checkpoint at step: 11
[2024-05-05 15:44:46,002][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_11
[2024-05-05 15:44:46,004][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_10
 38%|███████████████████████████████████████████████████████████████                                                                                                         | 12/32 [04:57<06:14, 18.73s/it][2024-05-05 15:44:46,005][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 12
[2024-05-05 15:45:04,337][absl][INFO] - Saving checkpoint at step: 12
[2024-05-05 15:45:04,339][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_12
[2024-05-05 15:45:04,340][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_11
 41%|████████████████████████████████████████████████████████████████████▎                                                                                                   | 13/32 [05:15<05:53, 18.61s/it][2024-05-05 15:45:04,341][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 13
[2024-05-05 15:45:22,587][absl][INFO] - Saving checkpoint at step: 13
[2024-05-05 15:45:22,591][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_13
[2024-05-05 15:45:22,594][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_12
 44%|█████████████████████████████████████████████████████████████████████████▌                                                                                              | 14/32 [05:33<05:33, 18.50s/it][2024-05-05 15:45:22,594][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 14
[2024-05-05 15:45:40,872][absl][INFO] - Saving checkpoint at step: 14
[2024-05-05 15:45:40,880][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_14
[2024-05-05 15:45:40,881][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_13
 47%|██████████████████████████████████████████████████████████████████████████████▊                                                                                         | 15/32 [05:51<05:13, 18.44s/it][2024-05-05 15:45:40,881][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 15
[2024-05-05 15:45:59,246][absl][INFO] - Saving checkpoint at step: 15
[2024-05-05 15:45:59,251][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_15
[2024-05-05 15:45:59,252][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_14
 50%|████████████████████████████████████████████████████████████████████████████████████                                                                                    | 16/32 [06:10<04:54, 18.42s/it][2024-05-05 15:45:59,252][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 16
[2024-05-05 15:46:17,534][absl][INFO] - Saving checkpoint at step: 16
[2024-05-05 15:46:17,538][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_16
[2024-05-05 15:46:17,538][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_15
 53%|█████████████████████████████████████████████████████████████████████████████████████████▎                                                                              | 17/32 [06:28<04:35, 18.38s/it][2024-05-05 15:46:17,539][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 17
[2024-05-05 15:46:35,893][absl][INFO] - Saving checkpoint at step: 17
[2024-05-05 15:46:35,899][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_17
[2024-05-05 15:46:35,900][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_16
 56%|██████████████████████████████████████████████████████████████████████████████████████████████▌                                                                         | 18/32 [06:46<04:17, 18.37s/it][2024-05-05 15:46:35,900][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 18
[2024-05-05 15:46:54,148][absl][INFO] - Saving checkpoint at step: 18
[2024-05-05 15:46:54,150][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_18
[2024-05-05 15:46:54,150][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_17
 59%|███████████████████████████████████████████████████████████████████████████████████████████████████▊                                                                    | 19/32 [07:05<03:58, 18.34s/it][2024-05-05 15:46:54,151][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 19
[2024-05-05 15:47:12,467][absl][INFO] - Saving checkpoint at step: 19
[2024-05-05 15:47:12,468][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_19
[2024-05-05 15:47:12,469][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_18
 62%|█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                               | 20/32 [07:23<03:39, 18.33s/it][2024-05-05 15:47:12,469][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 20
[2024-05-05 15:47:30,717][absl][INFO] - Saving checkpoint at step: 20
[2024-05-05 15:47:30,718][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_20
[2024-05-05 15:47:30,719][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_19
 66%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                         | 21/32 [07:41<03:21, 18.31s/it][2024-05-05 15:47:30,719][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 21
[2024-05-05 15:47:49,020][absl][INFO] - Saving checkpoint at step: 21
[2024-05-05 15:47:49,026][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_21
[2024-05-05 15:47:49,026][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_20
 69%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                                                    | 22/32 [08:00<03:03, 18.31s/it][2024-05-05 15:47:49,027][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 22
[2024-05-05 15:48:07,384][absl][INFO] - Saving checkpoint at step: 22
[2024-05-05 15:48:07,386][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_22
[2024-05-05 15:48:07,386][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_21
 72%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                               | 23/32 [08:18<02:44, 18.32s/it][2024-05-05 15:48:07,387][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 23
[2024-05-05 15:48:25,666][absl][INFO] - Saving checkpoint at step: 23
[2024-05-05 15:48:25,671][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_23
[2024-05-05 15:48:25,672][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_22
 75%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                          | 24/32 [08:36<02:26, 18.31s/it][2024-05-05 15:48:25,672][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 24
[2024-05-05 15:48:43,856][absl][INFO] - Saving checkpoint at step: 24
[2024-05-05 15:48:43,861][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_24
[2024-05-05 15:48:43,862][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_23
 78%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                    | 25/32 [08:54<02:07, 18.28s/it][2024-05-05 15:48:43,862][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 25
[2024-05-05 15:49:02,126][absl][INFO] - Saving checkpoint at step: 25
[2024-05-05 15:49:02,128][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_25
[2024-05-05 15:49:02,129][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_24
 81%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                               | 26/32 [09:13<01:49, 18.27s/it][2024-05-05 15:49:02,129][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 26
[2024-05-05 15:49:20,398][absl][INFO] - Saving checkpoint at step: 26
[2024-05-05 15:49:20,402][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_26
[2024-05-05 15:49:20,403][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_25
 84%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                          | 27/32 [09:31<01:31, 18.27s/it][2024-05-05 15:49:20,403][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 27
[2024-05-05 15:49:38,634][absl][INFO] - Saving checkpoint at step: 27
[2024-05-05 15:49:38,636][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_27
[2024-05-05 15:49:38,637][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_26
 88%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                     | 28/32 [09:49<01:13, 18.26s/it][2024-05-05 15:49:38,637][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 28
[2024-05-05 15:49:56,883][absl][INFO] - Saving checkpoint at step: 28
[2024-05-05 15:49:56,885][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_28
[2024-05-05 15:49:56,885][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_27
 91%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎               | 29/32 [10:07<00:54, 18.26s/it][2024-05-05 15:49:56,886][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 29
[2024-05-05 15:50:15,084][absl][INFO] - Saving checkpoint at step: 29
[2024-05-05 15:50:15,089][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_29
[2024-05-05 15:50:15,089][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_28
 94%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌          | 30/32 [10:26<00:36, 18.24s/it][2024-05-05 15:50:15,090][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 30
[2024-05-05 15:50:33,486][absl][INFO] - Saving checkpoint at step: 30
[2024-05-05 15:50:33,487][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_30
[2024-05-05 15:50:33,488][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_29
 97%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊     | 31/32 [10:44<00:18, 18.29s/it][2024-05-05 15:50:33,488][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 31
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [11:02<00:00, 18.31s/it][2024-05-05 15:50:51,953][functional_diffusion_processes.trainers.trainer][INFO] - FID: 1.288000e+00
[2024-05-05 15:50:51,953][functional_diffusion_processes.trainers.trainer][INFO] - Inception score -1.000000e+00
wandb: Waiting for W&B process to finish... (success).
wandb: 
wandb: Run history:
wandb:             FID ▁
wandb: inception score ▁
wandb: 
wandb: Run summary:
wandb:             FID 1.288
wandb: inception score -1.0
wandb: 
wandb: 🚀 View run inr_mnist at: https://wandb.ai/eurecom-ds/fpd/runs/lk5wmrth
wandb: Synced 6 W&B file(s), 32 media file(s), 2 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20240505_153938-lk5wmrth/logs
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [11:12<00:00, 21.01s/it]

and this are the sampled image on wandb
image

Did you follow the exact same steps?
Do you get the same logs as mine?
Please share the content of your .env file (obscuring your wandb api key)
cat .env
and your full logs

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

export WANDB_API_KEY=
export HOME=/cis/net/io93c/data/shuan124/
export CUDA_HOME=/usr/local/cuda
export PROJECT_ROOT=/cis/net/io93c/data/shuan124/functional-diffusion-processes # /home/username/functional_diffusion_processes
export DATA_ROOT=${PROJECT_ROOT}/data
export LOGS_ROOT=${PROJECT_ROOT}/logs
export TFDS_DATA_DIR= ${DATA_ROOT}/tensorflow_datasets
export PYTHONPATH=${PROJECT_ROOT}
export PYTHONUNBUFFERED=1
export HYDRA_FULL_ERROR=1
export WANDB_DISABLE_SERVICE=true
export CUDA_VISIBLE_DEVICES=6

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

zxcvzxcv980234Projectsthird_timeRunsinr_mnistLogs
Invite teammates

cindyyyl
Personal

Overview
Workspace
System
Logs
Files
Artifacts
Search logs
Download
93 [2024-05-07 18:05:18,543][absl][INFO] - Saving checkpoint at step: 20
94 [2024-05-07 18:05:18,563][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_20
95 [2024-05-07 18:05:18,564][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_19
96 [2024-05-07 18:05:18,568][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 21
97 [2024-05-07 18:05:56,161][absl][INFO] - Saving checkpoint at step: 21
98 [2024-05-07 18:05:56,183][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_21
99 [2024-05-07 18:05:56,185][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_20
100 [2024-05-07 18:05:56,189][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 22
101 [2024-05-07 18:06:33,751][absl][INFO] - Saving checkpoint at step: 22
102 [2024-05-07 18:06:33,769][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_22
103 [2024-05-07 18:06:33,771][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_21
104 [2024-05-07 18:06:33,775][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 23
105 [2024-05-07 18:07:11,030][absl][INFO] - Saving checkpoint at step: 23
106 [2024-05-07 18:07:11,049][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_23
107 [2024-05-07 18:07:11,051][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_22
108 [2024-05-07 18:07:11,055][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 24
109 [2024-05-07 18:07:48,711][absl][INFO] - Saving checkpoint at step: 24
110 [2024-05-07 18:07:48,730][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_24
111 [2024-05-07 18:07:48,732][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_23
112 [2024-05-07 18:07:48,737][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 25
113 [2024-05-07 18:08:26,313][absl][INFO] - Saving checkpoint at step: 25
114 [2024-05-07 18:08:26,331][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_25
115 [2024-05-07 18:08:26,333][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_24
116 [2024-05-07 18:08:26,337][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 26
117 [2024-05-07 18:09:03,872][absl][INFO] - Saving checkpoint at step: 26
118 [2024-05-07 18:09:03,889][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_26
119 [2024-05-07 18:09:03,891][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_25
120 [2024-05-07 18:09:03,895][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 27
121 [2024-05-07 18:09:41,522][absl][INFO] - Saving checkpoint at step: 27
122 [2024-05-07 18:09:41,543][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_27
123 [2024-05-07 18:09:41,545][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_26
124 [2024-05-07 18:09:41,549][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 28
125 [2024-05-07 18:10:19,344][absl][INFO] - Saving checkpoint at step: 28
126 [2024-05-07 18:10:19,390][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_28
127 [2024-05-07 18:10:19,393][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_27
128 [2024-05-07 18:10:19,397][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 29
129 [2024-05-07 18:10:56,875][absl][INFO] - Saving checkpoint at step: 29
130 [2024-05-07 18:10:56,894][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_29
131 [2024-05-07 18:10:56,896][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_28
132 [2024-05-07 18:10:56,901][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 30
133 [2024-05-07 18:11:34,635][absl][INFO] - Saving checkpoint at step: 30
134 [2024-05-07 18:11:34,654][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_30
135 [2024-05-07 18:11:34,656][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/functional-diffusion-processes/logs/inr_mnist/meta_0_29
136 [2024-05-07 18:11:34,661][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 31
137 [2024-05-07 18:12:12,547][functional_diffusion_processes.trainers.trainer][INFO] - FID: 1.020612e+02
138 [2024-05-07 18:12:12,547][functional_diffusion_processes.trainers.trainer][INFO] - Inception score -1.000000e+00

100

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

Thank you so much for the help !!

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

Hi I hope we can have a meeting for efficiency , since the result of exp by i run still incorrect,i.e. with larger N, should be have small fid, however as so far, my results still larger N larger fid. and i can not write these results in my work. ./cry /cry

best,

@giulio98
Copy link
Owner

giulio98 commented May 8, 2024

Hi from your logs it appear that it skips sampling because it has found checkpoints from previous run, please clean your logs directory, rm your and run again.

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

i think it should be here ?
image

there should be logs_test not logs ?

@giulio98
Copy link
Owner

giulio98 commented May 8, 2024

I get fid score 1.28 using our config

@giulio98
Copy link
Owner

giulio98 commented May 8, 2024

Please rm your mnist_stats since could be broken before the changes i made

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

Emmm, I am a little confused about this: Please note that you will get a higher FID score because the checkpoint we provided does not include y-corrupted data at the input of the INR.

Also, since I've set up a new environment and performed a new Git clone, I did all operations from the beginning. Could this issue be because I did not run the training script 'sh scripts/maml/train_mnist.sh'?"

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

i see, but this time i totally create all by the beging , what ever environment and git repo " Please rm your mnist_stats since could be broken before the changes i made"
image
(lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

i still think is the problem of when i caculate the FID . not the sample probllem
image

@giulio98
Copy link
Owner

giulio98 commented May 8, 2024

Please recalculate the mnist_stats.npz
And reload the dataset
rm -rf ./data

You should get FID 1.28

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

hi , it is still \cry
image
(lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$ sh scripts/maml/eval_mnist.sh
2024-05-08 12:21:25.670212: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[2024-05-08 12:21:31,422][HYDRA] Launching 1 jobs locally
[2024-05-08 12:21:31,422][HYDRA] #0 : +experiments_maml=eval_mnist
[2024-05-08 12:21:31,574][main][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
[2024-05-08 12:21:32,447][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-08 12:21:32,448][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-08 12:21:32,462][absl][INFO] - Load dataset info from /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
[2024-05-08 12:21:32,464][main][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
[2024-05-08 12:21:32,471][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-08 12:21:32,475][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-08 12:21:32,476][absl][INFO] - Load dataset info from /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
[2024-05-08 12:21:32,483][main][INFO] - Instantiating <functional_diffusion_processes.models.mlp_modulation.MLPModulationLR>
[2024-05-08 12:21:32,594][main][INFO] - Instantiating <functional_diffusion_processes.sdetools.heat_subvp_sde.HeatSubVPSDE>
[2024-05-08 12:21:32,612][absl][INFO] - Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker:
[2024-05-08 12:21:32,729][absl][INFO] - Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter CUDA Host
[2024-05-08 12:21:32,730][absl][INFO] - Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client'
[2024-05-08 12:21:32,736][absl][INFO] - Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this.
[2024-05-08 12:21:33,412][main][INFO] - Instantiating <functional_diffusion_processes.samplers.correctors.langevin_corrector.LangevinCorrector>
[2024-05-08 12:21:33,432][main][INFO] - Instantiating <functional_diffusion_processes.samplers.predictors.euler_predictor.EulerMaruyamaPredictor>
[2024-05-08 12:21:33,436][main][INFO] - Instantiating <functional_diffusion_processes.samplers.pc_sampler.PCSampler>
[2024-05-08 12:21:33,440][main][INFO] - Instantiating <functional_diffusion_processes.losses.mse_loss.MSELoss>
[2024-05-08 12:21:33,450][main][INFO] - Instantiating <functional_diffusion_processes.trainers.trainer.Trainer>
[2024-05-08 12:21:33,508][absl][WARNING] - GlobalAsyncCheckpointManager is not imported correctly. Checkpointing of GlobalDeviceArrays will not be available.To use the feature, install tensorstore.
WARNING:tensorflow:From /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-08 12:21:35,639][tensorflow][WARNING] - From /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-08 12:21:35,664][main][INFO] - Instantiating <functional_diffusion_processes.metrics.fid_metric.FIDMetric>
[2024-05-08 12:21:35,759][main][INFO] - Starting testing!
wandb: Currently logged in as: zxcvzxcv980234. Use wandb login --relogin to force relogin
wandb: wandb version 0.17.0 is available! To upgrade, please run:
wandb: $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.14.0
wandb: Run data is saved locally in /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/wandb/run-20240508_122135-lk25x14t
wandb: Run wandb offline to turn off syncing.
wandb: Syncing run inr_mnist
wandb: ⭐️ View project at https://wandb.ai/zxcvzxcv980234/final
wandb: 🚀 View run at https://wandb.ai/zxcvzxcv980234/final/runs/lk25x14t
[2024-05-08 12:21:47,063][functional_diffusion_processes.trainers.trainer][INFO] - Total number of parameters: 0.12M
[2024-05-08 12:21:47,314][functional_diffusion_processes.trainers.trainer][WARNING] - Resuming training from the latest checkpoint.
[2024-05-08 12:21:47,315][absl][INFO] - Restoring checkpoint from /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/checkpoints/checkpoint_27
/cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/jax/src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
warnings.warn(
[2024-05-08 12:21:47,399][absl][INFO] - Found no checkpoint files in /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist with prefix meta_0

[2024-05-08 12:21:47,400][functional_diffusion_processes.trainers.trainer][INFO] - Starting sampling loop at step 0.
0%| | 0/32 [00:00<?, ?it/s][2024-05-08 12:21:47,407][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 0
/cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/tensorflow/python/util/nest.py:917: UserWarning: tf.layers.flatten is deprecated and will be removed in a future version. Please use tf.keras.layers.Flatten instead.
structure[0], [func(*x) for x in entries],
/cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/keras/legacy_tf_layers/base.py:627: UserWarning: layer.updates will be removed in a future version. This property should not be used in TensorFlow 2.0, as updates are applied automatically.
self.updates, tf.compat.v1.GraphKeys.UPDATE_OPS
[2024-05-08 12:23:06,841][absl][INFO] - Saving checkpoint at step: 0
[2024-05-08 12:23:06,869][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_0
3%|███ | 1/32 [01:19<41:03, 79.46s/it][2024-05-08 12:23:06,871][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 1
[2024-05-08 12:23:19,436][absl][INFO] - Saving checkpoint at step: 1
[2024-05-08 12:23:19,461][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_1
[2024-05-08 12:23:19,465][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_0
6%|██████ | 2/32 [01:32<20:04, 40.13s/it][2024-05-08 12:23:19,475][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 2
[2024-05-08 12:23:32,672][absl][INFO] - Saving checkpoint at step: 2
[2024-05-08 12:23:32,695][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_2
[2024-05-08 12:23:32,698][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_1
9%|█████████ | 3/32 [01:45<13:27, 27.85s/it][2024-05-08 12:23:32,707][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 3
[2024-05-08 12:23:44,812][absl][INFO] - Saving checkpoint at step: 3
[2024-05-08 12:23:44,842][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_3
[2024-05-08 12:23:44,844][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_2
12%|████████████ | 4/32 [01:57<10:06, 21.65s/it][2024-05-08 12:23:44,854][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 4
[2024-05-08 12:23:57,303][absl][INFO] - Saving checkpoint at step: 4
[2024-05-08 12:23:57,325][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_4
[2024-05-08 12:23:57,326][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_3
16%|███████████████ | 5/32 [02:09<08:15, 18.34s/it][2024-05-08 12:23:57,336][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 5
[2024-05-08 12:24:09,414][absl][INFO] - Saving checkpoint at step: 5
[2024-05-08 12:24:09,437][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_5
[2024-05-08 12:24:09,439][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_4
19%|██████████████████ | 6/32 [02:22<07:01, 16.23s/it][2024-05-08 12:24:09,449][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 6
[2024-05-08 12:24:22,034][absl][INFO] - Saving checkpoint at step: 6
[2024-05-08 12:24:22,057][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_6
[2024-05-08 12:24:22,058][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_5
22%|█████████████████████ | 7/32 [02:34<06:16, 15.05s/it][2024-05-08 12:24:22,069][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 7
[2024-05-08 12:24:34,772][absl][INFO] - Saving checkpoint at step: 7
[2024-05-08 12:24:34,802][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_7
[2024-05-08 12:24:34,803][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_6
25%|████████████████████████ | 8/32 [02:47<05:43, 14.31s/it][2024-05-08 12:24:34,813][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 8
[2024-05-08 12:24:47,208][absl][INFO] - Saving checkpoint at step: 8
[2024-05-08 12:24:47,230][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_8
[2024-05-08 12:24:47,231][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_7
28%|███████████████████████████ | 9/32 [02:59<05:15, 13.73s/it][2024-05-08 12:24:47,244][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 9
[2024-05-08 12:24:59,672][absl][INFO] - Saving checkpoint at step: 9
[2024-05-08 12:24:59,699][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_9
[2024-05-08 12:24:59,700][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_8
31%|█████████████████████████████▋ | 10/32 [03:12<04:53, 13.34s/it][2024-05-08 12:24:59,712][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 10
[2024-05-08 12:25:11,951][absl][INFO] - Saving checkpoint at step: 10
[2024-05-08 12:25:11,976][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_10
[2024-05-08 12:25:11,978][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_9
34%|████████████████████████████████▋ | 11/32 [03:24<04:33, 13.01s/it][2024-05-08 12:25:11,986][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 11
[2024-05-08 12:25:24,033][absl][INFO] - Saving checkpoint at step: 11
[2024-05-08 12:25:24,055][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_11
[2024-05-08 12:25:24,056][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_10
38%|███████████████████████████████████▋ | 12/32 [03:36<04:14, 12.73s/it][2024-05-08 12:25:24,068][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 12
[2024-05-08 12:25:36,608][absl][INFO] - Saving checkpoint at step: 12
[2024-05-08 12:25:36,641][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_12
[2024-05-08 12:25:36,643][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_11
41%|██████████████████████████████████████▌ | 13/32 [03:49<04:01, 12.69s/it][2024-05-08 12:25:36,653][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 13
[2024-05-08 12:25:48,740][absl][INFO] - Saving checkpoint at step: 13
[2024-05-08 12:25:48,763][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_13
[2024-05-08 12:25:48,764][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_12
44%|█████████████████████████████████████████▌ | 14/32 [04:01<03:45, 12.51s/it][2024-05-08 12:25:48,774][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 14
[2024-05-08 12:26:00,879][absl][INFO] - Saving checkpoint at step: 14
[2024-05-08 12:26:00,908][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_14
[2024-05-08 12:26:00,909][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_13
47%|████████████████████████████████████████████▌ | 15/32 [04:13<03:30, 12.40s/it][2024-05-08 12:26:00,919][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 15
[2024-05-08 12:26:13,237][absl][INFO] - Saving checkpoint at step: 15
[2024-05-08 12:26:13,259][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_15
[2024-05-08 12:26:13,261][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_14
50%|███████████████████████████████████████████████▌ | 16/32 [04:25<03:18, 12.39s/it][2024-05-08 12:26:13,271][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 16
[2024-05-08 12:26:25,451][absl][INFO] - Saving checkpoint at step: 16
[2024-05-08 12:26:25,472][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_16
[2024-05-08 12:26:25,474][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_15
53%|██████████████████████████████████████████████████▍ | 17/32 [04:38<03:05, 12.33s/it][2024-05-08 12:26:25,483][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 17
[2024-05-08 12:26:37,780][absl][INFO] - Saving checkpoint at step: 17
[2024-05-08 12:26:37,806][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_17
[2024-05-08 12:26:37,808][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_16
56%|█████████████████████████████████████████████████████▍ | 18/32 [04:50<02:52, 12.34s/it][2024-05-08 12:26:37,818][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 18
[2024-05-08 12:26:50,356][absl][INFO] - Saving checkpoint at step: 18
[2024-05-08 12:26:50,377][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_18
[2024-05-08 12:26:50,379][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_17
59%|████████████████████████████████████████████████████████▍ | 19/32 [05:02<02:41, 12.41s/it][2024-05-08 12:26:50,389][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 19
[2024-05-08 12:27:02,621][absl][INFO] - Saving checkpoint at step: 19
[2024-05-08 12:27:02,643][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_19
[2024-05-08 12:27:02,645][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_18
62%|███████████████████████████████████████████████████████████▍ | 20/32 [05:15<02:28, 12.36s/it][2024-05-08 12:27:02,654][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 20
[2024-05-08 12:27:14,775][absl][INFO] - Saving checkpoint at step: 20
[2024-05-08 12:27:14,809][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_20
[2024-05-08 12:27:14,816][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_19
66%|██████████████████████████████████████████████████████████████▎ | 21/32 [05:27<02:15, 12.30s/it][2024-05-08 12:27:14,820][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 21
[2024-05-08 12:27:26,709][absl][INFO] - Saving checkpoint at step: 21
[2024-05-08 12:27:26,738][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_21
[2024-05-08 12:27:26,740][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_20
69%|█████████████████████████████████████████████████████████████████▎ | 22/32 [05:39<02:01, 12.19s/it][2024-05-08 12:27:26,748][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 22
[2024-05-08 12:27:38,806][absl][INFO] - Saving checkpoint at step: 22
[2024-05-08 12:27:38,829][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_22
[2024-05-08 12:27:38,831][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_21
72%|████████████████████████████████████████████████████████████████████▎ | 23/32 [05:51<01:49, 12.16s/it][2024-05-08 12:27:38,840][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 23
[2024-05-08 12:27:51,517][absl][INFO] - Saving checkpoint at step: 23
[2024-05-08 12:27:51,539][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_23
[2024-05-08 12:27:51,540][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_22
75%|███████████████████████████████████████████████████████████████████████▎ | 24/32 [06:04<01:38, 12.33s/it][2024-05-08 12:27:51,552][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 24
[2024-05-08 12:28:03,336][absl][INFO] - Saving checkpoint at step: 24
[2024-05-08 12:28:03,382][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_24
[2024-05-08 12:28:03,384][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_23
78%|██████████████████████████████████████████████████████████████████████████▏ | 25/32 [06:15<01:25, 12.18s/it][2024-05-08 12:28:03,394][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 25
[2024-05-08 12:28:15,250][absl][INFO] - Saving checkpoint at step: 25
[2024-05-08 12:28:15,272][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_25
[2024-05-08 12:28:15,274][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_24
81%|█████████████████████████████████████████████████████████████████████████████▏ | 26/32 [06:27<01:12, 12.09s/it][2024-05-08 12:28:15,283][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 26
[2024-05-08 12:28:27,419][absl][INFO] - Saving checkpoint at step: 26
[2024-05-08 12:28:27,440][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_26
[2024-05-08 12:28:27,442][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_25
84%|████████████████████████████████████████████████████████████████████████████████▏ | 27/32 [06:40<01:00, 12.12s/it][2024-05-08 12:28:27,451][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 27
[2024-05-08 12:28:39,309][absl][INFO] - Saving checkpoint at step: 27
[2024-05-08 12:28:39,340][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_27
[2024-05-08 12:28:39,342][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_26
88%|███████████████████████████████████████████████████████████████████████████████████▏ | 28/32 [06:51<00:48, 12.05s/it][2024-05-08 12:28:39,351][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 28
[2024-05-08 12:28:51,955][absl][INFO] - Saving checkpoint at step: 28
[2024-05-08 12:28:51,977][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_28
[2024-05-08 12:28:51,980][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_27
91%|██████████████████████████████████████████████████████████████████████████████████████ | 29/32 [07:04<00:36, 12.23s/it][2024-05-08 12:28:51,989][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 29
[2024-05-08 12:29:04,399][absl][INFO] - Saving checkpoint at step: 29
[2024-05-08 12:29:04,421][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_29
[2024-05-08 12:29:04,423][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_28
94%|█████████████████████████████████████████████████████████████████████████████████████████ | 30/32 [07:17<00:24, 12.30s/it][2024-05-08 12:29:04,458][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 30
[2024-05-08 12:29:16,601][absl][INFO] - Saving checkpoint at step: 30
[2024-05-08 12:29:16,623][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_30
[2024-05-08 12:29:16,625][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_29
97%|████████████████████████████████████████████████████████████████████████████████████████████ | 31/32 [07:29<00:12, 12.26s/it][2024-05-08 12:29:16,636][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 31
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [07:41<00:00, 12.33s/it][2024-05-08 12:29:29,178][functional_diffusion_processes.trainers.trainer][INFO] - FID: 7.572990e+01
[2024-05-08 12:29:29,179][functional_diffusion_processes.trainers.trainer][INFO] - Inception score -1.000000e+00
wandb: Waiting for W&B process to finish... (success).
wandb: \ 39.926 MB of 39.966 MB uploaded (0.000 MB deduped)
wandb: Run history:
wandb: FID ▁
wandb: inception score ▁
wandb:
wandb: Run summary:
wandb: FID 75.7299
wandb: inception score -1.0
wandb:
wandb: 🚀 View run inr_mnist at: https://wandb.ai/zxcvzxcv980234/final/runs/lk25x14t
wandb: Synced 6 W&B file(s), 32 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20240508_122135-lk25x14t/logs
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [07:45<00:00, 14.56s/it]
(lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$

@giulio98
Copy link
Owner

giulio98 commented May 8, 2024

In your logs you don't have the steps where it calculates the mnist_stats.npz meaning that is reusing one already precalculated, did you remove it?

@giulio98
Copy link
Owner

giulio98 commented May 8, 2024

Please run this
rm -rf ./data
It will delete the mnist dataset and the stats then rerun the script

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

yes, removed it all .npz file when i run a new fid. i think maybe the problem in this part code since i remeber last time you fix about the jax .... , will it affect there ?
class FIDMetric:
"""Class for computing the Frechet Inception Distance (FID) metric.

This class facilitates the computation of the FID metric, which measures the similarity between two distributions of images.
It precomputes features for the real dataset using a specified Inception feature extractor and provides methods to compute
and store features for generated images, and to compute the FID and Inception Score (IS).

Attributes:
    metric_config (DictConfig): Configuration parameters for the FID metric.
    feature_extractor (InceptionFeatureExtractor): Inception feature extractor for computing the FID metric.
    dataset (BaseDataset): Dataset object providing real samples for FID computation.
    generated_pools (list): List to store features of generated images.
    generated_logits (list): List to store logits of generated images.
    real_features (dict): Dictionary to store precomputed features of real dataset.
"""

def __init__(
    self,
    metric_config: DictConfig,
    feature_extractor: InceptionFeatureExtractor,
    dataset: BaseDataset,
) -> None:
    """Initializes the FIDMetric class with specified configurations, feature extractor, and dataset.

    Args:
        metric_config (DictConfig): Configuration parameters for the FID metric.
        feature_extractor (InceptionFeatureExtractor): Inception feature extractor for computing the FID metric.
        dataset (BaseDataset): Dataset object providing real samples for FID computation.
    """
    self.metric_config = metric_config
    self.feature_extractor = feature_extractor
    self.dataset = dataset
    self.generated_pools = []
    self.generated_logits = []
    try:
        self.real_features = load_dataset_stats(
            save_path=metric_config.real_features_path,
            dataset_name=metric_config.dataset_name,
        )
    except FileNotFoundError:
        self._precompute_features(
            dataset_name=metric_config.dataset_name,
            save_path=metric_config.real_features_path,
        )
        self.real_features = load_dataset_stats(
            save_path=metric_config.real_features_path,
            dataset_name=metric_config.dataset_name,
        )

def _precompute_features(self, dataset_name: str, save_path: str) -> None:
    """Precomputes and saves features for the real dataset.

    Args:
        dataset_name (str): Name of the dataset.
        save_path (str): Path where the computed features will be saved.
    """
    tf.io.gfile.makedirs(path=save_path)

    tf.io.gfile.makedirs(os.path.join(save_path, f"{dataset_name.lower()}_clean"))

    # Use the feature extractor to compute features for the real dataset
    all_pools = self.feature_extractor.extract_features(
        dataset=self.dataset, save_path=save_path, dataset_name=dataset_name
    )

    # Save latent represents of the Inception network to disk or Google Cloud Storage
    filename = f"{dataset_name.lower()}_stats.npz"

    if jax.host_id() == 0:
        pylogger.info("Saving real dataset stats to: %s" % os.path.join(save_path, filename))

    with tf.io.gfile.GFile(os.path.join(save_path, filename), "wb") as f_out:
        io_buffer = io.BytesIO()
        np.savez_compressed(io_buffer, pool_3=all_pools)
        f_out.write(io_buffer.getvalue())

def compute_fid(self, eval_dir, num_sampling_round) -> Tuple[float, float]:
    """Computes the FID and Inception Score (IS) for the generated and real images.

    Args:
        eval_dir (str): Directory path for evaluation.
        num_sampling_round (int): Number of sampling rounds.

    Returns:
        Tuple[float, float]: A tuple containing the FID and Inception Score.
    """
    real_pools = self.real_features["pool_3"]
    if not self.feature_extractor.inception_v3 and not self.feature_extractor.inception_v3 == "lenet":
        if len(self.generated_logits) == 0 or len(self.generated_pools) == 0:
            if jax.host_id() == 0:
                # Load all statistics that have been previously computed and saved for each host
                for host in range(jax.host_count()):
                    stats = tf.io.gfile.glob(os.path.join(eval_dir, "statistics_*.npz"))
                    wait_message = False
                    while len(stats) < num_sampling_round:
                        if not wait_message:
                            print("Waiting for statistics on host %d" % (host,))
                            wait_message = True
                        stats = tf.io.gfile.glob(os.path.join(eval_dir, "statistics_*.npz"))
                        time.sleep(10)

                    for stat_file in stats:
                        with tf.io.gfile.GFile(stat_file, "rb") as fin:
                            stat = np.load(fin)

                            self.generated_pools.append(stat["pool_3"])
                            self.generated_logits.append(stat["logits"])

        all_logits = np.concatenate(self.generated_logits, axis=0)[: self.metric_config.num_samples]
        inception_score = tfgan.eval.classifier_score_from_logits(logits=all_logits)
    else:
        inception_score = -1

    all_pools = np.concatenate(self.generated_pools, axis=0)[: self.metric_config.num_samples]

    fid = tfgan.eval.frechet_classifier_distance_from_activations(activations1=real_pools, activations2=all_pools)

    return fid, inception_score

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

image image

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

i will run this Please run this
rm -rf ./data
It will delete the mnist dataset and the stats then rerun the script
thank you !

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

I think this time maybe on the way !! than you ~~

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

it stills, iclear both *.npz in inr_minist and meta_0_30 and rm -rf./data:
(lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$ sh scripts/maml/eval_mnist.sh
2024-05-08 12:58:52.177511: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[2024-05-08 12:58:58,048][HYDRA] Launching 1 jobs locally
[2024-05-08 12:58:58,048][HYDRA] #0 : +experiments_maml=eval_mnist
[2024-05-08 12:58:58,198][main][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
[2024-05-08 12:58:59,062][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-08 12:58:59,062][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-08 12:58:59,077][main][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
[2024-05-08 12:58:59,081][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-08 12:58:59,084][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-08 12:58:59,085][main][INFO] - Instantiating <functional_diffusion_processes.models.mlp_modulation.MLPModulationLR>
[2024-05-08 12:58:59,199][main][INFO] - Instantiating <functional_diffusion_processes.sdetools.heat_subvp_sde.HeatSubVPSDE>
[2024-05-08 12:58:59,219][absl][INFO] - Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker:
[2024-05-08 12:58:59,337][absl][INFO] - Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Host Interpreter CUDA
[2024-05-08 12:58:59,337][absl][INFO] - Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client'
[2024-05-08 12:58:59,344][absl][INFO] - Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this.
[2024-05-08 12:59:00,007][main][INFO] - Instantiating <functional_diffusion_processes.samplers.correctors.langevin_corrector.LangevinCorrector>
[2024-05-08 12:59:00,028][main][INFO] - Instantiating <functional_diffusion_processes.samplers.predictors.euler_predictor.EulerMaruyamaPredictor>
[2024-05-08 12:59:00,033][main][INFO] - Instantiating <functional_diffusion_processes.samplers.pc_sampler.PCSampler>
[2024-05-08 12:59:00,037][main][INFO] - Instantiating <functional_diffusion_processes.losses.mse_loss.MSELoss>
[2024-05-08 12:59:00,048][main][INFO] - Instantiating <functional_diffusion_processes.trainers.trainer.Trainer>
[2024-05-08 12:59:00,082][absl][WARNING] - GlobalAsyncCheckpointManager is not imported correctly. Checkpointing of GlobalDeviceArrays will not be available.To use the feature, install tensorstore.
WARNING:tensorflow:From /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-08 12:59:02,223][tensorflow][WARNING] - From /cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-08 12:59:02,246][main][INFO] - Instantiating <functional_diffusion_processes.metrics.fid_metric.FIDMetric>
[2024-05-08 12:59:02,358][functional_diffusion_processes.metrics.feature_extractor][INFO] - Extracting features from dataset...
[2024-05-08 12:59:02,359][absl][INFO] - Generating dataset mnist (/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1)
Downloading and preparing dataset Unknown size (download: Unknown size, generated: Unknown size, total: Unknown size) to /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1...
2024-05-08 12:59:02.460464: W tensorflow/core/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "NOT_FOUND: Could not locate the credentials file.". Retrieving token from GCE failed with "FAILED_PRECONDITION: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Could not resolve host: metadata".
Dl Completed...: 0 url [00:00, ? url/s] [2024-05-08 13:00:28,013][absl][INFO] - Downloading https://storage.googleapis.com/cvdf-datasets/mnist/t10k-images-idx3-ubyte.gz into /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/downloads/cvdf-datasets_mnist_t10k-images-idx3-ubytedDnaEPiC58ZczHNOp6ks9L4_JLids_rpvUj38kJNGMc.gz.tmp.740d2973d3604c8fbd79ec6edc1c10c4...
Dl Completed...: 0%| [2024-05-08 13:00:28,049][absl][INFO] - Downloading https://storage.googleapis.com/cvdf-datasets/mnist/t10k-labels-idx1-ubyte.gz into /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/downloads/cvdf-datasets_mnist_t10k-labels-idx1-ubyte4Mqf5UL1fRrpd5pIeeAh8c8ZzsY2gbIPBuKwiyfSD_I.gz.tmp.0728e263677c40f09040a51d0f978486...
Dl Completed...: 0%| [2024-05-08 13:00:28,060][absl][INFO] - Downloading https://storage.googleapis.com/cvdf-datasets/mnist/train-images-idx3-ubyte.gz into /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/downloads/cvdf-datasets_mnist_train-images-idx3-ubyteJAsxAi0QnOBEygBw_XW2X7zp-LBZAIqqYSHN8ru4ZO4.gz.tmp.9d4a633e123642f6a031084df424687b...
Dl Completed...: 0%| [2024-05-08 13:00:28,072][absl][INFO] - Downloading https://storage.googleapis.com/cvdf-datasets/mnist/train-labels-idx1-ubyte.gz into /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/downloads/cvdf-datasets_mnist_train-labels-idx1-ubytedcDWkl3FO9T-WMEH1f1Xt51eIRmePRIMAk6X147Qw8w.gz.tmp.28ed7d4ef4dc4d159d2f1b8b3f17f8b6...
Extraction completed...: 100%|█████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00, 2.24 file/s]
Dl Size...: 100%|█████████████████████████████████████████████████████████████████████████████████| 10/10 [00:01<00:00, 5.60 MiB/s]
Dl Completed...: 100%|██████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00, 2.24 url/s]
Generating splits...: 0%| | 0/2 [00:00<?, ? splits/s[2024-05-08 13:00:39,848][absl][INFO] - Done writing /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1.incompleteB351A2/mnist-train.tfrecord*. Number of examples: 60000 (shards: [60000])
Generating splits...: 50%|███████████████████████████████████ | 1/2 [00:10<00:10, 10.07s/ splits[2024-05-08 13:00:41,542][absl][INFO] - Done writing /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1.incompleteB351A2/mnist-test.tfrecord*. Number of examples: 10000 (shards: [10000])
Dataset mnist downloaded and prepared to /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1. Subsequent calls will reuse this data.
[2024-05-08 13:00:41,610][absl][INFO] - Constructing tf.data.Dataset mnist for split test, from /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
WARNING:tensorflow:AutoGraph could not transform <bound method PercentStyle._format of <logging.PercentStyle object at 0x7f2db42ed120>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output.
Cause: 'NoneType' object has no attribute '_fields'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
[2024-05-08 13:00:43,039][tensorflow][WARNING] - AutoGraph could not transform <bound method PercentStyle._format of <logging.PercentStyle object at 0x7f2db42ed120>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output.
Cause: 'NoneType' object has no attribute '_fields'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
[2024-05-08 13:00:42,288][functional_diffusion_processes.datasets.mnist_dataset][INFO] - Converting image to range [0,1]...
[2024-05-08 13:00:43,141][functional_diffusion_processes.datasets.mnist_dataset][INFO] - Resizing image to size 32...
[2024-05-08 13:00:43,159][functional_diffusion_processes.datasets.image_dataset][INFO] - Preprocessing images for split test...
[2024-05-08 13:00:43,175][functional_diffusion_processes.datasets.image_dataset][INFO] - Image reshaped to shape (1024, 1)...
/cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/jax/_src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
warnings.warn(
[2024-05-08 13:00:43,429][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 0
/cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/tensorflow/python/util/nest.py:917: UserWarning: tf.layers.flatten is deprecated and will be removed in a future version. Please use tf.keras.layers.Flatten instead.
structure[0], [func(*x) for x in entries],
/cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/keras/legacy_tf_layers/base.py:627: UserWarning: layer.updates will be removed in a future version. This property should not be used in TensorFlow 2.0, as updates are applied automatically.
self.updates, tf.compat.v1.GraphKeys.UPDATE_OPS
[2024-05-08 13:00:53,779][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 1
[2024-05-08 13:01:03,710][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 2
[2024-05-08 13:01:14,135][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 3
[2024-05-08 13:01:24,131][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 4
[2024-05-08 13:01:34,221][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 5
[2024-05-08 13:01:44,314][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 6
[2024-05-08 13:01:54,057][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 7
[2024-05-08 13:02:04,196][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 8
[2024-05-08 13:02:14,321][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 9
[2024-05-08 13:02:24,116][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 10
[2024-05-08 13:02:34,053][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 11
[2024-05-08 13:02:43,856][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 12
[2024-05-08 13:02:53,518][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 13
[2024-05-08 13:03:03,617][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 14
[2024-05-08 13:03:13,475][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 15
[2024-05-08 13:03:23,326][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 16
[2024-05-08 13:03:33,394][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 17
[2024-05-08 13:03:43,111][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 18
[2024-05-08 13:03:53,045][functional_diffusion_processes.metrics.fid_metric][INFO] - Saving real dataset stats to: /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/data/stats/mnist_test_stats.npz
[2024-05-08 13:03:53,270][main][INFO] - Starting testing!
wandb: Currently logged in as: zxcvzxcv980234. Use wandb login --relogin to force relogin
wandb: wandb version 0.17.0 is available! To upgrade, please run:
wandb: $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.14.0
wandb: Run data is saved locally in /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/wandb/run-20240508_130353-9shiirs7
wandb: Run wandb offline to turn off syncing.
wandb: Syncing run inr_mnist
wandb: ⭐️ View project at https://wandb.ai/zxcvzxcv980234/final
wandb: 🚀 View run at https://wandb.ai/zxcvzxcv980234/final/runs/9shiirs7
[2024-05-08 13:04:04,421][functional_diffusion_processes.trainers.trainer][INFO] - Total number of parameters: 0.12M
[2024-05-08 13:04:04,660][functional_diffusion_processes.trainers.trainer][WARNING] - Resuming training from the latest checkpoint.
[2024-05-08 13:04:04,663][absl][INFO] - Restoring checkpoint from /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/checkpoints/checkpoint_27
/cis/home/shuan124/anaconda3/envs/lxxfdp/lib/python3.10/site-packages/jax/src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
warnings.warn(
[2024-05-08 13:04:04,746][absl][INFO] - Found no checkpoint files in /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist with prefix meta_0

[2024-05-08 13:04:04,746][functional_diffusion_processes.trainers.trainer][INFO] - Starting sampling loop at step 0.
0%| | 0/32 [00:00<?, ?it/s][2024-05-08 13:04:04,753][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 0
[2024-05-08 13:05:24,172][absl][INFO] - Saving checkpoint at step: 0
[2024-05-08 13:05:24,204][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_0
3%|███ | 1/32 [01:19<41:03, 79.45s/it][2024-05-08 13:05:24,205][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 1
[2024-05-08 13:05:36,189][absl][INFO] - Saving checkpoint at step: 1
[2024-05-08 13:05:36,210][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_1
[2024-05-08 13:05:36,211][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_0
6%|██████ | 2/32 [01:31<19:53, 39.78s/it][2024-05-08 13:05:36,221][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 2
[2024-05-08 13:05:48,381][absl][INFO] - Saving checkpoint at step: 2
[2024-05-08 13:05:48,404][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_2
[2024-05-08 13:05:48,406][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_1
9%|█████████ | 3/32 [01:43<13:08, 27.19s/it][2024-05-08 13:05:48,417][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 3
[2024-05-08 13:06:00,483][absl][INFO] - Saving checkpoint at step: 3
[2024-05-08 13:06:00,512][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_3
[2024-05-08 13:06:00,514][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_2
12%|████████████ | 4/32 [01:55<09:54, 21.23s/it][2024-05-08 13:06:00,523][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 4
[2024-05-08 13:06:12,557][absl][INFO] - Saving checkpoint at step: 4
[2024-05-08 13:06:12,579][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_4
[2024-05-08 13:06:12,580][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_3
16%|███████████████ | 5/32 [02:07<08:04, 17.93s/it][2024-05-08 13:06:12,591][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 5
[2024-05-08 13:06:25,096][absl][INFO] - Saving checkpoint at step: 5
[2024-05-08 13:06:25,122][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_5
[2024-05-08 13:06:25,123][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_4
19%|██████████████████ | 6/32 [02:20<06:58, 16.10s/it][2024-05-08 13:06:25,134][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 6
[2024-05-08 13:06:37,433][absl][INFO] - Saving checkpoint at step: 6
[2024-05-08 13:06:37,454][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_6
[2024-05-08 13:06:37,455][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_5
22%|█████████████████████ | 7/32 [02:32<06:11, 14.87s/it][2024-05-08 13:06:37,465][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 7
[2024-05-08 13:06:49,675][absl][INFO] - Saving checkpoint at step: 7
[2024-05-08 13:06:49,697][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_7
[2024-05-08 13:06:49,698][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_6
25%|████████████████████████ | 8/32 [02:44<05:36, 14.03s/it][2024-05-08 13:06:49,708][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 8
[2024-05-08 13:07:02,048][absl][INFO] - Saving checkpoint at step: 8
[2024-05-08 13:07:02,069][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_8
[2024-05-08 13:07:02,070][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_7
28%|███████████████████████████ | 9/32 [02:57<05:10, 13.51s/it][2024-05-08 13:07:02,080][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 9
[2024-05-08 13:07:14,028][absl][INFO] - Saving checkpoint at step: 9
[2024-05-08 13:07:14,058][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_9
[2024-05-08 13:07:14,059][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_8
31%|█████████████████████████████▋ | 10/32 [03:09<04:46, 13.04s/it][2024-05-08 13:07:14,069][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 10
[2024-05-08 13:07:26,524][absl][INFO] - Saving checkpoint at step: 10
[2024-05-08 13:07:26,546][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_10
[2024-05-08 13:07:26,548][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_9
34%|████████████████████████████████▋ | 11/32 [03:21<04:30, 12.87s/it][2024-05-08 13:07:26,558][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 11
[2024-05-08 13:07:38,804][absl][INFO] - Saving checkpoint at step: 11
[2024-05-08 13:07:38,826][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_11
[2024-05-08 13:07:38,828][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_10
38%|███████████████████████████████████▋ | 12/32 [03:34<04:13, 12.69s/it][2024-05-08 13:07:38,837][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 12
[2024-05-08 13:07:51,029][absl][INFO] - Saving checkpoint at step: 12
[2024-05-08 13:07:51,051][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_12
[2024-05-08 13:07:51,053][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_11
41%|██████████████████████████████████████▌ | 13/32 [03:46<03:58, 12.55s/it][2024-05-08 13:07:51,064][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 13
[2024-05-08 13:08:03,532][absl][INFO] - Saving checkpoint at step: 13
[2024-05-08 13:08:03,554][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_13
[2024-05-08 13:08:03,556][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_12
44%|█████████████████████████████████████████▌ | 14/32 [03:58<03:45, 12.54s/it][2024-05-08 13:08:03,565][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 14
[2024-05-08 13:08:15,668][absl][INFO] - Saving checkpoint at step: 14
[2024-05-08 13:08:15,691][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_14
[2024-05-08 13:08:15,693][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_13
47%|████████████████████████████████████████████▌ | 15/32 [04:10<03:31, 12.42s/it][2024-05-08 13:08:15,704][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 15
[2024-05-08 13:08:28,020][absl][INFO] - Saving checkpoint at step: 15
[2024-05-08 13:08:28,043][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_15
[2024-05-08 13:08:28,044][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_14
50%|███████████████████████████████████████████████▌ | 16/32 [04:23<03:18, 12.40s/it][2024-05-08 13:08:28,054][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 16
[2024-05-08 13:08:40,610][absl][INFO] - Saving checkpoint at step: 16
[2024-05-08 13:08:40,631][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_16
[2024-05-08 13:08:40,633][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_15
53%|██████████████████████████████████████████████████▍ | 17/32 [04:35<03:06, 12.45s/it][2024-05-08 13:08:40,642][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 17
[2024-05-08 13:08:52,806][absl][INFO] - Saving checkpoint at step: 17
[2024-05-08 13:08:52,828][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_17
[2024-05-08 13:08:52,829][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_16
56%|█████████████████████████████████████████████████████▍ | 18/32 [04:48<02:53, 12.38s/it][2024-05-08 13:08:52,840][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 18
[2024-05-08 13:09:05,014][absl][INFO] - Saving checkpoint at step: 18
[2024-05-08 13:09:05,039][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_18
[2024-05-08 13:09:05,043][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_17
59%|████████████████████████████████████████████████████████▍ | 19/32 [05:00<02:40, 12.33s/it][2024-05-08 13:09:05,052][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 19
[2024-05-08 13:09:17,288][absl][INFO] - Saving checkpoint at step: 19
[2024-05-08 13:09:17,317][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_19
[2024-05-08 13:09:17,319][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_18
62%|███████████████████████████████████████████████████████████▍ | 20/32 [05:12<02:27, 12.31s/it][2024-05-08 13:09:17,328][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 20
[2024-05-08 13:09:29,298][absl][INFO] - Saving checkpoint at step: 20
[2024-05-08 13:09:29,329][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_20
[2024-05-08 13:09:29,331][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_19
66%|██████████████████████████████████████████████████████████████▎ | 21/32 [05:24<02:14, 12.22s/it][2024-05-08 13:09:29,341][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 21
[2024-05-08 13:09:41,956][absl][INFO] - Saving checkpoint at step: 21
[2024-05-08 13:09:41,979][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_21
[2024-05-08 13:09:41,981][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_20
69%|█████████████████████████████████████████████████████████████████▎ | 22/32 [05:37<02:03, 12.35s/it][2024-05-08 13:09:41,992][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 22
[2024-05-08 13:09:54,395][absl][INFO] - Saving checkpoint at step: 22
[2024-05-08 13:09:54,420][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_22
[2024-05-08 13:09:54,422][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_21
72%|████████████████████████████████████████████████████████████████████▎ | 23/32 [05:49<01:51, 12.38s/it][2024-05-08 13:09:54,431][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 23
[2024-05-08 13:10:06,786][absl][INFO] - Saving checkpoint at step: 23
[2024-05-08 13:10:06,816][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_23
[2024-05-08 13:10:06,818][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_22
75%|███████████████████████████████████████████████████████████████████████▎ | 24/32 [06:02<01:39, 12.38s/it][2024-05-08 13:10:06,827][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 24
[2024-05-08 13:10:19,285][absl][INFO] - Saving checkpoint at step: 24
[2024-05-08 13:10:19,309][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_24
[2024-05-08 13:10:19,311][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_23
78%|██████████████████████████████████████████████████████████████████████████▏ | 25/32 [06:14<01:26, 12.42s/it][2024-05-08 13:10:19,320][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 25
[2024-05-08 13:10:31,462][absl][INFO] - Saving checkpoint at step: 25
[2024-05-08 13:10:31,485][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_25
[2024-05-08 13:10:31,487][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_24
81%|█████████████████████████████████████████████████████████████████████████████▏ | 26/32 [06:26<01:14, 12.34s/it][2024-05-08 13:10:31,497][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 26
[2024-05-08 13:10:44,044][absl][INFO] - Saving checkpoint at step: 26
[2024-05-08 13:10:44,066][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_26
[2024-05-08 13:10:44,068][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_25
84%|████████████████████████████████████████████████████████████████████████████████▏ | 27/32 [06:39<01:02, 12.42s/it][2024-05-08 13:10:44,078][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 27
[2024-05-08 13:10:56,560][absl][INFO] - Saving checkpoint at step: 27
[2024-05-08 13:10:56,590][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_27
[2024-05-08 13:10:56,592][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_26
88%|███████████████████████████████████████████████████████████████████████████████████▏ | 28/32 [06:51<00:49, 12.45s/it][2024-05-08 13:10:56,602][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 28
[2024-05-08 13:11:09,017][absl][INFO] - Saving checkpoint at step: 28
[2024-05-08 13:11:09,039][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_28
[2024-05-08 13:11:09,041][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_27
91%|██████████████████████████████████████████████████████████████████████████████████████ | 29/32 [07:04<00:37, 12.45s/it][2024-05-08 13:11:09,050][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 29
[2024-05-08 13:11:21,610][absl][INFO] - Saving checkpoint at step: 29
[2024-05-08 13:11:21,633][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_29
[2024-05-08 13:11:21,635][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_28
94%|█████████████████████████████████████████████████████████████████████████████████████████ | 30/32 [07:16<00:24, 12.49s/it][2024-05-08 13:11:21,646][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 30
[2024-05-08 13:11:33,831][absl][INFO] - Saving checkpoint at step: 30
[2024-05-08 13:11:33,853][absl][INFO] - Saved checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_30
[2024-05-08 13:11:33,855][absl][INFO] - Removing checkpoint at /cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist/meta_0_29
97%|████████████████████████████████████████████████████████████████████████████████████████████ | 31/32 [07:29<00:12, 12.41s/it][2024-05-08 13:11:33,864][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 31
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [07:41<00:00, 12.41s/it][2024-05-08 13:11:46,368][functional_diffusion_processes.trainers.trainer][INFO] - FID: 7.572990e+01
[2024-05-08 13:11:46,368][functional_diffusion_processes.trainers.trainer][INFO] - Inception score -1.000000e+00
wandb: Waiting for W&B process to finish... (success).
wandb: \ 39.926 MB of 39.965 MB uploaded (0.000 MB deduped)
wandb: Run history:
wandb: FID ▁
wandb: inception score ▁
wandb:
wandb: Run summary:
wandb: FID 75.7299
wandb: inception score -1.0
wandb:
wandb: 🚀 View run inr_mnist at: https://wandb.ai/zxcvzxcv980234/final/runs/9shiirs7
wandb: Synced 6 W&B file(s), 32 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20240508_130353-9shiirs7/logs
100%|███████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [07:46<00:00, 14.57s/it]
(lxxfdp) shuan124@r34:/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes$

@giulio98
Copy link
Owner

giulio98 commented May 8, 2024

I'm sorry but I don't understand what could be the reason.
Are you changing the parameters for sampling?

if a run the code using our config file I get the following:

(fdp) corallo@atlas1:~/PycharmProjects/functional-diffusion-processes$ sh scripts/maml/eval_mnist.sh
2024-05-05 15:39:17.027317: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[2024-05-05 15:39:19,697][HYDRA] Launching 1 jobs locally
[2024-05-05 15:39:19,697][HYDRA]        #0 : +experiments_maml=eval_mnist
[2024-05-05 15:39:19,880][__main__][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
  warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
[2024-05-05 15:39:20,082][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,082][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,087][absl][INFO] - Load dataset info from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
[2024-05-05 15:39:20,089][__main__][INFO] - Instantiating <functional_diffusion_processes.datasets.mnist_dataset.MNISTDataset>
[2024-05-05 15:39:20,092][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,092][absl][WARNING] - options.experimental_threading is deprecated. Use options.threading instead.
[2024-05-05 15:39:20,092][absl][INFO] - Load dataset info from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
[2024-05-05 15:39:20,093][__main__][INFO] - Instantiating <functional_diffusion_processes.models.mlp_modulation.MLPModulationLR>
[2024-05-05 15:39:20,127][__main__][INFO] - Instantiating <functional_diffusion_processes.sdetools.heat_subvp_sde.HeatSubVPSDE>
[2024-05-05 15:39:20,209][absl][INFO] - Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker: 
[2024-05-05 15:39:20,776][absl][INFO] - Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter Host CUDA
[2024-05-05 15:39:20,777][absl][INFO] - Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client'
[2024-05-05 15:39:20,777][absl][INFO] - Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this.
[2024-05-05 15:39:21,560][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.correctors.langevin_corrector.LangevinCorrector>
[2024-05-05 15:39:21,564][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.predictors.euler_predictor.EulerMaruyamaPredictor>
[2024-05-05 15:39:21,568][__main__][INFO] - Instantiating <functional_diffusion_processes.samplers.pc_sampler.PCSampler>
[2024-05-05 15:39:21,571][__main__][INFO] - Instantiating <functional_diffusion_processes.losses.mse_loss.MSELoss>
[2024-05-05 15:39:21,573][__main__][INFO] - Instantiating <functional_diffusion_processes.trainers.trainer.Trainer>
[2024-05-05 15:39:21,591][absl][WARNING] - GlobalAsyncCheckpointManager is not imported correctly. Checkpointing of GlobalDeviceArrays will not be available.To use the feature, install tensorstore.
WARNING:tensorflow:From /home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-05 15:39:23,457][tensorflow][WARNING] - From /home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow_gan/python/estimator/tpu_gan_estimator.py:42: The name tf.estimator.tpu.TPUEstimator is deprecated. Please use tf.compat.v1.estimator.tpu.TPUEstimator instead.

[2024-05-05 15:39:23,465][__main__][INFO] - Instantiating <functional_diffusion_processes.metrics.fid_metric.FIDMetric>
[2024-05-05 15:39:23,604][functional_diffusion_processes.metrics.feature_extractor][INFO] - Extracting features from dataset...
[2024-05-05 15:39:23,605][absl][INFO] - Reusing dataset mnist (/home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1)
[2024-05-05 15:39:23,640][absl][INFO] - Constructing tf.data.Dataset mnist for split test, from /home/corallo/PycharmProjects/functional-diffusion-processes/data/tensorflow_datasets/mnist/3.0.1
WARNING:tensorflow:AutoGraph could not transform <bound method PercentStyle._format of <logging.PercentStyle object at 0x7fb89752cb80>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'NoneType' object has no attribute '_fields'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
[2024-05-05 15:39:25,504][tensorflow][WARNING] - AutoGraph could not transform <bound method PercentStyle._format of <logging.PercentStyle object at 0x7fb89752cb80>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: 'NoneType' object has no attribute '_fields'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
[2024-05-05 15:39:24,513][functional_diffusion_processes.datasets.mnist_dataset][INFO] - Converting image to range [0,1]...
[2024-05-05 15:39:25,637][functional_diffusion_processes.datasets.mnist_dataset][INFO] - Resizing image to size 32...
[2024-05-05 15:39:25,661][functional_diffusion_processes.datasets.image_dataset][INFO] - Preprocessing images for split test...
[2024-05-05 15:39:25,682][functional_diffusion_processes.datasets.image_dataset][INFO] - Image reshaped to shape (1024, 1)...
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/jax/_src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
  warnings.warn(
[2024-05-05 15:39:25,912][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 0
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/tensorflow/python/util/nest.py:917: UserWarning: `tf.layers.flatten` is deprecated and will be removed in a future version. Please use `tf.keras.layers.Flatten` instead.
  structure[0], [func(*x) for x in entries],
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/keras/legacy_tf_layers/base.py:627: UserWarning: `layer.updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
  self.updates, tf.compat.v1.GraphKeys.UPDATE_OPS
[2024-05-05 15:39:27,005][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 1
[2024-05-05 15:39:27,542][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 2
[2024-05-05 15:39:28,074][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 3
[2024-05-05 15:39:28,655][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 4
[2024-05-05 15:39:29,643][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 5
[2024-05-05 15:39:30,365][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 6
[2024-05-05 15:39:30,861][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 7
[2024-05-05 15:39:31,510][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 8
[2024-05-05 15:39:32,186][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 9
[2024-05-05 15:39:32,710][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 10
[2024-05-05 15:39:33,367][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 11
[2024-05-05 15:39:34,191][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 12
[2024-05-05 15:39:34,736][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 13
[2024-05-05 15:39:35,238][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 14
[2024-05-05 15:39:35,738][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 15
[2024-05-05 15:39:36,249][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 16
[2024-05-05 15:39:36,761][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 17
[2024-05-05 15:39:37,254][functional_diffusion_processes.metrics.feature_extractor][INFO] - Making FID stats -- step 18
[2024-05-05 15:39:37,758][functional_diffusion_processes.metrics.fid_metric][INFO] - Saving real dataset stats to: /home/corallo/PycharmProjects/functional-diffusion-processes/data/stats/mnist_test_stats.npz
[2024-05-05 15:39:37,980][__main__][INFO] - Starting testing!
wandb: Currently logged in as: giulio-corallo (eurecom-ds). Use `wandb login --relogin` to force relogin
wandb: wandb version 0.16.6 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.14.0
wandb: Run data is saved locally in /home/corallo/PycharmProjects/functional-diffusion-processes/wandb/run-20240505_153938-lk5wmrth
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run inr_mnist
wandb: ⭐️ View project at https://wandb.ai/eurecom-ds/fpd
wandb: 🚀 View run at https://wandb.ai/eurecom-ds/fpd/runs/lk5wmrth
[2024-05-05 15:39:48,607][functional_diffusion_processes.trainers.trainer][INFO] - Total number of parameters: 0.12M
[2024-05-05 15:39:48,904][functional_diffusion_processes.trainers.trainer][WARNING] - Resuming training from the latest checkpoint.
[2024-05-05 15:39:48,905][absl][INFO] - Restoring checkpoint from /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/checkpoints/checkpoint_27
/home/corallo/miniconda3/envs/fdp/lib/python3.10/site-packages/jax/_src/lib/xla_bridge.py:544: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
  warnings.warn(
[2024-05-05 15:39:48,989][absl][INFO] - Found no checkpoint files in /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist with prefix meta_0_
[2024-05-05 15:39:48,989][functional_diffusion_processes.trainers.trainer][INFO] - Starting sampling loop at step 0.
  0%|                                                                                                                                                                                 | 0/32 [00:00<?, ?it/s][2024-05-05 15:39:48,990][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 0
[2024-05-05 15:41:24,948][absl][INFO] - Saving checkpoint at step: 0
[2024-05-05 15:41:24,952][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_0
  3%|█████▎                                                                                                                                                                   | 1/32 [01:35<49:34, 95.96s/it][2024-05-05 15:41:24,953][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 1
[2024-05-05 15:41:43,396][absl][INFO] - Saving checkpoint at step: 1
[2024-05-05 15:41:43,397][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_1
[2024-05-05 15:41:43,397][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_0
  6%|██████████▌                                                                                                                                                              | 2/32 [01:54<25:10, 50.36s/it][2024-05-05 15:41:43,397][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 2
[2024-05-05 15:42:01,615][absl][INFO] - Saving checkpoint at step: 2
[2024-05-05 15:42:01,618][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_2
[2024-05-05 15:42:01,619][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_1
  9%|███████████████▊                                                                                                                                                         | 3/32 [02:12<17:14, 35.69s/it][2024-05-05 15:42:01,619][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 3
[2024-05-05 15:42:19,896][absl][INFO] - Saving checkpoint at step: 3
[2024-05-05 15:42:19,897][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_3
[2024-05-05 15:42:19,897][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_2
 12%|█████████████████████▏                                                                                                                                                   | 4/32 [02:30<13:26, 28.81s/it][2024-05-05 15:42:19,898][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 4
[2024-05-05 15:42:38,094][absl][INFO] - Saving checkpoint at step: 4
[2024-05-05 15:42:38,096][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_4
[2024-05-05 15:42:38,097][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_3
 16%|██████████████████████████▍                                                                                                                                              | 5/32 [02:49<11:14, 24.99s/it][2024-05-05 15:42:38,097][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 5
[2024-05-05 15:42:56,403][absl][INFO] - Saving checkpoint at step: 5
[2024-05-05 15:42:56,404][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_5
[2024-05-05 15:42:56,404][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_4
 19%|███████████████████████████████▋                                                                                                                                         | 6/32 [03:07<09:50, 22.72s/it][2024-05-05 15:42:56,405][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 6
[2024-05-05 15:43:14,647][absl][INFO] - Saving checkpoint at step: 6
[2024-05-05 15:43:14,648][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_6
[2024-05-05 15:43:14,648][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_5
 22%|████████████████████████████████████▉                                                                                                                                    | 7/32 [03:25<08:51, 21.25s/it][2024-05-05 15:43:14,651][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 7
[2024-05-05 15:43:32,949][absl][INFO] - Saving checkpoint at step: 7
[2024-05-05 15:43:32,956][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_7
[2024-05-05 15:43:32,956][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_6
 25%|██████████████████████████████████████████▎                                                                                                                              | 8/32 [03:43<08:07, 20.32s/it][2024-05-05 15:43:32,957][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 8
[2024-05-05 15:43:51,240][absl][INFO] - Saving checkpoint at step: 8
[2024-05-05 15:43:51,242][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_8
[2024-05-05 15:43:51,244][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_7
 28%|███████████████████████████████████████████████▌                                                                                                                         | 9/32 [04:02<07:32, 19.68s/it][2024-05-05 15:43:51,244][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 9
[2024-05-05 15:44:09,502][absl][INFO] - Saving checkpoint at step: 9
[2024-05-05 15:44:09,505][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_9
[2024-05-05 15:44:09,506][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_8
 31%|████████████████████████████████████████████████████▌                                                                                                                   | 10/32 [04:20<07:03, 19.24s/it][2024-05-05 15:44:09,507][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 10
[2024-05-05 15:44:27,768][absl][INFO] - Saving checkpoint at step: 10
[2024-05-05 15:44:27,776][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_10
[2024-05-05 15:44:27,776][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_9
 34%|█████████████████████████████████████████████████████████▊                                                                                                              | 11/32 [04:38<06:37, 18.95s/it][2024-05-05 15:44:27,777][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 11
[2024-05-05 15:44:45,992][absl][INFO] - Saving checkpoint at step: 11
[2024-05-05 15:44:46,002][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_11
[2024-05-05 15:44:46,004][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_10
 38%|███████████████████████████████████████████████████████████████                                                                                                         | 12/32 [04:57<06:14, 18.73s/it][2024-05-05 15:44:46,005][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 12
[2024-05-05 15:45:04,337][absl][INFO] - Saving checkpoint at step: 12
[2024-05-05 15:45:04,339][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_12
[2024-05-05 15:45:04,340][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_11
 41%|████████████████████████████████████████████████████████████████████▎                                                                                                   | 13/32 [05:15<05:53, 18.61s/it][2024-05-05 15:45:04,341][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 13
[2024-05-05 15:45:22,587][absl][INFO] - Saving checkpoint at step: 13
[2024-05-05 15:45:22,591][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_13
[2024-05-05 15:45:22,594][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_12
 44%|█████████████████████████████████████████████████████████████████████████▌                                                                                              | 14/32 [05:33<05:33, 18.50s/it][2024-05-05 15:45:22,594][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 14
[2024-05-05 15:45:40,872][absl][INFO] - Saving checkpoint at step: 14
[2024-05-05 15:45:40,880][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_14
[2024-05-05 15:45:40,881][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_13
 47%|██████████████████████████████████████████████████████████████████████████████▊                                                                                         | 15/32 [05:51<05:13, 18.44s/it][2024-05-05 15:45:40,881][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 15
[2024-05-05 15:45:59,246][absl][INFO] - Saving checkpoint at step: 15
[2024-05-05 15:45:59,251][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_15
[2024-05-05 15:45:59,252][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_14
 50%|████████████████████████████████████████████████████████████████████████████████████                                                                                    | 16/32 [06:10<04:54, 18.42s/it][2024-05-05 15:45:59,252][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 16
[2024-05-05 15:46:17,534][absl][INFO] - Saving checkpoint at step: 16
[2024-05-05 15:46:17,538][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_16
[2024-05-05 15:46:17,538][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_15
 53%|█████████████████████████████████████████████████████████████████████████████████████████▎                                                                              | 17/32 [06:28<04:35, 18.38s/it][2024-05-05 15:46:17,539][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 17
[2024-05-05 15:46:35,893][absl][INFO] - Saving checkpoint at step: 17
[2024-05-05 15:46:35,899][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_17
[2024-05-05 15:46:35,900][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_16
 56%|██████████████████████████████████████████████████████████████████████████████████████████████▌                                                                         | 18/32 [06:46<04:17, 18.37s/it][2024-05-05 15:46:35,900][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 18
[2024-05-05 15:46:54,148][absl][INFO] - Saving checkpoint at step: 18
[2024-05-05 15:46:54,150][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_18
[2024-05-05 15:46:54,150][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_17
 59%|███████████████████████████████████████████████████████████████████████████████████████████████████▊                                                                    | 19/32 [07:05<03:58, 18.34s/it][2024-05-05 15:46:54,151][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 19
[2024-05-05 15:47:12,467][absl][INFO] - Saving checkpoint at step: 19
[2024-05-05 15:47:12,468][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_19
[2024-05-05 15:47:12,469][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_18
 62%|█████████████████████████████████████████████████████████████████████████████████████████████████████████                                                               | 20/32 [07:23<03:39, 18.33s/it][2024-05-05 15:47:12,469][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 20
[2024-05-05 15:47:30,717][absl][INFO] - Saving checkpoint at step: 20
[2024-05-05 15:47:30,718][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_20
[2024-05-05 15:47:30,719][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_19
 66%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                         | 21/32 [07:41<03:21, 18.31s/it][2024-05-05 15:47:30,719][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 21
[2024-05-05 15:47:49,020][absl][INFO] - Saving checkpoint at step: 21
[2024-05-05 15:47:49,026][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_21
[2024-05-05 15:47:49,026][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_20
 69%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                                                    | 22/32 [08:00<03:03, 18.31s/it][2024-05-05 15:47:49,027][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 22
[2024-05-05 15:48:07,384][absl][INFO] - Saving checkpoint at step: 22
[2024-05-05 15:48:07,386][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_22
[2024-05-05 15:48:07,386][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_21
 72%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                               | 23/32 [08:18<02:44, 18.32s/it][2024-05-05 15:48:07,387][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 23
[2024-05-05 15:48:25,666][absl][INFO] - Saving checkpoint at step: 23
[2024-05-05 15:48:25,671][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_23
[2024-05-05 15:48:25,672][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_22
 75%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                          | 24/32 [08:36<02:26, 18.31s/it][2024-05-05 15:48:25,672][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 24
[2024-05-05 15:48:43,856][absl][INFO] - Saving checkpoint at step: 24
[2024-05-05 15:48:43,861][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_24
[2024-05-05 15:48:43,862][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_23
 78%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                    | 25/32 [08:54<02:07, 18.28s/it][2024-05-05 15:48:43,862][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 25
[2024-05-05 15:49:02,126][absl][INFO] - Saving checkpoint at step: 25
[2024-05-05 15:49:02,128][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_25
[2024-05-05 15:49:02,129][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_24
 81%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                               | 26/32 [09:13<01:49, 18.27s/it][2024-05-05 15:49:02,129][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 26
[2024-05-05 15:49:20,398][absl][INFO] - Saving checkpoint at step: 26
[2024-05-05 15:49:20,402][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_26
[2024-05-05 15:49:20,403][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_25
 84%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                          | 27/32 [09:31<01:31, 18.27s/it][2024-05-05 15:49:20,403][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 27
[2024-05-05 15:49:38,634][absl][INFO] - Saving checkpoint at step: 27
[2024-05-05 15:49:38,636][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_27
[2024-05-05 15:49:38,637][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_26
 88%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                     | 28/32 [09:49<01:13, 18.26s/it][2024-05-05 15:49:38,637][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 28
[2024-05-05 15:49:56,883][absl][INFO] - Saving checkpoint at step: 28
[2024-05-05 15:49:56,885][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_28
[2024-05-05 15:49:56,885][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_27
 91%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎               | 29/32 [10:07<00:54, 18.26s/it][2024-05-05 15:49:56,886][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 29
[2024-05-05 15:50:15,084][absl][INFO] - Saving checkpoint at step: 29
[2024-05-05 15:50:15,089][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_29
[2024-05-05 15:50:15,089][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_28
 94%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌          | 30/32 [10:26<00:36, 18.24s/it][2024-05-05 15:50:15,090][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 30
[2024-05-05 15:50:33,486][absl][INFO] - Saving checkpoint at step: 30
[2024-05-05 15:50:33,487][absl][INFO] - Saved checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_30
[2024-05-05 15:50:33,488][absl][INFO] - Removing checkpoint at /home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist/meta_0_29
 97%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊     | 31/32 [10:44<00:18, 18.29s/it][2024-05-05 15:50:33,488][functional_diffusion_processes.trainers.trainer][INFO] - sampling -- round: 31
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [11:02<00:00, 18.31s/it][2024-05-05 15:50:51,953][functional_diffusion_processes.trainers.trainer][INFO] - FID: 1.288000e+00
[2024-05-05 15:50:51,953][functional_diffusion_processes.trainers.trainer][INFO] - Inception score -1.000000e+00
wandb: Waiting for W&B process to finish... (success).
wandb: 
wandb: Run history:
wandb:             FID ▁
wandb: inception score ▁
wandb: 
wandb: Run summary:
wandb:             FID 1.288
wandb: inception score -1.0
wandb: 
wandb: 🚀 View run inr_mnist at: https://wandb.ai/eurecom-ds/fpd/runs/lk5wmrth
wandb: Synced 6 W&B file(s), 32 media file(s), 2 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20240505_153938-lk5wmrth/logs
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [11:12<00:00, 21.01s/it]

please share your eval_mnist.yaml

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

Hi this is my eval_mnist.yaml

@Package global

defaults:

  • override /trainers: trainer_maml
  • override /models: mlp_modulation
  • override /datasets: mnist
  • override /sdes: heat_subvp
  • override /samplers: pc_sampler
  • override /predictors: euler
  • override /correctors: langevin
  • override /metrics: metrics_mnist

trainers:
mode: "eval"
model_name: "local"
training_config:
inner_steps: 3
use_meta_sgd: False
ema_rate: 0.9999
save_dir: ${oc.env:LOGS_ROOT}/inr_mnist
trainer_logging:
use_wandb: True

evaluation_config:
seed: 43 # random seed for reproducibility
eval_dir: ${oc.env:LOGS_ROOT}/inr_mnist # directory where evaluation results are saved
num_samples: 16000 # number of samples to be generated for evaluation

sdes:
sde_config:
beta_max: 5.0
const: 0.02
psm_type: "time_independent"
probability_flow: False
factor: 2.0
x_norm: 32
energy_norm: 1

correctors:
snr: 0.19

samplers:
sampler_config:
N: 3
k: 1
denoise: True

models:
model_config:
uniform_min_val: 0.005
uniform_max_val: 0.1
use_dense_lr: False

layer_sizes:
  - 128
  - 128
  - 128
  - 128
  - 128
  - 128
  - 128
  - 128
  - ${datasets.train.data_config.output_size}
y_input: False

datasets:
test:
data_config:
image_height_size: 32
image_width_size: 32
batch_size: 512

@giulio98
Copy link
Owner

giulio98 commented May 8, 2024

You changed N to be 3 is normal you get an high fid scores, please use our configurations to get the same result

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

I totolly understand the problem , if i run your configurations with N=50 , i am almost 100 times larger than you
image

@cindyyyl
Copy link
Author

cindyyyl commented May 8, 2024

N= 50 image

image

image thank you so much~ however the new problem is comming, the first image is when i run eval_minist with N = 50, the fid score is 1.022488e+02, however when i set N = 3 , the fid score is 7.572990e+01, less than N=50, which means 3 steps is better than 50 steps. this is counterintuitive

you can see this. i think is still the problem of function of fid.compute. since i have already sampled image successfully

@giulio98
Copy link
Owner

giulio98 commented May 8, 2024

Sorry, i cloned the repository from scratch, loaded the model from scratch and run the script and i still get fid:1.28
I'm not able to reproduce your experiment

@giulio98
Copy link
Owner

giulio98 commented May 8, 2024

I think when you changed from 3 to 50 you still skipping the sampling

@giulio98
Copy link
Owner

giulio98 commented May 8, 2024

Please run this

PYTHONPATH=. python3 src/functional_diffusion_processes/run.py --multirun +experiments_maml=eval_mnist trainers.evaluation_config.eval_dir= ${oc.env:LOGS_ROOT}/inr_mnist_new

This way you will specify a new folder for eval and it will generate images from scratch

@giulio98
Copy link
Owner

giulio98 commented May 8, 2024

Please rember that when you do changes on the config you have to change also the eval dir otherwise the code detect that there are already samples and use them for the fid computation

@cindyyyl
Copy link
Author

cindyyyl commented May 9, 2024

yes, i see . since today's code is a totally new git repo and environment i downloaded in the morning, and the first time i run this code did not change the N , it is 100 times larger than you,. imean waht i did is totally follow the instructions of github without any chage. after that , i change N = 3 and find it fid score is less than N = 50. I think althogh git clone, with a unknown reason. we still have something different( like in yoour project is logs and for me is logs.test ...) so , now i will try to use other way not your code to caculate the fid . since as so far i can see the image from wandb and it is seems right image with N =3 and N =50

@cindyyyl
Copy link
Author

cindyyyl commented May 9, 2024

Please rember that when you do changes on the config you have to change also the eval dir otherwise the code detect that there are already samples and use them for the fid computation

yes, each time i run before i will rm .data you told me and move *.npz include meta file to another folder to keep the clean of eval.dir

@giulio98
Copy link
Owner

giulio98 commented May 9, 2024

You don't have to remove the data, you just have to change the eval dir name, have you tried that?

@giulio98
Copy link
Owner

giulio98 commented May 9, 2024

Hello i'm able to reproduce your experiment
If I run:

PYTHONPATH=. python3 src/functional_diffusion_processes/run.py --multirun +experiments_maml=eval_mnist trainers.evaluation_config.eval_dir=/home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist_3 samplers.sampler_config.N=3

and i get indeed fid score: 75.73

and these sampled image
image

when i run instead:

PYTHONPATH=. python3 src/functional_diffusion_processes/run.py --multirun +experiments_maml=eval_mnist trainers.evaluation_config.eval_dir=/home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist_50 samplers.sampler_config.N=50

I get fid score: 1.28
and these sampled image
image

If i run instead:

PYTHONPATH=. python3 src/functional_diffusion_processes/run.py --multirun +experiments_maml=eval_mnist trainers.evaluation_config.eval_dir=/home/corallo/PycharmProjects/functional-diffusion-processes/logs/inr_mnist_3 samplers.sampler_config.N=50

Here i am specifying on porpouse the eval_dir corresponding to N=3
THIS IS WRONG AND IS NOT HOW YOU HAVE TO DO! IN THIS CASE WILL SKIP SAMPLING BECAUSE DETECTS SAMPLES IN THAT FOLDER!!!
then i get fid score: 75.81

Please whenever you do changes on N or any other parameters for the FID score calculation remember to change also
trainers.evaluation_config.eval_dir.

I hope the message is clear and you will be able to get the FID score.

@cindyyyl
Copy link
Author

cindyyyl commented May 9, 2024

Thanks so much for your kind help. i copy that, i am not good at argpasers opearations .. but , this is the score i use probality flow ode to eval.minist when N=10, and the fid_score is : 1.1349964141845703:
/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/src/functional_diffusion_processes/run.py --multirun +experiments_maml=eval_mnist sdes.sde_config.probability_flow=True trainers.evaluation_config.eval_dir=/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist_ODE_3_10 samplers.sampler_config.N=10


when i run eval.minist with SDE N =10 , , the scre is : 12.630059242248535

/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/src/functional_diffusion_processes/run.py --multirun +experiments_maml=eval_mnist trainers.evaluation_config.eval_dir=/cis/net/io93c/data/shuan124/lxxfinal/functional-diffusion-processes/logs_test/inr_mnist_SDE_10 samplers.sampler_config.N=10

i think for SD, this is reasonable. but PF ODE for N=10 is a little unreasonable. so do i need to do some extra change for run ODE for eval.minst?

@giulio98
Copy link
Owner

Look at the sampled image and see the results.
Ode is not guaranteed to be better than sde, remember that we are simulating a partial diffuirential equation, you should tune this hyperparameters all togheter:
N=10,50,100,100,...2000
factor:0.4,0.6,0.8,2,4
Ode:True,False
snr:between 0.05 to 0.2

Remember always to change the eval dir when you compute the fid calculation.

@cindyyyl
Copy link
Author

Thank you so much for the whole help ! i really appreciate alll your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants