Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review normalization of package and extra names #736

Closed
tcompa opened this issue Jun 6, 2023 · 6 comments · Fixed by #1188
Closed

Review normalization of package and extra names #736

tcompa opened this issue Jun 6, 2023 · 6 comments · Fixed by #1188

Comments

@tcompa
Copy link
Collaborator

tcompa commented Jun 6, 2023

Ref: https://packaging.python.org/en/latest/specifications/name-normalization:

import re

def normalize(name):
    return re.sub(r"[-_.]+", "-", name).lower()

Note that this is not a big deal on fractal-server side, but it is creating some issues in the fractal-tasks-core for defining extras (ref python-poetry/poetry#6819 and fractal-analytics-platform/fractal-tasks-core#390 (comment)). Thus it's better to also make the right decision on the fractal-server side.

@tcompa tcompa added the High Priority Current Priorities & Blocking Issues label Jun 6, 2023
@tcompa tcompa added july2023 maintenance and removed High Priority Current Priorities & Blocking Issues labels Jul 17, 2023
@tcompa tcompa removed the july2023 label Sep 15, 2023
@jluethi
Copy link
Collaborator

jluethi commented Jan 23, 2024

This came up here:
https://github.com/fmi-basel/cellpose_2d3d_merging/blob/main/pyproject.toml

With the pyproject.toml

# Project metadata (see https://peps.python.org/pep-0621)
[project]
name = "cellpose-2D3D-merging"
version = "0.0.1"
description = "Cellpose based segmentation combining 2D and 3D segmentation"
readme = "README.md"
license = { text = "BSD-3-Clause" }
authors = [
    { name = "Franziska Moos", email = "franziska.moos@fmi.ch" },
]

# Required Python version and dependencies
requires-python = ">=3.8"
dependencies = [
    "fractal-tasks-core==0.14.1",
    "anndata",
    "cellpose",
    "dask",
    "numpy",
    "pandas",
    "zarr",
    "pydantic",
    "scikit-image==0.20.0",
]

# Optional dependencies (e.g. for `pip install -e ".[dev]"`, see
# https://peps.python.org/pep-0621/#dependencies-optional-dependencies)
[project.optional-dependencies]
dev = ["devtools", "pytest", "requests", "build", "jsonschema"]

# Build options (see https://peps.python.org/pep-0517)
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"

[tool.setuptools.packages.find]
where = ["src"]
include = ["cellpose_2d3d_merging"]

# Always include the __FRACTAL_MANIFEST__.json file in the package
[tool.setuptools.package-data]
"*" = ["__FRACTAL_MANIFEST__.json"]

And the following manifest:

{
  "manifest_version": "1",
  "task_list": [
    {
      "name": "Cellpose 2D 3D merging",
      "executable": "cellpose_2D3D_merging.py",
      "input_type": "zarr",
      "output_type": "zarr",
      "meta": {
        "cpus_per_task": 4,
        "mem": 16000,
        "needs_gpu": true,
        "parallelization_level": "image"
      },
      "args_schema": {
        "title": "Cellpose2d3dMerging",
        "type": "object",
        "properties": {
          "input_paths": {
            "title": "Input Paths",
            "type": "array",
            "items": {
              "type": "string"
            },
            "description": "List of input paths where the image data is stored as OME-Zarrs. Should point to the parent folder containing one or many OME-Zarr files, not the actual OME-Zarr file. Example: `[\"/some/path/\"]`. This task only supports a single input path. (standard argument for Fractal tasks, managed by Fractal server)."
          },
          "output_path": {
            "title": "Output Path",
            "type": "string",
            "description": "This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server)."
          },
          "component": {
            "title": "Component",
            "type": "string",
            "description": "Path to the OME-Zarr image in the OME-Zarr plate that is processed. Example: `\"some_plate.zarr/B/03/0\"`. (standard argument for Fractal tasks, managed by Fractal server)."
          },
          "metadata": {
            "title": "Metadata",
            "type": "object",
            "description": "This parameter is not used by this task. (standard argument for Fractal tasks, managed by Fractal server)."
          },
          "level": {
            "title": "Level",
            "type": "integer",
            "description": "Pyramid level of the image to be segmented. Choose `0` to process at full resolution."
          },
          "channel1": {
            "$ref": "#/definitions/ChannelInputModel",
            "title": "Channel1",
            "description": "Primary channel for segmentation; requires either `wavelength_id` (e.g. `A01_C01`) or `label` (e.g. `DAPI`)."
          },
          "channel2": {
            "$ref": "#/definitions/ChannelInputModel",
            "title": "Channel2",
            "description": "Second channel for segmentation (in the same format as `channel1`). If specified, cellpose runs in dual channel mode. For dual channel segmentation of cells, the first channel should contain the membrane marker, the second channel should contain the nuclear marker."
          },
          "input_ROI_table": {
            "title": "Input Roi Table",
            "default": "organoid_ROI_table",
            "type": "string",
            "description": "Name of the ROI table over which the task loops to apply Cellpose segmentation. Examples: `FOV_ROI_table` => loop over the field of views, `organoid_ROI_table` => loop over the organoid ROI table (generated by another task), `well_ROI_table` => process the whole well as one image."
          },
          "output_ROI_table": {
            "title": "Output Roi Table",
            "type": "string",
            "description": "If provided, a ROI table with that name is created, which will contain the bounding boxes of the newly segmented labels. ROI tables should have `ROI` in their name."
          },
          "output_label_name": {
            "title": "Output Label Name",
            "type": "string",
            "description": "Name of the output label image (e.g. `\"organoids\"`)."
          },
          "use_masks": {
            "title": "Use Masks",
            "default": true,
            "type": "boolean",
            "description": "If `True`, try to use masked loading and fall back to `use_masks=False` if the ROI table is not suitable. Masked loading is relevant when only a subset of the bounding box should actually be processed (e.g. running within `organoid_ROI_table`)."
          },
          "relabeling": {
            "title": "Relabeling",
            "default": true,
            "type": "boolean",
            "description": "If `True`, apply relabeling so that label values are unique for all objects in the well."
          },
          "diameter_level0": {
            "title": "Diameter Level0",
            "default": 50.0,
            "type": "number",
            "description": "Expected diameter of the objects that should be segmented in pixels at level 0. Initial diameter is rescaled using the `level` that was selected. The rescaled value is passed as the diameter to the `CellposeModel.eval` method."
          },
          "model_type_1": {
            "title": "Model Type 1",
            "default": "cyto2",
            "type": "string",
            "description": "Parameter of `CellposeModel` class. Defines which model should be used for predicition. Typical choices are `nuclei`, `cyto`, `cyto2`, etc. If merging is True, this must correspond to a 3D model."
          },
          "model_type_2": {
            "title": "Model Type 2",
            "default": "cyto2",
            "type": "string",
            "description": "Same as model_type_1. If merging is True, this must correspond to a 2D model."
          },
          "pretrained_model_1": {
            "title": "Pretrained Model 1",
            "type": "string",
            "description": "Parameter of `CellposeModel` class (takes precedence over `model_type_1`). Allows you to specify the path of a custom trained cellpose model. If merging is True, this must correspond to a 3D model."
          },
          "pretrained_model_2": {
            "title": "Pretrained Model 2",
            "type": "string",
            "description": "Same as pretrained_model_2. If merging is True, this must correspond to a 2D model."
          },
          "cellprob_threshold_1": {
            "title": "Cellprob Threshold 1",
            "default": 10,
            "type": "number",
            "description": "Same as cellprob_threshold_2. If merging is True, this must correspond to a 2D model."
          },
          "cellprob_threshold_2": {
            "title": "Cellprob Threshold 2",
            "default": 0.0,
            "type": "number",
            "description": "Missing description"
          },
          "flow_threshold": {
            "title": "Flow Threshold",
            "default": 0.4,
            "type": "number",
            "description": "Parameter of `CellposeModel.eval` method. Valid values between 0.0 and 1.0. From Cellpose documentation: \"Increase this threshold if cellpose is not returning as many ROIs as you\u2019d expect. Similarly, decrease this threshold if cellpose is returning too many ill-shaped ROIs.\""
          },
          "anisotropy": {
            "title": "Anisotropy",
            "type": "number",
            "description": "Ratio of the pixel sizes along Z and XY axis (ignored if the image is not three-dimensional). If `None`, it is inferred from the OME-NGFF metadata."
          },
          "min_size": {
            "title": "Min Size",
            "default": 15,
            "type": "integer",
            "description": "Parameter of `CellposeModel` class. Minimum size of the segmented objects (in pixels). Use `-1` to turn off the size filter."
          },
          "channel_axis": {
            "title": "Channel Axis",
            "default": 0,
            "type": "integer",
            "description": "Missing description"
          },
          "z_axis": {
            "title": "Z Axis",
            "default": 1,
            "type": "integer",
            "description": "Missing description"
          },
          "augment": {
            "title": "Augment",
            "default": false,
            "type": "boolean",
            "description": "Parameter of `CellposeModel` class. Whether to use cellpose augmentation to tile images with overlap."
          },
          "net_avg": {
            "title": "Net Avg",
            "default": false,
            "type": "boolean",
            "description": "Parameter of `CellposeModel` class. Whether to use cellpose net averaging to run the 4 built-in networks (useful for `nuclei`, `cyto` and `cyto2`, not sure it works for the others)."
          },
          "use_gpu": {
            "title": "Use Gpu",
            "default": true,
            "type": "boolean",
            "description": "If `False`, always use the CPU; if `True`, use the GPU if possible (as defined in `cellpose.core.use_gpu()`) and fall-back to the CPU otherwise."
          },
          "overwrite": {
            "title": "Overwrite",
            "default": true,
            "type": "boolean",
            "description": "If `True`, overwrite the task output."
          },
          "how_big_do_you_want_it": {
            "title": "How Big Do You Want It",
            "default": 20,
            "type": "integer",
            "description": "Missing description"
          },
          "merging": {
            "title": "Merging",
            "default": false,
            "type": "boolean",
            "description": "If `True` merges two label maps generated with cellpose. pretrained_model_1, model_1, cellprob_threshold_1 must correspond to a 3D model, pretrained_model_2, model_2, cellprob_threshold_2 must correspond to a 2D model."
          }
        },
        "required": [
          "input_paths",
          "output_path",
          "component",
          "metadata",
          "level",
          "channel1"
        ],
        "additionalProperties": false,
        "definitions": {
          "ChannelInputModel": {
            "title": "ChannelInputModel",
            "description": "A channel which is specified by either `wavelength_id` or `label`.",
            "type": "object",
            "properties": {
              "wavelength_id": {
                "title": "Wavelength Id",
                "type": "string",
                "description": "Unique ID for the channel wavelength, e.g. `A01_C01`."
              },
              "label": {
                "title": "Label",
                "type": "string",
                "description": "Name of the channel."
              }
            }
          }
        }
      },
      "docs_info": "Run cellpose segmentation on the ROIs of a single OME-Zarr image.\nOptionally runs two different cellpose segmentations and merges the\nresults.",
      "docs_link": "https://example.com"
    }
  ],
  "has_args_schemas": true,
  "args_schema_version": "pydantic_v1"
}

It lead to this error:

ERROR - Task manifest loading failed
2024-01-23 14:23:15,643 - cellpose_2D3D_merging - DEBUG - Task-collection status: fail
2024-01-23 14:23:15,643 - cellpose_2D3D_merging - INFO - Background collection failed. 
Original error: Could not determine package installation location.

@tcompa
Copy link
Collaborator Author

tcompa commented Jan 25, 2024

Quick fix: do not use capital letters in project name, that is go from

[project]
name = "cellpose-2D3D-merging"
...

to

[project]
name = "cellpose-2d3d-merging"
...

Meanwhile we'll improve on this, based on https://packaging.python.org/en/latest/specifications/pyproject-toml/#name ("Tools SHOULD normalize this name, as soon as it is read for internal consistency.") and https://packaging.python.org/en/latest/specifications/name-normalization/#name-normalization (which defines how to normalize a name).

@enricotagliavini
Copy link

Hello @tcompa , we did switch to lower case, but we still got the same error and I see the folder created for the task virtualenv is still with the upper case letters. We are sure we used the correct wheel (we removed the previous one and the timestamp in the web interface is also from today). There might be more places affected than just the project name.

We are now trying to remove the upper case letters from all the occurrences of the module name.

@tcompa tcompa linked a pull request Jan 25, 2024 that will close this issue
2 tasks
@tcompa
Copy link
Collaborator Author

tcompa commented Jan 25, 2024

Hello @tcompa , we did switch to lower case, but we still got the same error and I see the folder created for the task virtualenv is still with the upper case letters. We are sure we used the correct wheel (we removed the previous one and the timestamp in the web interface is also from today). There might be more places affected than just the project name.

We are now trying to remove the upper case letters from all the occurrences of the module name.

Thanks for your feedback. I have a fractal-server patch which seems to work (ref #1188), but it will require a little while more since I'm trying to find some sanity in the Python-packaging specifications (and then we'll need to release the new version and have it in your Fractal instance).

Meanwhile, the best would be to modify your package - to get it to work quickly. I'll also debug it on my side and provide feedback if I can.

@enricotagliavini
Copy link

I think we found the mistake on our side actually. But we anyway removed any occurrence of upper case letters from everywhere. Thank you.

@tcompa
Copy link
Collaborator Author

tcompa commented Jan 25, 2024

I think we found the mistake on our side actually. But we anyway removed any occurrence of upper case letters from everywhere. Thank you.

Good to hear.
For the record, I had just created https://github.com/tcompa/example-fractal-server-issue-736. This package can be successfully collected, as in:
image

A slight variation, where I just change the name in pyproject.toml into something with capital letters (e.g. package-NAME) leads to a failure in task collection. The plan is obviously that after PR #1188 this case will also be accepted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants