Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnboundLocalError: cannot access local variable 'pipelines_package' where it is not associated with a value #3847

Open
JenspederM opened this issue May 2, 2024 · 3 comments
Labels
Community Issue/PR opened by the open-source community

Comments

@JenspederM
Copy link

JenspederM commented May 2, 2024

Description

Error is thrown when trying to print find_pipelines from the kedro.framework.project module.

Context

Unable to use find_pipelines

Steps to Reproduce

  1. Add print(find_pipelines()) to the bottom of the pipeline_regitry.py file
  2. Run the file python ./src/<project>/pipeline_regitry.py

Expected Result

A dict of pipelines.

Actual Result

I get the following error:

[05/02/24 18:05:49] WARNING  /Users/.../.venv/lib/python3.12/site-pac warnings.py:110
                             kages/kedro/framework/project/__init__.py:350: UserWarning: An error                      
                             occurred while importing the 'None.pipeline' module. Nothing defined                      
                             therein will be returned by 'find_pipelines'.                                             
                                                                                                                       
                             Traceback (most recent call last):                                                        
                               File                                                                                    
                             "/Users/.../.venv/lib/python3.12/site-pa                
                             ckages/kedro/framework/project/__init__.py", line 347, in find_pipelines                  
                                 pipeline_module = importlib.import_module(pipeline_module_name)                       
                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                       
                               File                                                                                    
                             "/Users/.../.rye/py/cpython@3.12.2/install/lib/python3.12/i                
                             mportlib/__init__.py", line 90, in import_module                                          
                                 return _bootstrap._gcd_import(name[level:], package, level)                           
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                           
                               File "<frozen importlib._bootstrap>", line 1387, in _gcd_import                         
                               File "<frozen importlib._bootstrap>", line 1360, in _find_and_load                      
                               File "<frozen importlib._bootstrap>", line 1310, in                                     
                             _find_and_load_unlocked                                                                   
                               File "<frozen importlib._bootstrap>", line 488, in                                      
                             _call_with_frames_removed                                                                 
                               File "<frozen importlib._bootstrap>", line 1387, in _gcd_import                         
                               File "<frozen importlib._bootstrap>", line 1360, in _find_and_load                      
                               File "<frozen importlib._bootstrap>", line 1324, in                                     
                             _find_and_load_unlocked                                                                   
                             ModuleNotFoundError: No module named 'None'                                               
                                                                                                                       
                               warnings.warn(                                                                          
                                                                                                                       
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /Users/.../project/src/project/pipeline_registy.py:21 in <module>                                                                             │
│                                                                                                  │
│   18                                                                                             │
│   19                                                                                             │
│   20 if __name__ == "__main__":                                                                  │
│ ❱ 21 │   print(register_pipelines())                                                             │
│   22                                                                                             │
│                                                                                                  │
│ /Users/.../project/src/project/pipeline_registry.py:15 in register_pipelines                                                                   │
│                                                                                                  │
│   12 │   Returns:                                                                                │
│   13 │   │   A mapping from pipeline names to ``Pipeline`` objects.                              │
│   14 │   """                                                                                     │
│ ❱ 15 │   pipelines = find_pipelines()                                                            │
│   16 │   pipelines["__default__"] = sum(pipelines.values())                                      │
│   17 │   return pipelines                                                                        │
│   18                                                                                             │
│                                                                                                  │
│ /Users/.../.venv/lib/python3.12/site-packages/kedro/framework/project/__init__.py:367 in find_pipelines                                                        │
│                                                                                                  │
│   364 │   │   if str(exc) == f"No module named '{PACKAGE_NAME}.pipelines'":                      │
│   365 │   │   │   return pipelines_dict                                                          │
│   366 │                                                                                          │
│ ❱ 367 │   for pipeline_dir in pipelines_package.iterdir():                                       │
│   368 │   │   if not pipeline_dir.is_dir():                                                      │
│   369 │   │   │   continue                                                                       │
│   370                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
UnboundLocalError: cannot access local variable 'pipelines_package' where it is not associated with a value

Your Environment

  • Kedro version used (pip show kedro or kedro -V): kedro, version 0.19.5
  • Python version used (python -V): Python 3.12.2 using rye as package manager
  • Operating system and version: M1 Mac with macOS Sonoma Version 14.4.1
@merelcht
Copy link
Member

Hi @JenspederM, thanks for flagging this issue. Can I ask what your use case is for printing the result of find_pipelines()?

This method has been added to enable auto discovery of pipelines and does some stuff in the back to make sure your project and its modules are discoverable (https://docs.kedro.org/en/stable/nodes_and_pipelines/pipeline_registry.html). It's meant to run as part of a "regular" Kedro flow where it's preceded by certain project setup methods. You can fix your script by calling bootstrap_project() before find_pipelines() (https://docs.kedro.org/en/stable/kedro_project_setup/session.html#bootstrap-project-and-configure-project). However, I would only recommend doing that for exploration and not if you're planning to run that code in production.

Let me know if this makes sense!

@merelcht merelcht added the Community Issue/PR opened by the open-source community label May 21, 2024
@JenspederM
Copy link
Author

JenspederM commented May 21, 2024

Hi @merelcht,

Thank you for your reply.

I am using find_pipelines() to generate databricks assets bundle resources. I am working on a template for asset bundles that uses Kedro for defining pipelines and dependencies and databricks workflows for scheduling. You can find the project here

Thanks for the suggesting bootstrap_project(). For now, I have been using configure_project(<package-name>) as used in databricks_run.py in the databricks-iris starter.

You can see my exact usage right here

@JenspederM
Copy link
Author

@merelcht

I have been thinking of making a cookiecutter for Kedro as well. Do you think there would be any interest in this?

I made the template based on my own experience of running large scale Databricks projects in production with many contributors of varying levels of experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Issue/PR opened by the open-source community
Projects
Status: No status
Development

No branches or pull requests

2 participants