Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OxidizedFinder.iter_modules() support #253

Closed
philipkimmey opened this issue May 15, 2020 · 2 comments
Closed

OxidizedFinder.iter_modules() support #253

philipkimmey opened this issue May 15, 2020 · 2 comments
Labels
enhancement New feature or request

Comments

@philipkimmey
Copy link

philipkimmey commented May 15, 2020

Hello! What a cool project!

I've been engaged in what may be the longest yak-shaving exercise of my life over the last several days in trying to improve how we distribute some internal tools. The path of least resistance is almost certainly using https://github.com/pantsbuild/pex but I've even tried https://github.com/Nuitka/Nuitka which is a very cool project as well.

Our internal tool has one big dependency which is botocore. I've been working to eliminate __file__ dependencies from that project and I have something that needs some cleanup but is (mostly?) functional: boto/botocore#2046 .

In turn, that let me build my package using PyOxidizer and actually execute it, but unfortunately the importlib.resources behaviors end up being quite different between a normal Python runtime environment and the PyOxidizer context for reasons that aren't entirely clear to me.

Specifically, in the context of that PR vs. the PyOxidizer context here's the differing behavior:

>>> import botocore.resource_file_adapter
>>> [i for i in botocore.resource_file_adapter.os_adapter.listdir('pkg://botocore/data')]
['dataexchange', 'lex-models', 'storagegateway', 'endpoints.json', 'elb', 'alexaforbusiness', 'docdb', 'ds', 'devicefarm', 'groundstation', 'athena', 'kinesis-video-media', 'serverlessrepo', 'iotevents-data', 'license-manager', 'sms-voice', 'personalize', 'appconfig', 'pinpoint-email', 'forecast', 'elasticache', 'iotthingsgraph', 'detective', 'logs', 'securityhub', 'neptune', 'route53domains', 'macie', 'synthetics', 'ec2-instance-connect', 'robomaker', 'workspaces', 'acm', 'appsync', 'iot1click-devices', 'events', 'iotanalytics', 'ram', 'connectparticipant', 'pricing', '_retry.json', 'gamelift', 'marketplace-entitlement', 'lambda', 'emr', 'codecommit', 'kinesis', 'lakeformation', 'comprehendmedical', 'chime', 'ec2', 'sesv2', 'firehose', 'fsx', 'textract', 'opsworkscm', 'pinpoint', 'sns', 'mediaconvert', 'meteringmarketplace', 'sqs', 'xray', 'cloudtrail', 'sdb', 'migrationhub-config', 'iotsecuretunneling', 'application-autoscaling', 'mediapackage-vod', 'frauddetector', 'imagebuilder', 'sso-oidc', 'apigatewaymanagementapi', 'kinesisanalytics', 'dynamodb', 'route53resolver', 'sagemaker-a2i-runtime', 'polly', 'codestar-connections', 'codestar-notifications', 'redshift', 'waf', 'kafka', 'servicecatalog', 'globalaccelerator', 'ecr', 'codestar', 'amplify', 'datasync', 'transfer', 'es', 'rekognition', 'sso', 'iot1click-projects', 'cloudwatch', 'iot-data', 'iot-jobs-data', 'iotevents', 'dynamodbstreams', 'forecastquery', 'kinesis-video-signaling', 'config', 'efs', 'translate', 's3', 'pinpoint-sms-voice', 'cloudfront', 'kms', 'support', 'cloudhsm', 'connect', 'servicediscovery', 'codeguruprofiler', 'mgh', 'application-insights', 'cloudformation', 'apigatewayv2', 'transcribe', 'workmailmessageflow', 'elastictranscoder', 'snowball', 'greengrass', 'guardduty', 'kinesisanalyticsv2', 'mediatailor', 'autoscaling-plans', 'kendra', 'pi', 'ssm', 'personalize-events', 'signer', 'cur', 'iot', 'backup', 'mediapackage', 'dms', 'service-quotas', 'mediastore', 'mediastore-data', 'appmesh', 'batch', 'worklink', 'sagemaker-runtime', 'qldb', 'savingsplans', 'organizations', 'cognito-identity', 'sts', 'ebs', 'schemas', 'wafv2', 'rds', 'marketplacecommerceanalytics', 'personalize-runtime', 'lightsail', 'mq', 'autoscaling', 'apigateway', 'mobile', 'inspector', 'networkmanager', 'cloudhsmv2', 'acm-pca', 'secretsmanager', 'cloudsearchdomain', 'dax', 'cloudsearch', 'ce', 'importexport', 'comprehend', 'machinelearning', 'kinesisvideo', 'kinesis-video-archived-media', 'compute-optimizer', 'workdocs', 'sms', 'cognito-sync', 'medialive', 'sagemaker', 'resource-groups', 'health', 'managedblockchain', 'clouddirectory', 'qldb-session', 's3control', 'directconnect', '__init__.py', 'opsworks', 'accessanalyzer', 'route53', 'discovery', 'elasticbeanstalk', 'codedeploy', 'quicksight', 'iotsitewise', 'ses', 'elbv2', 'workmail', 'fms', 'mturk', 'budgets', 'resourcegroupstaggingapi', 'appstream', 'iam', 'elastic-inference', 'stepfunctions', 'glue', 'marketplace-catalog', 'ecs', 'swf', 'cloud9', 'datapipeline', 'waf-regional', 'codeguru-reviewer', 'mediaconnect', 'codebuild', 'dlm', 'glacier', 'eks', 'lex-runtime', 'shield', 'cognito-idp', 'outposts', 'codepipeline', 'rds-data']

By comparison, in the PyOxidizer context we get this:

>>> import botocore.resource_file_adapter
>>> [i for i in botocore.resource_file_adapter.os_adapter.listdir('pkg://botocore/data')]
['endpoints.json', '_retry.json']

The documentation is pretty clear that importlib.abc.ResourceReader.contents may or may not return non-resource contents, but another option would be to change the behavior there in PyOxidizer's content_impl to return non-resources. (Which appears to be more similar to how the cPython finder ends up behaving.)

Long story short, I think implementing iter_modules on the OxidizerFinder would let me use pkgutil.walk_packages to find those non-resource packages but I'm now pretty far out of my depth and could use some guidance.

Thanks!

@philipkimmey
Copy link
Author

I also just noticed some of the related discussion in #237 .

@indygreg indygreg added the enhancement New feature or request label May 17, 2020
@indygreg
Copy link
Owner

Oh, I had no clue there was an optional Finder.iter_modules() that things in the wild look for! We should definitely implement that!

As for exposing Python modules as resources, I actually have a half-finished patch somewhere that attempts to do this. One of the big areas of focus for the upcoming release has been shoring up the code in the pyembed crate. And exposing Python modules as resources is definitely on the short list of things I'd like to do before the next release.

Thank you for the excellent report. And thank you for fighting the good fight and porting botocore to the new Python API! Feel free to reference #69 in that PR so we have a better record of changes to __file__ in external projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants