Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DataViewSchema overloads to ConvertToOnnx #6449

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

FranklinWhale
Copy link
Contributor

Fixes #6448

@codecov
Copy link

codecov bot commented Nov 13, 2022

Codecov Report

Merging #6449 (0180ff6) into main (87337c0) will decrease coverage by 0.05%.
The diff coverage is 0.00%.

❗ Current head 0180ff6 differs from pull request most recent head aed491f. Consider uploading reports for the commit aed491f to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6449      +/-   ##
==========================================
- Coverage   68.55%   68.49%   -0.06%     
==========================================
  Files        1170     1170              
  Lines      246977   246980       +3     
  Branches    25792    25792              
==========================================
- Hits       169303   169164     -139     
- Misses      70951    71075     +124     
- Partials     6723     6741      +18     
Flag Coverage Δ
Debug 68.49% <0.00%> (-0.06%) ⬇️
production 62.87% <0.00%> (-0.08%) ⬇️
test 88.96% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...Microsoft.ML.OnnxConverter/OnnxExportExtensions.cs 91.42% <0.00%> (-8.58%) ⬇️
...osoft.ML.KMeansClustering/KMeansPlusPlusTrainer.cs 83.92% <0.00%> (-7.16%) ⬇️
src/Microsoft.ML.FastTree/Training/StepSearch.cs 57.42% <0.00%> (-4.96%) ⬇️
src/Microsoft.ML.Data/Training/TrainerUtils.cs 66.26% <0.00%> (-3.82%) ⬇️
...crosoft.ML.StandardTrainers/Standard/SdcaBinary.cs 85.40% <0.00%> (-3.42%) ⬇️
src/Microsoft.ML.Sweeper/AsyncSweeper.cs 71.42% <0.00%> (-1.37%) ⬇️
...oft.ML.StandardTrainers/Standard/SdcaMulticlass.cs 91.46% <0.00%> (-1.03%) ⬇️
src/Microsoft.ML.Data/Utils/LossFunctions.cs 66.83% <0.00%> (-0.52%) ⬇️
...soft.ML.Data/DataLoadSave/Text/TextLoaderCursor.cs 89.61% <0.00%> (-0.16%) ⬇️
...ML.Transforms/Text/StopWordsRemovingTransformer.cs 86.38% <0.00%> (-0.15%) ⬇️
... and 1 more

@FranklinWhale
Copy link
Contributor Author

FranklinWhale commented Nov 13, 2022

No idea why UseKeyDataViewTypeAsUInt32InOnnxInput keeps failing... The exception occurs during cursor disposal.

@FranklinWhale FranklinWhale marked this pull request as ready for review November 14, 2022 18:39
@michaelgsharp
Copy link
Member

I added a comment in the issue itself, but I'll add it here as well.

So the call to Transform itself is a lazy call. Unless you get a cursor and start itterating over the data itself this call should basically be a NO-OP (not quite true as it does some validations and still sets up the pipeline, but no actual "processing" should take place).

Are you seeing something that is showing its actually doing data processing there?

@michaelgsharp
Copy link
Member

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@FranklinWhale
Copy link
Contributor Author

Actually the original issue is about more the need to convert an ITransformer to ONNX with DataViewSchema only, because the data that trained the model is not always available (e.g. the model is imported).

However, the failed test case UseKeyDataViewTypeAsUInt32InOnnxInput seems to suggest that some data processing was involved during the conversion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ConvertToOnnx should also accept DataViewSchema
2 participants