You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi everyone, I would like to know if performing a layer-per-layer inference on a pre-trained model (in fp32 or int8 datatypes) is possible using GPU with cuda 11.2.
My idea is to use several fp32 and int8-quantized models from ONNX Model Zoo Repo and then do the inference layer by layer to achieve a feature extraction. After this, I would modify the outputs from each layer and use them as a new input for the following layers, with the last layer's output equal to the output of the original model.
The approximate code would be something similar to this one:
model_path = "model.onnx"
ort_session = ort.InferenceSession(model_path)
input_data = np.random.randn(1, 3, 32, 32).astype(np.float32)
conv1_output = ort_session.run(None, {'input1': input_data})[0]
conv2_output = ort_session.run(None, {'input2': conv1_output})[0]
# Now, I can work with intermediate outputs, modify them and use them as new inputs
However, I tried to reproduce this code with a resnet50 pre-trained model from ONNX Model Zoo Repo, but it seems this model, like the rest of pre-trained models, only has one input and one output (no way of accessing to intermediate outputs).
So, is there any way I could do this? I have seen Evaluation Step by Step documentation. However, I am unsure if this "ReferenceEvaluator" function also works for pre-trained/quantised models and, more importantly, obtaining accuracy from a dataset like ImageNet.
Thank you!
The text was updated successfully, but these errors were encountered:
What is your intended use-case? Is this to debug/understand what's happening (where performance does not matter), or is it for production use? For debugging/understanding, IIRC @xadupre has written utilities that will execute models node by node. He might be able to point you to it and answer your question about reference evaluator also.
### Description
Intermediate results can only be printed right now. With this PR, they
can be returned as well.
### Motivation and Context
See #6025.
---------
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
### Description
Intermediate results can only be printed right now. With this PR, they
can be returned as well.
### Motivation and Context
See onnx#6025.
---------
Signed-off-by: Xavier Dupre <xadupre@microsoft.com>
Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>
Hi everyone, I would like to know if performing a layer-per-layer inference on a pre-trained model (in fp32 or int8 datatypes) is possible using GPU with cuda 11.2.
My idea is to use several fp32 and int8-quantized models from ONNX Model Zoo Repo and then do the inference layer by layer to achieve a feature extraction. After this, I would modify the outputs from each layer and use them as a new input for the following layers, with the last layer's output equal to the output of the original model.
The approximate code would be something similar to this one:
However, I tried to reproduce this code with a resnet50 pre-trained model from ONNX Model Zoo Repo, but it seems this model, like the rest of pre-trained models, only has one input and one output (no way of accessing to intermediate outputs).
So, is there any way I could do this? I have seen Evaluation Step by Step documentation. However, I am unsure if this "ReferenceEvaluator" function also works for pre-trained/quantised models and, more importantly, obtaining accuracy from a dataset like ImageNet.
Thank you!
The text was updated successfully, but these errors were encountered: