Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to swap a set of parameters inside of an .onnx / .ort graph with an identically shaped set of parameters? #6090

Closed
jakemdaly opened this issue Apr 18, 2024 · 5 comments
Labels
question Questions about ONNX

Comments

@jakemdaly
Copy link

Ask a Question

Question

I want to be able to swap params at inference time to facilitate a LoRA deployment.

Eg. in torch, I could do

class myModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(4,8)
        self.act = nn.Sigmoid()
    def forward(self, x):
        return self.act( self.linear(x) )

m = myModel()
x = torch.randn(1,4)

pred_1 = m( x )

new_weights = nn.Parameter(torch.randn(8,4))
# swap weights
m.linear.weight = new_weights

# call model with new weights
m( x )

Notes

I am using an ORT file for inference if that matters

@jakemdaly jakemdaly added the question Questions about ONNX label Apr 18, 2024
@gramalingam
Copy link
Contributor

If you use external-data format, you can replace the data file representing external tensors with new values, as you wish.

Alternatively, you can make the weights as input parameters of the model, and then vary them as you wish for each invocation. However, this will incur a performance penalty (potentially huge) if ort has to do things like move the weight to gpu or transpose them etc. (which will be done once at session creation if the weights are not inputs).

@jakemdaly
Copy link
Author

@gramalingam After reading the docs and tinkering with some of those functions, I am still not sure I quite understand the purpose of the external-data format, or if it would be compatible for the onnxruntime API (as opposed to onnx). What is the purpose of the format, and could you provide psuedo code to show how to load a subset of params with onnxruntime?

@gramalingam
Copy link
Contributor

Yes, onnxruntime also supports the external-data format, which is part of the onnx standard. The external-data format serves a couple of purposes.

First, the protobuf format has a limit of 2GB on the size of a protobuf object (in terms of the size of the serialized representation). Models which exceed this size can exploit the external-data format to get around this limitation.

Second, even if the model size is less than 2GB, weights end up dominating the size of the model representation. Hence, it is convenient and efficient to load these weights only if required. It helps analysis/optimization tools that care about the graph, and not so much about the weights.

@jakemdaly
Copy link
Author

jakemdaly commented Apr 22, 2024

Is there a way to specify which parameters in the graph to load weights into? Or this capability doesn't exist yet

@jakemdaly
Copy link
Author

jakemdaly commented May 1, 2024

In my application I adding an initializer with the AddInitializer method, and then loading the session via the CreateSessionFromArray API. I am getting the following initialization error when I call CreateSessionFromArray

[E:onnxruntime:, inference_session.cc:1935 onnxruntime::InferenceSession::Initialize::<lambda_5a23845ba810e30de3b9e7b450415bf5>::operator ()] Exception during initialization: C:\a\_work\1\s\onnxruntime\core\optimizer\initializer.cc:35 onnxruntime::Initializer::Initializer !model_path.IsEmpty() was false. model_path must not be empty. Ensure that a path is provided when the model is created or loaded.

Because I am not supplying a model path (I'm initializing from an array), does this imply the two methods or not compatible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Questions about ONNX
Projects
None yet
Development

No branches or pull requests

3 participants