RuntimeError: Unable to find data type for weight_name='/encoder/layer.0/attention/output/dense/MatMul_output_0'. shape_inference failed to return a type probably this node is from a different domain or using an input produced by such an operator. This may happen if you quantize a model already quantized. You may use extra_options `DefaultTensorType` to indicate the default weight type, usually `onnx.TensorProto.FLOAT`. #2598

ARES3366 · 2024-04-17T07:13:48Z

RuntimeError: Unable to find data type for weight_name='/encoder/layer.0/attention/output/dense/MatMul_output_0'. shape_inference failed to return a type probably this node is from a different domain or using an input produced by such an operator. This may happen if you quantize a model already quantized. You may use extra_options DefaultTensorType to indicate the default weight type, usually onnx.TensorProto.FLOAT.

The text was updated successfully, but these errors were encountered:

ARES3366 · 2024-04-17T07:15:49Z

from optimum.onnxruntime import ORTQuantizer
from optimum.onnxruntime.configuration import AutoQuantizationConfig
dynamic_quantizer = ORTQuantizer.from_pretrained(
output_model_path, 'model_optimized.onnx')

extra_options = {'DefaultTensorType': onnx.TensorProto.FLOAT}
dqconfig = AutoQuantizationConfig.avx512_vnni(is_static=False, per_channel=False)

dynamic_quantizer.quantize(save_dir=output_model_path,quantization_config=dqconfig)
tokenizer.save_pretrained(output_model_path)                                How should I change it

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARES3366 commented Apr 17, 2024

ARES3366 commented Apr 17, 2024

Comments

ARES3366 commented Apr 17, 2024

ARES3366 commented Apr 17, 2024