Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create error codes and reference pages for checker errors #6082

Open
justinchuby opened this issue Apr 12, 2024 · 10 comments
Open

Create error codes and reference pages for checker errors #6082

justinchuby opened this issue Apr 12, 2024 · 10 comments
Assignees

Comments

@justinchuby
Copy link
Contributor

justinchuby commented Apr 12, 2024

With some ChatGPT help:

Based on the provided code, here are some possible errors that the graph checker function (check_graph) is designed to detect:

  1. Empty Name Field: Ensuring that the name field of the graph is not empty.
  2. Duplicate Input or Output Names: Checking if input or output names are unique.
  3. Tensor Initializer Name Uniqueness: Ensuring that tensor initializer names are unique.
  4. Sparse Tensor Initializer Name Uniqueness: Ensuring that sparse tensor initializer names are unique across initializers and sparse initializers.
  5. Tensor Initializer Non-Empty Name: Checking that tensor initializers have non-empty names.
  6. Sparse Tensor Initializer Non-Empty Name: Checking that sparse tensor initializers have non-empty names.
  7. Initializer in Graph Input: For IR versions prior to 4, ensuring that initializers are also present in graph inputs.
  8. Topological Sort of Nodes: Ensuring that nodes are in topologically sorted order.
  9. Input of Node Not Output of Previous Nodes: Detecting when the input of a node is not the output of any previous nodes, violating topological sorting.
  10. Experimental Op Usage: Checking if any experimental operations are used.
  11. Node Output Name Uniqueness: Ensuring that node output names are unique.
  12. Graph Output Not Output of Any Node: Detecting when a graph output is not the output of any node in the graph.

These checks aim to ensure that the graph adheres to certain constraints and conventions, such as single static assignment (SSA) form and topological sorting of nodes, which are common requirements in graph-based computational frameworks like ONNX.

@justinchuby justinchuby self-assigned this Apr 12, 2024
@justinchuby
Copy link
Contributor Author

justinchuby commented Apr 14, 2024

errors:
  - code: empty-graph-name
    explanation: The 'name' field of a GraphProto must not be empty.
  
  - code: duplicate-input-output-names
    explanation: Input and output names within a GraphProto must be unique.
  
  - code: non-unique-tensor-initializer-name
    explanation: Tensor initializer names within a GraphProto must be unique.
  
  - code: non-unique-sparse-tensor-initializer-name
    explanation: Sparse tensor initializer names within a GraphProto must be unique across both initializers and sparse initializers.
  
  - code: empty-tensor-initializer-name
    explanation: Tensor initializers within a GraphProto must have non-empty names.
  
  - code: empty-sparse-tensor-initializer-name
    explanation: Sparse tensor initializers within a GraphProto must have non-empty names.
  
  - code: initializer-not-in-graph-input
    explanation: For IR versions prior to 4, tensor initializers within a GraphProto must also be present in the graph inputs.
  
  - code: node-topological-sorting
    explanation: Nodes within a GraphProto must be in topologically sorted order.
  
  - code: input-not-output-of-previous-nodes
    explanation: The input of a node within a GraphProto must be the output of a previous node, violating topological sorting otherwise.
  
  - code: experimental-op-usage
    explanation: Experimental operations should be avoided within a GraphProto.
  
  - code: non-unique-node-output-name
    explanation: Output names of nodes within a GraphProto must be unique.
  
  - code: graph-output-not-from-any-node
    explanation: Graph outputs within a GraphProto must be outputs of nodes within the same graph.

  - code: ssa-violation
    explanation: "The GraphProto must be in Single Static Assignment (SSA) form, meaning each variable (tensor) is assigned exactly once. This error occurs when a variable is assigned multiple times within the graph, violating the SSA form."

@justinchuby
Copy link
Contributor Author

errors:
  - code: empty-value-info-name
    explanation: The 'name' field of a ValueInfoProto must not be empty.
  - code: invalid-tensor-type
    explanation: The 'type' field of a TensorTypeProto in a ValueInfoProto must be present and valid.
  - code: invalid-optional-type
    explanation: The 'type' field of an OptionalTypeProto in a ValueInfoProto must be present and valid.
  - code: invalid-sequence-type
    explanation: The 'type' field of a SequenceTypeProto in a ValueInfoProto must be present and valid.
  - code: invalid-map-type
    explanation: The 'type' field of a MapTypeProto in a ValueInfoProto must be present and valid.
  - code: unrecognized-type-value-case
    explanation: The 'type' field of a ValueInfoProto contains an unrecognized value case.
  - code: undefined-tensor-data-type
    explanation: The 'data_type' field of a TensorProto must be defined and not set to UNDEFINED.
  - code: tensor-data-field-mismatch
    explanation: The data field of a TensorProto does not match the specified data type.
  - code: externally-stored-tensor-has-data-field
    explanation: Externally stored tensors should not have data fields.
  - code: externally-stored-tensor-missing-location
    explanation: Externally stored tensors must have a 'location' specified.
  - code: zero-element-tensor-has-data
    explanation: A tensor with zero elements should not contain any data.
  - code: multi-value-field-for-non-zero-element-tensor
    explanation: A tensor with non-zero elements should contain exactly one value field.
  - code: string-data-in-raw-data-field
    explanation: STRING data should not be stored in the raw_data field of a TensorProto.
  - code: invalid-sparse-tensor-indices-size
    explanation: The size of the indices field in a SparseTensorProto does not match the number of non-zero elements (NNZ).
  - code: sparse-tensor-indices-out-of-range
    explanation: Indices in a SparseTensorProto are out of range.
  - code: unsorted-sparse-tensor-indices
    explanation: Indices in a SparseTensorProto are not in sorted order.
  - code: sparse-tensor-indices-not-in-lexicographic-order
    explanation: Indices in a SparseTensorProto are not in lexicographic sorted order.
  - code: attribute-multiple-value-fields
    explanation: An attribute should not contain more than one value field.
  - code: attribute-refers-to-parent-attribute-but-has-value-field
    explanation: An attribute referring to a parent attribute should not have its own value field set.
  - code: invalid-opset-import
    explanation: No opset import found for the domain referenced by the node.
  - code: deprecated-op
    explanation: The operator referenced by the node is deprecated.
  - code: no-op-registered
    explanation: No operator is registered for the domain and version referenced by the node.
  - code: op-is-deprecated
    explanation: The operator referenced by the node is deprecated in the specified domain version.

@justinchuby
Copy link
Contributor Author

Based on the provided code, here are the possible errors that could occur during the model checking process:

  1. No Opset registered for domain: This error occurs if there is no Opset registered for a specific domain.

  2. Opset import for a node in the model is missing in function body: This error is raised when the model includes an Opset import for a node that is present in the function body but not in the model's Opset imports.

  3. Opset versions are incompatible: This error occurs when the Opset versions imported by the function and the model are incompatible.

  4. Graph must be in single static assignment (SSA) form: This error indicates that the graph must be in SSA form, but a variable has been used multiple times.

  5. Function should not have duplicate outputs specified: This error occurs when duplicate outputs are specified for a function.

  6. Function should not have duplicate attributes specified: This error occurs when duplicate attributes are specified for a function.

  7. Nodes in a function must be topologically sorted: This error is raised when nodes in a function are not in topologically sorted order.

  8. Function must be in single static assignment (SSA) form: This error indicates that the function must be in SSA form, but a variable has been used as output names multiple times.

  9. Location of external TensorProto should be a relative path: This error occurs when the location of an external TensorProto is specified as an absolute path.

  10. Location of external TensorProto should not be empty: This error indicates that the location of an external TensorProto is empty.

  11. Location of external TensorProto points outside the directory: This error occurs when the location of an external TensorProto points outside the specified directory.

  12. Data of TensorProto should be stored in a valid location: This error indicates that the data of a TensorProto should be stored in a valid location, but it either doesn't exist or is not accessible.

  13. Data of TensorProto should be stored in a regular file: This error occurs when the data of a TensorProto is not stored in a regular file.

  14. Model does not have an ir_version set properly: This error occurs when the model does not have the ir_version set properly.

  15. Model ir_version is higher than the checker's: This error indicates that the model's ir_version is higher than the checker's.

  16. Model has duplicate keys in metadata_props: This error occurs when the model has duplicate keys in metadata_props.

  17. Model with IR version >= 3 must specify opset_import for ONNX: This error occurs when the model with IR version greater than or equal to 3 does not specify opset_import for ONNX.

  18. Model with IR version < 3 cannot have opset_import specified: This error occurs when the model with IR version less than 3 specifies opset_import.

These are the possible errors that could be encountered during the model checking process based on the provided code.

@justinchuby
Copy link
Contributor Author

justinchuby commented Apr 14, 2024

- code: no-opset-registered
  explanation: No Opset registered for the specified domain.

- code: opset-import-missing
  explanation: Opset import for a node in the model is missing in the function body.

- code: incompatible-opset-versions
  explanation: Opset versions imported by the function and the model are incompatible.

- code: ssa-form-multiple-usage
  explanation: Graph must be in single static assignment (SSA) form, however, a variable has been used multiple times.

- code: duplicate-function-outputs
  explanation: Function should not have duplicate outputs specified.

- code: duplicate-function-attributes
  explanation: Function should not have duplicate attributes specified.

- code: unsorted-function-nodes
  explanation: Nodes in a function must be topologically sorted.

- code: ssa-form-multiple-outputs
  explanation: Function must be in single static assignment (SSA) form, however, a variable has been used as output names multiple times.

- code: external-tensorpath-absolute
  explanation: Location of external TensorProto should be a relative path, but it is specified as an absolute path.

- code: empty-external-tensorpath
  explanation: Location of external TensorProto should not be empty.

- code: external-tensorpath-outside-directory
  explanation: Location of external TensorProto points outside the specified directory.

- code: invalid-tensorpath
  explanation: Data of TensorProto should be stored in a valid location, but it either doesn't exist or is not accessible.

- code: non-regular-file
  explanation: Data of TensorProto should be stored in a regular file.

- code: invalid-ir-version
  explanation: Model does not have an ir_version set properly.

- code: higher-ir-version
  explanation: Model ir_version is higher than the checker's.

- code: duplicate-metadata-props
  explanation: Model has duplicate keys in metadata_props.

- code: missing-opset-import
  explanation: Model with IR version >= 3 must specify opset_import for ONNX.

- code: invalid-opset-import
  explanation: Model with IR version < 3 cannot have opset_import specified.

@justinchuby
Copy link
Contributor Author

- code: opset-no-registered-domain
  explanation: No Opset is registered for the domain specified in the node.

- code: opset-model-missing-import
  explanation: The model does not include an opset import for a node present in a function body.

- code: opset-version-mismatch
  explanation: Opset versions imported by the function and the model are not compatible.

- code: opset-schema-not-found
  explanation: Schemas for the specified op and opset versions are not found, possibly because the op belongs to a custom domain.

- code: function-no-name
  explanation: The name field of a function must not be empty.

- code: function-no-domain
  explanation: For IR versions greater than or equal to 8, functions must specify the domain.

- code: function-input-multiple-uses
  explanation: Function inputs must have unique names and cannot be used multiple times.

- code: function-duplicate-outputs
  explanation: Functions should not have duplicate outputs specified.

- code: function-duplicate-attributes
  explanation: Functions should not have duplicate attributes specified.

- code: function-topological-sorting
  explanation: Nodes in a function must be in topologically sorted order, and each node's inputs should come from outputs of previous nodes.

- code: function-output-non-unique-name
  explanation: Function output names must be unique within the function.

- code: function-not-in-ssa-form
  explanation: Functions must be in single static assignment (SSA) form, meaning each variable is assigned exactly once.

- code: function-opset-incompatible
  explanation: Opset import for a domain in the function's operation is not compatible with the version imported by the model.

- code: model-no-ir-version
  explanation: The model does not have the ir_version set properly.

- code: model-ir-version-higher-than-checker
  explanation: The model's ir_version is higher than the checker's ir_version.

- code: model-duplicate-metadata-keys
  explanation: The model has duplicate keys in metadata_props.

- code: model-no-opset-import-ir3
  explanation: Models with IR version greater than or equal to 3 must specify opset_import for ONNX.

- code: model-opset-import-not-specified
  explanation: Models with IR version less than 3 cannot have opset_import specified.

- code: external-tensor-relative-path
  explanation: Location of an external TensorProto should be a relative path, not an absolute path.

- code: external-tensor-outside-directory
  explanation: The location of an external TensorProto points outside the specified directory.

- code: external-tensor-path-not-accessible
  explanation: The location of an external TensorProto does not exist or is not accessible.

- code: external-tensor-empty-location
  explanation: The location of an external TensorProto should not be empty.

- code: external-tensor-not-regular-file
  explanation: Data of an external TensorProto should be stored in a regular file, not a directory or a symlink.

@justinchuby
Copy link
Contributor Author

graph input/output shape/type.

@justinchuby
Copy link
Contributor Author

Shape inference

Sure, here are YAML files for each of the errors along with their error codes and explanations:

  1. Mismatched Element Types:
- code: mismatched-element-types
  explanation: |
    The inferred element type of a tensor does not match the existing element type.
  1. Differences in Rank:
- code: rank-mismatch
  explanation: |
    The inferred shape and existing shape have different ranks.
  1. Mismatched Dimension Sizes:
- code: dimension-size-mismatch
  explanation: |
    The inferred shape and existing shape differ in dimension sizes.
  1. Unsupported Type Cases:
- code: unsupported-type-case
  explanation: |
    The type case encountered during shape inference is unsupported.
  1. Unsupported Operations:
- code: unsupported-operation
  explanation: |
    The shape inference encountered an unsupported operation.
  1. Incomplete Schema or Function Information:
- code: incomplete-schema-information
  explanation: |
    The schema for the operation is not defined or incomplete.
  1. Undefined Value Cases:
- code: undefined-value-case
  explanation: |
    The value case of a type is unset, indicating an undefined type.
  1. Errors during Node Processing:
- code: node-processing-error
  explanation: |
    An error occurred during node processing, such as missing attribute values or unsupported operations.
  1. Missing Symbolic Shapes:
- code: missing-symbolic-shapes
  explanation: |
    Symbolic shapes were not properly generated or materialized.
  1. Inconsistent Initializer Information:
- code: inconsistent-initializer-information
  explanation: |
    There is inconsistency between the information provided in the initializers and the input/output definitions.

You can use these YAML files to organize and manage the errors in your system.

@justinchuby
Copy link
Contributor Author

- code: shape-inference-failure
  explanation: |
    An error occurred during shape inference for the given model or function. This error indicates that the shape inference process failed to determine the shapes of one or more tensors in the model or function.

- code: function-opset-imports-error
  explanation: |
    An error occurred while retrieving the opset imports for a function. This error indicates that there was an issue extracting opset imports from the function's definition.

- code: graph-opset-imports-error
  explanation: |
    An error occurred while retrieving the opset imports for a graph. This error indicates that there was an issue extracting opset imports from the graph's definition.

- code: invalid-inputs-error
  explanation: |
    An error occurred due to invalid inputs provided to the shape inference process. This error indicates that the inputs provided to the shape inference process were not valid or were incompatible with the model or function being analyzed.

- code: initializer-name-conflict-error
  explanation: |
    An error occurred due to conflicting names between initializers and subgraph inputs. This error indicates that there was a naming conflict between initializers and subgraph inputs, which is not allowed.

- code: graph-input-mismatch-error
  explanation: |
    An error occurred due to a mismatch between the number of graph inputs and the number of provided inputs. This error indicates that the number of inputs provided to the graph does not match the expected number of inputs.

- code: missing-input-error
  explanation: |
    An error occurred due to missing inputs for the shape inference process. This error indicates that one or more required inputs were missing, which prevented the shape inference process from completing successfully.

- code: infer-function-output-types-error
  explanation: |
    An error occurred during the inference of function output types. This error indicates that there was an issue while inferring the output types of a function, which prevented the process from completing successfully.

- code: symbol-table-update-error
  explanation: |
    An error occurred while updating the symbol table. This error indicates that there was an issue updating the symbol table during the shape inference process.

- code: node-processing-error
  explanation: |
    An error occurred during the processing of a node. This error indicates that there was an issue processing a node in the model or function being analyzed, which prevented the shape inference process from completing successfully.

- code: subgraph-processing-error
  explanation: |
    An error occurred during the processing of a subgraph. This error indicates that there was an issue processing a subgraph within the model or function being analyzed, which prevented the shape inference process from completing successfully.

- code: unsupported-operation-error
  explanation: |
    An error occurred due to an unsupported operation encountered during shape inference. This error indicates that the shape inference process encountered an operation that is not supported, which prevented the process from completing successfully.

@justinchuby
Copy link
Contributor Author

- code: subgraph-initializer-duplicate-name
  explanation: |
    Cannot use the same name as both a subgraph initializer and subgraph input.

@justinchuby justinchuby changed the title Create error codes for checker errors Create error codes and a knowledge base for checker errors Apr 14, 2024
@justinchuby justinchuby changed the title Create error codes and a knowledge base for checker errors Create error codes and reference pages for checker errors Apr 14, 2024
@justinchuby
Copy link
Contributor Author

justinchuby commented Apr 15, 2024

IR version checks - feature / type etc. unsupported

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant