Skip to content

Olive-ai 0.2.0

Compare
Choose a tag to compare
@leqiao-1 leqiao-1 released this 17 May 12:15
· 1 commit to rel-0.2.0 since this release
fb639ba

Examples

The following examples are added

General

  • Simplify data load experience by adding transformers data config support. For transformer models, user can use hf_config.dataset to leverage the online huggingface datasets.
  • Ease the process of setting up environment: user can run olive.workflows.run --config config.json --setup to install necessary packages required by passes.

Passes (optimization techniques)

  • Integrate Intel® Neural Compressor into Olive: introduce new passes IncStaticQuantization, IncDynamicQuantization, and IncQuantization.
  • Integrate Vitis-AI into Olive: intriduce new pass VitisAIQuantization.
  • Introduce OnnxFloatToFloat16: converts a model to float16. It is based on onnxconverter-common.convert_float_to_float16.
  • Introduce OrtMixedPrecision: converts model to mixed precision to retain a certain level of accuracy.
  • introduce AppendPrePostProcessingOps: adds Pre/Post nodes to the input model.
  • introduce InsertBeamSearch: chains two model components (for example, encoder and decoder) together by inserting beam search op in between them.
  • Support external data for all ONNX passes.
  • Enable transformer optimization fusion options in workflow file.
  • Expose extra_options in ONNX quantization passes.

Models

  • Introduce DistributedOnnxModel to support distributed inferencing
  • Introduce CompositeOnnxModel to represent models with encoder and decoder subcomponents as individual OnnxModels.
  • Add io_config to PytorchModel, including input_names, input_shapes, output_names and dynamic_axes
  • Add MLFlow model loader

Systems

  • Introduce PythonEnvironmentSystem: a python environment on the host machine. This system allows user to evaluate models using onnxruntime or pacakges installed in a different python environment. 

Evaluator

  • Remove target from the evaluator config.
  • Introduce dummy dataloader for latency evaluation.

Metrics

  • Introduce priority_rank: User needs to specify "priority_rank": rank_num for the metrics if you have multiple metrics. Olive will use the priority_ranks of the metrics to determine the best model.

Engine

  • Introduce Olive Footprint: generate report json files, including footprints.json and Pareto frontier footprints, and dump frontier to html/image.
  • Introduce Packaing Olive artifacts: pakcages CandidateModels, SampleCode and ONNXRuntimePackages in the output_dir folder if it is configured from Engine Configuration.
  • Introduce log_severity_level.