It is important that you make it easy for users with different levels of experience to use your algorithm in AWS Marketplace. To do this, we recommend that you include the following elements as part of your AWS Marketplace algorithm listing.
# | Section | Description | Mandatory/Highly Recommended | Sample Example |
---|---|---|---|---|
SD1 | Short product description | List most important product use case and supported input content type (short description). | Mandatory | An AutoML algorithm that trains a multi-layer stack ensemble model to predict on regression/classification datasets directly from CSV data. |
# | Section | Description | Mandatory/Highly Recommended | Sample Example |
---|---|---|---|---|
PO1 | Product overview | Describe algorithm category (e.g. Tree, Neural Net, Ensemble). | Mandatory | An AutoML algorithm that trains a multi-layer stack ensemble model to predict on regression/classification datasets directly from CSV data. |
PO2 | Product overview | Summarize how the algorithm works including any feature engineering. | Mandatory | AutoGluon-Tabular can save time by automating time-consuming manual steps鈥攈andling missing data, manual feature transformations, data splitting, model selection, algorithm selection, hyperparameter selection and tuning, ensembling multiple models, and repeating the process when data changes. |
PO3 | Product overview | List the core framework that the model/algorithm was built on. | Highly Recommended | Unlike existing AutoML frameworks that primarily focus on model/hyperparameter selection, AutoGluon-Tabular succeeds by ensembling multiple models and stacking them in multiple layers. |
PO4 | Product overview | Differentiated capabilities of model/algorithm. | Mandatory | Dynamic factor models (DFMs) can be used to analyze and forecast large sets of time-series, such as measurements and indicators of national or multinational economies, prices of products or instruments constantly traded in markets, measurements and observations of natural or engineering processes, and trends in social media or sports tournaments. The evolutions of these time-series are influenced by evolutions of a number of unobserved factors commonly affecting all or many of the time-series. A long-memory DFM can estimate influences of longer histories of factor evolutions. |
PO5 | Product overview | List most important use case(s) for this product. | Mandatory | The long-memory dynamic factor model (LMDFM) algorithm is developed to analyze and forecast large sets of time-series when the time-series are influenced by evolution histories of a number of unobserved factors commonly affecting all or many of the time-series. By applying objective data-driven constraints, the LMDFM algorithm can estimate the influences of longer histories of common factors. The algorithm accommodates wider ranges of values of model parameters, especially model learning parameters. The wider ranges can further enhance the power of machine learning. Current version of the LMDFM algorithm estimates: (a) dynamic factor loadings matrixes, (b) vector autoregressive (VAR) coefficients of the factors, (c) time-series of factor scores, and (d) forecasts of the set of time-series. |
# | Section | Description | Mandatory/Highly Recommended | Sample Example |
---|---|---|---|---|
H1 | Highlights | Summarize algorithm performance metric. | Mandatory | In benchmarks from the AutoGluon-Tabular paper, AutoGluon outperformed many popular open-source/commercial AutoML platforms on 50 classification/regression datasets from Kaggle/OpenML. AutoGluon is faster, more robust, and much more accurate than other tools, even outperforming the best-of five other AutoML platforms on most datasets. In two popular Kaggle competitions, AutoGluon beat 99% of the participating data scientists after just four hours of training. |
H2 | Highlights | Summarize Model performance metric. | Must-have for an algorithm with a pre-trained model. Highly recommended for others. | The model applying learning on top of an existing model trained on Pascal VOC 2007-2012 dataset extended with 200 annotated images from the XXXX dataset in addition to 4,000 augmented images to represent blur and foggy conditions. The algorithm accepts customer data and fine tunes the base ML model further to achieve higher Mean Average Precision (MaP) in short time. |
H3 | Highlights | Specify inference latency metric and/or transaction per second on recommended Amazon SageMaker compute instance. | Mandatory | The avg response time for a single image single vehicle inference on the compute optimized ml.c5.2xlarge instance with 8 vCPUs & 16 GB Memory is approximately 3.25 secs. |
H4 | Highlights | Specify if algorithm is compatible with, for example, model auto tuning functionality, distributed training, GPU on Amazon SageMaker. | Mandatory | The model can be trained using automatic model tuning capability and you can specify multiple instances while running a model training job. |
H5 | Highlights | Applicable research paper/repo related to model/algorithm. | Highly Recommended | ArXiv Publication- AutoGluon-Tabular |
# | Section | Description | Mandatory/Highly Recommended | Sample Example |
---|---|---|---|---|
US1 | Usage Instructions | Describe how to use algorithm: data pre-processing guidelines, training data format (e.g, mandatory fields), minimum data row requirements, and guidelines to create good model (e.g., identify important hyperparameters and values). Also, add details of approximate duration of training based on data. Clarify any feature engineering (e.g., scaling, imputing of values) performed by the algorithm. |
Mandatory | The algorithm is a X pass algorithm and a training on XXX.XXXX instance for a dataset of size XX MB takes XX minutes of compute.You are recommended to provide 50 rows required in the training data. For large datasets with > 100,000 rows or > 1,000 columns, use larger instance type. For best results, provide all your data to AutoGluon as train_data rather than splitting a validation set yourself and specify which eval_metric will be used to evaluate predictions. The algorithm handles missing data, manual feature transformations, data splitting, model selection, algorithm selection, hyperparameter selection and tuning, ensembling multiple models, and repeating the process when data changes. |
US2 | Usage Instructions | Mime-type for input data. | Mandatory | Supported MIME Content Types: text/csv. |
US3 | Usage Instructions | Input data limitations (text) - for supervised learning algorithms, describe how are labeled data provided to the algorithm. | Mandatory | The first line of your CSV file should contain names for each column. Columns in your CSV file can be strings/text-fields/Numeric. |
US4 | Usage Instructions | Format and description for inference input for trained model. | Mandatory | AutoGluon-Tabular requires no manual data preprocessing as long as your data is a valid CSV table. Your data must contain the column that you identify as 'label' in your hyperparameter configuration. |
US5 | Usage Instructions | Mime-type for inference output. | Mandatory | Content type: text/plain. |
US6 | Usage Instructions | Format and description for inference output (text). | Mandatory | For this license plate image, the ML Model returned following output. Sample output: KL40L5577. If your output is complex, here is the sample description of output for your reference. The model returns JSON object detections that includes an array with individual elements for each face detected. Each element has two attributes: 1) box_points: includes the bounding box pixels of the detected face. The first value represents XX, second value represents XX, third value XX, and fourth value XX. 2) classes: no_mask represents the probability score that the bounding box does not include mask. When multiple faces are detected in the image multiple inferences are returned as part of the array... |
US7 | Usage Instructions | Provide example to pre-process data (text). | Highly Recommended |
# | Section | Description | Mandatory/Highly Recommended | Sample Example |
---|---|---|---|---|
AR1 | Additional Resources | Provide a validated notebook, data and other resources in GitHub. Note, this notebook and sample data will also be verified by MCO. Prepare a notebook using this template - https://github.com/awslabs/amazon-sagemaker-examples/tree/master/aws_marketplace/curating_aws_marketplace_listing_and_sample_notebook/Algorithm. |
Mandatory | https://github.com/awslabs/amazon-sagemaker-examples/tree/master/aws_marketplace/using_algorithms/autogluon |
AR2 | Additional Resources | Link to sample training input data. | Mandatory | http://timeseriesclassification.com/Downloads/ECG200.zip |
AR3 | Additional Resources | Links for additional resources such as architecture diagram, related listings to integrate model with other applications and services. | Highly Recommended | A blog-post or a link such as this which explains architecture as well as process for using the model in a real world application :聽VITech Lab Healthcare introduces Automated PPE compliance control on Amazon Web Services. |
AR4 | Additional Resources | Sample inference input data for real-time invocation (text or link on Github). | Mandatory | https://gitlab.qdatalabs.com/quantiphi-sagemaker-marketplace-examples/vehicle-license-plate-recognition/tree/master/data/output/batch |
AR5 | Additional Resources | Sample inference input data for batch invocation (link on Github). | Mandatory | https://gitlab.qdatalabs.com/quantiphi-sagemaker-marketplace-examples/vehicle-license-plate-recognition/tree/master/data/output/batch |
AR6 | Additional Resources | Sample inference output for real-time invocation for the input sample provided (text or links on Github). | Mandatory | https://gitlab.qdatalabs.com/quantiphi-sagemaker-marketplace-examples/vehicle-license-plate-recognition/tree/master/data/output/batch |
AR7 | Additional Resources | Sample inference output for batch invocation corresponding to the batch input samples (text or links on Github). | Mandatory | https://gitlab.qdatalabs.com/quantiphi-sagemaker-marketplace-examples/vehicle-license-plate-recognition/tree/master/data/output/batch |