mlharness

package module

v1.0.0 Latest Latest Go to latest Published: Nov 16, 2021 License: NCSA Imports: 15 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/c3sr/mlharness

Links

Open Source Insights

README ¶

MLHarness

MLHarness is a scalable benchmarking harness system for MLCommons Inference with three distinctive features:

MLHarness codifies the standard benchmark process as defined by MLCommons Inference including the models, datasets, DL frameworks, and software and hardware systems;
MLHarness provides an easy and declarative approach for model developers to contribute their models and datasets to MLCommons Inference; and
MLHarness includes the support of a wide range of models with varying inputs/outputs modalities so that MLHarness can scalably benchmark models across different datasets, frameworks, and hardware systems.

Please see the MLHarness Paper for detailed descriptions and case studies that demonstrate the unique value of MLHarness.

Tutorial

The easiest way to use MLHarness is through the pre-built docker images. Instructions for installing docker can be found at docker's guiding documents.

To get started, choose a configuration from the table below that fits best to your system.

System	ONNX Runtime v1.7.1	MXNet v1.8.0	PyTorch v1.8.1	TensorFlow v1.14.0
CPU Only	`c3sr/mlharness:amd64-cpu-onnxruntime1.7.1-latest`	`c3sr/mlharness:amd64-cpu-mxnet1.8.0-latest`	`c3sr/mlharness:amd64-cpu-pytorch1.8.1-latest`	`c3sr/mlharness:amd64-cpu-tensorflow1.14.0-latest`
GPU with CUDA 10.0	—	`c3sr/mlharness:amd64-gpu-mxnet1.8.0-cuda10.0-latest`	`c3sr/mlharness:amd64-gpu-pytorch1.8.1-cuda10.0-latest`	`c3sr/mlharness:amd64-gpu-tensorflow1.14.0-cuda10.0-latest`
GPU with CUDA 10.1	`c3sr/mlharness:amd64-gpu-onnxruntime1.7.1-cuda10.1-latest`	`c3sr/mlharness:amd64-gpu-mxnet1.8.0-cuda10.1-latest`	`c3sr/mlharness:amd64-gpu-pytorch1.8.1-cuda10.1-latest`	`c3sr/mlharness:amd64-gpu-tensorflow1.14.0-cuda10.1-latest`
GPU with CUDA 10.2	`c3sr/mlharness:amd64-gpu-onnxruntime1.7.1-cuda10.2-latest`	`c3sr/mlharness:amd64-gpu-mxnet1.8.0-cuda10.2-latest`	`c3sr/mlharness:amd64-gpu-pytorch1.8.1-cuda10.2-latest`	`c3sr/mlharness:amd64-gpu-tensorflow1.14.0-cuda10.2-latest`
GPU with CUDA 11.0	`c3sr/mlharness:amd64-gpu-onnxruntime1.7.1-cuda11.0-latest`	`c3sr/mlharness:amd64-gpu-mxnet1.8.0-cuda11.0-latest`	`c3sr/mlharness:amd64-gpu-pytorch1.8.1-cuda11.0-latest`	—
GPU with CUDA 11.1	`c3sr/mlharness:amd64-gpu-onnxruntime1.7.1-cuda11.1-latest`	`c3sr/mlharness:amd64-gpu-mxnet1.8.0-cuda11.1-latest`	`c3sr/mlharness:amd64-gpu-pytorch1.8.1-cuda11.1-latest`	—
GPU with CUDA 11.2	`c3sr/mlharness:amd64-gpu-onnxruntime1.7.1-cuda11.2-latest`	`c3sr/mlharness:amd64-gpu-mxnet1.8.0-cuda11.2-latest`	`c3sr/mlharness:amd64-gpu-pytorch1.8.1-cuda11.2-latest`	—

After choosing a docker image, there are other two required components, which are models and datasets, where MLHarness uses manifests to codify them. Examples of model manifests can be found at dlmodel/models and examples of dataset manifests can be found at dldataset/datasets. As not all models and datasets are public, some of the manifests only provide methods to manipulate models and data without having a download method. To address this issue, we can set the environment variable $DATA_DIR to the directory containing models and datasets we have pre-downloaded, and use this environment variable to find models and datasets we want.

Here is an example run. Suppose we choose ONNX Runtime as our backend and we have a GPU with CUDA 11.2. Therefore, we use c3sr/mlharness:amd64-gpu-onnxruntime1.7.1-cuda11.2-latest as the pre-built docker image. Then, we choose to benchmark the BERT model (manifest) on the SQuAD v1.1 dataset (manifest). We setup our directory as follow, where we need dev-v1.1.json and vocab.txt from the SQuAD v1.1 dataset, and we can get the manifests of models and datasets by cloning dlmodel and dldataset.

~/data/
├── SQuAD
│   ├── dev-v1.1.json
│   └── vocab.txt
├── dlmodel
│   └── models
│       └── language
│           └── onnxruntime
│               └── BERT.yml
└── dldataset
    └── datasets
        └── squad.yml

To get the help information of MLHarness, we can run the following command by setting $GPUID to the GPU ID you wan to use:

docker run --rm --gpus device=$GPUID c3sr/mlharness:amd64-gpu-onnxruntime1.7.1-cuda11.2-latest -h

Following the help information, we can run the following command to get a simple run:

docker run --rm \
  -v ~/data:/root/data \
  --env DATA_DIR=/root/data/SQuAD \
  --gpus device=$GPUID \
  --shm-size 1g --ulimit memlock=-1 --ulimit stack=67108864 --privileged=true --network host \
  c3sr/mlharness:amd64-gpu-onnxruntime1.7.1-cuda11.2-latest \
  --dataset squad --dataset_path /root/data/dldataset/datasets/squad.yml \
  --backend onnxruntime --model_path /root/data/dlmodel/models/language/onnxruntime/BERT.yml \
  --use_gpu 1 --gpu_id $GPUID \
  --accuracy --count 10 \
  --scenario Offline

The description follows:

docker run --rm: Run MLHarness as a docker container, remove it after execution.
-v ~/data:/root/data: Mount the directory we prepared.
--env DATA_DIR=/root/data/SQuAD: Set environment variable to the dataset directory we downloaded.
--gpus device=$GPUID: Expose GPU to docker, please replace $GPUID with the GPU ID you want to use.
--shm-size 1g --ulimit memlock=-1 --ulimit stack=67108864 --privileged=true --network host: Configure resources in docker container.
c3sr/mlharness:amd64-gpu-onnxruntime1.7.1-cuda11.2-latest: The pre-build docker image we choose above.
--dataset squad --dataset_path /root/data/dldataset/datasets/squad.yml: The dataset and the path to the dataset manifest file in the mounted directory.
--backend onnxruntime --model_path /root/data/dlmodel/models/language/onnxruntime/BERT.yml: The backend and the path to the model manifest file in the mounted directory.
--use_gpu 1 --gpu_id $GPUID: Let MLHarness know that we want to use GPU in the program. Please replace $GPUID with the GPU ID you want to use.
--accuracy --count 10: Generate MLCommons Inference reports in accuracy mode, and only run 10 samples for simplicity.
--scenario Offline: Scenario for MLCommons Inference.

After the execution, we are supposed to get {"exact_match": 70.0, "f1": 70.0} as the result for the first 10 samples.

Customization

Aside from using the manifests we already have above, we can also create and contribute our manifests, by replacing the corresponding fields in the manifests.

Documentation ¶

Index ¶

func Finalize() error
func Initialize(backendName string, modelPath string, datasetPath string, count int, ...) (int, error)
func IssueQuery(sampleList []int) string
func LoadQuerySamples(sampleList []int) error
func UnloadQuerySamples(sampleList []int) error

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Finalize ¶

func Finalize() error

This needs to be called once from the python side in the end

func Initialize ¶

func Initialize(backendName string, modelPath string, datasetPath string, count int,
	useGPU bool, GPUID int, traceLevel string, batchSize int) (int, error)

This needs to be call once from the python side in the beginning

func IssueQuery ¶

func IssueQuery(sampleList []int) string

func LoadQuerySamples ¶

func LoadQuerySamples(sampleList []int) error

func UnloadQuerySamples ¶

func UnloadQuerySamples(sampleList []int) error

Types ¶

This section is empty.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
sut
wrapper

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL