merlin

package

v0.8.0 Latest Latest Go to latest Published: Mar 16, 2023 License: Apache-2.0 Imports: 22 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/goto/meteor

Links

Open Source Insights

README ¶

merlin

Extractor for Machine Learning(ML) Models from Merlin.

The extractor uses the REST API exposed by Merlin to extract models. The REST API has been documented with Swagger and can be seen here.

Usage

source:
  name: merlin
  scope: staging
  config:
    url: my-company.com/api/merlin/
    service_account_base64: |
      ____base64_encoded_service_account_credentials____

Inputs

Key	Value	Example	Description	Required?
`url`	`string`	`my-company.com/api/merlin/`	Merlin's API base URL	✅
`service_account_base64`	`string`	`____BASE64_ENCODED_SERVICE_ACCOUNT____`	Service Account credentials in base64 encoded string.	❌
`request_timeout`	`string`	`10s`	Timeout for HTTP requests to Merlin API	❌
`worker_count`	`int`	`5`	Number of workers to spawn for extracting projects parallely from Merlin.	❌

Notes

Leaving service_account_base64 blank will default to Google's default authentication. It is recommended if Meteor instance runs inside the same Google Cloud environment as the BigQuery project.

Outputs

The models are mapped to an Asset with model specific metadata stored using Model. Please refer the proto definitions for more information.

A single model asset includes all the active model versions. A model version is considered active if it has an endpoint.

Field	Value	Sample Value
`resource.urn`	`urn:merlin:{scope}:model:{model.project_id}.{model.id}`	`urn:merlin:staging:model:15.1512`
`resource.name`	`{model.name}`	`tensorflow-sample`
`resource.service`	`merlin`	`merlin`
`resource.type`	`model`	`model`
`resource.url`	`{model.endpoints[0].url}`	`tensorflow-sample.integration-test.models.mycompany.com`
`namespace`	`{project.name}`	`integration-test`
`flavor`	`model.type`	`pyfunc`
`versions`	`[]ModelVersion`
`attributes.project_id`	`project.id`	`23`
`attributes.mlflow_experiment_id`	`model.mlflow_experiment_id`	`721`
`attributes.mlflow_experiment_url`	`model.mlflow_url`	`http://mlflow.mycompany.com/#/experiments/721`
`attributes.endpoint_urls[]`	`model.endpoints[].url`	`["tensorflow-sample.integration-test.models.mycompany.com"]`
`create_time`	`model.created_at`	`2021-03-01T18:42:50.564685Z`
`update_time`	`model.updated_at`	`2022-01-27T10:21:26.121941Z`
`resource.owners[].urn`	`{project.administrators[]}`	`giga.chad@knowyourmeme.com`
`resource.owners[].email`	`{project.administrators[]}`	`giga.chad@knowyourmeme.com`
`lineage.upstreams`	`[]Resource` upstreams
`resource.labels`	`{"team": {project.team}, "stream": {project.stream} + project.labels`	`{"stream": "relevance","team": "search"}`

`ModelVersion`

A ModelVersion is used to represent each combination of Merlin model's version and it's 'endpoint' destination. A single model version will have an 'endpoint' for each environment it is deployed in. Please refer the proto definitions for more information.

Field	Value	Sample Value
`status`	`model_version.status`	`running`
`version`	`model_version.id`	`11`
`attributes.endpoint_id`	`endpoint.id`	`187`
`attributes.mlflow_run_id`	`model_version.mlflow_run_id`	`3c7067f3770441ebbd66a0dce91b8724`
`attributes.mlflow_run_url`	`model_version.mlflow_url`	`http://mlflow.mycompany.com/#/experiments/721/runs/3c7067f3770441ebbd66a0dce91b8724`
`attributes.endpoint_url`	`endpoint.url`	`tensorflow-sample.integration-test.models.mycompany.com`
`attributes.version_endpoint_url`	`version_endpoint.url`	`http://tensorflow-sample-11.integration-test.models.mycompany.com/v1/models/tensorflow-sample-11`
`attributes.monitoring_url`	`version_endpoint.monitoring_url`	`https://grafana.mycompany.com/graph/d/z9MBKR1Az/model-version-dashboard?params`
`attributes.message`	`version_endpoint.message`	`timeout creating inference service`
`attributes.environment_name`	`endpoint.environment_name`	`aws-staging`
`attributes.deployment_mode`	`version_endpoint.deployment_mode`	`serverless`
`attributes.service_name`	`version_endpoint.service_name`	`tensorflow-sample-11-predictor-default.integration-test.models.mycompany.com`
`attributes.env_vars`	`version_endpoint.env_vars`	`{"INIT_HEAP_SIZE_IN_MB": "2250","WORKERS": "1"}`
`attributes.transformer`	`version_endpoint.transformer`	Attributes including `transformer.{enabled, type, image, command, args, env_vars}`
`attributes.weight`	`endpoint.rule.destinationsp[].weight`	`100`
`labels`	`model_version.labels`
`create_time`	`model_version.created_at`	`2022-11-13T07:21:07.888150Z`
`update_time`	`model_version.updated_at`	`2022-11-13T07:21:07.888150Z`

`Resource` upstreams

The extractor currently has limited support for constructing the upstreams for Model that utilises the env vars for standard transformer. It parses the feature table specs that specify the project name and feature table name of the CaraML Store Feature Table from the env vars. This information is used to construct the upstreams for the model.

Field	Value	Sample Value
`urn`	`urn:caramlstore:{scope}:feature_table:{ft.project}.{ft.name}`	`urn:kafka:int-kafka.yonkou.io:topic:staging_30min_demand`
`type`	`feature_table`	`topic`
`service`	`caramlstore`	`kafka`

Contributing

Refer to the contribution guidelines for information on contributing to this module.

Documentation ¶

Index ¶

type Client
type Config
type Extractor
- func New(logger log.Logger, newClient NewClientFunc) *Extractor
- func (e *Extractor) Extract(ctx context.Context, emit plugins.Emit) error
- func (e *Extractor) Init(ctx context.Context, config plugins.Config) error
type NewClientFunc

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Client ¶

type Client interface {
	Projects(ctx context.Context) ([]merlin.Project, error)
	Models(ctx context.Context, projectID int64) ([]merlin.Model, error)
	ModelVersion(ctx context.Context, modelID, versionID int64) (merlin.ModelVersion, error)
}

type Config ¶

type Config struct {
	URL                  string        `mapstructure:"url" validate:"required"`
	ServiceAccountBase64 string        `mapstructure:"service_account_base64"`
	RequestTimeout       time.Duration `mapstructure:"request_timeout" validate:"min=1ms" default:"10s"`
	WorkerCount          int           `mapstructure:"worker_count" validate:"min=1" default:"5"`
}

Config holds the set of configuration for the Merlin extractor.

type Extractor ¶

type Extractor struct {
	plugins.BaseExtractor
	// contains filtered or unexported fields
}

Extractor manages the communication with the Merlin service.

func New ¶

func New(logger log.Logger, newClient NewClientFunc) *Extractor

New returns a pointer to an initialized Extractor Object

func (*Extractor) Extract ¶

func (e *Extractor) Extract(ctx context.Context, emit plugins.Emit) error

func (*Extractor) Init ¶

func (e *Extractor) Init(ctx context.Context, config plugins.Config) error

Init initializes the extractor

type NewClientFunc ¶

type NewClientFunc func(ctx context.Context, cfg Config) (Client, error)

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
internal
merlin
mocks

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

merlin

Usage

Inputs

Notes

Outputs

ModelVersion

Resource upstreams

Contributing

Documentation ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

Types ¶

type Client ¶

type Config ¶

type Extractor ¶

func New ¶

func (*Extractor) Extract ¶

func (*Extractor) Init ¶

type NewClientFunc ¶

Source Files ¶

Directories ¶

`ModelVersion`

`Resource` upstreams