http

package

v0.10.6 Latest Latest Go to latest Published: Aug 27, 2024 License: Apache-2.0 Imports: 25 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/goto/meteor

Links

Open Source Insights

README ¶

http

Generic Extractor capable of using the HTTP response from an external API for constructing the following assets types:

The user specified script has access to the response, if the API call was successful, and can use it for constructing and emitting assets using a custom script. Currently, Tengo is the only supported script engine.

Refer Tengo documentation for script language syntax and supported functionality - https://github.com/d5/tengo/tree/v2.13.0#references. Tengo standard library modules can also be imported and used if required (except the os module).

Usage

source:
  scope: gotocompany
  type: http
  config:
    request:
      route_pattern: "/api/v1/endpoint"
      url: "https://example.com/api/v1/endpoint"
      query_params:
        - key: param_key
          value: param_value
      method: "POST"
      headers:
        "User-Id": "1a4336bc-bc6a-4972-83c1-d6426b4d79c3"
      content_type: application/json
      accept: application/json
      body:
        key: value
      timeout: 5s
    success_codes: [ 200 ]
    concurrency: 3
    script:
      engine: tengo
      source: |
        asset := new_asset("user")
        // modify the asset using 'response'...
        emit(asset)

Inputs

Key	Value	Example	Description	Required?
`request`	`Object`	see Request	The configuration for constructing and sending HTTP request.	✅
`success_codes`	`[]int`	`[200]`	The list of status codes that would be considered as a successful response. Default is `[200]`.	✘
`concurrency`	`int`	`5`	Number of concurrent child requests to execute. Default is `5`	✘
`script.engine`	`string`	`tengo`	Script engine. Only `"tengo"` is supported currently	✅
`script.source`	`string`	see Worked Example.	Tengo script used to map the response into 0 or more assets.	✅
`script.max_allocs`	`int`	10000	The max number of object allocations allowed during the script run time. Default is `5000`.	✘
`script.max_const_objects`	`int`	1000	The maximum number of constant objects in the compiled script. Default is `500`.	✘

Request

Key	Value	Example	Description	Required?
`route_pattern`	`string`	`/api/v1/endpoint`	A route pattern to use in metrics as `http.route` tag.	✅
`url`	`string`	`http://example.com/api/v1/endpoint`	The HTTP endpoint to send request to	✅
`query_params`	`[]{key, value}`	`[{"key":"s","value":"One Piece"}]`	The query parameters to be added to the request URL.	✘
`method`	`string`	`GET`/`POST`	The HTTP verb/method to use with request. Default is `GET`.	✘
`headers`	`map[string]string`	`{"Api-Token": "..."}`	Headers to send in the HTTP request.	✘
`content_type`	`string`	`application/json`	Content type for encoding request body. Also sent as a header.	✅
`accept`	`string`	`application/json`	Sent as the `Accept` header. Also indicates the format to use for decoding.	✅
`body`	`Object`	`{"key": "value"}`	The request body to be sent.	✘
`timeout`	`string`	`1s`	Timeout for the HTTP request. Default is 5s.	✘

Notes

In case of conflicts between query parameters present in request.url and request.query_params, request.query_params takes precedence.
Currently, only application/json is supported for encoding the request body and for decoding the response body. If Content-Type and Accept headers are added under request.headers, they will be ignored and overridden.
Script is only executed if the response status code matches the success_codes provided.
Tengo is the only supported script engine.
Tengo's os stdlib module cannot be imported and used in the script.

Script Globals

recipe_scope
response
new_asset(string): Asset
emit(Asset)
execute_request(...requests): []Response
exit

`recipe_scope`

The value of the scope specified in the recipe (string).

With the following example recipe:

source:
  scope: integration
  type: http
  config:
  #...

The value of recipe_scope will be integration.

`response`

HTTP response received with the status_code, header and body. Ex:

{
  "status_code": "200",
  "header": {
    "link": "</products?page=5&perPage=20>;rel=self,</products?page=0&perPage=20>;rel=first,</products?page=4&perPage=20>;rel=previous,</products?page=6&perPage=20>;rel=next,</products?page=26&perPage=20>;rel=last"
  },
  "body": [
    {
      "id": 1,
      "name": "Widget #1"
    },
    {
      "id": 2,
      "name": "Widget #2"
    },
    {
      "id": 3,
      "name": "Widget #3"
    }
  ]
}

The header names are always in lower case. See Worked Example for detailed usage.

`new_asset(string): Asset`

Takes a single string parameter and returns an asset instance. The type parameter can be one of the following:

"bucket" (proto)
"dashboard" (proto)
"experiment" (proto)
"feature_table" (proto)
"group" (proto)
"job" (proto)
"metric" (proto)
"model" (proto)
"application" (proto)
"table" (proto)
"topic" (proto)
"user" (proto)

The asset can then be modified in the script to set properties that are available for the given asset type.

WARNING: Do not overwrite the data property, set fields on it instead. Translating script object into proto fails otherwise.

// Bad
asset.data = {full_name: "Daiyamondo Jozu"}

// Good
asset.data.full_name = "Daiyamondo Jozu"

`emit(Asset)`

Takes an asset and emits the asset that can then be consumed by the processor/sink.

`execute_request(...requests): []Response`

Takes 1 or more requests and executes the requests with the concurrency defined in the recipe. The results are returned as an array. Each item in the array can be an error or the HTTP response. The request object supports the properties defined in the Request input section.

When a request is executed, it can fail due to temporary errors such as network errors. These instances need to be handled in the script.

if !response.body.success {
	exit()
}

reqs := []
for j in response.body.jobs {
	reqs = append(reqs, {
		url: format("http://my.server.com/jobs/%s/config", j.id),
		method: "GET",
		content_type: "application/json", 
		accept: "application/json",
		timeout: "5s" 
	})
}

responses := execute_request(reqs...)
for r in responses {
	if is_error(r) {
		// TODO: Handle it appropriately. The error value has the request and 
		//  error string:
		//  r.value.{request, error}
		continue 
	}
	
	asset := new_asset("job")
	asset.name = r.body.name
	exec_cfg := r.body["execution-config"]
	asset.data.attributes = {
	  "job_id": r.body.jid,
	  "job_parallelism": exec_cfg["job-parallelism"],
	  "config": exec_cfg["user-config"]
	}
	emit(asset)
}

If the request passed to the function fails validation, a runtime error is thrown.

`exit()`

Terminates the script execution.

Output

The output of the extractor depends on the user specified script. It can emit 0 or more assets.

Worked Example

Lets consider a service that returns a list of users on making a GET call on the endpoint http://my_user_service.company.com/api/v1/users in the following format:

{
  "success": "<bool>"
  "message": "<string>",
  "data": [
    {
      "manager_name": "<string>",
      "terminated": "<string: true/false>",
      "fullname": "<string>",
      "location_name": "<string>",
      "work_email": "<string: email>",
      "supervisory_org_id": "<string>",
      "supervisory_org_name": "<string>",
      "preferred_last_name": "<string>",
      "business_title": "<string>",
      "company_name": "<string>",
      "cost_center_id": "<string>",
      "preferred_first_name": "<string>",
      "product_name": "<string>",
      "cost_center_name": "<string>",
      "employee_id": "<string>",
      "manager_id": "<string>",
      "location_id": "<string: ID/IN>",
      "manager_id_2": "<string>",
      "termination_date": "<string: YYYY-MM-DD>",
      "company_hierarchy": "<string>",
      "company_id": "<string>",
      "preferred_middle_name": "<string>",
      "preferred_social_suffix": "<string>",
      "legal_middle_name": "<string>",
      "manager_email_2": "<string: email>",
      "legal_first_name": "<string>",
      "manager_name_2": "<string>",
      "manager_email": "<string: email>",
      "legal_last_name": "<string>"
    }
  ]
}

Assuming the authentication can be done using an Api-Token header, we can use the following recipe:

source:
  scope: production
  type: http
  config:
    request:
      url: "http://my_user_service.company.com/api/v1/users"
      method: "GET"
      headers:
        "Api-Token": "1a4336bc-bc6a-4972-83c1-d6426b4d79c3"
      content_type: application/json
      accept: application/json
      timeout: 5s
    success_codes: [ 200 ]
    script:
      engine: tengo
      source: |
        if !response.body.success {
          exit()
        }

        users := response.body.data
        for u in users {
          if u.email == "" {
            continue
          }

          asset := new_asset("user")
          // URN format: "urn:{service}:{scope}:{type}:{id}"
          asset.urn = format("urn:%s:staging:user:%s", "my_usr_svc", u.employee_id)
          asset.name = u.fullname
          asset.service = "my_usr_svc"
          // asset.type = "user" // not required, new_asset("user") sets the field.
          asset.data.email = u.work_email
          asset.data.username = u.employee_id
          asset.data.first_name = u.legal_first_name
          asset.data.last_name = u.legal_last_name
          asset.data.full_name = u.fullname
          asset.data.display_name = u.fullname
          asset.data.title = u.business_title
          asset.data.status = u.terminated == "true" ? "suspended" : "active"
          asset.data.manager_email = u.manager_email
          asset.data.attributes = {
            manager_id:           u.manager_id,
            cost_center_id:       u.cost_center_id, 
            supervisory_org_name: u.supervisory_org_name,
            location_id:          u.location_id,
            service_job_id:       response.header["x-job-id"]
          }
          emit(asset)
        }

This would emit a 'User' asset for each user object in response.data. Note that the response headers can be accessed under response.header and can be used as needed.

Caveats

The following features are currently not supported:

Explicit authentication support, ex: Basic auth/OAuth/OAuth2/JWT etc.
Retries with configurable backoff.
Content type for request/response body other than application/json.

Contributing

Refer to the contribution guidelines for information on contributing to this module.

Documentation ¶

Index ¶

type Config
type Extractor
- func New(logger log.Logger) *Extractor
- func (e *Extractor) Extract(ctx context.Context, emit plugins.Emit) error
- func (e *Extractor) Init(ctx context.Context, config plugins.Config) error
type QueryParam
type RequestConfig
type Script

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Config ¶

type Config struct {
	Request      RequestConfig `mapstructure:"request"`
	SuccessCodes []int         `mapstructure:"success_codes" validate:"dive,gte=200,lt=300" default:"[200]"`
	Concurrency  int           `mapstructure:"concurrency" validate:"gte=1,lte=100" default:"5"`
	Script       Script        `mapstructure:"script"`
	BeforeScript *Script       `mapstructure:"before_script"`
}

Config holds the set of configuration for the HTTP extractor.

type Extractor ¶

type Extractor struct {
	plugins.BaseExtractor
	// contains filtered or unexported fields
}

Extractor is responsible for executing an HTTP request as per configuration and executing the script with the response to 'extract' assets from within the script.

func New ¶

func New(logger log.Logger) *Extractor

New returns a pointer to an initialized Extractor Object

func (*Extractor) Extract ¶

func (e *Extractor) Extract(ctx context.Context, emit plugins.Emit) error

Extract executes an HTTP request as per the configuration and if successful, executes the script. The script has access to the response and can use the same to 'emit' assets from within the script.

func (*Extractor) Init ¶

func (e *Extractor) Init(ctx context.Context, config plugins.Config) error

Init initializes the extractor

type QueryParam ¶

type QueryParam struct {
	Key   string `mapstructure:"key" validate:"required"`
	Value string `mapstructure:"value" validate:"required"`
}

type RequestConfig ¶

type RequestConfig struct {
	RoutePattern string            `mapstructure:"route_pattern" default:""`
	URL          string            `mapstructure:"url" validate:"required,url"`
	QueryParams  []QueryParam      `mapstructure:"query_params" validate:"dive"`
	Method       string            `mapstructure:"method" validate:"oneof=GET POST" default:"GET"`
	Headers      map[string]string `mapstructure:"headers"`
	ContentType  string            `mapstructure:"content_type" validate:"required,oneof=application/json"`
	Accept       string            `mapstructure:"accept" validate:"required,oneof=application/json"`
	Body         interface{}       `mapstructure:"body"`
	Timeout      time.Duration     `mapstructure:"timeout" validate:"min=1ms" default:"5s"`
}

type Script ¶ added in v0.10.3

type Script struct {
	Engine          string `mapstructure:"engine" validate:"required,oneof=tengo"`
	Source          string `mapstructure:"source" validate:"required"`
	MaxAllocs       int64  `mapstructure:"max_allocs" validate:"gt=100" default:"5000"`
	MaxConstObjects int    `mapstructure:"max_const_objects" validate:"gt=10" default:"500"`
}

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL