http

package
v0.10.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 27, 2024 License: Apache-2.0 Imports: 25 Imported by: 0

README

http

Generic Extractor capable of using the HTTP response from an external API for constructing the following assets types:

The user specified script has access to the response, if the API call was successful, and can use it for constructing and emitting assets using a custom script. Currently, Tengo is the only supported script engine.

Refer Tengo documentation for script language syntax and supported functionality - https://github.com/d5/tengo/tree/v2.13.0#references. Tengo standard library modules can also be imported and used if required (except the os module).

Usage

source:
  scope: gotocompany
  type: http
  config:
    request:
      route_pattern: "/api/v1/endpoint"
      url: "https://example.com/api/v1/endpoint"
      query_params:
        - key: param_key
          value: param_value
      method: "POST"
      headers:
        "User-Id": "1a4336bc-bc6a-4972-83c1-d6426b4d79c3"
      content_type: application/json
      accept: application/json
      body:
        key: value
      timeout: 5s
    success_codes: [ 200 ]
    concurrency: 3
    script:
      engine: tengo
      source: |
        asset := new_asset("user")
        // modify the asset using 'response'...
        emit(asset)

Inputs

Key Value Example Description Required?
request Object see Request The configuration for constructing and sending HTTP request.
success_codes []int [200] The list of status codes that would be considered as a successful response. Default is [200].
concurrency int 5 Number of concurrent child requests to execute. Default is 5
script.engine string tengo Script engine. Only "tengo" is supported currently
script.source string see Worked Example. Tengo script used to map the response into 0 or more assets.
script.max_allocs int 10000 The max number of object allocations allowed during the script run time. Default is 5000.
script.max_const_objects int 1000 The maximum number of constant objects in the compiled script. Default is 500.
Request
Key Value Example Description Required?
route_pattern string /api/v1/endpoint A route pattern to use in metrics as http.route tag.
url string http://example.com/api/v1/endpoint The HTTP endpoint to send request to
query_params []{key, value} [{"key":"s","value":"One Piece"}] The query parameters to be added to the request URL.
method string GET/POST The HTTP verb/method to use with request. Default is GET.
headers map[string]string {"Api-Token": "..."} Headers to send in the HTTP request.
content_type string application/json Content type for encoding request body. Also sent as a header.
accept string application/json Sent as the Accept header. Also indicates the format to use for decoding.
body Object {"key": "value"} The request body to be sent.
timeout string 1s Timeout for the HTTP request. Default is 5s.
Notes
  • In case of conflicts between query parameters present in request.url and request.query_params, request.query_params takes precedence.
  • Currently, only application/json is supported for encoding the request body and for decoding the response body. If Content-Type and Accept headers are added under request.headers, they will be ignored and overridden.
  • Script is only executed if the response status code matches the success_codes provided.
  • Tengo is the only supported script engine.
  • Tengo's os stdlib module cannot be imported and used in the script.
Script Globals
recipe_scope

The value of the scope specified in the recipe (string).

With the following example recipe:

source:
  scope: integration
  type: http
  config:
  #...

The value of recipe_scope will be integration.

response

HTTP response received with the status_code, header and body. Ex:

{
  "status_code": "200",
  "header": {
    "link": "</products?page=5&perPage=20>;rel=self,</products?page=0&perPage=20>;rel=first,</products?page=4&perPage=20>;rel=previous,</products?page=6&perPage=20>;rel=next,</products?page=26&perPage=20>;rel=last"
  },
  "body": [
    {
      "id": 1,
      "name": "Widget #1"
    },
    {
      "id": 2,
      "name": "Widget #2"
    },
    {
      "id": 3,
      "name": "Widget #3"
    }
  ]
}

The header names are always in lower case. See Worked Example for detailed usage.

new_asset(string): Asset

Takes a single string parameter and returns an asset instance. The type parameter can be one of the following:

The asset can then be modified in the script to set properties that are available for the given asset type.

WARNING: Do not overwrite the data property, set fields on it instead. Translating script object into proto fails otherwise.

// Bad
asset.data = {full_name: "Daiyamondo Jozu"}

// Good
asset.data.full_name = "Daiyamondo Jozu"
emit(Asset)

Takes an asset and emits the asset that can then be consumed by the processor/sink.

execute_request(...requests): []Response

Takes 1 or more requests and executes the requests with the concurrency defined in the recipe. The results are returned as an array. Each item in the array can be an error or the HTTP response. The request object supports the properties defined in the Request input section.

When a request is executed, it can fail due to temporary errors such as network errors. These instances need to be handled in the script.

if !response.body.success {
	exit()
}

reqs := []
for j in response.body.jobs {
	reqs = append(reqs, {
		url: format("http://my.server.com/jobs/%s/config", j.id),
		method: "GET",
		content_type: "application/json", 
		accept: "application/json",
		timeout: "5s" 
	})
}

responses := execute_request(reqs...)
for r in responses {
	if is_error(r) {
		// TODO: Handle it appropriately. The error value has the request and 
		//  error string:
		//  r.value.{request, error}
		continue 
	}
	
	asset := new_asset("job")
	asset.name = r.body.name
	exec_cfg := r.body["execution-config"]
	asset.data.attributes = {
	  "job_id": r.body.jid,
	  "job_parallelism": exec_cfg["job-parallelism"],
	  "config": exec_cfg["user-config"]
	}
	emit(asset)
}

If the request passed to the function fails validation, a runtime error is thrown.

exit()

Terminates the script execution.

Output

The output of the extractor depends on the user specified script. It can emit 0 or more assets.

Worked Example

Lets consider a service that returns a list of users on making a GET call on the endpoint http://my_user_service.company.com/api/v1/users in the following format:

{
  "success": "<bool>"
  "message": "<string>",
  "data": [
    {
      "manager_name": "<string>",
      "terminated": "<string: true/false>",
      "fullname": "<string>",
      "location_name": "<string>",
      "work_email": "<string: email>",
      "supervisory_org_id": "<string>",
      "supervisory_org_name": "<string>",
      "preferred_last_name": "<string>",
      "business_title": "<string>",
      "company_name": "<string>",
      "cost_center_id": "<string>",
      "preferred_first_name": "<string>",
      "product_name": "<string>",
      "cost_center_name": "<string>",
      "employee_id": "<string>",
      "manager_id": "<string>",
      "location_id": "<string: ID/IN>",
      "manager_id_2": "<string>",
      "termination_date": "<string: YYYY-MM-DD>",
      "company_hierarchy": "<string>",
      "company_id": "<string>",
      "preferred_middle_name": "<string>",
      "preferred_social_suffix": "<string>",
      "legal_middle_name": "<string>",
      "manager_email_2": "<string: email>",
      "legal_first_name": "<string>",
      "manager_name_2": "<string>",
      "manager_email": "<string: email>",
      "legal_last_name": "<string>"
    }
  ]
}

Assuming the authentication can be done using an Api-Token header, we can use the following recipe:

source:
  scope: production
  type: http
  config:
    request:
      url: "http://my_user_service.company.com/api/v1/users"
      method: "GET"
      headers:
        "Api-Token": "1a4336bc-bc6a-4972-83c1-d6426b4d79c3"
      content_type: application/json
      accept: application/json
      timeout: 5s
    success_codes: [ 200 ]
    script:
      engine: tengo
      source: |
        if !response.body.success {
          exit()
        }

        users := response.body.data
        for u in users {
          if u.email == "" {
            continue
          }

          asset := new_asset("user")
          // URN format: "urn:{service}:{scope}:{type}:{id}"
          asset.urn = format("urn:%s:staging:user:%s", "my_usr_svc", u.employee_id)
          asset.name = u.fullname
          asset.service = "my_usr_svc"
          // asset.type = "user" // not required, new_asset("user") sets the field.
          asset.data.email = u.work_email
          asset.data.username = u.employee_id
          asset.data.first_name = u.legal_first_name
          asset.data.last_name = u.legal_last_name
          asset.data.full_name = u.fullname
          asset.data.display_name = u.fullname
          asset.data.title = u.business_title
          asset.data.status = u.terminated == "true" ? "suspended" : "active"
          asset.data.manager_email = u.manager_email
          asset.data.attributes = {
            manager_id:           u.manager_id,
            cost_center_id:       u.cost_center_id, 
            supervisory_org_name: u.supervisory_org_name,
            location_id:          u.location_id,
            service_job_id:       response.header["x-job-id"]
          }
          emit(asset)
        }

This would emit a 'User' asset for each user object in response.data. Note that the response headers can be accessed under response.header and can be used as needed.

Caveats

The following features are currently not supported:

  • Explicit authentication support, ex: Basic auth/OAuth/OAuth2/JWT etc.
  • Retries with configurable backoff.
  • Content type for request/response body other than application/json.

Contributing

Refer to the contribution guidelines for information on contributing to this module.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {
	Request      RequestConfig `mapstructure:"request"`
	SuccessCodes []int         `mapstructure:"success_codes" validate:"dive,gte=200,lt=300" default:"[200]"`
	Concurrency  int           `mapstructure:"concurrency" validate:"gte=1,lte=100" default:"5"`
	Script       Script        `mapstructure:"script"`
	BeforeScript *Script       `mapstructure:"before_script"`
}

Config holds the set of configuration for the HTTP extractor.

type Extractor

type Extractor struct {
	plugins.BaseExtractor
	// contains filtered or unexported fields
}

Extractor is responsible for executing an HTTP request as per configuration and executing the script with the response to 'extract' assets from within the script.

func New

func New(logger log.Logger) *Extractor

New returns a pointer to an initialized Extractor Object

func (*Extractor) Extract

func (e *Extractor) Extract(ctx context.Context, emit plugins.Emit) error

Extract executes an HTTP request as per the configuration and if successful, executes the script. The script has access to the response and can use the same to 'emit' assets from within the script.

func (*Extractor) Init

func (e *Extractor) Init(ctx context.Context, config plugins.Config) error

Init initializes the extractor

type QueryParam

type QueryParam struct {
	Key   string `mapstructure:"key" validate:"required"`
	Value string `mapstructure:"value" validate:"required"`
}

type RequestConfig

type RequestConfig struct {
	RoutePattern string            `mapstructure:"route_pattern" default:""`
	URL          string            `mapstructure:"url" validate:"required,url"`
	QueryParams  []QueryParam      `mapstructure:"query_params" validate:"dive"`
	Method       string            `mapstructure:"method" validate:"oneof=GET POST" default:"GET"`
	Headers      map[string]string `mapstructure:"headers"`
	ContentType  string            `mapstructure:"content_type" validate:"required,oneof=application/json"`
	Accept       string            `mapstructure:"accept" validate:"required,oneof=application/json"`
	Body         interface{}       `mapstructure:"body"`
	Timeout      time.Duration     `mapstructure:"timeout" validate:"min=1ms" default:"5s"`
}

type Script added in v0.10.3

type Script struct {
	Engine          string `mapstructure:"engine" validate:"required,oneof=tengo"`
	Source          string `mapstructure:"source" validate:"required"`
	MaxAllocs       int64  `mapstructure:"max_allocs" validate:"gt=100" default:"5000"`
	MaxConstObjects int    `mapstructure:"max_const_objects" validate:"gt=10" default:"500"`
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL