data_filter_elasticsearch

module
v0.0.0-...-0626e7f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 29, 2024 License: Apache-2.0

README

OPA-Elasticsearch Data Filtering Example

This directory contains an example of how to perform data filtering in Elasticsearch using the queries provided by OPA's Compile API.

The example server is written in Go and when it receives API requests it asks OPA for a set of conditions to apply to the Elasticsearch query that serves the request. OPA is integrated as a library.

Building

Build the example by running make build

Running the example

  1. Run Elasticsearch (with security turned off - the example assumes http and default credentials). Dockerized example:
docker run --rm -d -p 9200:9200 -e "xpack.security.enabled=false" -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:8.1.1

See Elasticsearch's Installation docs for other methods of installation.

  1. Build the policy bundle
opa build -b policy

This will produce a bundle.tar.gz file from the policy/example.rego file.

  1. Run nginx server to serve the bundle. The application will need a bundle server to fetch the policy bundle from. We will use nginx for this.
docker run --rm -d --name bundle-server -p 8888:80 -v ${PWD}/bundle.tar.gz:/usr/share/nginx/html/bundle.tar.gz:ro nginx:latest
  1. Start the example server. This will use the opa-conf.yaml file to configure OPA to download bundles from nginx.
./opa-es-filtering

The server listens on :8080 and exposes two endpoints /posts and /posts/{post_id}. OPA is loaded with an example policy from the file example.rego which has rules related to both these endpoints.

  1. Open a new window and make a request:

    curl  -H "Authorization: bob" localhost:8080/posts |  jq .
    

    This will return all the posts that bob is allowed to see depending on the policy loaded into OPA. All policies are defined in the example.rego file.

Supported OPA Built-in Functions

Comparison
  • ==
  • !=
  • <
  • <=
  • >
  • >=
Strings
  • contains
Regex
  • re_match

Support for OPA references

References are used to access nested documents in OPA. OPA policies can be written over deeply nested structures which the server would then translate to Elasticsearch Nested queries.

Generated Elasticsearch queries

For the OPA operators mentioned above, following are Elasticsearch queries generated by the server:

Term level Queries
  • Term Query
  • Range Query
  • Regexp Query
Joining Queries
  • Nested Query
Compound Queries
  • Bool Query
Full Text Queries
  • Match Query
  • Query String Query

Limitations

  • The server is loaded with an Elasticsearch Index template which defines the settings and the mapping for the posts index which is also created when the server starts.

  • The OPA policies should be written according to the fields in the Elasticsearch documents to get the desired results. The manner in which Elasticsearch handles unmapped fields depends on the type of query. For example, a Term query returns no matches if the query refers to a term that doesn't point to an object field in the mapping. On the other hand, a Nested query will fail if the defined path doesn't point to an object field in the mapping. To obtain uniform behaviour across queries such that an unmapped path in a Nested query does not throw an exception and instead not match any documents for this query, the server generates Nested queries that ignore an unmapped path.

  • The server supports limited OPA operators and returns an error if the OPA policy contains an unsupported operator.

  • The server supports only two endpoints /posts and /posts/{post_id} for fetching posts created when the server starts.

Directories

Path Synopsis
cmd
internal
api
es
opa

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL