meilibridge

module
v0.6.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 26, 2024 License: MIT

README

meilibridge

meilibridge is a robust package designed to seamlessly sync data from both SQL and NoSQL databases to Meilisearch, providing an efficient and unified search solution.

Table of content

Features

  • Supports multiple data sources
  • Compatible with various databases such as MongoDB (currently supported), MySQL, and PostgreSQL
  • Index configuration options
  • Real-time synchronization (currently mysql and postgres not supported)
  • Bulk sync support with options to continue or reindex
  • Concurrent data bridging to Meilisearch
  • Customizable fields for indexing
  • Set primary key for index
  • Many meilisearch for specific bridge
  • Scheduled automatic bulk sync on existing index in the background
  • Trigger sync for real-time sync by webhook trigger

Installation

You can install meilibridge using different methods.

Release

Download the latest release from here.

Go Installation

Install the package using Go:

go install github.com/Ja7ad/meilibridge/cmd/meilibridge@latest
Docker

Run meilibridge with real-time sync using Docker:

  • Docker
docker run --rm -it -v ./config.yml:/etc/meilibridge/config.yml --name meilibridge ja7adr/meilibridge
  • Docker compose
version: "3"
services:
  meilibridge:
    image: ja7adr/meilibridge:latest
    volumes:
      - ./config.yml:/etc/meilibridge/config.yml
    restart: always

Example Configuration

example configuration for run meilibridge

general:
  # The trigger sync method is a new way to synchronize data by receiving signals from webhooks.
  # It creates a custom route for each bridge to receive the signal and initiate data synchronization.
  # For example: http://127.0.0.1:8800/{bridge_name}
  trigger_sync:
    token: foobar # The token secures your webhook and must be sent in the header with the key "x-token-key".
    listen: 127.0.0.1:8800
  auto_bulk_interval: 1800 # auto bulk continue data on exists index, default is 1800 second (30 min)
  pprof:
    enable: false
    listen: 127.0.0.1:9900

bridges:
  - name: bridge1 # name is required

    meilisearch:
      # API address of meilisearch
      api_url: http://127.0.0.1:7700
      # master key https://www.meilisearch.com/docs/learn/security/differences_master_api_keys#master-key
      # optional
      api_key: foobar

    database:
      # database engine mongo, mysql, postgres
      engine: mongo
      host: "localhost"
      port: 27017
      user: "foo"
      password: "bar"
      database: "foobar"
      # custom parameter for database engine key:val
      custom_params:
        directConnection: true
        replicaSet: test

    # index map is collection or table of data source to meilisearch index
    # source collection or table -> index
    index_map:
      # if you want sync view table should original_name_table:view_name
      col1:col1_view_name:
        index_name: idx1
        # set pk for fields in meilisearch, note if set value for fields please enter value not database key.
        # it's require.
        # for mongodb use field _id for primary key.
        # https://www.meilisearch.com/docs/learn/core_concepts/primary_key#primary-field
        primary_key: id
        fields:
          _id: id
          first_name:
          last_name:
          age:
          created_at:

        settings:
          # list of strings Meilisearch should parse as a single term, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#dictionary
          dictionary:
            - foo
            - bar

          # he distinct attribute is a special, user-designated field. It is most commonly used to prevent Meilisearch
          # from returning a set of several similar documents, instead forcing it to return only one, default is empty
          # https://www.meilisearch.com/docs/learn/relevancy/distinct_attribute#setting-a-distinct-attribute-during-configuration
          distinct_attribute: foo

          # fields displayed in the returned documents, default is all attributes
          # https://www.meilisearch.com/docs/reference/api/settings#displayed-attributes
          displayed_attributes:
            - foo
            - bar

          # faceting settings
          # https://www.meilisearch.com/docs/reference/api/settings#faceting-object
          faceting:
            # maximum number of facet values returned for each facet. Values are sorted in ascending lexicographical order
            # default is 100
            max_values_per_facet: 100

          # attributes to use as filters and facets, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#filterable-attributes
          filterable_attributes:
            - first_name
            - last_name

          # fields in which to search for matching query words sorted by order of importance, default is all attributes ["*"]
          # https://www.meilisearch.com/docs/reference/api/settings#searchable-attributes
          searchable_attributes:
            - first_name
            - last_name
            - age

          # attributes to use when sorting search results, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#sortable-attributes
          sortable_attributes:
            - age

          # pagination settings
          # https://www.meilisearch.com/docs/reference/api/settings#pagination
          pagination:
            # the maximum number of search results Meilisearch can return, default is 1000
            # note: setting maxTotalHits to a value higher than the default will negatively impact search performance.
            # setting maxTotalHits to values over 20000 may result in queries taking seconds to complete.
            max_total_hits: 500

          # precision level when calculating the proximity ranking rule, default is "byWord"
          # https://www.meilisearch.com/docs/reference/api/settings#proximity-precision
          proximity_precision: "byWord"

          # list of ranking rules in order of importance,
          # default is ["words", "typo", "proximity", "attribute", "sort", "exactness"]
          # https://www.meilisearch.com/docs/reference/api/settings#ranking-rules
          ranking_rules:
            - "words"
            - "typo"

          # maximum duration of a search query for null set 0, default is 1500
          # https://www.meilisearch.com/docs/reference/api/settings#search-cutoff
          search_cutoff_ms: 500

          # list of characters delimiting where one term begins and ends, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#separator-tokens
          separator_tokens:
            - foo
            - bar

          # list of characters not delimiting where one term begins and ends, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#non-separator-tokens
          non_separator_tokens:
            - foo
            - bar

          # list of words ignored by Meilisearch when present in search queries, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#stop-words
          stop_words:
            - foo
            - bar

          # list of associated words treated similarly, default is empty
          # https://www.meilisearch.com/docs/reference/api/settings#synonyms
          synonyms:
            wolverine:
              - foo
              - bar
            logan:
              - x
              - y
              - z

          # typo tolerance settings
          # https://www.meilisearch.com/docs/reference/api/settings#typo-tolerance
          typo_tolerance:
            # whether typo tolerance is enabled or not, default is true
            enabled: true

            # the minimum word size for accepting 2 typos; must be between oneTypo and 255, default is 9
            min_word_size_for_typos:
              one_typo: 5
              two_typos: 9

            # an array of words for which the typo tolerance feature is disabled, default is empty
            disable_on_words:
              - foo
              - bar

            # an array of attributes for which the typo tolerance feature is disabled, default is empty
            disable_on_attributes:
              - foo
              - bar

          # embedders translate documents and queries into vector embeddings. You must configure at
          # least one embedder to use AI-powered search, this is experimental.
          # https://www.meilisearch.com/docs/reference/api/settings#embedders-experimental
          embedders:
            embedder1:
              source: source1
              api_key: apikey1
              model: model1
              dimensions: 128
              document_template: template1

            embedder2:
              source: source2
              api_key: apikey2
              model: model2
              dimensions: 128
              document_template: template2

      col2:
        index_name: idx2
        primary_key: id
        fields:
        settings:

  - name: bridge 2

    meilisearch:
      api_url: http://127.0.0.1:7700
      api_key: foobar

    database:
      engine: mysql
      host: "localhost"
      port: 6315
      user: "foo"
      password: "bar"
      database: "foobar"

    index_map:
      col1:
        index_name: idx1
        primary_key: id
        fields:
        settings:

How to run?

$ meilibridge -h
Meilibridge is a robust package designed to seamlessly sync data from both SQL and NoSQL databases to Meilisearch,
providing an efficient and unified search solution.

Usage:
  meilibridge [command]

Available Commands:
  help        Help about any command
  sync        Bulk or real-time sync
  version     Print the version number

Flags:
  -h, --help   Help for meilibridge

Use "meilibridge [command] --help" for more information about a command.
Bulk Sync

Bulk sync recreates the index and syncs all data to Meilisearch.

$ meilibridge sync bulk -h
Start bulk sync operation.

Usage:
  meilibridge sync bulk [flags]

Flags:
  -c, --config string   Path to config file (default "/etc/meilibridge/config.yml")
      --continue        Sync new data on existing index
  -h, --help            Help for bulk

Example:

$ meilibridge sync bulk -c ./config.yml
Bulk Sync with Continue

Bulk sync continues to sync new data to Meilisearch on an existing index.

$ meilibridge sync bulk -h
Start bulk sync operation.

Usage:
  meilibridge sync bulk [flags]

Flags:
  -c, --config string   Path to config file (default "/etc/meilibridge/config.yml")
      --continue        Sync new data on existing index
  -h, --help            Help for bulk

Example:

$ meilibridge sync bulk -c ./config.yml --continue
Auto bulk sync

Scheduled automatic bulk sync on existing index in the background.

$ meilibridge sync bulk -h
Start bulk sync operation.

Usage:
  meilibridge sync bulk [flags]

Flags:
  -c, --config string   Path to config file (default "/etc/meilibridge/config.yml")
      --continue        Sync new data on existing index
      --auto            Auto bulk sync on exists index every n seconds"
  -h, --help            Help for bulk

Example:

$ meilibridge sync bulk -c ./config.yml --auto
Real-time Sync

meilibridge supports real-time data synchronization on write operations of the database by watching or triggering events.

$ meilibridge sync start -h
Start real-time sync operation.

Usage:
  meilibridge sync start [flags]

Flags:
  -c, --config string   Path to config file (default "/etc/meilibridge/config.yml")
  -h, --help            Help for start

Example:

$ meilibridge sync start -c ./config.yml
Trigger Sync

meilibridge supports trigger synchronization with specific webhooks for indexes. Each index has a unique webhook at the path /{bridge_name}/{index_name}.

$ meilibridge sync trigger -h
start trigger sync

Usage:
  meilibridge sync trigger [flags]

Flags:
  -c, --config string   path to config file (default "/etc/meilibridge/config.yml")
  -h, --help            help for trigger

Example:

$ meilibridge sync trigger -c ./config.yml

Example API call to the webhook:

curl --location 'http://127.0.0.1:8800/bridge/foo' \
--header 'x-token-key: foobar' \
--header 'Content-Type: application/json' \
--data '{
    "index_uid": "foo",
    "type": "UPDATE",
    "document": {
        "primary_key": "_id",
        "primary_value": "65d0982320ff6c9a9a09eca2"
    }
}'
  • index_uid: The index UID in Meilisearch.
  • type: The operation type for the document (INSERT, UPDATE, DELETE).
  • document: The object used to find the document.
  • document.primary_key: The column name or field name that serves as the primary key for the Meilisearch index.
  • document.primary_value: The specific value used to find the document in the database table or collection for synchronization with Meilisearch.

Directories

Path Synopsis
cmd
pkg

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL