cohere

package
v0.28.0-beta Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 24, 2024 License: MIT Imports: 10 Imported by: 0

README

---
title: "Cohere"
lang: "en-US"
draft: false
description: "Learn about how to set up a VDP Cohere component https://github.com/instill-ai/instill-core"
---

The Cohere component is an AI component that allows users to connect the AI models served on the Cohere Platform.
It can carry out the following tasks:
- [Text Generation Chat](#text-generation-chat)
- [Text Embeddings](#text-embeddings)
- [Text Reranking](#text-reranking)

## Release Stage

`Alpha`

## Configuration

The component definition and tasks are defined in the [definition.json](https://github.com/instill-ai/component/blob/main/ai/cohere/v0/config/definition.json) and [tasks.json](https://github.com/instill-ai/component/blob/main/ai/cohere/v0/config/tasks.json) files respectively.

## Setup


In order to communicate with Cohere, the following connection details need to be
provided. You may specify them directly in a pipeline recipe as key-value pairs
within the component's `setup` block, or you can create a **Connection** from
the [**Integration Settings**](https://www.instill.tech/docs/vdp/integration)
page and reference the whole `setup` as `setup:
${connection.<my-connection-id>}`.

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| API Key | `api-key` | string | Fill in your Cohere API key. To find your keys, visit the Cohere dashboard page.  |




## Supported Tasks

### Text Generation Chat

Cohere's text generation models (often called generative pre-trained transformers or large language models) have been trained to understand natural language, code, and images. The models provide text outputs in response to their inputs. The inputs to these models are also referred to as "prompts". Designing a prompt is essentially how you “program” a large language model model, usually by providing instructions or some examples of how to successfully complete a task.

| Input | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Task ID (required) | `task` | string | `TASK_TEXT_GENERATION_CHAT` |
| Model Name (required) | `model-name` | string | The Cohere command model to be used |
| Prompt (required) | `prompt` | string | The prompt text |
| System message | `system-message` | string | The system message helps set the behavior of the assistant. For example, you can modify the personality of the assistant or provide specific instructions about how it should behave throughout the conversation. By default, the model’s behavior is using a generic message as "You are a helpful assistant." |
| Documents | `documents` | array[string] | The documents to be used for the model, for optimal performance, the length of each document should be less than 300 words. |
| Prompt Images | `prompt-images` | array[string] | The prompt images (Note: As for 2024-06-24 Cohere models are not multimodal, so images will be ignored.) |
| [Chat history](#text-generation-chat-chat-history) | `chat-history` | array[object] | Incorporate external chat history, specifically previous messages within the conversation. Each message should adhere to the format: : \{"role": "The message role, i.e. 'USER' or 'CHATBOT'", "content": "message content"\}. |
| Seed | `seed` | integer | The seed (default=42) |
| Temperature | `temperature` | number | The temperature for sampling (default=0.7) |
| Top K | `top-k` | integer | Top k for sampling (default=10) |
| Max new tokens | `max-new-tokens` | integer | The maximum number of tokens for model to generate (default=50) |

<details>
<summary> Input Objects in Text Generation Chat</summary>

<h4 id="text-generation-chat-chat-history">Chat History</h4>

Incorporate external chat history, specifically previous messages within the conversation. Each message should adhere to the format: : {"role": "The message role, i.e. 'USER' or 'CHATBOT'", "content": "message content"}.

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| [Content](#text-generation-chat-content) | `content` | array | The message content  |
| Role | `role` | string | The message role, i.e. 'system', 'user' or 'assistant'  |
<h4 id="text-generation-chat-content">Content</h4>

The message content

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| [Image URL](#text-generation-chat-image-url) | `image-url` | object | The image URL  |
| Text | `text` | string | The text content.  |
| Type | `type` | string | The type of the content part.  <br/><details><summary><strong>Enum values</strong></summary><ul><li>`text`</li><li>`image_url`</li></ul></details>  |
<h4 id="text-generation-chat-image-url">Image Url</h4>

The image URL

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| URL | `url` | string | Either a URL of the image or the base64 encoded image data.  |
</details>



| Output | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Text | `text` | string | Model Output |
| [Citations](#text-generation-chat-citations) (optional) | `citations` | array[object] | Citations |
| [Usage](#text-generation-chat-usage) (optional) | `usage` | object | Token Usage on the Cohere Platform Command Models |

<details>
<summary> Output Objects in Text Generation Chat</summary>

<h4 id="text-generation-chat-citations">Citations</h4>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| End | `end` | integer | The end position of the citation |
| Start | `start` | integer | The start position of the citation |
| Text | `text` | string | The text body of the citation |

<h4 id="text-generation-chat-usage">Usage</h4>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| Input Tokens | `input-tokens` | number | The input tokens used by Cohere Models |
| Output Tokens | `output-tokens` | number | The output tokens generated by Cohere Models |
</details>

### Text Embeddings

An embedding is a list of floating point numbers that captures semantic information about the text that it represents.

| Input | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Task ID (required) | `task` | string | `TASK_TEXT_EMBEDDINGS` |
| Embedding Type (required) | `embedding-type` | string | Specifies the return type of embedding, Note that 'binary'/'ubinary' options means the component will return packed unsigned binary embeddings. The length of each binary embedding is 1/8 the length of the float embeddings of the provided model. |
| Input Type (required) | `input-type` | string | Specifies the type of input passed to the model |
| Model Name (required) | `model-name` | string | The Cohere embed model to be used |
| Text (required) | `text` | string | The text |





| Output | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Embedding | `embedding` | array[number] | Embedding of the input text |
| [Usage](#text-embeddings-usage) (optional) | `usage` | object | Token usage on the Cohere platform embed models |

<details>
<summary> Output Objects in Text Embeddings</summary>

<h4 id="text-embeddings-usage">Usage</h4>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| Token Count | `tokens` | number | The token count used by Cohere Models |
</details>

### Text Reranking

Rerank models sort text inputs by semantic relevance to a specified query. They are often used to sort search results returned from an existing search solution.

| Input | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Task ID (required) | `task` | string | `TASK_TEXT_RERANKING` |
| Model Name (required) | `model-name` | string | The Cohere rerank model to be used |
| Query (required) | `query` | string | The query |
| Documents (required) | `documents` | array[string] | The documents to be used for reranking |
| Top N | `top-n` | integer | The number of most relevant documents or indices to return. Defaults to the length of the documents (default=3) |
| Maximum number of chunks per document | `max-chunks-per-doc` | integer | The maximum number of chunks to produce internally from a document (default=10) |





| Output | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Reranked documents | `ranking` | array[string] | Reranked documents |
| Reranked documents relevance (optional) | `relevance` | array[number] | The relevance scores of the reranked documents |
| [Usage](#text-reranking-usage) (optional) | `usage` | object | Search Usage on the Cohere Platform Rerank Models |

<details>
<summary> Output Objects in Text Reranking</summary>

<h4 id="text-reranking-usage">Usage</h4>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| Search Counts | `search-counts` | number | The search count used by Cohere Models |
</details>

Documentation

Index

Constants

View Source
const (
	TextGenerationTask = "TASK_TEXT_GENERATION_CHAT"
	TextEmbeddingTask  = "TASK_TEXT_EMBEDDINGS"
	TextRerankTask     = "TASK_TEXT_RERANKING"
)

Variables

This section is empty.

Functions

func Init

func Init(bc base.Component) *component

func IsEmbeddingOutputInt

func IsEmbeddingOutputInt(embeddingType string) bool

Types

type ChatMessage

type ChatMessage struct {
	Role    string              `json:"role"`
	Content []MultiModalContent `json:"content"`
}

type CohereClient

type CohereClient interface {
	// contains filtered or unexported methods
}

type EmbeddingFloatOutput

type EmbeddingFloatOutput struct {
	Usage     embedUsage `json:"usage"`
	Embedding []float64  `json:"embedding"`
}

type EmbeddingInput

type EmbeddingInput struct {
	Text          string `json:"text"`
	ModelName     string `json:"model-name"`
	InputType     string `json:"input-type"`
	EmbeddingType string `json:"embedding-type"`
}

type EmbeddingIntOutput

type EmbeddingIntOutput struct {
	Usage     embedUsage `json:"usage"`
	Embedding []int      `json:"embedding"`
}

type MultiModalContent

type MultiModalContent struct {
	ImageURL URL    `json:"image-url"`
	Text     string `json:"text"`
	Type     string `json:"type"`
}

type RerankInput

type RerankInput struct {
	Query     string   `json:"query"`
	Documents []string `json:"documents"`
	ModelName string   `json:"model-name"`
}

type RerankOutput

type RerankOutput struct {
	Ranking   []string    `json:"ranking"`
	Usage     rerankUsage `json:"usage"`
	Relevance []float64   `json:"relevance"`
}

type TextGenerationInput

type TextGenerationInput struct {
	ChatHistory  []ChatMessage `json:"chat-history"`
	MaxNewTokens int           `json:"max-new-tokens"`
	ModelName    string        `json:"model-name"`
	Prompt       string        `json:"prompt"`
	PromptImages []string      `json:"prompt-images"`
	Seed         int           `json:"seed"`
	SystemMsg    string        `json:"system-message"`
	Temperature  float64       `json:"temperature"`
	TopK         int           `json:"top-k"`
	Documents    []string      `json:"documents"`
}

type TextGenerationOutput

type TextGenerationOutput struct {
	Text      string       `json:"text"`
	Citations []citation   `json:"citations"`
	Usage     commandUsage `json:"usage"`
}

type URL

type URL struct {
	URL string `json:"url"`
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL