googlesearch

package
v0.29.0-beta Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 30, 2024 License: MIT Imports: 13 Imported by: 0

README

---
title: "Google Search"
lang: "en-US"
draft: false
description: "Learn about how to set up a VDP Google Search component https://github.com/instill-ai/instill-core"
---

The Google Search component is an application component that allows users to leverage the Google Search engine.
It can carry out the following tasks:
- [Search](#search)

## Release Stage

`Alpha`

## Configuration

The component definition and tasks are defined in the [definition.json](https://github.com/instill-ai/component/blob/main/application/googlesearch/v0/config/definition.json) and [tasks.json](https://github.com/instill-ai/component/blob/main/application/googlesearch/v0/config/tasks.json) files respectively.

## Setup


In order to communicate with Google, the following connection details need to be
provided. You may specify them directly in a pipeline recipe as key-value pairs
within the component's `setup` block, or you can create a **Connection** from
the [**Integration Settings**](https://www.instill.tech/docs/vdp/integration)
page and reference the whole `setup` as `setup:
${connection.<my-connection-id>}`.

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| API Key (required) | `api-key` | string | API Key for the Google Custom Search API. You can create one <a href="https://developers.google.com/custom-search/v1/overview#api-key">here</a>  |
| Search Engine ID (required) | `cse-id` | string | ID of the Search Engine to use. Before using the Custom Search JSON API you will first need to create and configure your Programmable Search Engine. If you have not already created a Programmable Search Engine, you can start by visiting [the Programmable Search Engine control panel](https://programmablesearchengine.google.com/controlpanel/all). <br /> You can find this in the URL of your Search Engine. For example, if the URL of your search engine is `https://cse.google.com/cse.js?cx=012345678910`, the ID value is: `012345678910`  |

</div>




## Supported Tasks

### Search

Search data via Google Search Engine.

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Input | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Task ID (required) | `task` | string | `TASK_SEARCH` |
| Query (required) | `query` | string | The search query for Google |
| Top K | `top-k` | integer | The number of results to return for each query |
| Include Link Text | `include-link-text` | boolean | Indicate whether to scrape the link and include the text of the link associated with this search result in the 'link-text' field |
| Include Link HTML | `include-link-html` | boolean | Indicate whether to scrape the link and include the raw HTML of the link associated with this search result in the 'link-html' field |
</div>






<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Output | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| [Results](#search-results) | `results` | array[object] | The returned search results from Google |
</div>

<details>
<summary> Output Objects in Search</summary>

<h4 id="search-results">Results</h4>

<div class="markdown-col-no-wrap" data-col-1 data-col-2>

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| Link | `link` | string | The full URL to which the search result is pointing, e.g., http://www.example.com/foo/bar. |
| Link HTML | `link-html` | string | The scraped raw html of the link associated with this search result |
| Link Text | `link-text` | string | The scraped text of the link associated with this search result, in plain text |
| Snippet | `snippet` | string | The snippet from the page associated with this search result, in plain text |
| Title | `title` | string | The title of a search result, in plain text |
</div>
</details>

Documentation

Index

Constants

View Source
const (
	// MaxResultsPerPage is the default max number of search results per page
	MaxResultsPerPage = 10
	// MaxResults is the maximum number of search results
	MaxResults = 100
)

Variables

This section is empty.

Functions

func Init

func Init(bc base.Component) *component

func Min

func Min(x, y int) int

Min returns the smaller of x or y.

func NewService

func NewService(apiKey string) (*customsearch.Service, error)

NewService creates a Google custom search service

Types

type Result

type Result struct {
	// Title: The title of the search result, in plain text.
	Title string `json:"title"`

	// Link: The full URL to which the search result is pointing, e.g.
	// http://www.example.com/foo/bar.
	Link string `json:"link"`

	// Snippet: The snippet of the search result, in plain text.
	Snippet string `json:"snippet"`

	// LinkText: The scraped text of the search web page result, in plain text.
	LinkText string `json:"link-text"`

	// LinkHTML: The full raw HTML of the search web page result.
	LinkHTML string `json:"link-html"`
}

type SearchInput

type SearchInput struct {
	// Query: The search query.
	Query string `json:"query"`

	// TopK: The number of search results to return.
	TopK *int `json:"top-k,omitempty"`

	// IncludeLinkText: Whether to include the scraped text of the search web page result.
	IncludeLinkText *bool `json:"include-link-text,omitempty"`

	// IncludeLinkHTML: Whether to include the scraped HTML of the search web page result.
	IncludeLinkHTML *bool `json:"include-link-html,omitempty"`
}

SearchInput defines the input of the search task

type SearchOutput

type SearchOutput struct {
	// Results: The search results.
	Results []*Result `json:"results"`
}

SearchOutput defines the output of the search task

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL