mongoimport

package
v0.0.0-...-ad3090c Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 9, 2024 License: Apache-2.0 Imports: 32 Imported by: 19

Documentation

Overview

Package mongoimport allows importing content from a JSON, CSV, or TSV into a MongoDB instance.

Index

Constants

View Source
const (
	CSV  = "csv"
	TSV  = "tsv"
	JSON = "json"
)

Input format types accepted by mongoimport.

Variables

View Source
var (
	// ErrNoOpeningBracket means that the input source did not contain any
	// opening brace - returned only if --jsonArray is passed in.
	ErrNoOpeningBracket = errors.New("bad JSON array format - found no " +
		"opening bracket '[' in input source")

	// ErrNoClosingBracket means that the input source did not contain any
	// closing brace - returned only if --jsonArray is passed in.
	ErrNoClosingBracket = errors.New("bad JSON array format - found no " +
		"closing bracket ']' in input source")
)
View Source
var (
	UTF8_BOM = []byte{0xEF, 0xBB, 0xBF}
)
View Source
var Usage = `` /* 280-byte string literal not displayed */

Functions

func ColumnNames

func ColumnNames(fs []ColumnSpec) (s []string)

ColumnNames maps a ColumnSpec slice to their associated names.

Types

type CSVConverter

type CSVConverter struct {
	// contains filtered or unexported fields
}

CSVConverter implements the Converter interface for CSV input.

func (CSVConverter) Convert

func (c CSVConverter) Convert() (b bson.D, err error)

Convert implements the Converter interface for CSV input. It converts a CSVConverter struct to a BSON document.

func (CSVConverter) Print

func (c CSVConverter) Print() error

type CSVInputReader

type CSVInputReader struct {
	// contains filtered or unexported fields
}

CSVInputReader implements the InputReader interface for CSV input types.

func NewCSVInputReader

func NewCSVInputReader(
	colSpecs []ColumnSpec,
	in io.Reader,
	rejects io.Writer,
	numDecoders int,
	ignoreBlanks bool,
	useArrayIndexFields bool,
) *CSVInputReader

NewCSVInputReader returns a CSVInputReader configured to read data from the given io.Reader, extracting only the specified columns using exactly "numDecoders" goroutines.

func (*CSVInputReader) ReadAndValidateHeader

func (r *CSVInputReader) ReadAndValidateHeader() (err error)

ReadAndValidateHeader reads the header from the underlying reader and validates the header fields. It sets err if the read/validation fails.

func (*CSVInputReader) ReadAndValidateTypedHeader

func (r *CSVInputReader) ReadAndValidateTypedHeader(parseGrace ParseGrace) (err error)

ReadAndValidateHeader reads the header from the underlying reader and validates the header fields. It sets err if the read/validation fails.

func (*CSVInputReader) StreamDocument

func (r *CSVInputReader) StreamDocument(ordered bool, readDocs chan bson.D) (retErr error)

StreamDocument takes a boolean indicating if the documents should be streamed in read order and a channel on which to stream the documents processed from the underlying reader. Returns a non-nil error if streaming fails.

type ColumnSpec

type ColumnSpec struct {
	Name       string
	Parser     FieldParser
	ParseGrace ParseGrace
	TypeName   string
	NameParts  []string
}

ColumnSpec keeps information for each 'column' of import.

func ParseAutoHeaders

func ParseAutoHeaders(headers []string) (fs []ColumnSpec)

ParseAutoHeaders converts a list of header items to ColumnSpec objects, with automatic parsers.

func ParseTypedHeader

func ParseTypedHeader(header string, parseGrace ParseGrace) (f ColumnSpec, err error)

ParseTypedHeader produces a ColumnSpec from a header item, extracting type information from it. The parseGrace is passed along to the new ColumnSpec.

func ParseTypedHeaders

func ParseTypedHeaders(headers []string, parseGrace ParseGrace) (fs []ColumnSpec, err error)

ParseTypedHeaders performs ParseTypedHeader on each item, returning an error if any single one fails.

type Converter

type Converter interface {
	Convert() (document bson.D, err error)
}

Converter is an interface that adds the basic Convert method which returns a valid BSON document that has been converted by the underlying implementation. If conversion fails, err will be set.

type FieldAutoParser

type FieldAutoParser struct{}

func (*FieldAutoParser) Parse

func (ap *FieldAutoParser) Parse(in string) (interface{}, error)

type FieldBinaryParser

type FieldBinaryParser struct {
	// contains filtered or unexported fields
}

func NewFieldBinaryParser

func NewFieldBinaryParser(arg string) (*FieldBinaryParser, error)

func (*FieldBinaryParser) Parse

func (bp *FieldBinaryParser) Parse(in string) (interface{}, error)

type FieldBooleanParser

type FieldBooleanParser struct{}

func (*FieldBooleanParser) Parse

func (bp *FieldBooleanParser) Parse(in string) (interface{}, error)

type FieldDateParser

type FieldDateParser struct {
	// contains filtered or unexported fields
}

func (*FieldDateParser) Parse

func (dp *FieldDateParser) Parse(in string) (interface{}, error)

type FieldDecimalParser

type FieldDecimalParser struct{}

func (*FieldDecimalParser) Parse

func (ip *FieldDecimalParser) Parse(in string) (interface{}, error)

type FieldDoubleParser

type FieldDoubleParser struct{}

func (*FieldDoubleParser) Parse

func (dp *FieldDoubleParser) Parse(in string) (interface{}, error)

type FieldInt32Parser

type FieldInt32Parser struct{}

func (*FieldInt32Parser) Parse

func (ip *FieldInt32Parser) Parse(in string) (interface{}, error)

type FieldInt64Parser

type FieldInt64Parser struct{}

func (*FieldInt64Parser) Parse

func (ip *FieldInt64Parser) Parse(in string) (interface{}, error)

type FieldParser

type FieldParser interface {
	Parse(in string) (interface{}, error)
}

FieldParser is the interface for any parser of a field item.

func NewFieldParser

func NewFieldParser(t columnType, arg string) (parser FieldParser, err error)

NewFieldParser yields a FieldParser corresponding to the given columnType. arg is passed along to the specific type's parser, if it permits an argument. An error will be raised if arg is not valid for the type's parser.

type FieldStringParser

type FieldStringParser struct{}

func (*FieldStringParser) Parse

func (sp *FieldStringParser) Parse(in string) (interface{}, error)

type IngestOptions

type IngestOptions struct {
	// Drops target collection before importing.
	Drop bool `long:"drop" description:"drop collection before inserting documents"`

	// Ignores fields with empty values in CSV and TSV imports.
	IgnoreBlanks bool `long:"ignoreBlanks" description:"ignore fields with empty values in CSV and TSV"`

	// Indicates that documents will be inserted in the order of their appearance in the input source.
	MaintainInsertionOrder bool `` /* 286-byte string literal not displayed */

	// Sets the number of insertion routines to use
	NumInsertionWorkers int `` /* 149-byte string literal not displayed */

	// Forces mongoimport to halt the import operation at the first insert or upsert error.
	StopOnError bool `` /* 411-byte string literal not displayed */

	// Modify the import process.
	// For existing documents (match --upsertFields) in the database:
	// "insert": Insert only, skip existing documents.
	// "upsert": Insert new documents or replace existing ones.
	// "merge": Insert new documents or modify existing ones; Preserve values in the database that are not overwritten.
	// "delete": Skip new documents or delete existing ones that match --upsertFields.
	// We don't set `default: insert` here since we need to be able to set mode to upsert if --mode isn't set and --upsertFields is set.
	//
	//nolint:staticcheck
	Mode string `` /* 389-byte string literal not displayed */

	Upsert bool `long:"upsert" hidden:"true" description:"(deprecated; same as --mode=upsert) insert or update objects that already exist"`

	// Specifies a list of fields for the query portion of the upsert; defaults to _id field.
	UpsertFields string `` /* 145-byte string literal not displayed */

	// Sets write concern level for write operations.
	// By default mongoimport uses a write concern of 'majority'.
	// Cannot be used simultaneously with write concern options in a URI.
	WriteConcern string `` /* 202-byte string literal not displayed */

	// Indicates that the server should bypass document validation on import.
	BypassDocumentValidation bool `long:"bypassDocumentValidation" description:"bypass document validation"`

	// Specifies the number of threads to use in processing data read from the input source
	NumDecodingWorkers int `long:"numDecodingWorkers" default:"0" hidden:"true"`

	BulkBufferSize int `long:"batchSize" default:"1000" hidden:"true"`
}

IngestOptions defines the set of options for storing data.

func (*IngestOptions) Name

func (_ *IngestOptions) Name() string

Name returns a description of the IngestOptions struct.

type InputOptions

type InputOptions struct {
	// Fields is an option to directly specify comma-separated fields to import to CSV.
	Fields *string `long:"fields" value-name:"<field>[,<field>]*" short:"f" description:"comma separated list of fields, e.g. -f name,age"`

	// FieldFile is a filename that refers to a list of fields to import, 1 per line.
	FieldFile *string `long:"fieldFile" value-name:"<filename>" description:"file with field names - 1 per line"`

	// Specifies the location and name of a file containing the data to import.
	File string `long:"file" value-name:"<filename>" description:"file to import from; if not specified, stdin is used"`

	// Treats the input source's first line as field list (csv and tsv only).
	HeaderLine bool `long:"headerline" description:"use first line in input source as the field list (CSV and TSV only)"`

	// Indicates that the underlying input source contains a single JSON array with the documents to import.
	JSONArray bool `long:"jsonArray" description:"treat input source as a JSON array"`

	// Indicates how to handle type coercion failures
	ParseGrace string `` /* 155-byte string literal not displayed */

	// Specifies the file type to import. The default format is JSON, but it’s possible to import CSV and TSV files.
	Type string `long:"type" value-name:"<type>" default:"json" default-mask:"-" description:"input format to import: json, csv, or tsv"`

	// Indicates that field names include type descriptions
	ColumnsHaveTypes bool `` /* 573-byte string literal not displayed */

	// Indicates that the legacy extended JSON format should be used to parse JSON documents. Defaults to false.
	Legacy bool `long:"legacy" description:"use the legacy extended JSON format"`

	UseArrayIndexFields bool `` /* 245-byte string literal not displayed */
}

InputOptions defines the set of options for reading input data.

func (*InputOptions) Name

func (_ *InputOptions) Name() string

Name returns a description of the InputOptions struct.

type InputReader

type InputReader interface {
	// StreamDocument takes a boolean indicating if the documents should be streamed
	// in read order and a channel on which to stream the documents processed from
	// the underlying reader.  Returns a non-nil error if encountered.
	StreamDocument(ordered bool, read chan bson.D) error

	// ReadAndValidateHeader reads the header line from the InputReader and returns
	// a non-nil error if the fields from the header line are invalid; returns
	// nil otherwise. No-op for JSON input readers.
	ReadAndValidateHeader() error

	// ReadAndValidateTypedHeader is the same as ReadAndValidateHeader,
	// except it also parses types from the fields of the header. Parse errors
	// will be handled according parseGrace.
	ReadAndValidateTypedHeader(parseGrace ParseGrace) error
	// contains filtered or unexported methods
}

type JSONConverter

type JSONConverter struct {
	// contains filtered or unexported fields
}

JSONConverter implements the Converter interface for JSON input.

func (JSONConverter) Convert

func (c JSONConverter) Convert() (bson.D, error)

Convert implements the Converter interface for JSON input. It converts a JSONConverter struct to a BSON document.

type JSONInputReader

type JSONInputReader struct {
	// contains filtered or unexported fields
}

JSONInputReader is an implementation of InputReader that reads documents in JSON format.

func NewJSONInputReader

func NewJSONInputReader(
	isArray bool,
	legacyExtJSON bool,
	in io.Reader,
	numDecoders int,
) *JSONInputReader

NewJSONInputReader creates a new JSONInputReader in array mode if specified, configured to read data to the given io.Reader.

func (*JSONInputReader) ReadAndValidateHeader

func (r *JSONInputReader) ReadAndValidateHeader() error

ReadAndValidateHeader is a no-op for JSON imports; always returns nil.

func (*JSONInputReader) ReadAndValidateTypedHeader

func (r *JSONInputReader) ReadAndValidateTypedHeader(parseGrace ParseGrace) error

ReadAndValidateTypedHeader is a no-op for JSON imports; always returns nil.

func (*JSONInputReader) StreamDocument

func (r *JSONInputReader) StreamDocument(ordered bool, readChan chan bson.D) (retErr error)

StreamDocument takes a boolean indicating if the documents should be streamed in read order and a channel on which to stream the documents processed from the underlying reader. Returns a non-nil error if encountered.

type MongoImport

type MongoImport struct {

	// generic mongo tool options
	ToolOptions *options.ToolOptions

	// InputOptions defines options used to read data to be ingested
	InputOptions *InputOptions

	// IngestOptions defines options used to ingest data into MongoDB
	IngestOptions *IngestOptions

	// SessionProvider is used for connecting to the database
	SessionProvider *db.SessionProvider

	// the tomb is used to synchronize ingestion goroutines and causes
	// other sibling goroutines to terminate immediately if one errors out
	tomb.Tomb
	// contains filtered or unexported fields
}

MongoImport is a container for the user-specified options and internal state used for running mongoimport.

func New

func New(opts Options) (*MongoImport, error)

New constructs a new MongoImport instance from the provided options. This will fail if the options are invalid or if it cannot establish a new connection to the server.

func (*MongoImport) Close

func (imp *MongoImport) Close()

Close disconnects the server.

func (*MongoImport) ImportDocuments

func (imp *MongoImport) ImportDocuments() (uint64, uint64, error)

ImportDocuments is used to write input data to the database. It returns the number of documents successfully imported to the appropriate namespace, the number of failures, and any error encountered in doing this.

type Options

type Options struct {
	*options.ToolOptions
	*InputOptions
	*IngestOptions
	ParsedArgs []string
}

Options contains all the possible options that can be used to configure mongoimport.

func ParseOptions

func ParseOptions(rawArgs []string, versionStr, gitCommit string) (Options, error)

ParseOptions reads command line arguments and converts them into options used to configure mongoimport.

type ParseGrace

type ParseGrace int

func ParsePG

func ParsePG(pg string) (res ParseGrace)

ParsePG interprets the user-provided parseGrace, assuming it is valid.

func ValidatePG

func ValidatePG(pg string) (ParseGrace, error)

ValidatePG ensures the user-provided parseGrace is one of the allowed values.

type TSVConverter

type TSVConverter struct {
	// contains filtered or unexported fields
}

TSVConverter implements the Converter interface for TSV input.

func (TSVConverter) Convert

func (c TSVConverter) Convert() (b bson.D, err error)

Convert implements the Converter interface for TSV input. It converts a TSVConverter struct to a BSON document.

func (TSVConverter) Print

func (c TSVConverter) Print() error

type TSVInputReader

type TSVInputReader struct {
	// contains filtered or unexported fields
}

TSVInputReader is a struct that implements the InputReader interface for a TSV input source.

func NewTSVInputReader

func NewTSVInputReader(
	colSpecs []ColumnSpec,
	in io.Reader,
	rejects io.Writer,
	numDecoders int,
	ignoreBlanks bool,
	useArrayIndexFields bool,
) *TSVInputReader

NewTSVInputReader returns a TSVInputReader configured to read input from the given io.Reader, extracting the specified columns only.

func (*TSVInputReader) ReadAndValidateHeader

func (r *TSVInputReader) ReadAndValidateHeader() (err error)

ReadAndValidateHeader reads the header from the underlying reader and validates the header fields. It sets err if the read/validation fails.

func (*TSVInputReader) ReadAndValidateTypedHeader

func (r *TSVInputReader) ReadAndValidateTypedHeader(parseGrace ParseGrace) (err error)

ReadAndValidateTypedHeader reads the header from the underlying reader and validates the header fields. It sets err if the read/validation fails.

func (*TSVInputReader) StreamDocument

func (r *TSVInputReader) StreamDocument(ordered bool, readDocs chan bson.D) (retErr error)

StreamDocument takes a boolean indicating if the documents should be streamed in read order and a channel on which to stream the documents processed from the underlying reader. Returns a non-nil error if streaming fails.

Directories

Path Synopsis
Package csv reads and writes comma-separated values (CSV) files.
Package csv reads and writes comma-separated values (CSV) files.
Main package for the mongoimport tool.
Main package for the mongoimport tool.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL