csv

package
v0.2.9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 24, 2024 License: Apache-2.0 Imports: 11 Imported by: 0

README

Go CSV Encoder

A Go package for encoding and decoding CSV-structured textfiles to/from arbitrary Go types.

Features

  • RFC 4180 compliant CSV reader and writer
  • MarshalCSV/UnmarshalCSV interfaces
  • mapping to strings, integers, floats and boolean values
  • bulk or stream processing
  • custom separator and comment characters
  • optional whitespace trimming for headers and string values
  • any support for reading unknown CSV fields

Examples

Reading a well defined CSV file

This example assumes your CSV file contains a header who's values match the struct tags defined on the Go type FrameInfo. CSV fields that are undefined in the type are ignored.

import "blockwatch.cc/knoxdb/encoding/csv"

type FrameInfo struct {
	ActiveImageHeight  int      `csv:"Active Image Height"`
	ActiveImageLeft    int      `csv:"Active Image Left"`
	ActiveImageTop     int      `csv:"Active Image Top"`
	ActiveImageWidth   int      `csv:"Active Image Width"`
	CameraClipName     string   `csv:"Camera Clip Name"`
	CameraRoll         float32  `csv:"Camera Roll"`
	CameraTilt         float32  `csv:"Camera Tilt"`
	MasterTC           string   `csv:"Master TC"`
	MasterTCFrameCount int      `csv:"Master TC Frame Count"`
	SensorFPS          float32  `csv:"Sensor FPS"`
}

type FrameSequence []*FrameInfo

func ReadFile(path string) (FrameSequence, error) {
	b, err := ioutil.ReadFile(path)
	if err != nil {
		return nil, err
    	}
	seq := make(FrameSequence, 0)
	if err := csv.Unmarshal(b, &seq); err != nil {
		return nil, err
	}
	return seq, nil
}
Fail when encountering unknown CSV fields
func ReadFileUnknown(path string) (FrameSequence, error) {
	f, err := os.Open(path)
	if err != nil {
		return nil, err
	}
    	defer f.Close()
	dec := csv.NewDecoder(f).SkipUnknown(false)
	c := make(FrameSequence, 0)
	if err := dec.Decode(&c); err != nil {
		return nil, err
    	}
	return c, nil
}
Parsing an unknown CSV file into a slice of maps
type GenericRecord struct {
	Record map[string]string `csv:,any`
}

type GenericCSV []GenericRecord

func ReadFileIntoMap(path string) (GenericCSV, error) {
	f, err := os.Open(path)
	if err != nil {
	    return nil, err
    	}
    	defer f.Close()
	dec := csv.NewDecoder(f)
	c := make(GenericCSV, 0)
	if err := dec.Decode(&c); err != nil {
	    return nil, err
    	}
	return c, nil
}
Stream-process CSV input
func ReadStream(r io.Reader) error {
	dec := csv.NewDecoder(r)

	// read and decode the file header
	line, err := dec.ReadLine()
	if err != nil {
		return err
	}
	if err = dec.DecodeHeader(line); err != nil {
		return err
	}

	// loop until EOF (i.e. dec.ReadLine returns an empty line and nil error);
	// any other error during read will result in a non-nil error
	for {
		// read the next line from stream
		line, err = dec.ReadLine()

		// check for read errors other than EOF
		if err != nil {
			return err
		}

		// check for EOF condition
		if line == "" {
			break
		}

		// decode the record
		v := &FrameInfo{}
		if err = dec.DecodeRecord(v, line); err != nil {
			return err
		}

		// process the record here
		Process(v)
	}
	return nil
}

License

Originally (c) Alexander Eichhorn, available under the Apache License, Version 2.0.

Documentation

Overview

Package csv decodes and encodes comma-separated values (CSV) files to and from arbitrary Go types. Because there are many different kinds of CSV files, this package implements the format described in RFC 4180.

A CSV file may contain an optional header and zero or more records of one or more fields per record. The number of fields must be the same for each record and the optional header. The field separator is configurable and defaults to comma ',' (0x2C). Empty lines and lines starting with a comment character are ignored. The comment character is configurable as well and defaults to the number sign '#' (0x23). Records are separated by the newline character '\n' (0x0A) and the final record may or may not be followed by a newline. Carriage returns '\r' (0x0D) before newline characters are silently removed.

White space is considered part of a field. Leading or trailing whitespace can optionally be trimmed when parsing a value. Fields may optionally be quoted in which case the surrounding double quotes '"' (0x22) are removed before processing. Inside a quoted field a double quote may be escaped by a preceeding second double quote which will be removed during parsing.

Index

Constants

View Source
const (
	Separator = ','
	Comment   = '#'
	Wrapper   = "\""
)

Variables

This section is empty.

Functions

func Marshal

func Marshal(v interface{}) ([]byte, error)

Marshal returns the CSV encoding of slice v.

When the slice's element type implements the Marshaler interface, MarshalCSV is called for each element and the resulting string slice is written in the order returned by MarshalCSV to the output stream. Otherwise, CSV records are ordered like type attributes in the element's type definition.

CSV header field names are taken from struct field tags of each attribute and when missing from the attribute name as specified in the Go type.

// CSV field "name" will be assigned to struct field "Field".
Field int64 `csv:"name"`

// Field is ignored by this package.
Field int `csv:"-"`

Marshal only supports strings, integers, floats, booleans, []byte slices and [N]byte arrays as well as pointers to these types. Slices of other types, maps, interfaces and channels are not supported and result in an error when passed to Marshal.

func Unmarshal

func Unmarshal(data []byte, v interface{}) error

Unmarshal parses CSV encoded data and stores the result in the slice v.

Unmarshal allocates new slice elements for each CSV record encountered in the input. The first non-empty and non-commented line of input is expected to contain a CSV header that will be used to map the order of values in each CSV record to fields in the Go type.

When the slice element type implements the Marshaler interface, UnmarshalCSV is called for each record. Otherwise, CSV record fields are assigned to the struct fields with a corresponding name in their csv struct tag.

// CSV field "name" will be assigned to struct field "Field".
Field int64 `csv:"name"`

// Field is used to store all unmapped CSV fields.
Field map[string]string `csv:",any"`

A special flag 'any' can be used on a map or any other field type implementing TextUnmarshaler interface to capture all unmapped CSV fields of a record.

Types

type DecodeError

type DecodeError struct {
	// contains filtered or unexported fields
}

func (*DecodeError) Error

func (e *DecodeError) Error() string

type Decoder

type Decoder struct {
	// contains filtered or unexported fields
}

A Decoder reads and decodes records and fields from a CSV stream.

Using a Decoder is only required when the default behaviour of Unmarshal is undesired. This is the case when no headers are present in the CSV file, when special parsing is required or for stream processing when files are too large to fit into memory.

When headers are present in a file, a Decoder will interprete the number and order of values in each record from the header and map record fields to Go struct fields according to their struct tags.

If a header is missing a Decoder will use the type definition of the first value passed to DecodeRecord() or the type of slice elements passed to Decode() assuming records in the CSV file have the same order as attributes defined for the Go type.

func NewDecoder

func NewDecoder(r io.Reader) *Decoder

NewDecoder returns a new decoder that reads from r.

func (*Decoder) Buffer

func (d *Decoder) Buffer(buf []byte) *Decoder

Buffer sets a buffer buf to be used by the underlying bufio.Scanner for reading from io.Reader r.

func (*Decoder) Comment

func (d *Decoder) Comment(c rune) *Decoder

Comment sets rune c as comment line identifier. Comments must start with rune c as first character to be skipped.

func (*Decoder) Decode

func (d *Decoder) Decode(v interface{}) error

Decode reads CSV records from the input and stores their decoded values in the slice pointed to by v.

See the documentation for Unmarshal for details about the conversion of CSV records into a Go value.

func (*Decoder) DecodeHeader

func (d *Decoder) DecodeHeader(line string) ([]string, error)

DecodeHeader reads CSV head fields from line and stores them as internal Decoder state required to map CSV records later on.

func (*Decoder) DecodeRecord

func (d *Decoder) DecodeRecord(v interface{}, line string) error

DecodeRecord extracts CSV record fields from line and stores them into Go value v.

func (*Decoder) Header

func (d *Decoder) Header(h bool) *Decoder

Header controls if the decoder expects the input stream to contain header fields.

func (*Decoder) ReadLine

func (d *Decoder) ReadLine() (string, error)

ReadLine returns the next non-empty and non-commented line of input. It's intended use in combination with DecodeHeader() and DecodeRecord() in loops for stream-processing of CSV input. ReadLine returns an error when the underlying io.Reader fails. On EOF, ReadLine returns an empty string and a nil error.

The canonical way of using ReadLine is (error handling omitted)

dec := csv.NewDecoder(r)
line, _ := dec.ReadLine()
head, _ := dec.DecodeHeader(line)
for {
    line, err = dec.ReadLine()
    if err != nil {
        return err
    }
    if line == "" {
        break
    }
    // process the next record here
}

func (*Decoder) Separator

func (d *Decoder) Separator(r rune) *Decoder

Separator sets rune r as record field separator that will be used for parsing.

func (*Decoder) SkipUnknown

func (d *Decoder) SkipUnknown(t bool) *Decoder

SkipUnknown controls if the Decoder will return an error when encountering a CSV header field that cannot be mapped to a struct tag. When true, such fields will be silently ignored in all CSV records.

func (*Decoder) Trim

func (d *Decoder) Trim(t bool) *Decoder

Trim controls if the Decoder will trim whitespace surrounding header fields and records before processing them.

type Encoder

type Encoder struct {
	// contains filtered or unexported fields
}

Encoder writes CSV header and CSV records to an output stream. The encoder may be configured to omit the header, to use a user-defined separator and to trim string values before writing them as CSV fields.

func NewEncoder

func NewEncoder(w io.Writer) *Encoder

NewEncoder returns a new encoder that writes to w.

func (*Encoder) Encode

func (e *Encoder) Encode(v interface{}) error

Encode writes the CSV encoding of slice v to the stream.

See the documentation for Marshal for details about the conversion of Go values to CSV.

func (*Encoder) EncodeHeader

func (e *Encoder) EncodeHeader(fields []string, v interface{}) error

EncodeHeader prepares and optionally writes a CSV header. When fields is not empty, it determines which header fields and subsequently which attributes from a Go type will be written as CSV record fields.

When fields is nil or empty, the value of v will be used to determine the type of records and their field names. v in this case is an element of the slice you would pass to Marshal, not a slice itself.

func (*Encoder) EncodeRecord

func (e *Encoder) EncodeRecord(v interface{}) error

EncodeRecord writes the CSV encoding of v to the output stream.

func (*Encoder) Header

func (e *Encoder) Header(h bool) *Encoder

Header controls if the encoder will write a CSV header to the first line of the output stream.

func (*Encoder) HeaderWritten

func (e *Encoder) HeaderWritten() bool

HeaderWritten returns true if the CSV header has already been written to the output.

func (*Encoder) Separator

func (e *Encoder) Separator(r rune) *Encoder

Separator sets the rune r that will be used to separate header fields and CSV record fields.

func (*Encoder) Trim

func (e *Encoder) Trim(t bool) *Encoder

Trim controls if the Decoder will trim whitespace surrounding string values before writing them to the output stream.

func (*Encoder) Write

func (e *Encoder) Write(p []byte) (n int, err error)

Allow using the encoder as io.Writer

type Marshaler

type Marshaler interface {
	MarshalCSV() ([]string, error)
}

Marshaler is the interface implemented by types that can marshal themselves as CSV records. The assumed return value is a slice of strings that must be of same length for all records and the header.

type Unmarshaler

type Unmarshaler interface {
	UnmarshalCSV(header, values []string) error
}

Unmarshaler is the interface implemented by types that can unmarshal a CSV record from a slice of strings. The input is the scanned header array followed by all fields for a record. Both slices are guaranteed to be of equal length.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL