table

package

v0.1.0 Latest Latest Go to latest Published: Feb 15, 2025 License: BSD-3-Clause Imports: 23 Imported by: 14

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

README ¶

table

table provides a DataTable / DataFrame structure similar to pandas and xarray in Python, and Apache Arrow Table, using tensor n-dimensional columns aligned by common outermost row dimension.

See examples/dataproc for a demo of how to use this system for data analysis, paralleling the example in Python Data Science using pandas, to see directly how that translates into this framework.

Whereas an individual Tensor can only hold one data type, the Table allows coordinated storage and processing of heterogeneous data types, aligned by the outermost row dimension. The main tensor data processing functions are defined on the individual tensors (which are the universal computational element in the tensor system), but the coordinated row-wise indexing in the table is important for sorting or filtering a collection of data in the same way, and grouping data by a common set of "splits" for data analysis. Plotting is also driven by the table, with one column providing a shared X axis for the rest of the columns.

The Table mainly provides "infrastructure" methods for adding tensor columns and CSV (comma separated values, and related tab separated values, TSV) file reading and writing. Any function that can be performed on an individual column should be done using the tensor.Rows and Tensor methods directly.

As a general convention, it is safest, clearest, and quite fast to access columns by name instead of index (there is a map from name to index), so the base access method names generally take a column name argument, and those that take a column index have an Index suffix.

The table itself stores raw data tensor.Tensor values, and the Column (by name) and ColumnByIndex methods return a tensor.Rows with the Indexes pointing to the shared table-wide Indexes (which can be nil if standard sequential order is being used).

If you call Sort, Filter or other routines on an individual column tensor, then you can grab the updated indexes via the IndexesFromTensor method so that they apply to the entire table. The SortColumn and FilterString methods do this for you.

There are also multi-column Sort and Filter methods on the Table itself.

It is very low-cost to create a new View of an existing Table, via NewView, as they can share the underlying Columns data.

Cheat Sheet

dt is the Table pointer variable for examples below:

Table Access

Column data access:

// FloatRow is a method on the `tensor.Rows` returned from the `Column` method.
// This is the best method to use in general for generic 1D data access,
// as it works on any data from 1D on up (although it only samples the first value
// from higher dimensional data) .
val := dt.Column("Values").FloatRow(3)

dt.Column("Name").SetStringRow(4)

To access higher-dimensional "cell" level data using a simple 1D index into the cell patterns:

// FloatRow is a method on the `tensor.Rows` returned from the `Column` method.
// This is the best method to use in general for generic 1D data access,
// as it works on any data from 1D on up (although it only samples the first value
// from higher dimensional data) .
val := dt.Column("Values").FloatRow(3, 2)

dt.Column("Name").SetStringRow("Alia", 4, 1)

todo: more

Sorting and Filtering

Splits ("pivot tables" etc), Aggregation

Create a table of mean values of "Data" column grouped by unique entries in "Name" column, resulting table will be called "DataMean":

byNm := split.GroupBy(ix, []string{"Name"}) // column name(s) to group by
split.Agg(byNm, "Data", agg.AggMean) // 
gps := byNm.AggsToTable(etable.AddAggName) // etable.AddAggName or etable.ColNameOnly for naming cols

Describe (basic stats) all columns in a table:

ix := etable.NewRows(et) // new view with all rows
desc := agg.DescAll(ix) // summary stats of all columns
// get value at given column name (from original table), row "Mean"
mean := desc.Float("ColNm", desc.RowsByString("Agg", "Mean", etable.Equals, etable.UseCase)[0])

CSV / TSV file format

Tables can be saved and loaded from CSV (comma separated values) or TSV (tab separated values) files. See the next section for special formatting of header strings in these files to record the type and tensor cell shapes.

Type and Tensor Headers

To capture the type and shape of the columns, we support the following header formatting. We weren't able to find any other widely supported standard (please let us know if there is one that we've missed!)

Here is the mapping of special header prefix characters to standard types:

'$': etensor.STRING,
'%': etensor.FLOAT32,
'#': etensor.FLOAT64,
'|': etensor.INT64,
'@': etensor.UINT8,
'^': etensor.BOOL,

Columns that have tensor cell shapes (not just scalars) are marked as such with the first such column having a <ndim:dim,dim..> suffix indicating the shape of the cells in this column, e.g., <2:5,4> indicates a 2D cell Y=5,X=4. Each individual column is then indexed as [ndims:x,y..] e.g., the first would be [2:0,0], then [2:0,1] etc.

Example

Here's a TSV file for a scalar String column (Name), a 2D 1x4 tensor float32 column (Input), and a 2D 1x2 float32 Output column.

_H:	$Name	%Input[2:0,0]<2:1,4>	%Input[2:0,1]	%Input[2:0,2]	%Input[2:0,3]	%Output[2:0,0]<2:1,2>	%Output[2:0,1]
_D:	Event_0	1	0	0	0	1	0
_D:	Event_1	0	1	0	0	1	0
_D:	Event_2	0	0	1	0	0	1
_D:	Event_3	0	0	0	1	0	1

Logging one row at a time

Documentation ¶

Index ¶

Constants
Variables
func AddColumn[T tensor.DataTypes](dt *Table, name string, cellSizes ...int) tensor.Tensor
func CleanCatTSV(filename string, sorts ...string) error
func ConfigFromDataValues(dt *Table, hdrs []string, rec [][]string) error
func ConfigFromHeaders(dt *Table, hdrs []string, rec [][]string) error
func ConfigFromTableHeaders(dt *Table, hdrs []string) error
func DetectTableHeaders(hdrs []string) bool
func InferDataType(str string) reflect.Kind
func InsertColumn[T tensor.DataTypes](dt *Table, name string, idx int, cellSizes ...int) tensor.Tensor
func ShapeFromString(dims string) []int
func TableColumnType(nm string) (reflect.Kind, string)
func TableHeaderChar(typ reflect.Kind) byte
func UpdateSliceTable(st any, dt *Table)
type Columns
- func NewColumns() *Columns
- func (cl *Columns) AddColumn(name string, tsr tensor.Values) error
- func (cl *Columns) AppendRows(cl2 *Columns)
- func (cl *Columns) Clone() *Columns
- func (cl *Columns) InsertColumn(idx int, name string, tsr tensor.Values) error
- func (cl *Columns) SetNumRows(rows int) *Columns
type FilterFunc
type Table
- func New(name ...string) *Table
- func NewSliceTable(st any) (*Table, error)
- func NewView(src *Table) *Table
- func (dt *Table) AddColumn(name string, tsr tensor.Values) error
- func (dt *Table) AddColumnOfType(name string, typ reflect.Kind, cellSizes ...int) tensor.Tensor
- func (dt *Table) AddFloat32Column(name string, cellSizes ...int) *tensor.Float32
- func (dt *Table) AddFloat64Column(name string, cellSizes ...int) *tensor.Float64
- func (dt *Table) AddIntColumn(name string, cellSizes ...int) *tensor.Int
- func (dt *Table) AddRows(n int) *Table
- func (dt *Table) AddStringColumn(name string, cellSizes ...int) *tensor.String
- func (dt *Table) AppendRows(dt2 *Table)
- func (dt *Table) Clone() *Table
- func (dt *Table) CloseLog()
- func (dt *Table) Column(name string) *tensor.Rows
- func (dt *Table) ColumnByIndex(idx int) *tensor.Rows
- func (dt *Table) ColumnIndex(name string) int
- func (dt *Table) ColumnIndexList(names ...string) []int
- func (dt *Table) ColumnList(names ...string) []tensor.Tensor
- func (dt *Table) ColumnName(i int) string
- func (dt *Table) ColumnTry(name string) (*tensor.Rows, error)
- func (dt *Table) ConfigFromTable(ft *Table) error
- func (dt *Table) DeleteAll()
- func (dt *Table) DeleteColumnByIndex(i, j int)
- func (dt *Table) DeleteColumnName(name string) bool
- func (dt *Table) DeleteRows(at, n int)
- func (dt *Table) Filter(filterer func(dt *Table, row int) bool)
- func (dt *Table) FilterString(columnName string, str string, opts tensor.StringMatch) error
- func (dt *Table) IndexesFromTensor(ix *tensor.Rows)
- func (dt *Table) IndexesNeeded()
- func (dt *Table) Init()
- func (dt *Table) InsertColumn(idx int, name string, tsr tensor.Values) error
- func (dt *Table) InsertKeyColumns(args ...string) *Table
- func (dt *Table) InsertRows(at, n int) *Table
- func (dt *Table) IsValidRow(row int) error
- func (dt *Table) Metadata() *metadata.Data
- func (dt *Table) New() *Table
- func (dt *Table) NumColumns() int
- func (dt *Table) NumRows() int
- func (dt *Table) OpenCSV(filename fsx.Filename, delim tensor.Delims) error
- func (dt *Table) OpenFS(fsys fs.FS, filename string, delim tensor.Delims) error
- func (dt *Table) OpenLog(filename string, delim tensor.Delims) error
- func (dt *Table) Permuted()
- func (dt *Table) ReadCSV(r io.Reader, delim tensor.Delims) error
- func (dt *Table) ReadCSVRow(rec []string, row int)
- func (dt *Table) RowIndex(idx int) int
- func (dt *Table) SaveCSV(filename fsx.Filename, delim tensor.Delims, headers bool) error
- func (dt *Table) Sequential()
- func (dt *Table) SetNumRows(rows int) *Table
- func (dt *Table) SetNumRowsToMax()
- func (dt *Table) SortColumn(columnName string, ascending bool) error
- func (dt *Table) SortColumnIndexes(ascending, stable bool, colIndexes ...int)
- func (dt *Table) SortColumns(ascending, stable bool, columns ...string)
- func (dt *Table) SortFunc(cmp func(dt *Table, i, j int) int)
- func (dt *Table) SortIndexes()
- func (dt *Table) SortStableFunc(cmp func(dt *Table, i, j int) int)
- func (dt *Table) Swap(i, j int)
- func (dt *Table) TableHeaders() []string
- func (dt *Table) ValidIndexes()
- func (dt *Table) WriteCSV(w io.Writer, delim tensor.Delims, headers bool) error
- func (dt *Table) WriteCSVHeaders(w io.Writer, delim tensor.Delims) (int, error)
- func (dt *Table) WriteCSVRow(w io.Writer, row int, delim tensor.Delims) error
- func (dt *Table) WriteCSVRowWriter(cw *csv.Writer, row int, ncol int) error
- func (dt *Table) WriteToLog() error

Constants ¶

View Source

const (
	// Headers is passed to CSV methods for the headers arg, to use headers
	// that capture full type and tensor shape information.
	Headers = true

	// NoHeaders is passed to CSV methods for the headers arg, to not use headers
	NoHeaders = false
)

Variables ¶

View Source

var (
	ErrLogNoNewRows = errors.New("no new rows to write")
)

View Source

var TableHeaderToType = map[byte]reflect.Kind{
	'$': reflect.String,
	'%': reflect.Float32,
	'#': reflect.Float64,
	'|': reflect.Int,
	'^': reflect.Bool,
}

TableHeaderToType maps special header characters to data type

Functions ¶

func AddColumn ¶

func AddColumn[T tensor.DataTypes](dt *Table, name string, cellSizes ...int) tensor.Tensor

AddColumn adds a new column to the table, of given type and column name (which must be unique). If no cellSizes are specified, it holds scalar values, otherwise the cells are n-dimensional tensors of given size.

func CleanCatTSV ¶

func CleanCatTSV(filename string, sorts ...string) error

CleanCatTSV cleans a TSV file formed by concatenating multiple files together. Removes redundant headers and then sorts by given set of columns.

func ConfigFromDataValues ¶

func ConfigFromDataValues(dt *Table, hdrs []string, rec [][]string) error

ConfigFromDataValues configures a Table based on data types inferred from the string representation of given records, using header names if present.

func ConfigFromHeaders ¶

func ConfigFromHeaders(dt *Table, hdrs []string, rec [][]string) error

ConfigFromHeaders attempts to configure Table based on the headers. for non-table headers, data is examined to determine types.

func ConfigFromTableHeaders ¶

func ConfigFromTableHeaders(dt *Table, hdrs []string) error

ConfigFromTableHeaders attempts to configure a Table based on special table headers

func DetectTableHeaders ¶

func DetectTableHeaders(hdrs []string) bool

DetectTableHeaders looks for special header characters -- returns true if found

func InferDataType ¶

func InferDataType(str string) reflect.Kind

InferDataType returns the inferred data type for the given string only deals with float64, int, and string types

func InsertColumn ¶

func InsertColumn[T tensor.DataTypes](dt *Table, name string, idx int, cellSizes ...int) tensor.Tensor

InsertColumn inserts a new column to the table, of given type and column name (which must be unique), at given index. If no cellSizes are specified, it holds scalar values, otherwise the cells are n-dimensional tensors of given size.

func ShapeFromString ¶

func ShapeFromString(dims string) []int

ShapeFromString parses string representation of shape as N:d,d,..

func TableColumnType ¶

func TableColumnType(nm string) (reflect.Kind, string)

TableColumnType parses the column header for special table type information

func TableHeaderChar ¶

func TableHeaderChar(typ reflect.Kind) byte

TableHeaderChar returns the special header character based on given data type

func UpdateSliceTable ¶

func UpdateSliceTable(st any, dt *Table)

UpdateSliceTable updates given Table with data from the given slice of structs, which must be the same type as used to configure the table

Types ¶

type Columns ¶

type Columns struct {
	keylist.List[string, tensor.Values]

	// number of rows, which is enforced to be the size of the
	// outermost row dimension of the column tensors.
	Rows int `edit:"-"`
}

Columns is the underlying column list and number of rows for Table. Each column is a raw tensor.Values tensor, and Table provides a tensor.Rows indexed view onto the Columns.

func NewColumns ¶

func NewColumns() *Columns

NewColumns returns a new Columns.

func (*Columns) AddColumn ¶

func (cl *Columns) AddColumn(name string, tsr tensor.Values) error

AddColumn adds the given tensor (as a tensor.Values) as a column, returning an error and not adding if the name is not unique. Automatically adjusts the shape to fit the current number of rows, (setting Rows if this is the first column added) and calls the metadata SetName with column name.

func (*Columns) AppendRows ¶

func (cl *Columns) AppendRows(cl2 *Columns)

AppendRows appends shared columns in both tables with input table rows.

func (*Columns) Clone ¶

func (cl *Columns) Clone() *Columns

Clone returns a complete copy of this set of columns.

func (*Columns) InsertColumn ¶

func (cl *Columns) InsertColumn(idx int, name string, tsr tensor.Values) error

InsertColumn inserts the given tensor as a column at given index, returning an error and not adding if the name is not unique. Automatically adjusts the shape to fit the current number of rows.

func (*Columns) SetNumRows ¶

func (cl *Columns) SetNumRows(rows int) *Columns

SetNumRows sets the number of rows in the table, across all columns. It is safe to set this to 0. For incrementally growing tables (e.g., a log) it is best to first set the anticipated full size, which allocates the full amount of memory, and then set to 0 and grow incrementally.

type FilterFunc ¶

type FilterFunc func(dt *Table, row int) bool

FilterFunc is a function used for filtering that returns true if Table row should be included in the current filtered view of the table, and false if it should be removed.

type Table ¶

type Table struct {
	// Columns has the list of column tensor data for this table.
	// Different tables can provide different indexed views onto the same Columns.
	Columns *Columns

	// Indexes are the indexes into Tensor rows, with nil = sequential.
	// Only set if order is different from default sequential order.
	// These indexes are shared into the `tensor.Rows` Column values
	// to provide a coordinated indexed view into the underlying data.
	Indexes []int

	// Meta is misc metadata for the table. Use lower-case key names
	// following the struct tag convention:
	//	- name string = name of table
	//	- doc string = documentation, description
	//	- read-only bool = gui is read-only
	//	- precision int = n for precision to write out floats in csv.
	Meta metadata.Data
}

Table is a table of Tensor columns aligned by a common outermost row dimension. Use the Table.Column (by name) and Table.ColumnIndex methods to obtain a tensor.Rows view of the column, using the shared [Table.Indexes] of the Table. Thus, a coordinated sorting and filtered view of the column data is automatically available for any of the tensor package functions that use tensor.Tensor as the one common data representation for all operations. Tensor Columns are always raw value types and support SubSpace operations on cells.

func New ¶

func New(name ...string) *Table

New returns a new Table with its own (empty) set of Columns. Can pass an optional name which calls metadata SetName.

func NewSliceTable ¶

func NewSliceTable(st any) (*Table, error)

NewSliceTable returns a new Table with data from the given slice of structs.

func NewView ¶

func NewView(src *Table) *Table

NewView returns a new Table with its own Rows view into the same underlying set of Column tensor data as the source table. Indexes are copied from the existing table -- use Sequential to reset to full sequential view.

func (*Table) AddColumn ¶

func (dt *Table) AddColumn(name string, tsr tensor.Values) error

AddColumn adds the given tensor.Values as a column to the table, returning an error and not adding if the name is not unique. Automatically adjusts the shape to fit the current number of rows.

func (*Table) AddColumnOfType ¶

func (dt *Table) AddColumnOfType(name string, typ reflect.Kind, cellSizes ...int) tensor.Tensor

AddColumnOfType adds a new scalar column to the table, of given reflect type, column name (which must be unique), If no cellSizes are specified, it holds scalar values, otherwise the cells are n-dimensional tensors of given size. Supported types include string, bool (for tensor.Bool), float32, float64, int, int32, and byte.

func (*Table) AddFloat32Column ¶

func (dt *Table) AddFloat32Column(name string, cellSizes ...int) *tensor.Float32

AddFloat32Column adds a new float32 column with given name. If no cellSizes are specified, it holds scalar values, otherwise the cells are n-dimensional tensors of given size.

func (*Table) AddFloat64Column ¶

func (dt *Table) AddFloat64Column(name string, cellSizes ...int) *tensor.Float64

AddFloat64Column adds a new float64 column with given name. If no cellSizes are specified, it holds scalar values, otherwise the cells are n-dimensional tensors of given size.

func (*Table) AddIntColumn ¶

func (dt *Table) AddIntColumn(name string, cellSizes ...int) *tensor.Int

AddIntColumn adds a new int column with given name. If no cellSizes are specified, it holds scalar values, otherwise the cells are n-dimensional tensors of given size.

func (*Table) AddRows ¶

func (dt *Table) AddRows(n int) *Table

AddRows adds n rows to end of underlying Table, and to the indexes in this view.

func (*Table) AddStringColumn ¶

func (dt *Table) AddStringColumn(name string, cellSizes ...int) *tensor.String

AddStringColumn adds a new String column with given name. If no cellSizes are specified, it holds scalar values, otherwise the cells are n-dimensional tensors of given size.

func (*Table) AppendRows ¶

func (dt *Table) AppendRows(dt2 *Table)

AppendRows appends shared columns in both tables with input table rows.

func (*Table) Clone ¶

func (dt *Table) Clone() *Table

Clone returns a complete copy of this table, including cloning the underlying Columns tensors, and the current [Table.Indexes]. See also Table.New to flatten the current indexes.

func (*Table) CloseLog ¶

func (dt *Table) CloseLog()

CloseLog closes the log file opened by Table.OpenLog.

func (*Table) Column ¶

func (dt *Table) Column(name string) *tensor.Rows

Column returns the tensor with given column name, as a tensor.Rows with the shared [Table.Indexes] from this table. It is best practice to access columns by name, and direct access through [Table.Columns] does not provide the shared table-wide Indexes. Returns nil if not found.

func (*Table) ColumnByIndex ¶

func (dt *Table) ColumnByIndex(idx int) *tensor.Rows

ColumnIndex returns the tensor at the given column index, as a tensor.Rows with the shared [Table.Indexes] from this table. It is best practice to instead access columns by name using Table.Column. Direct access through [Table.Columns} does not provide the shared table-wide Indexes. Will panic if out of range.

func (*Table) ColumnIndex ¶

func (dt *Table) ColumnIndex(name string) int

ColumnIndex returns the index for given column name.

func (*Table) ColumnIndexList ¶

func (dt *Table) ColumnIndexList(names ...string) []int

ColumnIndexList returns a list of indexes to columns of given names.

func (*Table) ColumnList ¶

func (dt *Table) ColumnList(names ...string) []tensor.Tensor

ColumnList returns a list of tensors with given column names, as tensor.Rows with the shared [Table.Indexes] from this table.

func (*Table) ColumnName ¶

func (dt *Table) ColumnName(i int) string

ColumnName returns the name of given column.

func (*Table) ColumnTry ¶

func (dt *Table) ColumnTry(name string) (*tensor.Rows, error)

ColumnTry is a version of Table.Column that also returns an error if the column name is not found, for cases when error is needed.

func (*Table) ConfigFromTable ¶

func (dt *Table) ConfigFromTable(ft *Table) error

ConfigFromTable configures the columns of this table according to the values in the first two columns of given format table, conventionally named Name, Type (but names are not used), which must be of the string type.

func (*Table) DeleteAll ¶

func (dt *Table) DeleteAll()

DeleteAll deletes all columns, does full reset.

func (*Table) DeleteColumnByIndex ¶

func (dt *Table) DeleteColumnByIndex(i, j int)

DeleteColumnIndex deletes column within the index range [i:j].

func (*Table) DeleteColumnName ¶

func (dt *Table) DeleteColumnName(name string) bool

DeleteColumnName deletes column of given name. returns false if not found.

func (*Table) DeleteRows ¶

func (dt *Table) DeleteRows(at, n int)

DeleteRows deletes n rows of Indexes starting at given index in the list of indexes. This does not affect the underlying tensor data; To create an actual in-memory ordering with rows deleted, use Table.New.

func (*Table) Filter ¶

func (dt *Table) Filter(filterer func(dt *Table, row int) bool)

Filter filters the indexes into our Table using given Filter function. The Filter function operates directly on row numbers into the Table as these row numbers have already been projected through the indexes.

func (*Table) FilterString ¶

func (dt *Table) FilterString(columnName string, str string, opts tensor.StringMatch) error

FilterString filters the indexes using string values in column compared to given string. Includes rows with matching values unless the Exclude option is set. If Contains option is set, it only checks if row contains string; if IgnoreCase, ignores case, otherwise filtering is case sensitive. Uses first cell from higher dimensions. Returns error if column name not found.

func (*Table) IndexesFromTensor ¶

func (dt *Table) IndexesFromTensor(ix *tensor.Rows)

IndexesFromTensor copies Indexes from the given tensor.Rows tensor, including if they are nil. This allows column-specific Sort, Filter and other such methods to be applied to the entire table.

func (*Table) IndexesNeeded ¶

func (dt *Table) IndexesNeeded()

IndexesNeeded is called prior to an operation that needs actual indexes, e.g., Sort, Filter. If Indexes == nil, they are set to all rows, otherwise current indexes are left as is. Use Sequential, then IndexesNeeded to ensure all rows are represented.

func (*Table) Init ¶

func (dt *Table) Init()

Init initializes a new empty table with NewColumns.

func (*Table) InsertColumn ¶

func (dt *Table) InsertColumn(idx int, name string, tsr tensor.Values) error

InsertColumn inserts the given tensor.Values as a column to the table at given index, returning an error and not adding if the name is not unique. Automatically adjusts the shape to fit the current number of rows.

func (*Table) InsertKeyColumns ¶

func (dt *Table) InsertKeyColumns(args ...string) *Table

InsertKeyColumns returns a copy of the given Table with new columns having given values, inserted at the start, used as legend keys etc. args must be in pairs: column name, value. All rows get the same value.

func (*Table) InsertRows ¶

func (dt *Table) InsertRows(at, n int) *Table

InsertRows adds n rows to end of underlying Table, and to the indexes starting at given index in this view, providing an efficient insertion operation that only exists in the indexed view. To create an in-memory ordering, use Table.New.

func (*Table) IsValidRow ¶

func (dt *Table) IsValidRow(row int) error

IsValidRow returns error if the row is invalid, if error checking is needed.

func (*Table) Metadata ¶

func (dt *Table) Metadata() *metadata.Data

func (*Table) New ¶

func (dt *Table) New() *Table

New returns a new table with column data organized according to the indexes. If Indexes are nil, a clone of the current tensor is returned but this function is only sensible if there is an indexed view in place.

func (*Table) NumColumns ¶

func (dt *Table) NumColumns() int

NumColumns returns the number of columns.

func (*Table) NumRows ¶

func (dt *Table) NumRows() int

NumRows returns the number of rows, which is the number of Indexes if present, else actual number of [Columns.Rows].

func (*Table) OpenCSV ¶

func (dt *Table) OpenCSV(filename fsx.Filename, delim tensor.Delims) error

OpenCSV reads a table from a comma-separated-values (CSV) file (where comma = any delimiter, specified in the delim arg), using the Go standard encoding/csv reader conforming to the official CSV standard. If the table does not currently have any columns, the first row of the file is assumed to be headers, and columns are constructed therefrom. If the file was saved from table with headers, then these have full configuration information for tensor type and dimensionality. If the table DOES have existing columns, then those are used robustly for whatever information fits from each row of the file.

func (*Table) OpenFS ¶

func (dt *Table) OpenFS(fsys fs.FS, filename string, delim tensor.Delims) error

OpenFS is the version of Table.OpenCSV that uses an fs.FS filesystem.

func (*Table) OpenLog ¶

func (dt *Table) OpenLog(filename string, delim tensor.Delims) error

OpenLog opens a log file for this table, which supports incremental output of table data as it is generated, using the standard Table.SaveCSV output formatting, using given delimiter between values on a line. Call Table.WriteToLog to write any new data rows to the open log file, and Table.CloseLog to close the file.

func (*Table) Permuted ¶

func (dt *Table) Permuted()

Permuted sets indexes to a permuted order -- if indexes already exist then existing list of indexes is permuted, otherwise a new set of permuted indexes are generated

func (*Table) ReadCSV ¶

func (dt *Table) ReadCSV(r io.Reader, delim tensor.Delims) error

ReadCSV reads a table from a comma-separated-values (CSV) file (where comma = any delimiter, specified in the delim arg), using the Go standard encoding/csv reader conforming to the official CSV standard. If the table does not currently have any columns, the first row of the file is assumed to be headers, and columns are constructed therefrom. If the file was saved from table with headers, then these have full configuration information for tensor type and dimensionality. If the table DOES have existing columns, then those are used robustly for whatever information fits from each row of the file.

func (*Table) ReadCSVRow ¶

func (dt *Table) ReadCSVRow(rec []string, row int)

ReadCSVRow reads a record of CSV data into given row in table

func (*Table) RowIndex ¶

func (dt *Table) RowIndex(idx int) int

RowIndex returns the actual index into underlying tensor row based on given index value. If Indexes == nil, index is passed through.

func (*Table) SaveCSV ¶

func (dt *Table) SaveCSV(filename fsx.Filename, delim tensor.Delims, headers bool) error

SaveCSV writes a table to a comma-separated-values (CSV) file (where comma = any delimiter, specified in the delim arg). If headers = true then generate column headers that capture the type and tensor cell geometry of the columns, enabling full reloading of exactly the same table format and data (recommended). Otherwise, only the data is written.

func (*Table) Sequential ¶

func (dt *Table) Sequential()

Sequential sets Indexes to nil, resulting in sequential row-wise access into tensor.

func (*Table) SetNumRows ¶

func (dt *Table) SetNumRows(rows int) *Table

SetNumRows sets the number of rows in the table, across all columns. If rows = 0 then effective number of rows in tensors is 1, as this dim cannot be 0. If indexes are in place and rows are added, indexes for the new rows are added.

func (*Table) SetNumRowsToMax ¶

func (dt *Table) SetNumRowsToMax()

SetNumRowsToMax gets the current max number of rows across all the column tensors, and sets the number of rows to that. This will automatically pad shorter columns so they all have the same number of rows. If a table has columns that are not fully under its own control, they can change size, so this reestablishes a common row dimension.

func (*Table) SortColumn ¶

func (dt *Table) SortColumn(columnName string, ascending bool) error

SortColumn sorts the indexes into our Table according to values in given column, using either ascending or descending order, (use tensor.Ascending or tensor.Descending for self-documentation). Uses first cell of higher dimensional data. Returns error if column name not found.

func (*Table) SortColumnIndexes ¶

func (dt *Table) SortColumnIndexes(ascending, stable bool, colIndexes ...int)

SortColumnIndexes sorts the indexes into our Table according to values in given list of column indexes, using either ascending or descending order for all of the columns. Uses first cell of higher dimensional data.

func (*Table) SortColumns ¶

func (dt *Table) SortColumns(ascending, stable bool, columns ...string)

SortColumns sorts the indexes into our Table according to values in given column names, using either ascending or descending order, (use tensor.Ascending or tensor.Descending for self-documentation, and optionally using a stable sort. Uses first cell of higher dimensional data.

func (*Table) SortFunc ¶

func (dt *Table) SortFunc(cmp func(dt *Table, i, j int) int)

SortFunc sorts the indexes into our Table using given compare function. The compare function operates directly on row numbers into the Table as these row numbers have already been projected through the indexes. cmp(a, b) should return a negative number when a < b, a positive number when a > b and zero when a == b.

func (*Table) SortIndexes ¶

func (dt *Table) SortIndexes()

SortIndexes sorts the indexes into our Table directly in numerical order, producing the native ordering, while preserving any filtering that might have occurred.

func (*Table) SortStableFunc ¶

func (dt *Table) SortStableFunc(cmp func(dt *Table, i, j int) int)

SortStableFunc stably sorts the indexes into our Table using given compare function. The compare function operates directly on row numbers into the Table as these row numbers have already been projected through the indexes. cmp(a, b) should return a negative number when a < b, a positive number when a > b and zero when a == b. It is *essential* that it always returns 0 when the two are equal for the stable function to actually work.

func (*Table) Swap ¶

func (dt *Table) Swap(i, j int)

Swap switches the indexes for i and j

func (*Table) TableHeaders ¶

func (dt *Table) TableHeaders() []string

TableHeaders generates special header strings from the table with full information about type and tensor cell dimensionality.

func (*Table) ValidIndexes ¶

func (dt *Table) ValidIndexes()

ValidIndexes deletes all invalid indexes from the list. Call this if rows (could) have been deleted from table.

func (*Table) WriteCSV ¶

func (dt *Table) WriteCSV(w io.Writer, delim tensor.Delims, headers bool) error

WriteCSV writes only rows in table idx view to a comma-separated-values (CSV) file (where comma = any delimiter, specified in the delim arg). If headers = true then generate column headers that capture the type and tensor cell geometry of the columns, enabling full reloading of exactly the same table format and data (recommended). Otherwise, only the data is written.

func (*Table) WriteCSVHeaders ¶

func (dt *Table) WriteCSVHeaders(w io.Writer, delim tensor.Delims) (int, error)

WriteCSVHeaders writes headers to a comma-separated-values (CSV) file (where comma = any delimiter, specified in the delim arg). Returns number of columns in header

func (*Table) WriteCSVRow ¶

func (dt *Table) WriteCSVRow(w io.Writer, row int, delim tensor.Delims) error

WriteCSVRow writes given row to a comma-separated-values (CSV) file (where comma = any delimiter, specified in the delim arg)

func (*Table) WriteCSVRowWriter ¶

func (dt *Table) WriteCSVRowWriter(cw *csv.Writer, row int, ncol int) error

WriteCSVRowWriter uses csv.Writer to write one row

func (*Table) WriteToLog ¶

func (dt *Table) WriteToLog() error

WriteToLog writes any accumulated rows in the table to the file opened by Table.OpenLog. A Header row is written for the first output. If the current number of rows is less than the last number of rows, all of those rows are written under the assumption that the rows were reset via Table.SetNumRows. Returns error for any failure, including ErrLogNoNewRows if no new rows are available to write.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL