Documentation ¶
Index ¶
- type CSVReader
- type ColumnTypeT
- type SAS7BDAT
- type Series
- func (ser *Series) AllClose(other *Series, tol float64) (bool, int)
- func (ser *Series) AllEqual(other *Series) (bool, int)
- func (ser *Series) AsFloat64Slice() ([]float64, []bool, error)
- func (ser *Series) AsStringSlice() ([]string, []bool, error)
- func (ser *Series) AsUint64Slice() ([]uint64, []bool, error)
- func (ser *Series) CountMissing() int
- func (ser *Series) Data() interface{}
- func (ser *Series) DateFromDuration(base time.Time, units string) (*Series, error)
- func (ser *Series) ForceNumeric() *Series
- func (ser *Series) Length() int
- func (ser *Series) Missing() []bool
- func (ser *Series) NullStringMissing() *Series
- func (ser *Series) Print()
- func (ser *Series) PrintRange(first, last int)
- func (ser *Series) StringFunc(f func(string) string) *Series
- func (ser *Series) ToString() *Series
- func (ser *Series) UpcastNumeric() *Series
- func (ser *Series) Write(w io.Writer)
- func (ser *Series) WriteRange(w io.Writer, first, last int)
- type SeriesArray
- type StataReader
- type StatfileReader
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type CSVReader ¶
type CSVReader struct { // Skip this number of rows before reading the header. SkipRows int // If true, there is a header to read, otherwise default column names are used HasHeader bool // The column names, in the order that they appear in the // file. Can be set by caller. ColumnNames []string // User-specified data types (maps column name to type name). TypeHintsName map[string]string // User-specified data types (indexed by column number). TypeHintsPos []string // The data type for each column. DataTypes []string // contains filtered or unexported fields }
A CSVReader specifies how a data set in CSV format can be read from a text file.
func NewCSVReader ¶
NewCSVReader returns a CSVReader that reads CSV data from the given io.reader, with type inference and chunking.
type ColumnTypeT ¶
type ColumnTypeT uint16
ColumnTypeT is the type of a data column in a SAS or Stata file.
const ( SASNumericType ColumnTypeT = iota SASStringType )
const ( StataFloat64Type ColumnTypeT = 65526 StataFloat32Type ColumnTypeT = 65527 StataInt32Type ColumnTypeT = 65528 StataInt16Type ColumnTypeT = 65529 StataInt8Type ColumnTypeT = 65530 StataStrlType ColumnTypeT = 32768 )
These are constants used in Dta files to represent different data types.
type SAS7BDAT ¶
type SAS7BDAT struct { // Formats for the columns ColumnFormats []string // If true, trim whitespace from right of each string variable // (SAS7BDAT strings are fixed width) TrimStrings bool // If true, converts some date formats to Go date values (does // not work for all SAS date formats) ConvertDates bool // If true, strings are represented as uint64 values. Call // the StringFactorMap method to obtain the mapping from these // coded values to the actual strings that they represent. FactorizeStrings bool // If true, turns off alignment correction when reading mix-type pages. // In general this should be set to false. However some files // are read incorrectly and need this flag set to true. At present, // we do not know how to automatically detect the correct setting, so // we leave this as a configurable option. NoAlignCorrection bool // The creation date of the file DateCreated time.Time // The modification date of the file DateModified time.Time // The name of the data set Name string // The platform used to create the file Platform string // The SAS release used to create the file SASRelease string // The server type used to create the file ServerType string // The operating system type used to create the file OSType string // The operating system name used to create the file OSName string // The SAS file type FileType string // The encoding name FileEncoding string // True if the file was created on a 64 bit architecture U64 bool // The byte order of the file ByteOrder binary.ByteOrder // The compression mode of the file Compression string // A decoder for decoding text to unicode TextDecoder *xencoding.Decoder // contains filtered or unexported fields }
SAS7BDAT represents a SAS data file in SAS7BDAT format.
func NewSAS7BDATReader ¶
func NewSAS7BDATReader(r io.ReadSeeker) (*SAS7BDAT, error)
NewSAS7BDATReader returns a new reader object for SAS7BDAT files. Call the Read method to obtain the data.
func (*SAS7BDAT) ColumnLabels ¶
ColumnLabels returns the column labels.
func (*SAS7BDAT) ColumnNames ¶
ColumnNames returns the names of the columns.
func (*SAS7BDAT) ColumnTypes ¶
func (sas *SAS7BDAT) ColumnTypes() []ColumnTypeT
ColumnTypes returns integer codes for the column data types.
func (*SAS7BDAT) Read ¶
Read returns up to num_rows rows of data from the SAS7BDAT file, as an array of Series objects. The Series data types are either float64 or string. If num_rows is negative, the remainder of the file is read. Returns (nil, io.EOF) when no rows remain.
SAS strings variables have a fixed width and are right-padded with whitespace. The TrimRight field of the SAS7BDAT struct can be set to true to automatically trim this whitespace.
func (*SAS7BDAT) StringFactorMap ¶
StringFactorMap returns a map that associates integer codes with the string value that each code represents. This is only relevant if FactorizeStrings is set to True.
type Series ¶
type Series struct { // A name describing what is in this series. Name string // contains filtered or unexported fields }
A Series is a fixed-type one-dimensional sequence of data values, with an optional mask for missing values.
func NewSeries ¶
NewSeries returns a new Series value with the given name and data contents. The data slice parameter is not copied.
func (*Series) AllClose ¶
AllClose returns true, 0 if the Series is within tol of the other series. If the Series have different lengths, AllClose returns false, -1. If the Series have different types, AllClose returns false, -2. If the Series have the same type and the same length but are not equal, AllClose returns false, j, where j is the index of the first position where the two series differ.
func (*Series) AsFloat64Slice ¶
AsFloat64Slice returns the data of the series as a float64 slice, and a boolean slice for the missing value indicators.
func (*Series) AsStringSlice ¶
AsStringSlice returns the series data as slices for the values, and the missing data indicators.
func (*Series) AsUint64Slice ¶
AsUint64Slice returns the data of the series as a uint64 slice, and a boolean slice for the missing value indicators.
func (*Series) CountMissing ¶
CountMissing returns the number of missing values in the Series.
func (*Series) Data ¶
func (ser *Series) Data() interface{}
Data returns the data component of the Series.
func (*Series) DateFromDuration ¶
DateFromDuration returns a new Series in which the data are dates, derived from a given duration value. Currently, units must be "days".
func (*Series) ForceNumeric ¶
ForceNumeric converts string values to float64 values, creating missing values where the conversion is not possible. If the data is not string type, it is unaffected.
func (*Series) NullStringMissing ¶
NullStringMissing returns a copy of a string series in which zero-length strings are treated as missing values. If the method is applied to a series that is not of string type, the series is returned unchanged.
func (*Series) Print ¶
func (ser *Series) Print()
Print prints the entire Series to the standard output.
func (*Series) PrintRange ¶
PrintRange prints a slice of the Series to the standard output.
func (*Series) StringFunc ¶
StringFunc applies the given function to all values in the series, if the series holds string values. Otherwise calling this method has no effect.
func (*Series) ToString ¶
ToString returns a Series with string values, derived from the given series.
func (*Series) UpcastNumeric ¶
UpcastNumeric converts in-place all numeric type variables to float64 values. Non-numeric data is not affected.
type SeriesArray ¶
type SeriesArray []*Series
SeriesArray is an array of pointers to Series objects. It can represent a dataset consisting of several variables.
func (SeriesArray) AllClose ¶
AllClose returns (true, 0, 0) if all numeric values in corresponding columns of the two arrays of Series objects are within the given tolerance. If any corresponding columns are not identically equal, returns (false, j, i), where j is the index of a column and i is the index of a row where the two Series are not identical. If the two SeriesArray objects have different numbers of columns, returns (false, -1, -1). If column j of the two SeriesArray objects have different lengths, returns (false, j, -1). If column j of the two SeriesArray objects have different types, returns (false, j, -2)
type StataReader ¶
type StataReader struct { // If true, the strl numerical codes are replaced with their // string values when available. InsertStrls bool // If true, the categorial numerical codes are replaced with // their string labels when available. InsertCategoryLabels bool // If true, dates are converted to Go date format. ConvertDates bool // A short text label for the data set. DatasetLabel string // The time stamp for the data set TimeStamp string // Number of variables Nvar int // An additional text entry describing each variable ColumnNamesLong []string // String labels for categorical variables ValueLabels map[string]map[int32]string ValueLabelNames []string // Format codes for each variable Formats []string // Maps from strl keys to values Strls map[uint64]string StrlsBytes map[uint64][]byte // The format version of the dta file FormatVersion int // The endian-ness of the file ByteOrder binary.ByteOrder // contains filtered or unexported fields }
StataReader reads Stata dta data files. Currently dta format versions 114, 115, 117, and 118 can be read.
The Read method reads and returns the data. Several fields of the StataReader struct may also be of interest.
Technical information about the file format can be found here: http://www.stata.com/help.cgi?dta
func NewStataReader ¶
func NewStataReader(r io.ReadSeeker) (*StataReader, error)
NewStataReader returns a StataReader for reading from the given io.ReadSeeker.
func (*StataReader) ColumnNames ¶
func (rdr *StataReader) ColumnNames() []string
ColumnNames returns the names of the columns in the data file.
func (*StataReader) ColumnTypes ¶
func (rdr *StataReader) ColumnTypes() []ColumnTypeT
ColumnTypes returns integer codes corresponding to the data types in the Stata file. See the Stata dta doumentation for more information.
func (*StataReader) Read ¶
func (rdr *StataReader) Read(rows int) ([]*Series, error)
Read returns the given number of rows of data from the Stata data file. The data are returned as an array of Series objects. If rows is negative, the remainder of the file is read.
func (*StataReader) RowCount ¶
func (rdr *StataReader) RowCount() int
RowCount returns the number of rows in the data set.
type StatfileReader ¶
type StatfileReader interface { ColumnNames() []string ColumnTypes() []ColumnTypeT RowCount() int Read(int) ([]*Series, error) }
StatfileReader is an interface that can be used to work interchangeably with StataReader and SAS7BDAT objects.