Documentation
¶
Overview ¶
Package parquet provides a Parser which can interpret Parquet data
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Parser ¶
type Parser struct {
// contains filtered or unexported fields
}
Parser produces partitions from Parquet data
func CreateParser ¶
func CreateParser(conf *ParserConf) *Parser
CreateParser returns a new Parquet Parser. Columns are parsed lazily from each row of JSON using their column name, which should be a gjson path. Values within the JSON which do not correspond to a Schema column are ignored.
func (*Parser) Parse ¶
func (p *Parser) Parse(r io.Reader, source sif.DataSource, schema sif.Schema, widestInitialSchema sif.Schema, onIteratorEnd func()) (sif.PartitionIterator, error)
Parse parses Parquet data to produce Partitions
func (*Parser) PartitionSize ¶
PartitionSize returns the maximum size in rows of Partitions produced by this Parser
type ParserConf ¶
type ParserConf struct {
PartitionSize int // The maximum number of rows per Partition. Defaults to 128.
}
ParserConf configures a Parquet Parser, suitable for JSON lines data
Click to show internal directories.
Click to hide internal directories.