geobuf

package module
v0.0.0-...-cb44b2d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 18, 2021 License: Apache-2.0 Imports: 15 Imported by: 0

README

Geobuf - A GeoJSON Interchange Format

GoDoc

What is it?

From a top level geobuf provides simple apis to convert from geojson, read, and write geobuf. Geobuf is a custom protobuf implementation of geojson features, its much much faster than json unmarshalling as well as much smaller for geometry heavy features. Performance for reading and writing can be summarized as about 5-10x what your going to see out of plain json. However with some larger files reading concurrently is like 18x faster (1 gb california roads geojson)

Why Should I consider Using This?

Beyond being much faster for serialization reads can be done iteratively and more importantly piece wise, so one could do a partial read of just the values needed for a filter than read the entire feature if those conditions are satisified.

I think thats the main deviation from mapbox's geobuf is the use of flat geojson features at the top level. The problem with mapbox's geobuf format is it can't be stream because in order to assemble an entire geojson feature from MB's geobuf you need the end of the file being the key, value lists, and I also recall there being another structure you needed to interate in parallel in order to assemble the geojson feature. That effectively relegates it to a faster & smaller geojson feature collection implementation, but it really doesn't solve the problem I was having, needing to deserialize every feature in the feature collection in memory before any of the features can be operated on.

As for the pretty beefy properties sizes that my features have, instead of using the value / key lists basically without any super formal testing the differences between the two are pretty minimal when gzipped which is about what you would expect anyway.

Features
  • Straightforward Reader / Writer methods for everything thing that is done to geobufs
  • Expect at least 5-10x performance gains in both read / write against line-delimited geojson
  • CLI executables to convert to and from geobuf from geojson
  • Inplace geobuf sorts to do things like feature mapping about a file out of memory for tiling
Internals

Below is the geobuf proto file that this implementation is based.

syntax = "proto3";

// Variant type encoding
// The use of values is described in section 4.1 of the specification
message Value {
	// Exactly one of these values must be present in a valid message
	string string_value = 1;
	float float_value = 2;
	double double_value = 3;
	int64 int_value = 4;
	uint64 uint_value = 5;
	sint64 sint_value = 6;
	bool bool_value = 7;
}

// GeomType is described in section 4.3.4 of the specification
enum GeomType {
	UNKNOWN = 0;
	POINT = 1;
	LINESTRING = 2;
	POLYGON = 3;
	MULTIPOINT = 4;
	MULTILINESTRING = 5;
	MULTIPOLYGON = 6;
}

message Feature {
	uint64 Id = 1;
	map<string, Value>  Properties = 2;
 	GeomType type = 3;
	repeated uint64 Geometry = 4 [ packed = true ];
	repeated int64 BoundingBox = 5 [ packed = true ]; // N,S,E,W
}

message FeatureCollection {
	repeated Feature Features = 1;
}
Usage

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func AddFeatures

func AddFeatures(geobuf *Writer, feats []string, count int, s time.Time) int

adds featuers

func BenchmarkRead

func BenchmarkRead(filename_geojson string, filename_geobuf string)

func BenchmarkWrite

func BenchmarkWrite(filename_geojson string, filename_geobuf string)

func BoundingBox_GeometryCollection

func BoundingBox_GeometryCollection(gs []*geojson.Geometry) []float64

Returns a BoundingBox for a geometry collection

func BoundingBox_LineStringGeometry

func BoundingBox_LineStringGeometry(line [][]float64) []float64

Returns BoundingBox for a LineString

func BoundingBox_MultiLineStringGeometry

func BoundingBox_MultiLineStringGeometry(multiline [][][]float64) []float64

Returns BoundingBox for a MultiLineString

func BoundingBox_MultiPointGeometry

func BoundingBox_MultiPointGeometry(pts [][]float64) []float64

Returns BoundingBox for a MultiPoint

func BoundingBox_MultiPolygonGeometry

func BoundingBox_MultiPolygonGeometry(multipolygon [][][][]float64) []float64

Returns BoundingBox for a Polygon

func BoundingBox_PointGeometry

func BoundingBox_PointGeometry(pt []float64) []float64

boudning box on a normal point geometry relatively useless

func BoundingBox_Points

func BoundingBox_Points(pts [][]float64) []float64

BoundingBox implementation as per https://tools.ietf.org/html/rfc7946 BoundingBox syntax: "bbox": [west, south, east, north] BoundingBox defaults "bbox": [-180.0, -90.0, 180.0, 90.0]

func BoundingBox_PolygonGeometry

func BoundingBox_PolygonGeometry(polygon [][][]float64) []float64

Returns BoundingBox for a Polygon

func ConvertGeobuf

func ConvertGeobuf(infile string, outfile string)

function used for converting geojson to geobuf

func ConvertGeojson

func ConvertGeojson(infile string, outfile string)

function used for converting geojson to geobuf

func EncodeVarint

func EncodeVarint(x uint64) []byte

func Expand_BoundingBoxs

func Expand_BoundingBoxs(bboxs [][]float64) []float64

this functions takes an array of bounding box objects and pushses them all out

func GetBoundingBox

func GetBoundingBox(g *geojson.Geometry) []float64

retrieves a boundingbox for a given geometry

func GetFilesize

func GetFilesize(filename string) int

func GetKeys

func GetKeys(buf *Reader) ([]string, int)

func Increment

func Increment(buf *Reader, increment int) ([]byte, bool)

increments returning the given bytes of a feature collection

func MapGeobuf

func MapGeobuf(infile string, newfile string, mapfunc MapFunc)

function used for converting geojson to geobuf

func Push_Two_BoundingBoxs

func Push_Two_BoundingBoxs(bb1 []float64, bb2 []float64) []float64

func ReadBoundingBox

func ReadBoundingBox(bytevals []byte) []float64

reads a feature

func ReadFeature

func ReadFeature(bytevals []byte) *geojson.Feature

reads a single feature form bytes

func ReadGeobufCSV

func ReadGeobufCSV(filename string)

func ReadKeys

func ReadKeys(bytevals []byte) []string

reads a feature

func WriteMetaData

func WriteMetaData(meta MetaData) interface{}

reads the metadata from a raw bytes set

func WriteRow

func WriteRow(feature *geojson.Feature, keys []string)

Types

type Concurrent

type Concurrent struct {
	Reader       *Reader
	C            chan *geojson.Feature
	Count        int
	Limit        int
	FeatureCount int
}

func NewConcurrent

func NewConcurrent(buf *Reader, limit int) *Concurrent

intiating a new concurrent reader

func (*Concurrent) Feature

func (con *Concurrent) Feature() *geojson.Feature

recieving a feature from a channel

func (*Concurrent) Next

func (con *Concurrent) Next() bool

a next read concurrently

func (*Concurrent) StartProcesses

func (con *Concurrent) StartProcesses()

a start process read concurrently

type Geojson_File

type Geojson_File struct {
	Features []*geojson.Feature
	Count    int
	File     *os.File
	Pos      int64
	Feat_Pos int
}

structure used for converting geojson

func NewGeojson

func NewGeojson(filename string) Geojson_File

creates a geojosn

func (*Geojson_File) ReadChunk

func (geojsonfile *Geojson_File) ReadChunk(size int) []string

reads a chunk of a geojson file

type MapFunc

type MapFunc func(feature *geojson.Feature) *geojson.Feature

type MetaData

type MetaData struct {
	FileSize       int
	NumberFeatures int
	Files          map[string]*SubFile
	Bounds         m.Extrema
}

struct for handling metadata

func ReadMetaData

func ReadMetaData(bytevals []byte) MetaData

reads the metadata from a raw bytes set

func (*MetaData) LintMetaData

func (metadata *MetaData) LintMetaData(pos int)

lints metadata

type Reader

type Reader struct {
	FileBool     bool                       // a boolean for whether its a file or byte buffer
	Reader       *protoscan.ProtobufScanner // underlying protoscan reader
	Filename     string                     // filename
	File         *os.File                   // file object
	Buf          []byte                     // buffer if applicable
	MetaData     MetaData                   // metadata
	MetaDataBool bool                       // metadatabool
	SubFileEnd   int                        // the end point of a given subfile
	FeatureCount int                        // number of features iterated through
}

protobuf scanner implementation

func ReaderBuf

func ReaderBuf(bytevals []byte) *Reader

creates a reader for a byte array

func ReaderFile

func ReaderFile(filename string) *Reader

creates a reader for file

func (*Reader) Bytes

func (reader *Reader) Bytes() []byte

alias for the Protobuf() method again more expressive for our use case

func (*Reader) BytesIndicies

func (reader *Reader) BytesIndicies() ([]byte, [2]int)

alias for the Protobuf() method again more expressive for our use case

func (*Reader) CheckMetaData

func (reader *Reader) CheckMetaData()

checks for metadata

func (*Reader) Close

func (reader *Reader) Close()

closes an underlying file

func (*Reader) Feature

func (reader *Reader) Feature() *geojson.Feature

alias for the Protobuf() method again more expressive for our use case

func (*Reader) FeatureIndicies

func (reader *Reader) FeatureIndicies() (*geojson.Feature, [2]int)

alias for the Protobuf() method again more expressive for our use case

func (*Reader) Next

func (reader *Reader) Next() bool

alias for the Scan method on reader next is a little more expressive

func (*Reader) ReadAll

func (reader *Reader) ReadAll() []*geojson.Feature

func (*Reader) ReadIndAppend

func (reader *Reader) ReadIndAppend(inds [2]int) []byte

reads an indicies ready to append

func (*Reader) ReadIndFeature

func (reader *Reader) ReadIndFeature(inds [2]int) *geojson.Feature

read feature from an indicies

func (*Reader) ReadIndicies

func (reader *Reader) ReadIndicies(inds [2]int) []byte

a simple read of the bytes between two indices in a reader

func (*Reader) Reset

func (reader *Reader) Reset()

resets a reader to be read again

func (*Reader) Seek

func (reader *Reader) Seek(pos int)

this functions types into the underlying protoscan implementation and reconfigures the protoscan to start reading a certain position

func (*Reader) SubFileBytes

func (reader *Reader) SubFileBytes(key string) *Reader

this function takes a subfile map key and reads the entire byte array from the the section fo the file and returns a NEW geobuf reader object

func (*Reader) SubFileNext

func (reader *Reader) SubFileNext() bool

alias for the Scan method on reader next is a little more expressive this next specifically pertains to all features within a sub file

func (*Reader) SubFileSeek

func (reader *Reader) SubFileSeek(key string)

this functions seeks a specific key in the filemap if it contains metadata given a key the underlying reader is moved to exact positon where that subfile starts

type SubFile

type SubFile struct {
	Positions      [2]int
	NumberFeatures int
	Size           int
}

sub file contained within a geobuf

type Writer

type Writer struct {
	Filename  string
	Writer    *bufio.Writer
	FileBool  bool
	Buffer    *bytes.Buffer
	File      *os.File
	Bytesvals []byte
}

the writer struct

func WriterBuf

func WriterBuf(bytevals []byte) *Writer

creates a writer buffer

func WriterBufNew

func WriterBufNew() *Writer

creates a writer buffer new

func WriterFile

func WriterFile(filename string) *Writer

creates a writer struct

func WriterFileNew

func WriterFileNew(filename string) *Writer

creates a writer struct

func (*Writer) AddGeobuf

func (writer *Writer) AddGeobuf(buf *Writer)

adds a geobuf buffer value to an existing geobuf

func (*Writer) Bytes

func (writer *Writer) Bytes() []byte

returns the bytes present in an underlying writer type buffer

func (*Writer) Close

func (writer *Writer) Close()

closes an underlying writer

func (*Writer) Reader

func (writer *Writer) Reader() *Reader

converts a writer into a reader

func (*Writer) Write

func (writer *Writer) Write(bytevals []byte)

writes a set of byte values representing a feature to the underlying writer

func (*Writer) WriteFeature

func (writer *Writer) WriteFeature(feature *geojson.Feature)

writing feature

func (*Writer) WriteRaw

func (writer *Writer) WriteRaw(bytevals []byte)

writes a set of byte values representing a feature to the underlying writer

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL