Documentation ¶
Overview ¶
Package ais provides types and methods for conducting data science on signals generated by maritime entities radiating from an Automated Identification System (AIS) transponder as mandated by the International Maritime Organization (IMO) for all vessels over 300 gross tons and all passenger vessels.
Example ¶
This example shows the basic usage of creating a new RecordSet and then using it to write a Record and finally saving the RecordSet to a csv file.
package main import ( "strings" "github.com/FATHOM5/ais" ) func main() { rs := ais.NewRecordSet() defer rs.Close() h := ais.Headers{ Fields: strings.Split("MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselName,IMO,CallSign,VesselType,Status,Length,Width,Draft,Cargo", ","), } data := strings.Split("477307900,2017-12-01T00:00:03,36.90512,-76.32652,0.0,131.0,352.0,FIRST,IMO9739666,VRPJ6,1004,moored,337,,,", ",") rs.SetHeaders(h) rec1 := ais.Record(data) err := rs.Write(rec1) if err != nil { panic(err) } err = rs.Flush() if err != nil { panic(err) } err = rs.Save("test.csv") if err != nil { panic(err) } }
Output:
Example (Distance) ¶
This example demonstrates how to contruct two ais.Record types and compute the haversine distance between them.
package main import ( "fmt" "strings" "github.com/FATHOM5/ais" ) func main() { h := ais.Headers{ Fields: strings.Split("MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselName,IMO,CallSign,VesselType,Status,Length,Width,Draft,Cargo", ","), } idxMap, ok := h.ContainsMulti("LAT", "LON") if !ok { panic("missing one or more required headers LAT and LON") } data1 := strings.Split("477307900,2017-12-01T00:00:03,36.90512,-76.32652,0.0,131.0,352.0,FIRST,IMO9739666,VRPJ6,1004,moored,337,,,", ",") data2 := strings.Split("477307902,2017-12-01T00:00:03,36.91512,-76.22652,2.3,311.0,182.0,SECOND,IMO9739800,XHYSF,,underway using engines,337,,,", ",") rec1 := ais.Record(data1) rec2 := ais.Record(data2) nm, err := rec1.Distance(rec2, idxMap["LAT"].Idx, idxMap["LON"].Idx) if err != nil { panic(err) } fmt.Printf("The ships are %.1fnm away from one another.\n", nm) }
Output: The ships are 4.8nm away from one another.
Index ¶
- Constants
- Variables
- func PairHash64(rec1, rec2 *Record, indices [4]int) (uint64, error)
- type Box
- type ByTimestamp
- type Cluster
- type ClusterMap
- type Field
- type Generator
- type Geohasher
- type HeaderMap
- type Headers
- type Interactions
- type Matching
- type Record
- func (r Record) Data() []byte
- func (r Record) Distance(r2 Record, latIndex, lonIndex int) (nm float64, err error)
- func (r Record) Hash() uint64
- func (r Record) ParseFloat(index int) (float64, error)
- func (r Record) ParseInt(index int) (int64, error)
- func (r Record) ParseTime(index int) (time.Time, error)
- func (r *Record) Value(idx int) (val string, ok bool)
- func (r *Record) ValueFrom(hm HeaderMap) (val string, ok bool)
- type RecordPair
- type RecordSet
- func (rs *RecordSet) AppendField(newField string, requiredHeaders []string, gen Generator) (*RecordSet, error)
- func (rs *RecordSet) Close() error
- func (rs *RecordSet) Flush() error
- func (rs *RecordSet) Headers() Headers
- func (rs *RecordSet) Read() (*Record, error)
- func (rs *RecordSet) Save(name string) error
- func (rs *RecordSet) SetHeaders(h Headers)
- func (rs *RecordSet) SortByTime() (*RecordSet, error)
- func (rs *RecordSet) Stash(rec *Record)
- func (rs *RecordSet) Subset(m Matching) (*RecordSet, error)
- func (rs *RecordSet) SubsetLimit(m Matching, n int, multipass bool) (*RecordSet, error)
- func (rs *RecordSet) UniqueVessels() (VesselSet, error)
- func (rs *RecordSet) UniqueVesselsMulti(multipass bool) (VesselSet, error)
- func (rs *RecordSet) Write(rec Record) error
- type Vessel
- type VesselSet
- type Window
- func (win *Window) AddRecord(rec Record)
- func (win *Window) Config() string
- func (win *Window) FindClusters(geohashIndex int) ClusterMap
- func (win *Window) InWindow(t time.Time) bool
- func (win *Window) Left() time.Time
- func (win *Window) Len() int
- func (win *Window) RecordInWindow(rec *Record) (bool, error)
- func (win *Window) Right() time.Time
- func (win *Window) SetIndex(index int)
- func (win *Window) SetLeft(marker time.Time)
- func (win *Window) SetRight(marker time.Time)
- func (win *Window) SetWidth(dur time.Duration)
- func (win *Window) Slide(dur time.Duration)
- func (win *Window) String() string
- func (win *Window) Width() time.Duration
Examples ¶
Constants ¶
const InteractionFields = "InteractionHash,Distance(nm)," +
"MMSI_1,BaseDateTime_1,LAT_1,LON_1,SOG_1,COG_1,Heading_1,VesselName_1,IMO_1,CallSign_1,VesselType_1,Status_1,Length_1,Width_1,Draft_1,Cargo_1,Geohash_1," +
"MMSI_2,BaseDateTime_2,LAT_2,LON_2,SOG_2,COG_2,Heading_2,VesselName_2,IMO_2,CallSign_2,VesselType_2,Status_2,Length_2,Width_2,Draft_2,Cargo_2,Geohash_2"
InteractionFields are the default column headers used to write a csv file of two vessel interactions. The first field InteractionHash is an ParirHash64 return value that uniquely identifies this interaction and Distance(nm) is the haversine distance between the two vessels.
const TimeLayout = `2006-01-02T15:04:05`
TimeLayout is the timestamp format for the MarineCadastre.gov AIS data available from the U.S. Government. An example timestamp from the data set is `2017-12-05T00:01:14`. This layout is designed to be passed to the time.Parse function as the layout string.
Variables ¶
var ErrEmptySet = errors.New("ErrEmptySet")
ErrEmptySet is the error returned by Subset variants when there are no records in the returned *RecordSet because nothing matched the selection criteria. Functions should only return ErrEmptySet when all processing occurred successfully, but the subset criteria provided no matches to return.
Functions ¶
func PairHash64 ¶
PairHash64 returns a 64 bit fnv hash from two AIS records based on the string values of MMSI, BaseDateTime, LAT, and LON for each vessel. Indices must contain the index values in rec1 and rec2 for MMSI, BaseDateTime, LAT and LON.
Types ¶
type Box ¶
Box provides a type with min and max values for latitude and longitude, and Box implements the Matching interface. This provides a convenient way to create a Box and pass the new object to Subset in order to get a *RecordSet defined with a geographic boundary. Box includes records that are on the border and at the vertices of the geographic boundary. Constructing a box also requires the index value for lattitude and longitude in a *Record. These index values will be called in *Record.ParseFloat(index) from the Match method of a Box in order to see if the Record is in the Box.
type ByTimestamp ¶
type ByTimestamp struct {
// contains filtered or unexported fields
}
ByTimestamp implements the sort.Interface for creating a RecordSet sorted by BaseDateTime. The ByTimestamp struct and its Len, Swap, and Less methods are exported in order to serve as examples for how to implement the sort.Interface for a RecordSet. If you want to sort a RecordSet by time you do not need to call these methods. Just call RecordSet.SortByTime() directly to take advantage of the implementation provided in the package.
func NewByTimestamp ¶
func NewByTimestamp(rs *RecordSet) (*ByTimestamp, error)
NewByTimestamp returns a data structure suitable for sorting using the sort.Interface tools.
func (ByTimestamp) Len ¶
func (bt ByTimestamp) Len() int
Len function to implement the sort.Interface.
func (ByTimestamp) Less ¶
func (bt ByTimestamp) Less(i, j int) bool
Less function to implement the sort.Interface.
func (ByTimestamp) Swap ¶
func (bt ByTimestamp) Swap(i, j int)
Swap function to implement the sort.Interface.
type Cluster ¶
type Cluster struct {
// contains filtered or unexported fields
}
Cluster is an abstraction for a []*Record. The intent is that a Cluster of Records are vessels that share the same geohash
type ClusterMap ¶
ClusterMap is an abstraction for a map[geohash]*Cluster.
type Field ¶
type Field string
Field is an abstraction for string values that are read from and written to AIS Records.
type Generator ¶
Generator is the interface that is implemented to create a new Field from the index values of existing Fields in a Record. The receiver for Generator should be a pointer in order to avoid creating a copy of the Record when Generate is called millions of times iterating over a large RecordSet. Concrete implementation of the Generator interface are required arguments to RecordSet.AppendField(...).
type Geohasher ¶
type Geohasher RecordSet
Geohasher is the base type for implementing the Generator interface to append a github.com/mccloughlin/geohash to each Record in the RecordSet. Pass NewGeohasher(rs *Recordset) as the gen argument of RecordSet.AppendField to add a geohash to a RecordSet.
func NewGeohasher ¶
NewGeohasher returns a pointer to a new Geohasher.
func (*Geohasher) Generate ¶
Generate imlements the Generator interface to create a geohash Field. The returned geohash is accurate to 22 bits of precision which corresponds to about .1 degree differences in lattitude and longitude. The index values for the variadic function on a *Geohasher must be the index of "LAT" and "LON" in the rec. Field will come back nil for any non-nil error returned.
type HeaderMap ¶
HeaderMap is the returned map value for ContainsMulti. See the distance example for using the HeaderMap.
type Headers ¶
type Headers struct { // Fields is an encapsulated []string . It is initialized from the first // non-comment line of an AIS .csv file when ais.OpenRecordSet(filename string) // is called. Fields []string }
Headers are the field names for AIS data elements in a Record.
func (Headers) Contains ¶
Contains returns the index of a specific header. This provides a nice syntax ais.Headers().Contains("LAT") to ensure an ais.Record contains a specific field. If the Headers do not contain the requested field ok is false.
func (Headers) ContainsMulti ¶
ContainsMulti returns a map[string]int where the map keys are the field names and the int values are the index positions of the various fields in the Headers set. If there is an error determining an index position for any field then idxMap returns nil and ok is false. Users should always check for !ok and handle accordingly.
type Interactions ¶
type Interactions struct { RecordHeaders Headers // for the Records that will be used to create interactions OutputHeaders Headers // for an output RecordSet that may be written from the 2-ship interactions // contains filtered or unexported fields }
Interactions is an abstraction for two-vessel interactions. It requires a set of Headers that correspond to the Record slices being compared and it requires a set of Headers for the output. The default for OutputHeaders is the const InteractionFields with a nil dictionary. The data held by interactions is a map[hash]*RecordPair. This guarantees a non-duplicative set of interactions in the output.
func NewInteractions ¶
func NewInteractions(h Headers) (*Interactions, error)
NewInteractions creates a new set of interactions. It requires a set of Headers from the RecordSet that will be searched for Interactions. These Headers are required to contain "MMSI", "BaseDateTime", "LAT", and "LON" in order to uniquely identify an interaction. The returned *Interactions has its output file Headers set to ais.InteractionHeaders by default.
func (*Interactions) AddCluster ¶
func (inter *Interactions) AddCluster(c *Cluster) error
AddCluster adds all of the interactions in a given cluster to the set of Interactions
func (*Interactions) Len ¶
func (inter *Interactions) Len() int
Len returns the number of Interactions in the set.
func (*Interactions) Save ¶
func (inter *Interactions) Save(filename string) error
Save the interactions to a CSV file.
type Matching ¶
Matching provides an interface to pass into the Subset and LimitSubset functions of a RecordSet.
type Record ¶
type Record []string
Record wraps the return value from a csv.Reader because many publicly available data sources provide AIS records in large csv files. The Record type and its associate methods allow clients of the package to deal directly with the abtraction of individual AIS records and handle the csv file read/write operations internally.
func (Record) Distance ¶
Distance calculates the haversine distance between two AIS records that contain a latitude and longitude measurement identified by their index number in the Record slice.
Example ¶
Example demonstrates a simple use of the Distance function.
package main import ( "fmt" "strings" "github.com/FATHOM5/ais" ) func main() { h := strings.Split("MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselName,IMO,CallSign,VesselType,Status,Length,Width,Draft,Cargo", ",") headers := ais.Headers{Fields: h} latIndex, _ := headers.Contains("LAT") lonIndex, _ := headers.Contains("LON") data1 := strings.Split("477307900,2017-12-01T00:00:03,36.90512,-76.32652,0.0,131.0,352.0,FIRST,IMO9739666,VRPJ6,1004,moored,337,,,", ",") data2 := strings.Split("477307902,2017-12-01T00:00:03,36.91512,-76.22652,2.3,311.0,182.0,SECOND,IMO9739800,XHYSF,,underway using engines,337,,,", ",") rec1 := ais.Record(data1) rec2 := ais.Record(data2) nm, err := rec1.Distance(rec2, latIndex, lonIndex) if err != nil { panic(err) } fmt.Printf("The ships are %.1fnm away from one another.\n", nm) }
Output: The ships are 4.8nm away from one another.
func (Record) ParseFloat ¶
ParseFloat wraps strconv.ParseFloat with a method to return a float64 from the index value of a field in the AIS Record. Useful for getting a LAT, LON, SOG or other numeric value from an ais.Record.
func (Record) ParseInt ¶
ParseInt wraps strconv.ParseInt with a method to return an Int64 from the index value of a field in the AIS Record. Useful for getting int values from the Records such as MMSI and IMO number.
func (Record) ParseTime ¶
ParseTime wraps time.Parse with a method to return a time.Time from the index value of a field in the AIS Record. Useful for converting the BaseDateTime from the Record. NOTE: FUTURE VERSIONS OF THIS METHOD SHOULD NOT RELY ON A PACKAGE CONSTANT FOR THE LAYOUT FIELD. THIS FIELD SHOULD BE INFERRED FROM A LIST OF FORMATS SEEN IN COMMON DATASOURCES.
Example ¶
package main import ( "fmt" "strings" "github.com/FATHOM5/ais" ) func main() { h := strings.Split("MMSI,BaseDateTime,LAT,LON,SOG,COG,Heading,VesselName,IMO,CallSign,VesselType,Status,Length,Width,Draft,Cargo", ",") data := strings.Split("477307900,2017-12-01T00:00:03,36.90512,-76.32652,0.0,131.0,352.0,FIRST,IMO9739666,VRPJ6,1004,moored,337,,,", ",") headers := ais.Headers{Fields: h} rec := ais.Record(data) timeIndex, _ := headers.Contains("BaseDateTime") t, err := rec.ParseTime(timeIndex) if err != nil { panic(err) } fmt.Printf("The record timestamp is at %s\n", t.Format(ais.TimeLayout)) }
Output: The record timestamp is at 2017-12-01T00:00:03
func (*Record) Value ¶
Value returns the record value for the []string index. For out out bounds idx arguments or other errors Value returns an empty string for val and false for ok.
func (*Record) ValueFrom ¶
ValueFrom HeaderMap returns the record value decribed in the HeaderMap. The argument is a HeaderMap and normal usage has the nice syntax rec.ValueFrom(idxMap["LAT"]), where idxMap is the returned value from ContainsMulti(...). Returns an empty string and false when hm.Present == false.
type RecordPair ¶
type RecordPair struct {
// contains filtered or unexported fields
}
RecordPair holds pointers to two Records.
type RecordSet ¶
type RecordSet struct {
// contains filtered or unexported fields
}
RecordSet is an the high level interface to deal with comma separated value files of AIS records. A RecordSet is not usually constructed from the struct. Use NewRecordSet() to create an empty set, or OpenRecordSet(filename) to read a file on disk.
func NewRecordSet ¶
func NewRecordSet() *RecordSet
NewRecordSet returns a *Recordset that has an in-memory data buffer for the underlying Records that may be written to it. Additionally, the new *Recordset is configured so that the encoding/csv objects it uses internally has LazyQuotes = true and and Comment = '#'.
func OpenRecordSet ¶
OpenRecordSet takes the filename of an ais data file as its input. It returns a pointer to the RecordSet and a nil error upon successfully validating that the file can be read by an encoding/csv Reader. It returns a nil Recordset on any non-nil error.
func (*RecordSet) AppendField ¶
func (rs *RecordSet) AppendField(newField string, requiredHeaders []string, gen Generator) (*RecordSet, error)
AppendField calls the Generator on each Record in the RecordSet and adds the resulting Field to each record under the newField provided as the argument. The requiredHeaders argument is a []string of the required Headers that must be present in the RecordSet in order for Generator to be successful. If no errors are encournterd it returns a pointer to a new *RecordSet and a nil value for error. If there is an error it will return a nil value for the *RecordSet and an error.
func (*RecordSet) Close ¶
Close calls close on the unexported RecordSet data handle. It is the responsibility of the RecordSet user to call close. This is usually accomplished by a call to
defer rs.Close()
immediately after creating a NewRecordSet.
func (*RecordSet) Flush ¶
Flush empties the buffer in the underlying csv.Writer held by the RecordSet and returns any error that has occurred in a previous write or flush.
func (*RecordSet) Read ¶
Read calls Read() on the csv.Reader held by the RecordSet and returns a Record. The idiomatic way to iterate over a recordset comes from the same idiom to read a file using encoding/csv.
func (*RecordSet) SetHeaders ¶
SetHeaders provides the expected interface to a RecordSet
func (*RecordSet) SortByTime ¶
SortByTime returns a pointer to a new RecordSet sorted in ascending order by BaseDateTime.
func (*RecordSet) Stash ¶
Stash allows a client to take Record that has been previously retrieved through Read() and ensure the next call to Read() returns this same Record.
func (*RecordSet) Subset ¶
Subset returns a pointer to a new *RecordSet that contains all of the records that return true from calls to Match(*Record) (bool, error) on the provided argument m that implements the Matching interface. Returns nil for the *RecordSet when error is non-nil.
Example ¶
package main import ( "fmt" "time" "github.com/FATHOM5/ais" ) type subsetOneDay struct { rs *ais.RecordSet d1 time.Time timeIndex int } func (sod *subsetOneDay) Match(rec *ais.Record) (bool, error) { d2, err := time.Parse(ais.TimeLayout, (*rec)[sod.timeIndex]) if err != nil { return false, fmt.Errorf("subsetOneDay: %v", err) } d2 = d2.Truncate(24 * time.Hour) return sod.d1.Equal(d2), nil } func main() { rs, _ := ais.OpenRecordSet("testdata/ten.csv") defer rs.Close() // Implement a concreate type of subsetOneDay to return records // from 25 Dec 2017. timeIndex, ok := rs.Headers().Contains("BaseDateTime") if !ok { panic("recordset does not contain the header BaseDateTime") } targetDate, _ := time.Parse("2006-01-02", "2017-12-25") sod := &subsetOneDay{ rs: rs, d1: targetDate, timeIndex: timeIndex, } matches, _ := rs.Subset(sod) //matches.Save("newSet.csv") subsetRec, _ := matches.Read() subsetDate := (*subsetRec)[timeIndex] date, _ := time.Parse(ais.TimeLayout, subsetDate) fmt.Printf("The first record in the subset has BaseDateTime %v\n", date.Format("2006-01-02")) }
Output: The first record in the subset has BaseDateTime 2017-12-25
func (*RecordSet) SubsetLimit ¶
SubsetLimit returns a pointer to a new RecordSet with the first n records that return true from calls to Match(*Record) (bool, error) on the provided argument m that implements the Matching interface. Returns nil for the *RecordSet when error is non-nil. For n values less than zero, SubsetLimit will return all matches in the set.
SubsetLimit also implement a bool argument, multipass, that will reset the read pointer in the RecordSet to the beginning of the data when set to true. This has two important impacts. First, it allows the same rs receiver to be used multiple times in a row because the read pointer is reset each time after hitting EOF. Second, it has a significant performance penalty when dealing with a RecordSet of about one million or more records. When performance impacts from setting multipass to true outweigh the convenience of additional boilerplate code it is quite helpful. In situations where it is causing an issue use rs.Close() and then OpenRecordSet(filename) to get a fresh copy of the data.
func (*RecordSet) UniqueVessels ¶
UniqueVessels returns a VesselMap, map[Vessel]int, that includes a unique key for each Vessel in the RecordSet. The value of each key is the number of Records for that Vessel in the data.
func (*RecordSet) UniqueVesselsMulti ¶
UniqueVesselsMulti provides an option to control whether the RecordSet read pointer is returned to the top of the file. Using this option has a significant performance cost and is not recommended for any RecordSet with more than one million records. However, setting this version to true is valuable when the returned VesselMap is going to be used for additional queries on the same receiver. For example, ranging over the returned VesselSet to create a Subset of data for each ship requires reusing the rs reciver in most cases.
type Vessel ¶
Vessel is a struct for the identifying information about a specific ship in an AIS dataset. NOTE: REFINEMENT OF PACKAGE AIS WILL INCORPORATE MORE OF THE SHIP IDENTIFYING DATA COMMENTED OUT IN THIS MINIMALLY VIABLE IMPLEMENTATION.
type VesselSet ¶
VesselSet is a set of unique vessels usually obtained by the return value of RecordSet.UniqueVessels(). For each Record of a Vessel in the set the int value of the VesselSet is incremented
type Window ¶
Window is used to create a convolution algorithm that slides down a RecordSet and performs analysis on Records that are within the a time window.
func NewWindow ¶
NewWindow returns a *Window with the left marker set to the time in the next record read from the RecordSet. The Window width will be set from the argument provided and the righ marker will be derived from left and width. When creating a Window right after opening a RecordSet then the Window will be set to first Record in the set, but that first record will still be available to the client's first call to rs.Read(). For any non-nil error NewWindow returns nil and the error.
func (*Window) FindClusters ¶
func (win *Window) FindClusters(geohashIndex int) ClusterMap
FindClusters returns a ClusterMap that groups Records in the window into common Clusters that share the same geohash. It requires that the RecordSet Window it is operating on has a 'Geohash' field stored as a Uint64 with the proper prefix for the hash (i.e. 0x for hex representation).
func (*Window) RecordInWindow ¶
RecordInWindow returns true if the record is in the Window. Errors are possible from parsing the BaseDateTime field of the Record.
func (*Window) SetIndex ¶
SetIndex provides the integer index of the BaseDateTime field the Records stored in the Window.
func (*Window) Slide ¶
Slide moves the window down by the time provided in the arugment dur. Slide also removes any data from the Window that would no longer return true from InWindow for the new left and right markers after the Slide.