Documentation ¶
Overview ¶
Package gff provides types to read and write version 2 General Feature Format files according to the Sanger Institute specification.
The specification can be found at http://www.sanger.ac.uk/resources/software/gff/spec.html.
Index ¶
Constants ¶
const Astronomical = "2006-1-02"
"Astronomical" time format is the format specified in the GFF specification
const Version = 2
Version is the GFF version that is read and written.
Variables ¶
var ( ErrBadFeature = Error{"gff: feature start not less than feature end"} ErrBadStrandField = Error{"gff: bad strand field"} ErrBadStrand = Error{"gff: invalid strand"} ErrClosed = Error{"gff: writer closed"} ErrBadTag = Error{"gff: invalid tag"} ErrCannotHeader = Error{"gff: cannot write header: data written"} ErrNotHandled = Error{"gff: type not handled"} ErrFieldMissing = Error{"gff: missing fields"} ErrBadMoltype = Error{"gff: invalid moltype"} ErrEmptyMetaLine = Error{"gff: empty comment metaline"} ErrBadMetaLine = Error{"gff: incomplete metaline"} ErrBadSequence = Error{"gff: corrupt metasequence"} )
Functions ¶
This section is empty.
Types ¶
type Attribute ¶
type Attribute struct {
Tag, Value string
}
An Attribute represents a GFF2 attribute field record. Attribute field records must have an tag value structure following the syntax used within objects in a .ace file, flattened onto one line by semicolon separators. Tags must be standard identifiers ([A-Za-z][A-Za-z0-9_]*). Free text values must be quoted with double quotes.
Note: all non-printing characters in free text value strings (e.g. newlines, tabs, control characters, etc) must be explicitly represented by their C (UNIX) style backslash-escaped representation.
type Attributes ¶
type Attributes []Attribute
func (Attributes) Get ¶
func (a Attributes) Get(tag string) string
type Feature ¶
type Feature struct { // The name of the sequence. Having an explicit sequence name allows // a feature file to be prepared for a data set of multiple sequences. // Normally the seqname will be the identifier of the sequence in an // accompanying fasta format file. An alternative is that SeqName is // the identifier for a sequence in a public database, such as an // EMBL/Genbank/DDBJ accession number. Which is the case, and which // file or database to use, should be explained in accompanying // information. SeqName string // The source of this feature. This field will normally be used to // indicate the program making the prediction, or if it comes from // public database annotation, or is experimentally verified, etc. Source string // The feature type name. Feature string // FeatStart must be less than FeatEnd and non-negative - GFF indexing // is one-base and GFF features cannot have a zero length or a negative // position. gff.Feature indexing is, to be consistent with the rest of // the library zero-based half open. Translation between zero- and one- // based indexing is handled by the gff package. FeatStart, FeatEnd int // A floating point value representing the score for the feature. A nil // value indicates the score is not available. FeatScore *float64 // The strand of the feature - one of seq.Plus, seq.Minus or seq.None. // seq.None should be used when strand is not relevant, e.g. for // dinucleotide repeats. This field should be set to seq.None for RNA // and protein features. FeatStrand seq.Strand // FeatFrame indicates the frame of the feature. and takes the values // Frame0, Frame1, Frame2 or NoFrame. Frame0 indicates that the // specified region is in frame. Frame1 indicates that there is one // extra base, and Frame2 means that the third base of the region // is the first base of a codon. If the FeatStrand is seq.Minus, then // the first base of the region is value of FeatEnd, because the // corresponding coding region will run from FeatEnd to FeatStart on // the reverse strand. As with FeatStrand, if the frame is not relevant // then set FeatFrame to NoFram. This field should be set to seq.None // for RNA and protein features. FeatFrame Frame // FeatAttributes represents a collection of GFF2 attributes. FeatAttributes Attributes // Free comments. Comments string }
A Feature represents a standard GFF2 feature.
func (*Feature) Description ¶
type Reader ¶
type Reader struct { TimeFormat string // Required for parsing date fields. Defaults to astronomical format. Metadata // contains filtered or unexported fields }
A Reader can parse GFFv2 formatted io.Reader and return feat.Features.
type Writer ¶
type Writer struct { TimeFormat string Precision int Width int // contains filtered or unexported fields }
A Writer outputs features and sequences into GFFv2 format.
func NewWriter ¶
Returns a new GFF format writer using w. When header is true, a version header will be written to the GFF.
func (*Writer) Write ¶
Write writes a single feature and return the number of bytes written and any error. gff.Features are written as a canonical GFF line, seq.Sequences are written as inline sequence in GFF format (note that only sequences of feat.Moltype DNA, RNA and Protein are supported). gff.Sequences are not handled as they have a zero length. All other feat.Feature are written as sequence region metadata lines.
func (*Writer) WriteComment ¶
WriteComment writes a comment line to a GFF file.
func (*Writer) WriteMetaData ¶
WriteMetaData writes a meta data line to a GFF file. The type of metadata line depends on the type of d: strings and byte slices are written verbatim, an int is interpreted as a version number and can only be written before any other data, feat.Moltype and gff.Sequence types are written as sequence type lines, gff.Features and gff.Regions are written as sequence regions, sequences are written _n GFF format and time.Time values are written as date line. All other type return an ErrNotHandled.