Documentation ¶
Overview ¶
Package variants implements functionality to annotate mutations relative to a reference sequence for all records in a multiple sequence alignment in fasta format.
Index ¶
- func AggregateWriteVariants(w io.Writer, start, end int, appendSNP bool, threshold float64, refID string, ...)
- func FormatVariant(v Variant, appendSNP bool) (string, error)
- func GetMSAOffsets(refseq []byte) ([]int, []int)
- func Variants(msaIn io.Reader, stdin bool, refID string, annoIn io.Reader, annoSuffix string, ...) error
- func WriteVariants(w io.Writer, start, end int, firstmissing bool, appendSNP bool, refID string, ...)
- type AnnoStructs
- type Region
- func CDSRegion2fromGFF(fs []gff.Feature, refSeqDegapped string) (Region, error)
- func CDSRegion2fromGenbank(f genbank.GenbankFeature) (Region, error)
- func RegionsFromGFF(anno gff.GFF, refSeqDegapped string) ([]Region, []int, error)
- func RegionsFromGenbank(gb genbank.Genbank, refLength int) ([]Region, []int, error)
- type Variant
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func AggregateWriteVariants ¶
func AggregateWriteVariants(w io.Writer, start, end int, appendSNP bool, threshold float64, refID string, cVariants chan AnnoStructs, cWriteDone chan bool, cErr chan error)
AggregateWriteOutput aggregates the mutations that are present greater than or equal to threshold, and writes their frequencies to file or stdout
func FormatVariant ¶
FormatVariant returns a string representation of a single mutation, the format of which varies given its type (aa/nuc/indel)
func GetMSAOffsets ¶
GetMSAOffsets returns two arrays which contain coordinate shifting information that can be used to convert reference to msa coordinates, and vice versa
Types ¶
type AnnoStructs ¶
AnnoStructs is for passing groups of Variant structs around with an index which is used to retain input order in the output
type Region ¶
type Region struct { Whichtype string // only "protein-coding" for now Name string // name of feature, if it has one Start int // 1-based 5'-most position of region on the forward strand, inclusive Stop int // 1-based 3'-most position of region on the forward strand, inclusive Translation string // amino acid sequence of this region if it is CDS Strand int // values in the set {-1, +1} only (and "0" for a mixture?!) Positions []int // all the (1-based, unadjusted) positions in order, on the reverse strand if needs be }
func CDSRegion2fromGFF ¶ added in v1.2.0
func CDSRegion2fromGenbank ¶ added in v1.2.0
func CDSRegion2fromGenbank(f genbank.GenbankFeature) (Region, error)
func RegionsFromGFF ¶ added in v1.2.0
func RegionsFromGenbank ¶ added in v1.2.0
Parses a genbank flat format file of genome annotations to extract information about the the positions of CDS and intergenic regions, in order to annotate mutations within each
type Variant ¶
type Variant struct { Queryname string RefAl string QueAl string Position int // (1-based) genomic location (for an amino acid change, this is the first position of the codon) Residue int // (1-based) amino acid location Changetype string // one of {nuc,aa,ins,del} Feature string // this should be, for example, the name of the CDS that the thing is in Length int // for indels SNPs string // if this is an amino acid change, what are the snps Representation string }
A Variant is a struct that contains information about one mutation (nuc, amino acid, indel) between reference and query