Documentation ¶
Overview ¶
Package dna implements a data structure for storage and manipulation of sequences of DNA.
Index ¶
- Variables
- func AllToLower(bases []Base)
- func AllToUpper(bases []Base)
- func AminoAcidToShortString(a AminoAcid) string
- func AminoAcidToString(a AminoAcid) string
- func BaseToRune(base Base) rune
- func BaseToString(b Base) string
- func BasesToString(bases []Base) string
- func CompareSeqsCaseSensitive(alpha []Base, beta []Base) int
- func CompareSeqsCaseSensitiveIgnoreGaps(alpha []Base, beta []Base) int
- func CompareSeqsIgnoreCase(alpha []Base, beta []Base) int
- func CompareSeqsIgnoreCaseAndGaps(alpha []Base, beta []Base) int
- func CompareTwoDSeqsCaseSensitive(alpha [][]Base, beta [][]Base) int
- func CompareTwoDSeqsCaseSensitiveIgnoreGaps(alpha [][]Base, beta [][]Base) int
- func CompareTwoDSeqsIgnoreCase(alpha [][]Base, beta [][]Base) int
- func CompareTwoDSeqsIgnoreCaseAndGaps(alpha [][]Base, beta [][]Base) int
- func Complement(bases []Base)
- func Count(seq []Base) (ACount int, CCount int, GCount int, TCount int, NCount int, aCount int, ...)
- func CountBase(seq []Base, b Base) int
- func CountBaseInterval(seq []Base, b Base, start int, end int) int
- func CountGaps(seq []Base) int
- func CountMask(seq []Base) (unmaskedCount int, maskedCount int, gapCount int)
- func DefineBase(b Base) bool
- func Dist(a []Base, b []Base) (dist int)
- func GCContent(seq []Base) (gcContent float64)
- func IsEqual(c1 Codon, c2 Codon) bool
- func IsLower(b Base) bool
- func IsSeqOfACGT(seq []Base) bool
- func MeltingTemp(seq []Base) float64
- func NonSynonymous(c1 Codon, c2 Codon) bool
- func PeptideToShortString(a []AminoAcid) string
- func PeptideToString(a []AminoAcid) string
- func RangeToLower(bases []Base, start int, end int)
- func RangeToUpper(bases []Base, start int, end int)
- func ReverseComplement(bases []Base)
- func SeqsAreSimilar(a, b []Base, numAllowedMismatch int) bool
- func Synonymous(c1 Codon, c2 Codon) bool
- func TranslateToShortString(b []Base) string
- func TranslateToString(b []Base) string
- type AminoAcid
- type Base
- func ByteSliceToDnaBases(b []byte) []Base
- func ByteToBase(b byte) (Base, error)
- func CodonsToBases(c []Codon) []Base
- func ComplementSingleBase(b Base) Base
- func CreateAllGaps(numGaps int) []Base
- func CreateAllNs(numGaps int) []Base
- func Delete(seq []Base, delStart int, delEnd int) []Base
- func Extract(rec []Base, start int, end int) []Base
- func Insert(seq []Base, insPos int, insSeq []Base) []Base
- func RemoveBase(bases []Base, baseToRemove Base) []Base
- func RemoveGaps(bases []Base) []Base
- func Replace(seq []Base, start int, end int, insSeq []Base) []Base
- func ReverseComplementAndCopy(bases []Base) []Base
- func RuneToBase(r rune) (Base, error)
- func StringToBase(s string) Base
- func StringToBases(s string) []Base
- func StringToBasesForced(s string) []Base
- func ToLower(b Base) Base
- func ToUpper(b Base) Base
- type Codon
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var (
ErrLenInputSeqNotDivThree = errors.New("length of input sequence is not a factor of three. remaining bases were ignored")
)
var GeneticCode = map[Codon]AminoAcid{ {T, G, A}: Stop, {T, A, A}: Stop, {T, A, G}: Stop, {G, T, A}: Val, {G, T, C}: Val, {G, T, G}: Val, {G, T, T}: Val, {T, A, T}: Tyr, {T, A, C}: Tyr, {T, G, G}: Trp, {A, C, A}: Thr, {A, C, G}: Thr, {A, C, T}: Thr, {A, C, C}: Thr, {T, C, A}: Ser, {T, C, C}: Ser, {T, C, G}: Ser, {T, C, T}: Ser, {A, G, T}: Ser, {A, G, C}: Ser, {C, C, C}: Pro, {C, C, T}: Pro, {C, C, A}: Pro, {C, C, G}: Pro, {T, T, T}: Phe, {T, T, C}: Phe, {A, T, G}: Met, {A, A, A}: Lys, {A, A, G}: Lys, {T, T, A}: Leu, {T, T, G}: Leu, {C, T, C}: Leu, {C, T, G}: Leu, {C, T, A}: Leu, {C, T, T}: Leu, {A, T, T}: Ile, {A, T, C}: Ile, {A, T, A}: Ile, {C, A, T}: His, {C, A, C}: His, {G, G, G}: Gly, {G, G, A}: Gly, {G, G, T}: Gly, {G, G, C}: Gly, {G, A, A}: Glu, {G, A, G}: Glu, {C, A, A}: Gln, {C, A, G}: Gln, {T, G, T}: Cys, {T, G, C}: Cys, {G, A, T}: Asp, {G, A, C}: Asp, {A, A, T}: Asn, {A, A, C}: Asn, {A, G, A}: Arg, {A, G, G}: Arg, {C, G, C}: Arg, {C, G, G}: Arg, {C, G, A}: Arg, {C, G, T}: Arg, {G, C, A}: Ala, {G, C, G}: Ala, {G, C, T}: Ala, {G, C, C}: Ala, }
GeneticCode is a map of codon arrays to amino acids. Used for translating coding sequences to protein sequences.
Functions ¶
func AllToLower ¶
func AllToLower(bases []Base)
AllToLower changes all bases in a sequence to lowercase.
func AllToUpper ¶
func AllToUpper(bases []Base)
AllToUpper changes all bases in a sequence to uppercase.
func AminoAcidToShortString ¶
AminoAcidToShortString converts type AminoAcid into single character amino acid symbols.
func AminoAcidToString ¶
AminoAcidToString converts type AminoAcid into three letter amino acid symbols.
func BaseToString ¶
BaseToString converts a DNA base to a string by casting a BaseToRune result to a string.
func BasesToString ¶
BasesToString converts a slice of DNA bases into a string. Useful for writing to files.
Example ¶
var baseSeq []Base baseSeq = []Base{A, C, G, T} fmt.Println(baseSeq) var stringSeq string stringSeq = BasesToString(baseSeq) fmt.Println(stringSeq)
Output: [0 1 2 3] ACGT
func CompareSeqsCaseSensitive ¶
CompareSeqsCaseSensitive returns an integer defining the relationship between two input sequences. 1 if alpha > beta, -1 if beta > alpha, 0 if the sequences are equal. Case sensitive.
func CompareSeqsCaseSensitiveIgnoreGaps ¶
CompareSeqsCaseSensitiveIgnoreGaps returns an integer defining the relationship between two input sequences. 1 if alpha > beta, -1 if beta > alpha, 0 if the sequences are equal. Case sensitive. Ignores gaps.
func CompareSeqsIgnoreCase ¶
CompareSeqsIgnoreCase returns an integer defining the relationship between two input sequences. 1 if alpha > beta, -1 if beta > alpha, 0 if the sequences are equal. Case insensitive.
func CompareSeqsIgnoreCaseAndGaps ¶
CompareSeqsIgnoreCaseAndGaps returns an integer defining the relationship between two input sequences. 1 if alpha > beta, -1 if beta > alpha, 0 if the sequences are equal. Case insensitive. Ignores gaps.
func CompareTwoDSeqsCaseSensitive ¶
CompareTwoDSeqsCaseSensitive returns an integer defining the relationship between two input lists of sequences. 1 if alpha > beta, -1 if beta > alpha, 0 if the sequences are equal. Case sensitive.
func CompareTwoDSeqsCaseSensitiveIgnoreGaps ¶
CompareTwoDSeqsCaseSensitiveIgnoreGaps returns an integer defining the relationship between two input lists of sequences. 1 if alpha > beta, -1 if beta > alpha, 0 if the sequences are equal. Case sensitive. Ignores gaps.
func CompareTwoDSeqsIgnoreCase ¶
CompareTwoDSeqsIgnoreCase returns an integer defining the relationship between two input lists of sequences. 1 if alpha > beta, -1 if beta > alpha, 0 if the sequences are equal. Case insensitive.
func CompareTwoDSeqsIgnoreCaseAndGaps ¶
CompareTwoDSeqsIgnoreCaseAndGaps returns an integer defining the relationship between two input lists of sequences. 1 if alpha > beta, -1 if beta > alpha, 0 if the sequences are equal. Case insensitive. Ignores gaps.
func Complement ¶
func Complement(bases []Base)
Complement all bases in a sequence of bases.
Example ¶
var baseSeq []Base baseSeq = []Base{A, T, G} fmt.Println(BasesToString(baseSeq)) // Complement modifies the slice in place so no return value Complement(baseSeq) fmt.Println(BasesToString(baseSeq))
Output: ATG TAC
func Count ¶
func Count(seq []Base) (ACount int, CCount int, GCount int, TCount int, NCount int, aCount int, cCount int, gCount int, tCount int, nCount int, gapCount int)
Count returns the number of each base present in the input sequence.
func CountBase ¶
CountBase returns the number of the designated base present in the input sequence.
Example ¶
var seq []Base seq = []Base{A, A, C, T, T, T} fmt.Println(CountBase(seq, A)) fmt.Println(CountBase(seq, C)) fmt.Println(CountBase(seq, G)) fmt.Println(CountBase(seq, T)) fmt.Println(CountBase(seq, N))
Output: 2 1 0 3 0
func CountBaseInterval ¶
CountBaseInterval returns the number of the designated base present in the input range of the sequence.
func CountMask ¶
CountMask returns the number of bases that are masked/unmasked (lowercase/uppercase) in the input sequence.
func DefineBase ¶
DefineBase returns false if the input base is an N, Gap, Dot, or Nil.
func Dist ¶
Dist returns the number of bases that do not match between the input sequences. Input sequences must be the same length.
func GCContent ¶ added in v1.0.1
GCContent returns the GC content for the input sequence. Note that n/Ns are ignored.
func IsEqual ¶
IsEqual compares two Codons and returns true if the underlying sequences are identical.
func IsSeqOfACGT ¶
IsSeqOfACGT returns true if the input sequences contains only uppercase A/C/G/T.
func MeltingTemp ¶ added in v1.0.1
MeltingTemp calculates the melting temp of slice of Base in Celsius with the nearest-neighbor algorithm. Assumes 500 nM of both oligo + template and 50 mM Na+.
func NonSynonymous ¶
NonSynonymous compares two Codons and returns true if they encode different AminoAcids.
func PeptideToShortString ¶
PeptideToShortString converts a slice of amino acid into a string of one character amino acid symbols.
func PeptideToString ¶
PeptideToString converts a slice of AminoAcids into a string of three character amino acid symbols.
func RangeToLower ¶
RangeToLower changes the bases in a set range to lowercase. start is closed, end is open, both are zero-based.
func RangeToUpper ¶
RangeToUpper changes the bases in a set range to uppercase. start is closed, end is open, both are zero-based.
func ReverseComplement ¶
func ReverseComplement(bases []Base)
ReverseComplement reverses a sequence of bases and complements each base. Used to switch strands and maintain 5' -> 3' orientation.
Example ¶
var baseSeq []Base baseSeq = []Base{A, T, G} fmt.Println(BasesToString(baseSeq)) // Reverse complement modifies the slice in place so no return value ReverseComplement(baseSeq) fmt.Println(BasesToString(baseSeq))
Output: ATG CAT
func SeqsAreSimilar ¶ added in v1.0.1
SeqsAreSimilar returns true if the two input sequences have less than or equal mismatches to the user-specified threshold if two sequences of different length, the function will return false. Comparison is case-insensitive.
func Synonymous ¶
Synonymous compares two codons and returns true if the codons code for the same amino acid.
func TranslateToShortString ¶
TranslateToShortString converts a sequence of DNA bases into a string of one character amino acid symbols. Input expects bases to be in-frame. If the input sequence is not a factor of three the function will panic.
func TranslateToString ¶
TranslateToString converts a sequence of DNA bases into a string of three character amino acid symbols. Input expects bases to be in-frame. If the input sequence is not a factor of three the function will panic.
Types ¶
type AminoAcid ¶
type AminoAcid byte
AminoAcid converts the twenty canonical amino acids and stop codon into bytes.
const ( Ala AminoAcid = 0 Arg AminoAcid = 1 Asn AminoAcid = 2 Asp AminoAcid = 3 Cys AminoAcid = 4 Gln AminoAcid = 5 Glu AminoAcid = 6 Gly AminoAcid = 7 His AminoAcid = 8 Ile AminoAcid = 9 Leu AminoAcid = 10 Lys AminoAcid = 11 Met AminoAcid = 12 Phe AminoAcid = 13 Pro AminoAcid = 14 Ser AminoAcid = 15 Thr AminoAcid = 16 Trp AminoAcid = 17 Tyr AminoAcid = 18 Val AminoAcid = 19 Stop AminoAcid = 20 )
func OneLetterToAminoAcid ¶
OneLetterToAminoAcid converts a one letter amino acid byte into an AminoAcid type.
func StringToAminoAcid ¶
StringToAminoAcid converts a string into type amino acid. If singleLetter is false, the input string will be processed by the three letter code.
func ThreeLetterToAminoAcid ¶
ThreeLetterToAminoAcid converts a three letter amino acid string into an AminoAcid type.
func TranslateCodon ¶
TranslateCodon converts an individual Codon into the corresponding AminoAcid type.
func TranslateSeq ¶
TranslateSeq takes a sequence of DNA bases and translates it into a slice of Amino acids. Input expects bases to be in-frame. If the input sequence is not a factor of three the function will panic.
func TranslateSeqToTer ¶
TranslateSeqToTer takes a sequence of DNA bases and translates it into a slice of Amino acids. The translation will end after the first stop codon is reached and the function will return the protein sequence including the trailing stop codon. Any bases beyond the stop codon, or remaining bases after all 3-base codons have been made will be ignored.
type Base ¶
type Base byte
Base stores a single nucleotide as a byte.
func ByteSliceToDnaBases ¶
ByteSliceToDnaBases will convert a slice of bytes into a slice of Bases.
func ByteToBase ¶
ByteToBase converts a byte into a dna.Base if it matches one of the acceptable DNA characters. Notes: It will also mask the lower case values and return dna.Base as uppercase bases. Note: '*', used by VCF to denote deleted alleles, becomes a Gap in DNA.
func CodonsToBases ¶
CodonsToBases converts a slice of Codons into a slice of DNA bases.
func ComplementSingleBase ¶
ComplementSingleBase returns the nucleotide complementary to the input base.
func CreateAllGaps ¶
CreateAllGaps creates a DNA sequence of Gap with length of numGaps.
func CreateAllNs ¶
CreateAllNs creates a DNA sequence of N with length of numGaps.
func Delete ¶
Delete removes bases from a sequence of bases. all base positions are zero based and left closed, right open.
func Extract ¶
Extract returns a subsequence of an input slice of DNA bases from an input start and end point.
func Insert ¶
Insert adds bases to a sequence of bases. base position is zero-based, insertion happens before specified base giving the length of the sequence puts the insertion at the end.
func RemoveBase ¶
RemoveBase returns a sequence of bases without any of the designated base.
func RemoveGaps ¶
RemoveGaps returns a sequence of bases with no gaps.
func Replace ¶
Replace performs both a deletion and an insertion, replacing the input interval with the input insSeq. all base positions are zero based and left closed, right open.
func ReverseComplementAndCopy ¶ added in v1.0.1
ReverseComplementAndCopy returns a reverse complimented sequence of bases. Used to switch strands and maintain 5' -> 3' orientation.
func RuneToBase ¶
RuneToBase converts a rune into a dna.Base if it matches one of the acceptable DNA characters. Note: '*', used by VCF to denote deleted alleles becomes Nil.
func StringToBase ¶
StringToBase parses a string into a single DNA base.
func StringToBases ¶
StringToBases parses a string into a slice of DNA bases.
Example ¶
var stringSeq string stringSeq = "ACGT" fmt.Println(stringSeq) var baseSeq []Base baseSeq = StringToBases(stringSeq) fmt.Println(baseSeq)
Output: ACGT [0 1 2 3]
func StringToBasesForced ¶
StringToBasesForced parses a string into a slice of DNA bases and N-masks any invalid characters.
type Codon ¶
type Codon [3]Base
Codon is an array of three DNA bases for genetic analysis of proteins and amino acids.
func BasesToCodons ¶
BasesToCodons converts a slice of DNA bases into a slice of Codons. Input expects bases to be in-frame. If the input sequence is not a factor of three the function will panic.
func BasesToCodonsIgnoreRemainder ¶
BasesToCodonsIgnoreRemainder converts a slice of DNA bases into a slice of Codons. Any bases remaining after all 3-base codons have been assembled will be ignored.