seq

package
v0.8.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 16, 2022 License: MIT Imports: 9 Imported by: 22

README

seq

Go Reference

This package defines Seq and Alphabet type, and provides some basic operations of sequence, like validation of DNA/RNA/Protein sequence, getting reverse complement sequence and translation of RNA to protein.

This package was inspired by biogo.

Documentation

Overview

Package seq defines a *Seq* type, and provides some basic operations of sequence, like validation of DNA/RNA/Protein sequence and getting reverse complement sequence.

This package was inspired by [biogo](https://code.google.com/p/biogo/source/browse/#git%2Falphabet).

IUPAC nucleotide code: ACGTURYSWKMBDHVN

http://droog.gs.washington.edu/parc/images/iupac.html

code	base	Complement
A	A	T
C	C	G
G	G	C
T/U	T	A

M	A/C	K
R	A/G	Y
W	A/T	W
S	C/G	S
Y	C/T	R
K	G/T	M

V	A/C/G	B
H	A/C/T	D
D	A/G/T	H
B	C/G/T	V

X/N	A/C/G/T	X
.	not A/C/G/T
 or-	gap

IUPAC amino acid code

A	Ala	Alanine
B	Asx	Aspartic acid or Asparagine [2]
C	Cys	Cysteine
D	Asp	Aspartic Acid
E	Glu	Glutamic Acid
F	Phe	Phenylalanine
G	Gly	Glycine
H	His	Histidine
I	Ile	Isoleucine
J		Isoleucine or Leucine [4]
K	Lys	Lysine
L	Leu	Leucine
M	Met	Methionine
N	Asn	Asparagine
O		pyrrolysine [6]
P	Pro	Proline
Q	Gln	Glutamine
R	Arg	Arginine
S	Ser	Serine
T	Thr	Threonine
U	Sec	selenocysteine [5,6]
V	Val	Valine
W	Trp	Tryptophan
Y	Tyr	Tyrosine
Z	Glx	Glutamine or Glutamic acid [2]

X	unknown amino acid
.	gaps
*	End

Reference:

  1. http://www.bioinformatics.org/sms/iupac.html
  2. http://www.dnabaser.com/articles/IUPAC%20ambiguity%20codes.html
  3. http://www.bioinformatics.org/sms2/iupac.html
  4. http://www.matrixscience.com/blog/non-standard-amino-acid-residues.html
  5. http://www.sbcs.qmul.ac.uk/iupac/AminoAcid/A2021.html#AA21
  6. https://en.wikipedia.org/wiki/Amino_acid

https://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=tgencodes

Index

Constants

View Source
const NQualityEncoding int = 6

NQualityEncoding is the number of QualityEncoding + 1: 5 + 1 = 6

Variables

View Source
var AlphabetGuessSeqLengthThreshold = 10000

AlphabetGuessSeqLengthThreshold is the length of sequence prefix of the first FASTA record based which FastaRecord guesses the sequence type. 0 for whole seq

View Source
var AmbBase2Bases = map[byte][]byte{
	'A': {'A'},
	'a': {'A'},
	'C': {'C'},
	'c': {'C'},
	'G': {'G'},
	'g': {'G'},
	'T': {'T'},
	't': {'T'},
	'U': {'T'},
	'u': {'T'},

	'M': {'A', 'C', 'M'},
	'm': {'A', 'C', 'M'},
	'R': {'A', 'G', 'R'},
	'r': {'A', 'G', 'R'},
	'W': {'A', 'T', 'W'},
	'w': {'A', 'T', 'W'},
	'S': {'C', 'G', 'S'},
	's': {'C', 'G', 'S'},
	'Y': {'C', 'T', 'Y'},
	'y': {'C', 'T', 'Y'},
	'K': {'G', 'T', 'K'},
	'k': {'G', 'T', 'K'},

	'V': {'A', 'C', 'G', 'M', 'R', 'S', 'V'},
	'v': {'A', 'C', 'G', 'M', 'R', 'S', 'V'},
	'H': {'A', 'C', 'T', 'M', 'W', 'Y', 'H'},
	'h': {'A', 'C', 'T', 'M', 'W', 'Y', 'H'},
	'D': {'A', 'G', 'T', 'R', 'W', 'K', 'D'},
	'd': {'A', 'G', 'T', 'R', 'W', 'K', 'D'},
	'B': {'C', 'G', 'T', 'S', 'Y', 'K', 'B'},
	'b': {'C', 'G', 'T', 'S', 'Y', 'K', 'B'},

	'N': {'A', 'C', 'M', 'G', 'R', 'S', 'V', 'T', 'W', 'Y', 'H', 'K', 'D', 'B', 'N'},
	'n': {'A', 'C', 'M', 'G', 'R', 'S', 'V', 'T', 'W', 'Y', 'H', 'K', 'D', 'B', 'N'},
}

AmbBase2Bases holds relationship of ambiguous base and bases it represents, faster than AmbBase2Bases0

View Source
var AmbCodes2Codes = map[int][]int{
	1: {1},
	2: {2},
	4: {4},
	8: {8},

	3:  {1, 2, 3},
	5:  {1, 4, 5},
	9:  {1, 8, 9},
	6:  {2, 4, 6},
	10: {2, 8, 10},
	12: {4, 8, 12},

	7:  {1, 2, 4, 3, 5, 6, 7},
	11: {1, 2, 8, 3, 9, 10, 11},
	13: {1, 4, 8, 5, 9, 12, 13},
	14: {2, 4, 8, 6, 10, 12, 14},

	15: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15},
}

AmbCodes2Codes is code version of AmbBase2Bases

View Source
var CodonTables map[int]*CodonTable

CodonTables contains all the codon tables from NCBI:

1: The Standard Code
2: The Vertebrate Mitochondrial Code
3: The Yeast Mitochondrial Code
4: The Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code
5: The Invertebrate Mitochondrial Code
6: The Ciliate, Dasycladacean and Hexamita Nuclear Code
9: The Echinoderm and Flatworm Mitochondrial Code
10: The Euplotid Nuclear Code
11: The Bacterial, Archaeal and Plant Plastid Code
12: The Alternative Yeast Nuclear Code
13: The Ascidian Mitochondrial Code
14: The Alternative Flatworm Mitochondrial Code
16: Chlorophycean Mitochondrial Code
21: Trematode Mitochondrial Code
22: Scenedesmus obliquus Mitochondrial Code
23: Thraustochytrium Mitochondrial Code
24: Pterobranchia Mitochondrial Code
25: Candidate Division SR1 and Gracilibacteria Code
26: Pachysolen tannophilus Nuclear Code
27: Karyorelict Nuclear
28: Condylostoma Nuclear
29: Mesodinium Nuclear
30: Peritrich Nuclear
31: Blastocrithidia Nuclear
View Source
var ComplementSeqLenThreshold = 1000

ComplementSeqLenThreshold is the threshold of sequence length that needed to parallelly complement sequence

View Source
var ComplementThreads = runtime.NumCPU()

ComplementThreads is the threads number of parallelly complement sequence

View Source
var DegenerateBaseMapNucl = map[byte]string{
	'A': "A",
	'T': "[TU]",
	'U': "[TU]",
	'C': "C",
	'G': "G",
	'R': "[AG]",
	'Y': "[CTU]",
	'M': "[AC]",
	'K': "[GTU]",
	'S': "[CG]",
	'W': "[ATU]",
	'H': "[ACTU]",
	'B': "[CGTU]",
	'V': "[ACG]",
	'D': "[AGTU]",
	'N': "[ACGTU]",
	'a': "a",
	't': "[tu]",
	'u': "[tu]",
	'c': "c",
	'g': "g",
	'r': "[ag]",
	'y': "[ctu]",
	'm': "[ac]",
	'k': "[gtu]",
	's': "[cg]",
	'w': "[atu]",
	'h': "[actu]",
	'b': "[cgtu]",
	'v': "[acg]",
	'd': "[agtu]",
	'n': "[acgtu]",
}

DegenerateBaseMapNucl mappings nucleic acid degenerate base to regular expression

View Source
var DegenerateBaseMapNucl2 = map[byte]string{
	'A': "A",
	'T': "TU",
	'U': "TU",
	'C': "C",
	'G': "G",
	'R': "AG",
	'Y': "CTU",
	'M': "AC",
	'K': "GTU",
	'S': "CG",
	'W': "ATU",
	'H': "ACTU",
	'B': "CGTU",
	'V': "ACG",
	'D': "AGTU",
	'N': "ACGTU",
	'a': "a",
	't': "tu",
	'u': "tu",
	'c': "c",
	'g': "g",
	'r': "ag",
	'y': "ctu",
	'm': "ac",
	'k': "gtu",
	's': "cg",
	'w': "atu",
	'h': "actu",
	'b': "cgtu",
	'v': "acg",
	'd': "agtu",
	'n': "acgtu",
}

DegenerateBaseMapNucl2 mappings nucleic acid degenerate base to all bases.

View Source
var DegenerateBaseMapProt = map[byte]string{
	'A': "A",
	'B': "[DN]",
	'C': "C",
	'D': "D",
	'E': "E",
	'F': "F",
	'G': "G",
	'H': "H",
	'I': "I",
	'J': "[IL]",
	'K': "K",
	'L': "L",
	'M': "M",
	'N': "N",
	'P': "P",
	'Q': "Q",
	'R': "R",
	'S': "S",
	'T': "T",
	'V': "V",
	'W': "W",
	'X': "[A-Z]",
	'Y': "Y",
	'Z': "[QE]",
	'a': "a",
	'b': "[dn]",
	'c': "c",
	'd': "d",
	'e': "e",
	'f': "f",
	'g': "g",
	'h': "h",
	'i': "i",
	'j': "[il]",
	'k': "k",
	'l': "l",
	'm': "m",
	'n': "n",
	'p': "p",
	'q': "q",
	'r': "r",
	's': "s",
	't': "t",
	'v': "v",
	'w': "w",
	'x': "[a-z]",
	'y': "y",
	'z': "[qe]",
}

DegenerateBaseMapProt mappings protein degenerate base to regular expression

View Source
var ErrInvalidCodon = errors.New("seq: invalid codon")

ErrInvalidCodon means the length of codon is not 3.

View Source
var ErrInvalidDNABase = errors.New("seq: invalid DNA base")

ErrInvalidDNABase means invalid DNA base

View Source
var ErrInvalidPhredQuality = errors.New("seq: invalid Phred quality")

ErrInvalidPhredQuality occurs for phred quality less than 0.

View Source
var ErrInvalidSolexaQuality = errors.New("seq: invalid Solexa quality")

ErrInvalidSolexaQuality occurs for solexa quality less than -5.

View Source
var ErrUnknownCodon = errors.New("seq: unknown codon")

ErrUnknownCodon means the codon is not in the codon table, or the codon contains bases expcet for A C T G U.

View Source
var ErrUnknownQualityEncoding = errors.New("unknown quality encoding")

ErrUnknownQualityEncoding is error for Unknown quality encoding type

View Source
var NMostCommonThreshold = 2

NMostCommonThreshold is the threshold of 'B' in top N most common quality for guessing Illumina 1.5.

View Source
var QUAL_MAP [256]float64
View Source
var ValidSeqLengthThreshold = 10000

ValidSeqLengthThreshold is the threshold of sequence length that needed to parallelly checking sequence

View Source
var ValidSeqThreads = runtime.NumCPU()

ValidSeqThreads is the threads number of parallelly checking sequence

View Source
var ValidateSeq = true

ValidateSeq decides whether check sequence or not

View Source
var ValidateWholeSeq = true

ValidateWholeSeq is used to determin whether validate all bases of a seq

Functions

func AmbBase2Bases0

func AmbBase2Bases0(b byte) ([]byte, error)

AmbBase2Bases0 converts ambiguous base to bases it represents, slower than AmbBase2Bases

func Bases2AmbBase

func Bases2AmbBase(bs []byte) (byte, error)

Bases2AmbBase converts list of bases to ambiguous base

func Codes2AmbCode

func Codes2AmbCode(codes []int) (int, error)

Codes2AmbCode converts list of codes of bases to code of ambiguous base

func Degenerate2Seqs

func Degenerate2Seqs(s []byte) (dseqs [][]byte, err error)

Degenerate2Seqs transforms seqs containing degenrate bases to all possible sequences.

func Phred2Solexa

func Phred2Solexa(q float64) (float64, error)

Phred2Solexa converts Phred quality to Solexa quality.

func QualityConvert

func QualityConvert(from, to QualityEncoding, quality []byte, force bool) ([]byte, error)

QualityConvert convert quality from one encoding to another encoding. Force means forcely truncate scores > 40 to 40 when converting Illumina-1.8+ to Sanger.

func QualityValue

func QualityValue(encoding QualityEncoding, quality []byte) ([]int, error)

QualityValue returns quality value for given encoding and quality string

func Solexa2Phred

func Solexa2Phred(q float64) (float64, error)

Solexa2Phred converts Solexa quality to Phred quality.

func SubLocation

func SubLocation(length, start, end int) (int, int, bool)

SubLocation is my sublocation strategy, start, end and returned start and end are all 1-based

1-based index    1 2 3 4 5 6 7 8 9 10

negative index 0-9-8-7-6-5-4-3-2-1

           seq    A C G T N a c g t n
           1:1    A
           2:4      C G T
         -4:-2                c g t
         -4:-1                c g t n
         -1:-1                      n
          2:-2      C G T N a c g t
          1:-1    A C G T N a c g t n
		  1:12    A C G T N a c g t n
		-12:-1    A C G T N a c g t n

Types

type Alphabet

type Alphabet struct {
	// contains filtered or unexported fields
}

Alphabet could be defined. Attention that, **the letters are case sensitive**.

For example, DNA:

DNA, _ = NewAlphabet(
	"DNA",
	[]byte("acgtACGT"),
	[]byte("tgcaTGCA"),
	[]byte(" -"),
	[]byte("nN"))
var (
	DNA          *Alphabet
	DNAredundant *Alphabet
	RNA          *Alphabet
	RNAredundant *Alphabet
	Protein      *Alphabet
	Unlimit      *Alphabet
)

Four types of alphabets are pre-defined:

DNA           Deoxyribonucleotide code
DNAredundant  DNA + Ambiguity Codes
RNA           Oxyribonucleotide code
RNAredundant  RNA + Ambiguity Codes
Protein       Amino Acide single-letter Code
Unlimit       Self-defined, including all 26 English letters

func GuessAlphabet

func GuessAlphabet(seqs []byte) *Alphabet

GuessAlphabet guesses alphabet by given

func GuessAlphabetLessConservatively

func GuessAlphabetLessConservatively(seqs []byte) *Alphabet

GuessAlphabetLessConservatively change DNA to DNAredundant and RNA to RNAredundant

func NewAlphabet

func NewAlphabet(
	t string,
	isUnlimit bool,
	letters []byte,
	pairs []byte,
	gap []byte,
	ambiguous []byte,
) (*Alphabet, error)

NewAlphabet is Constructor for type *Alphabet*

func (*Alphabet) AllLetters

func (a *Alphabet) AllLetters() []byte

AllLetters return all letters

func (*Alphabet) AmbiguousLetters

func (a *Alphabet) AmbiguousLetters() []byte

AmbiguousLetters returns AmbiguousLetters

func (*Alphabet) Clone

func (a *Alphabet) Clone() *Alphabet

Clone of a Alphabet

func (*Alphabet) Gaps

func (a *Alphabet) Gaps() []byte

Gaps returns gaps

func (*Alphabet) IsValid

func (a *Alphabet) IsValid(s []byte) error

IsValid is used to validate a byte slice

func (*Alphabet) IsValidLetter

func (a *Alphabet) IsValidLetter(b byte) bool

IsValidLetter is used to validate a letter

func (*Alphabet) Letters

func (a *Alphabet) Letters() []byte

Letters returns letters

func (*Alphabet) PairLetter

func (a *Alphabet) PairLetter(b byte) (byte, error)

PairLetter return the Pair Letter

func (*Alphabet) String

func (a *Alphabet) String() string

String returns type of the alphabet

func (*Alphabet) Type

func (a *Alphabet) Type() string

Type returns type of the alphabet

type CodonTable

type CodonTable struct {
	ID         int
	Name       string
	InitCodons map[string]struct{} // upper-case of codon as string, map for fast quering
	StopCodons map[string]struct{} // upper-case of codon as string, map for fast quering
	// contains filtered or unexported fields
}

CodonTable represents a codon table

func NewCodonTable

func NewCodonTable(id int, name string) *CodonTable

NewCodonTable contructs a CodonTable with ID and Name, you need to set the detailed codon table by calling Set or Set2.

func (*CodonTable) Clone

func (t *CodonTable) Clone() CodonTable

Clone returns a deep copy of the CodonTable.

func (*CodonTable) Get

func (t *CodonTable) Get(codon []byte, allowUnknownCodon bool) (byte, error)

Get returns the amino acid of the codon ([]byte), codon can be DNA or RNA. When allowUnknownCodon is true, codons that not int the codon table will still be translated to 'X', and "---" is translated to "-".

func (*CodonTable) Get2

func (t *CodonTable) Get2(codon string, allowUnknownCodon bool) (byte, error)

Get2 returns the amino acid of the codon (string), codon can be DNA or RNA.

func (*CodonTable) Set

func (t *CodonTable) Set(codon []byte, aminoAcid byte) error

Set sets a codon of byte slice.

func (*CodonTable) Set2

func (t *CodonTable) Set2(codon string, aminoAcid byte) error

Set2 sets a codon of string.

func (CodonTable) String

func (t CodonTable) String() string

String returns details of the CodonTable.

func (CodonTable) StringWithAmbiguousCodons

func (t CodonTable) StringWithAmbiguousCodons() string

StringWithAmbiguousCodons returns details of the CodonTable, including ambiguous codons.

func (*CodonTable) Translate

func (t *CodonTable) Translate(sequence []byte, frame int, trim bool, clean bool, allowUnknownCodon bool, markInitCodonAsM bool) ([]byte, error)

Translate translates a DNA/RNA sequence to amino acid sequences. Available frame: 1, 2, 3, -1, -2 ,-3. If option trim is true, it removes all 'X' and '*' characters from the right end of the translation. If option clean is true, it changes all STOP codon positions from the '*' character to 'X' (an unknown residue). If option allowUnknownCodon is true, codons not in the codon table will be translated to 'X'. If option markInitCodonAsM is true, initial codon at beginning will be represented as 'M'.

type QualityEncoding

type QualityEncoding int

QualityEncoding is the type of quality encoding

const (
	// Unknown quality encoding
	Unknown QualityEncoding = iota
	// Sanger format can encode a Phred quality score from 0 to 93 using
	// ASCII 33 to 126 (although in raw read data the Phred quality score
	// rarely exceeds 60, higher scores are possible in assemblies or read maps).
	Sanger
	// Solexa /Illumina 1.0 format can encode a Solexa/Illumina quality score
	// from -5 to 62 using ASCII 59 to 126 (although in raw read data Solexa
	// scores from -5 to 40 only are expected).
	Solexa
	// Illumina1p3 means Illumina 1.3+.
	// Starting with Illumina 1.3 and before Illumina 1.8, the format
	// encoded a Phred quality score from 0 to 62 using ASCII 64 to 126
	// (although in raw read data Phred scores from 0 to 40 only are expected).
	Illumina1p3
	// Illumina1p5 means Illumina 1.5+.
	// Starting in Illumina 1.5 and before Illumina 1.8, the Phred scores
	//  0 to 2 have a slightly different meaning. The values 0 and 1 are
	// no longer used and the value 2, encoded by ASCII 66 "B", is used
	// also at the end of reads as a Read Segment Quality Control Indicator.
	Illumina1p5
	// Illumina1p8 means Illumina 1.8+.
	// Starting in Illumina 1.8, the quality scores have basically
	// returned to the use of the Sanger format (Phred+33)
	Illumina1p8
)

func GuessQualityEncoding

func GuessQualityEncoding(quality []byte) []QualityEncoding

GuessQualityEncoding returns potential quality encodings.

func (QualityEncoding) IsSolexa

func (qe QualityEncoding) IsSolexa() bool

IsSolexa tells whether the encoding is Solexa

func (QualityEncoding) Offset

func (qe QualityEncoding) Offset() int

Offset is the ASCII offset

func (QualityEncoding) QualityRange

func (qe QualityEncoding) QualityRange() []int

QualityRange is the typical quality range

func (QualityEncoding) String

func (qe QualityEncoding) String() string

type Seq

type Seq struct {
	Alphabet  *Alphabet
	Seq       []byte
	Qual      []byte
	QualValue []int
}

Seq represents a FASTA/Q record

func NewSeq

func NewSeq(t *Alphabet, s []byte) (*Seq, error)

NewSeq is constructor for type *Seq*

func NewSeqWithQual

func NewSeqWithQual(t *Alphabet, s []byte, q []byte) (*Seq, error)

NewSeqWithQual is used to store fastq sequence

func NewSeqWithQualWithoutValidation

func NewSeqWithQualWithoutValidation(t *Alphabet, s []byte, q []byte) (*Seq, error)

NewSeqWithQualWithoutValidation create Seq with quality without check the sequences

func NewSeqWithoutValidation

func NewSeqWithoutValidation(t *Alphabet, s []byte) (*Seq, error)

NewSeqWithoutValidation create Seq without check the sequences

func (*Seq) AvgQual

func (seq *Seq) AvgQual(asciiBase int) float64

AvgQual calculates average quality value.

func (*Seq) BaseContent

func (seq *Seq) BaseContent(list string) float64

BaseContent returns base content for given bases. For example:

seq.BaseContent("gc")

func (*Seq) BaseContentCaseSensitive

func (seq *Seq) BaseContentCaseSensitive(list string) float64

BaseContentCaseSensitive returns base content for given case sensitive bases.

func (*Seq) BaseCount

func (seq *Seq) BaseCount(list string) int

BaseCount counts bases

func (*Seq) BaseCountCaseSensitive

func (seq *Seq) BaseCountCaseSensitive(list string) int

BaseCountCaseSensitive counts bases, case is not ignored.

func (*Seq) Bases added in v0.1.1

func (seq *Seq) Bases(gapLetters string) int

Bases counts non-gap bases

func (*Seq) Clone

func (seq *Seq) Clone() *Seq

Clone of a Seq

func (*Seq) Clone2

func (seq *Seq) Clone2() *Seq

Clone2 clones the sequence except the alphabet

func (*Seq) Complement

func (seq *Seq) Complement() *Seq

Complement returns complement sequence.

func (*Seq) ComplementInplace

func (seq *Seq) ComplementInplace() *Seq

ComplementInplace returns complement sequence.

func (*Seq) Degenerate2Regexp

func (seq *Seq) Degenerate2Regexp() string

Degenerate2Regexp transforms seqs containing degenrate base to regular expression

func (*Seq) FormatSeq

func (seq *Seq) FormatSeq(width int) []byte

FormatSeq wrap seq

func (*Seq) GC

func (seq *Seq) GC() float64

GC returns the GC content

func (*Seq) Length

func (seq *Seq) Length() int

Length returns the length of sequence

func (*Seq) ParseQual

func (seq *Seq) ParseQual(asciiBase int)

ParseQual parses sequence quality, asciiBase = 33 for Phred+33.

func (*Seq) RemoveGaps

func (seq *Seq) RemoveGaps(letters string) *Seq

RemoveGaps return a new seq without gaps

func (*Seq) RemoveGapsInplace

func (seq *Seq) RemoveGapsInplace(letters string) *Seq

RemoveGapsInplace removes gaps in place

func (*Seq) RevCom

func (seq *Seq) RevCom() *Seq

RevCom returns reverse complement sequence

func (*Seq) RevComInplace

func (seq *Seq) RevComInplace() *Seq

RevComInplace reverses complement sequence in place

func (*Seq) Reverse

func (seq *Seq) Reverse() *Seq

Reverse a sequence

func (*Seq) ReverseInplace

func (seq *Seq) ReverseInplace() *Seq

ReverseInplace reverses the sequence content

func (*Seq) Slider

func (seq *Seq) Slider(window int, step int, circular bool, greedy bool) func() (*Seq, bool)

Slider returns a function for sliding the sequence. Circular is for circular genome, and it overides greedy. If not circular and greedy is true, last fragment shorter than window will be returned.

func (*Seq) String

func (seq *Seq) String() string

func (*Seq) SubSeq

func (seq *Seq) SubSeq(start int, end int) *Seq

SubSeq returns a sub seq. start and end is 1-based.

Examples:

1-based index    1 2 3 4 5 6 7 8 9 10

negative index 0-9-8-7-6-5-4-3-2-1

           seq    A C G T N a c g t n
           1:1    A
           2:4      C G T
         -4:-2                c g t
         -4:-1                c g t n
         -1:-1                      n
          2:-2      C G T N a c g t
          1:-1    A C G T N a c g t n
		  1:12    A C G T N a c g t n
		-12:-1    A C G T N a c g t n

func (*Seq) SubSeqInplace

func (seq *Seq) SubSeqInplace(start int, end int) *Seq

SubSeqInplace return subseq inplace

func (*Seq) Translate

func (seq *Seq) Translate(transl_table int, frame int, trim bool, clean bool, allowUnknownCodon bool, markInitCodonAsM bool) (*Seq, error)

Translate translates the RNA/DNA to amino acid sequence. Available frame: 1, 2, 3, -1, -2 ,-3. If option trim is true, it removes all 'X' and '*' characters from the right end of the translation. If option clean is true, it changes all STOP codon positions from the '*' character to 'X' (an unknown residue). If option allowUnknownCodon is true, codons not in the codon table will be translated to 'X'. If option markInitCodonAsM is true, initial codon at beginning will be represented as 'M'.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL