Documentation ¶
Overview ¶
Package sam is a library for parsing and representing SAM files, and for efficiently executing sequencing pipelines on .sam/.bam/.cram files, taking advantage of modern multi-core processors.
Modifications to headers and alignments are expressed as filters. The library comes with a number of commonly used pre-defined filters, but you can also define and use your own filters. A pipeline can be executed with the RunPipeline method of the PipelineInput interface, which accepts SAM/BAM/CRAM files as input and/or output sources, but can also operate on an in-memory representation of such files. PipelineInput and PipelineOutput can be implemented to also operate on other input/output sources, such as databases.
elPrep provides high-level Filter and AlignmentFilter types that operate on SAM file header and alignment structs. elPrep then uses the pargo library for expressing pipelines of such filters for efficient parallel execution. It is normally not necessary to deal with pargo pipelines directly, but you can check the documentation at https://godoc.org/github.com/ExaScience/pargo/pipeline for details of pargo pipelines if necessary.
Index ¶
- Constants
- Variables
- func AlignmentToString(p *pipeline.Pipeline, _ pipeline.NodeKind, _ *int) (receiver pipeline.Receiver, _ pipeline.Finalizer)
- func ComposeFilters(header *Header, hdrFilters []Filter) (receiver pipeline.Receiver)
- func CoordinateLess(aln1, aln2 *Alignment) bool
- func FormatComment(out *bufio.Writer, code, comment string) error
- func FormatHeaderLine(out *bufio.Writer, code string, record utils.StringMap) error
- func FormatString(out *bufio.Writer, tag, value string) error
- func FormatTag(out []byte, tag utils.Symbol, value interface{}) ([]byte, error)
- func IsHeaderUserTag(code string) bool
- func MergeSingleEndFilesSplitPerChromosome(inputPath, output, fai, fasta, inputPrefix, inputExtension string, ...) (err error)
- func MergeSortedFilesSplitPerChromosome(inputPath, output, fai, fasta, inputPrefix, inputExtension string, ...) (err error)
- func MergeUnsortedFilesSplitPerChromosome(inputPath, output, fai, fasta, inputPrefix, inputExtension string, ...) (err error)
- func ParseHeaderLineFromString(line string) (utils.StringMap, error)
- func QNAMELess(aln1, aln2 *Alignment) bool
- func SQLN(record utils.StringMap) (int32, error)
- func SetSQLN(record utils.StringMap, value int32)
- func SkipHeader(reader *bufio.Reader) (lines int, err error)
- func SplitFilePerChromosome(input, outputPath, outputPrefix, outputExtension, fai, fasta string) (err error)
- func SplitSingleEndFilePerChromosome(input, outputPath, outputPrefix, outputExtension, fai, fasta string) (err error)
- func StringToAlignment(p *pipeline.Pipeline, _ pipeline.NodeKind, _ *int) (receiver pipeline.Receiver, _ pipeline.Finalizer)
- type Alignment
- func (aln *Alignment) FlagEvery(flag uint16) bool
- func (aln *Alignment) FlagNotAny(flag uint16) bool
- func (aln *Alignment) FlagNotEvery(flag uint16) bool
- func (aln *Alignment) FlagSome(flag uint16) bool
- func (aln *Alignment) Format(out []byte) ([]byte, error)
- func (aln *Alignment) IsDuplicate() bool
- func (aln *Alignment) IsFirst() bool
- func (aln *Alignment) IsLast() bool
- func (aln *Alignment) IsMultiple() bool
- func (aln *Alignment) IsNextReversed() bool
- func (aln *Alignment) IsNextUnmapped() bool
- func (aln *Alignment) IsProper() bool
- func (aln *Alignment) IsQCFailed() bool
- func (aln *Alignment) IsReversed() bool
- func (aln *Alignment) IsSecondary() bool
- func (aln *Alignment) IsSupplementary() bool
- func (aln *Alignment) IsUnmapped() bool
- func (aln *Alignment) LIBID() interface{}
- func (aln *Alignment) REFID() int32
- func (aln *Alignment) RG() interface{}
- func (aln *Alignment) SetLIBID(libid interface{})
- func (aln *Alignment) SetREFID(refid int32)
- func (aln *Alignment) SetRG(rg interface{})
- type AlignmentFilter
- type AlignmentSorter
- type By
- type ByteArray
- type CigarOperation
- type FieldParser
- type Filter
- type GroupingOrder
- type Header
- func (hdr *Header) AddUserRecord(code string, record utils.StringMap)
- func (hdr *Header) EnsureHD() utils.StringMap
- func (hdr *Header) EnsureUserRecords() map[string][]utils.StringMap
- func (hdr *Header) Format(out *bufio.Writer) (err error)
- func (hdr *Header) HDGO() GroupingOrder
- func (hdr *Header) HDSO() SortingOrder
- func (hdr *Header) SetHDGO(value GroupingOrder)
- func (hdr *Header) SetHDSO(value SortingOrder)
- type InputFile
- type OutputFile
- type PipelineInput
- type PipelineOutput
- type Reader
- type Sam
- type SortingOrder
- type StringScanner
- func (sc *StringScanner) Err() error
- func (sc *StringScanner) Len() int
- func (sc *StringScanner) ParseAlignment() *Alignment
- func (sc *StringScanner) ParseByteArray(tag utils.Symbol) (utils.Symbol, interface{})
- func (sc *StringScanner) ParseChar(tag utils.Symbol) (utils.Symbol, interface{})
- func (sc *StringScanner) ParseFloat(tag utils.Symbol) (utils.Symbol, interface{})
- func (sc *StringScanner) ParseHeaderField() (tag, value string)
- func (sc *StringScanner) ParseHeaderLine() utils.StringMap
- func (sc *StringScanner) ParseInteger(tag utils.Symbol) (utils.Symbol, interface{})
- func (sc *StringScanner) ParseMandatoryField() string
- func (sc *StringScanner) ParseNumericArray(tag utils.Symbol) (utils.Symbol, interface{})
- func (sc *StringScanner) ParseOptionalField() (tag utils.Symbol, value interface{})
- func (sc *StringScanner) ParseString(tag utils.Symbol) (utils.Symbol, interface{})
- func (sc *StringScanner) Reset(s string)
- type Writer
Constants ¶
const ( SamExt = ".sam" BamExt = ".bam" CramExt = ".cram" )
SAM file extensions.
const ( FileFormatVersion = "1.5" FileFormatDate = "1 Jun 2017" )
The SAM file format version and date strings supported by this library. This is entered by default in an @HD line in the header section of a SAM file, unless user code explicitly asks for a different version number. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
const ( // Template having multiple segments in sequencing. Multiple = 0x1 // Each segment properly aligned according to the aligner. Proper = 0x2 // Segment unmapped. Unmapped = 0x4 // Next segment in the template unmapped. NextUnmapped = 0x8 // SEQ being reversed complemented. Reversed = 0x10 // SEQ of the next segment in the template being reverse // complemented. NextReversed = 0x20 // The first segment in the template. First = 0x40 // The last segment in the template. Last = 0x80 // Secondary alignment. Secondary = 0x100 // Not passing filters, such as platform/vendor quality controls. QCFailed = 0x200 // PCR or optical duplicate. Duplicate = 0x400 // Supplementary alignment. Supplementary = 0x800 )
Bit values for the FLAG field in the Alignment struct. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
const CigarOperations = "MmIiDdNnSsHhPpXx="
CigarOperations contains all valid CIGAR operations.
Variables ¶
var ( CC = utils.Intern("CC") LB = utils.Intern("LB") PG = utils.Intern("PG") PU = utils.Intern("PU") RG = utils.Intern("RG") )
Symbols for some commonly used optional fields. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.
var ( LIBID = utils.Intern("LIBID") REFID = utils.Intern("REFID") )
Symbols for some temporary fields.
Functions ¶
func AlignmentToString ¶
func AlignmentToString(p *pipeline.Pipeline, _ pipeline.NodeKind, _ *int) (receiver pipeline.Receiver, _ pipeline.Finalizer)
AlignmentToString returns a pargo pipeline.Receiver that formats slices of Alignment pointers into slices of strings representing these alignments according to the SAM file format. See http://samtools.github.io/hts-specs/SAMv1.pdf - Sections 1.4 and 1.5.
func ComposeFilters ¶
ComposeFilters takes a Header and a slice of Filter functions, and successively calls these functions to generate the corresponding AlignmentFilter predicates. It then returns a pargo pipeline.Receiver that applies these AlignmentFilter predicates on the slices of Alignment pointers it receives. ComposeFilters may return nil if all AlignmentFilters are nil.
func CoordinateLess ¶
CoordinateLess compares two alignments according to their coordinate. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD, SO.
func FormatComment ¶
FormatComment writes a header comment line in a SAM file header section. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
func FormatHeaderLine ¶
FormatHeaderLine writes a header line in a SAM file header section. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
func FormatString ¶
FormatString writes a SAM file TAG of type string.
func FormatTag ¶
FormatTag writes a SAM file TAG by appending its ASCII-string representation to out and returning the result, dispatching on the actual type of the given value. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.
The following types are accepted: byte (A), int32 (i), float32 (f), string (Z), ByteArray (H), []int8 (B:c), []uint8 (B:C), []int16 (B:s), []uint16 (B:S), []int32 (B:i), []uint32 (B:I), and []float32 (B:f).
func IsHeaderUserTag ¶
IsHeaderUserTag determins whether this tag string represent a user-defined tag.
func MergeSingleEndFilesSplitPerChromosome ¶
func MergeSingleEndFilesSplitPerChromosome(inputPath, output, fai, fasta, inputPrefix, inputExtension string, header *Header) (err error)
MergeSingleEndFilesSplitPerChromosome merges files containing single-end reads that were split with SplitSingleEndFilePerChromosome.
func MergeSortedFilesSplitPerChromosome ¶
func MergeSortedFilesSplitPerChromosome(inputPath, output, fai, fasta, inputPrefix, inputExtension string, header *Header) (err error)
MergeSortedFilesSplitPerChromosome merges files that were split with SplitFilePerChromosome and sorted in coordinate order.
func MergeUnsortedFilesSplitPerChromosome ¶
func MergeUnsortedFilesSplitPerChromosome(inputPath, output, fai, fasta, inputPrefix, inputExtension string, header *Header) (err error)
MergeUnsortedFilesSplitPerChromosome merges files that were split with SplitFilePerChromosome and are unsorted.
func ParseHeaderLineFromString ¶
ParseHeaderLineFromString parses a SAM header line from a string, except that entries are separated by white space instead of tabulators. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
The @ record type code must have already been scanned. ParseHeaderLineFromString cannot be used for @CO lines.
func QNAMELess ¶
QNAMELess compares two alignments according to their query template name. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD, SO.
func SQLN ¶
SQLN returns he LN field value, assuming that the given record represents an @SQ line in the the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
If the LN field is present, error is nil unless the value cannot be successfully parsed into an int32. If the LN field is not present, SQLN returns the maximum possible value for LN and a non-nil error value.
func SetSQLN ¶
SetSQLN sets the LN field value, assumming that the given record represents an @SQ line in the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
func SkipHeader ¶
SkipHeader skips the complete header in a SAM file. This is more efficient than calling ParseHeader and ignoring its result.
Returns the number of header lines and a non-nil error value if an error occurred.
func SplitFilePerChromosome ¶
func SplitFilePerChromosome(input, outputPath, outputPrefix, outputExtension, fai, fasta string) (err error)
SplitFilePerChromosome splits a SAM file into: a file containing all unmapped reads, a file containing all pairs where reads map to different chromosomes, and a file per chromosome containing all pairs where the reads map to that chromosome. There are no requirements on the input file for splitting.
func SplitSingleEndFilePerChromosome ¶
func SplitSingleEndFilePerChromosome(input, outputPath, outputPrefix, outputExtension, fai, fasta string) (err error)
SplitSingleEndFilePerChromosome splits a SAM file containing single-end reads into a file for the unmapped reads, and a file per chromosome, containing all reads that map to that chromosome. There are no requirements on the input file for splitting.
func StringToAlignment ¶
func StringToAlignment(p *pipeline.Pipeline, _ pipeline.NodeKind, _ *int) (receiver pipeline.Receiver, _ pipeline.Finalizer)
StringToAlignment returns a pargo pipeline.Receiver that parses slices of strings representing alignments according to the SAM file format into slices of pointers to freshly allocated Alignment values. See http://samtools.github.io/hts-specs/SAMv1.pdf - Sections 1.4 and 1.5.
Types ¶
type Alignment ¶
type Alignment struct { // The Query template NAME. QNAME string // The bitwise FLAG. FLAG uint16 // The Reference sequence NAME. RNAME string // The 1-based leftmost mapping POSition. POS int32 // The MAPping Quality. MAPQ byte // The CIGAR string. CIGAR string // The Reference sequence name of the mate/NEXT read. RNEXT string // The 1-based leftmost mapping Position of the make/NEXT read. PNEXT int32 // The observed Template LENgth. TLEN int32 // The segment SEQuence. SEQ string // The ASCII of Phred-scaled base QUALity+33. QUAL string // The optional fields in a read alignment. TAGS utils.SmallMap // Additional optional fields which are not stored in SAM files, but // resereved for temporary values in filters. Temps utils.SmallMap }
An Alignment represents a single read alignment with mandatory and optional fields that can be contained in a SAM file alignment line. See http://samtools.github.io/hts-specs/SAMv1.pdf - Sections 1.4 and 1.5.
func NewAlignment ¶
func NewAlignment() *Alignment
NewAlignment allocates and initializes an empty alignment.
func (*Alignment) FlagEvery ¶
FlagEvery checks for every bit set in the given flag being also set in aln.FLAG.
func (*Alignment) FlagNotAny ¶
FlagNotAny checks for not any bit set in the given flag being also set in aln.FLAG.
func (*Alignment) FlagNotEvery ¶
FlagNotEvery checks for not every bit set in the given flag being also set in aln.FLAG.
func (*Alignment) FlagSome ¶
FlagSome checks for some bits set in the given flag being also set in aln.FLAG.
func (*Alignment) Format ¶
Format writes a SAM file read alignment line by appending its ASCII-string representation to out and return the result. See http://samtools.github.io/hts-specs/SAMv1.pdf - Sections 1.4 and 1.5.
func (*Alignment) IsDuplicate ¶
IsDuplicate checks for PCR or optical duplicate. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsFirst ¶
IsFirst checks for being the first segment in the template. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsLast ¶
IsLast checks for being the last segment in the template. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsMultiple ¶
IsMultiple checks for template having multiple segments in sequencing. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsNextReversed ¶
IsNextReversed check for SEQ of the next segment in the template being reverse complemented. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsNextUnmapped ¶
IsNextUnmapped checks for next segment in the template unmapped. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsProper ¶
IsProper checks for each segment being properly aligned according to the aligner. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsQCFailed ¶
IsQCFailed checks for not passing filters, such as platform/vendor quality controls. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsReversed ¶
IsReversed checks for SEQ being reversed complemented. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsSecondary ¶
IsSecondary checks for secondary alignment. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsSupplementary ¶
IsSupplementary checks for supplementary alignment. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsUnmapped ¶
IsUnmapped checks for segment unmapped. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) LIBID ¶
func (aln *Alignment) LIBID() interface{}
LIBID returns the LIBID temporary field.
func (*Alignment) REFID ¶
REFID returns the REFID temporary field.
If REFID field is not set, this will panic with a log message. The AddREFID filter can be used to avoid this situation. (The elPrep command line tool ensures that AddREFID is correctly used for its default pipelines.)
func (*Alignment) RG ¶
func (aln *Alignment) RG() interface{}
RG returns the (potentially empty) RG optional field.
func (*Alignment) SetLIBID ¶
func (aln *Alignment) SetLIBID(libid interface{})
SetLIBID sets the LIBID temporary field.
type AlignmentFilter ¶
An AlignmentFilter receives an Alignment which it can modify. It returns true if the alignment should be kept, and false if the alignment should be removed.
type AlignmentSorter ¶
type AlignmentSorter struct {
// contains filtered or unexported fields
}
AlignmentSorter is a helper for sorting Alignment slices that implements https://godoc.org/github.com/ExaScience/pargo/sort#StableSorter
func (AlignmentSorter) Assign ¶
func (s AlignmentSorter) Assign(p psort.StableSorter) func(i, j, len int)
Assign implements the method of the StableSorter interface.
func (AlignmentSorter) Len ¶
func (s AlignmentSorter) Len() int
Len implements the method of the sort.Interface.
func (AlignmentSorter) Less ¶
func (s AlignmentSorter) Less(i, j int) bool
Less implements the method of the sort.Interface.
func (AlignmentSorter) NewTemp ¶
func (s AlignmentSorter) NewTemp() psort.StableSorter
NewTemp implements the method of the StableSorter interface
func (AlignmentSorter) SequentialSort ¶
func (s AlignmentSorter) SequentialSort(i, j int)
SequentialSort implements the method of the SequantialSorter interface.
type By ¶
By is a type for comparison predicates on Alignment pointers.
func (By) ParallelStableSort ¶
ParallelStableSort sorts a slice of alignments according to the given comparison predicate.
type ByteArray ¶
type ByteArray []byte
ByteArray is a representation for byte arrays as stored in optional fields of read alignments lines using type H. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.
type CigarOperation ¶
type CigarOperation struct { Length int32 Operation byte // 'M', 'I', 'D', 'N', 'S', 'H', 'P', '=', or 'X' }
CigarOperation represents a CIGAR operation. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.6.
func ScanCigarString ¶
func ScanCigarString(cigar string) ([]CigarOperation, error)
ScanCigarString converts a CIGAR string to a slice of CigarOperation. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.6.
Uses an internal cache to reduce memory overhead. It is safe for multiple goroutines to call ScanCigarString concurrently.
type FieldParser ¶
type FieldParser func(*StringScanner, utils.Symbol) (utils.Symbol, interface{})
FieldParser is the signature for all parsers for optional fields in read alignment lines in SAM files.
type Filter ¶
type Filter func(*Header) AlignmentFilter
A Filter receives a Header and returns an AlignmentFilter or nil.
type GroupingOrder ¶
type GroupingOrder string
GroupingOrder represents the possible values for the GO tag stored in the @HD line of a header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
const ( None GroupingOrder = "none" Query GroupingOrder = "query" Reference GroupingOrder = "reference" )
Grouping orders.
type Header ¶
type Header struct { // The @HD line. HD utils.StringMap // The @SQ, @RG, and @PG lines, in the order they occur in the // header. SQ, RG, PG []utils.StringMap // The @CO lines in the order they occur in the header. CO []string // The lines with user-defined @ tags, for each tag in the order // they occur in the header. UserRecords map[string][]utils.StringMap }
Header represents the information stored in the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
Each line (except for @CO) is represented as a map[string]string, mapping string tags to string values.
The zero Header is valid and empty.
func ParseHeader ¶
ParseHeader parses a complete header in a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
Returns a freshly allocated header, the number of header lines, and a non-nil error value if an error occurred during parsing.
func (*Header) AddUserRecord ¶
AddUserRecord adds a header line for the given user-defined @ tag to the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
func (*Header) EnsureHD ¶
EnsureHD ensures that an @HD line is present in the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
If an @HD line already exists, it is returned unchanged. Otherwise, the HD field is initialized with a default VN value.
func (*Header) EnsureUserRecords ¶
EnsureUserRecords ensures that a map for user-defined @ tags exists in the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
If the map already exists, it is returned unchanged. Otherwise, the UserRecords field is initialized with an empty map.
func (*Header) Format ¶
Format writes the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
func (*Header) HDGO ¶
func (hdr *Header) HDGO() GroupingOrder
HDGO returns the grouping order (GO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
If there is no @HD line, or the GO field is not set, returns "none".
func (*Header) HDSO ¶
func (hdr *Header) HDSO() SortingOrder
HDSO returns the sorting order (SO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
If there is no @HD line, or the SO field is not set, returns "unknown".
func (*Header) SetHDGO ¶
func (hdr *Header) SetHDGO(value GroupingOrder)
SetHDGO sets the grouping order (GO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
This also deletes the value for the SO field if it is set.
func (*Header) SetHDSO ¶
func (hdr *Header) SetHDSO(value SortingOrder)
SetHDSO sets the sorting order (SO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
This also deletes the value for the GO field if it is set.
type InputFile ¶
InputFile represents a SAM, BAM, or CRAM file for input.
func Open ¶
Open a SAM file for input.
If the filename extension is .bam or .cram, use samtools view for input. Tell samtools view to only return the header section for input when headerOnly is true.
samtools must be visible in the directories named by the PATH environment variable for .bam or .cram input.
If the filename extension is not .bam or .cram, then .sam is always assumed.
If the name is "/dev/stdin", then the input is read from os.Stdin
type OutputFile ¶
OutputFile represents a SAM, BAM, or CRAM file for output.
func Create ¶
func Create(name, fai, fasta string) (*OutputFile, error)
Create a SAM file for output.
If the filename extension is .bam or .cram, use samtools view for output. If the filename extension is .cram, then either fai or fasta must be a filename, and the other must be "". If fai is a filename, it is passed as the -t option to samtools view. If fasta is a filename, it is passed as the -T option to samtools view.
samtools must be visible in the directories named by the PATH environment variable for .bam or .cram output.
If the filename extension is not .bam or .cram, then .sam is always assumed.
If the name is "/dev/stdout", then the output is written to os.Stdout.
func (*OutputFile) Close ¶
func (output *OutputFile) Close() error
Close the SAM output file. If samtools view is used for output, wait for its process to finish.
func (*OutputFile) SamWriter ¶
func (output *OutputFile) SamWriter() *Writer
SamWriter returns the Writer for a SAM, BAM or CRAM OutputFile.
type PipelineInput ¶
type PipelineInput interface {
RunPipeline(output PipelineOutput, filters []Filter, sortingOrder SortingOrder) error
}
A PipelineInput arranges for a pargo pipeline to be properly initialized, arrange for the pipeline to run the given filters, call output.AddNodes(...), and eventually run the pipeline. If RunPipeline doesn't encounter an error of its own, it should return the error of its pargo pipeline, if any.
type PipelineOutput ¶
type PipelineOutput interface {
AddNodes(p *pipeline.Pipeline, header *Header, sortingOrder SortingOrder)
}
A PipelineOutput can add nodes to the given pargo pipeline. AddNodes also receives a header that should be added to the output, and a sortingOrder. AddNodes should arrange for the alignments that it receives to be sorted according to that sortingOrder if possible, or report an error if it can't perform such a sort. Any error should be reported to the pipeline by calling p.Err(err) with a non-nil error value.
type Reader ¶
Reader is a bufio.Reader for a SAM, BAM or CRAM InputFile.
func (*Reader) RunPipeline ¶
func (input *Reader) RunPipeline(output PipelineOutput, hdrFilters []Filter, sortingOrder SortingOrder) error
RunPipeline implements the PipelineInput interface for Reader values that produce input in the SAM file format.
type Sam ¶
Sam represents a complete SAM data set that can be contained in a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.
func (*Sam) AddNodes ¶
func (sam *Sam) AddNodes(p *pipeline.Pipeline, header *Header, sortingOrder SortingOrder)
AddNodes implements the PipelineOutput interface for Sam values to represent complete SAM files in memory.
func (*Sam) Format ¶
Format writes a complete SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.
func (*Sam) RunPipeline ¶
func (sam *Sam) RunPipeline(output PipelineOutput, hdrFilters []Filter, sortingOrder SortingOrder) error
RunPipeline implements the PipelineInput interface for Sam values that represent complete SAM files in memory.
type SortingOrder ¶
type SortingOrder string
SortingOrder represents the possible values for the SO tag stored in the @HD line of a header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
const ( Keep SortingOrder = "keep" Unknown SortingOrder = "unknown" Unsorted SortingOrder = "unsorted" Queryname SortingOrder = "queryname" Coordinate SortingOrder = "coordinate" )
Sorting orders.
type StringScanner ¶
type StringScanner struct {
// contains filtered or unexported fields
}
A StringScanner can be used scan/parse ASCII strings representing lines in SAM files.
The zero StringScanner is valid and empty.
func (*StringScanner) Err ¶
func (sc *StringScanner) Err() error
Err returns the error that occurred during scanning/parsing.
func (*StringScanner) Len ¶
func (sc *StringScanner) Len() int
Len returns the number of ASCII characters that still need to be scanned/parsed. Returns 0 if Err() would return a non-nil value.
func (*StringScanner) ParseAlignment ¶
func (sc *StringScanner) ParseAlignment() *Alignment
ParseAlignment parses a read alignment line in a SAM file and returns a freshly allocated alignment. See http://samtools.github.io/hts-specs/SAMv1.pdf - Sections 1.4 and 1.5.
func (*StringScanner) ParseByteArray ¶
func (sc *StringScanner) ParseByteArray(tag utils.Symbol) (utils.Symbol, interface{})
ParseByteArray parses a byte array in the tab-delimited Hex format and returns it as a ByteArray. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.
func (*StringScanner) ParseChar ¶
func (sc *StringScanner) ParseChar(tag utils.Symbol) (utils.Symbol, interface{})
ParseChar parses a single tab-delimited character and returns it as a byte. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.
func (*StringScanner) ParseFloat ¶
func (sc *StringScanner) ParseFloat(tag utils.Symbol) (utils.Symbol, interface{})
ParseFloat parses a single tab-delimited float and returns it as a float32. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.
func (*StringScanner) ParseHeaderField ¶
func (sc *StringScanner) ParseHeaderField() (tag, value string)
ParseHeaderField parses a field in a header line in a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
func (*StringScanner) ParseHeaderLine ¶
func (sc *StringScanner) ParseHeaderLine() utils.StringMap
ParseHeaderLine parses a header line in a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
The @ record type code must have already been scanned. ParseHeaderLine cannot be used for @CO lines.
func (*StringScanner) ParseInteger ¶
func (sc *StringScanner) ParseInteger(tag utils.Symbol) (utils.Symbol, interface{})
ParseInteger parses a single tab-delimited integer and returns it as an int32. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.
func (*StringScanner) ParseMandatoryField ¶
func (sc *StringScanner) ParseMandatoryField() string
ParseMandatoryField parses a single tab-delimited mandatory field in a SAM read alignment line and returns it as a string. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.
func (*StringScanner) ParseNumericArray ¶
func (sc *StringScanner) ParseNumericArray(tag utils.Symbol) (utils.Symbol, interface{})
ParseNumericArray parses a typed, tab-delimited, and comma-separated integer or numeric array and returns it as a []int8, []uint8, []int16, []uint16, []int32, []uint32, or []float32. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.
func (*StringScanner) ParseOptionalField ¶
func (sc *StringScanner) ParseOptionalField() (tag utils.Symbol, value interface{})
ParseOptionalField parses a single tab-delimited optional field in a SAM read alignment line and returns it as a tag/value pair. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.
The second return value is one of byte (representing an ASCII character), int32, float32, string, ByteArray, []int8, []uint8, []int16, []uint16, []int32, []uint32, or []float32.
func (*StringScanner) ParseString ¶
func (sc *StringScanner) ParseString(tag utils.Symbol) (utils.Symbol, interface{})
ParseString parses a single tab-delimited string and returns it. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.
func (*StringScanner) Reset ¶
func (sc *StringScanner) Reset(s string)
Reset resets the scanner, and initializes it with the given string.