Documentation ¶
Overview ¶
Package sam is a library for parsing and representing SAM files, and for efficiently executing sequencing pipelines on .sam/.bam files, taking advantage of modern multi-core processors.
Modifications to headers and alignments are expressed as filters. The library comes with a number of commonly used pre-defined filters, but you can also define and use your own filters. A pipeline can be executed with the RunPipeline method of the PipelineInput interface, which accepts SAM/BAM files as input and/or output sources, but can also operate on an in-memory representation of such files. PipelineInput and PipelineOutput can be implemented to also operate on other input/output sources, such as databases.
elPrep provides high-level Filter and AlignmentFilter types that operate on SAM file header and alignment structs. elPrep then uses the pargo library for expressing pipelines of such filters for efficient parallel execution. It is normally not necessary to deal with pargo pipelines directly, but you can check the documentation at https://godoc.org/github.com/ExaScience/pargo/pipeline for details of pargo pipelines if necessary.
Index ¶
- Constants
- Variables
- func AlignmentToBytes(writer *OutputFile) pipeline.Filter
- func BytesToAlignment(reader *InputFile) pipeline.Filter
- func BytesToAlignmentFI(reader *InputFile, setFileIndex bool) pipeline.Filter
- func ComposeFilters(header *Header, hdrFilters []Filter) (receiver pipeline.Receiver)
- func CoordinateLess(aln1, aln2 *Alignment) bool
- func IsHeaderUserTag(code string) bool
- func MergeSingleEndFilesSplitPerChromosome(inputPath, output, inputPrefix, inputExtension string, header *Header, _ int) (funcErr error)
- func MergeSortedFilesSplitPerChromosome(inputPath, output, inputPrefix, inputExtension string, header *Header, _ int) (funcErr error)
- func MergeUnsortedFilesSplitPerChromosome(inputPath, output, inputPrefix, inputExtension string, header *Header, _ int) (funcErr error)
- func ParseBamHeader(reader io.Reader) (hdr *Header, references []BAMReference, err error)
- func ParseHeaderLineFromString(line string) (utils.StringMap, error)
- func QNAMELess(aln1, aln2 *Alignment) bool
- func SQLN(record utils.StringMap) (int32, error)
- func SetSQLN(record utils.StringMap, value int32)
- func SkipSamHeader(reader *bufio.Reader) (err error)
- func SplitFilePerChromosome(input, outputPath, outputPrefix, outputExtension string, contigGroupSize int) (funcErr error)
- func SplitSingleEndFilePerChromosome(input, outputPath, outputPrefix, outputExtension string, contigGroupSize int) (funcErr error)
- type Alignment
- func (aln *Alignment) FileIndex() int
- func (aln *Alignment) FlagEvery(flag uint16) bool
- func (aln *Alignment) FlagNotAny(flag uint16) bool
- func (aln *Alignment) FlagNotEvery(flag uint16) bool
- func (aln *Alignment) FlagSome(flag uint16) bool
- func (aln *Alignment) IsDuplicate() bool
- func (aln *Alignment) IsFirst() bool
- func (aln *Alignment) IsLast() bool
- func (aln *Alignment) IsMultiple() bool
- func (aln *Alignment) IsNextReversed() bool
- func (aln *Alignment) IsNextUnmapped() bool
- func (aln *Alignment) IsProper() bool
- func (aln *Alignment) IsQCFailed() bool
- func (aln *Alignment) IsReversed() bool
- func (aln *Alignment) IsSecondary() bool
- func (aln *Alignment) IsSupplementary() bool
- func (aln *Alignment) IsUnmapped() bool
- func (aln *Alignment) LIBID() interface{}
- func (aln *Alignment) REFID() int32
- func (aln *Alignment) RG() interface{}
- func (aln *Alignment) SetLIBID(libid interface{})
- func (aln *Alignment) SetREFID(refid int32)
- func (aln *Alignment) SetRG(rg interface{})
- type AlignmentFilter
- type AlignmentSorter
- type BAMReference
- type BGZFReader
- type BGZFWriter
- type By
- type ByteArray
- type CigarOperation
- type Filter
- type GroupingOrder
- type Header
- func (hdr *Header) AddUserRecord(code string, record utils.StringMap)
- func (hdr *Header) EnsureHD() utils.StringMap
- func (hdr *Header) EnsureUserRecords() map[string][]utils.StringMap
- func (hdr *Header) FormatBam(out []byte) []byte
- func (hdr *Header) FormatSam(out []byte) []byte
- func (hdr *Header) HDGO() GroupingOrder
- func (hdr *Header) HDSO() SortingOrder
- func (hdr *Header) SetHDGO(value GroupingOrder)
- func (hdr *Header) SetHDSO(value SortingOrder)
- type InputFile
- func (f *InputFile) Close() error
- func (f *InputFile) Data() interface{}
- func (f *InputFile) Err() error
- func (f *InputFile) Fetch(size int) int
- func (f *InputFile) ParseAlignment(block []byte) (*Alignment, error)
- func (f *InputFile) ParseHeader() (*Header, error)
- func (f *InputFile) Prepare(ctx context.Context) int
- func (f *InputFile) RunPipeline(output PipelineOutput, hdrFilters []Filter, sortingOrder SortingOrder) error
- func (f *InputFile) RunPipelineFI(output PipelineOutput, hdrFilters []Filter, sortingOrder SortingOrder, ...) error
- func (f *InputFile) SkipHeader() error
- type OutputFile
- func (f *OutputFile) AddNodes(p *pipeline.Pipeline, header *Header, sortingOrder SortingOrder)
- func (f *OutputFile) Close() error
- func (f *OutputFile) FormatAlignment(aln *Alignment, out []byte) ([]byte, error)
- func (f *OutputFile) FormatHeader(hdr *Header) error
- func (f *OutputFile) Write(p []byte) (int, error)
- type PipelineInput
- type PipelineOutput
- type Sam
- type Sequence
- type SortingOrder
Constants ¶
const ( SamExt = ".sam" BamExt = ".bam" )
SAM file extensions.
const ( FileFormatVersion = "1.6" FileFormatDate = "22 May 2018" )
The SAM file format version and date strings supported by this library. This is entered by default in an @HD line in the header section of a SAM file, unless user code explicitly asks for a different version number. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
const ( // Template having multiple segments in sequencing. Multiple = 0x1 // Each segment properly aligned according to the aligner. Proper = 0x2 // Segment unmapped. Unmapped = 0x4 // Next segment in the template unmapped. NextUnmapped = 0x8 // SEQ being reversed complemented. Reversed = 0x10 // SEQ of the next segment in the template being reverse // complemented. NextReversed = 0x20 // The first segment in the template. First = 0x40 // The last segment in the template. Last = 0x80 // Secondary alignment. Secondary = 0x100 // Not passing filters, such as platform/vendor quality controls. QCFailed = 0x200 // PCR or optical duplicate. Duplicate = 0x400 // Supplementary alignment. Supplementary = 0x800 )
Bit values for the FLAG field in the Alignment struct. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
Variables ¶
var ( CC = utils.Intern("CC") LB = utils.Intern("LB") PG = utils.Intern("PG") PU = utils.Intern("PU") RG = utils.Intern("RG") )
Symbols for some commonly used optional fields. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.
var ( LIBID = utils.Intern("LIBID") REFID = utils.Intern("REFID") )
Symbols for some temporary fields.
Functions ¶
func AlignmentToBytes ¶
func AlignmentToBytes(writer *OutputFile) pipeline.Filter
AlignmentToBytes returns a pargo pipeline.Filter that formats slices of Alignment pointers into slices of bytes representing these alignments according to the SAM/BAM file format.
func BytesToAlignment ¶
BytesToAlignment returns a pargo pipeline.Filter that parses slices of bytes representing alignments according to the SAM/BAM file format into slices of pointers to freshly allocated Alignment values.
func BytesToAlignmentFI ¶
BytesToAlignmentFI returns a pargo pipeline.Filter that parses slices of bytes representing alignments according to the SAM/BAM file format into slices of pointers to freshly allocated Alignment values, with an additional option to indicate whether a file index should be recorded with each alignment or not.
func ComposeFilters ¶
ComposeFilters takes a Header and a slice of Filter functions, and successively calls these functions to generate the corresponding AlignmentFilter predicates. It then returns a pargo pipeline.Receiver that applies these AlignmentFilter predicates on the slices of Alignment pointers it receives. ComposeFilters may return nil if all AlignmentFilters are nil.
func CoordinateLess ¶
CoordinateLess compares two alignments according to their coordinate. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD, SO.
func IsHeaderUserTag ¶
IsHeaderUserTag determins whether this tag string represent a user-defined tag.
func MergeSingleEndFilesSplitPerChromosome ¶
func MergeSingleEndFilesSplitPerChromosome(inputPath, output, inputPrefix, inputExtension string, header *Header, _ int) (funcErr error)
MergeSingleEndFilesSplitPerChromosome merges files containing single-end reads that were split with SplitSingleEndFilePerChromosome.
func MergeSortedFilesSplitPerChromosome ¶
func MergeSortedFilesSplitPerChromosome(inputPath, output, inputPrefix, inputExtension string, header *Header, _ int) (funcErr error)
MergeSortedFilesSplitPerChromosome merges files that were split with SplitFilePerChromosome and sorted in coordinate order.
func MergeUnsortedFilesSplitPerChromosome ¶
func MergeUnsortedFilesSplitPerChromosome(inputPath, output, inputPrefix, inputExtension string, header *Header, _ int) (funcErr error)
MergeUnsortedFilesSplitPerChromosome merges files that were split with SplitFilePerChromosome and are unsorted.
func ParseBamHeader ¶
func ParseBamHeader(reader io.Reader) (hdr *Header, references []BAMReference, err error)
ParseBamHeader parses a complete header in a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 4.2.
Returns a freshly allocated header, the BAM-encoded sequence dictionary, and a non-nil error value if an error occurred during parsing.
func ParseHeaderLineFromString ¶
ParseHeaderLineFromString parses a SAM header line from a string, except that entries are separated by white space instead of tabulators. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
The @ record type code must have already been scanned. ParseHeaderLineFromString cannot be used for @CO lines.
func QNAMELess ¶
QNAMELess compares two alignments according to their query template name. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD, SO.
func SQLN ¶
SQLN returns he LN field value, assuming that the given record represents an @SQ line in the the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
If the LN field is present, error is nil unless the value cannot be successfully parsed into an int32. If the LN field is not present, SQLN returns the maximum possible value for LN and a non-nil error value.
func SetSQLN ¶
SetSQLN sets the LN field value, assumming that the given record represents an @SQ line in the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
func SkipSamHeader ¶
SkipSamHeader skips the complete header in a SAM file. This is more efficient than calling ParseHeader and ignoring its result.
Returns a non-nil error value if an error occurred.
func SplitFilePerChromosome ¶
func SplitFilePerChromosome(input, outputPath, outputPrefix, outputExtension string, contigGroupSize int) (funcErr error)
SplitFilePerChromosome splits a SAM file into: a file containing all unmapped reads, a file containing all pairs where reads map to different chromosomes, and a file per chromosome containing all pairs where the reads map to that chromosome. There are no requirements on the input file for splitting.
func SplitSingleEndFilePerChromosome ¶
func SplitSingleEndFilePerChromosome(input, outputPath, outputPrefix, outputExtension string, contigGroupSize int) (funcErr error)
SplitSingleEndFilePerChromosome splits a SAM file containing single-end reads into a file for the unmapped reads, and a file per chromosome, containing all reads that map to that chromosome. There are no requirements on the input file for splitting.
Types ¶
type Alignment ¶
type Alignment struct { // The Query template NAME. QNAME string // The Reference sequence NAME. RNAME string // The 1-based leftmost mapping POSition (as in the SAM format). POS int32 // The bitwise FLAG. FLAG uint16 // The MAPping Quality. MAPQ byte // The CIGAR string as a slice of CIGAR operations. CIGAR []CigarOperation // The Reference sequence name of the mate/NEXT read. RNEXT string // The 1-based leftmost mapping Position of the make/NEXT read (as in the SAM format). PNEXT int32 // The observed Template LENgth. TLEN int32 // The segment SEQuence (as in the BAM format). SEQ Sequence // The ASCII of Phred-scaled base QUALity+33. // A slice of the Phred-scaled base quality values (as in the BAM format, // without the increment of 33 to turn the values into printable ASCII characters). QUAL []byte // The optional fields in a read alignment. TAGS utils.SmallMap // Additional optional fields which are not stored in SAM files, but // reserved for temporary values in filters. Temps utils.SmallMap }
An Alignment represents a single read alignment with mandatory and optional fields that can be contained in a SAM file alignment line. See http://samtools.github.io/hts-specs/SAMv1.pdf - Sections 1.4 and 1.5. SEQ and QUAL are represented as in the BAM format, see Section 4.2.
func (*Alignment) FileIndex ¶
FileIndex returns the index of the alignment in the original input file. May return -1 if unknown. This function may be deprecated in the future.
func (*Alignment) FlagEvery ¶
FlagEvery checks for every bit set in the given flag being also set in aln.FLAG.
func (*Alignment) FlagNotAny ¶
FlagNotAny checks for not any bit set in the given flag being also set in aln.FLAG.
func (*Alignment) FlagNotEvery ¶
FlagNotEvery checks for not every bit set in the given flag being also set in aln.FLAG.
func (*Alignment) FlagSome ¶
FlagSome checks for some bits set in the given flag being also set in aln.FLAG.
func (*Alignment) IsDuplicate ¶
IsDuplicate checks for PCR or optical duplicate. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsFirst ¶
IsFirst checks for being the first segment in the template. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsLast ¶
IsLast checks for being the last segment in the template. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsMultiple ¶
IsMultiple checks for template having multiple segments in sequencing. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsNextReversed ¶
IsNextReversed check for SEQ of the next segment in the template being reverse complemented. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsNextUnmapped ¶
IsNextUnmapped checks for next segment in the template unmapped. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsProper ¶
IsProper checks for each segment being properly aligned according to the aligner. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsQCFailed ¶
IsQCFailed checks for not passing filters, such as platform/vendor quality controls. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsReversed ¶
IsReversed checks for SEQ being reversed complemented. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsSecondary ¶
IsSecondary checks for secondary alignment. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsSupplementary ¶
IsSupplementary checks for supplementary alignment. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) IsUnmapped ¶
IsUnmapped checks for segment unmapped. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.
func (*Alignment) LIBID ¶
func (aln *Alignment) LIBID() interface{}
LIBID returns the LIBID temporary field.
func (*Alignment) REFID ¶
REFID returns the REFID temporary field.
If REFID field is not set, this will panic with a log message. The AddREFID filter can be used to avoid this situation. (The elPrep command line tool ensures that AddREFID is correctly used for its default pipelines.)
func (*Alignment) RG ¶
func (aln *Alignment) RG() interface{}
RG returns the (potentially empty) RG optional field.
func (*Alignment) SetLIBID ¶
func (aln *Alignment) SetLIBID(libid interface{})
SetLIBID sets the LIBID temporary field.
type AlignmentFilter ¶
An AlignmentFilter receives an Alignment which it can modify. It returns true if the alignment should be kept, and false if the alignment should be removed.
type AlignmentSorter ¶
type AlignmentSorter struct {
// contains filtered or unexported fields
}
AlignmentSorter is a helper for sorting Alignment slices that implements https://godoc.org/github.com/ExaScience/pargo/sort#StableSorter
func (AlignmentSorter) Assign ¶
func (s AlignmentSorter) Assign(p psort.StableSorter) func(i, j, len int)
Assign implements the method of the StableSorter interface.
func (AlignmentSorter) Len ¶
func (s AlignmentSorter) Len() int
Len implements the method of the sort.Interface.
func (AlignmentSorter) Less ¶
func (s AlignmentSorter) Less(i, j int) bool
Less implements the method of the sort.Interface.
func (AlignmentSorter) NewTemp ¶
func (s AlignmentSorter) NewTemp() psort.StableSorter
NewTemp implements the method of the StableSorter interface
func (AlignmentSorter) SequentialSort ¶
func (s AlignmentSorter) SequentialSort(i, j int)
SequentialSort implements the method of the SequantialSorter interface.
type BAMReference ¶
BAMReference is a an entry in a slice of BAM-encoded sequence dictionary entries. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 4.2.
func SkipBamHeader ¶
func SkipBamHeader(reader io.Reader) (references []BAMReference, err error)
SkipBamHeader skips the complete header in a BAM file. This is more efficient than calling ParseBamHeader and ignoring its result.
Returns the BAM-encoded sequence dictionary and a non-nil error value if an error occurred during parsing.
type BGZFReader ¶
type BGZFReader struct {
// contains filtered or unexported fields
}
BGZFReader reads in parallel from a BGZF file.
func NewBGZFReader ¶
func NewBGZFReader(r flate.Reader) (*BGZFReader, error)
NewBGZFReader returns a BGZFReader for the given flate.Reader
func (*BGZFReader) Close ¶
func (bgzf *BGZFReader) Close() error
Close implements the corresponding method of io.Closer
type BGZFWriter ¶
type BGZFWriter struct {
// contains filtered or unexported fields
}
BGZFWriter writes in parallel to a BGZF file.
func NewBGZFWriter ¶
func NewBGZFWriter(w io.Writer) *BGZFWriter
NewBGZFWriter returns a BGZFWriter for the given io.Writer.
type By ¶
By is a type for comparison predicates on Alignment pointers.
func (By) ParallelStableSort ¶
ParallelStableSort sorts a slice of alignments according to the given comparison predicate.
type ByteArray ¶
type ByteArray []byte
ByteArray is a representation for byte arrays as stored in optional fields of read alignments lines using type H. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.
type CigarOperation ¶
type CigarOperation struct { Length int32 Operation byte // 'M', 'I', 'D', 'N', 'S', 'H', 'P', '=', or 'X' }
CigarOperation represents a CIGAR operation. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.6.
func ScanCigarString ¶
func ScanCigarString(cigar string) ([]CigarOperation, error)
ScanCigarString converts a CIGAR string to a slice of CigarOperation. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.6.
Uses an internal cache to reduce memory overhead. It is safe for multiple goroutines to call ScanCigarString concurrently.
type Filter ¶
type Filter func(*Header) AlignmentFilter
A Filter receives a Header and returns an AlignmentFilter or nil.
type GroupingOrder ¶
type GroupingOrder string
GroupingOrder represents the possible values for the GO tag stored in the @HD line of a header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
const ( None GroupingOrder = "none" Query GroupingOrder = "query" Reference GroupingOrder = "reference" )
Grouping orders.
type Header ¶
type Header struct { // The @HD line. HD utils.StringMap // The @SQ, @RG, and @PG lines, in the order they occur in the // header. SQ, RG, PG []utils.StringMap // The @CO lines in the order they occur in the header. CO []string // The lines with user-defined @ tags, for each tag in the order // they occur in the header. UserRecords map[string][]utils.StringMap }
Header represents the information stored in the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
Each line (except for @CO) is represented as a map[string]string, mapping string tags to string values.
The zero Header is valid and empty.
func ParseSamHeader ¶
ParseSamHeader parses a complete header in a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
Returns a freshly allocated header and a non-nil error value if an error occurred during parsing.
func (*Header) AddUserRecord ¶
AddUserRecord adds a header line for the given user-defined @ tag to the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
func (*Header) EnsureHD ¶
EnsureHD ensures that an @HD line is present in the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
If an @HD line already exists, it is returned unchanged. Otherwise, the HD field is initialized with a default VN value.
func (*Header) EnsureUserRecords ¶
EnsureUserRecords ensures that a map for user-defined @ tags exists in the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
If the map already exists, it is returned unchanged. Otherwise, the UserRecords field is initialized with an empty map.
func (*Header) FormatBam ¶
FormatBam writes the header section of a BAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 4.2.
func (*Header) FormatSam ¶
FormatSam writes the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.
func (*Header) HDGO ¶
func (hdr *Header) HDGO() GroupingOrder
HDGO returns the grouping order (GO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
If there is no @HD line, or the GO field is not set, returns "none".
func (*Header) HDSO ¶
func (hdr *Header) HDSO() SortingOrder
HDSO returns the sorting order (SO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
If there is no @HD line, or the SO field is not set, returns "unknown".
func (*Header) SetHDGO ¶
func (hdr *Header) SetHDGO(value GroupingOrder)
SetHDGO sets the grouping order (GO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
This also deletes the value for the SO field if it is set.
func (*Header) SetHDSO ¶
func (hdr *Header) SetHDSO(value SortingOrder)
SetHDSO sets the sorting order (SO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
This also deletes the value for the GO field if it is set.
type InputFile ¶
type InputFile struct {
// contains filtered or unexported fields
}
InputFile represents a SAM or BAM file for input.
func Open ¶
Open a SAM or BAM file for input.
If the filename extension is not .bam, then .sam is always assumed.
If the name is "/dev/stdin", then the input is read from os.Stdin
func (*InputFile) Data ¶
func (f *InputFile) Data() interface{}
Data implements the method of the pipeline.Source interface.
func (*InputFile) ParseAlignment ¶
ParseAlignment parses a block of bytes into an alignment. For example in a SAM file, each block of bytes must be one line from the alignment section.
func (*InputFile) ParseHeader ¶
ParseHeader fetches a header from a SAM or BAM file.
func (*InputFile) RunPipeline ¶
func (f *InputFile) RunPipeline(output PipelineOutput, hdrFilters []Filter, sortingOrder SortingOrder) error
RunPipeline implements the PipelineInput interface for SAM/BAM InputFile values.
func (*InputFile) RunPipelineFI ¶
func (f *InputFile) RunPipelineFI(output PipelineOutput, hdrFilters []Filter, sortingOrder SortingOrder, setFileIndex bool) error
RunPipelineFI implements a variant of the PipelineInput interface for SAM/BAM InputFile values, with an additional option to indicate whether a file index should be recorded with each alignment or not.
func (*InputFile) SkipHeader ¶
SkipHeader skips the header section of a SAM or BAM file. This is more efficient than calling ParseHeader and ignoring its result.
type OutputFile ¶
type OutputFile struct {
// contains filtered or unexported fields
}
OutputFile represents a SAM or BAM file for output.
func Create ¶
func Create(name string) (*OutputFile, error)
Create a SAM or BAM file for output.
If the filename extension is not .bam, then .sam is always assumed.
If the name is "/dev/stdout", then the output is written to os.Stdout.
func (*OutputFile) AddNodes ¶
func (f *OutputFile) AddNodes(p *pipeline.Pipeline, header *Header, sortingOrder SortingOrder)
AddNodes implements the PipelineOutput interface for SAM/BAM OutputFile values.
func (*OutputFile) Close ¶
func (f *OutputFile) Close() error
Close closes a SAM or BAM output file.
func (*OutputFile) FormatAlignment ¶
func (f *OutputFile) FormatAlignment(aln *Alignment, out []byte) ([]byte, error)
FormatAlignment formats an alignment into a block of bytes for a SAM or BAM file.
func (*OutputFile) FormatHeader ¶
func (f *OutputFile) FormatHeader(hdr *Header) error
FormatHeader writes the header to a SAM or BAM file.
type PipelineInput ¶
type PipelineInput interface {
RunPipeline(output PipelineOutput, filters []Filter, sortingOrder SortingOrder) error
}
A PipelineInput arranges for a pargo pipeline to be properly initialized, arrange for the pipeline to run the given filters, call output.AddNodes(...), and eventually run the pipeline. If RunPipeline doesn't encounter an error of its own, it should return the error of its pargo pipeline, if any.
type PipelineOutput ¶
type PipelineOutput interface {
AddNodes(p *pipeline.Pipeline, header *Header, sortingOrder SortingOrder)
}
A PipelineOutput can add nodes to the given pargo pipeline. AddNodes also receives a header that should be added to the output, and a sortingOrder. AddNodes should arrange for the alignments that it receives to be sorted according to that sortingOrder if possible, or report an error if it can't perform such a sort. Any error should be reported to the pipeline by calling p.Err(err) with a non-nil error value.
type Sam ¶
type Sam struct { Header *Header Alignments []*Alignment // contains filtered or unexported fields }
Sam represents a complete SAM data set that can be contained in a SAM or BAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.
func (*Sam) AddNodes ¶
func (sam *Sam) AddNodes(p *pipeline.Pipeline, header *Header, sortingOrder SortingOrder)
AddNodes implements the PipelineOutput interface for Sam values to represent complete SAM/BAM files in memory.
func (*Sam) NofBatches ¶
NofBatches sets or gets the number of batches that are created from this Sam value for the next call of RunPipeline.
NofBatches can be called safely by user programs before RunPipeline is called.
If user programs do not call NofBatches, or call them with a value < 1, then the pipeline will choose a reasonable default value that takes runtime.GOMAXPROCS(0) into account.
func (*Sam) RunPipeline ¶
func (sam *Sam) RunPipeline(output PipelineOutput, hdrFilters []Filter, sortingOrder SortingOrder) error
RunPipeline implements the PipelineInput interface for Sam values that represent complete SAM/BAM files in memory.
type Sequence ¶
Sequence encodes a SAM segment SEQuence as in the BAM format. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 4.2.
type SortingOrder ¶
type SortingOrder string
SortingOrder represents the possible values for the SO tag stored in the @HD line of a header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.
const ( Keep SortingOrder = "keep" Unknown SortingOrder = "unknown" Unsorted SortingOrder = "unsorted" Queryname SortingOrder = "queryname" Coordinate SortingOrder = "coordinate" )
Sorting orders.