Documentation ¶
Overview ¶
Package filters defines filters for .sam/.bam sequencing pipelines.
Index ¶
- Variables
- func AddOrReplaceReadGroup(readGroup utils.StringMap) sam.Filter
- func AddPGLine(newPG utils.StringMap) sam.Filter
- func AddREFID(header *sam.Header) sam.AlignmentFilter
- func CleanSam(header *sam.Header) sam.AlignmentFilter
- func KeepOptionalFields(tags []string) sam.Filter
- func MarkDuplicates(alsoOpticals bool) (sam.Filter, *sync.Map, *sync.Map)
- func PrintDuplicatesMetrics(input, output, metrics string, removeDuplicates bool, ctrs DuplicatesCtrMap) (err error)
- func PrintDuplicatesMetricsToIntermediateFile(name string, ctrs DuplicatesCtrMap) (err error)
- func RemoveDuplicateReads(_ *sam.Header) sam.AlignmentFilter
- func RemoveMappingQualityLessThan(mq int) sam.Filter
- func RemoveNonExactMappingReads(_ *sam.Header) sam.AlignmentFilter
- func RemoveNonExactMappingReadsStrict(header *sam.Header) sam.AlignmentFilter
- func RemoveNonOverlappingReads(bed *bed.Bed) sam.Filter
- func RemoveOptionalFields(tags []string) sam.Filter
- func RemoveOptionalReads(header *sam.Header) sam.AlignmentFilter
- func RemoveUnmappedReads(_ *sam.Header) sam.AlignmentFilter
- func RemoveUnmappedReadsStrict(_ *sam.Header) sam.AlignmentFilter
- func RenameChromosomes(header *sam.Header) sam.AlignmentFilter
- func ReplaceReferenceSequenceDictionary(dict []utils.StringMap) sam.Filter
- func ReplaceReferenceSequenceDictionaryFromSamFile(samFile string) (f sam.Filter, err error)
- type BaseRecalibrator
- type BaseRecalibratorTables
- func (recal BaseRecalibratorTables) ApplyBQSR(quantizeLevels int, sqqList []uint8) sam.Filter
- func (recal BaseRecalibratorTables) ApplyBQSRWithMaxCycle(quantizeLevels int, sqqList []uint8, maxCycle int) sam.Filter
- func (recal BaseRecalibratorTables) Err() error
- func (recal BaseRecalibratorTables) FinalizeBQSRTables()
- func (recal *BaseRecalibratorTables) PrintBQSRTables(name string) error
- func (recal *BaseRecalibratorTables) PrintBQSRTablesToIntermediateFile(name string) error
- type DuplicatesCountsHistograms
- type DuplicatesCtr
- type DuplicatesCtrMap
- func LoadAndCombineDuplicateMetrics(metricsPath string) DuplicatesCtrMap
- func MarkOpticalDuplicates(reads *sam.Sam, _, pairs *sync.Map, deterministic bool) DuplicatesCtrMap
- func MarkOpticalDuplicatesWithPixelDistance(reads *sam.Sam, pairs *sync.Map, deterministic bool, opticalPixelDistance int) DuplicatesCtrMap
Constants ¶
This section is empty.
Variables ¶
var ( X0 = utils.Intern("X0") X1 = utils.Intern("X1") XM = utils.Intern("XM") XO = utils.Intern("XO") XG = utils.Intern("XG") )
Symbols for optional fields used for determining exact matches. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.
Functions ¶
func AddOrReplaceReadGroup ¶
AddOrReplaceReadGroup returns a filter for adding or replacing the read group both in the Header and in each Alignment.
func AddPGLine ¶
AddPGLine returns a filter for adding a @PG tag to a Header, and ensuring that it is the first one in the chain.
func AddREFID ¶
func AddREFID(header *sam.Header) sam.AlignmentFilter
AddREFID is a filter for adding the refid (index in the reference sequence dictionary) to alignments as temporary values.
func CleanSam ¶
func CleanSam(header *sam.Header) sam.AlignmentFilter
CleanSam is a filter for soft-clipping an alignment at the end of a reference sequence, and setting MAPQ to 0 if unmapped.
func KeepOptionalFields ¶
KeepOptionalFields returns a filter for removing all but a list of given optional fields in an alignment.
func MarkDuplicates ¶
MarkDuplicates returns a filter for marking duplicate alignments. Depends on the AddREFID filter being called before to fill in the refid.
Duplicate marking is based on an adapted Phred score. In case of ties, the QNAME is used as a tie-breaker.
func PrintDuplicatesMetrics ¶
func PrintDuplicatesMetrics(input, output, metrics string, removeDuplicates bool, ctrs DuplicatesCtrMap) (err error)
PrintDuplicatesMetrics writes the duplication metrics for a set of reads to a file.
func PrintDuplicatesMetricsToIntermediateFile ¶
func PrintDuplicatesMetricsToIntermediateFile(name string, ctrs DuplicatesCtrMap) (err error)
PrintDuplicatesMetricsToIntermediateFile writes the duplicate metrics to a gob file.
func RemoveDuplicateReads ¶
func RemoveDuplicateReads(_ *sam.Header) sam.AlignmentFilter
RemoveDuplicateReads is a filter for removing duplicate sam-alignment instances, based on FLAG.
func RemoveMappingQualityLessThan ¶
RemoveMappingQualityLessThan is a filter for removing reads that do not match or exceed the given mapping quality.
func RemoveNonExactMappingReads ¶
func RemoveNonExactMappingReads(_ *sam.Header) sam.AlignmentFilter
RemoveNonExactMappingReads is a filter that removes all reads that are not exact matches with the reference (soft-clipping ok), based on CIGAR string (only M and S allowed).
func RemoveNonExactMappingReadsStrict ¶
func RemoveNonExactMappingReadsStrict(header *sam.Header) sam.AlignmentFilter
RemoveNonExactMappingReadsStrict is a filter that removes all reads that are not exact matches with the reference, based on the optional fields X0=1 (unique mapping), X1=0 (no suboptimal hit), XM=0 (no mismatch), XO=0 (no gap opening), XG=0 (no gap extension).
func RemoveNonOverlappingReads ¶
RemoveNonOverlappingReads returns a filter for removing all reads that do not overlap with a set of regions specified by a bed file.
func RemoveOptionalFields ¶
RemoveOptionalFields returns a filter for removing optional fields in an alignment.
func RemoveOptionalReads ¶
func RemoveOptionalReads(header *sam.Header) sam.AlignmentFilter
RemoveOptionalReads is a filter for removing alignments that represent optional information in elPrep.
func RemoveUnmappedReads ¶
func RemoveUnmappedReads(_ *sam.Header) sam.AlignmentFilter
RemoveUnmappedReads is a filter for removing unmapped sam-alignment instances, based on FLAG.
func RemoveUnmappedReadsStrict ¶
func RemoveUnmappedReadsStrict(_ *sam.Header) sam.AlignmentFilter
RemoveUnmappedReadsStrict is a filter for removing unmapped sam-alignment instances, based on FLAG, or POS=0, or RNAME=*.
func RenameChromosomes ¶
func RenameChromosomes(header *sam.Header) sam.AlignmentFilter
RenameChromosomes is a filter for prepending "chr" to the reference sequence names in a Header, and in RNAME and RNEXT in each Alignment.
func ReplaceReferenceSequenceDictionary ¶
ReplaceReferenceSequenceDictionary returns a filter for replacing the reference sequence dictionary in a Header.
func ReplaceReferenceSequenceDictionaryFromSamFile ¶
ReplaceReferenceSequenceDictionaryFromSamFile returns a filter for replacing the reference sequence dictionary in a Header with one parsed from the given SAM/DICT file.
Types ¶
type BaseRecalibrator ¶
type BaseRecalibrator struct {
// contains filtered or unexported fields
}
BaseRecalibrator implements the first step of base recalibration.
func NewBaseRecalibrator ¶
func NewBaseRecalibrator(knownSites []string, referenceFasta string) (recal *BaseRecalibrator)
NewBaseRecalibrator returns a struct for the first step of base recalibration.
func (*BaseRecalibrator) Recalibrate ¶
func (recal *BaseRecalibrator) Recalibrate(reads *sam.Sam) (tables BaseRecalibratorTables)
Recalibrate implements the first step of base recalibration.
func (*BaseRecalibrator) RecalibrateWithMaxCycle ¶ added in v4.1.5
func (recal *BaseRecalibrator) RecalibrateWithMaxCycle(reads *sam.Sam, maxCycle int) (tables BaseRecalibratorTables)
RecalibrateWithMaxCycle implements the first step of base recalibration.
type BaseRecalibratorTables ¶
type BaseRecalibratorTables struct {
QualityScores, Cycles, Contexts bqsrTable
// contains filtered or unexported fields
}
BaseRecalibratorTables is the result of the base recalibration. All subsequent steps, including ApplyBQSR, are based on these tables.
func LoadAndCombineBQSRTables ¶
func LoadAndCombineBQSRTables(bqsrPath string) (BaseRecalibratorTables, error)
LoadAndCombineBQSRTables loads and merges multiple recalibration tables from file into a single, new recalibration table.
func NewBaseRecalibratorTables ¶
func NewBaseRecalibratorTables() BaseRecalibratorTables
NewBaseRecalibratorTables returns a struct for storing the result of the base recalibration.
func (BaseRecalibratorTables) ApplyBQSR ¶
func (recal BaseRecalibratorTables) ApplyBQSR(quantizeLevels int, sqqList []uint8) sam.Filter
ApplyBQSR applies the base recalibration result to the QUAL strings of the given reads.
func (BaseRecalibratorTables) ApplyBQSRWithMaxCycle ¶ added in v4.1.5
func (recal BaseRecalibratorTables) ApplyBQSRWithMaxCycle(quantizeLevels int, sqqList []uint8, maxCycle int) sam.Filter
ApplyBQSRWithMaxCycle applies the base recalibration result to the QUAL strings of the given reads.
func (BaseRecalibratorTables) Err ¶
func (recal BaseRecalibratorTables) Err() error
Err returns the error stored in these BaseRecalibratorTables.
func (BaseRecalibratorTables) FinalizeBQSRTables ¶
func (recal BaseRecalibratorTables) FinalizeBQSRTables()
FinalizeBQSRTables finalizes the first step of base recalibration.
func (*BaseRecalibratorTables) PrintBQSRTables ¶
func (recal *BaseRecalibratorTables) PrintBQSRTables(name string) error
PrintBQSRTables creates a recalibration report file.
func (*BaseRecalibratorTables) PrintBQSRTablesToIntermediateFile ¶
func (recal *BaseRecalibratorTables) PrintBQSRTablesToIntermediateFile(name string) error
PrintBQSRTablesToIntermediateFile prints the recalibration tables to a gob file.
type DuplicatesCountsHistograms ¶ added in v4.1.2
type DuplicatesCountsHistograms struct {
// contains filtered or unexported fields
}
DuplicatesCountHistograms keeps tracks of metrics for the number of pcr vs optical duplicates per list of duplicates
type DuplicatesCtr ¶
type DuplicatesCtr struct { UnpairedReadsExamined int ReadPairsExamined int SecondaryOrSupplementaryReads int UnmappedReads int UnpairedReadDuplicates int ReadPairDuplicates int ReadPairOpticalDuplicates int // contains filtered or unexported fields }
DuplicatesCtr implements a struct that stores metrics about reads such as the number of (optical) duplicates, unmapped reads, etc.
type DuplicatesCtrMap ¶
type DuplicatesCtrMap struct { Map map[string]*DuplicatesCtr // contains filtered or unexported fields }
DuplicatesCtrMap maps library names to duplicate counters.
func LoadAndCombineDuplicateMetrics ¶
func LoadAndCombineDuplicateMetrics(metricsPath string) DuplicatesCtrMap
LoadAndCombineDuplicateMetrics loads partial duplication metrics from file and combines them
func MarkOpticalDuplicates ¶
MarkOpticalDuplicates implements a function for calculating duplication metrics for a set of reads, optical pixel distance = 100
func MarkOpticalDuplicatesWithPixelDistance ¶ added in v4.1.0
func MarkOpticalDuplicatesWithPixelDistance(reads *sam.Sam, pairs *sync.Map, deterministic bool, opticalPixelDistance int) DuplicatesCtrMap
MarkOpticalDuplicatesWithPixelDistance implements a function for calculating duplication metrics for a set of reads
func (DuplicatesCtrMap) Err ¶
func (ctrMap DuplicatesCtrMap) Err() error
Err returns the error stored in this DuplicatesCtrMap.