filters

package
v4.1.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 19, 2020 License: AGPL-3.0, AGPL-3.0-or-later Imports: 25 Imported by: 0

Documentation

Overview

Package filters defines filters for .sam/.bam sequencing pipelines.

Index

Constants

This section is empty.

Variables

View Source
var (
	X0 = utils.Intern("X0")
	X1 = utils.Intern("X1")
	XM = utils.Intern("XM")
	XO = utils.Intern("XO")
	XG = utils.Intern("XG")
)

Symbols for optional fields used for determining exact matches. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.

Functions

func AddOrReplaceReadGroup

func AddOrReplaceReadGroup(readGroup utils.StringMap) sam.Filter

AddOrReplaceReadGroup returns a filter for adding or replacing the read group both in the Header and in each Alignment.

func AddPGLine

func AddPGLine(newPG utils.StringMap) sam.Filter

AddPGLine returns a filter for adding a @PG tag to a Header, and ensuring that it is the first one in the chain.

func AddREFID

func AddREFID(header *sam.Header) sam.AlignmentFilter

AddREFID is a filter for adding the refid (index in the reference sequence dictionary) to alignments as temporary values.

func CleanSam

func CleanSam(header *sam.Header) sam.AlignmentFilter

CleanSam is a filter for soft-clipping an alignment at the end of a reference sequence, and setting MAPQ to 0 if unmapped.

func KeepOptionalFields

func KeepOptionalFields(tags []string) sam.Filter

KeepOptionalFields returns a filter for removing all but a list of given optional fields in an alignment.

func MarkDuplicates

func MarkDuplicates(alsoOpticals bool) (sam.Filter, *sync.Map, *sync.Map)

MarkDuplicates returns a filter for marking duplicate alignments. Depends on the AddREFID filter being called before to fill in the refid.

Duplicate marking is based on an adapted Phred score. In case of ties, the QNAME is used as a tie-breaker.

func PrintDuplicatesMetrics

func PrintDuplicatesMetrics(input, output, metrics string, removeDuplicates bool, ctrs DuplicatesCtrMap) (err error)

PrintDuplicatesMetrics writes the duplication metrics for a set of reads to a file.

func PrintDuplicatesMetricsToIntermediateFile

func PrintDuplicatesMetricsToIntermediateFile(name string, ctrs DuplicatesCtrMap) (err error)

PrintDuplicatesMetricsToIntermediateFile writes the duplicate metrics to a gob file.

func RemoveDuplicateReads

func RemoveDuplicateReads(_ *sam.Header) sam.AlignmentFilter

RemoveDuplicateReads is a filter for removing duplicate sam-alignment instances, based on FLAG.

func RemoveMappingQualityLessThan

func RemoveMappingQualityLessThan(mq int) sam.Filter

RemoveMappingQualityLessThan is a filter for removing reads that do not match or exceed the given mapping quality.

func RemoveNonExactMappingReads

func RemoveNonExactMappingReads(_ *sam.Header) sam.AlignmentFilter

RemoveNonExactMappingReads is a filter that removes all reads that are not exact matches with the reference (soft-clipping ok), based on CIGAR string (only M and S allowed).

func RemoveNonExactMappingReadsStrict

func RemoveNonExactMappingReadsStrict(header *sam.Header) sam.AlignmentFilter

RemoveNonExactMappingReadsStrict is a filter that removes all reads that are not exact matches with the reference, based on the optional fields X0=1 (unique mapping), X1=0 (no suboptimal hit), XM=0 (no mismatch), XO=0 (no gap opening), XG=0 (no gap extension).

func RemoveNonOverlappingReads

func RemoveNonOverlappingReads(bed *bed.Bed) sam.Filter

RemoveNonOverlappingReads returns a filter for removing all reads that do not overlap with a set of regions specified by a bed file.

func RemoveOptionalFields

func RemoveOptionalFields(tags []string) sam.Filter

RemoveOptionalFields returns a filter for removing optional fields in an alignment.

func RemoveOptionalReads

func RemoveOptionalReads(header *sam.Header) sam.AlignmentFilter

RemoveOptionalReads is a filter for removing alignments that represent optional information in elPrep.

func RemoveUnmappedReads

func RemoveUnmappedReads(_ *sam.Header) sam.AlignmentFilter

RemoveUnmappedReads is a filter for removing unmapped sam-alignment instances, based on FLAG.

func RemoveUnmappedReadsStrict

func RemoveUnmappedReadsStrict(_ *sam.Header) sam.AlignmentFilter

RemoveUnmappedReadsStrict is a filter for removing unmapped sam-alignment instances, based on FLAG, or POS=0, or RNAME=*.

func RenameChromosomes

func RenameChromosomes(header *sam.Header) sam.AlignmentFilter

RenameChromosomes is a filter for prepending "chr" to the reference sequence names in a Header, and in RNAME and RNEXT in each Alignment.

func ReplaceReferenceSequenceDictionary

func ReplaceReferenceSequenceDictionary(dict []utils.StringMap) sam.Filter

ReplaceReferenceSequenceDictionary returns a filter for replacing the reference sequence dictionary in a Header.

func ReplaceReferenceSequenceDictionaryFromSamFile

func ReplaceReferenceSequenceDictionaryFromSamFile(samFile string) (f sam.Filter, err error)

ReplaceReferenceSequenceDictionaryFromSamFile returns a filter for replacing the reference sequence dictionary in a Header with one parsed from the given SAM/DICT file.

Types

type BaseRecalibrator

type BaseRecalibrator struct {
	// contains filtered or unexported fields
}

BaseRecalibrator implements the first step of base recalibration.

func NewBaseRecalibrator

func NewBaseRecalibrator(knownSites []string, referenceFasta string) (recal *BaseRecalibrator)

NewBaseRecalibrator returns a struct for the first step of base recalibration.

func (*BaseRecalibrator) Recalibrate

func (recal *BaseRecalibrator) Recalibrate(reads *sam.Sam) (tables BaseRecalibratorTables)

Recalibrate implements the first step of base recalibration.

func (*BaseRecalibrator) RecalibrateWithMaxCycle added in v4.1.5

func (recal *BaseRecalibrator) RecalibrateWithMaxCycle(reads *sam.Sam, maxCycle int) (tables BaseRecalibratorTables)

RecalibrateWithMaxCycle implements the first step of base recalibration.

type BaseRecalibratorTables

type BaseRecalibratorTables struct {
	QualityScores, Cycles, Contexts bqsrTable
	// contains filtered or unexported fields
}

BaseRecalibratorTables is the result of the base recalibration. All subsequent steps, including ApplyBQSR, are based on these tables.

func LoadAndCombineBQSRTables

func LoadAndCombineBQSRTables(bqsrPath string) (BaseRecalibratorTables, error)

LoadAndCombineBQSRTables loads and merges multiple recalibration tables from file into a single, new recalibration table.

func NewBaseRecalibratorTables

func NewBaseRecalibratorTables() BaseRecalibratorTables

NewBaseRecalibratorTables returns a struct for storing the result of the base recalibration.

func (BaseRecalibratorTables) ApplyBQSR

func (recal BaseRecalibratorTables) ApplyBQSR(quantizeLevels int, sqqList []uint8) sam.Filter

ApplyBQSR applies the base recalibration result to the QUAL strings of the given reads.

func (BaseRecalibratorTables) ApplyBQSRWithMaxCycle added in v4.1.5

func (recal BaseRecalibratorTables) ApplyBQSRWithMaxCycle(quantizeLevels int, sqqList []uint8, maxCycle int) sam.Filter

ApplyBQSRWithMaxCycle applies the base recalibration result to the QUAL strings of the given reads.

func (BaseRecalibratorTables) Err

func (recal BaseRecalibratorTables) Err() error

Err returns the error stored in these BaseRecalibratorTables.

func (BaseRecalibratorTables) FinalizeBQSRTables

func (recal BaseRecalibratorTables) FinalizeBQSRTables()

FinalizeBQSRTables finalizes the first step of base recalibration.

func (*BaseRecalibratorTables) PrintBQSRTables

func (recal *BaseRecalibratorTables) PrintBQSRTables(name string) error

PrintBQSRTables creates a recalibration report file.

func (*BaseRecalibratorTables) PrintBQSRTablesToIntermediateFile

func (recal *BaseRecalibratorTables) PrintBQSRTablesToIntermediateFile(name string) error

PrintBQSRTablesToIntermediateFile prints the recalibration tables to a gob file.

type DuplicatesCountsHistograms added in v4.1.2

type DuplicatesCountsHistograms struct {
	// contains filtered or unexported fields
}

DuplicatesCountHistograms keeps tracks of metrics for the number of pcr vs optical duplicates per list of duplicates

type DuplicatesCtr

type DuplicatesCtr struct {
	UnpairedReadsExamined         int
	ReadPairsExamined             int
	SecondaryOrSupplementaryReads int
	UnmappedReads                 int
	UnpairedReadDuplicates        int
	ReadPairDuplicates            int
	ReadPairOpticalDuplicates     int
	// contains filtered or unexported fields
}

DuplicatesCtr implements a struct that stores metrics about reads such as the number of (optical) duplicates, unmapped reads, etc.

type DuplicatesCtrMap

type DuplicatesCtrMap struct {
	Map map[string]*DuplicatesCtr
	// contains filtered or unexported fields
}

DuplicatesCtrMap maps library names to duplicate counters.

func LoadAndCombineDuplicateMetrics

func LoadAndCombineDuplicateMetrics(metricsPath string) DuplicatesCtrMap

LoadAndCombineDuplicateMetrics loads partial duplication metrics from file and combines them

func MarkOpticalDuplicates

func MarkOpticalDuplicates(reads *sam.Sam, _, pairs *sync.Map, deterministic bool) DuplicatesCtrMap

MarkOpticalDuplicates implements a function for calculating duplication metrics for a set of reads, optical pixel distance = 100

func MarkOpticalDuplicatesWithPixelDistance added in v4.1.0

func MarkOpticalDuplicatesWithPixelDistance(reads *sam.Sam, pairs *sync.Map, deterministic bool, opticalPixelDistance int) DuplicatesCtrMap

MarkOpticalDuplicatesWithPixelDistance implements a function for calculating duplication metrics for a set of reads

func (DuplicatesCtrMap) Err

func (ctrMap DuplicatesCtrMap) Err() error

Err returns the error stored in this DuplicatesCtrMap.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL