Documentation ¶
Index ¶
Constants ¶
const DefaultParallelism = 2
DefaultParallelism is the default value for SortOptions.Parallelism.
const DefaultSortBatchSize = 1 << 20
DefaultSortBatchSize is the default number of records to keep in memory before resorting to external sorting.
Variables ¶
This section is empty.
Functions ¶
func BAMFromSortShards ¶
BAMFromSortShards merges a set of sortshard files into a single BAM file.
Types ¶
type SortOptions ¶
type SortOptions struct { // ShardIndex must be a number unique to this sorter, across all sorters for // shards that are eventually merged into one BAM or PAM file. // // ShardIndex defines the sort order of reads at the same (ref,pos), but on // different Sorters. If ShardIndex==0, it is set to sha(sortshardpath). ShardIndex uint32 // SortBatchSize is the number of sam.Records to keep in memory before // resorting to external sorting. Not for general use; the default value // should suffice for most applications. SortBatchSize int // MaxParallelism limits the number of background sorts. Max memory // consumption of the sorter grows linearly with this value. If <= 0, // DefaultMaxParallelism is used. Parallelism int // NoCompressTmpFiles, if false (default), compress sortshards using snappy. // Compression is a big win on an EC2 EBS. It will slow sort down by a minor // degree on fast NVMe disks. NoCompressTmpFiles bool // TmpDir defines the directory to store temp files created during merge. "" // means the system default, usually /tmp. TmpDir string }
SortOptions controls options passed to the toplevel Sort.
type Sorter ¶
type Sorter struct {
// contains filtered or unexported fields
}
Sorter sorts list of sam.Records and produces a sortshard file in "outPath". SortedShardsToBAM can be later used to merge multiple sorted shard files into a BAM file. "header" must contain all the references used by records to be added later.
Sorter orders records in the following way:
- Increasing reference sequence IDs, then - increasing alignment positions, then - sorts a forward read before a reverse read. - All else equal, sorts records the order of appearance in the input (i.e., stable sort)
These criteria are the same as "samtool sort" and "sambamba sort".
Example:
sorter := NewSorter("tmp0.sort", header) for _, rec := range recordlist { sorter.AddRecord(rec) } err := sorter.Close() .. Similarly, produce tmp1.sort, .., tmpN.sort, possibly on .. different processes or machines .. // Merge all the sorted shards into one BAM file. err := SortedShardsToBAM([]string{"tmp0.sort",..."tmpN.sort"}, "foo.bam")
func NewSorter ¶
func NewSorter(outPath string, header *sam.Header, optList ...SortOptions) *Sorter
NewSorter creates a Sorter object.