hts

module
v1.0.2-0...-030e6b9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 3, 2020 License: BSD-3-Clause

README

bíogo

HTS

Build Status GoDoc

Installation

    $ go get github.com/biogo/hts/...

Overview

SAM and BAM handling for the Go language.

bíogo/hts provides a Go native implementation of the SAM specification for SAM and BAM alignment formats commonly used for representation of high throughput genomic data, the BAI, CSI and tabix indexing formats, and the BGZF blocked compression format. The bíogo/hts packages perform parallelized read and write operations and are able to cache recent reads according to user-specified caching methods. The bíogo/hts APIs have been constructed to provide a consistent interface to sequence alignment data and the underlying compression system in order to aid ease of use and tool development.

Example usage

The following code implements the equivalent of samtools view -c -f n -F N file.bam.

package main

import (
	"flag"
	"fmt"
	"io"
	"log"
	"os"

	"github.com/biogo/hts/bam"
	"github.com/biogo/hts/bgzf"
	"github.com/biogo/hts/sam"
)

var (
	require = flag.Int("f", 0, "required flags")
	exclude = flag.Int("F", 0, "excluded flags")
	file    = flag.String("file", "", "input file (empty for stdin)")
	conc    = flag.Int("threads", 0, "number of threads to use (0 = auto)")
	help    = flag.Bool("help", false, "display help")
)

const maxFlag = int(^sam.Flags(0))

func main() {
	flag.Parse()
	if *help {
		flag.Usage()
		os.Exit(0)
	}

	if *require > maxFlag {
		flag.Usage()
		log.Fatal("required flags (f) out of range")
	}
	reqFlag := sam.Flags(*require)

	if *exclude > maxFlag {
		flag.Usage()
		log.Fatal("excluded flags (F) out of range")
	}
	excFlag := sam.Flags(*exclude)

	var r io.Reader
	if *file == "" {
		r = os.Stdin
	} else {
		f, err := os.Open(*file)
		if err != nil {
			log.Fatalf("could not open file %q:", err)
		}
		defer f.Close()
		ok, err := bgzf.HasEOF(f)
		if err != nil {
			log.Fatalf("could not open file %q:", err)
		}
		if !ok {
			log.Printf("file %q has no bgzf magic block: may be truncated", *file)
		}
		r = f
	}

	b, err := bam.NewReader(r, *conc)
	if err != nil {
		log.Fatalf("could not read bam:", err)
	}
	defer b.Close()

	// We only need flags, so skip variable length data.
	b.Omit(bam.AllVariableLengthData)

	var n int
	for {
		rec, err := b.Read()
		if err == io.EOF {
			break
		}
		if err != nil {
			log.Fatalf("error reading bam: %v", err)
		}
		if rec.Flags&reqFlag == reqFlag && rec.Flags&excFlag == 0 {
			n++
		}
	}

	fmt.Println(n)
}

Getting help

Help or similar requests are preferred on the biogo-user Google Group.

https://groups.google.com/forum/#!forum/biogo-user

Contributing

If you find any bugs, feel free to file an issue on the github issue tracker. Pull requests are welcome, though if they involve changes to API or addition of features, please first open a discussion at the biogo-dev Google Group.

https://groups.google.com/forum/#!forum/biogo-dev

Citing

If you use bíogo/hts, please cite Kortschak, Pedersen and Adelson "bíogo/hts: high throughput sequence handling for the Go language", doi:10.21105/joss.00168.

Library Structure and Coding Style

The coding style should be aligned with normal Go idioms as represented in the Go core libraries.

Copyright ©2011-2013 The bíogo Authors except where otherwise noted. All rights reserved. Use of this source code is governed by a BSD-style license that can be found in the LICENSE file.

The bíogo logo is derived from Bitstream Charter, Copyright ©1989-1992 Bitstream Inc., Cambridge, MA.

BITSTREAM CHARTER is a registered trademark of Bitstream Inc.

Directories

Path Synopsis
Package bam implements BAM file format reading, writing and indexing.
Package bam implements BAM file format reading, writing and indexing.
Package bgzf implements BGZF format reading and writing according to the SAM specification.
Package bgzf implements BGZF format reading and writing according to the SAM specification.
cache
Package cache provides basic block cache types for the bgzf package.
Package cache provides basic block cache types for the bgzf package.
index
Package index provides common code for CSI and tabix BGZF indexing.
Package index provides common code for CSI and tabix BGZF indexing.
Package csi implements CSIv1 and CSIv2 coordinate sorted indexing.
Package csi implements CSIv1 and CSIv2 coordinate sorted indexing.
Package internal provides shared code for BAI and tabix index implementations.
Package internal provides shared code for BAI and tabix index implementations.
paper
examples/flagstat
This program tabulates statistics on a bam file from the sam flag.
This program tabulates statistics on a bam file from the sam flag.
Code generated by generate_randomized_freepool.py.
Code generated by generate_randomized_freepool.py.
Package tabix implements tabix coordinate sorted indexing.
Package tabix implements tabix coordinate sorted indexing.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL