slow5

package
v0.0.0-...-f005bc5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 14, 2024 License: MIT Imports: 7 Imported by: 0

Documentation

Overview

Package slow5 contains slow5 parsers and writers.

Right now, only parsing slow5 files is supported. Support for writing and blow5 coming soon.

slow5 is a file format alternative to fast5, which is the file format outputted by Oxford Nanopore sequencing devices. fast5 uses hdf5, which is a complex file format that can only be read and written with a single software library built in 1998. On the other hand, slow5 uses a .tsv file format, which is easy to both parse and write.

slow5 files contain both general metadata about the sequencing run and raw signal reads from the sequencing run. This raw signal can be used directly or basecalled and used for alignment.

More information on slow5 can be found here: https://github.com/hasindu2008/slow5tools

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func SvbCompressRawSignal

func SvbCompressRawSignal(rawSignal []int16) (mask, data []byte)

SvbCompressRawSignal takes a read and converts its raw signal field to two arrays: a mask array and a data array. Both are needed for decompression.

Example
package main

import (
	"fmt"
	"os"

	"github.com/koeng101/dnadesign/lib/bio/slow5"
)

func main() {
	// example.slow5 is a file I generated using slow5tools from nanopore fast5
	// run where I was testing using nanopore for doing COVID testing. It
	// contains real nanopore data.
	file, _ := os.Open("data/example.slow5")
	defer file.Close()
	// Set maxLineSize to 64kb. If you expect longer reads,
	// make maxLineSize longer!
	const maxLineSize = 2 * 32 * 1024
	parser, _ := slow5.NewParser(file, maxLineSize)
	read, _ := parser.Next()

	// Get the raw signal from a read
	rawSignal := read.RawSignal

	// Compress that raw signal into a mask and data
	mask, data := slow5.SvbCompressRawSignal(rawSignal)

	// Decompress mask and data back into raw signal
	rawSignalDecompressed := slow5.SvbDecompressRawSignal(len(rawSignal), mask, data)

	for idx := range rawSignal {
		if rawSignal[idx] != rawSignalDecompressed[idx] {
			fmt.Println("Compression failed!")
		}
	}
	fmt.Println(data[:10])

}
Output:

[174 1 216 1 207 1 211 1 198 1]

func SvbDecompressRawSignal

func SvbDecompressRawSignal(lenRawSignal int, mask, data []byte) []int16

SvbDecompressRawSignal decompresses raw signal back to a []int16. It requires not only the mask array and data array returned by SvbCompressRawSignal, but also the length of the raw signals.

Types

type Header struct {
	HeaderValues []HeaderValue
}

Header contains metadata about the sequencing run in general.

func (*Header) WriteTo

func (header *Header) WriteTo(w io.Writer) (int64, error)

type HeaderValue

type HeaderValue struct {
	ReadGroupID        uint32
	Slow5Version       string
	Attributes         map[string]string
	EndReasonHeaderMap map[string]int
}

type Parser

type Parser struct {
	// contains filtered or unexported fields
}

Parser is a flexible parser that provides ample control over reading slow5 sequences. It is initialized with NewParser.

func NewParser

func NewParser(r io.Reader, maxLineSize int) (*Parser, error)

NewParser parsers a slow5 file.

Example
package main

import (
	"fmt"
	"os"

	"github.com/koeng101/dnadesign/lib/bio/slow5"
)

func main() {
	// example.slow5 is a file I generated using slow5tools from nanopore fast5
	// run where I was testing using nanopore for doing COVID testing. It
	// contains real nanopore data.
	file, _ := os.Open("data/example.slow5")
	defer file.Close()
	// Set maxLineSize to 64kb. If you expect longer reads,
	// make maxLineSize longer!
	const maxLineSize = 2 * 32 * 1024
	parser, _ := slow5.NewParser(file, maxLineSize)

	var outputReads []slow5.Read
	for {
		read, err := parser.Next()
		if err != nil {
			// Break at EOF
			break
		}
		outputReads = append(outputReads, read)
	}

	fmt.Println(outputReads[0].RawSignal[0:10])
}
Output:

[430 472 463 467 454 465 463 450 450 449]

func (*Parser) Header

func (parser *Parser) Header() (Header, error)

Header returns the header

func (*Parser) Next

func (parser *Parser) Next() (Read, error)

Next parses the next read from a parser.

type Read

type Read struct {
	ReadID       string
	ReadGroupID  uint32
	Digitisation float64
	Offset       float64
	Range        float64
	SamplingRate float64
	LenRawSignal uint64
	RawSignal    []int16

	// Auxiliary fields
	ChannelNumber string
	MedianBefore  float64
	ReadNumber    int32
	StartMux      uint8
	StartTime     uint64
	EndReason     string // enum{unknown,partial,mux_change,unblock_mux_change,data_service_unblock_mux_change,signal_positive,signal_negative}

	EndReasonMap map[string]int // Used for writing
}

Read contains metadata and raw signal strengths for a single nanopore read.

func (*Read) WriteTo

func (read *Read) WriteTo(w io.Writer) (int64, error)

Directories

Path Synopsis
svb
cpuid
Package cpuid provides access to the information available through the CPUID instruction.
Package cpuid provides access to the information available through the CPUID instruction.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL