genemap

package module
v0.0.0-...-f9e32f2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 7, 2021 License: MIT Imports: 2 Imported by: 0

README

Map between gene identifiers in GO

Currently supported identifiers

Identifier Setting name Examples
BioGRID Biogrid 116400
Ensembl gene EnsemblGene ENSG00000114209
Ensembl protein EnsemblProtein ENSP00000376506
Entrez gene ID Entrez 11235
gene symbol Symbol PDCD10, CCM3
HGNC HGNC 8761
Protein name Name Programmed cell death protein 10
Refseq mRNA RefseqMRNA NM_007217
Refseq protein RefseqProtein NP_009148
UniProt Accession Accession Q9BUL8
UniProt ID ID PDC10_HUMAN

Installation for CLI (optional)

> go get -v github.com/knightjdr/genemap
> cd $GOPATH/src/github.com/knightjdr/genemap
> go install ./...

Usage

Generate file for mapping.

The first step in mapping is to download reviewed UniProt entries and approved HGNC entries and create a JSON file containing supported conversion identifiers. Currently this only supports human genes. By default the file will be created in the current directory but a target directory can be passed as a flag/option. The created mapping file will have the creation date in the name formatted like genemap-YYYY-MM-DD.json. This allows you to have your own database for mapping and update it as you see fit.

Import
import "github.com/knightjdr/genemap"

func main() {
  genemap.Generate("/path/to/output/folder/")
}
CLI

The map generation step can be run from the command line

> genemapcreate --folder="/path/to/output/folder/"
Convert identifiers

You can use the mapping file created in the previous step to convert between supported gene identifiers. The Mapper struct is used for setting mapping parameters and will contain the results of the mapping.

Import
import "github.com/knightjdr/genemap"

func main() {
  mapper := genemap.CreateMapper()
  mapper.Load("/path/to/map/genemap.json")
  mapper.FromType = "Symbol"
  mapper.ToType = "Accession"

  err := mapper.Convert(ids)
  if err != nil {
    log.Fatal("Error converting IDs")
  }

  fmt.Println(mapper.Converted)
}

The mapper struct looks like so:

type Mapper struct {
	Converted          map[string]string
	FromType           string
	PossibleConversions map[string][]string
	ToType             string
	Unconverted        []string
}

Converted will have a one-to-one mapping of identifiers between the requested types and Unconverted will contain a list of identifiers that could not be converted. PossibleConversions will contain all possible mappings that the supplied identifier could be converted to. For example, if you request a mapping to UniProt identifiers, Converted will have a mapping to a reviewed UniProt entry while PossibleConversions will have all possible consistent UniProt entries, whether reviewed or not. When converting to UniProt accession or ID, the Converted map will always map to the reviewed entry. Similarly, when converting to gene Symbol, the entry will always map to the official gene symbol. When mapping to other conversion types, the Converted map will simply contain one of the possible identifiers it can be converted to if there are multiple.

CLI

ID conversion can be done from the command line. It requires a mapping file (.json format) and a text file with IDs to convert (one entry per line). It will output a text file with the converted identifiers. The output file name will be called conversion.txt if one is not supplied.

> genemapconvert --fromType="Symbol" --toType="Accession" --idFile="ids.txt" --mapFile="genemap.json" --outFile="my-out-file.txt"

Documentation

Overview

Package genemap contains functions for generating a map of gene identifiers and converting between them.

Index

Constants

This section is empty.

Variables

View Source
var CreateMapper = terms.CreateMapper

CreateMapper creates a structure for mapping between geen identifiers.

Generate mapping file for mapping between gene identifiers.

Functions

This section is empty.

Types

type Mapper

type Mapper terms.Mapper

Mapper structure for setting options for mapping and retrieving results.

Directories

Path Synopsis
cmd
genemapconvert
Package main reads gene identifiers from a file and converts between requested types.
Package main reads gene identifiers from a file and converts between requested types.
genemapcreate
Package main generates mapping file between various types.
Package main generates mapping file between various types.
internal
generate
Package generate mapping file between various types.
Package generate mapping file between various types.
terms
Package terms maps gene identifiers between types.
Package terms maps gene identifiers between types.
pkg
flags
Package flags handles flag parsing
Package flags handles flag parsing
fs
Package fs creates a filesystem to use (for easy mocking)
Package fs creates a filesystem to use (for easy mocking)
slice
Package slice contains methods for slice conversion and modification
Package slice contains methods for slice conversion and modification
zip
Package zip handles file compression and decompression.
Package zip handles file compression and decompression.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL