This page lists bioinformatics tools and software that are installed across several of the BioCommons infrastructure partner systems, including Gadi, Australian BioCommons Tools and Workflows repository at NCI (project if89
), Setonix, Bunya, and Galaxy Australia.
Please let us know if you have any feedback.
Loading...
Filter results by topic(s):
|
|||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Tool metadata | Availability on Australian compute infrastructures | ||||||||||||
Tool Name | Description | Registry link | Tool identifier (e.g. module name) | Topic(s) | Publications | Containers available? (BioContainers) | License | Resources / documentation | Galaxy Australia | NCI (Gadi) | NCI (if89) | Pawsey (Setonix) | QRIScloud / UQ-RCC (Bunya) |
3D de novo assembly (3D-DNA) is a pipeline for de novo assembly using HiC. | 3d-dna | De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds | MIT | 201008 | |||||||||
Mass screening of contigs for antimicrobial resistance or virulence genes. | abricate | ABRicate | GPL-2.0 | 3 tools | 1.0.0-gompi-2021a | ||||||||
De novo genome sequence assembler using short reads. | abyss | 2 publications | ABySS | GPL-3.0 | ABySS 2.3.7+galaxy0 | 2.2.3 | |||||||
Another Gff Analysis Toolkit (AGAT) Suite of tools to handle gene annotations in any GTF/GFF format. | agat | 10.5281/zenodo.3552717 | GPL-3.0 | AGAT 1.4.0+galaxy0 | 1.4.0 | ||||||||
The Assisted Model Building with Energy Refinement tool refers to two things: a set of molecular mechanical force fields for the simulation of biomolecules (which are in the public domain, and are used in a variety of simulation programs); and a package of molecular simulation programs which includes source code and demos. | amber | 2 publications | Generate MD topologies for small molecules 21.10+galaxy0 | 19.19.12 20 20-tools21 22 | |||||||||
Consists of several independently developed packages that work well by themselves, and with Amber (Assisted Model Building with Energy Refinement) itself. The suite can also be used to carry out complete (non-periodic) molecular dynamics simulations (using NAB), with generalized Born solvent models. | ambertools | The Amber biomolecular simulation programs | MMPBSA/MMGBSA 21.10+galaxy0 | ||||||||||
Software package specially developed for the study of genes’ primary structure. It uses gene sequences downloaded from public databases, as FASTA and GenBank, and it applies a set of statistical and visualization methods in different ways, to reveal information about codon context, codon usage, nucleotide repeats within open reading frames (ORFeome) and others. | anaconda | Statistical, computational and visualization methodologies to unveil gene primary structure features | 2022.05 | ||||||||||
From https://anndata.readthedocs.io/en/latest/ "Python package for handling annotated data matrices in memory and on disk, positioned between pandas and xarray." | anndata | 10.1101/2021.12.16.473007 | BSD-3-Clause | 4 tools | |||||||||
This tool can get annotation for a generic set of IDs, using the Bioconductor annotation data packages. Supported organisms are human, mouse, rat, fruit fly and zebrafish. The org.db packages that are used here are primarily based on mapping using Entrez Gene identifiers. More information on the annotation packages can be found at the Bioconductor website, for example, information on the human annotation package (org.Hs.eg.db) can be found here. | annotatemyids | MIT | annotateMyIDs 3.18.0+galaxy0 | ||||||||||
Rapid genome-wide identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genomes. It integrates and cross-links with a large number of in silico secondary metabolite analysis tools that have been published earlier. | antismash | 5 publications | antiSMASH | Antismash 6.1.1+galaxy1 | |||||||||
Convert various sequence formats to FASTA | any2fasta | GPL-3.0 | 0.4.2-gcccore-10.3.0 | ||||||||||
Apollo is a genome annotation viewer and editor. Apollo allows researchers to explore genomic annotations at many levels of detail, and to perform expert annotation curation, all in a graphical environment. | apollo | Apollo: a sequence annotation editor. | |||||||||||
ARAGORN detects tRNA, mtRNA info about tmRNA, and tmRNA genes | aragorn | ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences | Not licensed | tRNA and tmRNA 0.6 | |||||||||
argtable | 2.13 | ||||||||||||
Arriba is a command-line tool to detect gene fusions from RNA-Seq data based on the STAR aligner. In addition to fusions, it can detect exon duplications/inversions and truncations of genes (i.e., breakpoints in introns and intergenic regions). Arriba is the winner of the DREAM SMC-RNA Challenge. | arriba | MIT | 3 tools | ||||||||||
AUGUSTUS is a eukaryotic gene prediction tool. It can integrate evidence, e.g. from RNA-Seq, ESTs, proteomics, but can also predict genes ab initio. The PPX extension to AUGUSTUS can take a protein sequence multiple sequence alignment as input to find new members of the family in a genome. It can be run through a web interface (see https://bio.tools/webaugustus), or downloaded and run locally. | augustus | 9 publications | Augustus | Artistic-1.0 | 2 tools | 3.4.03.5.0 3.4.03.5.0 | 3.4.0-foss-2021a3.5.0-foss-2022a (D) 3.4.0-foss-2021a3.5.0-foss-2022a (D) | ||||||
AutoDock Vina is a new open-source program for drug discovery, molecular docking and virtual screening, offering multi-core capability, high performance and enhanced accuracy and ease of use. | autodock_vina | 10.1002/jcc.21334 | 4 tools | ||||||||||
Rapid & standardized annotation of bacterial genomes, MAGs & plasmids | bakta | 10.1099/mgen.0.000685 | GPL-3.0 | Bakta 1.9.2+galaxy0 | |||||||||
BamTools provides a fast, flexible C++ API & toolkit for reading, writing, and managing BAM files. | bamtools | Bamtools: A C++ API and toolkit for analyzing and managing BAM files | bamtools | MIT | 5 tools | 2.5.2 | 2.5.1--hd03093a_10 | 2.5.2-gcc-10.3.02.5.2-gcc-11.3.0 (D) 2.5.2-gcc-10.3.02.5.2-gcc-11.3.0 (D) | |||||
Bamutil provides a serie of programs to work on SAM/BAM files. | bamutil_diff | An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data | GPL-3.0 | BamUtil diff 1.0.15+galaxy1 | |||||||||
GUI program that allows users to interact with the assembly graphs made by de novo assemblers such as Velvet, SPAdes, MEGAHIT and others. It visualises assembly graphs, with connections, using graph layout algorithms. | bandage | 10.1093/bioinformatics/btv383 | bandage | GPL-3.0 | 2 tools | ||||||||
Predict the location of ribosomal RNA genes in genomes. It supports bacteria (5S,23S,16S), archaea (5S,5.8S,23S,16S), mitochondria (12S,16S) and eukaryotes (5S,5.8S,28S,18S). | barrnap | barrnap | GPL-3.0 | barrnap 1.2.2 | |||||||||
basespace | 1.5.3 matlab | ||||||||||||
BBMap is a fast splice-aware aligner for RNA and DNA. It is faster than almost all short-read aligners, yet retains unrivaled sensitivity and specificity, particularly for reads with many errors and indels. | bbmap | bbmap | BSD-3-Clause | BBTools: BBduk 39.08+galaxy0 | 38.93 | 38.96--h5c4e2a8_0 | 38.96-gcc-10.3.039.01-gcc-11.3.0 (D) 38.96-gcc-10.3.039.01-gcc-11.3.0 (D) | ||||||
A tool for filling the gap created by genomic data processing/analysis by rebasing some analysis results against the parent features which were originally analysed. | bcbiogff | Biopython: Freely available Python tools for computational molecular biology and bioinformatics | 3 tools | ||||||||||
BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed. | bcftools | 2 publications | BCFtools | MIT | 30 tools | 1.9 1.12 | 1.15--haf5b3da_0 | 1.12-gcc-10.3.01.15.1-gcc-11.3.0 (D) 1.12-gcc-10.3.01.15.1-gcc-11.3.0 (D) | |||||
bcl2fastq2 | 2.20.0-gcc-11.3.0 | ||||||||||||
Beagle is a software package that performs genotype calling, genotype phasing, imputation of ungenotyped markers, and identity-by-descent segment detection. | beagle | 4 publications | Beagle | GPL-3.0 | 5.4.22jul22.46e-java-11 | ||||||||
BEAGLE is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics packages. | beagle-lib | BEAGLE: An application programming interface and high-performance computing library for statistical phylogenetics | GPL-3.0 | 3.1.2 | 3.1.2-gcc-11.3.04.0.0-gcc-11.3.0 (D) 3.1.2-gcc-11.3.04.0.0-gcc-11.3.0 (D) | ||||||||
The Bayesian Evolutionary Analysis Sampling Trees is a cross-platform program for Bayesian analysis of molecular sequences using MCMC (Markov chain Monte Carlo). It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. | beast | 3 publications | BEAST | 1.10.4 | 1.10.4 | ||||||||
Bayesian phylogenetic analysis of molecular sequences. It estimates rooted, time-measured phylogenies using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. It uses Markov chain Monte Carlo (MCMC) to average over tree space, so that each tree is weighted proportional to its posterior probability. It includes a graphical user-interface for setting up standard analyses and a suit of programs for analysing the results. | beast2 | BEAST 2: A Software Platform for Bayesian Evolutionary Analysis | 2.6.7 | ||||||||||
Convert a BED format file of the proteins from a proteomics search database into a tabular format for the Multiomics Visualization Platform (MVP). | bed_to_protein_map | Not licensed | bed to protein map 0.2.0 | ||||||||||
BEDTools is an extensive suite of utilities for comparing genomic features in BED format. | bedtools | BEDTools: A flexible suite of utilities for comparing genomic features | BEDTools | GPL-2.0 | 39 tools | 2.28.0 | 2.30.0--h468198e_3 | 2.30.0-gcc-10.3.02.30.0-gcc-11.3.0 (D) 2.30.0-gcc-10.3.02.30.0-gcc-11.3.0 (D) | |||||
The Bellerophon pipeline, improving de novo transcriptomes and removing chimeras. Bellerophon is a pipeline created to remove falsely assembled chimeric transcripts in de novo transcriptome assemblies. The pipeline can be downloaded as a vragrant virtual machine (https://app.vagrantup.com/bellerophon/boxes/bellerophon). This is recommended, as it avoids backwards compatibility problems with TransRate | bellerophon | The Bellerophon pipeline, improving de novo transcriptomes and removing chimeras | Filter and merge 1.0+galaxy1 | ||||||||||
Trim, circularise, orient and filter long read bacterial genome assemblies | berokka | GPL-3.0 | Berokka 0.2.3 | ||||||||||
bftools | 2 tools | ||||||||||||
bio-db-hts | 3.01-gcc-11.3.0 | ||||||||||||
bio-searchio-hmmer | 1.7.3-gcc-10.3.01.7.3-gcc-11.3.0 (D) 1.7.3-gcc-10.3.01.7.3-gcc-11.3.0 (D) | ||||||||||||
Bio3D is an R package containing utilities for the analysis of protein structure, sequence and trajectory data. | bio3d | The Bio3D packages for structural bioinformatics | GPL-3.0 | 4 tools | |||||||||
biobakery_workflows | 3.1 | ||||||||||||
Tools for early stage NGS alignment file processing including fast sorting and duplicate marking. | biobambam | Biobambam: Tools for read pair collation based algorithms on BAM files | GPL-3.0 | 2.0.182 | |||||||||
This package includes basic tools for reading biom-format files, accessing and subsetting data tables from a biom object, as well as limited support for writing a biom-object back to a biom-format file. The design of this API is intended to match the python API and other tools included with the biom-format project, but with a decidedly "R flavor" that should be familiar to R users. This includes S4 classes and methods, as well as extensions of common core functions/methods. | biom-format | Orchestrating high-throughput genomic analysis with Bioconductor | biom-format | GPL-2.0 | 2 tools | ||||||||
bionano_solve | Bionano Hybrid Scaffold 3.7.0+galaxy3 | ||||||||||||
A collection of Perl modules that facilitate the development of Perl scripts for bioinformatics applications. It provides software modules for many of the typical tasks of bioinformatics programming. | bioperl | An introduction to BioPerl | bioperl | 1.7.8-gcccore-10.3.01.7.8-gcccore-11.3.0 (D) 1.7.8-gcccore-10.3.01.7.8-gcccore-11.3.0 (D) | |||||||||
Biopython is a set of freely available tools for biological computation written in Python by an international team of developers. | biopython | Biopython: Freely available Python tools for computational molecular biology and bioinformatics | MIT | 3 tools | 1.79 | 1.79-foss-2021a1.79-foss-2022a (D) 1.79-foss-2021a1.79-foss-2022a (D) | |||||||
BioTransformer is a freely available web server that supports accurate, rapid and comprehensive in silico metabolism prediction. | biotransformer | BioTransformer 3.0 - a web server for accurately predicting metabolic transformation products | LGPL-3.0 | BioTransformer 3.0.20230403+galaxy3 | |||||||||
Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion. | bismark | Bismark: A flexible aligner and methylation caller for Bisulfite-Seq applications | bismark | GPL-3.0 | 4 tools | ||||||||
A tool that finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance. | blast | 4 publications | 2.10.1 2.11.0 | 2.13.02.14.1 2.13.02.14.1 | 2.12.0--pl5262h3289130_0 | 2.11.0-linux_x86_642.13.0--hf3cf87c_0 2.11.0-linux_x86_642.13.0--hf3cf87c_0 | |||||||
A tool that finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance. | blast+ | 4 publications | 16 tools | 2.11.0-gompi-2021a2.13.0-gompi-2022a (D) 2.11.0-gompi-2021a2.13.0-gompi-2022a (D) | |||||||||
Fast, accurate spliced alignment of DNA sequences. | blat | BLAT - The BLAST-like alignment tool | 37 | 3.7-gcc-11.3.0 | |||||||||
detect blocks of overlapping reads using a gaussian-distribution approach | Blockbuster | Evidence for human microRNA-offset RNAs in small RNA sequencing data | blockbuster 0.1.2 | ||||||||||
BlockClust | BlockClust 1.1.1 | ||||||||||||
bolt-lmm | 2.4.1-intel-2022a | ||||||||||||
Boost is a set of libraries for the C++ programming language that provides support for tasks and structures such as linear algebra, pseudorandom number generation, multithreading, image processing, regular expressions, and unit testing. | boost | Other | 1.71.0 1.72.0 1.77.0 1.79.0 1.80.0 | 1.76.0-gcc-10.3.01.79.0-gcc-11.3.0 (D) 1.76.0-gcc-10.3.01.79.0-gcc-11.3.0 (D) | |||||||||
Bowtie is an ultrafast, memory-efficient short read aligner. | bowtie | 3 publications | Bowtie | 2 tools | 1.3.1-gcc-11.3.0 | ||||||||
Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes. | bowtie2 | 6 publications | Bowtie2 | GPL-3.0 | 2.3.5.1 | 2.4.5--py36hd4290be_0 | 2.4.4-gcc-10.3.02.4.5-gcc-11.3.0 (D) 2.4.4-gcc-10.3.02.4.5-gcc-11.3.0 (D) | ||||||
Statistical method that computes the abundance of species in DNA sequences from a metagenomics sample. | bracken | Bracken: Estimating species abundance in metagenomics data | GPL-3.0 | Bracken 3.0+galaxy0 | |||||||||
Pipeline for unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS. | braker | BRAKER1: Unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS | BRAKER3 3.0.6+galaxy2 | 3.0.3 | |||||||||
Runs Breseq software on a set of fastq files. | breseq | 3 publications | breseq 0.35.5+0 | ||||||||||
Provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs. | busco | 4 publications | BUSCO | Busco 5.5.0+galaxy0 | 5.2.15.4.0 5.2.15.4.0 | 5.4.2-foss-2021a5.4.5-foss-2022a (D) 5.4.2-foss-2021a5.4.5-foss-2022a (D) | |||||||
Accelerated nanopore basecalling with SLOW5 data format. | buttery-eel | Accelerated nanopore basecalling with SLOW5 data format | MIT | 0.3.1+guppy6.4.20.4.1+guppy6.5.70.4.2+dorado7.2.130.4.2+guppy6.5.7 0.3.1+guppy6.4.20.4.1+guppy6.5.70.4.2+dorado7.2.130.4.2+guppy6.5.7 0.3.1+guppy6.4.20.4.1+guppy6.5.70.4.2+dorado7.2.130.4.2+guppy6.5.7 0.3.1+guppy6.4.20.4.1+guppy6.5.70.4.2+dorado7.2.130.4.2+guppy6.5.7 | |||||||||
Fast, accurate, memory-efficient aligner for short and long sequencing reads | bwa | 6 publications | bwa | MIT | 2 tools | 0.7.17 | 0.7.17--h7132678_9 | 0.7.17-gcc-10.3.00.7.17-gcccore-11.3.0 (D) 0.7.17-gcc-10.3.00.7.17-gcccore-11.3.0 (D) | |||||
BWA-meth | bwameth 0.2.7+galaxy0 | ||||||||||||
Bwa-mem2 is the next version of the bwa-mem algorithm in bwa. It produces alignment identical to bwa and is ~1.3-3.1x faster depending on the use-case, dataset and the running machine. | bwa-mem2 | Efficient architecture-aware acceleration of BWA-MEM for multicore systems | MIT | BWA-MEM2 2.2.1+galaxy1 | 2.2.1 | 2.2.1--hd03093a_2 | |||||||
bwakit | 0.7.110.7.17 0.7.110.7.17 | ||||||||||||
Tools for manipulating biological data, particularly multiple sequence alignments. | bx-python | MIT | 13 tools | ||||||||||
c3s | Copernicus Climate Data Store 0.1.0 | ||||||||||||
Cactus is a reference-free whole-genome multiple alignment program. | cactus | 3 publications | 2 tools | 2.0.3 | |||||||||
Annotation of peaklists generated by xcms, rule based annotation of isotopes and adducts, isotope validation, EIC correlation based tagging of unknown adducts and fragments. | camera | CAMERA: An integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets | GPL-2.0 | 2 tools | |||||||||
De-novo assembly tool for long read chemistry like Nanopore data and PacBio data. | canu | Canu: Scalable and accurate long-read assembly via adaptive κ-mer weighting and repeat separation | Canu | Canu assembler 2.2+galaxy0 | 2.0 2.0t 1.9 2.1.1 | 2.2--ha47f30e_0 | 2.2-gcccore-10.3.02.2-gcccore-11.3.0 (D) 2.2-gcccore-10.3.02.2-gcccore-11.3.0 (D) | ||||||
Web-based contig assembly. | cap3 | CAP3: A DNA sequence assembly program | cap3 2.0.0 | ||||||||||
Implements statistical and computational tools for analyzing mass spectrometry imaging datasets, including methods for efficient pre-processing, spatial segmentation, and classification. | cardinal | 10.1093/bioinformatics/btv146 | Artistic-2.0 | 9 tools | |||||||||
Contig Annotation Tool (CAT) and Bin Annotation Tool (BAT) are pipelines for the taxonomic classification of long DNA sequences and metagenome assembled genomes (MAGs/bins) of both known and (highly) unknown microorganisms, as generated by contemporary metagenomics studies. The core algorithm of both programs involves gene calling, mapping of predicted ORFs against the nr protein database, and voting-based classification of the entire contig / MAG based on classification of the individual ORFs. | cat_bins | Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT | MIT | 2 tools | |||||||||
Cluster a nucleotide dataset into representative sequences. | cd-hit | 5 publications | cd-hit | 2 tools | 4.8.1 | 4.8.1-gcc-10.3.04.8.1-gcc-11.3.0 (D) 4.8.1-gcc-10.3.04.8.1-gcc-11.3.0 (D) | |||||||
celseq2 is a Python framework for generating the UMI count matrix from CEL-Seq2 sequencing data. | celseq2 | CEL-Seq2: Sensitive highly-multiplexed single-cell RNA-Seq | BSD-2-Clause | 0.5.3 | |||||||||
Tool for quantifying data from biological images, particularly in high-throughput experiments. | CellProfiler | 2 publications | CellProfiler | BSD-3-Clause | 23 tools | ||||||||
cellranger | 6.1.2 | 7.1.0 | |||||||||||
cellxgene (pronounced "cell-by-gene") is an interactive data explorer for single-cell transcriptomics datasets, such as those coming from the Human Cell Atlas. | cellxgene | 10.1101/2021.04.05.438318 | MIT | Interactive CellXgene Environment 1.1.1 | |||||||||
CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes. | checkm | CheckM: Assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes | GPL-3.0 | 1.1.3-foss-2021a1.2.2-foss-2022a (D) 1.1.3-foss-2021a1.2.2-foss-2022a (D) | |||||||||
CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes. | checkm-database | CheckM: Assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes | GPL-3.0 | 2015_01_16 | |||||||||
Database of bioactive compounds, their quantitative properties and bioactivities (binding constants, pharmacology and ADMET, etc). The data is abstracted and curated from the primary scientific literature. | chembl | 2 publications | 2 tools | ||||||||||
Fast cheminformatics fingerprint search, at your fingertips. Chemfp is a set of command-line tools and a Python library for fingerprint generation and high-performance similarity search. There are two ways to try out chemfp. From the download page page you can request an evaluation copy of the most recent version of chemfp, or you can download an earlier version for no cost under the MIT license | chemfp | 2 publications | 4 tools | ||||||||||
ChemicalToolbox is a publicly available web server for performing cheminformatics analysis. The ChemicalToolbox provides an intuitive, graphical interface for common tools for downloading, filtering, visualizing and simulating small molecules and proteins. The ChemicalToolbox is based on Galaxy, an open-source web-based platform which enables accessible and reproducible data analysis. There is already an active Galaxy cheminformatics community using and developing tools. Based on their work, we provide four example workflows which illustrate the capabilities of the ChemicalToolbox, covering assembly of a compound library, hole filling, protein-ligand docking, and construction of a quantitative structure-activity relationship (QSAR) model. | chemicaltoolbox | The ChemicalToolbox: Reproducible, user-friendly cheminformatics analysis on the Galaxy platform | 3 tools | ||||||||||
This package implements functions to retrieve the nearest genes around the peak, annotate the genomic region of the peak, statistical methods for estimating the significance of overlap among ChIP peak data sets, and incorporate GEO database to compare the own dataset with those deposited in database. The comparison can be used to infer cooperative regulation and thus can be used to generate hypotheses. | chipseeker | ChIP seeker: An R/Bioconductor package for ChIP peak annotation, comparison and visualization | chipseeker | Artistic-2.0 | ChIPseeker 1.28.3+galaxy0 | ||||||||
ChiRA is a tool suite to analyze RNA-RNA interactome experimental data such as CLASH, CLEAR-CLIP, PARIS, SPLASH, etc. | chira | GPL-3.0 | 5 tools | ||||||||||
An ultra fast, heuristic approach to detect conserved signals in extremely large pairwise genome comparisons (dotplot). | chromeister | Ultra-fast genome comparison for large-scale genomic experiments | GPL-3.0 | Chromeister 1.5.a+galaxy1 | |||||||||
circexplorer | CIRCexplorer 1.1.9.0 | ||||||||||||
circlator | 1.5.5--py_3 | ||||||||||||
Circos is tool for visualizing data in a circular format. It was developed for genomic data but can work for many other kinds of data as well. | circos | 2 publications | circos | 12 tools | |||||||||
clifinder | CLIFinder 0.5.1 | ||||||||||||
climate_stripes | climate stripes 1.0.1 | ||||||||||||
Automatic generation of gene cluster comparison figures. Gene cluster comparison figure generator. A d3 chart for generating gene cluster comparison figures. clinker is a pipeline for easily generating publication-quality gene cluster comparison figures. Given a set of GenBank files, clinker will automatically extract protein translations, perform global alignments between sequences in each cluster, determine the optimal display order based on cluster similarity, and generate an interactive visualisation (using clustermap.js) that can be extensively tweaked before being exported as an SVG file. clustermap.js is an interactive, reusable d3 chart designed to visualise homology between multiple gene clusters. | clinker | 10.1101/2020.11.08.370650 | MIT | clinker 0.0.23+galaxy0 | |||||||||
clipkit | ClipKIT. Alignment trimming software for phylogenetics. 0.1.0 | ||||||||||||
Multiple sequence alignment software. The name is occassionally spelled as ClustalOmega, Clustal Ω, ClustalΩ, Clustal O, ClustalO. | clustalo | 3 publications | Clustal Omega | GPL-2.0 | 1.2.4 | 1.2.4--h87f3376_5 | |||||||
Multiple sequence alignment software. Old deprecated versions. Even older versions were CLUSTAL and CLUSTAL V (ClustalV). | clustalw | 5 publications | clustalw | ClustalW 2.1+galaxy1 | 2.1 | ||||||||
ColabFold databases are MMseqs2 expandable profile databases to generate diverse multiple sequence alignments to predict protein structures. | colabfold_batch | 2 publications | MIT | 1.4.01.5.2 1.4.01.5.2 | |||||||||
compose_text_param | Compose text parameter value 0.1.1 | ||||||||||||
cookiecutter | 2.4.0 | ||||||||||||
Count1 | Count 1.0.3 | ||||||||||||
CPAT (Coding-Potential Assessment Tool) is a logistic regression model-based classifier that can accurately and quickly distinguish protein-coding and noncoding RNAs using pure linguistic features calculated from the RNA sequences. CPAT takes as input the nucleotides sequences or genomic coordinates of RNAs and outputs the probabilities p (0 ≤ p ≤ 1), which measure the likelihood of protein coding. | cpat | RNA Coding Potential Prediction Using Alignment-Free Logistic Regression Model | CPAT (Coding-Potential Assessment Tool) | GPL-3.0 | CPAT 3.0.5+galaxy1 | ||||||||
Software to generate CRISPR guide RNAs against genomes annotated with individual variation. | crisflash | Crisflash: Open-source software to generate CRISPR guide RNAs against genomes annotated with individual variation | GPL-3.0 | 1.2.0 star-ccm+ | |||||||||
ctsm_fates | CTSM/FATES-EMERALD 2.0.1 | ||||||||||||
cuda | 10.1 11.0.3 11.2.2 11.4.1 11.6.1 11.7.0 | ||||||||||||
The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. | cudnn | 7.6.5-cuda10.1 8.1.1-cuda11 8.2.2-cuda11.4 8.6.0-cuda11 | |||||||||||
Cufflinks assembles transcripts and estimates their abundances in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one. | cufflinks | 5 publications | Cufflinks | BSL-1.0 | 5 tools | ||||||||
Allows for persistent storage, access, exploration, and manipulation of Cufflinks high-throughput sequencing data. In addition, provides numerous plotting functions for commonly used visualizations. | cummeRbund | Erratum: Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks (Nature Protocols (2012) 7 (562-578)) | cummeRbund | Artistic-2.0 | cummeRbund 2.16.0+galaxy1 | ||||||||
Generate customized protein sequence database from RNA-Seq data for proteomics search. | customprodb | CustomProDB: An R package to generate customized protein databases from RNA-Seq data for proteomics search | Artistic-2.0 | CustomProDB 1.22.0 | |||||||||
Find and remove adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads. | cutadapt | 2 publications | MIT | Cutadapt 4.9+galaxy1 | 3.7 | 3.7--py38hbff2b2d_0 | 3.4-gcccore-10.3.04.2-gcccore-11.3.0 (D) 3.4-gcccore-10.3.04.2-gcccore-11.3.0 (D) | ||||||
Long Read based Human Genomic Structural Variation Detection with cuteSV | Long-read sequencing technologies enable to comprehensively discover structural variations (SVs). However, it is still non-trivial for state-of-the-art approaches to detect SVs with high sensitivity or high performance or both. Herein, we propose cuteSV, a sensitive, fast and lightweight SV detection approach. cuteSV uses tailored methods to comprehensively collect various types of SV signatures, and a clustering-and-refinement method to implement a stepwise SV detection, which enables to achieve high sensitivity without loss of accuracy. Benchmark results demonstrate that cuteSV has better yields on real datasets. Further, its speed and scalability are outstanding and promising to large-scale data analysis | cutesv | 2 publications | MIT | cuteSV 1.0.8+galaxy0 | 1.0.13 | ||||||||
This package infers exact sequence variants (SVs) from amplicon data, replacing the commonly used and coarser OTU clustering approach. This pipeline inputs demultiplexed fastq files, and outputs the sequence variants and their sample-wise abundances after removing substitution and chimera errors. Taxonomic classification is available via a native implementation of the RDP naive Bayesian classifier. | dada2 | 10.1038/nmeth.3869 | GPL-3.0 | 10 tools | |||||||||
dadi | 2.0.5 | ||||||||||||
datamash | Datamash 1.8+galaxy0 | ||||||||||||
dbbuilder | Protein Database Downloader 0.3.4 | ||||||||||||
dcm2niix | 1.0.202207201.0.20230411 1.0.202207201.0.20230411 | ||||||||||||
deepconsensus-cpu | 1.0.0 | ||||||||||||
deepconsensus-gpu | 1.2.0 | ||||||||||||
User-friendly tools for the normalization and visualization of deep-sequencing data. | deeptools | DeepTools: A flexible platform for exploring deep-sequencing data | DeepTools | GPL-3.0 | 17 tools | 3.5.0-foss-2021a3.5.2-foss-2022a (D) 3.5.0-foss-2021a3.5.2-foss-2022a (D) | |||||||
DendroPy is a Python library for phylogenetic computing. It provides classes and functions for the simulation, processing, and manipulation of phylogenetic trees and character matrices, and supports the reading and writing of phylogenetic data in a range of formats. | dendropy | DendroPy: A Python library for phylogenetic computing | dendropy | BSD-3-Clause | 4.5.2-gcccore-10.3.04.5.2-gcccore-11.3.0 (D) 4.5.2-gcccore-10.3.04.5.2-gcccore-11.3.0 (D) | ||||||||
R/Bioconductor package for differential gene expression analysis based on the negative binomial distribution. Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution. | deseq2 | Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 | LGPL-2.1 | DESeq2 2.11.40.8+galaxy0 | |||||||||
The package is focused on finding differential exon usage using RNA-seq exon counts between samples with different experimental designs. It provides functions that allows the user to make the necessary statistical tests based on a model that uses the negative binomial distribution to estimate the variance between biological replicates and generalized linear models for testing. The package also provides functions for the visualization and exploration of the results. | DEXSeq | Drift and conservation of differential exon usage across tissues in primate species | DEXSeq | GPL-3.0 | 3 tools | ||||||||
The Dfam database is a open collection of Transposable Element DNA sequence alignments, hidden Markov Models (HMMs), consensus sequences, and genome annotations. | dfam | 10.21203/RS.3.RS-76062/V1 | 3.3--hdfd78af_0 | ||||||||||
Neural networks and interference correction enable deep proteome coverage in high throughput. DIA-NN - a fast and easy to use tool for processing data independent acquisition (DIA) proteomics data. None required (for .raw, .mzML and .dia processing). Two executables are provided: DiaNN.exe (a command-line tool) and DIA-NN.exe (a GUI implemented as a wrapper for DiaNN.exe) | diann | DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput | DIA-NN 1.8.1+galaxy3 | ||||||||||
DIA-Umpire is an open source Java program for computational analysis of data independent acquisition (DIA) mass spectrometry-based proteomics data. It enables untargeted peptide and protein identification and quantitation using DIA data, and also incorporates targeted extraction to reduce the number of cases of missing quantitation. | diaumpire | DIA-Umpire: Comprehensive computational framework for data-independent acquisition proteomics | Apache-2.0 | DIA_Umpire_SE 2.1.3.0 | |||||||||
Sequence aligner for protein and translated DNA searches and functions as a drop-in replacement for the NCBI BLAST software tools. It is suitable for protein-protein search as well as DNA-protein search on short reads and longer sequences including contigs and assemblies, providing a speedup of BLAST ranging up to x20,000. | diamond | Fast and sensitive protein alignment using DIAMOND | AGPL-3.0 | 3 tools | 2.1.9 | 2.0.14--hdcc8f71_0 | 2.0.13-gcc-10.3.02.1.0-gcc-11.3.0 (D)2.1.7 2.0.13-gcc-10.3.02.1.0-gcc-11.3.0 (D)2.1.7 2.0.13-gcc-10.3.02.1.0-gcc-11.3.0 (D)2.1.7 | ||||||
diaPASEF is an appproch for parallel accumulation-serial fragmentation combined with data-independent acquisition. | diapysef | diaPASEF: parallel accumulation–serial fragmentation combined with data-independent acquisition | diapysef library generation 0.3.5.0 | ||||||||||
Compute differentially bound sites from multiple ChIP-seq experiments using affinity (quantitative) data. Also enables occupancy (overlap) analysis and plotting functions. | diffbind | VAV3 mediates resistance to breast cancer endocrine therapy | Artistic-2.0 | DiffBind 3.12.0+galaxy0 | |||||||||
Dorado is a high-performance, easy-to-use, open source basecaller for Oxford Nanopore reads. | dorado | 2 tools | |||||||||||
orad | 2.6.1 | ||||||||||||
drishti | 3.0 3.0.1 | ||||||||||||
Provides a number of utility functions for handling single-cell (RNA-seq) data from droplet technologies such as 10X Genomics. This includes data loading, identification of cells from empty droplets, removal of barcode-swapped pseudo-cells, and downsampling of the count matrix. | dropletutils | 2 publications | GPL-3.0 | DropletUtils 1.10.0+galaxy2 | |||||||||
dwt | 5 tools | ||||||||||||
Fast and Accurate Genome-wide Phasing and Imputation in a Single Tool. | eagleimp | 10.1101/2022.01.11.475810 | GPL-3.0 | 1.10 | |||||||||
EasyBuild is a software build and installation framework that allows you to manage (scientific) software on High Performance Computing (HPC) systems in an efficient way. | easybuild | GPL-2.0 | 4.8.0 | ||||||||||
Differential expression analysis of RNA-seq expression profiles with biological replication. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models and quasi-likelihood tests. As well as RNA-seq, it be applied to differential signal analysis of other types of genomic data that produce counts, including ChIP-seq, SAGE and CAGE. | edger | 3 publications | edger | GPL-2.0 | edgeR 3.36.0+galaxy4 | ||||||||
Differential expression analysis of RNA-seq expression profiles with biological replication. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models and quasi-likelihood tests. As well as RNA-seq, it be applied to differential signal analysis of other types of genomic data that produce counts, including ChIP-seq, SAGE and CAGE. | edger-repenrich | 3 publications | GPL-2.0 | edgeR-repenrich 1.5.2 | |||||||||
Entrez Direct (EDirect) is a command-line tool for Entrez databases. EDirect connects to Entrez through the Entrez Programming Utilities interface. It supports searching by indexed terms, looking up precomputed neighbors or links, filtering results by date or category, and downloading record summaries or reports. | edirect | Freeware | 16.2 | ||||||||||
For fast functional annotation of novel sequences. It uses precomputed orthologous groups and phylogenies from the eggNOG database to transfer functional information from fine-grained orthologs only. Its common uses include the annotation of novel genomes, transcriptomes or even metagenomic gene catalogs. The use of orthology predictions for functional annotation permits a higher precision than traditional homology searches, as it avoids transferring annotations from close paralogs. | eggnog-mapper | 3 publications | GPL-3.0 | 3 tools | |||||||||
This package implements the Ensemble of Gene Set Enrichment Analyses method for gene set testing. | egsea | Combining multiple tools outperforms individual methods in gene set enrichment analyses | egsea | EGSEA 1.20.0 | |||||||||
Evaluation of an Open Source Registration Package for Automatic Contour Propagation in Online Adaptive Intensity-Modulated Proton Therapy of Prostate Cancer. Home : About : FAQ : wiki : Download : News : Legal stuff : Documentation. Welcome to elastix : a toolbox for rigid and nonrigid registration of images. elastix is open source software, based on the well-known Insight Segmentation and Registration Toolkit (ITK). The software consists of a collection of algorithms that are commonly used to solve (medical) image registration problems. The modular design of elastix allows the user to quickly configure, test, and compare different registration methods for a specific application. A command-line interface enables automated processing of large numbers of data sets, by means of scripting. Nowadays elastix is accompanied by SimpleElastix , making it available in languages like C++, Python, Java, R, Ruby, C# and Lua. | elastix | 3 publications | 4.9.05.1.0 4.9.05.1.0 | ||||||||||
Diverse suite of tools for sequence analysis; many programs analagous to GCG; context-sensitive help for each tool. | emboss | EMBOSS: The European Molecular Biology Open Software Suite | EMBOSS (European Molecular Biology Open Software Suite) | 107 tools | |||||||||
A globally comprehensive data resource for nucleotide sequence, spanning raw data, alignments and assemblies, functional and taxonomic annotation and rich contextual data relating to sequenced samples and experimental design. Serving both as the database of record for the output of the world's sequencing activity and as a platform for the management, sharing and publication of sequence data. | ena_upload | 2 publications | 2 tools | ||||||||||
EncyclopeDIA is library search engine comprised of several algorithms for DIA data analysis and can search for peptides using either DDA-based spectrum libraries or DIA-based chromatogram libraries. | encyclopedia | Chromatogram libraries improve peptide detection and quantification by data independent acquisition mass spectrometry | Apache-2.0 | 4 tools | |||||||||
enrichm | 0.6.5 | ||||||||||||
ensembl-vep | 106.1 | ||||||||||||
The Environment for Tree Exploration (ETE) is a computational framework that simplifies the reconstruction, analysis, and visualization of phylogenetic trees and multiple sequence alignments. Here, we present ETE v3, featuring numerous improvements in the underlying library of methods, and providing a novel set of standalone tools to perform common tasks in comparative genomics and phylogenetics. The new features include (i) building gene-based and supermatrix-based phylogenies using a single command, (ii) testing and visualizing evolutionary models, (iii) calculating distances between trees of different size or including duplications, and (iv) providing seamless integration with the NCBI taxonomy database. ETE is freely available at http://etetoolkit.org | ete3 | 3.1.3 | |||||||||||
ethercalc | EtherCalc 0.1 | ||||||||||||
Integrated database covering the eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. The database portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera. | eupathdb | EuPathDB: A portal to eukaryotic pathogen databases | EuPathDB 1.0.0 | ||||||||||
ExaBayes is a software package for Bayesian phylogenetic tree inference. It is particularly suitable for large-scale analyses on computer clusters. | exabayes | GPL-3.0 | 1.5.1 | ||||||||||
Tool for phylogenomic analyses on supercomputers. | examl | ExaML version 3: A tool for phylogenomic analyses on supercomputers | GPL-3.0 | 3.0.22 | |||||||||
A tool for pairwise sequence alignment. It enables alignment for DNA-DNA and DNA-protein pairs and also gapped and ungapped alignment. | exonerate | Automated generation of heuristics for biological sequence comparison | Exonerate | GPL-3.0 | Exonerate 2.4.0+galaxy2 | 2.2.02.4.0 2.2.02.4.0 | 2.4.0--hf34a1b8_7 | ||||||
export2graphlan is a conversion software tool for producing both annotation and tree file for GraPhlAn. In particular, the annotation file tries to highlight specific sub-trees deriving automatically from input file what nodes are important. | export2graphlan | Compact graphical representation of phylogenetic data and metadata with GraPhlAn | MIT | Export to GraPhlAn 0.20+galaxy0 | |||||||||
export_remote | Export datasets 0.1.0 | ||||||||||||
Streaming tool for quantifying the abundances of a set of target sequences from sampled subsequences. Example applications include transcript-level RNA-Seq quantification, allele-specific/haplotype expression analysis (from RNA-Seq), transcription factor binding quantification in ChIP-Seq, and analysis of metagenomic data. It can be used to resolve ambiguous mappings in other high-throughput sequencing based applications. | eXpress | 2 publications | Apache-2.0 | eXpress 1.1.1 | |||||||||
sdf_to_tab | Extract values from an SD-file 2020.03.4+galaxy0 | ||||||||||||
GPU Accelerated Adaptive Banded Event Alignment for Rapid Comparative Nanopore Signal Analysis | Re-engineered and optimised Nanopolish call-methylation module (supports CUDA acceleration) | An optimised re-implementation of the call-methylation module in Nanopolish. Given a set of basecalled Nanopore reads and the raw signals, f5c detects the methylated cytosine bases. f5c can optionally utilise NVIDIA graphics cards for acceleration | f5c | 2 publications | MIT | 1.3 | 1.1--h0326b38_1 | ||||||||
Experimental PacBio diploid assembler. | pb-assembly | 10.5281/zenodo.35745 | 0.0.8--hdfd78af_1 | ||||||||||
Add length of sequence to fasta header. | fasta_compute_length | 2 publications | Compute sequence length 1.0.3 | ||||||||||
fastahack | 1.0.0-gcccore-10.3.0 | ||||||||||||
FastANI is developed for fast alignment-free computation of whole-genome Average Nucleotide Identity (ANI). ANI is defined as mean nucleotide identity of orthologous gene pairs shared between two microbial genomes. FastANI supports pairwise comparison of both complete and draft genome assemblies. | fastani | Apache-2.0 | 1.33-gcc-10.3.0 | ||||||||||
Read huge FastQ and FastA files (both normal and gzipped) an demanipulate them. | fastool | MIT | 0.1.4--h7132678_6 | ||||||||||
A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance. | fastp | 10.1093/bioinformatics/bty560 | fastp | MIT | fastp 0.23.4+galaxy1 | 0.20.0 | 0.23.2-gcc-11.3.0 | ||||||
This tool aims to provide a QC report which can spot problems or biases which originate either in the sequencer or in the starting library material. It can be run in one of two modes. It can either run as a stand alone interactive application for the immediate analysis of small numbers of FastQ files, or it can be run in a non-interactive mode where it would be suitable for integrating into a larger analysis pipeline for the systematic processing of large numbers of files. | fastqc | 10.7490/f1000research.1114334.1 | FASTQC | GPL-3.0 | FastQC 0.74+galaxy1 | 0.11.7 | 0.12.1 | 0.11.9--hdfd78af_1 | 0.11.9-java-11 | ||||
Compute quality stats for FASTQ files and print those stats as emoji... for some reason. | fastqe | FASTQE 0.3.1+galaxy0 | |||||||||||
Infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. | fasttree | 2 publications | FastTree | FASTTREE 2.1.10+galaxy1 | 2.1.11 | 2.1.11-gcccore-10.3.0 | |||||||
Collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing. | fastx | Comparison of DNA sequences with protein sequences | FASTX-Toolkit | AGPL-3.0 | 9 tools | ||||||||
Collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing. | fastx_toolkit | Comparison of DNA sequences with protein sequences | AGPL-3.0 | 5 tools | |||||||||
featureCounts is a very efficient read quantifier. It can be used to summarize RNA-seq reads and gDNA-seq reads to a variety of genomic features such as genes, exons, promoters, gene bodies and genomic bins. It is included in the Bioconductor Rsubread package and also in the SourceForge Subread package. | featurecounts | FeatureCounts: An efficient general purpose program for assigning sequence reads to genomic features | GPL-3.0 | featureCounts 2.0.3+galaxy2 | |||||||||
A tool to annotate long non-coding RNAs from RNA-seq assembled transcripts. | feelnc | FEELnc: A tool for long non-coding RNA annotation and its application to the dog transcriptome | GPL-3.0 | FEELnc 0.2.1+galaxy0 | |||||||||
fermi-lite | 20190320-gcccore-10.3.0 | ||||||||||||
HMM-based gene structure prediction (multiple genes, both chains); Program for predicting multiple genes in genomic DNA sequences. | fgenesh | 10.1186/gb-2006-7-s1-s10 | FGENESH get protein 1.0.0+galaxy0 | ||||||||||
The package implements an algorithm for fast gene set enrichment analysis. Using the fast algorithm allows to make more permutations and get more fine grained p-values, which allows to use accurate stantard approaches to multiple hypothesis correction. | fgsea | 10.1101/060012 | MIT | fgsea 1.8.0+galaxy1 | |||||||||
filter_transcripts_via_tracking | Filter Combined Transcripts 0.1 | ||||||||||||
pileup_parser | Filter pileup 1.0.2 | ||||||||||||
sam_bitwise_flag_filter | Filter SAM 1.0.0 | ||||||||||||
Filtlong is a tool for filtering long reads by quality. It can take a set of long reads and produce a smaller, better subset. It uses both read length (longer is better) and read identity (higher is better) when choosing which reads pass the filter. | filtlong | GPL-3.0 | filtlong 0.2.1+galaxy0 | ||||||||||
FlashLFQ is an ultrafast label-free quantification algorithm for mass-spectrometry proteomics. | flashlfq | LGPL-3.0 | FlashLFQ 1.0.3.1 | ||||||||||
flex | 2.6.4-gcccore-10.3.02.6.4-gcccore-11.3.02.6.4-gcccore-12.3.0 (D) 2.6.4-gcccore-10.3.02.6.4-gcccore-11.3.02.6.4-gcccore-12.3.0 (D) 2.6.4-gcccore-10.3.02.6.4-gcccore-11.3.02.6.4-gcccore-12.3.0 (D) | ||||||||||||
Flye is a de novo assembler for single molecule sequencing reads, such as those produced by PacBio and Oxford Nanopore Technologies. It is designed for a wide range of datasets, from small bacterial projects to large mammalian-scale assemblies. The package represents a complete pipeline: it takes raw PB / ONT reads as input and outputs polished contigs. | flye | 3 publications | Flye 2.9.4+galaxy0 | 2.92.9.12.9.3 2.92.9.12.9.3 2.92.9.12.9.3 | 2.9-gcc-10.3.0 | ||||||||
An integrated database for Drosophila and Anopheles genomics. | flymine | 2 publications | LGPL-2.1 | Flymine 1.0.0 | |||||||||
Foldseek enables fast and sensitive comparisons of large structure sets. It reaches sensitivities similar to state-of-the-art structural aligners while being at least 20,000 times faster. | foldseek | 2 publications | GPL-3.0 | 3-915ef7d | |||||||||
Web server which detects small molecule pockets by relying on the geometric alpha sphere theory. It also tracks pockets during molecular dynamics so to provide insight on pocket dynamics (mdpocket) and transposes mdpocket to the combined analysis of homologous structures (hpocket). | fpocket | 2 publications | Freeware | 2 tools | |||||||||
Application for finding (fragmented) genes in short reads | fraggenescan | 2 publications | GPL-3.0 | FragGeneScan 1.30.0 | |||||||||
Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs, indels, multi-nucleotide polymorphisms, and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment. | freebayes | freebayes | MIT | 2 tools | 1.3.6-foss-2021a-r-4.1.0 | ||||||||
fsom | 20141119-gcccore-10.3.0 | ||||||||||||
funannotate is a pipeline for genome annotation (built specifically for fungi, but will also work with higher eukaryotes). | funannotate | BSD-2-Clause | 5 tools | ||||||||||
Gene Annotation EVAluation. | gaeval | Not licensed | 4 tools | ||||||||||
galaxy_genomic_intervals | 2 tools | ||||||||||||
text_processing | 19 tools | ||||||||||||
galaxy_collection_operations | 19 tools | ||||||||||||
Galaxy CONVERTER | 75 tools | ||||||||||||
galaxy_data_sources | 11 tools | ||||||||||||
galaxy_fetch_alignments_sequences | 12 tools | ||||||||||||
galaxy_filter_and_sort | 10 tools | ||||||||||||
galaxy_graph_display | 11 tools | ||||||||||||
galaxy_join_subtract_and_group | 5 tools | ||||||||||||
galaxy_sequence_utils | 21 tools | ||||||||||||
galaxy_statistics | 8 tools | ||||||||||||
galaxy_text_manipulation | 36 tools | ||||||||||||
The Genome Analysis Toolkit (GATK) is a set of bioinformatic tools for analyzing high-throughput sequencing (HTS) and variant call format (VCF) data. The toolkit is well established for germline short variant discovery from whole genome and exome sequencing data. GATK4 expands functionality into copy number and somatic analyses and offers pipeline scripts for workflows. Version 4 (GATK4) is open-source at https://github.com/broadinstitute/gatk. | gatk | The genome analysis toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data | GATK | 4.1.4.0 4.2.1.0 4.1.8.1 4.2.5.0 | 4.2.5.0--hdfd78af_0 | 4.3.0.0-gcccore-11.3.0-java-11 | |||||||
Cleaning aligned sequences. | gblock | 2 publications | Gblocks 0.91b | ||||||||||
Genome-wide Complex Trait Analysis. Estimate the proportion of phenotypic variance explained by genome- or chromosome-wide SNPs for complex traits (the GREML method), and has subsequently extended for many other analyses to better understand the genetic architecture of complex traits. | gcta | GCTA: A tool for genome-wide complex trait analysis | gcta | MIT | 1.94.0beta-gfbf-2022a1.94.1--h9ee0642_0 1.94.0beta-gfbf-2022a1.94.1--h9ee0642_0 | ||||||||
Software aimed at pairwise sequence comparison generating high quality results (equivalent to MUMmer) with controlled memory consumption and comparable or faster execution times particularly with long sequences. | gecko | Breaking the computational barriers of pairwise genome comparison | GPL-3.0 | Gecko 1.2 | |||||||||
GEMINI (GEnome MINIng) is a flexible framework for exploring genetic variation in the context of the wealth of genome annotations available for the human genome. By placing genetic variants, sample phenotypes and genotypes, as well as genome annotations into an integrated database framework, GEMINI provides a simple, flexible, and powerful system for exploring genetic variation for disease and population genetics. | gemini | 10.1371/journal.pcbi.1003153 | MIT | 24 tools | |||||||||
Gene Model Mapper is a homology-based gene prediction program. GeMoMa uses the annotation of protein-coding genes in a reference genome to infer the annotation of protein-coding genes in a target genome. Thereby, it utilizes amino acid sequence and intron position conservation. In addition, it allows to incorporate RNA-seq evidence for splice site prediction. | gemoma | Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi | GPL-3.0 | 1.81.9 1.81.9 | |||||||||
An interactive web tool for versatile, clinically-driven variant interrogation and prioritization. IOBIO is a suite of web apps for visually driven real-time analysis of genomic data. Visually driven real-time analysis of genomic data. | geneiobio | 10.1101/2020.11.05.20224865 | gene.iobio visualisation 4.7.1+galaxy1 | ||||||||||
generate_count_matrix | Generate count matrix 1.0 | ||||||||||||
generate_pc_lda_matrix | Generate A Matrix 1.0.0 | ||||||||||||
generode | 0.5.1 | ||||||||||||
Reference-free profiling of polyploid genomes | We have developed GenomeScope 2.0, which applies classical insights from combinatorial theory to establish a detailed mathematical model of how k-mer frequencies will be distributed in heterozygous and polyploid genomes | Average k-mer coverage for polyploid genome | Upload results from running Jellyfish or KMC | genomescope | 10.1101/747568 | Apache-2.0 | GenomeScope 2.0.1+galaxy0 | 1.0.01.0.0 1.0.01.0.0 | ||||||||
Free collection of bioinformatics tools for genome informatics. | genometools | Genome tools: A comprehensive software library for efficient processing of structured genome annotations | GenomeTools | BSD-3-Clause | 1.6.2 | ||||||||
genrich | Genrich 0.5+galaxy2 | ||||||||||||
get_pdb_file | Get PDB file 0.1.0 | ||||||||||||
A fast and versatile toolkit for accurate de novo assembly of organelle genomes. This toolkit assemblies organelle genome from genomic skimming data. | getorganelle | GetOrganelle: A fast and versatile toolkit for accurate de novo assembly of organelle genomes | GPL-3.0 | 2 tools | |||||||||
gfa_to_fa | GFA to FASTA 0.1.2 | ||||||||||||
gfastats is a single fast and exhaustive tool for summary statistics and simultaneous genome assembly file manipulation. gfastats also allows seamless fasta/fastq/gfa conversion. | gfastats | MIT | gfastats 1.3.6+galaxy0 | 1.3.6 | |||||||||
gff2bed1 | GFF-to-BED 1.0.1 | ||||||||||||
gff3sort | 0.1.a1a2bc9--hdfd78af_2 | ||||||||||||
Program for comparing, annotating, merging and tracking transcripts in GFF files. | gffcompare | MIT | GffCompare 0.12.6+galaxy0 | 0.12.2-gcc-10.3.0 | |||||||||
gffcompare_to_bed | Convert gffCompare annotated GTF to BED 0.2.1 | ||||||||||||
program for filtering, converting and manipulating GFF files | gffread | MIT | gffread 2.2.1.4+galaxy0 | 0.12.7 | 0.12.7-gcccore-10.3.0 | ||||||||
Plotting system for R, based on the grammar of graphics. | ggplot2 | 10.1007/978-3-319-24277-4 | 6 tools | ||||||||||
ghostscript | 9.54.0-gcccore-10.3.09.56.1-gcccore-11.3.0 (D) 9.54.0-gcccore-10.3.09.56.1-gcccore-11.3.0 (D) | ||||||||||||
GMAJ | GMAJ 2.0.1 | ||||||||||||
Genomic Mapping and Alignment Program for mRNA and EST Sequences. | gmap | 2 publications | gmap | 2023.04.28 | |||||||||
GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel. | parallel | GPL-3.0 | 20191022 | 20210622-gcccore-10.3.020220722-gcccore-11.3.0 (D) 20210622-gcccore-10.3.020220722-gcccore-11.3.0 (D) | |||||||||
The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. | gsl | GPL-3.0 | 2.6 | 2.7-gcc-10.3.02.7-gcc-11.3.0 (D) 2.7-gcc-10.3.02.7-gcc-11.3.0 (D) | |||||||||
GOEnrichment is a tool for performing GO enrichment analysis of gene sets, such as those obtained from RNA-seq or Microarray experiments, to help characterize them at the functional level. It is available in Galaxy Europe and as a stand-alone tool. GOEnrichment is flexible in that it allows the user to use any version of the Gene Ontology and any GO annotation file they desire. To enable the use of GO slims, it is accompanied by a sister tool GOSlimmer, which can convert annotation files from full GO to any specified GO slim. The tool features an optional graph clustering algorithm to reduce the redundancy in the set of enriched GO terms and simplify its output. It was developed by the BioData.pt / ELIXIR-PT team at the Instituto Gulbenkian de Ciência. | goenrichment | Apache-2.0 | 2 tools | ||||||||||
Detect Gene Ontology and/or other user defined categories which are over/under represented in RNA-seq data. | goseq | 10.1186/gb-2010-11-2-r14 | goseq | GPL-2.0 | goseq 1.50.0+galaxy0 | ||||||||
gramenemart | GrameneMart 1.0.1 | ||||||||||||
GraPhlAn is a software tool for producing high-quality circular representations of taxonomic and phylogenetic trees. GraPhlAn focuses on concise, integrative, informative, and publication-ready representations of phylogenetically- and taxonomically-driven investigation. | graphlan | Compact graphical representation of phylogenetic data and metadata with GraPhlAn | MIT | 2 tools | |||||||||
Versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since it is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers. | gromacs | 10 publications | LGPL-2.1 | 8 tools | 2019.3 2019.3-plumed 2019.3-gpuvolta 2018.3 2018.3-plumed 2020.1 2020.1-plumed 2020.1-gpuvolta 2020.3 2020.3-gpuvolta 2021-gpuvolta 2021 2021.2 2021.2-gpuvolta 2021.4 2021.4-gpuvolta 2022 2022-gpuvolta 2020.3-gpuampere 2019.3-gpuampere 2021-gpuampere 2020.1-gpuampere 2021.2-gpuampere 2021.4-gpuampere 2022-gpuampere | 2021.3-foss-2021a | |||||||
a toolkit to classify genomes with the Genome Taxonomy Database. GTDB-Tk: a toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes. GTDB-Tk is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes based on the Genome Database Taxonomy GTDB. It is designed to work with recent advances that allow hundreds or thousands of metagenome-assembled genomes (MAGs) to be obtained directly from environmental samples. It can also be applied to isolate and single-cell genomes. The GTDB-Tk is open source and released under the GNU General Public License (Version 3). | gtdb-tk | GTDB-Tk: A toolkit to classify genomes with the genome taxonomy database | GPL-3.0 | GTDB-Tk Classify genomes 2.3.2+galaxy1 | 2.0.0-foss-2021a2.2.6 2.0.0-foss-2021a2.2.6 | ||||||||
gtf2bedgraph | GTF-to-BEDGraph 1.0.0 | ||||||||||||
gtf2gene_list | GTF2GeneList 1.52.0+galaxy0 | ||||||||||||
Gubbins is a tool for rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences. | gubbins | 10.1093/nar/gku1196 | GPL-2.0 | Gubbins 3.2.1+galaxy0 | |||||||||
gzip | Compress file(s) 0.1.0 | 1.10-gcccore-10.3.01.12-gcccore-11.3.0 (D) 1.10-gcccore-10.3.01.12-gcccore-11.3.0 (D) | |||||||||||
HapCUT2 is a maximum-likelihood-based tool for assembling haplotypes from DNA sequence reads, designed to "just work" with excellent speed and accuracy across a range of long- and short-read sequencing technologies. The output is in Haplotype block format described here: https://github.com/vibansal/HapCUT2/blob/master/outputformat.md | hapcut2 | Hapcut2 1.3.3+galaxy0+ga1 | |||||||||||
hbvar | HbVar 2.0.0 | ||||||||||||
Tool for single-species active module discovery. | heinz | XHeinz: An algorithm for mining cross-species network modules under a flexible conservation model | 4 tools | ||||||||||
Deep Learning to predict gene annotations | helixer | Helixer: Cross-species gene annotation of large eukaryotic genomes using deep learning | GPL-3.0 | Helixer 0.3.2 | |||||||||
This tool provides functional annotation for a list of genes by connecting with DAVID database. | hgv_david | 3 publications | DAVID 1.0.1 | ||||||||||
This tool can be used to analyze the patterns of linkage disequilibrium (LD) between polymorphic sites in a locus. | hgv_ldtools | 3 publications | LD 1.0.0 | ||||||||||
This tool creates a link to the g:GOSt tool (Gene Group Functional Profiling), which provides functional profiling of gene lists. | hgv_linkToGProfile | 3 publications | g:Profiler 1.0.0 | ||||||||||
A web server for reproducible Hi-C, capture Hi-C and single-cell Hi-C data analysis, quality control and visualization. HiCExplorer — HiCExplorer 3.6 documentation. scHiCExplorer — scHiCExplorer 7 documentation. Free document hosting provided by Read the Docs. | hicexplorer | Galaxy HiCExplorer 3: A web server for reproducible Hi-C, capture Hi-C and single-cell Hi-C data analysis, quality control and visualization | 57 tools | ||||||||||
Remove CCS reads with remnant PacBio adapter sequences and convert outputs to a compressed .fastq (.fastq.gz). | hifiadapterfilt | HiFiAdapterFilt, a memory efficient read processing pipeline, prevents occurrence of adapter sequence in PacBio HiFi reads and their negative impacts on genome assembly | GPL-3.0 | HiFi Adapter Filter 2.0.0+galaxy0 | 2.0.0 | ||||||||
Hifiasm: a haplotype-resolved assembler for accurate Hifi reads | hifiasm | Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm | MIT | Hifiasm 0.19.9+galaxy0 | 0.16.10.18.90.19.60.19.80.19.9 0.16.10.18.90.19.60.19.80.19.9 0.16.10.18.90.19.60.19.80.19.9 0.16.10.18.90.19.60.19.80.19.9 0.16.10.18.90.19.60.19.80.19.9 | 0.16.1-gcccore-10.3.0 | |||||||
Hifiasm_meta - de novo metagenome assembler, based on hifiasm, a haplotype-resolved de novo assembler for PacBio Hifi reads. | hifiasm_meta | MIT | Hifiasm_meta 0.3.1+galaxy0 | ||||||||||
Alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes (as well as to a single reference genome). | hisat2 | 3 publications | HISAT2 | GPL-3.0 | HISAT2 2.2.1+galaxy1 | 2.2.1 | 2.2.1-gompi-2021a2.2.1-gompi-2022a2.2.1--h87f3376_4 2.2.1-gompi-2021a2.2.1-gompi-2022a2.2.1--h87f3376_4 2.2.1-gompi-2021a2.2.1-gompi-2022a2.2.1--h87f3376_4 | ||||||
hmmcleaner | 0.180750 | ||||||||||||
This tool is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods using probabilistic models called profile hidden Markov models. The new HMMER3 project, HMMER is now as fast as BLAST for protein search. | hmmer | 4 publications | Other | 12 tools | 3.3.2 | 3.3.2-gompi-2021a3.3.2-gompi-2022a (D) 3.3.2-gompi-2021a3.3.2-gompi-2022a (D) | |||||||
Horovod is a distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. The goal of Horovod is to make distributed deep learning fast and easy to use. | horovod | Apache-2.0 | 0.19.0 0.22.1 | ||||||||||
Python framework to process and analyse high-throughput sequencing (HTS) data | htseq | HTSeq-A Python framework to work with high-throughput sequencing data | HTSeq | GPL-3.0 | htseq-count 2.0.5+galaxy0 | 2.0.2-foss-2022a | |||||||
The main purpose of HTSlib is to provide access to genomic information files, both alignment data (SAM, BAM, and CRAM formats) and variant data (VCF and BCF formats). The library also provides interfaces to access and index genome reference data in FASTA format and tab-delimited files with genomic coordinates. It is utilized and incorporated into both SAMtools and BCFtools. | htslib | HTSlib: C library for reading/writing high-Throughput sequencing data | HTSlib | MIT | 1.9 1.12 1.16 | 1.19.11.20 1.19.11.20 | 1.12-gcc-10.3.01.15.1-gcc-11.3.0 (D) 1.12-gcc-10.3.01.15.1-gcc-11.3.0 (D) | ||||||
HUMAnN is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads). This process, referred to as functional profiling, aims to describe the metabolic potential of a microbial community and its members. More generally, functional profiling answers the question “What are the microbes in my community-of-interest doing (or are capable of doing)?” | humann | Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with biobakery 3 | MIT | 12 tools | 3.6-foss-2022a | ||||||||
HUMAnN 2.0 is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads). This process, referred to as functional profiling, aims to describe the metabolic potential of a microbial community and its members. More generally, functional profiling answers the question “What are the microbes in my community-of-interest doing (or capable of doing)?” | humann2 | Species-level functional profiling of metagenomes and metatranscriptomes | 7 tools | ||||||||||
Paralogs and off-target sequences improve phylogenetic resolution in a densely-sampled study of the breadfruit genus (Artocarpus, Moraceae). Recovering genes from targeted sequence capture data. Current version: 1.3.1 (August 2018). -- Read our article in Applications in Plant Sciences (Open Access). HybPiper was designed for targeted sequence capture, in which DNA sequencing libraries are enriched for gene regions of interest, especially for phylogenetics. HybPiper is a suite of Python scripts that wrap and connect bioinformatics tools in order to extract target sequences from high-throughput DNA sequencing reads. | hybpiper | 10.1101/854232 | GPL-3.0 | HybPiper 2.1.6+galaxy0 | |||||||||
Software package for the analysis of genetic sequences using techniques in phylogenetics, molecular evolution, and machine learning. | HyPhy | 2 publications | Unlicense | 2 tools | |||||||||
idl | 8.6 8.8 | ||||||||||||
imagemagick | 7.0.11 | 7.0.11-14-gcccore-10.3.07.1.0-37-gcccore-11.3.0 (D) 7.0.11-14-gcccore-10.3.07.1.0-37-gcccore-11.3.0 (D) | |||||||||||
Improved Phased Assembler (IPA) is the official PacBio software for HiFi genome assembly. IPA was designed to utilize the accuracy of PacBio HiFi reads to produce high-quality phased genome assemblies | pbipa | 1.5.01.8.0 1.5.01.8.0 | |||||||||||
Infernal ("INFERence of RNA ALignment") is for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a special case of profile stochastic context-free grammars called covariance models (CMs). A CM is like a sequence profile, but it scores a combination of sequence consensus and RNA secondary structure consensus, so in many cases, it is more capable of identifying RNA homologs that conserve their secondary structure more than their primary sequence. | infernal | Infernal 1.1: 100-fold faster RNA homology searches | BSD-3-Clause | 6 tools | |||||||||
insight-toolkit | 5.2.1 | ||||||||||||
High-performance visualization tool for interactive exploration of large, integrated datasets. It supports a wide variety of data types and format, including short-read alignments in the SAM/BAM format. Data can be viewed from local files or over the web via http. | igv | 3 publications | Integrated Genomics Viewer | LGPL-2.1 | 2.13.1 | ||||||||
A tool to detect Integron in DNA sequences. | integron_finder | 2 publications | GPL-3.0 | Integron Finder 2.0.5+galaxy0 | |||||||||
Open source data warehouse built specifically for the integration and analysis of complex biological data. It enables the creation of biological databases accessed by sophisticated web query tools. Parsers are provided for integrating data from many common biological data sources and formats, and there is a framework for adding your own data. | intermine | 2 publications | LGPL-2.1 | InterMine 1.0.0 | |||||||||
Scan sequences against the InterPro protein signature databases. | interproscan | 2 publications | InterProScan 5.59-91.0+galaxy3 | 5.55-88.0-foss-2021a | |||||||||
Interactive assembly and analysis of RADseq datasets. ipyrad: interactive assembly and analysis of RAD-seq data sets. Welcome to ipyrad, an interactive toolkit for assembly and analysis of restriction-site associated genomic data sets (e.g., RAD, ddRAD, GBS) for population genetic and phylogenetic studies. Welcome to ipyrad — ipyrad documentation. | ipyrad | Ipyrad: Interactive assembly and analysis of RADseq datasets | GPL-3.0 | 0.9.84 | 0.9.93 | ||||||||
Very efficient phylogenetic software for reconstructing maximum-likelihood trees and assessing branch supports with the ultrafast bootstrap approximation. It is based on the IQPNNI algorithm with 10-fold speedup together with substantially additional features. | iq-tree | W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis | IQ-TREE 2.3.6+galaxy0 | 2.1.2 | 2.2.2.3--h2202e69_2 | ||||||||
Provides functions for creating an interactive Shiny-based graphical user interface for exploring data stored in SummarizedExperiment objects, including row- and column-level metadata. Particular attention is given to single-cell data in a SingleCellExperiment object with visualization of dimensionality reduction results. | isee | iSEE: Interactive SummarizedExperiment Explorer [version 1; referees: 2 approved] | iSEE | MIT | iSEE 1.0.0 | ||||||||
Automated identification of insertion sequence elements in prokaryotic genomes. | isescan | ISEScan: automated identification of insertion sequence elements in prokaryotic genomes | ISEScan 1.7.2.3+galaxy1 | ||||||||||
Enables identification of isoform switches with predicted functional consequences from RNA-seq data. Consequences can be chosen from a long list but includes protein domains gain/loss changes in NMD sensitivity etc. It directly supports import of data from Cufflinks/Cuffdiff, Kallisto, Salmon and RSEM but other transcript qunatification tools are easy to import as well. | isoformswitchanalyzer | The landscape of isoform switches in human cancers | GPL-2.0 | IsoformSwitchAnalyzeR 1.20.0+galaxy5 | |||||||||
IsoSeq v3 contains the newest tools to identify transcripts in PacBio single-molecule sequencing data. Starting in SMRT Link v6.0.0, those tools power the IsoSeq GUI-based analysis application. A composable workflow of existing tools and algorithms, combined with a new clustering technique. | isoseq3 | BSD-3-Clause-Clear | 4.0.0--h9ee0642_0 | ||||||||||
Interpretation-oriented tool to manage the update and revision of variant annotation and classification. iVar - DataBase of Genomics Variants. | ivar | 10.22541/AU.160610419.99549785/V1 | AGPL-3.0 | 6 tools | |||||||||
Implementation of the Interval-Wise Testing (IWT) for omics data. This inferential procedure tests for differences in "Omics" data between two groups of genomic regions (or between a group of genomic regions and a reference center of symmetry), and does not require fixing location and scale at the outset. | iwtomics | IWTomics: Testing high-resolution sequence-based 'Omics' data at multiple locations and scales | GPL-2.0 | 3 tools | |||||||||
JAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation. | jags | GPL-2.0 | 4.3.0-foss-2021a | ||||||||||
JASMINE (Jointly Accurate Sv Merging with Intersample Network Edges) is an automated pipeline for alignment and SV calling in long-read datasets. The tool is used to merge structural variants (SVs) across samples. Each sample has a number of SV calls, consisting of position information (chromosome, start, end, length), type and strand information, and a number of other values. Jasmine represents the set of all SVs across samples as a network, and uses a modified minimum spanning forest algorithm to determine the best way of merging the variants such that each merged variants represents a set of analogous variants occurring in different samples. | jasminesv | 10.1101/2021.05.27.445886 | MIT | 1.1.4 | |||||||||
jdk | 11.0.2 | ||||||||||||
jbigkit | 2.1-gcccore-10.3.02.1-gcccore-11.3.0 (D) 2.1-gcccore-10.3.02.1-gcccore-11.3.0 (D) | ||||||||||||
Slick, speedy genome browser with a responsive and dynamic AJAX interface for visualization of genome data. Being developed by the GMOD project as a successor to GBrowse. | jbrowse | 10.1101/gr.094607.109 | JBrowse | 2 tools | |||||||||
jcvi | Genome annotation statistics 0.8.4 | ||||||||||||
A command-line algorithm for counting k-mers in DNA sequence. | jellyfish | 10.1093/bioinformatics/btr011 | jellyfish | GPL-3.0 | jellyfish 2.3.0+galaxy1 | 2.3.0 | 2.3.0-gcc-10.3.02.3.0-gcc-11.3.0 (D) 2.3.0-gcc-10.3.02.3.0-gcc-11.3.0 (D) | ||||||
jq | JQ 1.0 | ||||||||||||
Juicer is a platform for analyzing kilobase resolution Hi-C data. In this distribution, we include the pipeline for generating Hi-C maps from fastq raw data files and command line tools for feature annotation on the Hi-C maps. | juicer | Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments | MIT | 1.6 | |||||||||
jupyterlab | 3.4.3-py3.9 | 3.5.0-gcccore-11.3.0 | |||||||||||
A program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment. | kallisto | Near-optimal probabilistic RNA-seq quantification | BSD-2-Clause | 2 tools | 0.48.0-gompi-2021a0.48.0-gompi-2022a0.48.0--h15996b6_2 0.48.0-gompi-2021a0.48.0-gompi-2022a0.48.0--h15996b6_2 0.48.0-gompi-2021a0.48.0-gompi-2022a0.48.0--h15996b6_2 | ||||||||
Suite of tools that generate, analyse and compare k-mer spectra produced from sequence files | kat | KAT: A K-mer analysis toolkit to quality control NGS datasets and genome assemblies | GPL-3.0 | 2.4.2 | |||||||||
kentutils | 0.0 | ||||||||||||
khmer is a set of command-line tools for working with DNA shotgun sequencing data from genomes, transcriptomes, metagenomes, and single cells. khmer can make de novo assemblies faster, and sometimes better. khmer can also identify (and fix) problems with shotgun data. | khmer | 4 publications | khmer | BSD-3-Clause | 8 tools | ||||||||
KMC is a utility designed for counting k-mers (sequences of consecutive k symbols) in a set of reads from genome sequencing projects. | kmc | KMC 2: Fast and resource-frugal k-mer counting | KMC | 3.2.13.2.4 3.2.13.2.4 | |||||||||
KofamScan is a gene function annotation tool based on KEGG Orthology and hidden Markov model. You need KOfam database to use this tool. | kofamscan | MIT | 1.3.0--hdfd78af_2 | ||||||||||
System for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies. Previous attempts by other bioinformatics software to accomplish this task have often used sequence alignment or machine learning techniques that were quite slow, leading to the development of less sensitive but much faster abundance estimation programs. It aims to achieve high sensitivity and high speed by utilizing exact alignments of k-mers and a novel classification algorithm. | kraken | Kraken: Ultrafast metagenomic sequence classification using exact alignments | kraken | GFDL-1.3 | 9 tools | ||||||||
Kraken 2 is the newest version of Kraken, a taxonomic classification system using exact k-mer matches to achieve high accuracy and fast classification speeds. This classifier matches each k-mer within a query sequence to the lowest common ancestor (LCA) of all genomes containing the given k-mer. The k-mer assignments inform the classification algorithm. | kraken2 | 10.1101/762302 | MIT | Kraken2 2.1.3+galaxy1 | 2.1.2-gompi-2021a2.1.2-gompi-2022a (D) 2.1.2-gompi-2021a2.1.2-gompi-2022a (D) | ||||||||
KrakenTools provides individual scripts to analyze Kraken/Kraken2/Bracken/KrakenUniq output files | krakentools | krakentools | GPL-3.0 | 4 tools | |||||||||
Krona creates interactive HTML5 charts of hierarchical data (such as taxonomic abundance in a metagenome). | krona | Interactive metagenomic visualization in a Web browser | krona | Proprietary | 2 tools | ||||||||
kronatools | 2.8.1-gcccore-11.3.0 | ||||||||||||
Automated image analysis for developmental phenotyping of mouse embryos. LAMA (Lightweight Analysis of Morphological Abnormalities). Welcome to LAMA, an open source pipeline to automatically identify embryo dysmorphology from 3D volumetric images. | lama | 10.1101/2020.05.04.075853 | 0.9.1001.0.01.0.11.0.2 0.9.1001.0.01.0.11.0.2 0.9.1001.0.01.0.11.0.2 0.9.1001.0.01.0.11.0.2 | ||||||||||
A tool for (1) aligning two DNA sequences, and (2) inferring appropriate scoring parameters automatically. | lastz | 3 tools | 1.04.15 | ||||||||||
length_and_gc_content | Gene length and GC content 0.1.2 | ||||||||||||
An Easy-To-Use Interactive Web Platform To Analyze and Visualize Label-Free Proteomics Data Preprocessed with MaxQuant. A tool for analysing label-free quantitative proteomics dataset https://bioinformatics.erc.monash.edu/apps/LFQ-Analyst/. LFQ-Analyst: An easy-to-use interactive web-platform to analyze and visualize proteomics data preprocessed with MaxQuant. LFQ-Analyst is an easy-to-use, interactive web application developed to perform differential expression analysis with “one click” and to visualize label-free quantitative proteomic datasets preprocessed with MaxQuant. LFQ-Analyst provides a wealth of user-analytic features and offers numerous publication-quality result output graphics and tables to facilitate statistical and exploratory analysis of label-free quantitative datasets | LFQ-Analyst | Lfq-Analyst: An easy-To-use interactive web platform to analyze and visualize label-free proteomics data preprocessed with maxquant | GPL-3.0 | LFQ Analyst 1.2.6+galaxy0 | |||||||||
liftOver1 | Convert genome coordinates 1.0.6 | ||||||||||||
Data analysis, linear models and differential expression for microarray data. | limma | Limma powers differential expression analyses for RNA-sequencing and microarray studies | limma | GPL-2.0 | limma 3.58.1+galaxy0 | ||||||||
LINKS (Long Interval Nucleotide K-mer Scaffolder) is a genomics application for scaffolding genome assemblies with long reads, such as those produced by Oxford Nanopore Technologies Ltd. It can be used to scaffold high-quality draft genome assemblies with any long sequences (eg. ONT reads, PacBio reads, other draft genomes, etc). It is also used to scaffold contig pairs linked by ARCS/ARKS. | links | LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads | GPL-3.0 | LINKS 2.0.1+galaxy+1 | |||||||||
LoFreq* (i.e. LoFreq version 2) is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It makes full use of base-call qualities and other sources of errors inherent in sequencing (e.g. mapping or base/indel alignment uncertainty), which are usually ignored by other methods or only used for filtering. | lofreq | LoFreq: A sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets | MIT | 5 tools | |||||||||
Longshot is a variant calling tool for diploid genomes using long error prone reads such as Pacific Biosciences (PacBio) SMRT and Oxford Nanopore Technologies (ONT). It takes as input an aligned BAM file and outputs a phased VCF file with variants and haplotype information. It can also genotype and phase input VCF files. It can output haplotype-separated BAM files that can be used for downstream analysis. Currently, it only calls single nucleotide variants (SNVs), but it can genotype indels if they are given in an input VCF. | longshot | Longshot enables accurate variant calling in diploid genomes from single-molecule long read sequencing | MIT | 0.4.1 | |||||||||
lpsolve | 5.5.2.11 | 5.5.2.11-gcc-10.3.05.5.2.11-gcc-11.3.0 (D) 5.5.2.11-gcc-10.3.05.5.2.11-gcc-11.3.0 (D) | |||||||||||
LTR_retriever is a highly accurate and sensitive program for identification of LTR retrotransposons; The LTR Assembly Index (LAI) is also included in this package. | ltr_retriever | LTR_retriever: A highly accurate and sensitive program for identification of long terminal repeat retrotransposons | GPL-3.0 | 2.9.4 | |||||||||
lua | 5.4.3-gcccore-10.3.05.4.4-gcccore-11.3.0 (D) 5.4.3-gcccore-10.3.05.4.4-gcccore-11.3.0 (D) | ||||||||||||
Model-based Analysis of ChIP-seq data. | macs2 | Model-based analysis of ChIP-Seq (MACS) | Artistic-2.0 | 10 tools | 2.2.9.1 | ||||||||
maeparser | 1.3.0-gompi-2021a1.3.0-gompi-2022a (D) 1.3.0-gompi-2021a1.3.0-gompi-2022a (D) | ||||||||||||
MAFFT (Multiple Alignment using Fast Fourier Transform) is a high speed multiple sequence alignment program. | mafft | 6 publications | BSD-Source-Code | 2 tools | 7.5057.525 7.5057.525 | 7.490-gcc-10.3.0-with-extensions | |||||||
Computational tool to identify important genes from the recent genome-scale CRISPR-Cas9 knockout screens technology. | MAGeCK | Quality control, modeling, and visualization of CRISPR screens with MAGeCK-VISPR | 5 tools | ||||||||||
Portable and easily configurable genome annotation pipeline. It’s purpose is to allow smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases. | maker | 2 publications | MAKER | Artistic-2.0 | 2 tools | 3.01.04 | 3.01.03--pl526hb8757ab_0 | 3.01.03--pl5262h8f1cd36_2 | |||||
MALDIquant is a complete analysis pipeline for matrix-assisted laser desorption/ionization-time-of-flight (MALDI-TOF) and other two-dimensional mass spectrometry data. In addition to commonly used plotting and processing methods it includes distinctive features, namely baseline subtraction methods such as morphological filters (TopHat) or the statistics-sensitive non-linear iterative peak-clipping algorithm (SNIP), peak alignment using warping functions, handling of replicated measurements as well as allowing spectra with different resolutions. | maldi_quant | Maldiquant: A versatile R package for the analysis of mass spectrometry data | GPL-3.0 | 2 tools | |||||||||
music_manipulate_eset | Manipulate Expression Set Object 0.1.1+galaxy4 | ||||||||||||
Fast genome and metagenome distance estimation using MinHash. | mash | 10.1186/s13059-016-0997-x | mash | CC-BY-4.0 | 2 tools | 2.3-gcc-10.3.0 | |||||||
master2pgSnp | MasterVar to pgSnp 1.0.0 | ||||||||||||
Whole genome assembly software. It combines the efficiency of the de Bruijn graph and Overlap-Layout-Consensus (OLC) approaches. MaSuRCA can assemble data sets containing only short reads from Illumina sequencing or a mixture of short reads and long reads (Sanger, 454). | masurca | The MaSuRCA genome assembler | MaSuRCA simple 4.0.6+galaxy0 | ||||||||||
Tool to import, process, clean, and compare mass spectrometry data. | matchms | Apache-2.0 | 11 tools | ||||||||||
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible. | matplotlib | MIT | 3.4.2-foss-2021a3.5.2-foss-2022a (D) 3.4.2-foss-2021a3.5.2-foss-2022a (D) | ||||||||||
Software for binning assembled metagenomic sequences based on an Expectation-Maximization algorithm. | maxbin | 26515820 | MaxBin2 2.2.7+galaxy2 | ||||||||||
Quantitative proteomics software package designed for analyzing large mass-spectrometric data sets. It is specifically aimed at high-resolution MS data. | maxquant | 4 publications | MaxQuant | 3 tools | 2.2.0.0-gcccore-11.3.0 | ||||||||
MCL is a clustering algorithm widely used in bioinformatics and gaining traction in other fields. | mcl | 10.1007/978-1-61779-361-5_15 | mcl | GPL-3.0 | 14-137 | ||||||||
mcquant | MCQUANT 1.5.3+galaxy1 | ||||||||||||
MDAnalysis is an object-oriented python toolkit to analyze molecular dynamics trajectories generated by CHARMM, Gromacs, NAMD, LAMMPS, Amber or DL_POLY; it also reads other formats (e.g. PDB files and XYZ format trajectories; see the supported coordinate formats for the full list). It can write most of these formats, too, together with atom selections for use in Gromacs, CHARMM, VMD and PyMol | mdanalysis | MDAnalysis: A toolkit for the analysis of molecular dynamics simulations | GPL-2.0 | Cosine Content 1.0.0+galaxy0 | |||||||||
MDTraj | 2 tools | ||||||||||||
medaka is a tool to create consensus sequences and variant calls from nanopore sequencing data. This task is performed using neural networks applied a pileup of individual sequencing reads against a draft assembly. | medaka | MPL-2.0 | 4 tools | 1.9.1 | |||||||||
Single node assembler for large and complex metagenomics NGS reads, such as soil. It makes use of succinct de Bruijn graph to achieve low memory usage, whereas its goal is not to make memory usage as low as possible. | megahit | MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph | 2 tools | 1.2.9-gcccore-10.3.01.2.9-gcccore-11.3.0 (D) 1.2.9-gcccore-10.3.01.2.9-gcccore-11.3.0 (D) | |||||||||
merge_cols | Merge Columns 1.0.3 | ||||||||||||
Reference-free quality, completeness, and phasing assessment for genome assemblies. Evaluate genome assemblies with k-mers and more. Often, genome assembly projects have illumina whole genome sequencing reads available for the assembled individual. Merqury provides a set of tools for this purpose. | merqury | 10.1101/2020.03.15.992941 | 2 tools | 1.3 | |||||||||
Meryl is a tool for counting and working with sets of k-mers that was originally developed for use in the Celera Assembler and has since been migrated and maintained as part of Canu. | meryl | 10.1186/s13059-020-02134-9 | Freeware | Meryl 1.3+galaxy6 | 1.4.1 | ||||||||
an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies | MetaBAT2 clusters metagenomic contigs into different "bins", each of which should correspond to a putative genome | MetaBAT2 uses nucleotide composition information and source strain abundance (measured by depth-of-coverage by aligning the reads to the contigs) to perform binning | metabat | 2 publications | MetaBAT2 2.15+galaxy3 | ||||||||||
Galaxy workflow for differential abundance analysis of 16s metagenomic data. You are over your disk quota. Tool execution is on hold until your disk usage drops below your allocated quota. This history is empty. You can load your own data or get data from an external source | MetaDEGalaxy | MetaDEGalaxy: Galaxy workflow for differential abundance analysis of 16s metagenomic data | 9 tools | ||||||||||
MetaEuk - sensitive, high-throughput gene discovery and annotation for large-scale eukaryotic metagenomics | metaeuk | 2 publications | GPL-3.0 | 5-34c21f2 | 5-gcc-10.3.06-gcc-11.3.0 (D) 5-gcc-10.3.06-gcc-11.3.0 (D) | ||||||||
Computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data. | metaphlan | Metagenomic microbial community profiling using unique clade-specific marker genes | MIT | 4 tools | |||||||||
metaQuantome software suite analyzes the state of a microbiome by leveraging complex taxonomic and functional hierarchies to summarize peptide-level quantitative information. metaQuantome offers differential abundance analysis, principal components analysis, and clustered heat map visualizations, as well as exploratory analysis for a single sample or experimental condition. | metaQuantome | 2 publications | 6 tools | ||||||||||
Genome assembler for metagenomics datasets. | metaspades | 3 publications | metaSPAdes 3.15.5+galaxy2 | ||||||||||
MetaWRAP aims to be an easy-to-use metagenomic wrapper suite that accomplishes the core tasks of metagenomic analysis from start to finish: read quality control, assembly, visualization, taxonomic profiling, extracting draft genomes (binning), and functional annotation. | metawrap | MetaWRAP - A flexible pipeline for genome-resolved metagenomic data analysis 08 Information and Computing Sciences 0803 Computer Software 08 Information and Computing Sciences 0806 Information Systems | MIT | MetaWRAP 1.3.0+galaxy1 | |||||||||
A (mostly) universal methylation extractor for BS-seq experiments. | MethylDackel | MIT | MethylDackel 0.5.2+galaxy0 | ||||||||||
metilene | metilene 0.2.6.1 | ||||||||||||
Estimates effective population sizes,past migration rates between n population assuming a migration matrix model with asymmetric migration rates and different subpopulation sizes, and population divergences or admixture. | MIGRATE | 2 publications | MIT | ||||||||||
A lightweight Python3 pipeline whose purpose is to facilitate the identification of expressed loci from RNA-Seq data and to select the best models in each locus. | mikado | Leveraging multiple transcriptome assembly methods for improved gene structure annotation | LGPL-3.0 | 2.2.4--py39h70b41aa_0 | |||||||||
mimodd | 14 tools | ||||||||||||
Short-read assembler based on a de Bruijn graph, capable of assembling a human genome on a desktop computer in a day. | minia | Using cascading Bloom filters to improve the memory usage for de Brujin graphs | minia | CECILL-2.0 | Minia 3.2.6 | ||||||||
Miniasm is a very fast OLC-based de novo assembler for noisy long reads. It takes all-vs-all read self-mappings (typically by minimap) as input and outputs an assembly graph in the GFA format. | miniasm | Minimap and miniasm: Fast mapping and de novo assembly for noisy long sequences | MIT | miniasm 0.3_r179+galaxy1 | 0.3 | ||||||||
miniconda | 4.12.0 | 4.12.0 | |||||||||||
minigraph | 0.20 | ||||||||||||
Pairwise aligner for genomic and spliced nucleotide sequences | minimap2 | Minimap2: Pairwise alignment for nucleotide sequences | minimap2 | MIT | Map with minimap2 2.28+galaxy0 | 2.17 2.22 2.24 | 2.26 | 2.24-gcccore-11.3.0 | |||||
Miniprot aligns a protein sequence against a genome with affine gap penalty, splicing and frameshift. It is primarily intended for annotating protein-coding genes in a new species using known genes from other species. | miniprot | MIT | 0.5 | ||||||||||
MIRA 3 - Whole Genome Shotgun and EST Sequence Assembler | mira | Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs | mira | GPL-3.0 | 4.9.6--1 | ||||||||
mircounts | miRcounts 1.4.0 | ||||||||||||
miRDeep2 discovers active known or novel miRNAs from deep sequencing data. | mirdeep2 | GPL-3.0 | 3 tools | ||||||||||
mitobim | MITObim 1.9.1 | ||||||||||||
Find, circularise and annotate mitogenome from PacBio assemblies | mitohifi | MitoFinder: Efficient automated large-scale extraction of mitogenomic data in target enrichment phylogenomics | MIT | MitoHiFi 3+galaxy0 | |||||||||
De novo metazoan mitochondrial genome annotation. | mitos2 | 10.1016/j.ympev.2012.08.023 | MITOS2 2.1.9+galaxy0 | ||||||||||
Multi Locus Sequence Typing from an assembled genome or from a set of reads. | mlst | Multilocus sequence typing of total-genome-sequenced bacteria | MLST | Other | 2 tools | 2.23.0--hdfd78af_1 | |||||||
MMseqs2 (Many-against-Many sequence searching) is a software suite to search and cluster huge protein and nucleotide sequence sets. MMseqs2 is open source GPL-licensed software implemented in C++ for Linux, MacOS, and (as beta version, via cygwin) Windows. The software is designed to run on multiple cores and servers and exhibits very good scalability. MMseqs2 can run 10000 times faster than BLAST. At 100 times its speed it achieves almost the same sensitivity. It can perform profile searches with the same sensitivity as PSI-BLAST at over 400 times its speed. MMseqs2 includes Linclust, the first clustering algorithm whose runtime scales linearly With Linclust we clustered 1.6 billion metagenomic sequence fragments in 10 h on a single server to 50% sequence identity. | mmseqs2 | 6 publications | GPL-3.0 | 13-45111 | |||||||||
Universal whole-sequence-based plasmid typing and its utility to prediction of host range and epidemiological surveillance. MOB-suite: Software tools for clustering, reconstruction and typing of plasmids from draft assemblies. Plasmids are mobile genetic elements (MGEs), which allow for rapid evolution and adaption of bacteria to new niches through horizontal transmission of novel traits to different genetic backgrounds. The MOB-suite is designed to be a modular set of tools for the typing and reconstruction of plasmid sequences from WGS assemblies. | mob-suite | Universal whole-sequence-based plasmid typing and its utility to prediction of host range and epidemiological surveillance | 2 tools | ||||||||||
A modest Feature Finder to extract features in MS1 Data. | moff | MoFF: A robust and automated approach to extract peptide ion intensities | moFF 2.0.3.0 | ||||||||||
monailabel | 0.6.00.7.00.8.0 0.6.00.7.00.8.0 0.6.00.7.00.8.0 | ||||||||||||
A proteomics search algorithm specifically designed for high-resolution tandem mass spectra. | morpheus | A proteomics search algorithm specifically designed for high-resolution tandem mass spectra | MIT | Morpheus 288+galaxy0 | |||||||||
Fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing. | mosdepth | Mosdepth: Quick coverage calculation for genomes and exomes | mosdepth | MIT | mosdepth 0.3.8+galaxy0 | ||||||||
Open-source, platform-independent, community-supported software for describing and comparing microbial communities | mothur | 10.1128/AEM.01541-09 | GPL-3.0 | 131 tools | |||||||||
Data warehouse for accessing mouse data from Mouse Genome Informatics (MGI). Supports powerful query, reporting, and analysis capabilities, the ability to save and combine results from different queries, easy integration into larger workflows, and a comprehensive Web Services layer. | mousemine | MouseMine: a new data warehouse for MGI | MouseMine 1.0.0 | ||||||||||
Program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. It uses Markov chain Monte Carlo (MCMC) methods to estimate the posterior distribution of model parameters. | mrbayes | 3 publications | GPL-3.0 | 3.2.7--h19cf415_2 | 3.2.7a-foss-2022a | ||||||||
A fast, flexible and open software framework for medical image processing and visualisation | MRtrix3 is an open-source, cross-platform software package for medical image processing, analysis and visualisation, with a particular emphasis on the investigation of the brain using diffusion MRI. It is implemented using a fast, modular and flexible general-purpose code framework for image data access and manipulation, enabling efficient development of new applications, whilst retaining high computational performance and a consistent command-line interface between applications. In this article, we provide a high-level overview of the features of the MRtrix3 framework and general-purpose image processing applications provided with the software | mrtrix | MRtrix3: A fast, flexible and open software framework for medical image processing and visualisation | 3.0.3-foss-2021a | ||||||||||
msConvert is a command-line utility for converting between various mass spectrometry data formats, including from raw data from several commercial companies (with vendor libraries, Windows-only). For Windows users, there is also a GUI, msConvertGUI. | msconvert | A cross-platform toolkit for mass spectrometry and proteomics | Apache-2.0 | msconvert 3.0.20287.4 | |||||||||
Tool for mass spectra metadata annotation. | msmetaenhancer | 10.21105/joss.04494 | MIT | MSMetaEnhancer 0.4.0+galaxy1 | |||||||||
Statistical tool for quantitative mass spectrometry-based proteomics. | msstats | MSstats: An R package for statistical analysis of quantitative mass spectrometry-based proteomic experiments | MSstats | MSstats 4.0.0+galaxy1 | |||||||||
Tools for detecting differentially abundant peptides and proteins in shotgun mass spectrometry-based proteomic experiments with tandem mass tag (TMT) labeling | bioconductor-msstatstmt | MSstatsTMT 2.0.0+galaxy1 | |||||||||||
mtag | 20230414 | ||||||||||||
MultiQC aggregates results from multiple bioinformatics analyses across many samples into a single report. It searches a given directory for analysis logs and compiles a HTML report. It's a general use tool, perfect for summarising the output from numerous bioinformatics tools. | multiqc | 10.1093/bioinformatics/btw354 | MultiQC | GPL-3.0 | MultiQC 1.11+galaxy1 | 1.9 | 1.11-foss-2021a1.14-foss-2022a (D) 1.11-foss-2021a1.14-foss-2022a (D) | ||||||
MUMmer is a modular system for the rapid whole genome alignment of finished or draft sequence. Basically it is a ultra-fast alignment of large-scale DNA and protein sequences | mummer | 4 publications | MUMmer | Artistic-2.0 | 6 tools | 3.23--pl5321h1b792b2_13 | |||||||
MuSiC is a suite of programs that evaluate the biophysical effects of amino acid mutations in proteins. They request the experimental or modeled 3-dimensional protein structure as input, and predict the impact of specific single-site mutations requested by the user or of all possible single-site mutations. PoPMuSiC and HoTMuSiC predict the changes in thermodynamic and thermal stability, respectively, upon mutation. They are helpful for the rational design of modified proteins with controlled stability properties. SNPMuSiC predicts whether protein variants are deleterious or benign due to stability issues, thus providing a molecular-level interpretation of disease phenotype. | music_compare | 6 publications | MuSiC Compare 0.1.1+galaxy4 | ||||||||||
music_deconvolution | 3 tools | ||||||||||||
Convert proteomics data files into a SQLite database | mztosqlite | mz to sqlite 2.1.1+galaxy0 | |||||||||||
nag | nll6i27dbl nll6i27dbl-i8 nll6i27dbl-mkl fll6i26dcl fll6i26dcl-mkl | ||||||||||||
nvc | Naive Variant Caller (NVC) 0.0.4 | ||||||||||||
RNA modifications detection by comparative Nanopore direct RNA sequencing. RNA modifications detection from Nanopore dRNA-Seq data. Nanocompore identifies differences in ONT nanopore sequencing raw signal corresponding to RNA modifications by comparing 2 samples. Analyses performed for the nanocompore paper. Nanocompore compares 2 ONT nanopore direct RNA sequencing datasets from different experimental conditions expected to have a significant impact on RNA modifications. It is recommended to have at least 2 replicates per condition. For example one can use a control condition with a significantly reduced number of modifications such as a cell line for which a modification writing enzyme was knocked-down or knocked-out. Alternatively, on a smaller scale transcripts of interests could be synthesized in-vitro | nanocompore | 10.1101/843136 | GPL-3.0 | SampComp 1.0.0rc3.post2+galaxy1 | 1.0.4--pyhdfd78af_0 | ||||||||
nanofilt | NanoFilt 0.1.0 | ||||||||||||
NanoPlot is a tool with various visualizations of sequencing data in bam, cram, fastq, fasta or platform-specific TSV summaries, mainly intended for long-read sequencing from Oxford Nanopore Technologies and Pacific Biosciences | nanoplot | NanoPack: Visualizing and processing long-read sequencing data | GPL-3.0 | NanoPlot 1.43.0+galaxy0 | |||||||||
A package for detecting cytosine methylations and genetic variations from nanopore MinION sequencing data. | nanopolish | Detecting DNA cytosine methylation using nanopore sequencing | Nanopolish | MIT | 4 tools | 0.14.0--hb24e783_1 | |||||||
nanosv | 1.2.4 | ||||||||||||
natural_product_likeness_scorer | Natural Product likeness calculator 2.1 | ||||||||||||
NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases. Find and download sequence, annotation, and metadata for genes and genomes using our command-line tools or web interface. | ncbi-datasets-cli | NCBI Datasets | 14.2.214.13.014.29.116.6.0 14.2.214.13.014.29.116.6.0 14.2.214.13.014.29.116.6.0 14.2.214.13.014.29.116.6.0 | ||||||||||
ncbi-vdb | 2.10.9-gompi-2021a3.0.2-gompi-2022a (D) 2.10.9-gompi-2021a3.0.2-gompi-2022a (D) | ||||||||||||
The National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. | ncbi_acc_download | 14 publications | NCBI Accession Download 0.2.8+galaxy0 | ||||||||||
nccmp compares two NetCDF files bitwise, semantically or with a user defined tolerance (absolute or relative percentage). Parallel comparisons are done in local memory without requiring temporary files. Highly recommended for regression testing scientific models or datasets in a test-driven development environment. | nccmp | GPL-2.0 | 1.8.5.0 | ||||||||||
ncl | 6.6.2 | ||||||||||||
The NCO toolkit manipulates and analyzes data stored in netCDF-accessible formats, including DAP, HDF4, and HDF5. | nco | BSD-3-Clause | 4.7.7 4.9.2 5.0.5 | ||||||||||
Ncview is a netCDF visual browser. | ncview | GPL-1.0 | 2.1.7 | ||||||||||
NetCDF (Network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. | netcdf | 4.7.1 4.7.1p 4.6.3 4.6.3p 4.7.3 4.7.3p 4.7.4 4.7.4p 4.6.3-i8r8 4.8.0 4.8.0p 4.9.0 4.9.0p | 4.8.0-gompi-2021a4.9.0-gompi-2022a (D) 4.8.0-gompi-2021a4.9.0-gompi-2022a (D) | ||||||||||
The Newick Utilities are a set of command-line tools for processing phylogenetic trees. They can process arbitrarily large amounts of data and do not require user interaction, which makes them suitable for automating phylogeny processing tasks. | newick_utils | 10.1093/bioinformatics/btq243 | Newick Display 1.6+galaxy1 | ||||||||||
Nextclade is an open-source project for viral genome alignment, mutation calling, clade assignment, quality checks and phylogenetic placement. | nextclade | MIT | Nextclade 2.7.0+galaxy0 | ||||||||||
Nextflow enables scalable and reproducible scientific workflows using software containers. It allows the adaptation of pipelines written in the most common scripting languages. | nextflow | Nextflow enables reproducible computational workflows | Apache-2.0 | 21.04.3 22.04.3 | 22.10.1 | ||||||||
A fast and efficient genome polishing tool for long read assembly. Fast and accurately polish the genome generated by noisy long reads. NextPolish is used to fix base errors (SNV/Indel) in the genome generated by noisy long reads, it can be used with short read data only or long read data only or a combination of both. It contains two core modules, and use a stepwise fashion to correct the error bases in reference genome. To correct the raw third-generation sequencing (TGS) long reads with approximately 15-10% sequencing errors, please use NextDenovo | nextpolish | NextPolish: A fast and efficient genome polishing tool for long-read assembly | 1.4.1--py311he4a0461_1 | ||||||||||
A fast and efficient genome polishing tool for long read assembly. Fast and accurately polish the genome generated by noisy long reads. NextPolish is used to fix base errors (SNV/Indel) in the genome generated by noisy long reads, it can be used with short read data only or long read data only or a combination of both. It contains two core modules, and use a stepwise fashion to correct the error bases in reference genome. To correct the raw third-generation sequencing (TGS) long reads with approximately 15-10% sequencing errors, please use NextDenovo | nextpolish2 | NextPolish: A fast and efficient genome polishing tool for long-read assembly | 0.1.0--hd03093a_0 | ||||||||||
ngs | 2.10.9-gcccore-10.3.0 | ||||||||||||
NGSUtils is a suite of software tools for working with next-generation sequencing datasets | ngsutils | NGSUtils: A software suite for analyzing and manipulating next-generation sequencing datasets | ngsutils | GPL-3.0 | BAM filter 0.5.9 | ||||||||
Nearly Infinite Neighbor Joining Application | ninja | MIT | 0.98-cluster_only | 1.10.2-gcccore-10.3.01.10.2-gcccore-11.3.0 (D) 1.10.2-gcccore-10.3.01.10.2-gcccore-11.3.0 (D) | |||||||||
NOVOPlasty | NOVOplasty 4.3.1+galaxy0 | ||||||||||||
nseg | 1.0.1 | ||||||||||||
bg_find_subsequences | Nucleotide subsequence search 0.2 | ||||||||||||
Set of python programs developed to simplify the manipulation of sequence files. They were mainly designed to help us for analyzing Next Generation Sequencer outputs (454 or Illumina) in the context of DNA Metabarcoding. | obitools | obitools: A unix-inspired software package for DNA metabarcoding | OBITools | 10 tools | |||||||||
ont-fast5-api | 4.1.1--pyhdfd78af_0 | ||||||||||||
Open source library and a collection of tools and interfaces for the analysis of mass spectrometry data. Includes over 200 standalone (TOPP) tools that can be combined to a workflow with the integrated workflow editor TOPPAS. Raw and intermediate mass spectrometry data can be visualised with the included viewer TOPPView. | openms | 2 publications | OpenMS | BSD-3-Clause | 35 tools | ||||||||
OrthoFinder is a fast, accurate and comprehensive platform for comparative genomics. It finds orthogroups and orthologs, infers rooted gene trees for all orthogroups and identifies all of the gene duplcation events in those gene trees. It also infers a rooted species tree for the species being analysed and maps the gene duplication events from the gene trees to branches in the species tree. OrthoFinder also provides comprehensive statistics for comparative genomic analyses. | orthofinder | 2 publications | GPL-3.0 | OrthoFinder 2.5.5+galaxy0 | |||||||||
pampa | 5 tools | ||||||||||||
pandoc | 3.1.2 | ||||||||||||
Pangolin is a deep-learning based method for predicting splice site strengths (for details, see Zeng and Li, Genome Biology 2022). It is available as a command-line tool that can be run on a VCF or CSV file containing variants of interest; Pangolin will predict changes in splice site strength due to each variant, and return a file of the same format. Pangolin's models can also be used with custom sequences. | pangolin | Predicting RNA splicing from DNA sequence using Pangolin | GPL-3.0 | Pangolin 4.3+galaxy2 | |||||||||
ParaView is an open-source, multi-platform data analysis and visualization application. ParaView users can quickly build visualizations to analyze their data using qualitative and quantitative techniques. The data exploration can be done interactively in 3D or programmatically using ParaView’s batch processing capabilities. | paraview | BSD-3-Clause | 5.8.0 5.8.0-mesa 5.8.0-gpu 5.9.1 5.9.1-mesa 5.9.1-gpu 5.10.1 5.9.1-mesa 5.10.1-mesa 5.9.1-gpu | ||||||||||
parse_mito_blast | Parse mitochondrial blast 1.0.2+galaxy0 | ||||||||||||
param_value_from_file | Parse parameter value 0.1.0 | ||||||||||||
Parsnp is a command-line-tool for efficient microbial core genome alignment and SNP detection. Parsnp was designed to work in tandem with Gingr, a flexible platform for visualizing genome alignments and phylogenetic trees. | parsnp | BSD-3-Clause | 1.7.4--hdcf5f25_2 | ||||||||||
Tool set for pathway based data integration and visualization that maps and renders a wide variety of biological data on relevant pathway graphs. It downloads the pathway graph data, parses the data file, maps user data to the pathway, and render pathway graph with the mapped data. In addition, it integrates with pathway and gene set (enrichment) analysis tools for large-scale and fully automated analysis. | pathview | Pathview: An R/Bioconductor package for pathway-based data integration and visualization | Pathview | GPL-3.0 | Pathview 1.34.0+galaxy0 | ||||||||
Web application for exploring metagenomics classification results, with a special focus on infectious disease diagnosis. Pinpointing pathogens in metagenomics classification results is often complicated by host and laboratory contaminants as well as many non-pathogenic microbiota. Researchers can analyze, display and transform results from the Kraken and Centrifuge classifiers using interactive tables, heatmaps and flow diagrams. | pavian | 10.1101/084715 | GPL-3.0 | Pavian 1.0 | |||||||||
Multithread blat algorithm speeding up aligning sequences to genomes. | pblat | pblat: A multithread blat algorithm speeding up aligning sequences to genomes | Unlicense | 2.5 | |||||||||
Productive visualization of high-throughput sequencing data using the SeqCode open portable platform. | pe_histogram | Productive visualization of high-throughput sequencing data using the SeqCode open portable platform | GPL-3.0 | Paired-end histogram 1.0.1 | |||||||||
Paired-end read merger. PEAR evaluates all possible paired-end read overlaps without requiring the target fragment size as input. In addition, it implements a statistical test for minimizing false-positive results. | pear | PEAR: A fast and accurate Illumina Paired-End reAd mergeR | CC-BY-NC-1.0 | Pear 0.9.6.3 | 0.9.6--h9d449c0_10 | ||||||||
PepPointer | PepPointer 0.1.3+galaxy1 | ||||||||||||
peptide_genomic_coordinate | Peptide Genomic Coordinate 1.0.0 | ||||||||||||
PeptideShaker is a search engine independent platform for interpretation of proteomics identification results from multiple search engines, currently supporting X!Tandem, MS-GF+, MS Amanda, OMSSA, MyriMatch, Comet, Tide, Mascot, Andromeda and mzIdentML. By combining the results from multiple search engines, while re-calculating PTM localization scores and redoing the protein inference, PeptideShaker attempts to give you the best possible understanding of your proteomics data | peptideshaker | PeptideShaker enables reanalysis of MS-derived proteomics data sets: To the editor | Apache-2.0 | 4 tools | |||||||||
Semi-supervised learning for peptide identification from MS/MS data. | percolator | Semi-supervised learning for peptide identification from shotgun proteomics datasets | Percolator | 4 tools | |||||||||
perllib | v5.26.3 | ||||||||||||
This tool is used to search a FASTA sequence against a library of Pfam HMM. | pfamscan | PfamScan 1.6+galaxy0 | |||||||||||
Pharokka is a rapid standardised annotation tool for bacteriophage genomes and metagenomes. | pharokka | Pharokka: a fast scalable bacteriophage annotation tool | MIT | pharokka 1.3.2+galaxy0 | |||||||||
phinch | Phinch Visualisation 0.1 | ||||||||||||
php | 5.6.40 | ||||||||||||
Provides a set of classes and tools to facilitate the import, storage, analysis, and graphical display of microbiome census data. | phyloseq | Phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data | Phyloseq | GPL-3.0 | 2 tools | ||||||||
Phylogenetic estimation software using Maximum Likelihood | phyml | 5 publications | PhyML | GPL-2.0 | PhyML 3.3.20220408+galaxy0 | ||||||||
A set of command line tools for manipulating high-throughput sequencing (HTS) data in formats such as SAM/BAM/CRAM and VCF. Available as a standalone program or within the GATK4 program. | picard | PICARD | MIT | 31 tools | 2.27.43.1.1 2.27.43.1.1 | 2.25.1-java-11 | |||||||
PICRUSt (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States) is a bioinformatics software package designed to predict metagenome functional content from marker gene (e.g., 16S rRNA) surveys and full genomes. | picrust | Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences | 6 tools | ||||||||||
PICRUSt2 (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States) is a software for predicting functional abundances based only on marker gene sequences. | picrust2 | 10.1038/s41587-020-0548-6 | GPL-3.0 | 7 tools | |||||||||
pileup_interval | Pileup-to-Interval 1.0.3 | ||||||||||||
Read alignment analysis to diagnose, report, and automatically improve de novo genome assemblies. | pilon | Pilon: An integrated tool for comprehensive microbial variant detection and genome assembly improvement | Pilon | pilon 1.20.1 | |||||||||
PIPE-T is a Galaxy Workflow for processing and analyzing miR expression profiles by RTqPCR. It is a tool that offers several state-of-the-art options for parsing, filtering, normalizing, imputing and analyzing RT-qPCR expression data. Integration of PIPE-T into Galaxy allows experimentalists with strong bioinformatic background, as well as those without any programming or development expertise, to perform complex analysis in a simple to use, transparent, accessible, reproducible, and user-friendly environment | pipe_t | PIPE-T: a new Galaxy tool for the analysis of RT-qPCR expression data | MIT | PIPE-T 1.0 | |||||||||
PlasFlow is a set of scripts used for prediction of plasmid sequences in metagenomic contigs. | PlasFlow | GPL-3.0 | PlasFlow 1.1.0+galaxy0 | ||||||||||
PlasmidFinder is a tool for the identification and typing of Plasmid Replicons in Whole-Genome Sequencing (WGS). | plasmidfinder | PlasmidFinder and In Silico pMLST: Identification and Typing of Plasmid Replicons in Whole-Genome Sequencing (WGS) | PlasmidFinder 2.1.6+galaxy1 | ||||||||||
Free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner. | plink | PLINK: A tool set for whole-genome association and population-based linkage analyses | plink | GPL-2.0 | v2.00a3.7 | 2.00a3.6-gcc-11.3.0 | |||||||
PnetCDF: A Parallel I/O Library for NetCDF File Access | pnetcdf | 10.1109/SC.2003.10053 | Freeware | 1.11.2 | |||||||||
poisson2test | Poisson two-sample test 1.0.0 | ||||||||||||
porechop | Porechop 0.2.4+galaxy0 | ||||||||||||
Flexible toolkit for exploring datasets generated by nanopore sequencing devices from MinION for the purposes of quality control and downstream analysis. | poretools | Poretools: A toolkit for analyzing nanopore sequence data | poretools | 13 tools | |||||||||
Tools for performing taxonomic assignment based on phylogeny using pplacer and clst. | pplacer | Orchestrating high-throughput genomic analysis with Bioconductor | pplacer | GPL-3.0 | 1.1.alpha19 | ||||||||
pretext_map | PretextMap 0.1.9+galaxy1 | ||||||||||||
Pretext is an OpenGL-powered pretext contact map viewer. | pretextview | MIT | Pretext Snapshot 0.0.3+galaxy2 | ||||||||||
The pipeline runs PRODIGAL gene predictions on all genomes, runs pan-reciprocal BLAST, and identifies ortholog sets. For a set of orthologous genes, if the positions of the PRODIGAL selected starts coincide in a multiple sequence alignment, they are accepted. If they do not coincide, a consistent start position is sought where a majority of the highest-scoring PRODIGAL selected sites coincide. If such a position is found, it is accepted, and the predictions are changed for the outlying genes. | prodigal | Genome majority vote improves gene predictions | 2.6.3 | 2.6.3-gcccore-10.3.02.6.3-gcccore-11.3.0 (D) 2.6.3-gcccore-10.3.02.6.3-gcccore-11.3.0 (D) | |||||||||
Flow Injection Analysis coupled to High-Resolution Mass Spectrometry is a promising approach for high-throughput metabolomics. FIA-HRMS data, however, cannot be pre-processed with current software tools which rely on liquid chromatography separation, or handle low resolution data only. Here we present the package that implements a new methodology to pre-process FIA-HRMS raw data (netCDF, mzData, mzXML, and mzML) and generates the peak table. | profia | Orchestrating high-throughput genomic analysis with Bioconductor | CECILL-2.1 | proFIA 3.1.0 | |||||||||
progressbar2 | 4.2.0 | ||||||||||||
Software tool to annotate bacterial, archaeal and viral genomes quickly and produce standards-compliant output files. | prokka | 10.1093/bioinformatics/btu153 | Prokka | Prokka 1.14.6+galaxy1 | 1.14.5-gompi-2021a1.14.5-gompi-2022a (D) 1.14.5-gompi-2021a1.14.5-gompi-2022a (D) | ||||||||
pslcdnafilter | 0 | ||||||||||||
Identifying and removing haplotypic duplication in primary genome assemblies | haplotypic duplication identification tool | scripts/pd_config.py: script to generate a configuration file used by run_purge_dups.py | purge haplotigs and overlaps in an assembly based on read depth | Given a primary assembly pri_asm and an alternative assembly hap_asm (optional, if you have one), follow the steps shown below to build your own purge_dups pipeline, steps with same number can be run simultaneously. Among all the steps, although step 4 is optional, we highly recommend our users to do so, because assemblers may produce overrepresented seqeuences. In such a case, The final step 4 can be applied to remove those seqeuences | purge_dups | 10.1101/729962 | MIT | Purge overlaps 1.2.6+galaxy0 | |||||||||
pybigwig | 0.3.18-foss-2021a0.3.18-foss-2022a (D) 0.3.18-foss-2021a0.3.18-foss-2022a (D) | ||||||||||||
PycoQC computes metrics and generates interactive QC plots for Oxford Nanopore technologies sequencing data. | pycoqc | GPL-3.0 | Pycoqc 2.5.2+galaxy0 | 2.5.2-foss-2021a | |||||||||
reproducible plots for multivariate genomic data sets. Standalone program and library to plot beautiful genome browser tracks. pyGenomeTracks aims to produce high-quality genome browser tracks that are highly customizable. Currently, it is possible to plot:. | pygenometracks | pyGenomeTracks: reproducible plots for multivariate genomic datasets | GPL-3.0 | pyGenomeTracks 3.8+galaxy2 | |||||||||
pyprophet | 4 tools | ||||||||||||
A Python module for reading and manipulating SAM/BAM/VCF/BCF files. | pysam | The Sequence Alignment/Map format and SAMtools | MIT | 0.16.0.1-gcc-10.3.00.19.1-gcc-11.3.0 (D) 0.16.0.1-gcc-10.3.00.19.1-gcc-11.3.0 (D) | |||||||||
PyTorch is an optimized tensor library for deep learning using GPUs and CPUs. | pytorch | BSD-3-Clause | 1.4.0a0 1.5.1 1.9.0 1.10.0 1.12.1 | ||||||||||
QIIME 2™ is a next-generation microbiome bioinformatics platform that is extensible, free, open source, and community developed. | qiime2 | Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2 | BSD-3-Clause | 161 tools | 2022.8 | ||||||||
Platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data. | qualimap | 22914218 | qualimap | 4 tools | |||||||||
A Collection of Tools for Viral Quasispecies Analysis | Abstract Summary quasitools is a collection of newly-developed, open-source tools for analyzing viral quasispcies data. The application suite includes tools with the ability to create consensus sequences, call nucleotide, codon, and amino acid variants, calculate the complexity of a quasispecies, and measure the genetic distance between two similar quasispecies. These tools may be run independently or in user-created workflows. Availability The quasitools suite is a freely available application licensed under the Apache License, Version 2.0. The source code, documentation, and file specifications are available at: https: phac-nml.github.io quasitools Contact gary.vandomselaar@canada.ca | quasitools | 10.1101/733238 | Apache-2.0 | 12 tools | |||||||||
QUAST stands for QUality ASsessment Tool. It evaluates a quality of genome assemblies by computing various metrics and providing nice reports. | quast | QUAST: Quality assessment tool for genome assemblies | GPL-2.0 | Quast 5.2.0+galaxy1 | 5.1.0rc15.2.0 5.1.0rc15.2.0 | 5.0.2-foss-2021a5.2.0-foss-2022a (D) 5.0.2-foss-2021a5.2.0-foss-2022a (D) | |||||||
Query Tabular is a Galaxy-based tool which manipulates tabular files. Query Tabular automatically creates a SQLite database directly from a tabular file within a Galaxy workflow. The SQLite database can be saved to the Galaxy history, and further process to generate tabular outputs containing desired information and formatting. | query_tabular | Improve your Galaxy text life: The Query Tabular Tool | CC-BY-4.0 | Query Tabular 3.3.2 | |||||||||
Free software environment for statistical computing and graphics. | r | 10.11120/msor.2001.01010023 | 3.6.1 4.0.0 4.1.0 4.2.1 | 4.1.0-foss-2021a4.2.1-foss-2021a4.2.1-foss-2022a (D) 4.1.0-foss-2021a4.2.1-foss-2021a4.2.1-foss-2022a (D) 4.1.0-foss-2021a4.2.1-foss-2021a4.2.1-foss-2022a (D) | |||||||||
r-raceid | 5 tools | ||||||||||||
Consensus module for raw de novo DNA assembly of long uncorrected reads Racon is intended as a standalone consensus module to correct raw contigs generated by rapid assembly methods which do not include a consensus step. The goal of Racon is to generate genomic consensus which is of similar or better quality compared to the output generated by assembly methods which employ both error correction and consensus steps, while providing a speedup of several times compared to those methods. It supports data produced by both Pacific Biosciences and Oxford Nanopore Technologies. | racon | Constructing a reference genome in a single lab: The possibility to use oxford nanopore technology | MIT | Racon 1.5.0+galaxy1 | 1.4.3 | ||||||||
fast and accurate reference-guided scaffolding of draft genomes. Fast Reference-Guided Scaffolding of Genome Assembly Contigs. Index of /shares/schatzlab/www-data/ragoo. A tool to order and orient genome assembly contigs via Minimap2 alignments to a reference genome. Alonge, Michael, et al. "RaGOO: fast and accurate reference-guided scaffolding of draft genomes." Genome biology 20.1 (2019): 1-17. Contigs and reference fasta files may now be gzipped. RaGOO is a tool for coalescing genome assembly contigs into pseudochromosomes via minimap2 alignments to a closely related reference genome. The focus of this tool is on practicality and therefore has the following features: | ragoo | RaGOO: Fast and accurate reference-guided scaffolding of draft genomes | MIT | RaGOO 1.0 | |||||||||
RagTag is a collection of software tools for scaffolding and improving modern genome assemblies. | ragtag | RaGOO: Fast and accurate reference-guided scaffolding of draft genomes | MIT | RagTag 2.1.0+galaxy1 | |||||||||
A feature clustering algorithm for non-targeted mass spectrometric metabolomics data. | ramclustr | RAMClust: A novel feature clustering method enables spectral-matching-based annotation for metabolomics data | GPL-2.0 | 2 tools | |||||||||
Ratatosk – Hybrid error correction of long reads enables accurate variant calling and assembly. Phased hybrid error correction of long reads using colored de Bruijn graphs. Ratatosk is a phased error correction tool for erroneous long reads based on compacted and colored de Bruijn graphs built from accurate short reads. | ratatosk | 10.1101/2020.07.15.204925 | BSD-2-Clause | 0.7.6.3--h43eeafb_2 | |||||||||
a de novo genome assembler for long reads. Raven is a de novo genome assembler for long uncorrected reads. | raven | 10.1101/2020.08.07.242461 | MIT | Raven 1.8.3+galaxy0 | |||||||||
A standalone tool for extracting data directly from raw files generated by Thermo Orbitrap family instruments. | rawtools | RawTools: Rapid and Dynamic Interrogation of Orbitrap Data Files for Mass Spectrometer System Management | Apache-2.0 | Raw Tools 1.4.2.0 | |||||||||
A tool for Phylogenetic Analysis and Post-Analysis of Large Phylogenies. | raxml | 2 publications | RAxML | RAxML 8.2.12+galaxy1 | 8.2.12 | ||||||||
RDKit is an Open-Source Cheminformatics Software. Fast, Efficient Fragment-Based Coordinate Generation for Open Babel. | rdkit | 10.26434/CHEMRXIV.7791947.V2 | 2 tools | ||||||||||
rdock | Create Frankenstein ligand 2013.1-0+galaxy0 | ||||||||||||
recetox-aplcms is a tool for peak detection in mass spectrometry data. The tool performs (1) noise removal, (2) peak detection, (3) retention time drift correction, (4) peak alignment and (5) weaker signal recovery as well as (6) suspect screening. | recetox-aplcms | GPL-2.0 | 8 tools | ||||||||||
Tool for calculating the probability of nucleosome formation along a DNA sequence input by the user. | recon | RECON: A program for prediction of nucleosome formation potential | 1.08 | ||||||||||
This is a program to detect and visualize RNA editing events at genomic scale using next-generation sequencing data. | red | RED: A Java-MySQL software for identifying and visualizing RNA editing sites using rule- based and statistical filters | Red 2018.09.10+galaxy1 | ||||||||||
regenie | 3.2.9 | ||||||||||||
A program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). | repeatmasker | OSL-2.1 | RepeatMasker 4.1.5+galaxy0 | 4.1.2-p14.1.5 4.1.2-p14.1.5 | 4.1.5--pl5321hdfd78af_0 | ||||||||
RepeatModeler is a de novo transposable element (TE) family identification and modeling package. At the heart of RepeatModeler are three de-novo repeat finding programs ( RECON, RepeatScout and LtrHarvest/Ltr_retriever ) which employ complementary computational methods for identifying repeat element boundaries and family relationships from sequence data. | repeatmodeler | 10.1101/856591 | RepeatModeler 2.0.4+galaxy1 | 2.0.32.0.42.0.4-conda 2.0.32.0.42.0.4-conda 2.0.32.0.42.0.4-conda | 2.0.4--pl5321hdfd78af_0 | ||||||||
RepeatScout is a tool to discover repetitive substrings in DNA. | repeatscout | 10.1093/bioinformatics/bti1018 | 1.0.6 | ||||||||||
repenrich | RepEnrich 1.6.1 | ||||||||||||
The rjags package provides an interface from R to the JAGS library for Bayesian data analysis. JAGS uses Markov Chain Monte Carlo (MCMC) to generate a sequence of dependent samples from the posterior distribution of the parameters. | rjags | GPL-2.0 | 4-10-foss-2021a-r-4.1.0 | ||||||||||
Workflow to process tandem MS files and build MassBank records. Functions include automated extraction of tandem MS spectra, formula assignment to tandem MS fragments, recalibration of tandem MS spectra with assigned fragments, spectrum cleanup, automated retrieval of compound information from Internet databases, and export to MassBank records. | rmassbank | Automatic recalibration and processing of tandem mass spectra using formula annotation | Artistic-2.0 | RMassBank 3.0.0+galaxy3 | |||||||||
rmats-turbo | 4.1.2 | ||||||||||||
RMBlast is a RepeatMasker compatible version of the standard NCBI blastn program. The primary difference between this distribution and the NCBI distribution is the addition of a new program "rmblastn" for use with RepeatMasker and RepeatModeler. | rmblast | OSL-2.1 | 2.11.02.14.0 2.11.02.14.0 | ||||||||||
rnachipintegrator | 2 tools | ||||||||||||
Quality assessment tool for de novo transcriptome assemblies. | rnaquast | RnaQUAST: A quality assessment tool for de novo transcriptome assemblies | Unlicense | rnaQUAST 2.2.3+galaxy0 | |||||||||
A high speed stand alone pan genome pipeline, which takes annotated assemblies in GFF3 format (produced by Prokka (Seemann, 2014)) and calculates the pan genome. | roary | 10.1093/bioinformatics/btv421 | roary | Roary 3.13.0+galaxy3 | |||||||||
We present a generative statistical model and associated inference methods that handle read mapping uncertainty in a principled manner. Through simulations parameterized by real RNASeq data, we show that our method is more accurate than previous methods. Our improved accuracy is the result of handling read mapping uncertainty with a statistical model and the estimation of gene expression levels as the sum of isoform expression levels. | rsem | RNA-Seq gene expression estimation with read mapping uncertainty | RSEM | 1.3.3--pl5321ha04fe3b_5 | |||||||||
Provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data. Some basic modules quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while RNA-seq specific modules evaluate sequencing saturation, mapped reads distribution, coverage uniformity, strand specificity, transcript level RNA integrity etc. | rseqc | 2 publications | rseqc | 22 tools | 5.0.1 | ||||||||
Integrated development environment (IDE) for the R programming language. | rstudio | RSTUDIO: A platform-independent IDE for R and sweave | RStudio 0.3 | ||||||||||
RTG Core: Software for alignment and analysis of next-gen sequencing data. | rtg-tools | Other | 3.12.1 | ||||||||||
rxdock | 2 tools | ||||||||||||
s3segmenter | s3segmenter 1.3.12+galaxy0 | ||||||||||||
A software tool that implements a novel, is an alignment-free algorithm for the estimation of isoform abundances directly from a set of reference sequences and RNA-seq reads. | sailfish | Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms | Sailfish 0.10.1.1 | ||||||||||
A tool for transcript expression quantification from RNA-seq data | salmon | Salmon provides fast and bias-aware quantification of transcript expression | GPL-3.0 | Salmon quant 1.10.1+galaxy2 | 1.1.0 | 1.4.0-gompi-2021a1.9.0-gcc-11.3.0 (D) 1.4.0-gompi-2021a1.9.0-gcc-11.3.0 (D) | |||||||
salmon_kallisto_mtx_to_10x | salmonKallistoMtxTo10x 0.0.1+galaxy6 | ||||||||||||
> VERY_LOW CONFIDENCE! | > CORRECT NAME OF TOOL COULD ALSO BE 'chromosome-scale', 'reference-quality', 'Hi-C', 'scaffolder' | Integrating Hi-C links with assembly graphs for chromosome-scale assembly | SALSA: A tool to scaffold long read assemblies with Hi-C data | SALSA: A tool to scaffold long read assemblies with Hi-C | This code is used to scaffold your assemblies using Hi-C data. This version implements some improvements in the original SALSA algorithm. If you want to use the old version, it can be found in the old_salsa branch | salsa | Integrating Hi-C links with assembly graphs for chromosome-scale assembly | MIT | 2.3 | |||||||||
sam2interval | Convert SAM 1.0.2 | ||||||||||||
sam_pileup | Generate pileup 1.1.3 | ||||||||||||
This tool is a high performance modern robust and fast tool (and library), written in the D programming language, for working with SAM, BAM and CRAM formats. | sambamba | Sambamba: Fast processing of NGS alignment formats | Sample, Slice or Filter BAM 0.7.1+galaxy1 | 0.8.1 | 0.8.1--h41abebc_0 | ||||||||
A tool to mark duplicates and extract discordant and split reads from SAM files. | samblaster | SAMBLASTER: Fast duplicate marking and structural variant read extraction | samblaster | 0.1.24 | |||||||||
SAMtools and BCFtools are widely used programs for processing and analysing high-throughput sequencing data. They include tools for file format conversion and manipulation, sorting, querying, statistics, variant calling, and effect analysis amongst other methods. | samtools | 3 publications | SAMTools | MIT | 23 tools | 1.9 1.10 1.12 | 1.181.19.2 1.181.19.2 | 1.15--h3843a85_0 | 1.13-gcc-10.3.01.13-gcc-11.3.01.16.1-gcc-11.3.0 (D) 1.13-gcc-10.3.01.13-gcc-11.3.01.16.1-gcc-11.3.0 (D) 1.13-gcc-10.3.01.13-gcc-11.3.01.16.1-gcc-11.3.0 (D) | ||||
Scalable toolkit for analyzing single-cell gene expression data. It includes preprocessing, visualization, clustering, pseudotime and trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells. | scanpy | SCANPY: Large-scale single-cell gene expression data analysis | BSD-3-Clause | 34 tools | |||||||||
Pre-processing, quality control, normalization and visualization of single-cell RNA-seq data. | scater | Scater: Pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R | 6 tools | ||||||||||
scikit-build | 0.11.1-gcccore-10.3.0 | ||||||||||||
Scikit-image contains image processing algorithms for SciPy, including IO, morphology, filtering, warping, color manipulation, object detection, etc. | scikit-image | 10.7287/PEERJ.PREPRINTS.336V2 | BSD-3-Clause | 7 tools | |||||||||
scikit | 14 tools | ||||||||||||
scipio | 1.4 | ||||||||||||
send_to_cloud | Send to cloud 0.1.0 | ||||||||||||
SEPP stands for SATé-Enabled Phylogenetic Placement and addresses the problem of phylogenetic placement for meta-genomic short reads | sepp | 10.1142/9789814366496_0024 | sepp | GPL-3.0 | 4.5.1 | 4.5.0-foss-2021a4.5.1-foss-2022a (D) 4.5.0-foss-2021a4.5.1-foss-2022a (D) | |||||||
seq_select_by_id | Select sequences by ID 0.0.14 | ||||||||||||
FASTA and FASTQ are basic and ubiquitous formats for storing nucleotide and protein sequences. Common manipulations of FASTA/Q file include converting, searching, filtering, deduplication, splitting, shuffling, and sampling. Existing tools only implement some of these manipulations, and not particularly efficiently, and some are only available for certain operating systems. Furthermore, the complicated installation process of required packages and running environments can render these programs less user friendly. SeqKit demonstrates competitive performance in execution time and memory usage compared to similar tools. The efficiency and usability of SeqKit enable researchers to rapidly accomplish common FASTA/Q file manipulations. | seqkit | 10.1371/journal.pone.0163962 | 2 tools | 2.2.02.3.12.5.1 2.2.02.3.12.5.1 2.2.02.3.12.5.1 | |||||||||
seqlib | 1.2.0-gcc-10.3.0 | ||||||||||||
A tool for processing sequences in the FASTA or FASTQ format. It parses both FASTA and FASTQ files which can also be optionally compressed by gzip. | seqtk | FastQ-brew: Module for analysis, preprocessing, and reformatting of FASTQ sequence data | seqtk | MIT | 15 tools | 1.3 | 1.4 | 1.3-gcc-10.3.01.3-gcc-11.3.0 (D) 1.3-gcc-10.3.01.3-gcc-11.3.0 (D) | |||||
Seurat is an R package designed for QC, analysis, and exploration of single-cell RNA-seq data. Seurat aims to enable users to identify and interpret sources of heterogeneity from single-cell transcriptomic measurements, and to integrate diverse types of single-cell data. | seurat | Integrated analysis of multimodal single-cell data | MIT | 16 tools | |||||||||
De novo assembly from Oxford Nanopore reads. | shasta | Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes | MIT | Shasta 0.6.0+galaxy0 | |||||||||
Shovill is a pipeline for assembly of bacterial isolate genomes from Illumina paired-end reads. Shovill uses SPAdes at its core, but alters the steps before and after the primary assembly step to get similar results in less time. Shovill also supports other assemblers like SKESA, Velvet and Megahit, so you can take advantage of the pre- and post-processing the Shovill provides with those too. | shovill | shovill | GPL-3.0 | Shovill 1.1.0+galaxy2 | 1.1.0 | ||||||||
A clustering approach for identification of enriched domains from histone modification ChIP-seq data. | sicer | A clustering approach for identification of enriched domains from histone modification ChIP-Seq data | SICER 1.1 | ||||||||||
A text mining framework for interactive analysis and visualization of similarities among biomedical entities. For each search query, PMIDs or abstracts from PubMed are saved. $ git clone https://github.com/dlal-group/simtext. For all PMIDs in each row of a table the according abstracts are saved in additional columns. | simtext | 10.1101/2020.07.06.190629 | 2 tools | ||||||||||
sip | 6.7.9 | ||||||||||||
The Salmonella In Silico Typing Resource (SISTR) is an open-source and freely available web application for rapid in silico typing and serovar prediction from Salmonella genome assemblies using cgMLST and O and H antigen gene searching. | sistr | Performance and accuracy of four open-source tools for in silico serotyping of salmonella spp. Based on whole-genome short-read sequencing data | Apache-2.0 | sistr_cmd 1.1.1+galaxy1 | |||||||||
slow5-dorado | 0.2.1 | ||||||||||||
slow5-guppy | 6.0.1 | ||||||||||||
Slow5tools is a simple toolkit for converting (FAST5 <-> SLOW5), compressing, viewing, indexing and manipulating data in SLOW5 format. About SLOW5 format: SLOW5 is a new file format for storing signal data from Oxford Nanopore Technologies (ONT) devices. SLOW5 was developed to overcome inherent limitations in the standard FAST5 signal data format that prevent efficient, scalable analysis and cause many headaches for developers. SLOW5 can be encoded in human-readable ASCII format, or a more compact and efficient binary format (BLOW5) - this is analogous to the seminal SAM/BAM format for storing DNA sequence alignments. The BLOW5 binary format supports zlib (DEFLATE) compression, or other compression methods, thereby minimising the data storage footprint while still permitting efficient parallel access. Detailed benchmarking experiments have shown that SLOW5 format is an order of magnitude faster and significantly smaller than FAST5. | slow5tools | 0.8.0 | 0.3.01.0.01.1.0 0.3.01.0.01.1.0 0.3.01.0.01.1.0 | ||||||||||
smithwaterman | 20160702-gcccore-10.3.0 | ||||||||||||
Reference-free profiling of polyploid genomes | Inference of ploidy and heterozygosity structure using whole genome sequencing data | Smudgeplots are computed from raw or even better from trimmed reads and show the haplotype structure using heterozygous kmer pairs. For example: | This tool extracts heterozygous kmer pairs from kmer dump files and performs gymnastics with them. We are able to disentangle genome structure by comparing the sum of kmer pair coverages (CovA + CovB) to their relative coverage (CovA / (CovA + CovB)). Such an approach also allows us to analyze obscure genomes with duplications, various ploidy levels, etc | GenomeScope 2.0 and Smudgeplots: Reference-free profiling of polyploid genomes Timothy Rhyker Ranallo-Benavidez, Kamil S. Jaron, Michael C. Schatz bioRxiv 747568; doi: https://doi.org/10.1101/747568 | Smudgeplots | 10.1101/747568 | Apache-2.0 | Smudgeplot 0.2.5+galaxy3 | |||||||||
Workflow engine and language. It aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern domain specific specification language (DSL) in python style. | snakemake | Snakemake-a scalable bioinformatics workflow engine | Snakemake | 7.18.2 | 6.6.1-foss-2021a7.22.0-foss-2022a (D) 6.6.1-foss-2021a7.22.0-foss-2022a (D) | ||||||||
The Semi-HMM-based Nucleic Acid Parser is a gene prediction tool. | snap | 10.1186/1471-2105-5-59 | SNAP | Train SNAP 2013_11_29+galaxy1 | 2006 | 2013_11_29 | |||||||
SnapATAC (Single Nucleus Analysis Pipeline for ATAC-seq) is a fast, accurate and comprehensive method for analyzing single cell ATAC-seq datasets. | snapatac | Comprehensive analysis of single cell ATAC-seq data with SnapATAC | GPL-3.0 | 4 tools | |||||||||
a snakemake pipeline for scalable HIV-1 subtyping by phylogenetic pairing | SNAPPy is a Snakemake pipeline for HIV-1 subtyping by phylogenetic pairing | This is the repository for SNAPPy, a Snakemake pipeline for HIV-1 subtyping by phylogenetic pairing. SNAPPy allows high-throughput HIV-1 subtyping locally while being resource efficient and scalable. This pipeline was constructed using Snakemake , and it uses MAFFT and for multiple sequence alignment, BLAST for similarirty querys, IQ-TREE for phylogenetic inference, and several Biopython modules for data parsing an analysis. For in-depth information on how the tool works please visit the documentation page. SNAPPy was design for Linux based operative systems | Welcome to snappy’s documentation! — SNAPPy-HIV1-Subtyping 1.0.0 documentation | Free document hosting provided by Read the Docs | snappy | SNAPPy: A snakemake pipeline for scalable HIV-1 subtyping by phylogenetic pairing | MIT | 1.1.8-gcccore-10.3.01.1.9-gcccore-11.3.0 (D) 1.1.8-gcccore-10.3.01.1.9-gcccore-11.3.0 (D) | |||||||||
An algorithm for structural variation detection from third generation sequencing alignment. | sniffles | Accurate detection of complex structural variations using single-molecule sequencing | sniffles | MIT | sniffles 1.0.12+galaxy0 | 2.0.22.3.32.4 2.0.22.3.32.4 2.0.22.3.32.4 | |||||||
Rapid haploid variant calling and core SNP phylogeny generation. | snippy | snippy | GPL-3.0 | 3 tools | |||||||||
snp-dists | SNP distance matrix 0.8.2+galaxy0 | ||||||||||||
snp_sites | Finds SNP sites 2.5.1+galaxy0 | ||||||||||||
Variant annotation and effect prediction tool. It annotates and predicts the effects of variants on genes and proteins (such as amino acid changes). | snpeff | 22728672 | snpeff | 6 tools | |||||||||
snpfreq | snpFreq 1.0.1 | ||||||||||||
snpfreqplot | Variant Frequency Plot 1.0+galaxy3 | ||||||||||||
Toolbox that allows you to filter and manipulate annotated vcf files. | snpsift | Using Drosophila melanogaster as a model for genotoxic chemical mutational studies with a new program, SnpSift | LGPL-3.0 | 8 tools | |||||||||
SOAPdenovo2 is a next generation sequencing reads de novo assembler. | soapdenovo2 | SOAPdenovo2: An empirically improved memory-efficient short-read de novo assembler | GPL-3.0 | 2.41 | |||||||||
Sequence analysis tool for filtering, mapping and OTU-picking NGS reads. | sortmerna | SortMeRNA: Fast and accurate filtering of ribosomal RNAs in metatranscriptomic data | sortmerna | Filter with SortMeRNA 4.3.6+galaxy0 | 4.3.6--h9ee0642_0 | ||||||||
spaceranger | 2.0.1-gcc-11.3.0 | ||||||||||||
St. Petersburg genome assembler – is intended for both standard isolates and single-cell MDA bacteria assemblies. SPAdes 3.9 works with Illumina or IonTorrent reads and is capable of providing hybrid assemblies using PacBio, Oxford Nanopore and Sanger reads. Additional contigs can be provided and can be used as long reads. | spades | 2 publications | SPAdes | GPL-2.0 | 8 tools | 3.15.4--h95f258a_0 | 3.15.3-gcc-10.3.03.15.5-gcc-11.3.0 (D) 3.15.3-gcc-10.3.03.15.5-gcc-11.3.0 (D) | ||||||
spectra | 1.0.1-gcccore-11.3.0 | ||||||||||||
Spectral Repeat Finder (SRF) is a program to find repeats through an analysis of the power spectrum of a given DNA sequence. | srf | Spectral repeat finders (SRF): Identification of repetitive sequences using Fourier transformation | 2022.11.22 | ||||||||||
sqlite | 3.36 | ||||||||||||
The SRA Toolkit and SDK from NCBI is a collection of tools and libraries for using data in the INSDC Sequence Read Archives. | sra-tools | Database resources of the National Center for Biotechnology Information. | 3 tools | 3.0.2 | 3.0.3-gompi-2022a3.0.3--h87f3376_0 3.0.3-gompi-2022a3.0.3--h87f3376_0 | ||||||||
srst2 | 0.2.0--py_4 | ||||||||||||
A fast implementation of the Smith-Waterman algorithm whose API that can be flexibly used by programs written in C, C++ and other languages. | ssw | SSW library: An SIMD Smith-Waterman C/C++ library for use in genomic applications | 1.1-gcccore-10.3.0 | ||||||||||
Developed to work with restriction enzyme based sequence data, such as RADseq, for building genetic maps and conducting population genomics and phylogeography analysis. | stacks | Stacks: An analysis tool set for population genomics | Stacks | GPL-3.0 | 25 tools | ||||||||
Ultrafast universal RNA-seq data aligner | star | 3 publications | star | GPL-3.0 | 2 tools | 2.7.10a | 2.7.10a--h9ee0642_0 | 2.7.9a-gcc-10.3.02.7.10b-gcc-11.3.0 (D)2.7.10a--h9ee0642_0 2.7.9a-gcc-10.3.02.7.10b-gcc-11.3.0 (D)2.7.10a--h9ee0642_0 2.7.9a-gcc-10.3.02.7.10b-gcc-11.3.0 (D)2.7.10a--h9ee0642_0 | |||||
STAR-Fusion, a method that is both fast and accurate in identifying fusion transcripts from RNA-Seq data | star_fusion | STAR-Fusion 0.5.4-3+galaxy1 | |||||||||||
staramr (*AMR) scans bacterial genome contigs against the ResFinder, PointFinder, and PlasmidFinder databases (used by the ResFinder webservice and other webservices offered by the Center for Genomic Epidemiology) and compiles a summary report of detected antimicrobial resistance genes. The star|* in staramr indicates that it can handle all of the ResFinder, PointFinder, and PlasmidFinder databases. | staramr | staramr 0.10.0+galaxy1 | |||||||||||
Fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It uses a novel network flow algorithm as well as an optional de novo assembly step to assemble and quantitate full-length transcripts representing multiple splice variants for each gene locus. | stringtie | StringTie enables improved reconstruction of a transcriptome from RNA-seq reads | Artistic-2.0 | 2 tools | 2.1.7-gcc-10.3.0 | ||||||||
Subread is a general-purpose read aligner which can be used to map both genomic DNA-seq reads and RNA-seq reads. It uses a new mapping paradigm called "seed-and-vote" to achieve fast, accurate and scalable read mapping. It automatically determines if a read should be globally or locally aligned, therefore particularly powerful in mapping RNA-seq reads. It supports indel detection and can map reads with both fixed and variable lengths. | subread | 2 publications | subread | GPL-3.0 | 2.0.32.0.6 2.0.32.0.6 | ||||||||
An agile homology-based approach using a reduced SEED database to report the subsystems present in metagenomic samples and profile their abundances. | superfocus | SUPER-FOCUS: A tool for agile functional analysis of shotgun metagenomic data | 1.4.1 | ||||||||||
This tool generates Alternative Splicing (AS) events from an annotation and calculates the PSI ("Percentage Spliced In") value for each event exploiting fast quantification of transcript abundances from multiple samples. | suppa | Leveraging transcript quantification for fast computation of alternative splicing profiles | MIT | 2.3--py_2 | |||||||||
Structural variant detection from haploid and diploid genome assemblies. SVIM-asm - Structural variant identification method (Assembly edition). SVIM-asm (pronounced SWIM-assem) is a structural variant caller for haploid or diploid genome-genome alignments. It analyzes a given sorted BAM file (preferably from minimap2) and detects five different variant classes between the query assembly and the reference: deletions, insertions, tandem and interspersed duplications and inversions. | svim-asm | 10.1101/2020.10.27.356907 | GPL-3.0 | 1.0.3 | |||||||||
SyRI is tool for finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genomic differences range from single nucleotide differences to complex structural variations. Current methods typically annotate sequence differences ranging from SNPs to large indels accurately but do not unravel the full complexity of structural rearrangements, including inversions, translocations, and duplications, where highly similar sequence changes in location, orientation, or copy number. Here, we present SyRI, a pairwise whole-genome comparison tool for chromosome-level assemblies. SyRI starts by finding rearranged regions and then searches for differences in the sequences, which are distinguished for residing in syntenic or rearranged regions. This distinction is important as rearranged regions are inherited differently compared to syntenic regions. | syri | SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies | 1.6 | ||||||||||
Targetfinder.org provides a web based resource that allows users to find genes that have a similar expression to a query gene signature. | targetfinder | Targetfinder.org: A resource for systematic discovery of transcription factor target genes | TargetFinder 1.7.0+galaxy1 | ||||||||||
tb_variant_filter | TB Variant Filter 0.4.0+galaxy0 | ||||||||||||
Tbl2asn is a command-line program that automates the creation of sequence records for submission to GenBank. It uses many of the same functions as Genome Workbench but is driven generally by data files. Tbl2asn generates .sqn files for submission to GenBank. | tbl2asn | 20220427-linux6420230119-linux64 (D) 20220427-linux6420230119-linux64 (D) | |||||||||||
A tool for drug resistance prediction from _M. tuberculosis_ genomic data (sequencing reads, alignments or variants). | tbprofiler | Rapid determination of anti-tuberculosis drug resistance from whole-genome sequences | GPL-3.0 | TB-Profiler Profile 6.2.1+galaxy1 | |||||||||
The COMBAT-TB Workbench is an IRIDA based, module workbench for M. tuberculosis bioinformatics. It is designed to be easily deployed on a single server. | tbvcfreport | 10.1101/2021.09.23.21263983 | Apache-2.0 | TB Variant Report 1.0.1+galaxy0 | |||||||||
Prediction of cognitive impairment via deep learning trained with multi-center neuropsychological test data. An end-to-end open source machine learning platform. Announcing the TensorFlow Dev Summit 2020 Learn more. TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications | tensorflow | Prediction of cognitive impairment via deep learning trained with multi-center neuropsychological test data | 2.0.0 2.1.0 2.3.0 2.4.1 2.6.0 2.8.0 | ||||||||||
lineage-level classification of transposable elements using conserved protein domains. Note: do not move or hard link TEsorter.py alone to anywhere else, as it rely on database/ and bin/. You can add the directory to PATH or soft link TEsorter.py to PATH | tesorter | 10.1101/800177 | 1.4.6 | ||||||||||
TEtranscripts | TEtranscripts 2.2.3+galaxy0 | ||||||||||||
Open-source, crossplatform tool that converts Thermo RAW files into open file formats such as MGF and to the HUPO-PSI standard file format mzML | ThermoRawFileParser | 10.1101/622852 | Apache-2.0 | Thermo 1.3.4+galaxy0 | |||||||||
tmt-analyst | TMT Analyst 0.11+galaxy0 | ||||||||||||
Program that aligns RNA-Seq reads to a genome in order to identify exon-exon splice junctions. It is built on the ultrafast short read mapping program Bowtie. A stable SAMtools version is now packaged with the program. | tophat | 2 publications | tophat | 2 tools | 2.1.1 | ||||||||
TotalView is a debugger for High Performance Computing applications. | totalview | Proprietary | 2019.3.14 2020.1.13 | ||||||||||
Institute for Systems Biology "Trans-Proteomic Pipeline" | tpp | 4 publications | 5 tools | ||||||||||
TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks. | transdecoder | TransDecoder 5.5.0+galaxy2 | 5.5.0--pl5321hdfd78af_5 | ||||||||||
A tool for the analysis of Tn-Seq data. It provides an easy to use graphical interface and access to three different analysis methods that allow the user to determine essentiality in a single condition as well as between conditions. | transit | TRANSIT - A Software Tool for Himar1 TnSeq Analysis | 5 tools | ||||||||||
transvar | 2.4.0 | ||||||||||||
treeshrink | 1.3.9 | ||||||||||||
Tandem Repeats Finder. Find tandem repeats in DNA sequences without the need to specify either the pattern or pattern size. It uses the method of k-tuple matching to avoid the need for full scale alignment matrix computations. It requires no a priori knowledge of the pattern, pattern size or number of copies. There are no restrictions on the size of the repeats that can be detected. It determines a consensus pattern for the smallest repetitive unit in the tandem repeat. | trf | Tandem repeats finder: A program to analyze DNA sequences | Other | 4.09.1 | |||||||||
trf-mod | 4.10.0 | ||||||||||||
A wrapper tool around Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files, with some extra functionality for MspI-digested RRBS-type (Reduced Representation Bisufite-Seq) libraries. | trim_galore | Trim Galore! 0.6.7+galaxy0 | |||||||||||
Tool for the automated removal of spurious sequences or poorly aligned regions from a multiple sequence alignment. | trimal | trimAl: A tool for automated alignment trimming in large-scale phylogenetic analyses | 1.4.1 | ||||||||||
A flexible read trimming tool for Illumina NGS data | trimmomatic | RobiNA: A user-friendly, integrated software solution for RNA-Seq-based transcriptomics | Trimmomatic | Trimmomatic 0.36.6 | 0.39 | 0.39--hdfd78af_2 | 0.39-java-11 | ||||||
Trinity is a transcriptome assembler which relies on three different tools, inchworm an assembler, chrysalis which pools contigs and butterfly which amongst others compacts a graph resulting from butterfly with reads. | trinity | 2 publications | 13 tools | 2.9.1 2.12.0 | 2.13.2--ha140323_0 | 2.9.1-foss-2021a2.15.1--h6ab5fc9_2 2.9.1-foss-2021a2.15.1--h6ab5fc9_2 | |||||||
Comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms. | trinotate | A Tissue-Mapped Axolotl De Novo Transcriptome Enables Identification of Limb Regeneration Factors | Trinotate 3.2.2+galaxy0 | 3.2.2--pl5321hdfd78af_1 | |||||||||
A program for improved detection of transfer RNA genes in genomic sequence. | trnascan-se | 2 publications | tRNA prediction 0.4 | ||||||||||
Trycycler: consensus long-read assemblies for bacterial genomes | trycycler | Trycycler: consensus long-read assemblies for bacterial genomes | GPL-3.0 | 5 tools | |||||||||
Utilities for handling sequences and assemblies from the UCSC Genome Browser project. | ucsc | Other | 6 tools | ||||||||||
Tools for handling Unique Molecular Identifiers in NGS data sets. | umi_tools | UMI-tools: Modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy | MIT | 5 tools | |||||||||
A tool for assembling bacterial genomes from a combination of short (2nd generation) and long (3rd generation) sequencing reads. | unicycler | Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads | Unicycler | GPL-3.0 | Create assemblies with Unicycler 0.5.1+galaxy0 | ||||||||
Metaproteomics data analysis with a focus on interactive data visualizations. | unipept | 2 publications | MIT | Unipept 4.5.1 | |||||||||
The universal protein knowledgebase in 2021. You are using a version of browser that may not display all the features of this website. | UniProt_Downloader | 14 publications | CC-BY-4.0 | UniProt 2.4.0 | |||||||||
The universal protein knowledgebase in 2021. You are using a version of browser that may not display all the features of this website. | uniprot_rest_interface | 14 publications | CC-BY-4.0 | UniProt 0.6 | |||||||||
unzip | Unzip 6.0+galaxy0 | 6.0-gcccore-10.3.06.0-gcccore-11.3.06.0-gcccore-12.3.0 (D) 6.0-gcccore-10.3.06.0-gcccore-11.3.06.0-gcccore-12.3.0 (D) 6.0-gcccore-10.3.06.0-gcccore-11.3.06.0-gcccore-12.3.0 (D) | |||||||||||
Tool for predicting effects of variants for any genome in Ensembl or with genome annotation (via GFF). This includes vertebrates and also plants, fungi, protists, metazoa and bacteria. There is a web and a REST API version but the most powerful is the Perl script version. See McLaren et al., 2016, Genome Biology | vep | The Ensembl Variant Effect Predictor | Apache-2.0 | 107-gcc-11.3.0 | |||||||||
VarScan, an open source tool for variant detection that is compatible with several short read align-ers. | varscan | 2 publications | VarScan | 4 tools | |||||||||
API and command line utilities for the manipulation of VCF files. | vcflib | 10.1101/023754 | MIT | 23 tools | 1.0.3-foss-2021a-r-4.1.0 | ||||||||
Provide easily accessible methods for working with complex genetic variation data in the form of VCF files. | vcftools | The variant call format and VCFtools | VCFTools | GPL-3.0 | 2 tools | 0.1.16 | 0.1.16--pl5321h9a82719_6 | 0.1.16-gcc-10.3.00.1.16-gcc-11.3.0 (D) 0.1.16-gcc-10.3.00.1.16-gcc-11.3.0 (D) | |||||
A de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454 or SOLiD. | velvet | 10.1101/gr.074492.107 | Velvet | GPL-3.0 | 3 tools | 1.2.10 | 1.2.10--h7132678_5 | ||||||
verkko | 1.1 | ||||||||||||
visit | 3.1.2 | ||||||||||||
High-throughput search and clustering sequence analysis tool. It supports de novo and reference based chimera detection, clustering, full-length and prefix dereplication, reverse complementation, masking, all-vs-all pairwise global alignment, exact and global alignment searching, shuffling, subsampling and sorting. It also supports FASTQ file analysis, filtering and conversion. | vsearch | 10.7717/peerj.2584 | vsearch | GPL-3.0 | 8 tools | ||||||||
weblogo3 | Sequence Logo 3.5.0 | ||||||||||||
Software for phasing genomic variants using DNA sequencing reads, also called haplotype assembly. It is especially suitable for long reads, but works also well with short reads. | whatshap | 2 publications | MIT | 1.72.3 1.72.3 | |||||||||
windowmasker identifies and masks highly repetitive DNA sequences in a genome, using only the sequence of the genome itself. | windowmasker | 2 tools | |||||||||||
Winnowmap is a long-read mapping algorithm optimized for mapping ONT and PacBio reads to repetitive reference sequences. Winnowmap development began on top of minimap2 codebase, and since then we have incorporated the following two ideas to improve mapping accuracy within repeats | winnowmap | 2 publications | Not licensed | 2.03 | |||||||||
First fully open-source and collaborative online platform for computational metabolomics. It includes preprocessing, normalization, quality control, statistical analysis of LC/MS, FIA-MS, GC/MS and NMR data. | workflow4metabolomics | 2 publications | 19 tools | ||||||||||
Caenorhabditis elegans genome database. International consortium of biologists and computer scientists dedicated to providing the research community with accurate, current, accessible information concerning the genetics, genomics and biology of C. elegans and related nematodes. Founded in 2000, the Consortium is led by Paul Sternberg of CalTech, Paul Kersey of the EBI, Matt Berriman of the Wellcome Trust Sanger Institute, and Lincoln Stein of the Ontario Institute for Cancer Research. | wormbase | WormBase 2014: New views of curated biology | WormBase 1.0.1 | ||||||||||
xarray | 2 tools | ||||||||||||
Framework for processing and visualization of chromatographically separated and single-spectra mass spectral data. The packages enables imports from AIA/ANDI NetCDF, mzXML, mzData and mzML files and preprocesses data for high-throughput, untargeted analyte profiling. | xcms | Correction of mass calibration gaps in liquid chromatography-mass spectrometry metabolomics data | GPL-2.0 | 10 tools | |||||||||
xml4ena | |||||||||||||
Detection of differential RNA modifications from direct RNA sequencing of human cell lines. Python package for detection of differential RNA modifications from direct RNA sequencing. | xpore | 10.1101/2020.06.18.160010 | xpore | MIT | 2.1--pyh5e36f6f_0 | ||||||||
Matches tandem mass spectra with peptide sequences. | xtandem | TANDEM: Matching proteins with tandem mass spectra | xtandem | 2 tools | |||||||||
YaHS is scaffolding tool using Hi-C data. It relies on a new algorithm for contig joining detection which considers the topological distribution of Hi-C signals aiming to distinguish real interaction signals from mapping noises. | yahs | 10.1101/2022.06.09.495093 | MIT | YAHS 1.2a.2+galaxy2 | 1.1 | ||||||||
Search and retrieve S. cerevisiae data, populated by SGD and powered by InterMine | yeastmine | InterMine: A flexible data warehouse system for the integration and analysis of heterogeneous biological data | LGPL-2.1 | YeastMine 1.0.0 | |||||||||
ZebrafishMine is powered by the InterMIne data warehouse system, and integrates biological data sets from multiple sources. It currently includes updates of data from ZFIN, the zebrafish model organism database. There is also data from the Panther database. | zebrafishmine | InterMine: A flexible data warehouse system for the integration and analysis of heterogeneous biological data | LGPL-2.1 | ZebrafishMine 1.0.0 |
- The Tool identifier column (hidden by default) contains an identifier for the tool / workflow: typically the module name (used for matching to HPC lists).
- The Topic(s) column categorises the tools by purpose, using an EDAM concept where possible.
- More information about a tool can be found by following the bio.tools links.
- When a tool has been containerised to allow for easier installation on any compute infrastructure, a link to the containerised software that can be downloaded from BioContainers is shown in the Containers available? (BioContainers) column.
- The primary source material for the table is manually curated, and while we endeavour to keep the information as current as possible, there is a natural limit to the volume of information maintained here. Production of this information will be automated over time, and tools that are not relevant for bioinformatics analyses removed.