This page provides a comprehensive overview of tools, containers, workflows, and datasets available to ABLeS researchers through the Australian BioCommons Tools and Workflows project (project if89 @ NCI).
Software
The list of tools available through the Australian BioCommons Shared Tools and Workflows repository (NCI (if89)) is available through ToolFinder.
module use -a /g/data/if89/apps/modulefiles After that, you can load any tool, then utilising it directly using the following command:
module load $tool/$version You can list all available modules using the following command:
module available
Software databases
Some of the databases required by different bioinformatics software tools are made available through the if89 project.
They are located at /g/data/if89/data_library. You can request other databases to be included by contacting us.
A list of the currently available (as of 25 Aug 2025) databases is included below:
| Dataset | Source | Download date | Location | Details |
|---|---|---|---|---|
| Blast | Blast Webpage | 28 Aug 2022 | blast_db/28082022/ |
nr.*.gz: non-redundant protein sequence database with entries from GenPept, Swissprot, PIR, PDF, PDB, and RefSeq. nt.*.gz: nucleotide sequence database, with entries from all traditional divisions of GenBank, EMBL, and DDBJ. |
| Blast | Blast Webpage | 7 Nov 2023 | blast_db/07112023 |
nt.*: nucleotide sequence database, with entries from all traditional divisions of GenBank, EMBL, and DDBJ. |
| Alphafold/UniProt | foldseek github pages and AlphaFold Protein Structure Database webpage | 30 Nov 2022 | AlphaFoldDB/aminoacid/UniProt/30112022/ |
Aminoacid dataset for foldseek tool. Downloaded through databases command in foldseek tool. |
| Alphafold/UniProt-NO-CA | foldseek github pages and AlphaFold Protein Structure Database webpage | 30 Nov 2022 | AlphaFoldDB/aminoacid/UniProt-NO-CA/30112022/ |
Aminoacid dataset for foldseek tool. Downloaded through databases command in foldseek tool. |
| Alphafold/UniProt50 | foldseek github pages and AlphaFold Protein Structure Database webpage | 30 Nov 2022 | AlphaFoldDB/aminoacid/UniProt50/30112022/ |
Aminoacid dataset for foldseek tool. Downloaded through databases command in foldseek tool. |
| Alphafold/Proteome | foldseek github pages and AlphaFold Protein Structure Database webpage | 29 Nov 2022 | AlphaFoldDB/aminoacid/Proteome/29112022/ |
Aminoacid dataset for foldseek tool. Downloaded through databases command in foldseek tool. |
| Alphafold/Swiss-Prot | foldseek github pages and AlphaFold Protein Structure Database webpage | 30 Nov 2022 | AlphaFoldDB/aminoacid/Swiss-Prot/30112022/ |
Aminoacid dataset for foldseek tool. Downloaded through databases command in foldseek tool. |
| Busco/eukaryota_odb10 | Busco webpages | 14 Aug 2023 | busco_db/14082023/lineages |
Lineage datasets for busco tool. Downloaded manually. |
| Kaiju | Kaiju Webpage | 26 May 2023 | kaiju/26052023/kaiju_db_rvdb |
Kaiju pre-built indexes for protein sequences from RVDB-prot v26.0. Contains the Kaiju .fmi index file, as well as nodes.dmp and names.dmp from the NCBI taxonomy. |
| Kraken2 | Kraken 2, KrakenUniq and Bracken indexes | 9 Oct 2023 | kraken2/09102023/k2_pluspf |
Kraken2 pre-built index for RefSeq database (archaea, bacteria, viral, plasmid, human, protozoa & fungi) plus UniVec_Core. |
| Kraken2 | Kraken 2, KrakenUniq and Bracken indexes | 9 Jun 2025 | kraken2/09062025/k2_core_nt |
Kraken2 Very large collection, inclusive of GenBank, RefSeq, TPA and PDB. |
| Kraken2 | Kraken 2, KrakenUniq and Bracken indexes | 9 Jun 2025 | kraken2/09062025/k2_standard |
Kraken2 pre-built "standard" database, includes RefSeq archaea, bacteria, viral, plasmid, human, plus UniVec_Core. |
| Dfam | Dfam | 04 Aug 2025 | dfam/04082025/dfam39 |
Dfam 3.9; FamDB Format 2.0; Partition 7 [dfam39_full.7.h5]: Mammalia (57 GB) More info - https://www.dfam.org/releases/Dfam_3.9/families/FamDB/README.txt. Available with RepeatMasker/4.2.0 (if89 module) |
| Human reference genome | GRCh38.p14 | Feb 2022 | Homo_Ref/GRCh38.p14 |
Homo sapiens reference genome GRCh38 patch release 14 (GCF_000001405.40). Includes primary assembly FASTA, gene annotation (GTF), transcript and protein FASTA files from NCBI RefSeq. This assembly is the current standard reference for WGS/WES alignment, variant calling, and annotation pipelines (e.g. BWA-MEM2, GATK, DeepVariant, VEP). |
| Human reference genome | GRCh37.p13 | Feb 2014 | Homo_Ref/GRCh37.p13 |
Homo sapiens reference genome GRCh37 patch release 13 (GCF_000001405.25), maintained for backward compatibility with legacy datasets and resources (e.g. older GWAS, ExAC, early gnomAD releases). Includes primary assembly FASTA, gene annotation (GTF), transcript and protein FASTA files from NCBI RefSeq. |
if89 Contributors
Hardip Patel
National Centre for Indigenous Genomics, John Curtin School of Medical Research, The Australian National University
J King Chang
School of Biotechnology and Biomolecular Science, Faculty of Science, UNSW, Sydney
Ziad Al Bkhetan
Australian BioCommons, University of Melbourne
Andre Martins Reis
Garvan Institute of Medical Research
Hasindu Gamaarachchi
Kyle Drover
Tim Amos
Garvan Institute of Medical Research
Leah Kemp
Garvan Institute of Medical Research
Kisaru Liyanage
National Computational Infrastructure (NCI)
Terry Bertozzi
South Australian Museum
Javed Shaikh
National Computational Infrastructure (NCI), Australian National University (ANU)
Kirat Alreja
National Centre for Indigenous Genomics, Australian National University
Andrey Bliznyuk
Hyungtaek Jung
National Centre for Indigenous Genomics, Australian National University
Wenjing Xue
National Computational Infrastructure (NCI)
Farhad Masoomi-Aladizgeh
The University of Sydney
Haixia Guan
The University of Sydney
Johan Gustafsson
Australian BioCommons, University of Melbourne
Dale Roberts
National Computational Infrastructure (NCI) (at the time of this work), ARC Centre of Excellence for Climate Extremes