Project title
Developing agentic systems autonomously using public biomedical data to accelerate and enrich research workflows.
Collaborators and funding
-
The Kids Research Institute Australia - https://www.thekids.org.au/
-
The University of Western Australia - https://www.uwa.edu.au
Contact(s)
Kevin Chen, The Kids Research Institute Australia, kevin.chen@thekids.org.au
Project description and aims
A large volume of bioinformatic tools are available to researchers, with new tools regularly released. While these offer potential, the process of using and evaluating these is challenging. For example, sufficient bioinformatic expertise is needed to understand how these tools could be used, with the process of understanding documentation, testing, and evaluating results being time-consuming. As a consequence, researchers will preference tools they are familiar with rather than those that will produce the most accurate results, and there are tools which have unexplored potential. Agentic workflows offer an opportunity to overcome these limitations - by designing tools that emulate the manual workflow used to test tools, it becomes possible to autonomously investigate the capabilities of these tools, systemtically benchmarking them to determine the strenghts and limitations of each one.
This project aims to develop agentic workflows that will autonomously identify and test tools, including downloading appropriate testing and benchmarking data and metrics. The findings from this will then be used to autonomously consturct agentic workflows that are tailored to a user’s use case, allowing for analyses where no established pipeline exists.
Upon completion, it is envisaged that the project will:
- assist researchers in determining which tools are most appropriate for their use case
- help standardise benchmarking/testing procedures in domains where it is currently heterogeneous
- provide a means for researchers, including those with little or no bioinformatic expertise, to use bioinformatic tools for their desired analyses
Downstream, these are expected to assist researchers with analysing their data more efficiently, to uncover findings such as mechanisms underpinning diseases or drug targets.
The following analyses are planned: - following development of the basic tool testing agentic workflow, this will be tested against a shortlist of 50 tools with a varying level of hardware/software requirements (e.g. those that could be run on a laptop, to those requiring GPUs) - the capacity of the agentic workflow to a) correctly determine parameters, necessary hardware, b) generate biologically correct results against ground truths will be quantified. - the performance of these agentic workflows against general coding agents, such as Claude Code and Codex, will be investigated
How is ABLeS supporting this work?
This work is supported through the Production Bioinformatics scheme provided by ABLeS.
Expected outputs enabled by participation in ABLeS
Findings are intended to be published via a manuscript - the target journal is Nature Methods. Furthermore, the code repository will be made publicly available via GitHub.
These details have been provided by project members at project initiation. For more information on the project, please consult the contact(s) or project links above.