Skip to content
Snippets Groups Projects

sequence and profile based annotation scripts

The directory contains all scripts to rerun the sequence and profile search based annotations.

MMseqs2

We perform a profile (based on the generated ColabFold MSAs) to sequence (uniref100, part of ColabFold's UniRef30 2021_03) search.

setup_mmseqs.sh

This script will download the same MMseqs2 version that was used to generate the presented results. It assumes that an archive with all ColabFold MSAs (msa.tar.gz) is present in the same folder.

run_mmseqs.sh

Executes MMseqs2. Please adjust the COLABFOLD_DB_PATH to point to a folder containing the UniRef30 2021_03 from: http://wwwuser.gwdg.de/~compbiol/colabfold/uniref30_2103.tar.gz

HHblits

We perform a profile (based on the generated ColabFold MSAs) to profile (UniRef30_2021_03) search.

setup_hhblits.sh

This script will download the same HHblits version that was used to generate the presented results and creates a HHblits readable version of the previously generated MSA database. Assumes that setup_mmseqs.sh was already executed.

run_hhblits.sh

Executes HHblits. Please adjust the HHSUITE_DB_PATH variable to to point to a folder containing the Uniref30 2021_03 from: http://wwwuser.gwdg.de/~compbiol/uniclust/2021_03/UniRef30_2021_03.tar.gz

hhblits_lock.sh

Workaround helper scripts to get around a performance issue with hhblits_omp on many CPU-cores.