"In this example, we load a multiple sequence alignment from a file, but if your program produces alignment and you wish to make an HMM out of them, you can instantiate a `TextMSA` object yourself, e.g.:\n",
"Because we need a `DigitalMSA` to build the HMM, you will have to convert it first:\n",
...
...
@@ -186,7 +186,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
...
...
@@ -200,7 +200,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.2"
"version": "3.10.4"
}
},
"nbformat": 4,
...
...
%% Cell type:markdown id: tags:
# Build an HMM from a multiple sequence alignment
%% Cell type:code id: tags:
``` python
importpyhmmer
pyhmmer.__version__
```
%% Cell type:code id: tags:
``` python
alphabet=pyhmmer.easel.Alphabet.amino()
```
%% Cell type:markdown id: tags:
## Loading the alignment
%% Cell type:markdown id: tags:
A new HMM can be built from a single sequence, or from a multiple sequence alignment. Let's load an alignment in digital mode so that we can build our HMM:
In this example, we load a multiple sequence alignment from a file, but if your program produces alignment and you wish to make an HMM out of them, you can instantiate a `TextMSA` object yourself, e.g.:
Because we need a `DigitalMSA` to build the HMM, you will have to convert it first:
```python
msa_d=msa.digitize(alphabet)
```
</div>
%% Cell type:markdown id: tags:
## Building an HMM
Now that we have a multiple alignment loaded in memory, we can build a pHMM using a `pyhmmer.plan7.Builder`. This also requires a Plan7 background model to compute the transition probabilities.
%% Cell type:code id: tags:
``` python
builder=pyhmmer.plan7.Builder(alphabet)
background=pyhmmer.plan7.Background(alphabet)
hmm,_,_=builder.build_msa(msa,background)
```
%% Cell type:markdown id: tags:
We can have a look at the consensus sequence of the HMM with the `consensus` property:
%% Cell type:code id: tags:
``` python
hmm.consensus
```
%% Cell type:markdown id: tags:
## Saving the resulting HMM
%% Cell type:markdown id: tags:
Now that we have an HMM, we can save it to a file to avoid having to rebuild it every time. Using the `HMM.write` method lets us write the HMM in ASCII or binary format to an arbitrary file. The resulting file will also be compatible with the `hmmsearch` binary if you wish to use that one instead of pyHMMER.
%% Cell type:code id: tags:
``` python
withopen("LuxC.hmm","wb")asoutput_file:
hmm.write(output_file)
```
%% Cell type:markdown id: tags:
## Applying the HMM to a sequence database
%% Cell type:markdown id: tags:
Once a pHMM has been obtained, it can be applied to a sequence database with the `pyhmmer.plan7.Pipeline` object. Let's iterate over the protein sequences in a FASTA to see if our new HMM gets any hits: