-
Martin Larralde authoredMartin Larralde authored
To find the state of this project's repository at the time of any of these versions, check out the tags.
- Changelog
- Unreleased
- v0.10.11 - 2024-03-27
- Fixed
- v0.10.10 - 2024-03-18
- Fixed
- v0.10.9 - 2024-03-12
- Fixed
- v0.10.8 - 2024-03-06
- Added
- Changed
- Fixed
- v0.10.7 - 2024-03-04
- Added
- Fixed
- v0.10.6 - 2024-02-20
- Added
- Changed
- Fixed
- v0.10.5 - 2024-02-16
- Added
- Fixed
- v0.10.4 - 2023-10-29
- Added
- Changed
- Fixed
- v0.10.3 - 2023-10-22
- Added
- Changed
- Fixed
- v0.10.2 - 2023-08-20
- Fixed
- v0.10.1 - 2023-08-17
- Added
- Fixed
- v0.10.0 - 2023-08-16
- Added
- Changed
- Fixed
- v0.9.0 - 2023-08-03
- Added
- Changed
- Fixed
- v0.8.2 - 2023-06-07
- Added
- Changed
- Fixed
- v0.8.1 - 2023-05-19
- Added
- v0.8.0 - 2023-05-01
- Added
- Fixed
- Changed
- Removed
- v0.7.4 - 2023-04-14
- Added
- Fixed
- Changed
- v0.7.3 - 2023-03-24
- Fixed
- v0.7.2 - 2023-02-17
- Added
- Deprecated
- v0.7.1 - 2022-12-15
- Added
- Fixed
- v0.7.0 - 2022-12-04
- Added
- Changed
- Fixed
- Removed
- v0.6.3 - 2022-09-09
- Fixed
- Added
- v0.6.2 - 2022-08-12
- Changed
- Added
- v0.6.1 - 2022-06-28
- Added
- Fixed
- Changed
- v0.6.0 - 2022-05-01
- Added
- Changed
- Fixed
- v0.5.0 - 2022-03-14
- Added
- Changed
- Fixed
- v0.4.11 - 2021-12-15
- Added
- Fixed
- Changed
- v0.4.10 - 2021-12-06
- Added
- Fixed
- Changed
- v0.4.9 - 2021-11-11
- Added
- Changed
- Fixed
- v0.4.8 - 2021-10-27
- Added
- Changed
- Fixed
- v0.4.7 - 2021-09-28
- Added
- Changed
- Fixed
- v0.4.6 - 2021-09-10
- Added
- Changed
- Fixed
- v0.4.5 - 2021-07-19
- Added
- Changed
- v0.4.4 - 2021-07-07
- Added
- Fixed
- v0.4.3 - 2021-07-03
- Fixed
- v0.4.2 - 2021-06-20
- Added
- Fixed
- v0.4.1 - 2021-06-06
- Fixed
- Added
- Changed
- v0.4.0 - 2021-06-05 - YANKED
- Added
- Removed
- Changed
- Fixed
- v0.3.1 - 2021-05-08
- Added
- Changed
- Fixed
- v0.3.0 - 2021-03-11
- Added
- Fixed
- Changed
- Removed
- v0.2.2 - 2021-03-04
- Fixed
- v0.2.1 - 2021-01-29
- Added
- v0.2.0 - 2021-01-21
- Added
- Changed
- Fixed
- Removed
- v0.1.4 - 2021-01-15
- Added
- v0.1.3 - 2021-01-08
- Fixed
- v0.1.2 - 2021-01-07
- Fixed
- v0.1.1 - 2020-12-02
- Fixed
- v0.1.0 - 2020-12-01
- Fixed
- v0.1.0-a5 - 2020-11-28
- Added
- Changed
- Fixed
- v0.1.0-a4 - 2020-11-24
- Added
- Changed
- Fixed
- Removed
- v0.1.0-a3 - 2020-11-19
- Added
- Changed
- Removed
- Fixed
- v0.1.0-a2 - 2020-11-12
- Added
- Changed
- v0.1.0-a1 - 2020-11-10
CHANGELOG.md 41.07 KiB
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
Unreleased
v0.10.11 - 2024-03-27
Fixed
- Compilation of Easel and HMMER code not using SSE4.1 extensions.
v0.10.10 - 2024-03-18
Fixed
- Implement
write
function forfopencookie
withoff_t
instead ofoff64_t
for compatibility. - Fix handling of NULL buffers passed to
read
andwrite
methods offopencookie
.
v0.10.9 - 2024-03-12
Fixed
- Reallocation issue causing segmentation faults in
nhmmer
with more than 64 sequences (#62).
v0.10.8 - 2024-03-06
Added
- Getter to access the strand of a
Domain
produced by aLongTargetsPipeline
.
Changed
- Display model and cutoff names in
MissingCutoffs
error message, if any. - Allow
LongTargetsPipeline
to be configured with window length and beta parameters. - Make
nhmmer
use the window length and beta from the options when creating aBuilder
.
Fixed
-
nhmmer
not computing E-values for non-default window lengths (moshi4/pybarrnap#2). -
SequenceFile
andMSAFile
crashing with a segmentation fault when given the path to a folder rather than a file.
v0.10.7 - 2024-03-04
Added
- Pre-compiled wheels for PyPy 3.10.
Fixed
- Invalid pointer cast in
__getbuffer__
method ofMatrix
andVector
objects. - Remaining tests failing to run on missing
importlib-resources
. -
pyhmmer.hmmer
dispatchers possibly dead-locking on background thread errors (#60).
v0.10.6 - 2024-02-20
Added
-
armv7
andaarch64
to thePKGBUILD
architectures.
Changed
-
SSIReader
andSSIWriter
constructors now accept path-like objects. - Skip tests dependending on
importlib.resources.files
when it is not available on the host machine.
Fixed
- Memory leak caused by alphabet allocation in
Pipeline._scan_loop_file
.
v0.10.5 - 2024-02-16
Added
-
Alignment
properties to get the original lengths of the sequence and HMM being stored. -
Hit.length
property storing the length of the hit sequence (or HMM). -
TopHits.query_length
storing the length of the hit HMM (or query). -
Alignment.posterior_probabilities
property showing an encoded representation of posteriors (#59, by @arajkovic). -
Trace.score
method to compute a trace score from a given profile and sequence. -
Alignment.__sizeof__
implementation leveraingp7_alidisplay_SizeOf
.
Fixed
-
Cutoffs
proxy objects not recording their owner to prevent deallocation. - Avoid GIL re-acquisition in
GeneticCode.translate
. - Query metadata not being recorded in
Hits
obtained fromdaemon.Client
. - Empty
MatrixU8
creation attempting zero-allocation. -
VectorU8.zeros
allocating 4x more memory than required. - Memory leak caused by string duplication in
__getbuffer__
methods ofMatrix
andVector
types.
v0.10.4 - 2023-10-29
Added
-
residue_markups
argument toTextSequence
andDigitalSequence
constructors. -
__reduce__
implementation toTextSequence
,DigitalSequence
,TextSequenceBlock
andDigitalSequenceBlock
.
Changed
- Handling of
easel
I/O methods to avoid implicit GIL acquisition for error checking.
Fixed
- Syntax errors in type annotation files.
v0.10.3 - 2023-10-22
Added
- Out-of-band pickle serialization of
Bitfield
objects. - Getters for
float
attributes and forward/backward parameters ofOptimizedProfile
. -
InvalidHMM
error raised byHMM.validate
.
Changed
- Mark
HMM.zero
method asnoexcept
. - Increase size of buffer for the query queue in the
hmmer
dispatcher.
Fixed
- Unneeded semaphore in
pyhmmer.hmmer
message passing implementation. - Broken assertion in
Bitfield._from_raw_bytes
. - Relax tolerance of HMM validation in
TraceAligner.align_traces
.
v0.10.2 - 2023-08-20
Fixed
- Invalid buffer write in
DigitalSequenceBlock.translate
(#50).
v0.10.1 - 2023-08-17
Added
-
HMM.set_consensus
method to set the consensus for a method or compute it from the emission probabilities.
Fixed
- Platform detection for MacOS and Armv7 platforms in
setup.py
. -
pyhmmer.plan7.HMM
constructor setting a consensus string forcefully.
v0.10.0 - 2023-08-16
Added
- Support for compiling wheels for Aarch64 and NEON-enabled Arm platforms.
Changed
- Updated HMMER to
v3.4
. - Updated Easel to
v0.49
. - Use
cibuildwheel
to build wheel distributions.
Fixed
- Patch missing
PyInterpreterState_GetID
preventing the package from working on PyPy 3.9.
v0.9.0 - 2023-08-03
Added
-
TopHits.mode
property showing from which pipeline mode (search or scan) the hits were obtained.
Changed
- Updated the code for Cython
v3.0
.
Fixed
v0.8.2 - 2023-06-07
Added
- Bracket-style
repr
implementation toHMM
,Profile
andOptimizedProfile
showing model alphabet, length and name. -
MissingCutoffs
andInvalidParameter
exceptions inheritingValueError
.
Changed
- Replace
pthread
locks withPyThread
API for synchronizing models inOptimizedProfileBlock
.
Fixed
- Sequence length extraction in
LongTargetsPipeline.search_hmm
(#42). -
LongTargetsPipeline.search_msa
not building a HMM withBuilder.build_msa
.
v0.8.1 - 2023-05-19
Added
-
HMM.validate
method to ensure a HMM holds HMMER structural constraints. -
plan7.Transitions
enum with transition names for indexingHMM.transition_probabilities
.
v0.8.0 - 2023-05-01
PyHMMER has been accepted for publication in Bioinformatics. Paper can be reached at doi:10.1093/bioinformatics/btad214.
Added
-
pyhmmer.hmmer.jackhmmer
function to run several JackHMMER iterative searches in parallel using multithreading (#35, by @zdk123). -
HMM.to_profile
shortcut method to allocate and configure a newProfile
object.
Fixed
- Type annotations of
Pipeline.iterate_seq
andPipeline.iterate_hmm
. - Potential memory leak on exceptions raised by
HMMPressedFile.read
. -
Offsets.profile
not recording offsets properly, causingpyhmmer.hmmer.hmmpress
to produce invalid pressed files (#37).
Changed
-
HMM.__init__
andHMM.sample
now take theAlphabet
as the first argument, for consistency with the rest of the API. -
HMM
now require aname
argument.
Removed
- Deprecated
ignore_gaps
argument inSequenceFile.__init__
. - Deprecated
Sequence.taxonomy_id
property.
v0.7.4 - 2023-04-14
Added
- Recipes page to the documentation with code example for loading multiple HMM files (#24, by @zdk123).
Fixed
-
TraceAligner
methods causing a segfault when passed an uninitialized HMM (#36).
Changed
-
HMM
default constructor now always creates a valid HMM (with respects to probability arrays). -
TraceAligner
now validates the inputHMM
before calling the HMMER code. - Use stack allocation for all error buffers instead of creating empty
bytearray
objects where applicable.
v0.7.3 - 2023-03-24
Fixed
v0.7.2 - 2023-02-17
Added
-
easel.GeneticCode
class wrapping anESL_GENCODE
struct for configuring translation. -
DigitalSequence.translate
method to translate a nucleotide sequence to a protein sequence. Metadata is copied from the source sequence to its translation (#31, by @valentynbez).
Deprecated
-
Sequence.taxonomy_id
property, as it is not used by Easel and implementation is not consistent (see EddyRivasLab/easel#68).
v0.7.1 - 2022-12-15
Added
- Missing
__reduce__
method toTopHits
.
Fixed
- Build detection of available platform functions in
setup.py
.
v0.7.0 - 2022-12-04
Added
-
Bitfield.zeros
andBitfield.ones
classmethods for constructing an empty bitfield of known size. -
Bitfield.copy
method to copy a bitfield object. -
SequenceBlock
andOptimizedProfileBlock
classes to store Python objects next to a contiguous array of pointers for iterating with the GIL released. -
SequenceFile.read_block
method to read a whole sequence block from a file. -
HMM.sample
class method to generate a HMM at random given aRandomness
source. -
hmmscan
function to scan a profile database with sequence queries. -
deepcopy
implementations toHMM
,Profile
andOptimizedProfile
classes ofplan7
. -
rewind
method toHMMFile
,HMMPressedFile
andSequenceFile
to reset a file back to its initial position. -
name
attribute toHMMFile
,HMMPressedFile
,MSAFile
andSequenceFile
to expose the path of a file (when it was created from path). -
local
property toProfile
andOptimizedProfile
, indicating whether a profile is in local or global mode. -
multihit
property toProfile
andOptimizedProfile
, indicating whether a profile is in unihit or multihit mode, with a setter taking care of the reconfiguration. -
Domain.included
andDomain.reported
settable properties to report the inclusion and reporting status of a single domain. -
TopHits.included
andTopHits.reported
sized iterator to iterate only on included and reported hits. -
Domains.included
andDomains.reported
sized iterator to iterate only on included and reported domains.
Changed
-
Bitfield
,Vector
andMatrix
can now be created from an iterable. -
Pipeline
search methods now expect aDigitalSequenceBlock
or aSequenceFile
for the target sequence database. -
Pipeline
scan methods now expect anOptimizedProfileBlock
or aHMMPressedFile
for the target profile database. -
TraceAligner
now expect aDigitalSequenceBlock
for the sequences to align to the HMM. -
Profile.configure
now uses a default value of 400 for theL
argument. -
hmmsearch
,nhmmer
andphmmer
support being given a single query instead of requiring an iterable. -
HMMPressedFile
can now be created, closed and used as a context manager directly without having to manage the sourceHMMFile
. - Renamed
Profile.optimized
method toProfile.to_optimized
. - Replaced
Randomness.is_fast
method with theRandomness.fast
property. - Rewrite handling of
Hit
flags using settable properties (Hit.included
,Hit.reported
,Hit.new
,Hit.dropped
,Hit.duplicate
) instead of methods.
Fixed
- Memory leak in the
LongTargetsPipeline
search loop. - PyPy behaviour change of
readinto
methods now expectingunsigned char*
instead ofchar*
memoryview. -
NULL
-pointer dereference inPipeline.search_hmm
when given a query without name. -
LongTargetsPipeline
not recording the query name and accession. - Memory leak caused by using a non-default prior scheme when constructing a
Builder
.
Removed
-
PipelineSearchTargets
, replaced in functionality witheasel.DigitalSequenceBlock
. -
is_local
andis_multihit
methods ofProfile
andOptimizedProfile
, replaced with equivalent properties. -
Hit.manually_drop
andHit.manually_include
methods, replaced with the differentHit
properties.
v0.6.3 - 2022-09-09
Fixed
- Error not being raised on alphabet detection failure in
SequenceFile
orMSAFile
. - Add check in
DigitalSequence
constructor to make sure encoded characters are in valid range (#25).
Added
-
SequenceFile.guess_alphabet
andMSAFile.guess_alphabet
to guess the alphabet from an open file. -
Alphabet.encode
andAlphabet.decode
to convert raw sequences between digital and text format.
v0.6.2 - 2022-08-12
Changed
-
hmmsearch
,phmmer
andnhmmer
functions will reduce the requested number of threads to the number of queries, if it can be detected usingoperator.length_hint
.
Added
- Documentation for loading all HMMs from an
HMMFile
object at once (#23). - List of projects depending on PyHMMER to the
Examples
page of the documentation.
v0.6.1 - 2022-06-28
Added
-
pickle
protocol support forTopHits
objects, using the HMMER network serialization. -
TopHits.write
method to write hits to a file in tabular format. -
query_name
andquery_accession
properties toTopHits
objects to access the name and accession of the query that produced the hits.
Fixed
- Extraction of filename from file-like objects in the
HMMFile
constructor. - Use
os.cpu_count
instead ofmultiprocessing.cpu_count
where applicable to preserve OS scheduling. - Wrong return type in docstring of
HMM.insert_emissions
. -
TopHits.searched_nodes
returning the searched number of residues instead of the searched number of model nodes. - Unsound decoding of pickled
MatrixF
orVectorF
when data comes from a source of different endianness.
Changed
- Rewrite
pyhmmer.hmmer
threading code usingDeque
instead ofcollections.Queue
to store the queries and results. - Reduce memory consumption of
pyhmmer.hmmer
by reducing the number of semaphores and event flags used concurrently. - Make
pyhmmer.hmmer
main threads block on query insertion rather than result retrieval to make sure worker threads are never idling.
v0.6.0 - 2022-05-01
Added
-
pyhmmer.daemon
module with an client implementation to communicate to ahmmpgmd
server. -
Pipeline.arguments
methods to get a list of CLI arguments from the parameters used to initialize thePipeline
. - Setters for
name
,accession
anddescription
properties ofplan7.Hit
. - Constructor for individual
plan7.Trace
objects outside aplan7.Traces
list. -
plan7.Trace.from_sequence
constructor to create a faux trace from a single sequence. -
manually_include
andmanually_drop
methods toplan7.Hit
for manually selecting the inclusion status of aHit
in aTopHits
instance. -
compare_ranking
method toplan7.TopHits
for comparing the order of the hits compared to a previous run on the same targets stored in aneasel.KeyHash
object. -
Pipeline.iterate_seq
andPipeline.iterate_hmm
to run iterative queries like JackHMMER. -
repr
implementations foreasel.MSAFile
,easel.SequenceFile
andeasel.HMMFile
showing the path or file object they were created from. -
repr
implementation foreasel.Randomness
showing the seed and the RNG algorithm in use. -
str
implementation forplan7.Alignment
using HMMER original code to display a domain alignment like in search/scan results.
Changed
-
plan7.Trace.posterior_probabilities
property may now beNone
in case no memory is allocated for the posteriors in theP7_TRACE
struct. -
TopHits.to_msa
can now add additional sequences passed as arguments to the alignment. -
plan7.HMMPressedFile
now raises an exception on attempts to create a new instance manually. -
ignore_gaps
argument ofeasel.SequenceFile
is now deprecated. -
repr
implementations foreasel
types now use the fully qualified class name.
Fixed
-
easel.SequenceFile.readinto
docstring not rendering properly in documentation. - Type annotations of
hits_included
andhits_reported
ofplan7.TopHits
marking these properties asbool
instead ofint
. - Setters of
name
,accession
,description
andauthor
properties ofeasel.MSA
crashing when givenNone
values. - Exception value raised from Easel code not being properly extracted.
- Plain strings being used in example for
easel.TextSequence
andeasel.TextMSA
constructors where byte strings are expected (#20).
v0.5.0 - 2022-03-14
Added
-
plan7.PipelineSearchTargets
to reduce the overhead when searching the same sequences several times with different. query profiles. -
TopHits.copy
method to duplicate aTopHits
instance. -
TopHits.merge
method to merge hits obtained with the same query on different targets. - Buffer protocol implementation for
pyhmmer.easel.Bitfield
.
Changed
- Renamed
TopHits.included
andTopHits.reported
properties toTopHits.hits_included
andTopHits.hits_included
. -
MSAFile
andSequenceFile
are now directly in digital mode if they are instantiated withdigital=True
. -
SequenceFile.parse
can now return a sequence in digital mode. - Reorganized tests to make then runnable from a site install.
Fixed
- Usage of
memcpy
in contexts where it may have had undefined behaviour. -
VectorF.__eq__
crashing when comparing two empty objects. -
SequenceFile
andMSAFile
not closing file handles when raising an error in__init__
.
v0.4.11 - 2021-12-15
Added
-
plan7.HMMFile.read
method to read a singleplan7.HMM
from anplan7.HMMFile
(instead of usingnext
). -
closed
property oneasel.SequenceFile
,easel.MSAFile
andplan7.HMMFile
to mark whether a file object is closed. -
plan7.HMMFile.is_pressed
method to check whether a HMM file has associated pressed data. -
plan7.HMMFile.optimized_profiles
methods to read theplan7.OptimizedProfile
entries in anplan7.HMMFile
is there are associated pressed data available. - Getters for the
name
,accession
,description
,consensus
,consensus_structure
,evalue_parameters
andcutoffs
properties of aplan7.OptimizedProfile
. -
plan7.OptimizedProfile.__eq__
implementation to compare two optimized profiles. -
__sizeof__
implementations forplan7.OptimizedProfile
andplan7.Profile
to get the allocated size of a profile.
Fixed
- Double-free caused by the Cython cycle breaking feature on several view types (
easel.Randomness
,easel.Vector
,easel.Matrix
,plan7.Cutoffs
,plan7.EvalueParameters
,plan7.Offsets
,plan7.Trace
) -
plan7.Hit.description
using the pointer to the accession string erroneously, causing occasional NULL dereference. -
plan7.OptimizedProfile.copy
performing a shallow copy instead of a deep copy as expected.
Changed
-
pyhmmer.hmmer
type annotations now explicit support forplan7.Profile
orplan7.OptimizedProfile
inputs where applicable.
v0.4.10 - 2021-12-06
Added
-
entropy
andrelative_entropy
methods toeasel.VectorF
to compute the Shannon entropy of a vector and the Kullback-Leibler divergence of two vectors. -
mean_match_entropy
,mean_match_information
andmean_match_relative_entropy
methods toplan7.HMM
to get information statistics of an HMM model. -
match_occupancy
method toplan7.HMM
to compute the occupancy for each match state as aneasel.VectorF
.
Fixed
-
plan7.Builder.build_msa
using the gap-open and gap-extend probabilities instead of the MSA itself to compute the transition probabilities for the new HMM.
Changed
-
plan7.Builder.build
will now only load the score system once and reuse it unless a different score system is requested between calls.
v0.4.9 - 2021-11-11
Added
-
plan7.ScoreData
class to store the substitution scores and maximal extensions for a long target search. -
plan7.LongTargetsPipeline
to run searches on targets longer than 100,000 residues. -
Alphabet
methods to check whether anAlphabet
object is a DNA, RNA, nucleotide or protein alphabet. -
window_length
andwindow_beta
arguments toplan7.Builder
to set the max length of nucleotideHMM
created by builder objects.
Changed
-
pyhmmer.hmmer.nhmmer
now uses aLongTargetsPipeline
instead of aPipeline
to search the target sequences. -
pyhmmer.hmmer.nhmmer
now supportsHMM
queries in addition toDigitalSequence
andDigitalMSA
queries. -
pyhmmer.hmmer.phmmer
now always assumes protein queries. -
Z
anddomZ
attributes ofplan7.TopHits
objects is now read-only.
Fixed
-
nhmmer
now uses DNA as the default alphabet instead of amino acid alphabet like it did before (#12).
v0.4.8 - 2021-10-27
Added
- Constructor arguments and properties to
plan7.Pipeline
to support bit score thresholds instead to filter top hits. - Support for creating a
SequenceFile
and anMSAFile
using a Python file-like object instead of only supporting filenames. - Support for reading individual sequences from an MSA file with
SequenceFile
. -
TextMSA.alignment
to access the actual alignment as a tuple of strings. - Subtraction and division support for
easel.Vector
subclasses
Changed
-
plan7.Cutoffs
now support setting the bit score cutoffs, but requires both to be set or cleared at the same time. -
easel.Vector
will always allocate some memory when created manually to avoid having a special empty case in every vector method. -
pyhmmer.easel.AllocationError
now stores the size it failed to allocate, and the number of elements when allocating an array.
Fixed
-
TextSequence.digitize
will not raise aValueError
when the sequence contains invalid characters for the alphabet (previously was anUnexpectedError
).
v0.4.7 - 2021-09-28
Added
-
TraceAligner
,Trace
andTraces
classes topyhmmer.plan7
to get tracebacks after aligning several sequences against an HMM. -
pyhmmer.hmmalign
function with the same features as thehmmalign
binary from HMMER3. - Support for out-of-band pickling in
easel.Vector
andeasel.Matrix
.
Changed
- Allow creating an empty
Vector
orMatrix
by calling their constructor without arguments.
Fixed
- Potential unreported exceptions in
plan7.OptimizedProfile.write
and severalplan7.SSIWriter
methods.
v0.4.6 - 2021-09-10
Added
-
pickle
protocol foreasel.Alphabet
,easel.Bitfield
,easel.KeyHash
,easel.Vector
,easel.Matrix
andplan7.HMM
. -
taxonomy_id
andresidue_markups
properties toeasel.Sequence
. -
sum_score
property toplan7.Hit
. -
plan7.EvalueParameters
class to expose the e-value parameters of aplan7.HMM
or aplan7.Profile
. - Equality checks and slicing for
easel.Matrix
andeasel.Vector
. - Support for creating and manipulating zero-sized
easel
matrices and vectors. -
plan7.Cutoffs
class to expose the Pfam score cutoffs of aplan7.HMM
or aplan7.Profile
. - Keyword arguments to configure E-value thresholds when creating a
plan7.Pipeline
object. - Support for using model-specific thresholding options in
plan7.Pipeline
.
Changed
- Use the replace error handler when decoding error messages to skip potential decoding issues when already building an exception.
- Improve
pyhmmer.hmmer
to ensure background threads exit on aKeyboardInterrupt
. -
easel.VectorU8.__eq__
accepts any object implementing the buffer protocol. -
plan7.HMM.creation_time
now takes and returns adatetime.datetime
object, assuming the field is only ever set withasctime
. - Refactor
easel.Vector
andeasel.Matrix
and mark exposed memory as C-contiguous.
Fixed
-
easel.Alphabet
not reporting potential allocation errors. - Potential buffer overflow in
easel.Matrix
andeasel.Vector
when calling__init__
more than once.
v0.4.5 - 2021-07-19
Added
-
OptimizedProfile.convert
method to configure an optimized profile from aProfile
without reallocating a newP7_OPROFILE
struct.
Changed
- Rewrite the
plan7.Pipeline
search loop to avoid reacquiring the GIL between reference sequences. - Require the reference sequences to be stored in a collection (instead of an iterable) when passing them to the
search_hmm
,search_msa
andsearch_seq
methods ofplan7.Pipeline
. - Avoid reallocating a new
OptimizedProfile
every time a new HMM is passed toPipeline.search_hmm
. - Relax the GIL while sorting and thresholding
TopHits
inPipeline
search methods.
v0.4.4 - 2021-07-07
Added
-
ignore_gaps
parameter topyhmmer.plan7.SequenceFile
, allowing to skip the gap characters when reading a sequence from an ungapped format. -
__sizeof__
implementation for some - Dedicated check for sequence length before running the platform-specific code in
pyhmmer.plan7.Pipeline
.
Fixed
- Score system not being set in
pyhmmer.plan7.Builder.build_msa
. - Alphabet not being checked after the first sequence in
Pipeline
search and scan methods.
v0.4.3 - 2021-07-03
Fixed
- File object wrappers not reporting exceptions raised when seeking on OSX/BSD platforms.
v0.4.2 - 2021-06-20
Added
-
pyhmmer.easel.Randomness
class exposing a deterministic random number generator. -
pyhmmer.plan7.Builder.randomness
andpyhmmer.plan7.Pipeline.randomness
attributes exposing the internal random number generator used by each object. -
pyhmmer.plan7.Hit.best_domain
property mapping to the highest scoring domain of a hit. -
pyhmmer.plan7.OptimizedProfile.rbv
property exposing match scores. -
pyhmmer.plan7.Domain.pvalue
andpyhmmer.plan7.Hit.pvalue
reporting the p-value for a domain or hit bitscore.
Fixed
- Dimensions of the
pyhmmer.plan7.OptimizedProfile.sbv
matrix not being properly set.
v0.4.1 - 2021-06-06
Fixed
- Main buffer not being freed in
MatrixF.__dealloc__
andMatrixU8.__dealloc__
when created without owner.
Added
- Additional configuration values for
pyhmmer.plan7.Pipeline
as both constructor arguments and mutable properties. -
consensus
,consensus_structure
andoffsets
properties topyhmmer.plan7.Profile
objects.
Changed
- Make
OptimizedProfile.ssv_filter
check the alphabet of the given sequence.
v0.4.0 - 2021-06-05 - YANKED
Added
- Linear algebra primitives to expose 1D (
Vector
) and 2D (Matrix
) contiguous buffers containing numerical values topyhmmer.easel
. - Documentation for the
Z
anddomZ
parameters of thepyhmmer.plan7.Pipeline
constructor. -
pyhmmer.errors.AlphabetMismatch
exception deriving fromValueError
to specifically report mismatching Easel alphabets where applicable. -
scale
andnormalize
methods topyhmmer.plan7.HMM
objects. - Property to access
pyhmmer.plan7.Background
residue frequencies as aVectorF
object. - Property to access
pyhmmer.plan7.HMM
mean residue composition as aVectorF
object. - Property to access
pyhmmer.plan7.HMM
probabilities and emissions asMatrixF
objects. -
ssv_filter
methods topyhmmer.plan7.OptimizedProfile
to get the SSV filter score of the profile for a given sequence. - Several additional properties to access the
pyhmmer.plan7.OptimizedProfile
internals.
Removed
- Unused
report_e
parameter ofpyhmmer.plan7.Pipeline
constructor. -
pyhmmer.plan7.TopHits.clear
method which could lead to segfault if it was called while aHit
is being held.
Changed
- Multithreaded loop in
pyhmmer.hmmer
to reduce memory consumption while still yielding hits in order. -
pyhmmer.easel.DigitalSequence.sequence
property is now aVectorU8
.
Fixed
- Type annotations in
pyhmmer.hmmer
. - Potential double free in
pyhmmer.plan7.HMM.command_line
property setter. - Minor floating-point precision issues in
pyhmmer.plan7.Builder
constructor. - Segfault in
TextMSA.digitize
caused byesl_msa_Copy
not digitizing on-the-fly likeesl_sq_Copy
. - Exceptions not being raised in some methods of
pyhmmer.plan7.Profile
andpyhmmer.plan7.TopHits
.
v0.3.1 - 2021-05-08
Added
-
Pipeline.scan_seq
method to query a database of profiles with one or more sequences. -
transition_probabilities
,match_emissions
,insert_emissions
properties to theHMM
class, providing access to the numerical parameters of the HMM. -
consensus_structure
andconsensus_accessibility
properties to theHMM
class to get consensus lines from the source alignment if the HMM was created from a MSA. -
nseq
andnseq_effective
properties to theHMM
class to get the number of training sequences and effective sequences used to build the HMM.
Changed
-
HMM.checksum
is nowNone
if thep7H_CHKSUM
flag is not set. -
Builder
methods will now recordsys.argv
when creating a HMM.
Fixed
-
HMM.write(..., binary=False)
crashing on HMMs without a consensus line. (#5). Fixed upstream in (EddyRivasLab/HMMER#236). -
Pipeline.reset
mishandling theZ
anddomZ
values if those were detected from the number of targets. -
pyhmmer.hmmer
functions will not block until all results have been collected anymore when run in multithreaded mode.
v0.3.0 - 2021-03-11
Added
-
easel.MSAFile
to read from a file containing -
accession
,author
,name
anddescription
properties toeasel.MSA
objects. -
plan7.Builder.build_msa
to build a pHMM from a sequence alignment. - Additional methods to
easel.KeyHash
, allowing to use it as adict
/set
hybrid. -
Sequence.write
andMSA.write
methods to format a sequence or an alignment to a file handle. -
plan7.TopHits.to_msa
method to convert all the top hits of a query against a database into a multiple sequence alignment. -
easel.MSA.sequences
attribute to access individual sequences of an alignment using thecollections.abc.Sequence
interface. -
easel.DigitalMSA.textize
method to convert a multiple sequence alignment in digital mode to its text-mode counterpart. - Read-only
name
,accession
anddescription
properties toplan7.Profile
showing attributes inherited from the HMM it was configured with. -
plan7.HMM.consensus
property, allowing to access the consensus sequence of a pHMM. -
plan7.HMM
equality implementation, using zero tolerance. -
plan7.Pipeline.search_msa
to query a MSA against a sequence database. -
easel.Sequence.reverse_complement
method allowing to reverse-complement inplace or to build a copy. -
errors.AlphabetMismatch
exception for use in cases where an alphabet is expected but not matched by the input. -
hmmer.nhmmer
function with the same behaviour ashmmer.phmmer
, except it expects inputs with a DNA alphabet.
Fixed
-
plan7.Builder.copy
not copying some parameters correctly, causingpyhmmer.hmmer.phmmer
to give inconsistent results in multithreaded mode. -
easel.Bitfield
not properly handling index overflows. - Documentation not rendering for the
__init__
method of all classes.
Changed
-
plan7.Builder
gap-open and gap-extend probabilities are now set on instantiation and depend on the alphabet type. - Constructors for
easel.TextMSA
andeasel.DigitalMSA
, which can now be given an iterable ofeasel.Sequence
objects to store in the alignment.
Removed
- Unimplemented
easel.SequenceFile.fetch
andeasel.SequenceFile.fetchinto
methods.
v0.2.2 - 2021-03-04
Fixed
- Linking issues on OSX caused by aggressive stripping of intermediate libraries.
-
plan7.Builder
RNG not reseeding between different HMMs.
v0.2.1 - 2021-01-29
Added
-
pyhmmer.plan7.HMM.checksum
property to get the 32-bit checksum of an HMM.
v0.2.0 - 2021-01-21
Added
-
pyhmmer.plan7.Builder
class to handle building a HMM from a sequence. -
Pipeline.search_seq
to query a sequence against a sequence database. -
psutil
dependency to detect the most efficient thread count forhmmsearch
based on the number of physical CPUs. -
pyhmmer.hmmer.phmmer
function to run a search of query sequences against a sequence database.
Changed
-
Pipeline.search
was renamed toPipeline.search_hmm
for disambiguation. -
libeasel.random
sequences do not require the GIL anymore. - Public API now have proper signature annotations.
Fixed
- Inaccurate exception messages in
Pipeline.search_hmm
. - Unneeded RNG reallocation, replaced with re-initialisation where possible.
-
SequenceFile.__next__
not working after being set in digital mode. -
sequences
argument ofhmmsearch
now only requires atyping.Collection[DigitalSequence]
instead of atyping.Collection[Sequence]
(not more__getitem__
needed).
Removed
-
hits
argument toPipeline.search_hmm
to reduce risk of issues withTopHits
reuse. - Broken alignment coordinates on
Domain
classes.
v0.1.4 - 2021-01-15
Added
-
DigitalSequence.textize
to convert a digital sequence to a text sequence. -
DigitalSequence.__init__
method allowing to create a digital sequence from any object implementing the buffer protocol. -
Alignment.hmm_accession
property to retrieve the accession of the HMM in an alignment.
v0.1.3 - 2021-01-08
Fixed
- Compilation issues in OSX-specific Cython code.
v0.1.2 - 2021-01-07
Fixed
- Required Cython files not being included in source distribution.
v0.1.1 - 2020-12-02
Fixed
-
HMMFile
callingfile.peek
without arguments, causing it to crash when passed some types, e.g.gzip.GzipFile
. -
HMMFile
failing to work with PyPy file objects because of a bug with their implementation ofreadinto
. - C/Python file object implementation using
strcpy
instead ofmemcpy
, causing issues when null bytes were read.
v0.1.0 - 2020-12-01
Initial beta release.
Fixed
-
TextSequence
uses the sequence argument it's given on instantiation. - Segmentation fault in
Sequence.__eq__
caused by implicit type conversion. - Segmentation fault on
SequenceFile.read
failure. - Missing type annotations for the
pyhmmer.easel
module.
v0.1.0-a5 - 2020-11-28
Added
-
Sequence.__len__
magic method so thatlen(seq)
returns the number of letters inseq
. - Python file-handle support when opening an
pyhmmer.plan7.HMMFile
. - Context manager protocol to
pyhmmer.easel.SSIWriter
. - Type annotations for
pyhmmer.easel.SSIWriter
. -
add_alias
topyhmmer.easel.SSIWriter
. -
write
method topyhmmer.plan7.OptimizedProfile
to write an optimized profile in binary format. -
offsets
property to interact with the disk offsets of apyhmmer.plan7.OptimizedProfile
instance. -
pyhmmer.hmmer.hmmpress
emulating thehmmpress
binary from HMMER. -
M
property topyhmmer.plan7.HMM
exposing the number of nodes in the model.
Changed
- Bumped vendored Easel to
v0.48
. - Bumped vendored HMMER to
v3.3.2
. -
pyhmmer.plan7.HMMFile
will raise anEOFError
when given an empty file. - Renamed
length
property toL
inpyhmmer.plan7.Background
.
Fixed
- Segmentation fault when
close
method ofpyhmmer.easel.SSIWriter
was called more than once. -
close
method ofpyhmmer.easel.SSIWriter
not writing the index contents.
v0.1.0-a4 - 2020-11-24
Added
-
MSA
,TextMSA
andDigitalMSA
classes representing a multiple sequence alignment topyhmmer.easel
. - Methods and protocol to copy a
Sequence
and aMSA
. -
pyhmmer.plan7.OptimizedProfile
wrapping a platform-specific optimized profile. -
SSIReader
andSSIWriter
classes interacting with sequence/subsequence indices topyhmmer.easel
. - Exception handler using Python exceptions to report Easel errors.
Changed
-
pyhmmer.hmmsearch
returns an iterator ofTopHits
, with one instance perHMM
in the input. -
pyhmmer.hmmsearch
properly raises errors happenning in the background threads without deadlock. -
pyhmmer.plan7.Pipeline
recycles memory betweenPipeline.search
calls.
Fixed
- Missing type annotations for the
pyhmmer.errors
module.
Removed
- Unneeded or private methods from
pyhmmer.plan7
.
v0.1.0-a3 - 2020-11-19
Added
-
TextSequence
andDigitalSequence
representing aSequence
in a given mode. - E-value properties to
Hit
andDomain
. -
TopHits
now stores a reference to the pipeline it was obtained from. -
Pipeline.Z
andPipeline.domZ
properties. - Experimental pickling support to
Alphabet
. - Experimental freelist to
Sequence
class to avoid allocation bottlenecks when iterating on aSequenceFile
without recycling sequence buffers.
Changed
- Made
Sequence
an abstract base class. - Additional
Pipeline
parameters can be passed as keyword arguments topyhmmer.hmmsearch
. -
SequenceFile.read
can now be configured to skip reading the metadata or the content of a sequence.
Removed
- Redundant
SequenceFile
methods.
Fixed
-
doctest
loader crashing on Python 3.5. -
TopHits.threshold
segfaulting when being called without priorTophits.sort
call - Unknown
format
argument toSequenceFile
constructor not raising the right error.
v0.1.0-a2 - 2020-11-12
Added
- Support for compilation on PowerPC big-endian platforms.
- Type annotations and stub files for Cython modules.
Changed
-
distutils
is now used to compile the package, instead of callingautotools
and letting HMMER configure itself. -
Bitfield.count
now allows passing an argument (for compatibility withcollections.abc.Sequence
).
v0.1.0-a1 - 2020-11-10
Initial alpha release (test deployment to PyPI).