- Allow skipping some sequences in
gecco embed; this is used in
gecco-training-datato use the same negative sequences to embed different MIBIG versions.
- Support for several input data tables in
- Support for alternative model in
gecco cvcommand to perform cross-validation from the CLI.
gecco tunecommand to tune the hyperparameters using several cross-validation rounds.
- Unit tests framework and documentation tests where applicable.
- Fast loading of CLI using delegate imports mechanism of the
gecco runnow performs annotation with different HMMs in parallel (1 thread /HMM).
- Greatly reorganised and improved code in
gecco.crf.preprocessingas the code is only used within the
ClusterCRFrequire a weight column in the input data.
- Renamed method arguments with more explicit variants in
- Reduced coupling of functions in
- TIGRFAM is now downloaded in merged version, which increases download speed when building a wheel.
- Properly relabel domain names after annotation for all databases (instead of only Pfam previously).
gecco embed(slowly) concatenating dataframe rows within a training loop.
rev_i_Evaluestatistic not being computed using vectorized functions.
- Improved performance of CRF features extraction by avoiding