galumph issueshttps://git.embl.de/grp-svergun/galumph/-/issues2020-05-27T04:10:51Zhttps://git.embl.de/grp-svergun/galumph/-/issues/58Tests are very flaky2020-05-27T04:10:51ZChris KerrTests are very flakye.g.
* [#38212](https://git.embl.de/grp-svergun/galumph/-/jobs/38212)
* [#38211](https://git.embl.de/grp-svergun/galumph/-/jobs/38211)
* [#38204](https://git.embl.de/grp-svergun/galumph/-/jobs/38204)
* [#38174](https://git.embl.de/grp-svergun/galumph/-/jobs/38174)e.g.
* [#38212](https://git.embl.de/grp-svergun/galumph/-/jobs/38212)
* [#38211](https://git.embl.de/grp-svergun/galumph/-/jobs/38211)
* [#38204](https://git.embl.de/grp-svergun/galumph/-/jobs/38204)
* [#38174](https://git.embl.de/grp-svergun/galumph/-/jobs/38174)https://git.embl.de/grp-svergun/galumph/-/issues/57Lint OpenCL code with cppcheck2020-05-24T16:11:27ZChris KerrLint OpenCL code with cppcheckhttps://git.embl.de/grp-svergun/galumph/-/issues/56Check docstrings with flake8-docstrings2020-05-24T16:11:09ZChris KerrCheck docstrings with flake8-docstringshttps://git.embl.de/grp-svergun/galumph/-/issues/50Add 'simple' versions of kernels with no explicit caching2019-03-29T20:36:02ZChris KerrAdd 'simple' versions of kernels with no explicit cachingTo test:
* Does the explicit caching actually make any difference to speed?
* Is the explicit caching implemented correctly?To test:
* Does the explicit caching actually make any difference to speed?
* Is the explicit caching implemented correctly?https://git.embl.de/grp-svergun/galumph/-/issues/47Use pyopencl.cltypes2019-03-29T21:26:48ZChris KerrUse pyopencl.cltypesThe cltypes module has the numpy dtypes corresponding to each OpenCL C type
https://documen.tician.de/pyopencl/types.htmlThe cltypes module has the numpy dtypes corresponding to each OpenCL C type
https://documen.tician.de/pyopencl/types.htmlhttps://git.embl.de/grp-svergun/galumph/-/issues/46Handle endianness mismatch between host and device2019-03-17T07:21:43ZChris KerrHandle endianness mismatch between host and deviceThe OpenCL library does not perform any automatic endianness conversion if the device endianness does not match the host.
See this issue on PyOpenCL: https://github.com/inducer/pyopencl/issues/282The OpenCL library does not perform any automatic endianness conversion if the device endianness does not match the host.
See this issue on PyOpenCL: https://github.com/inducer/pyopencl/issues/282https://git.embl.de/grp-svergun/galumph/-/issues/45Pass S and form factor arrays as __constant2019-02-28T19:09:04ZChris KerrPass S and form factor arrays as __constantThese arrays should never be written by the CL kernel, only by the host. Therefore they can be `__constant` instead of `__global const`.These arrays should never be written by the CL kernel, only by the host. Therefore they can be `__constant` instead of `__global const`.https://git.embl.de/grp-svergun/galumph/-/issues/44Define local arrays statically if the relevant macros are defined2019-02-28T19:00:50ZChris KerrDefine local arrays statically if the relevant macros are definedArrays in the `__local` address space can be defined in the source code if the size is known at compile time. When the macros `LMAX`, `NSWORK`, and so on are defined, the array sizes can be calculated. This should hopefully avoid tedious argument binding code in the C bindings #14.Arrays in the `__local` address space can be defined in the source code if the size is known at compile time. When the macros `LMAX`, `NSWORK`, and so on are defined, the array sizes can be calculated. This should hopefully avoid tedious argument binding code in the C bindings #14.Add to ATSAShttps://git.embl.de/grp-svergun/galumph/-/issues/43Version of alm_add_atom using constant form factors2019-02-28T18:47:25ZChris KerrVersion of alm_add_atom using constant form factorsAllow using a single value rather than a function of S. This is equivalent to assuming that the atom is a single point.Allow using a single value rather than a function of S. This is equivalent to assuming that the atom is a single point.Add to ATSAShttps://git.embl.de/grp-svergun/galumph/-/issues/42Compiling the OpenCL code on demand gives flaky variable test times2019-02-17T21:20:29ZChris KerrCompiling the OpenCL code on demand gives flaky variable test timesHypothesis gives lots of random errors about flaky tests which fail the `deadline` check on the first run but not on the second. This is almost certainly due to the time taken to compile the OpenCL program (for the second run, the compiled program is fetched from the cache).
* [ ] Stick to a single set of values for `LMAX`, `NSWORK`, `NATWORK` etc., rather than drawing these from a hypothesis strategy
* [ ] Have a pytest fixture that compiles the program once, before the tests are run, to ensure that it is in the cache.Hypothesis gives lots of random errors about flaky tests which fail the `deadline` check on the first run but not on the second. This is almost certainly due to the time taken to compile the OpenCL program (for the second run, the compiled program is fetched from the cache).
* [ ] Stick to a single set of values for `LMAX`, `NSWORK`, `NATWORK` etc., rather than drawing these from a hypothesis strategy
* [ ] Have a pytest fixture that compiles the program once, before the tests are run, to ensure that it is in the cache.https://git.embl.de/grp-svergun/galumph/-/issues/41Use pytest_generate_tests_for_pyopencl to run tests on all installed opencl p...2019-02-17T15:23:30ZChris KerrUse pytest_generate_tests_for_pyopencl to run tests on all installed opencl platformse.g. both pocl and the GPU driver
https://documen.tician.de/pyopencl/tools.html?highlight=environment#testinge.g. both pocl and the GPU driver
https://documen.tician.de/pyopencl/tools.html?highlight=environment#testinghttps://git.embl.de/grp-svergun/galumph/-/issues/39Remove dependency on pyopencl-complex.h2019-03-18T21:21:28ZChris KerrRemove dependency on pyopencl-complex.hTo use the CL source files from C or Fortran - #14 - independently of PyOpenCL, we will need to replace the PyOpenCL complex functions with a standalone implementation. Alternatively, given PyOpenCL's permissive licence, we could just copy the code into the galumph source.To use the CL source files from C or Fortran - #14 - independently of PyOpenCL, we will need to replace the PyOpenCL complex functions with a standalone implementation. Alternatively, given PyOpenCL's permissive licence, we could just copy the code into the galumph source.Add to ATSAShttps://git.embl.de/grp-svergun/galumph/-/issues/36Rename 'rotate' and 'zshift' to 'gyre' and 'gymble'2018-03-19T07:27:49ZChris KerrRename 'rotate' and 'zshift' to 'gyre' and 'gymble'... to stick with the 'Jabberwocky' theme ;)... to stick with the 'Jabberwocky' theme ;)https://git.embl.de/grp-svergun/galumph/-/issues/34Improve z-shift kernel2019-02-28T19:00:50ZChris KerrImprove z-shift kernelThe z-shift kernel currently does a sequential sum where it has a data race on writing data into the BLM array. Here are some options for resolving this:
* Use a parallel sum to add up the terms going into BLM
* Load the whole `dlmkp` M slice into local memory at once - this can only be done for relatively small LMAX (approx < 30 depending on NSWORK).
* Slightly unpack the dlmkp array so that all k values are present in each slice rather than just K<=LThe z-shift kernel currently does a sequential sum where it has a data race on writing data into the BLM array. Here are some options for resolving this:
* Use a parallel sum to add up the terms going into BLM
* Load the whole `dlmkp` M slice into local memory at once - this can only be done for relatively small LMAX (approx < 30 depending on NSWORK).
* Slightly unpack the dlmkp array so that all k values are present in each slice rather than just K<=Lhttps://git.embl.de/grp-svergun/galumph/-/issues/33Optimized spherical Bessel representation for dummy atoms on a regular grid2018-03-16T21:36:58ZChris KerrOptimized spherical Bessel representation for dummy atoms on a regular gridIn dummy atom models where the atoms are on a regular lattice, the squared distance to the origin is always an integer multiple of the (squared) lattice vector. If a non-uniform $`s`$ axis is used such that $`s^2`$ is linearly spaced, the $`r s`$ values of all atoms will line up so that many values are shared between atoms and only need to be calculated once.In dummy atom models where the atoms are on a regular lattice, the squared distance to the origin is always an integer multiple of the (squared) lattice vector. If a non-uniform $`s`$ axis is used such that $`s^2`$ is linearly spaced, the $`r s`$ values of all atoms will line up so that many values are shared between atoms and only need to be calculated once.https://git.embl.de/grp-svergun/galumph/-/issues/32Derivative backpropagating versions of the rotation and translation functions2018-03-16T21:31:07ZChris KerrDerivative backpropagating versions of the rotation and translation functionsThe functions used for rotation and translation have analytical derivatives. This means that it is in principle possible to go from a measured scattering curve to the partial derivatives of the chi-squared with respect to the various rotation and shift angles. These can then be passed to a gradient descent solver to find a local minimum. An algorithm like basin-hopping global optimization can be used to search for the global minimum.The functions used for rotation and translation have analytical derivatives. This means that it is in principle possible to go from a measured scattering curve to the partial derivatives of the chi-squared with respect to the various rotation and shift angles. These can then be passed to a gradient descent solver to find a local minimum. An algorithm like basin-hopping global optimization can be used to search for the global minimum.https://git.embl.de/grp-svergun/galumph/-/issues/30Reorganize kernels to reduce busy waiting2019-02-28T19:00:50ZChris KerrReorganize kernels to reduce busy waitingMany of the array calculations are triangular or even pyramidal because of symmetries and selection rules. When an OpenCL work group spans one of the array dimensions, this means that half of the execution time is spent busy waiting for other workers to complete a calculation which is not necessary for this worker.
It should in theory be possible to 'fold over' the top corner of a triangle to make a more rectangular execution pattern. This wouldn't work where there is a parallel scan calculation along the work group axis, but most of the kernels could benefit.Many of the array calculations are triangular or even pyramidal because of symmetries and selection rules. When an OpenCL work group spans one of the array dimensions, this means that half of the execution time is spent busy waiting for other workers to complete a calculation which is not necessary for this worker.
It should in theory be possible to 'fold over' the top corner of a triangle to make a more rectangular execution pattern. This wouldn't work where there is a parallel scan calculation along the work group axis, but most of the kernels could benefit.https://git.embl.de/grp-svergun/galumph/-/issues/29More packed representation of the Wigner small d rotation matrix2019-02-28T19:00:50ZChris KerrMore packed representation of the Wigner small d rotation matrixThis matrix has an additional symmetry which I didn't notice when I first wrote the code.
```math
d^{j}_{m', m}(\beta) = -1^{m'-m} d^{j}_{m, m'}(\beta) = d^{j}_{-m, -m'}(\beta)
```
This allows omitting elements which can be obtained from the symmetry relationThis matrix has an additional symmetry which I didn't notice when I first wrote the code.
```math
d^{j}_{m', m}(\beta) = -1^{m'-m} d^{j}_{m, m'}(\beta) = d^{j}_{-m, -m'}(\beta)
```
This allows omitting elements which can be obtained from the symmetry relationhttps://git.embl.de/grp-svergun/galumph/-/issues/28Test algorithms using sympy2018-03-14T17:57:56ZChris KerrTest algorithms using sympyAdd a test script which runs the implemented algorithms in symbolic algebra and checks that they give the exact correct answer when there are no rounding errors.Add a test script which runs the implemented algorithms in symbolic algebra and checks that they give the exact correct answer when there are no rounding errors.https://git.embl.de/grp-svergun/galumph/-/issues/27Consistent variable names2019-02-28T19:00:50ZChris KerrConsistent variable namese.g. uppercase vs lowercase, J vs L
Also choose names that do not conflict e.g. `s` for the sum in the Wigner matrix vs `s` the scattering vector.e.g. uppercase vs lowercase, J vs L
Also choose names that do not conflict e.g. `s` for the sum in the Wigner matrix vs `s` the scattering vector.