Speed-up calculation of the matrix elements
Created by: femtobit
My initial profiling results seemed to confirm my suspicion that there is some noticeable overhead from the std::vector
insertion operations in FindConn
. To get rid of this overhead, I propose to change the way the matrix elements are computed. This is noticeably faster, at least on my machine (compare FindConn
with ForEachConn
):
Model N Method Runtime (s) StdDev (s) #Runs
0 Ising 14 FindConn 1,986 0,012 16
1 Ising 14 ForEachConn 0,795 0,003 16
2 Ising 16 FindConn 10,175 0,168 5
3 Ising 16 ForEachConn 3,461 0,020 5
4 Ising 20 FindConn 246,630 2,620 2
5 Ising 20 ForEachConn 84,738 0,712 2
This PR provides a new method AbstractHamiltonian::ForEachConn
that iterates over all reachable configurations calling a callback function each time. It should be called like this:
https://github.com/netket/netket/blob/77ef5c2124c02115eee730a0a9e050bd583a5d5c/NetKet/Hamiltonian/MatrixWrapper/direct_matrix_wrapper.hpp#L51-L54
For backwards compatibility, ForEachConn
is implemented for the AbstractHamiltonian
in terms of FindConn
:
https://github.com/netket/netket/blob/77ef5c2124c02115eee730a0a9e050bd583a5d5c/NetKet/Hamiltonian/abstract_hamiltonian.hpp#L87-L88
Thus, all existing Hamiltonians will work with ForEachConn
. To get an actual speedup, subclasses need to override ForEachConn
. In this PR, this is implemented for Ising
:
https://github.com/netket/netket/blob/77ef5c2124c02115eee730a0a9e050bd583a5d5c/NetKet/Hamiltonian/ising.hpp#L140-L165
This is a PR on the lanczos
branch (PR #67) because it already contains changes in the direct matrix wrapper compared to master
.