Add FFT-based computation of the autocorrelation time
Created by: femtobit
This PR improves our estimate of the autocorrelation time in netket.stats.statistics
by importing code from emcee
which computes it via FFT.
The integrated autocorrelation time \tau_c
is computed separately for each chain c
. To summarize it for the user, Stats.tau_corr
is changed to contain the average over all chains and a new field Stats.tau_corr_max
is added containing the maximum autocorrelation among all chains (which helps to identify outliers). Using the average \tau
over all chains seems like a good choice as it results in a low-variance estimate (see here for a good discussion).
Definition of tau_corr
Note that I've encountered two definitions of the autocorrelation time in the literature:
- The statistics convention, where the effective sample size is
N_{\mathrm{eff}} = \frac{N}{\tau}
. Here,\tau=1
is the minimum and achieved for an uncorrelated chain. (Because samples\tau = 1
steps apart are uncorrelated in a correlation-free chain.) See emcee or Vehtari et al. (2021). - The convention
N_{\mathrm{eff}} = \frac{N}{1 + 2\tau'}
which is use e.g. by Ambegaokar and Troyer (2009) and might be more common in (parts of) physics. Here, an uncorrelated chain has\tau' = 0.
This PR chooses the first convention because a) it is useful to be able to estimate the effective sample size at a glance (in this convention, if \tau=2
we know that we have about half the effective samples compared to the total sample size) and b) the estimate can be a bit noisy for shorter chains and thus fluctuate around \tau = 1
or \tau' = 0
for mostly uncorrelated chains. In the second convention, this can result in slightly negative values which are somewhat ugly.
Effect on runtime
Note also that the FFT-based computation slows down statistics
between 5x to 20x for sample sizes between 1000 and 100000 on CPU. However, when testing the performance of this PR in VMC applications, the overhead was essentially unnoticeable as computing the stats is an almost negligible contribution to overall runtime compared to MC sampling. I believe the increased diagnostic value of the FFT estimate and avoidance of issues such as #1149 (closed), which this PR fixes, are easily worth it.
Block-based estimates
This PR also changes the statistics
code to only use block-based statistics (instead of the chain based ones) if there is only a single chain (like in autoregressive applications). This simplifies the logic (since in the case of multiple chains, the previous block-based estimates could be harder to interpret as they ignored per-chain information) but there might be a statistical impact if someone runs, say, two chains. (Maybe it would be ideal for two chains to combine block and chain based information, but that would be more complicated than the current code.) IMHO, the version of this PR is a reasonable tradeoff and the chain-based stats do give usable results even with two chains. If we decide so, we can always further improve "low-n_chain statistics" in a later PR.