Improve docstrings for `ARNN*.features`
Created by: wdphy16
In autoregressively masked linear and conv layers, features
is used as a dimension different from the input size, and it's also called 'channels'. However, in most models it's called 'feature density'.