Text this: What do end-to-end speech models learn about speaker, language and channel information? A layer-wise and neuron-level analysis