It can visualize audio as frequency of sound patterns (more red greater the frequency). To check if it's working you can start with (shshshshs) sound.
With a concept of transfer learning. It can also predict your audio activity based on the classes it is trained