morndeweyTo explain @njgphotography's point better:
Sound is a physical thing, akin to a rock, or a leaf, or water. The song on your phone (that takes up 3.5MB) is information. Information is an abstract thing; it is not like a rock, but rather the lyrics to a song or the steps to cook a chicken.
You cannot share a rock over the internet, and neither can you share sound in that manner. That's why information about those physical things are captured then shared: the rock is 3mm in diameter, it looks red-ish, etc. The sound is described by the audio file you have; think of Stairway_to_heaven.mp3 as a music sheet.
One component in every audio player called a DAC has to read the description and faithfully (to the best of its capability) reproduce the physical sound. Like a chef reading a recipe to create an actual roast chicken, or a songstress singing the lyrics out loud, something has to turn information about the thing into a thing.
That's what this device does. It receives the information from your phone via the Bluetooth connection, reads the information then re-creates the song (with a very talented miniature music band inside) in the form of actual sound signals. These signals are then sent via the output cable to the speakers to amplify. The reason this cable cannot be USB is because USB was never designed to carry physical stuff, only data.