Agreed. So it “alters the pattern of the sound wave”, but what does that mean really? I’m thinking this isn’t something that can just be thought up theoretically - it will require some actual measurements and comparisons to first recognize what those pattern differences are, and second to come up with a filter that can recreate them, which can be applied to the audio data. How realistic can a system be without this component? Well, I’ll just have to see, I suppose. I’ll continue working on the other components, and maybe come back to this one later.
To counter my earlier argument, the skull actually isn’t “more or less spherical”. More like an upside-down bowl (a better representation might be a half-sphere with a flat end). The ears themselves are positioned toward the lower-back, not smack in the middle. So the echo will take longer to return to the ear if a sound is coming from behind than if it is coming from the front (and longer if it is coming from below than if it is coming from above). Likewise, it will pass through more space and attenuate more if coming from behind or below. So even without the pinna component, the brain can probably make the distinction by taking the phase and attenuation differences between ears and comparing that to the phase and attenuation differences between the initial sound and its echo for each side.
All this is really driving home to me just how complex positional audio is in the real world. As far as we’ve come with 3D graphics and virtual reality, on the audio side we are still practically in the stone age as we continue to simulate positional sounds using the cosine function!! It is about time some advances were made in this area.