Simple Line Mixer

cylab · September 2, 2011, 9:48am

I was to suggesting the same thing, but happend to realize, that I have no idea if it really is more efficient today (it was in the past), given all those fance cache architectures. Maybe some cache prefetch today makes chunked processing more efficient than the per frame/byte one…

paulscode · September 2, 2011, 11:40am

I am using similar algorithms to whats in that library (also working with float buffers. This is fairly straightforward as long as you are dealing only with PCM data). I just realized my previous post sounded like I was asking how to do format conversions - I actually have this part coded from my other mixer project. That being said, I’ll definitely take a look through that link you posted see if there is anything useful that I could learn from it for optimizing my own code (although I personally shy away from anything GPL, even with those convoluted lawyer-speak “classpath exception” or “lesser” clauses - I don’t like the controls that the designers of these licenses try to place on the developer. I much prefer to give people a “do whatever you want, just don’t come crying to me if you screw up your sh**” type of license). Anyway, what I meant to say is that I need to work format conversions into this system, decide how many formats I want to support, and spend some time testing and debugging.

[quote=“philfrei,post:20,topic:37165”]
How would that work, though, since each line is going to have its own unique set of transformations to be applied? Even if you do it at the line level to present a pre-defined amount of transformed data to the mixer thread, you are just pushing the same operation to another class - in the end you are still performing transformations to each line in a linear fashion (that’s the only way a CPU can operate). So whether it is done by the mixer after it draws the data from the line, or done by the line before it is drawn by the mixer, it is the same performance-wise, right? Perhaps I’m not understanding the process you are describing (I sometimes need unfamiliar concepts to be “dumbed down” for me). Since cylab mentioned that is what he was suggesting as well, I’m probably just on the wrong wave-length so to speak

philfrei · September 2, 2011, 11:27pm

EDIT: see second thoughts on post #29.

I guess I’m thinking construct each line using a Factory pattern that implements the appropriate Strategy pattern, such that in the innermost loop of the mixer, all you have to do is worry about adding the signals and dividing by the number of signals (or whatever that algorithm is) and applying a global mixer volume to that.

Suppose you had a wrapper for each AudioInputStream that contained its own buffer, and a collection of these wrappers. The following I am writing as I sit here, for the inner loop of the Mixer, to illustrate the idea.

while (playing)
{
    for (AISWrapper aisw: aisWrappers)
    {
        bytesRead = aisw.read(aisw.getBuffer(), 0, bytesToRead);
    }

    for (int i = 0; i < framesToRead; i += frameSize)
    {
        for (AISWrapper aisw: aisWrappers)
        {
            sumLeft += aisw.getBuffer().getAudioVal(i);
            sumRight += aisw.getBuffer().getAudioVal(i+2);
        }
        sumLeft /= aisWrappers.count;
        sumRight /=  aisWrappers.count;
        sumLeft *= globalVolume;
        sumRight *= globalVolume;
        buffer[i] = LeastSigByte(sumLeft);
        buffer[i+1] = MostSigByte(sumLeft);
        buffer[i+2] = LeastSigByte(sumRight);
        buffer[i+3] = MostSigByte(sumRight);
    }

    playbackLine(buffer, 0, bytesToRead);
}

philfrei · September 3, 2011, 12:52am

Have you been looking into multicore processing? I’m trying to get a grip on it. This could affect whether there is a difference between the inner loop of the mixer progressing via frames vs tracks.

I’m sorry if I’m not doing a good job of getting my head around the issues in a way to clearly answer the points you raised.

For my JTheremin, I used the code presented here to feed output from Swing components to the audio processing loop on a per frame basis. I don’t know if you saw it or not, or if the idea or code is useful to you. (Probably lots of room for improvements here, too.) This tool is geared to a per frame processing order.

http://www.java-gaming.org/index.php/topic,24605.msg208379.html#msg208379

I would attach a few of these RTESmoothers, on an as-needed basis to whatever lines are going into the Mixer. The exact configuration might be containable as a “Strategy” implementation.

Here is some code as to how I used these with the JTheremin (mouse controls the volume and rates, slider controls tone).

pitchCtrl = min 0.25, max 64 (used to iterate through a wave table)
toneCtrl = min 0.0, max 1.0 (use to interpolate between a sine wave table and the selected square or sawtooth wave)
volumeCtrl = min 0.0, max 1.0, max delta 0.0005, start/stop from 0 = true.

The read() is for a TargetDataLine implementation.

@Override
public int read(byte[] buffer, int idxBuf, int bytesToRead) 
{
    if (!running) return 0;
    
    int frToIdx = bytesToRead/4;
    int bufferIdx = idxBuf;		

    while (frToIdx-- > 0)
    {
        pitchVal = pitchCtrl.tick();
        av1 = sineTable.get(pitchVal);
        av2 = activeWT.get(pitchVal); // might be square or sawtooth wavetable

        toneVal = toneCtrl.tick();
        audioVal = av1 * (1 - toneVal) + av2 * toneVal;		

        audioVal *= volumeCtrl.tick();
        audioVal *= 32767;

        buffer[bufferIdx++] = (byte)((int)(audioVal) & 0xff);
        buffer[bufferIdx++] = (byte)((int)audioVal >> 8);
        buffer[bufferIdx++] = (byte)((int)(audioVal) & 0xff);
        buffer[bufferIdx++] = (byte)((int)audioVal >> 8);	
    }
    return (bufferIdx - idxBuf);
}

paulscode · September 3, 2011, 1:46am

Thanks, I’ll take a look at the code you linked to. I like the basic infrastructure as I understand it, because whether or not it improves performance, it does clean up the code in the mixing loop.

Also, what I’m calling a “buffer” is any arbitrary size divisible by the sample size. I may very well use the sample size if a reasonable number of lines can be mixed on my crappy Netbook without having any skipping in the output. I do still need to determine the most efficient way to handle format differences between input and output that result in different sample sizes between the two, but that’s just a matter of trying things out to see what works the fastest (I currently have the mixer reading until it has enough data, and telling the lines to store any “leftovers” until the next iteration).

nsigma · September 6, 2011, 5:14pm

Great to see this project starting - would seem to be highly needed given recent conversations!

Sometime in October I’ll be releasing the rest of the audio libraries from Praxis as GPL w/CPE, which does most of this and a whole lot more. However, I also understand the need for a lightweight and more liberally licensed library. I’m happy to donate any fragments of the Praxis stuff that’s useful under BSD if either it’s code I’ve written or is 3rd-party based but license compatible.

The AudioServer code (minus the conversion code from Gervill) I’ve posted elsewhere might be useful (just the general concept of not blocking on the SDL write(). )

I’ve got some interpolation code (cubic spline if I remember rightly) ported from Pure Data (BSD) that should sound better than linear conversion for pitch changes / sample rate conversion.

I’ve also got various bits of stuff to do with gain / panning / mixing that might be useful.

This made me laugh - perhaps biggest understatement of the year! ;D

As I mentioned in the other JavaSound thread, I’ve been programming audio stuff with pure-Java DSP for about 7 years now. All that you’ve mentioned (gain / pan / pitch) are easily achievable at low latency - I’ve done things doing far more than that! I also don’t understand how you’ve expressed your concern on latency - each of these operations does not add latency as such - this is set in the soundcard driver. Each will add some (minimal) CPU usage, which can eventually cause audio break up if the CPU gets saturated - you can lower CPU usage by increasing latency, but this isn’t the same thing as saying every operation increases latency.

If it’s useful to know, I can get sub-5ms latency doing all this and more. Once your code is up I’m happy to have a look through and see if there’s any optimisations I can offer.

You don’t want to be dividing by the number of signals, or every time you add a signal all the others will get quieter. You do however need to limit the signal so it doesn’t go out of range. There are various approaches to that. An auto-gain control (measuring the ongoing absolute maximum signal and dividing all values by that) is simple and possibly suitable in this case.

This probably isn’t worth the effort unless you’re doing a lot of DSP work. Programming multicore audio isn’t easy - you need to use lock-free structures throughout (synchronized or other locks can really mess with your latency).

Best wishes, Neil

paulscode · September 6, 2011, 5:56pm

I’ve actually learned quite a bit by starting over from scratch with a new project and a new infrastructure. I’m really not experiencing the latency issues that I’m getting with my other mixer project (even with as low as 32 total lines, it is complete garbage for anything I’d use it for)

Anyway, I’m going to just scrap that project and turn this one into the base that I’ll use in my 3D Sound System. I can add per-line pitch/ pan/ gain capabilities without damaging the optimized pipeline, simply by having the mixer check if those features are requested, and if not simply don’t do the calculations. That way, there would be no performance hit as long as the user used the same format for the output and all inputs, and didn’t change the pitch, pan, or gain. This should give the capability to mix hundreds of lines at a decent speed.

gouessej · September 7, 2011, 10:42pm

[quote=“paulscode,post:27,topic:37165”]
Does it change anything for people not using JavaSound but rather JOAL?

paulscode · September 8, 2011, 1:22am

The mixer itself is independent of Java Sound, and could be used with JOAL. There is a class that provides the linkage with Java Sound (chooses a device, opens a line, etc), and it can be removed easily enough. As for how this mixer will fit into my SoundSystem library, it will be part of a new library plug-in (something along the lines of LibrarySimpleMixer or the like), and will link with Java Sound. The LibraryJOAL plug-in will not be affected.

philfrei · October 17, 2011, 8:31pm

Just a heads up. My waking thought today is that my suggestion on post #22 is doomed to sub-par performance. (Very likely the more skilled programmers in the forum knew enough to steer clear?)

Grabbing bigger blocks of data at once should be more “data-oriented”. I’m working through this problem via the Theremin project. Nothing like the learning process of making your own mistakes…

paulscode · October 18, 2011, 9:34pm

Thanks. I haven’t actually done much work on this project since my last post, because I’ve been busy with my N64 emulator for Android project. I’ll cycle back around to this project eventually ;D

gouessej · October 19, 2011, 9:41am

Hi

Sorry for the off topic. I tried to use JArio to play with Goldeneye 007 but some required op codes are not yet supported. Is your own emulator open source? I would like to make a nice demo of Goldeneye 007 with JOGL 2.0 ;D

nsigma · October 19, 2011, 11:53am

What exactly do you think is wrong with the approach you suggested in #22? That seems pretty standard from a glance. There’s probably something wrong with how you’re suggesting reading sumLeft / sumRight, but other than that the principle seems ‘sound’.

paulscode · October 19, 2011, 3:28pm

Yes it is open source, but mainly written in native c/c++ using the Android NDK, so not really suitable for porting to Java.

philfrei · October 19, 2011, 7:35pm

@nsigma, post #32
Probably just confused thinking on my part. I’m looking at what I suggested more closely and see that this is NOT the same as what is happening in the Theremin, where I’ve started to have dropouts and latency problems after adding an Echo. In the Theremin, I only work on single frames, even when extracting from the WaveTables instead of extracting blocks. My worry is that the operations are more fragmented than they need be as a result, rather than “data-oriented.”

But here, on the suggestion in #22, I see that blocks of audio data ARE being used. So the loops are progressing through the arrays in a way that should be reasonably “data-oriented”? I’m just not that clear on what goes on underneath to know for sure, hence the doubts.

On the left & right summing, I kind of fluffed over the details of how the audio data is represented…if it is streams of floats or PCM. I assume that is what you mean by this section not being correct?

I’ll post a progress report over on the Theremin thread to look at the drop out problems I’ve created for myself, so as not to hijack this thread.

philfrei · November 1, 2011, 9:35am

Hey Paul! I hit a point where a little audio mixer would be a big plus, so I started working on one as well. You are surely further along than I am, but there’s stuff you plan to do which I am going to take a pass on, like offering a broad range of format conversions.

The program I wrote today mixes six lines, but still lacks channel and master volumes, and needs some click prevention for the starts & stops. Four lines are for standard signed PCM .wav files (stereo, 16-bit, little endian). Two are for an Interface I am calling (unless someone has a better suggestion) NormalizedFloatAudioLine. This functions sort of like a TargetDataLine, in that you can read audio data from it. It delivers an array of floats assumed to be between -1 and 1 rather than bytes. (But I don’t do anything in the program of the Interface to enforce this.)

Maybe there is already something out there like this? It seems like a logical thing for audio.


public interface NormalizedFloatAudioLine  
{	
	int read(float[] buffer, int n);
	void start();
	void stop();
	boolean isRunning();
}

The point of this is that I’m starting to do more with synthesizing sounds on the fly, and most of the work is done with floats. It seems pointless to make the output a byte stream only to convert back to floats when the mixing happens. Yes?

Because the data is generated at the time of the read, there is no need to hassle with draining lines. Lots of other DataLine & TargetDataLine methods seem unnecessary as well. I don’t know if this is something you’d consider supporting, but I thought I’d put it out there.

I just started playing with FM synthesis (frequency modulation), using the WaveTables from the JTheremin code. I wrote my first sirens and ray guns this week! It is marvelously efficient way to do SF/X, imho, and I just need to get a little mixer running so that I can show them off more easily.