Audio: Write generated PCM to file

The following code will allow you to write a wav file from procedurally generated audio data.

For example, maybe you have written a mono synthesizer or are mixing sounds and want to save the result, as opposed to playing it back. The function that Java provides for writing audio data to file is a method belonging to javax.sound.sampled.AudioSystem:


static int 	write(AudioInputStream stream, AudioFileFormat.Type fileType, File out)

Writes a stream of bytes representing an audio file of the specified file type to the external file provided.

A tricky aspect is creating an AudioInputStream from procedurally generated audio data (assumed to be PCM values encoded as signed, normalized floats). AudioInputStream takes either TargetDataLine or InputStream as a parameter. Instead of streaming, this implementation subclasses InputStream, for outputting a predefined number of frames. The inner class PCMInputStream can be extracted and modified from the example code. Feel free to modify to work on stereo or different audio formats or to accept the audio source function as a parameter instead of being hard coded.

The example code is in the “get it to work” stage, with extra comments. It creates a 2-second long note, pitched at E above middle C. As it starts and stops abruptly, there will probably be a click at the beginning and end.


import java.io.File;
import java.io.IOException;
import java.io.InputStream;

import javax.sound.sampled.AudioFileFormat;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;

public class DevAudioWrite
{
	PCMInputStream ps;

	public static void main(String[] args)
	{
		DevAudioWrite dw = new DevAudioWrite();
		dw.ps = dw.new PCMInputStream();
		dw.ps.framesToFetch = 44100 * 2; // two seconds at 44100 fps
		
		// MONO wav format used in example
		AudioFormat audioFormat = new AudioFormat(
			AudioFormat.Encoding.PCM_SIGNED, 
			44100, 16, 1, 2, 44100, false);

		// params: InputStream, AudioFormat, length in frames
		AudioInputStream ais = new AudioInputStream(dw.ps, audioFormat, dw.ps.framesToFetch);
		
		try {
			System.out.println("ais.format()=" + ais.getFormat());
			System.out.println("ais.frameLength()=" + ais.getFrameLength());	
			System.out.println("ais.available()=" + ais.available());
		} catch (IOException e1) {
			e1.printStackTrace();
		}			
		
		File file = new File("test.wav");
		System.out.println("file is at following location:");
		System.out.println("" + file.getAbsolutePath());
		
		try {
			
			AudioSystem.write(ais, AudioFileFormat.Type.WAVE, file);

			System.out.println("finished AIS, available() = " + ais.available());

		} catch (IOException e) {
			e.printStackTrace();
		}
	}

	class PCMInputStream extends InputStream
	{
		private int cursor, idx;
		private int[] frameBytes = new int[2];
		int framesToFetch;
		
		@Override
		public int read() throws IOException
		{
			while(available() > 0)
			{
				idx &= 1; 
				if (idx == 0) // set up next frame's worth of data
				{
					cursor++; // count elapsing frames
					
					// Your audio data source call goes here.
					float audioVal = audioGet(cursor);
					
					// convert signed, normalized float to bytes:
					audioVal *= 32767; // scale value to 16 bits
					frameBytes[0] = (char)audioVal; // little byte
					frameBytes[1] = (char)((int)audioVal >> 8 ); // big byte
				}
				return frameBytes[idx++]; // but only return one of the bytes per read()
			}
			return -1;
		}	

		// Following is a substitute for your audio data source. Can be
		// an external audio call instead.
		// Input: if your function needs no inputs, eliminate the input param
		// Output: must be normalized signed float, one track of one frame.
		private float audioGet(long ii)
		{
			int frequency = 330;
			return (float)Math.sin((ii * frequency) / 44100f * 2 * Math.PI);
		}
		
		@Override 
		public int available()
		{
			// Took a while to get this! 
			// NOTE: not concurrency safe.
			// 1st half of sum: there are 2 reads available per frame to be read
			// 2nd half of sum: the bytes of the current frame that remain to be read
			return 2 * ((framesToFetch - 1) - cursor) + (2 - (idx % 2));
		}
		
		@Override
		public void reset()
		{
			cursor = 0;
			idx = 0;
		}
	}
}

A nice easy to ready bit of code! Although I know what this bit does


               // convert signed, normalized float to bytes:
               audioVal *= 32767; // scale value to 16 bits
               frameBytes[0] = (char)audioVal; // little byte
               frameBytes[1] = (char)((int)audioVal >> 8 ); // big byte

I would have wrote it as the below for clarity


               // convert signed, normalized float to bytes:

               short pcmValue = audioVal * 32767; // scale audioValue to 16bits
               // convert short to byte array
               frameBytes[0] = (byte)pcmValue;
               frameBytes[1] = (byte)(pcmValue >> 8);

Thanks for the compliment! :slight_smile:
I put a bunch of time revising the code so it would be a readable expression of the algorithm. Nice to have this recognized. And happy to have improvements posted!

InputStream has important requirements. It restricts the output to ints which must have the following values: 0…255 with -1 used to signal end-of-file. The (byte) cast returns values from -128 to 127. The (char) cast returns 0 to 255. So in addition to losing the sign bits (causing bad distortion to the data), there was the problem that (byte) casting would return occasional -1’s which would prematurely signal the end of the InputStream.

I think what happens at the bit level can be illustrated with the following audio byte: [1000 0011]. When the byte is placed into an int, the sign bit gets moved over, e.g., [1000 0000 0000 0011], and the value read by the AudioInputStream’s inner workings when writing will be [0000 0011] rather than the correct [1000 0011].

In fact, I made this very mistake, trying to use (byte) cast and getting truncated noise in my test files. Being thrown by this was part of what made me miss my “What-I-did-today” deadline.

Also, I think in order to compile, you need the first line to be as follows:

    short pcmValue = (short)(audioVal * 32767);

Maybe the following is a step in the right direction:


    int pcmValue = (int)(audioValue * Short.MAX_VALUE); // scale value to signed 16 bits
    frameBytes[0] = (char)pcmValue;  // "little" byte (unsigned)
    frameBytes[1] = (char)(pcmValue >> 8);  // "big" byte (unsigned) 

If the >> operator works with short, then pcmValue could be a short. But I think that Java represents shorts internally as ints. I’m a little shaky on the specifics. Hence the overly long debugging session!

The following is one of two InputStreams that I am now using with my audio code, and with the new Shepard Chord builder, when saving data to wav files. The other is specifically tied to the output of my mixer, so won’t be of as much use for people. But for this one, it allows you to load an array and save it as a playable stereo wav file. Or more precisely, it allows you to use it as an input parameter when creating an AudioInputStream which in turn can be used for writing via AudioSystem.write().

The expected data array should be stereo floats (i.e., left track, right track), one pair per frame, where the floats range from [-1 to 1].

import java.io.IOException;
import java.io.InputStream;

public class StereoPcmInputStream extends InputStream
{
	private float[] dataFrames;
	private int framesCounter;
	private int cursor;
	private int[] pcmOut = new int[2];
	private int[] frameBytes = new int[4];
	private int idx;
	
	private int framesToRead;
	
	public void setDataFrames(float[] dataFrames)
	{
		this.dataFrames = dataFrames;
		framesToRead = dataFrames.length / 2;
	}
	
	@Override
	public int read() throws IOException
	{
		while(available() > 0)
		{
			idx &= 3; 
			if (idx == 0) // set up next frame's worth of data
			{
				framesCounter++; // count elapsing frames
				
				// scale to 16 bits
				pcmOut[0] = (int)(dataFrames[cursor++] * Short.MAX_VALUE);
				pcmOut[1] = (int)(dataFrames[cursor++] * Short.MAX_VALUE);
				
				// output as unsigned bytes, in range [0..255]
				frameBytes[0] = (char)pcmOut[0];
				frameBytes[1] = (char)(pcmOut[0] >> 8);
				frameBytes[2] = (char)pcmOut[1];
				frameBytes[3] = (char)(pcmOut[1] >> 8);
				
			}
			return frameBytes[idx++]; 
		}
		return -1;
	}

	@Override 
	public int available()
	{
		// NOTE: not concurrency safe.
		// 1st half of sum: there are 4 reads available per frame to be read
		// 2nd half of sum: the bytes of the current frame that remain to be read
		return 4 * ((framesToRead - 1) - framesCounter) + (4 - (idx % 4));
	}

	@Override
	public void reset()
	{
		cursor = 0;
		framesCounter = 0;
		idx = 0;
	}
	
	@Override
	public void close()
	{
		// nothing to close, actually
//		System.out.println(
//				"CoreMixerInputStream stopped after reading frames:" 
//						+ framesCounter);
	}
}