Peak clipping when layering audio files Java

问题: So as part of a project I'm working on I'm trying to layer multiple audio clips over one another to create the sound of a crowd, and write that to a new .WAV file. First...

问题:

So as part of a project I'm working on I'm trying to layer multiple audio clips over one another to create the sound of a crowd, and write that to a new .WAV file.

First I create a byte[] representation of a file (a 16-bit PCM .WAV file), which doesn't seem to cause any problems.

public byte[] toByteArray(File file)
{
    try
    {
        AudioInputStream in = AudioSystem.getAudioInputStream(file);

        byte[] byteArray = new byte[(int) file.length()];//make sure the size is correct

        while (in.read(byteArray) != -1) ;//read in byte by byte until end of audio input stream reached

        return byteArray;//return the new byte array
    }

Then, I create a buffer (an integer array so as to prevent byte overflow when adding bytes) and try layering in the byte array version of my files.

 int[] buffer = new int[bufferLength];//buffer of appropriate length
        int offset = 0;//no offset for the very first file

        while(!convertedFiles.isEmpty())//until every sample has been added
        {
            byte[] curr = convertedFiles.pop();//get a sample from list

            if(curr.length+offset < bufferLength)
            {
                for (int i =0; i < curr.length; i++)
                {
                    buffer[i] += curr[i];
                }
            }

           offset = randomiseOffset();//next sample placed in a random location in the buffer
        }

The problem arises when I try to implement a sort of random offset. I can add all the audio to my buffer from index 0 (buffer[0]), so everything plays at once sort of and that works. However, if I try and disperse the individual clips randomly throughout the buffer I run into problems.

When I try and offset the addition of files, relative to the length of the buffer I get awful static and peak clipping.

 buffer[i+offset] += curr[i];

I realise I need to be careful with avoiding overflow so that's why I tried using an integer buffer as opposed to a byte one.

What I don't understand though is why it only breaks when I introduce offsetting.

I didn't post the code of actually using the AudioSystem object to create a new file as it doesn't seem to have an effect either way.

This is my first time working with audio programming so any help is much appreciated.

EDIT:

Hendrik's answer solved my problem, but I just needed to slightly change the suggested code (some type conversion issues):

    private static short byteToShortLittleEndian(final byte[] buf, final int offset)
{
    int sample = (buf[offset] & 0xff) + ((buf[offset+1] & 0xff) << 8);
    return (short)sample;
}

private static byte[] shortToByteLittleEndian(final short[] samples, final int offset)
{
    byte[] buf = new byte[2];
    int sample = samples[offset];
    buf[0] = (byte) (sample & 0xFF);
    buf[1] = (byte) ((sample >> 8) & 0xFF);
    return buf;
}

回答1:

What does your randomiseOffset() method look like? Does it take into account that each audio sample is two bytes long? If randomiseOffset() gives you odd offsets, you end up mixing the low bytes of one sample with the high bytes of another sample, which sounds like (usually awful) noise. Perhaps that's the sound you identified as clipping.

To do this right, you need to decode the audio first, i.e. take sample length (2 bytes) and channel count (?) into account, do your manipulation and then encode the audio again into a byte stream.

Let's assume that you have only one channel and the byte order is little-endian. Then you'd decode two bytes into a sample value like this:

private static int byteToShortLittleEndian(final byte[] buf, final int offset) {
    int sample = (buf[offset] & 0xff) + ((buf[offset+1] & 0xff) << 8);
    return (short)sample;
}

To encode, you'd use something like this:

private static byte[] shortToByteLittleEndian(final int[] samples, final int offset) {
    byte[] buf = new byte[2];
    int sample = samples[offset];
    buf[0] = sample & 0xFF;
    buf[1] = (sample >> 8) & 0xFF;
    return buf;
}

Here's how the two methods are used in your case:

byte[] byteArray = ...;  // your array
// DECODE: convert to sample values
int[] samples = byteArray.length / 2;
for (int i=0; i<samples.length; i++) {
    samples[i] = byteToShortLittleEndian(byteArray, i*2);
}
// now do your manipulation on the samples array
[...]
// ENCODE: convert back to byte values
byte[] byteOut = new byte[byteArray.length];
for (int i=0; i<samples.length; i++) {
    byte[] b = shortToByteLittleEndian(samples, i);
    byteOut[2*i] = b[0];
    byteOut[2*i+1] = b[1];
}
// do something with byteOut ...

(Note that you can easily make this more efficient by bulk decoding/encoding and not working on the individual sample as shown above. I just figured it's easier to understand.)

During your manipulation you have to pay attention to your sample values. They must not be greater than Short.MAX_VALUE or less than Short.MIN_VALUE. If you detect that you're outside the valid range, simply scale the whole array. That way you avoid clipping.

Good luck!

  • 发表于 2019-03-05 13:41
  • 阅读 ( 89 )
  • 分类:sof

条评论

请先 登录 后评论
不写代码的码农
小编

篇文章

作家榜 »

  1. 小编 文章
返回顶部
部分文章转自于网络,若有侵权请联系我们删除