You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I have an Android client app, which gets the byte array data of a wav file and then sends it to a server for inference:
public static byte[] getByteArrayFromWavFile(String filePath) {
try {
FileInputStream fileInputStream = new FileInputStream(filePath);
// Read the WAV file header
byte[] header = new byte[44];
fileInputStream.read(header);
// Check if it's a valid WAV file (contains "RIFF" and "WAVE" markers)
String headerStr = new String(header, 0, 4);
if (!headerStr.equals("RIFF")) {
System.err.println("Not a valid WAV file");
return new byte[0];
}
// Get the audio format details from the header
int sampleRate = byteArrayToNumber(header, 24, 4);
int bitsPerSample = byteArrayToNumber(header, 34, 2);
if (bitsPerSample != 16 && bitsPerSample != 32) {
System.err.println("Unsupported bits per sample: " + bitsPerSample);
return new byte[0];
}
// Get the size of the data section (all PCM data)
int dataLength = fileInputStream.available(); // byteArrayToInt(header, 40, 4);
// Calculate the number of samples
int bytesPerSample = bitsPerSample / 8;
int numSamples = dataLength / bytesPerSample;
// Read the audio data
byte[] audioData = new byte[dataLength];
fileInputStream.read(audioData);
ByteBuffer byteBuffer = ByteBuffer.wrap(audioData);
byteBuffer.order(ByteOrder.nativeOrder());
return audioData;
} catch (IOException e) {
e.printStackTrace();
Log.e(TAG, "Error...", e);
}
return new byte[0];
}
However, there is this error message returned from Whisper:
<built-in method with_traceback of InvalidDataError object at 0x7fa96815eb00>
With the following Python code, the inference works perfectly:
with open(args.audio_file, 'rb') as f:
wav_bytes = f.read()
print(f"byte size of wav file: {len(wav_bytes)}")
audio_bytes = BytesIO(wav_bytes)
response = requests.post(FAST_API_URL, files={"audio": audio_bytes}, data={"initial_prompt": ""}, stream=True)
Both of them gets the byte array data of the wav file so the Android approach is supposed to work as well. I did, however, noticed a difference: With the Android app, the size of the corresponding byte array is 214386 whereas with the Python app, the size is 214430. I'm not sure if this has anything to do with the error. Maybe the byte array calculated by the Android app is not what Whisper expects? Any help would be greatly appreciated!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hello! I have an Android client app, which gets the byte array data of a wav file and then sends it to a server for inference:
However, there is this error message returned from Whisper:
With the following Python code, the inference works perfectly:
Both of them gets the byte array data of the wav file so the Android approach is supposed to work as well. I did, however, noticed a difference: With the Android app, the size of the corresponding byte array is 214386 whereas with the Python app, the size is 214430. I'm not sure if this has anything to do with the error. Maybe the byte array calculated by the Android app is not what Whisper expects? Any help would be greatly appreciated!
Beta Was this translation helpful? Give feedback.
All reactions