Skip to content

Segmentation Fault with Distilled Models on CPU when word_timestamps=True #1283

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
zhitao-zeng opened this issue Apr 8, 2025 · 11 comments

Comments

@zhitao-zeng
Copy link

I am encountering a "Segmentation fault (core dumped)" error when using faster-whisper under specific conditions:

  1. Running inference on the CPU (device='cpu').
  2. Using a distilled Whisper model (e.g., distil-faster-whisper-base-it).
  3. Requesting word-level timestamps (word_timestamps=True).

The error does not occur if:

  1. Word timestamps are disabled (word_timestamps=False).
  2. A standard (non-distilled) finetune faster Whisper model (e.g., base, small, tested with base) is used, even with word_timestamps=True on CPU.
@Purfview
Copy link
Contributor

Purfview commented Apr 8, 2025

distil-faster-whisper-base-it

Post config.json file of that model.

@zhitao-zeng
Copy link
Author

distil-faster-whisper-base-it

Post config.json file of that model.
Here is the model I used
https://huggingface.co/gustavv-andrzejewski/distil-whisper-base-it/blob/main/config.json

@Purfview
Copy link
Contributor

Purfview commented Apr 8, 2025

@zhitao-zeng
Copy link
Author

https://huggingface.co/gustavv-andrzejewski/distil-whisper-base-it/blob/main/config.json

This model is for whisper, not for faster-whisper.

config.json
Oh I see, this file should be the faster-whisper one

@Purfview
Copy link
Contributor

Purfview commented Apr 8, 2025

config.json Oh I see, this file should be the faster-whisper one

"alignment_heads" is same as in original whisper https://huggingface.co/openai/whisper-base/blob/main/generation_config.json

I think it should be different for distil model.

@zhitao-zeng
Copy link
Author

config.json Oh I see, this file should be the faster-whisper one

"alignment_heads" is same as in original whisper https://huggingface.co/openai/whisper-base/blob/main/generation_config.json

I think it should be different for distil model.

That's what I directly got through the following code:
converter = TransformersConverter(model_path) converted_model_path = converter.convert(output_dir, quantization="int8", force=True)

@Purfview
Copy link
Contributor

Purfview commented Apr 9, 2025

Did you check in original Whisper if word_timestamps=True works there with this model?

@zhitao-zeng
Copy link
Author

zhitao-zeng commented Apr 10, 2025

Did you check in original Whisper if word_timestamps=True works there with this model?

Thanks for the suggestion. I've tested this further.

Using the Hugging Face transformers library:

  • The distilled (gustavv-andrzejewski/distil-whisper-base-it), fine-tuned, and original base models all correctly produce word timestamps when requested (return_timestamps=True). The output format includes inline time markers within the text (e.g., <|time|>word).

Using the faster-whisper library (with CTranslate2 models):

  • The CTranslate2 versions of fine-tuned and original base models work correctly with word_timestamps=True on CPU, providing detailed Word objects (with start/end times and probability).
  • However, the CTranslate2 version of the distilled model (gustavv-andrzejewski/distil-whisper-base-it) still causes a segmentation fault when word_timestamps=True is used on the CPU. It works if timestamps are off or if a non-distilled model is used.

Using openai-whisper library

  • The fine-tuned and original base models work correctly with word_timestamps=True on CPU, providing detailed Word objects (with start/end times and probability).
    -However the distil model can not load sucessfully because missing key(s) in state_dict. Since this distil model using hugging face framework, it's hard for me to convert it and load into the standard openai whisper framework. I need to construct the model and load weights manully and I am not sure if it's available.

@Purfview
Copy link
Contributor

Can you share that ct2 model?

@zhitao-zeng
Copy link
Author

@sssshhhhhh
Copy link

transformers word timestamps is return_timestamps='word' not True. This model also errors in hf and openai because the distil process removes layers which makes alignment heads refer to heads in nonexistant layers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants