Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alibaba-NLP/gte-multilingual-reranker-base fails to load #498

Closed
GarciaLnk opened this issue Feb 18, 2025 · 1 comment
Closed

Alibaba-NLP/gte-multilingual-reranker-base fails to load #498

GarciaLnk opened this issue Feb 18, 2025 · 1 comment

Comments

@GarciaLnk
Copy link

Alibaba-NLP/gte-multilingual-reranker-base is listed as a supported re-ranking model in the README, but it fails with a cannot find tensor embeddings.word_embeddings.weight error when trying to load it using the 1.6 images:

❯ docker run --gpus all --pull always ghcr.io/huggingface/text-embeddings-inference:86-1.6 --model-id Alibaba-NLP/gte-multilingual-reranker-base
86-1.6: Pulling from huggingface/text-embeddings-inference
Digest: sha256:6f6988335bcebbe9f6e96c7e3b91677670b6cd7eacfc89a7a53b2d015a9ef1ee
Status: Image is up to date for ghcr.io/huggingface/text-embeddings-inference:86-1.6
2025-02-18T13:28:01.462762Z  INFO text_embeddings_router: router/src/main.rs:175: Args { model_id: "Ali****-***/***-************-********-*ase", revision: None, tokenization_workers: None, dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: None, max_client_batch_size: 32, auto_truncate: false, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "991cf4eb0b5f", port: 80, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2025-02-18T13:28:01.462886Z  INFO hf_hub: /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token"    
2025-02-18T13:28:01.559022Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:20: Starting download
2025-02-18T13:28:01.559033Z  INFO download_artifacts:download_pool_config: text_embeddings_core::download: core/src/download.rs:53: Downloading `1_Pooling/config.json`
2025-02-18T13:28:01.775297Z  WARN download_artifacts: text_embeddings_core::download: core/src/download.rs:26: Download failed: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/Alibaba-NLP/gte-multilingual-reranker-base/resolve/main/1_Pooling/config.json)
2025-02-18T13:28:02.693958Z  INFO download_artifacts:download_new_st_config: text_embeddings_core::download: core/src/download.rs:77: Downloading `config_sentence_transformers.json`
2025-02-18T13:28:02.825517Z  WARN download_artifacts: text_embeddings_core::download: core/src/download.rs:36: Download failed: request error: HTTP status client error (404 Not Found) for url (https://huggingface.co/Alibaba-NLP/gte-multilingual-reranker-base/resolve/main/config_sentence_transformers.json)
2025-02-18T13:28:02.825537Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:40: Downloading `config.json`
2025-02-18T13:28:03.122270Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:43: Downloading `tokenizer.json`
2025-02-18T13:28:04.351601Z  INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:47: Model artifacts downloaded in 2.792576329s
2025-02-18T13:28:04.879533Z  WARN text_embeddings_router: router/src/lib.rs:184: Could not find a Sentence Transformers config
2025-02-18T13:28:04.879549Z  INFO text_embeddings_router: router/src/lib.rs:188: Maximum number of tokens per request: 8192
2025-02-18T13:28:04.879562Z  INFO text_embeddings_core::tokenization: core/src/tokenization.rs:28: Starting 12 tokenization workers
2025-02-18T13:28:07.561409Z  INFO text_embeddings_router: router/src/lib.rs:230: Starting model backend
2025-02-18T13:28:07.561437Z  INFO text_embeddings_backend: backends/src/lib.rs:360: Downloading `model.safetensors`
2025-02-18T13:28:13.593028Z  INFO text_embeddings_backend: backends/src/lib.rs:244: Model weights downloaded in 6.031586368s
2025-02-18T13:28:14.065445Z  INFO text_embeddings_backend_candle: backends/candle/src/lib.rs:362: Starting FlashGTE model on Cuda(CudaDevice(DeviceId(1)))
2025-02-18T13:28:14.075857Z ERROR text_embeddings_backend: backends/src/lib.rs:255: Could not start Candle backend: Could not start backend: cannot find tensor embeddings.word_embeddings.weight
Error: Could not create backend

Caused by:
    Could not start backend: Could not start a suitable backend
@GarciaLnk
Copy link
Author

nvm, duplicate of #471

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant