-
Notifications
You must be signed in to change notification settings - Fork 885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] New demo type/use case: semantic search (SemanticFinder) #84
Comments
This is so cool! I plan to completely rewrite the demo application which, as you can tell, is extremely simple... so this definitely sounds like something I can add!
|
@do-me Just a heads up that I updated the feature-extraction API to support other models (not just sentence-transformers). To use the updated API, you just need to add For example: Before: let extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
let result = await extractor('This is a simple test.');
console.log(result);
// Tensor {
// type: 'float32',
// data: Float32Array [0.09094982594251633, -0.014774246141314507, ...],
// dims: [1, 384]
// } After: let extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
let result = await extractor('This is a simple test.', { pooling: 'mean', normalize: true });
console.log(result);
// Tensor {
// type: 'float32',
// data: Float32Array [0.09094982594251633, -0.014774246141314507, ...],
// dims: [1, 384]
// } And if you don't want to do pooling/normalization, you can leave it out. You will then get the embeddings for each token in the sequence. |
Also - we're planning on releasing a semantic search demo next week 🥳 (so, watch this space!) |
This is awesome, thanks for pinging me! I'm very interested in this feature, mainly for speed improvements. Do you have some benchmarks at hand how the new pooling approach compares to sequential processing? Also, I'd be curious to know if there's a sweet spot somewhere how many elements could/should be passed to the model at once. And one more detail but it's probably also model dependent: can you track the progress of a batch/pool that has been passed to the model? E.g. if I pass 1000 elements at once, is there any theoretic way to return the progress so I can update the progress bar in the frontend meanwhile? |
fyi |
Hey, joining the semantic search on the FE party 🥳 . I'm wondering if we can leverage the power of threads in this scenario by setting |
@lizozom Hi there! 👋 So, the most likely reason for this is that To fix this, it depends where you are hosting the website, as these headers must be set by the server. At the moment, GitHub pages does not offer this (https://github.com/orgs/community/discussions/13309), but there are some workarounds (cc @josephrocca). On the other hand, we are actively working to support this feature in Hugging Face spaces (huggingface/huggingface_hub#1525), which should hopefully be ready soon! |
Seems like netlify offers a little more flexibility. I'm a very happy user of netlify (hosting my blog there since 2019 without any trouble) and it's pretty easy to link a GitHub repo to it. @lizozom if needed, we might consider switching from GitHub pages to netlify. |
Cool! |
Current workaround is to put this file beside your HTML file, and then import it with a script tag in your document I personally wouldn't go with Netlify, since their pricing is a bit to aggressive for my use cases, but depends on what you're doing. Netlify's free 100GB could be used up very quickly if you have a few assets like ML models or videos or whatever (even just a few thousand visitors - e.g. due to being shared on Twitter or HN). Cloudflare Pages is much better imo (unlimited bandwidth and requests for free), but again it depends on your use case - Netlify may suffice. |
Thanks for the hint! Does Cloudflare Pages offer custom headers? |
I haven't actually had to do that with Cloudflare Pages yet, but here are their docs for custom headers: https://developers.cloudflare.com/pages/platform/headers/ |
I tested this out on a local
And indeed, this causes the threaded version ( @xenova In your opinion, should I expect to see performance improvements if I'm running a large batch of |
Sweet, I'll keep track. |
@VarunNSrivastava built a really nice Chrome extension for SemenaticFinder. You can already install it locally as explained here. We submitted it for review so it should be a matter of days (hopefully) or few weeks in the worst case. It's working very well for many different types of pages (even pdfs if they end with .pdf!). There is a settings page too where it's highly recommended to raise the minimum segment length if there is lots of text on a page (like more than 10 pages for example). You can also choose a different model if you're working with non-English content. I spotted the gap in the HF docs about developing a browser extension and was wondering whether we could give a hand in filling it? In the end, our application isn't too complex in terms or "moving" parts so it might make for a good example. Also, we already learnt about some caveats that might be good to write down. |
That would be amazing! 🤯 Yes please! You could even strip down the tutorial quite a bit if you want (the simpler, the better). |
We're using vue components in the extension which might already be slightly too complex for a beginner's tutorial (this would be more of an intermediate/ slightly advanced version I guess). However, I have plans to write yet another extension with similar functionality and really keep it super simple. Will keep you posted but probably better in a new issue. I just have one question which is relevant to both, the extension and SemanticFinder, I just couldn't quite understand from the HF docs: When using var outputElement = document.getElementById("output");
async function allocatePipeline(instruction) {
let classifier = await pipeline('text2text-generation', 'Xenova/LaMini-Flan-T5-783M');
output = await classifier(instruction, {
max_new_tokens: 100
});
outputElement.innerHTML = output[0];
}
allocatePipeline() var outputElement = document.getElementById("output");
async function allocatePipeline(inText) {
let generator = await pipeline('summarization', 'Xenova/distilbart-cnn-6-6');
let out = await generator(inText, {
max_new_tokens: 100,
});
outputElement.innerHTML = out[0].summary_text;
}
allocatePipeline("some test text to summarize"); how can I add a callback, so that my html component is updated each time a new token is created? I tried with different kinds of callbacks and searched through the API but I have the impression that I'm missing something quite obvious. |
The callback functionality is not very well-documented (perhaps for good reason), since it's non-standard and at the time of its creation, didn't have an equivalent mechanism in transformers. For now, you can replicate what I did here using the |
PS: please check out this PR, it removes the redundant |
Thanks a lot, this pointed me in the right direction! let tokenizer = await AutoTokenizer.from_pretrained(model); I noticed that without a However, for a minimal example, demonstrating e.g. the speed of token generation, you can still log it to the console and watch it live: callback_function: function (beams) {
const decodedText = tokenizer.decode(beams[0].output_token_ids, {
skip_special_tokens: true});
console.log(decodedText);
} Demo here. |
Yes that's correct, the best way I have found around this is to use the Web Worker API, and post messages back to the main thread in the and you initialize the worker like: |
@xenova thank you for your extraordinary work. the code is simple: <script setup lang="ts">
import { env, pipeline, AutoConfig } from '@xenova/transformers'
await AutoConfig.from_pretrained(repoid)
</script> or in a ts file: import { env, pipeline, AutoConfig } from '@xenova/transformers'
import { defineStore } from 'pinia'
export const TransformerJs = defineStore('transformers', () => {
function setupOnnx() {
// env.localModelPath = '@/assets/models/'
env.allowRemoteModels = true
env.allowLocalModels = false
}
async function downloadModel(repoid:string, taskid:any) {
await AutoConfig.from_pretrained(repoid)
}
return { env, setupOnnx, downloadModel }
}) Did you change directly in the |
@Fhrozen As long as you:
It should work. This will be fixed in Transformers.js v3, where |
@Fhrozen, I'm pinging @VarunNSrivastava who created the entire vue-based browser plugin. Feel free to ask any questions! |
Hi @xenova,
first of all thanks for the amazing library - it's awesome to be able to play around with the models without a backend!
I just created SemanticFinder, a semantic search engine in the browser with the help of transformers.js and sentence-transformers/all-MiniLM-L6-v2.
You can find some technical details in the blog post.
I was wondering whether you'd be interested in showcasing semantic search as new demo type. Technically, it's not a new model but it's a new use case with an existing model so I don't know whether it's out of scope.
Anyway, just wanted to let you know that you're work is very much appreciated!
The text was updated successfully, but these errors were encountered: