Skip to content

feat(ingestion-api): add ingestion rest api and indexing ingested documents #4171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 28, 2025

Conversation

zwpaper
Copy link
Member

@zwpaper zwpaper commented Apr 17, 2025

API:
CleanShot 2025-04-22 at 23 55 12@2x

DB:
CleanShot 2025-04-18 at 01 39 30@2x

CleanShot 2025-04-18 at 01 32 09@2x

Test:

  1. Request to ingest:
    CleanShot 2025-04-20 at 23 30 46@2x

  2. saved in db:
    CleanShot 2025-04-20 at 23 34 11@2x

  3. updated in tantivy:

    {"chunk_attributes":{"chunk_body":"Getting Started with TabbyML/tabby\n\nThis is the main content of the document that will be ingested..."},"chunk_id":"ingested:tabby%20dev/page%20123-0","corpus":"structured_doc","id":"ingested:tabby%20dev/page%20123","source_id":"ingested:tabby%20dev","updated_at":"2025-04-24T13:40:23.487517Z"}
    {"attributes":{"kind":"ingested","link":"https://tabby.tabbyml.com/pages/123","title":"Getting Started with TabbyML/tabby"},"corpus":"structured_doc","id":"ingested:tabby%20dev/page%20123","source_id":"ingested:tabby%20dev","updated_at":"2025-04-24T13:40:23.487517Z"}

@zwpaper zwpaper marked this pull request as draft April 17, 2025 17:42
Copy link

codecov bot commented Apr 17, 2025

Codecov Report

Attention: Patch coverage is 0% with 372 lines in your changes missing coverage. Please review.

Project coverage is 56.34%. Comparing base (6be6e15) to head (e9f91eb).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
...rver/src/service/background_job/index_ingestion.rs 0.00% 102 Missing ⚠️
ee/tabby-db/src/ingestion.rs 0.00% 90 Missing ⚠️
ee/tabby-webserver/src/service/ingestion.rs 0.00% 53 Missing ⚠️
...s/tabby-index/src/structured_doc/types/ingested.rs 0.00% 43 Missing ⚠️
ee/tabby-webserver/src/routes/ingestion.rs 0.00% 19 Missing ⚠️
ee/tabby-schema/src/dao.rs 0.00% 18 Missing ⚠️
.../tabby-webserver/src/service/background_job/mod.rs 0.00% 17 Missing ⚠️
crates/tabby-common/src/api/ingestion.rs 0.00% 8 Missing ⚠️
ee/tabby-webserver/src/routes/mod.rs 0.00% 6 Missing ⚠️
ee/tabby-webserver/src/service/mod.rs 0.00% 6 Missing ⚠️
... and 3 more

❌ Your patch status has failed because the patch coverage (0.00%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4171      +/-   ##
==========================================
- Coverage   57.14%   56.34%   -0.81%     
==========================================
  Files         229      236       +7     
  Lines       28933    29349     +416     
==========================================
+ Hits        16534    16536       +2     
- Misses      12399    12813     +414     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@zwpaper zwpaper marked this pull request as ready for review April 20, 2025 15:36
@zwpaper zwpaper requested a review from wsxiaoys April 23, 2025 01:18
Copy link
Member

@wsxiaoys wsxiaoys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address the comments

@zwpaper
Copy link
Member Author

zwpaper commented Apr 24, 2025

@wsxiaoys I made some update since your last review, PTAL

  1. drop ingestion event and index status in response as the comment
  2. add ingested: prefix to source in DB and index to ensure the uniqueness of source_id
  3. since the change 2, drop the manually added /ingested prefix in id and use {source}/{id} as index id

@zwpaper zwpaper requested a review from wsxiaoys April 24, 2025 13:49
@zwpaper zwpaper merged commit 65a9082 into main Apr 28, 2025
5 of 8 checks passed
@zwpaper zwpaper deleted the feat/ingestion branch April 28, 2025 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants