Skip to content

Commit a5267a8

Browse files
changelog and docs
1 parent 6c59219 commit a5267a8

File tree

4 files changed

+49
-13
lines changed

4 files changed

+49
-13
lines changed

CHANGELOG.md

+25-4
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,27 @@ All notable changes to this project will be documented in this file.
44

55
The format is based on [Keep a Changelog](http://keepachangelog.com/).
66

7+
## [0.1.52] - 2024-07-06
8+
### Added
9+
- `llm` command for Retrieval-Augmented Generation on channels with embeddings
10+
- https://github.com/NotJoeMartinez/yt-fts/pull/156
11+
- Way to specify time interval when generating embeddings
12+
- https://github.com/NotJoeMartinez/yt-fts/pull/155
13+
- pytest unit testing for basic cli functionality
14+
- https://github.com/NotJoeMartinez/yt-fts/pull/151
15+
### Changed
16+
- Changed `get-embeddings` command to `embeddings` (it's cleaner)
17+
- https://github.com/NotJoeMartinez/yt-fts/pull/155
18+
- Refomatted most files to follow PEP 8 style guides
19+
- https://github.com/NotJoeMartinez/yt-fts/pull/153
20+
- Most of the commands now exit with status code
21+
- https://github.com/NotJoeMartinez/yt-fts/pull/152
22+
- Refactored to not use `import *`
23+
- https://github.com/NotJoeMartinez/yt-fts/pull/154
24+
## Fixed
25+
- Removed Regex warning when first running cli
26+
- Delete not working if you use a capital Y
27+
728
## [0.1.51] - 2024-07-04
829
### Fixed
930
- Fixed broken `get_channel_id` function cause by YouTube change to video page html
@@ -53,7 +74,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/).
5374

5475
### Added
5576
- [yt-fts-132](https://github.com/NotJoeMartinez/yt-fts/pull/132)
56-
- Github actions integration
77+
- GitHub actions integration
5778

5879

5980

@@ -86,7 +107,7 @@ Special thanks to [@danlamanna](https://github.com/danlamanna) for these fixes
86107
## [0.1.39] - 2023-12-31
87108
### Fixed
88109
- [yt-fts-118](https://github.com/NotJoeMartinez/yt-fts/pull/118)
89-
- Major: Fixed bug where download will fail if channel does not have live stream page
110+
- Major: Fixed bug where download will fail if channel does not have live-stream page
90111

91112
## [0.1.38] - 2023-12-29
92113
### Added
@@ -106,7 +127,7 @@ Special thanks to [@danlamanna](https://github.com/danlamanna) for these fixes
106127
## [0.1.36] - 2023-12-25
107128
### Fixed
108129
- [yt-fts-112](https://github.com/NotJoeMartinez/yt-fts/pull/112)
109-
- Medium: Fixed issue with download command not downloading live streamed videos
130+
- Medium: Fixed issue with download command not downloading live-streamed videos
110131

111132
### Added
112133
- [yt-fts-111](https://github.com/NotJoeMartinez/yt-fts/pull/111)
@@ -176,5 +197,5 @@ Special thanks to [@danlamanna](https://github.com/danlamanna) for these fixes
176197

177198
- [yt-fts-67](https://github.com/NotJoeMartinez/yt-fts/issues/67)
178199

179-
Minor: YouTube URL validation now allows for /@channelName and /channle/channelID
200+
Minor: YouTube URL validation now allows for /@channelName and /channel/channelID
180201
instead of forcing /@channel/videos.

README.md

+22-7
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,15 @@
11

2-
# yt-fts - Youtube Full Text Search
3-
`yt-fts` is a command line program that uses [yt-dlp](https://github.com/yt-dlp/yt-dlp) to scrape all of a youtube channels subtitles and load them into an sqlite database that is searchable from the command line. It allows you to query a channel for specific key word or phrase and will generate time stamped youtube urls to
2+
# yt-fts - YouTube Full Text Search
3+
`yt-fts` is a command line program that uses [yt-dlp](https://github.com/yt-dlp/yt-dlp) to scrape all of a YouTube
4+
channels subtitles and load them into a sqlite database that is searchable from the command line. It allows you to
5+
query a channel for specific key word or phrase and will generate time stamped YouTube urls to
46
the video containing the keyword.
57

68
It also supports semantic search via the [OpenAI embeddings API](https://beta.openai.com/docs/api-reference/) using [chromadb](https://github.com/chroma-core/chroma).
79

810
- [Blog Post](https://notjoemartinez.com/blog/youtube_full_text_search/)
9-
- [Semantic Search](#Semantic-Search-via-OpenAI-embeddings-API)
11+
- [LLM/RAG Chat Bot](#llm-chat-bot)
12+
- [Semantic Search](#vsearch-semantic-search)
1013
- [CHANGELOG](CHANGELOG.md)
1114

1215
https://github.com/NotJoeMartinez/yt-fts/assets/39905973/6ffd8962-d060-490f-9e73-9ab179402f14
@@ -90,7 +93,7 @@ yt-fts search "rea* kni* Mali*" --channel "The Tim Dillon Show"
9093
```
9194

9295

93-
# Semantic Search
96+
# Semantic Search and RAG
9497
You can enable semantic search for a channel by using the `get-embeddings` command.
9598
This requires an OpenAI API key set in the environment variable `OPENAI_API_KEY`, or
9699
you can pass the key with the `--openai-api-key` flag.
@@ -106,11 +109,12 @@ Fetches OpenAI embeddings for specified channel
106109
yt-fts embeddings --channel "3Blue1Brown"
107110

108111
# specify time interval in seconds to split text by default is 10
112+
# the larger the interval the more accurate the llm response
113+
# but semantic search will have more text for you to read.
109114
yt-fts embeddings --interval 60 --channel "3Blue1Brown"
110115
```
111-
112116
After the embeddings are saved you will see a `(ss)` next to the channel name when you
113-
list channels and you will be able to use the `vsearch` command for that channel.
117+
list channels, and you will be able to use the `vsearch` command for that channel.
114118

115119
## `vsearch` (Semantic Search)
116120
`vsearch` is for "Vector search". This requires that you enable semantic
@@ -133,11 +137,21 @@ yt-fts vsearch "[search query]" --export --channel "[channel name or id]"
133137

134138
```
135139

140+
## `llm` (Chat Bot)
141+
Starts interactive chat session with `gpt-4o` OpenAI model using
142+
the semantic search results of your initial prompt as the context
143+
to answer questions. If it can't answer your question, it has a
144+
mechanism to update the context by running targeted query based
145+
off the conversation. The channel must have semantic search enabled.
136146

147+
```sh
148+
yt-fts llm --channel "3Blue1Brown" "How does back propagation work?"
149+
```
137150

138151
## How To
139152

140153
**Export search results:**
154+
141155
For both the `search` and `vsearch` commands you can export the results to a csv file with
142156
the `--export` flag. and it will save the results to a csv file in the current directory.
143157
```bash
@@ -163,7 +177,8 @@ yt-fts update --channel "3Blue1Brown"
163177

164178

165179
**Export all of a channel's transcript:**
166-
This command will create a directory in current working directory with the youtube
180+
181+
This command will create a directory in current working directory with the YouTube
167182
channel id of the specified channel.
168183
```bash
169184
# Export to vtt

pyproject.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "yt-fts"
7-
version = "0.1.51"
7+
version = "0.1.52"
88
description = "Search all of a YouTube channel from the command line"
99
readme = "README.md"
1010
requires-python = ">=3.8"

yt_fts/yt_fts.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
show_message
2727
)
2828

29-
YT_FTS_VERSION = "0.1.51"
29+
YT_FTS_VERSION = "0.1.52"
3030
console = Console()
3131

3232

0 commit comments

Comments
 (0)