Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add sub proxy pool mechanics #213

Merged
merged 7 commits into from
Mar 25, 2024
Merged

Conversation

inVains
Copy link
Contributor

@inVains inVains commented Mar 25, 2024

with an individual tester class provided, it can test and store the available proxy ip for a specific URL, and provide an api.

@Germey
Copy link
Member

Germey commented Mar 25, 2024

请问可以描述一下引入这个能够带来什么场景下的好处吗?主要解决什么问题

@inVains
Copy link
Contributor Author

inVains commented Mar 25, 2024

面临2个问题:

  1. 构建多个专属的代理池,需要配置TEST_URL、REDIS_KEY,并启动多个不同proxypool。
    • 长期运行,多个pool运行内存资源占用较大
    • 一些私有代理接口模式,有调用获取IP限制(1个/10秒),多个pool同时进行get,没法获取IP被限制且效率低(穷)
  2. tester默认的检测机制,只检测状态码,复杂情况下无法检测proxy有效性
    • 有些网站限制IP时,返回json 200,但实际是失败,限制了IP
    • 有网站访问需要先获取动态token再请求,才能确认有效性

@Germey Germey merged commit 78b3244 into Python3WebSpider:master Mar 25, 2024
@Germey
Copy link
Member

Germey commented Mar 25, 2024

多谢解释,代码看着没问题,我已经合并。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants