This is a simple Scrapy-based web crawler designed to scrape articles from the Elkhabar website and store them in an SQLite database.
- Scrapes article details such as title, author, publish date, number of readers, and content.
- Stores scraped data in an SQLite database (
elkhabar_articles.db
).
git clone https://github.com/AmeUr56/El-Khabar-Crawler
pip install -r requirements.txt
scrapy crawl elkhabar_spider
The crawler saves article data in the article
table of the elkhabar_articles.db
file. The table includes the following columns:
title
author
publish_date
number_readers
content
This spider and similar projects are intended for learning purposes only. Please ensure you comply with the website’s terms of service and robots.txt when using the spider.
This project is licensed under the MIT License.