206 lines
5.6 KiB
Markdown
206 lines
5.6 KiB
Markdown
# Price Tracker 🛒
|
|
|
|
A comprehensive web scraper for tracking product prices across multiple e-commerce sites. Built with Python, Beautiful Soup, and Flask.
|
|
|
|
## Features ✨
|
|
|
|
- **Multi-site Price Tracking**: Monitor prices across Amazon, eBay, Walmart, and more
|
|
- **Beautiful Web UI**: Clean, responsive interface for managing products and viewing price history
|
|
- **Price Alerts**: Get notified when products reach your target price
|
|
- **Historical Data**: View price trends with interactive charts
|
|
- **Automated Scraping**: Schedule regular price checks
|
|
- **Multiple Notifications**: Email and webhook notifications
|
|
- **Robust Scraping**: Built-in retry logic, rotating user agents, and rate limiting
|
|
|
|
## Quick Start 🚀
|
|
|
|
1. **Clone and Setup**:
|
|
```bash
|
|
git clone <your-repo-url>
|
|
cd price-tracker
|
|
chmod +x setup.sh
|
|
./setup.sh
|
|
```
|
|
|
|
2. **Start the Web UI**:
|
|
```bash
|
|
source venv/bin/activate
|
|
python main.py --mode web
|
|
```
|
|
|
|
3. **Visit**: http://localhost:5000
|
|
|
|
## Usage 📋
|
|
|
|
### Web Interface
|
|
|
|
The web interface provides:
|
|
- **Dashboard**: Overview of all tracked products with current prices
|
|
- **Add Products**: Easy form to add new products with URLs from multiple sites
|
|
- **Product Details**: Detailed view with price history charts and statistics
|
|
- **Settings**: Configuration management and system health checks
|
|
|
|
### Command Line
|
|
|
|
```bash
|
|
# Start web UI
|
|
python main.py --mode web
|
|
|
|
# Run scraping once
|
|
python main.py --mode scrape
|
|
|
|
# Add sample products for testing
|
|
python examples/add_sample_products.py
|
|
|
|
# Scheduled scraping (for cron jobs)
|
|
python scripts/scheduled_scraping.py
|
|
```
|
|
|
|
### Scheduled Scraping
|
|
|
|
Add to your crontab for automatic price checks:
|
|
```bash
|
|
# Every 6 hours
|
|
0 */6 * * * cd /path/to/price-tracker && source venv/bin/activate && python scripts/scheduled_scraping.py
|
|
|
|
# Daily at 8 AM
|
|
0 8 * * * cd /path/to/price-tracker && source venv/bin/activate && python scripts/scheduled_scraping.py
|
|
```
|
|
|
|
## Configuration ⚙️
|
|
|
|
Edit `config.json` to customize:
|
|
|
|
### Scraping Settings
|
|
```json
|
|
{
|
|
"scraping": {
|
|
"delay_between_requests": 2,
|
|
"max_concurrent_requests": 5,
|
|
"timeout": 30,
|
|
"retry_attempts": 3
|
|
}
|
|
}
|
|
```
|
|
|
|
### Email Notifications
|
|
```json
|
|
{
|
|
"notifications": {
|
|
"email": {
|
|
"enabled": true,
|
|
"smtp_server": "smtp.gmail.com",
|
|
"smtp_port": 587,
|
|
"sender_email": "your-email@gmail.com",
|
|
"sender_password": "your-app-password",
|
|
"recipient_email": "alerts@yourdomain.com"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Adding New Sites
|
|
|
|
Add new e-commerce sites by extending the sites configuration:
|
|
|
|
```json
|
|
{
|
|
"sites": {
|
|
"your_site": {
|
|
"enabled": true,
|
|
"base_url": "https://www.yoursite.com",
|
|
"selectors": {
|
|
"price": [".price", ".cost"],
|
|
"title": [".product-title"],
|
|
"availability": [".stock-status"]
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Architecture 🏗️
|
|
|
|
- **`main.py`**: Application entry point
|
|
- **`src/config.py`**: Configuration management
|
|
- **`src/database.py`**: SQLite database operations
|
|
- **`src/scraper.py`**: Core scraping logic with Beautiful Soup
|
|
- **`src/scraper_manager.py`**: Scraping coordination and task management
|
|
- **`src/notification.py`**: Email and webhook notifications
|
|
- **`src/web_ui.py`**: Flask web interface
|
|
- **`templates/`**: HTML templates with Bootstrap styling
|
|
|
|
## Features in Detail 🔍
|
|
|
|
### Smart Price Extraction
|
|
- Multiple CSS selectors per site for robust price detection
|
|
- Handles various price formats and currencies
|
|
- Availability detection (in stock/out of stock)
|
|
- Automatic retry with exponential backoff
|
|
|
|
### Data Storage
|
|
- SQLite database for price history
|
|
- Product management with URLs and target prices
|
|
- Price statistics and trend analysis
|
|
|
|
### Web Interface
|
|
- Responsive design with Bootstrap 5
|
|
- Interactive price charts with Plotly
|
|
- Real-time scraping from the UI
|
|
- Product comparison and best price highlighting
|
|
|
|
### Notifications
|
|
- Email alerts when target prices are reached
|
|
- Webhook integration for custom notifications
|
|
- Rich HTML email templates
|
|
- Test notification functionality
|
|
|
|
## Tips for Best Results 📈
|
|
|
|
1. **Respectful Scraping**: The tool includes delays and rate limiting to be respectful to websites
|
|
2. **URL Selection**: Use direct product page URLs, not search results or category pages
|
|
3. **Target Prices**: Set realistic target prices based on historical data
|
|
4. **Multiple Sites**: Track the same product on multiple sites for best deals
|
|
5. **Regular Updates**: Run scraping regularly but not too frequently (every few hours is good)
|
|
|
|
## Troubleshooting 🔧
|
|
|
|
### Common Issues
|
|
|
|
1. **No prices found**: Check if the CSS selectors are correct for the site
|
|
2. **403/429 errors**: Sites may be blocking requests - try different user agents or increase delays
|
|
3. **Database errors**: Ensure the database file is writable
|
|
4. **Email not working**: Verify SMTP settings and app passwords for Gmail
|
|
|
|
### Adding Debug Information
|
|
|
|
Enable debug logging by modifying the logging level in `main.py`:
|
|
```python
|
|
logging.basicConfig(level=logging.DEBUG)
|
|
```
|
|
|
|
## Legal and Ethical Considerations ⚖️
|
|
|
|
- Respect robots.txt files
|
|
- Don't overload servers with too many requests
|
|
- Use for personal/educational purposes
|
|
- Check terms of service for each site
|
|
- Be mindful of rate limits
|
|
|
|
## Contributing 🤝
|
|
|
|
Feel free to contribute by:
|
|
- Adding support for new e-commerce sites
|
|
- Improving CSS selectors for existing sites
|
|
- Adding new notification methods
|
|
- Enhancing the web UI
|
|
- Fixing bugs and improving performance
|
|
|
|
## License 📄
|
|
|
|
This project is for educational purposes. Please review the terms of service of websites you scrape and use responsibly.
|
|
|
|
---
|
|
|
|
**Happy price tracking! 🛍️**
|