305 lines
8.8 KiB
Markdown
305 lines
8.8 KiB
Markdown
# Price Tracker 🛒
|
|
|
|
A comprehensive web scraper for tracking product prices across multiple e-commerce sites. Built with Python, Beautiful Soup, and Flask.
|
|
|
|
## Features ✨
|
|
|
|
- **Multi-site Price Tracking**: Monitor prices across JJ Food Service, A to Z Catering, and Amazon UK
|
|
- **Beautiful Web UI**: Clean, responsive interface for managing products and viewing price history
|
|
- **Daily Shopping Lists**: Automatically generate shopping lists with the best prices for each store
|
|
- **Price Alerts**: Get notified when products reach your target price
|
|
- **Historical Data**: View price trends with interactive charts
|
|
- **Automated Scraping**: Schedule regular price checks
|
|
- **Multiple Notifications**: Email and webhook notifications for price alerts and shopping lists
|
|
- **Robust Scraping**: Built-in retry logic, rotating user agents, and rate limiting
|
|
- **Special Pricing Detection**: Automatically detects and prioritizes delivery prices and special offers
|
|
- **Smart Shopping**: Compare prices across stores and get recommendations for the best deals
|
|
|
|
## Quick Start 🚀
|
|
|
|
1. **Clone and Setup**:
|
|
```bash
|
|
git clone <your-repo-url>
|
|
cd price-tracker
|
|
chmod +x setup.sh
|
|
./setup.sh
|
|
```
|
|
|
|
2. **Start the Web UI**:
|
|
```bash
|
|
source venv/bin/activate
|
|
python main.py --mode web
|
|
```
|
|
|
|
3. **Visit**: http://localhost:5000
|
|
|
|
## Usage 📋
|
|
|
|
### Web Interface
|
|
|
|
The web interface provides:
|
|
- **Dashboard**: Overview of all tracked products with current prices
|
|
- **Shopping Lists**: Daily shopping lists showing the best deals for each store
|
|
- **Add Products**: Easy form to add new products with URLs from multiple sites
|
|
- **Product Details**: Detailed view with price history charts and statistics
|
|
- **Settings**: Configuration management and system health checks
|
|
|
|
### Command Line
|
|
|
|
```bash
|
|
# Start web UI
|
|
python main.py --mode web
|
|
|
|
# Run scraping once
|
|
python main.py --mode scrape
|
|
|
|
# Add sample products for testing
|
|
python examples/add_sample_products.py
|
|
|
|
# Scheduled scraping (for cron jobs)
|
|
python scripts/scheduled_scraping.py
|
|
```
|
|
|
|
### Scheduled Scraping
|
|
|
|
Add to your crontab for automatic price checks:
|
|
```bash
|
|
# Every 6 hours
|
|
0 */6 * * * cd /path/to/price-tracker && source venv/bin/activate && python scripts/scheduled_scraping.py
|
|
|
|
# Daily at 8 AM
|
|
0 8 * * * cd /path/to/price-tracker && source venv/bin/activate && python scripts/scheduled_scraping.py
|
|
```
|
|
|
|
## Configuration ⚙️
|
|
|
|
Edit `config.json` to customize:
|
|
|
|
### Scraping Settings
|
|
```json
|
|
{
|
|
"scraping": {
|
|
"delay_between_requests": 2,
|
|
"max_concurrent_requests": 5,
|
|
"timeout": 30,
|
|
"retry_attempts": 3
|
|
}
|
|
}
|
|
```
|
|
|
|
### Email Notifications
|
|
```json
|
|
{
|
|
"notifications": {
|
|
"email": {
|
|
"enabled": true,
|
|
"smtp_server": "smtp.gmail.com",
|
|
"smtp_port": 587,
|
|
"sender_email": "your-email@gmail.com",
|
|
"sender_password": "your-app-password",
|
|
"recipient_email": "alerts@yourdomain.com"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Adding New Sites
|
|
|
|
Add new e-commerce sites by extending the sites configuration:
|
|
|
|
```json
|
|
{
|
|
"sites": {
|
|
"atoz_catering": {
|
|
"enabled": true,
|
|
"base_url": "https://www.atoz-catering.co.uk",
|
|
"selectors": {
|
|
"price": [
|
|
".my-price.price-offer",
|
|
".delivery-price",
|
|
".price"
|
|
],
|
|
"special_offer": [
|
|
".my-price.price-offer",
|
|
".special-offer",
|
|
"del:contains('£')"
|
|
]
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Architecture 🏗️
|
|
|
|
- **`main.py`**: Application entry point
|
|
- **`src/config.py`**: Configuration management
|
|
- **`src/database.py`**: SQLite database operations
|
|
- **`src/scraper.py`**: Core scraping logic with Beautiful Soup
|
|
- **`src/scraper_manager.py`**: Scraping coordination and task management
|
|
- **`src/notification.py`**: Email and webhook notifications
|
|
- **`src/web_ui.py`**: Flask web interface
|
|
- **`templates/`**: HTML templates with Bootstrap styling
|
|
|
|
## Features in Detail 🔍
|
|
|
|
### Smart Price Extraction
|
|
- Multiple CSS selectors per site for robust price detection
|
|
- Handles various price formats and currencies
|
|
- Availability detection (in stock/out of stock)
|
|
- Automatic retry with exponential backoff
|
|
|
|
### Data Storage
|
|
- SQLite database for price history
|
|
- Product management with URLs and target prices
|
|
- Price statistics and trend analysis
|
|
|
|
### Web Interface
|
|
- Responsive design with Bootstrap 5
|
|
- Interactive price charts with Plotly
|
|
- Real-time scraping from the UI
|
|
- Product comparison and best price highlighting
|
|
|
|
### Notifications
|
|
- Email alerts when target prices are reached
|
|
- Webhook integration for custom notifications
|
|
- Rich HTML email templates
|
|
- Test notification functionality
|
|
|
|
## Tips for Best Results 📈
|
|
|
|
1. **Respectful Scraping**: The tool includes delays and rate limiting to be respectful to websites
|
|
2. **URL Selection**: Use direct product page URLs, not search results or category pages
|
|
3. **Target Prices**: Set realistic target prices based on historical data
|
|
4. **Multiple Sites**: Track the same product on multiple sites for best deals
|
|
5. **Regular Updates**: Run scraping regularly but not too frequently (every few hours is good)
|
|
|
|
## Deployment 🚀
|
|
|
|
### Docker Deployment
|
|
|
|
1. **Build and run with Docker**:
|
|
```bash
|
|
# Build the container
|
|
docker build -t price-tracker .
|
|
|
|
# Run with docker-compose
|
|
docker-compose up -d
|
|
```
|
|
|
|
2. **Manual Docker deployment**:
|
|
```bash
|
|
docker run -d \
|
|
--name price-tracker \
|
|
-p 5000:5000 \
|
|
-v $(pwd)/data:/app/data \
|
|
price-tracker
|
|
```
|
|
|
|
### CI/CD with Azure DevOps
|
|
|
|
The project includes Azure DevOps pipeline configuration for automated deployments:
|
|
|
|
1. **Setup GitHub Integration**:
|
|
- See `AZURE-DEVOPS-SETUP.md` for detailed instructions
|
|
- Pipeline pulls directly from GitHub
|
|
- Automatic builds on push to `main` or `develop` branches
|
|
|
|
2. **Pipeline Features**:
|
|
- Docker image build and push to registry
|
|
- Security scanning with Trivy
|
|
- Automated testing
|
|
- Multi-environment deployment (dev/prod)
|
|
|
|
3. **Quick Setup**:
|
|
```bash
|
|
# Update azure-pipelines.yml with your GitHub repo
|
|
# Create GitHub service connection in Azure DevOps
|
|
# Create Docker registry service connection
|
|
# Run the pipeline
|
|
```
|
|
|
|
## Troubleshooting 🔧
|
|
|
|
### Common Issues
|
|
|
|
1. **No prices found**: Check if the CSS selectors are correct for the site
|
|
2. **403/429 errors**: Sites may be blocking requests - try different user agents or increase delays
|
|
3. **Database errors**: Ensure the database file is writable
|
|
4. **Email not working**: Verify SMTP settings and app passwords for Gmail
|
|
|
|
### Adding Debug Information
|
|
|
|
Enable debug logging by modifying the logging level in `main.py`:
|
|
```python
|
|
logging.basicConfig(level=logging.DEBUG)
|
|
```
|
|
|
|
## Legal and Ethical Considerations ⚖️
|
|
|
|
- Respect robots.txt files
|
|
- Don't overload servers with too many requests
|
|
- Use for personal/educational purposes
|
|
- Check terms of service for each site
|
|
- Be mindful of rate limits
|
|
|
|
## Contributing 🤝
|
|
|
|
Feel free to contribute by:
|
|
- Adding support for new e-commerce sites
|
|
- Improving CSS selectors for existing sites
|
|
- Adding new notification methods
|
|
- Enhancing the web UI
|
|
- Fixing bugs and improving performance
|
|
|
|
## License 📄
|
|
|
|
This project is for educational purposes. Please review the terms of service of websites you scrape and use responsibly.
|
|
|
|
---
|
|
|
|
**Happy price tracking! 🛍️**
|
|
|
|
## Shopping Lists 🛍️
|
|
|
|
The price tracker automatically generates daily shopping lists showing the best deals for each store:
|
|
|
|
### Features
|
|
- **Best Price Selection**: Automatically finds the lowest current price for each product
|
|
- **Store-Specific Lists**: Separate shopping lists for each tracked store
|
|
- **Savings Calculation**: Shows how much you save compared to target prices
|
|
- **Customizable Preferences**: Set minimum savings thresholds, maximum items, and delivery schedules
|
|
- **Multiple Formats**: View online, print, export to CSV, or email daily lists
|
|
|
|
### Daily Automation
|
|
Set up automatic daily shopping lists:
|
|
|
|
```bash
|
|
# Generate and send shopping lists manually
|
|
python shopping_list_scheduler.py --force
|
|
|
|
# Test without sending (dry run)
|
|
python shopping_list_scheduler.py --dry-run
|
|
|
|
# Set up daily cron job (9 AM daily)
|
|
echo "0 9 * * * cd /path/to/price-tracker && python shopping_list_scheduler.py" | crontab -
|
|
```
|
|
|
|
### Shopping List Preferences
|
|
|
|
For each store, you can configure:
|
|
- **Enable/Disable**: Turn shopping lists on/off per store
|
|
- **Minimum Savings**: Only include items with savings above threshold
|
|
- **Maximum Items**: Limit list size to avoid overwhelming
|
|
- **Out of Stock**: Include/exclude unavailable items
|
|
- **Delivery Time**: Schedule when to send daily lists
|
|
- **Notifications**: Email and/or webhook delivery
|
|
|
|
### Web Interface
|
|
|
|
Access shopping lists via the web interface:
|
|
- **View All Lists**: See current shopping lists for all stores
|
|
- **Store Details**: Detailed view with full product information
|
|
- **Send Immediately**: Manually trigger email/webhook delivery
|
|
- **Manage Preferences**: Configure settings per store
|