update augment + YOLO pipeline
This commit is contained in:
423
README.md
423
README.md
@@ -1,132 +1,148 @@
|
||||
# ID Cards Data Augmentation Tool
|
||||
# ID Card Data Augmentation Pipeline
|
||||
|
||||
A comprehensive data augmentation tool specifically designed for ID card images, implementing 7 different augmentation techniques to simulate real-world scenarios.
|
||||
A comprehensive data augmentation pipeline for ID card images with YOLO-based detection and advanced augmentation techniques.
|
||||
|
||||
## 🎯 Overview
|
||||

|
||||
|
||||
This tool provides data augmentation capabilities for ID card images, implementing various transformation techniques that mimic real-world conditions such as worn-out cards, partial occlusion, different lighting conditions, and more.
|
||||
## 🚀 Features
|
||||
|
||||
## ✨ Features
|
||||
### **YOLO-based ID Card Detection**
|
||||
- Automatic detection and cropping of ID cards from large images
|
||||
- Configurable confidence and IoU thresholds
|
||||
- Multiple cropping modes (bbox, square, aspect_ratio)
|
||||
- Padding and target size customization
|
||||
|
||||
### 7 Augmentation Techniques
|
||||
### **Advanced Data Augmentation**
|
||||
- **Geometric Transformations**: Rotation with multiple angles
|
||||
- **Random Cropping**: Simulates partially visible cards
|
||||
- **Noise Addition**: Simulates worn-out cards
|
||||
- **Partial Blockage**: Simulates occluded card details
|
||||
- **Blurring**: Simulates blurred but readable images
|
||||
- **Brightness/Contrast**: Mimics different lighting conditions
|
||||
- **Grayscale Conversion**: Final preprocessing step for all images
|
||||
|
||||
1. **Rotation** - Simulates cards at different angles
|
||||
2. **Random Cropping** - Simulates partially visible cards
|
||||
3. **Random Noise** - Simulates worn-out cards
|
||||
4. **Horizontal Blockage** - Simulates occluded card details
|
||||
5. **Grayscale Transformation** - Simulates Xerox/scan copies
|
||||
6. **Blurring** - Simulates blurred but readable cards
|
||||
7. **Brightness & Contrast** - Simulates different lighting conditions
|
||||
### **Flexible Configuration**
|
||||
- YAML-based configuration system
|
||||
- Command-line argument overrides
|
||||
- Environment-specific settings
|
||||
- Comprehensive logging
|
||||
|
||||
### Key Features
|
||||
## 📋 Requirements
|
||||
|
||||
- **Separate Methods**: Each augmentation technique is applied independently
|
||||
- **Quality Preservation**: Maintains image quality with white background preservation
|
||||
- **OpenCV Integration**: Uses OpenCV functions for reliable image processing
|
||||
- **Configurable**: Easy configuration through YAML files
|
||||
- **Progress Tracking**: Real-time progress monitoring
|
||||
- **Batch Processing**: Process multiple images efficiently
|
||||
```bash
|
||||
# Python 3.8+
|
||||
conda create -n gpu python=3.8
|
||||
conda activate gpu
|
||||
|
||||
## 🚀 Installation
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### Prerequisites
|
||||
### Dependencies
|
||||
- `opencv-python>=4.5.0`
|
||||
- `numpy>=1.21.0`
|
||||
- `Pillow>=8.3.0`
|
||||
- `PyYAML>=5.4.0`
|
||||
- `ultralytics>=8.0.0` (for YOLO models)
|
||||
|
||||
- Python 3.7+
|
||||
- OpenCV
|
||||
- NumPy
|
||||
- PyYAML
|
||||
- PIL (Pillow)
|
||||
## 🛠️ Installation
|
||||
|
||||
### Setup
|
||||
|
||||
1. **Clone the repository**:
|
||||
1. **Clone the repository**
|
||||
```bash
|
||||
git clone <repository-url>
|
||||
cd IDcardsGenerator
|
||||
```
|
||||
|
||||
2. **Install dependencies**:
|
||||
2. **Install dependencies**
|
||||
```bash
|
||||
pip install opencv-python numpy pyyaml pillow
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
3. **Activate conda environment** (if using GPU):
|
||||
3. **Prepare YOLO model** (optional)
|
||||
```bash
|
||||
conda activate gpu
|
||||
# Place your trained YOLO model at:
|
||||
data/weights/id_cards_yolov8n.pt
|
||||
```
|
||||
|
||||
## 📁 Project Structure
|
||||
## 📖 Usage
|
||||
|
||||
```
|
||||
IDcardsGenerator/
|
||||
├── config/
|
||||
│ └── config.yaml # Main configuration file
|
||||
├── data/
|
||||
│ └── IDcards/
|
||||
│ └── processed/ # Input images directory
|
||||
├── src/
|
||||
│ ├── data_augmentation.py # Core augmentation logic
|
||||
│ ├── config_manager.py # Configuration management
|
||||
│ ├── image_processor.py # Image processing utilities
|
||||
│ └── utils.py # Utility functions
|
||||
├── logs/ # Log files
|
||||
├── out/ # Output directory
|
||||
└── main.py # Main script
|
||||
### **Basic Usage**
|
||||
|
||||
```bash
|
||||
# Run with default configuration
|
||||
python main.py
|
||||
|
||||
# Run with ID card detection enabled
|
||||
python main.py --enable-id-detection
|
||||
|
||||
# Run with custom input/output directories
|
||||
python main.py --input-dir "path/to/input" --output-dir "path/to/output"
|
||||
```
|
||||
|
||||
## ⚙️ Configuration
|
||||
### **Configuration Options**
|
||||
|
||||
### Main Configuration (`config/config.yaml`)
|
||||
#### **ID Card Detection**
|
||||
```bash
|
||||
# Enable detection with custom model
|
||||
python main.py --enable-id-detection --model-path "path/to/model.pt"
|
||||
|
||||
# Adjust detection parameters
|
||||
python main.py --enable-id-detection --confidence 0.3 --crop-mode square
|
||||
|
||||
# Set target size for cropped cards
|
||||
python main.py --enable-id-detection --crop-target-size "640,640"
|
||||
```
|
||||
|
||||
#### **Data Augmentation**
|
||||
```bash
|
||||
# Customize augmentation parameters
|
||||
python main.py --num-augmentations 5 --target-size "512,512"
|
||||
|
||||
# Preview augmentation results
|
||||
python main.py --preview
|
||||
```
|
||||
|
||||
### **Configuration File**
|
||||
|
||||
Edit `config/config.yaml` for persistent settings:
|
||||
|
||||
```yaml
|
||||
# Data augmentation parameters
|
||||
# ID Card Detection
|
||||
id_card_detection:
|
||||
enabled: false # Enable/disable YOLO detection
|
||||
model_path: "data/weights/id_cards_yolov8n.pt"
|
||||
confidence_threshold: 0.25
|
||||
iou_threshold: 0.45
|
||||
padding: 10
|
||||
crop_mode: "bbox"
|
||||
target_size: null
|
||||
|
||||
# Data Augmentation
|
||||
augmentation:
|
||||
# Rotation
|
||||
rotation:
|
||||
enabled: true
|
||||
angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
|
||||
probability: 1.0
|
||||
|
||||
# Random cropping
|
||||
random_cropping:
|
||||
enabled: true
|
||||
ratio_range: [0.7, 1.0]
|
||||
probability: 1.0
|
||||
|
||||
# Random noise
|
||||
random_noise:
|
||||
enabled: true
|
||||
mean_range: [0.0, 0.7]
|
||||
variance_range: [0.0, 0.1]
|
||||
probability: 1.0
|
||||
|
||||
# Partial blockage
|
||||
partial_blockage:
|
||||
enabled: true
|
||||
num_occlusions_range: [1, 100]
|
||||
coverage_range: [0.0, 0.25]
|
||||
variance_range: [0.0, 0.1]
|
||||
probability: 1.0
|
||||
|
||||
# Grayscale transformation
|
||||
grayscale:
|
||||
enabled: true
|
||||
probability: 1.0
|
||||
|
||||
# Blurring
|
||||
blurring:
|
||||
enabled: true
|
||||
kernel_ratio_range: [0.0, 0.0084]
|
||||
probability: 1.0
|
||||
|
||||
# Brightness and contrast
|
||||
brightness_contrast:
|
||||
enabled: true
|
||||
alpha_range: [0.4, 3.0]
|
||||
beta_range: [1, 100]
|
||||
probability: 1.0
|
||||
grayscale:
|
||||
enabled: true # Applied as final step
|
||||
|
||||
# Processing configuration
|
||||
# Processing
|
||||
processing:
|
||||
target_size: [640, 640]
|
||||
num_augmentations: 3
|
||||
@@ -134,156 +150,139 @@ processing:
|
||||
quality: 95
|
||||
```
|
||||
|
||||
## 🎮 Usage
|
||||
## 🔄 Workflow
|
||||
|
||||
### Basic Usage
|
||||
### **Two-Step Processing Pipeline**
|
||||
|
||||
#### **Step 1: ID Card Detection (Optional)**
|
||||
When `id_card_detection.enabled: true`:
|
||||
1. **Input**: Large images containing multiple ID cards
|
||||
2. **YOLO Detection**: Locate and detect ID cards
|
||||
3. **Cropping**: Extract individual ID cards with padding
|
||||
4. **Output**: Cropped ID cards saved to `out/processed/`
|
||||
|
||||
#### **Step 2: Data Augmentation**
|
||||
1. **Input**: Original images OR cropped ID cards
|
||||
2. **Augmentation**: Apply 6 augmentation methods:
|
||||
- Rotation (9 different angles)
|
||||
- Random cropping (70-100% ratio)
|
||||
- Random noise (simulate wear)
|
||||
- Partial blockage (simulate occlusion)
|
||||
- Blurring (simulate motion blur)
|
||||
- Brightness/Contrast adjustment
|
||||
3. **Grayscale**: Convert all images to grayscale (final step)
|
||||
4. **Output**: Augmented images in main output directory
|
||||
|
||||
### **Direct Augmentation Mode**
|
||||
When `id_card_detection.enabled: false`:
|
||||
- Skips YOLO detection
|
||||
- Applies augmentation directly to input images
|
||||
- All images are converted to grayscale
|
||||
|
||||
## 📊 Output Structure
|
||||
|
||||
```
|
||||
output_directory/
|
||||
├── processed/ # Cropped ID cards (if detection enabled)
|
||||
│ ├── id_card_001.jpg
|
||||
│ ├── id_card_002.jpg
|
||||
│ └── processing_summary.json
|
||||
├── im1__rotation_01.png # Augmented images
|
||||
├── im1__cropping_01.png
|
||||
├── im1__noise_01.png
|
||||
├── im1__blockage_01.png
|
||||
├── im1__blurring_01.png
|
||||
├── im1__brightness_contrast_01.png
|
||||
└── augmentation_summary.json
|
||||
```
|
||||
|
||||
## 🎯 Use Cases
|
||||
|
||||
### **Training Data Generation**
|
||||
```bash
|
||||
python main.py --input-dir data/IDcards/processed --output-dir out
|
||||
# Generate diverse training data
|
||||
python main.py --enable-id-detection --num-augmentations 10
|
||||
```
|
||||
|
||||
### Command Line Options
|
||||
|
||||
### **Quality Control**
|
||||
```bash
|
||||
python main.py [OPTIONS]
|
||||
|
||||
Options:
|
||||
--config CONFIG Path to configuration file (default: config/config.yaml)
|
||||
--input-dir INPUT_DIR Input directory containing images
|
||||
--output-dir OUTPUT_DIR Output directory for augmented images
|
||||
--num-augmentations N Number of augmented versions per image (default: 3)
|
||||
--target-size SIZE Target size for images (width x height)
|
||||
--preview Preview augmentation on first image only
|
||||
--info Show information about images in input directory
|
||||
--list-presets List available presets and exit
|
||||
--log-level LEVEL Logging level (DEBUG, INFO, WARNING, ERROR)
|
||||
# Preview results before processing
|
||||
python main.py --preview
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
1. **Preview augmentation**:
|
||||
### **Batch Processing**
|
||||
```bash
|
||||
python main.py --preview --input-dir data/IDcards/processed --output-dir test_output
|
||||
# Process large datasets
|
||||
python main.py --input-dir "large_dataset/" --output-dir "augmented_dataset/"
|
||||
```
|
||||
|
||||
2. **Show image information**:
|
||||
```bash
|
||||
python main.py --info --input-dir data/IDcards/processed
|
||||
## ⚙️ Advanced Configuration
|
||||
|
||||
### **Custom Augmentation Parameters**
|
||||
|
||||
```yaml
|
||||
augmentation:
|
||||
rotation:
|
||||
angles: [45, 90, 135, 180, 225, 270, 315] # Custom angles
|
||||
random_cropping:
|
||||
ratio_range: [0.8, 0.95] # Tighter cropping
|
||||
random_noise:
|
||||
mean_range: [0.1, 0.5] # More noise
|
||||
variance_range: [0.05, 0.15]
|
||||
```
|
||||
|
||||
3. **Custom number of augmentations**:
|
||||
```bash
|
||||
python main.py --input-dir data/IDcards/processed --output-dir out --num-augmentations 5
|
||||
### **Performance Optimization**
|
||||
|
||||
```yaml
|
||||
performance:
|
||||
num_workers: 4
|
||||
prefetch_factor: 2
|
||||
pin_memory: true
|
||||
use_gpu: false
|
||||
```
|
||||
|
||||
4. **Custom target size**:
|
||||
```bash
|
||||
python main.py --input-dir data/IDcards/processed --output-dir out --target-size 512x512
|
||||
```
|
||||
|
||||
## 📊 Output
|
||||
|
||||
### File Naming Convention
|
||||
|
||||
The tool creates separate files for each augmentation method:
|
||||
|
||||
```
|
||||
im1_rotation_01.png # Rotation method
|
||||
im1_cropping_01.png # Random cropping method
|
||||
im1_noise_01.png # Random noise method
|
||||
im1_blockage_01.png # Partial blockage method
|
||||
im1_grayscale_01.png # Grayscale method
|
||||
im1_blurring_01.png # Blurring method
|
||||
im1_brightness_contrast_01.png # Brightness/contrast method
|
||||
```
|
||||
|
||||
### Output Summary
|
||||
|
||||
After processing, you'll see a summary like:
|
||||
|
||||
```
|
||||
==================================================
|
||||
AUGMENTATION SUMMARY
|
||||
==================================================
|
||||
Original images: 106
|
||||
Augmented images: 2226
|
||||
Augmentation ratio: 21.00
|
||||
Successful augmentations: 106
|
||||
Output directory: out
|
||||
==================================================
|
||||
```
|
||||
|
||||
## 🔧 Augmentation Techniques Details
|
||||
|
||||
### 1. Rotation
|
||||
- **Purpose**: Simulates cards at different angles
|
||||
- **Angles**: 30°, 60°, 120°, 150°, 180°, 210°, 240°, 300°, 330°
|
||||
- **Method**: OpenCV rotation with white background preservation
|
||||
|
||||
### 2. Random Cropping
|
||||
- **Purpose**: Simulates partially visible ID cards
|
||||
- **Ratio Range**: 0.7 to 1.0 (70% to 100% of original size)
|
||||
- **Method**: Random crop with white background preservation
|
||||
|
||||
### 3. Random Noise
|
||||
- **Purpose**: Simulates worn-out cards
|
||||
- **Mean Range**: 0.0 to 0.7
|
||||
- **Variance Range**: 0.0 to 0.1
|
||||
- **Method**: Gaussian noise addition
|
||||
|
||||
### 4. Horizontal Blockage
|
||||
- **Purpose**: Simulates occluded card details
|
||||
- **Lines**: 1 to 100 horizontal lines
|
||||
- **Coverage**: 0% to 25% of image area
|
||||
- **Colors**: Multiple colors to simulate various objects
|
||||
|
||||
### 5. Grayscale Transformation
|
||||
- **Purpose**: Simulates Xerox/scan copies
|
||||
- **Method**: OpenCV `cv2.cvtColor()` function
|
||||
- **Output**: 3-channel grayscale image
|
||||
|
||||
### 6. Blurring
|
||||
- **Purpose**: Simulates blurred but readable cards
|
||||
- **Kernel Ratio**: 0.0 to 0.0084
|
||||
- **Method**: OpenCV `cv2.filter2D()` with Gaussian kernel
|
||||
|
||||
### 7. Brightness & Contrast
|
||||
- **Purpose**: Simulates different lighting conditions
|
||||
- **Alpha Range**: 0.4 to 3.0 (contrast)
|
||||
- **Beta Range**: 1 to 100 (brightness)
|
||||
- **Method**: OpenCV `cv2.convertScaleAbs()`
|
||||
|
||||
## 🛠️ Development
|
||||
|
||||
### Adding New Augmentation Methods
|
||||
|
||||
1. Add the method to `src/data_augmentation.py`
|
||||
2. Update configuration in `config/config.yaml`
|
||||
3. Update default config in `src/config_manager.py`
|
||||
4. Test with preview mode
|
||||
|
||||
### Code Structure
|
||||
|
||||
- **`main.py`**: Entry point and command-line interface
|
||||
- **`src/data_augmentation.py`**: Core augmentation logic
|
||||
- **`src/config_manager.py`**: Configuration management
|
||||
- **`src/image_processor.py`**: Image processing utilities
|
||||
- **`src/utils.py`**: Utility functions
|
||||
|
||||
## 📝 Logging
|
||||
|
||||
The tool provides comprehensive logging:
|
||||
The system provides comprehensive logging:
|
||||
- **File**: `logs/data_augmentation.log`
|
||||
- **Console**: Real-time progress updates
|
||||
- **Summary**: JSON files with processing statistics
|
||||
|
||||
- **File logging**: `logs/data_augmentation.log`
|
||||
- **Console logging**: Real-time progress updates
|
||||
- **Log levels**: DEBUG, INFO, WARNING, ERROR
|
||||
### **Log Levels**
|
||||
- `INFO`: General processing information
|
||||
- `WARNING`: Non-critical issues (e.g., no cards detected)
|
||||
- `ERROR`: Critical errors
|
||||
|
||||
## 🔧 Troubleshooting
|
||||
|
||||
### **Common Issues**
|
||||
|
||||
1. **No images detected**
|
||||
- Check input directory path
|
||||
- Verify image formats (jpg, png, bmp, tiff)
|
||||
- Ensure images are not corrupted
|
||||
|
||||
2. **YOLO model not found**
|
||||
- Place model file at `data/weights/id_cards_yolov8n.pt`
|
||||
- Or specify custom path with `--model-path`
|
||||
|
||||
3. **Memory issues**
|
||||
- Reduce `num_augmentations`
|
||||
- Use smaller `target_size`
|
||||
- Enable GPU if available
|
||||
|
||||
### **Performance Tips**
|
||||
|
||||
- **GPU Acceleration**: Set `use_gpu: true` in config
|
||||
- **Batch Processing**: Use multiple workers for large datasets
|
||||
- **Memory Management**: Process in smaller batches
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
1. Fork the repository
|
||||
2. Create a feature branch
|
||||
3. Make your changes
|
||||
4. Test thoroughly
|
||||
4. Add tests if applicable
|
||||
5. Submit a pull request
|
||||
|
||||
## 📄 License
|
||||
@@ -292,18 +291,10 @@ This project is licensed under the MIT License - see the LICENSE file for detail
|
||||
|
||||
## 🙏 Acknowledgments
|
||||
|
||||
- OpenCV for image processing capabilities
|
||||
- NumPy for numerical operations
|
||||
- PyYAML for configuration management
|
||||
|
||||
## 📞 Support
|
||||
|
||||
For issues and questions:
|
||||
1. Check the logs in `logs/data_augmentation.log`
|
||||
2. Review the configuration in `config/config.yaml`
|
||||
3. Test with preview mode first
|
||||
4. Create an issue with detailed information
|
||||
- **YOLOv8**: Ultralytics for the detection framework
|
||||
- **OpenCV**: Computer vision operations
|
||||
- **NumPy**: Numerical computations
|
||||
|
||||
---
|
||||
|
||||
**Note**: This tool is specifically designed for ID card augmentation and may need adjustments for other image types.
|
||||
**For questions and support, please open an issue on GitHub.**
|
Reference in New Issue
Block a user