324 lines
8.9 KiB
Markdown
324 lines
8.9 KiB
Markdown
# ID Card Data Augmentation Pipeline
|
|
|
|
A comprehensive data augmentation pipeline for ID card images with YOLO-based detection, smart sampling strategies, and advanced augmentation techniques.
|
|
|
|

|
|
|
|
## 🚀 New Features v2.0
|
|
|
|
### **Smart Data Strategy**
|
|
- **Sampling Mode** (`factor < 1.0`): Process only a percentage of input data
|
|
- **Multiplication Mode** (`factor >= 1.0`): Multiply total dataset size
|
|
- **Balanced Output**: Includes both raw and augmented images
|
|
- **Configurable Sampling**: Random, stratified, or uniform selection
|
|
|
|
### **Enhanced Augmentation**
|
|
- **Random Method Combination**: Mix and match augmentation techniques
|
|
- **Method Probability Weights**: Control frequency of each augmentation
|
|
- **Raw Image Preservation**: Always includes original processed images
|
|
- **Flexible Processing Modes**: Individual, sequential, or random combination
|
|
|
|
## 🎯 Key Features
|
|
|
|
### **YOLO-based ID Card Detection**
|
|
- Automatic detection and cropping of ID cards from large images
|
|
- Configurable confidence and IoU thresholds
|
|
- Multiple cropping modes (bbox, square, aspect_ratio)
|
|
- Padding and target size customization
|
|
|
|
### **Advanced Data Augmentation**
|
|
- **Geometric Transformations**: Rotation with multiple angles
|
|
- **Random Cropping**: Simulates partially visible cards
|
|
- **Noise Addition**: Simulates worn-out cards
|
|
- **Partial Blockage**: Simulates occluded card details
|
|
- **Blurring**: Simulates motion blur while keeping readability
|
|
- **Brightness/Contrast**: Mimics different lighting conditions
|
|
- **Color Jittering**: HSV adjustments for color variations
|
|
- **Perspective Transform**: Simulates viewing angle changes
|
|
- **Grayscale Conversion**: Final preprocessing step for all images
|
|
|
|
### **Flexible Configuration**
|
|
- YAML-based configuration system
|
|
- Command-line argument overrides
|
|
- Smart data strategy configuration
|
|
- Comprehensive logging and statistics
|
|
|
|
## 📋 Requirements
|
|
|
|
```bash
|
|
# Python 3.8+
|
|
conda create -n gpu python=3.8
|
|
conda activate gpu
|
|
|
|
# Install dependencies
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
### Dependencies
|
|
- `opencv-python>=4.5.0`
|
|
- `numpy>=1.21.0`
|
|
- `Pillow>=8.3.0`
|
|
- `PyYAML>=5.4.0`
|
|
- `ultralytics>=8.0.0` (for YOLO models)
|
|
- `torch>=1.12.0` (for GPU acceleration)
|
|
|
|
## 🛠️ Installation
|
|
|
|
1. **Clone the repository**
|
|
```bash
|
|
git clone <repository-url>
|
|
cd IDcardsGenerator
|
|
```
|
|
|
|
2. **Install dependencies**
|
|
```bash
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
3. **Prepare YOLO model** (optional)
|
|
```bash
|
|
# Place your trained YOLO model at:
|
|
data/weights/id_cards_yolov8n.pt
|
|
```
|
|
|
|
## 📖 Usage
|
|
|
|
### **Basic Usage**
|
|
|
|
```bash
|
|
# Run with default configuration (3x multiplication)
|
|
python main.py
|
|
|
|
# Run with sampling mode (30% of input data)
|
|
python main.py # Set multiplication_factor: 0.3 in config
|
|
|
|
# Run with ID card detection enabled
|
|
python main.py --enable-id-detection
|
|
```
|
|
|
|
### **Data Strategy Examples**
|
|
|
|
#### **Sampling Mode** (factor < 1.0)
|
|
```yaml
|
|
data_strategy:
|
|
multiplication_factor: 0.3 # Process 30% of input images
|
|
sampling:
|
|
method: "random" # random, stratified, uniform
|
|
preserve_distribution: true
|
|
```
|
|
- Input: 100 images → Select 30 images → Output: 100 images total
|
|
- Each selected image generates ~3-4 versions (including raw)
|
|
|
|
#### **Multiplication Mode** (factor >= 1.0)
|
|
```yaml
|
|
data_strategy:
|
|
multiplication_factor: 3.0 # 3x dataset size
|
|
```
|
|
- Input: 100 images → Process all → Output: 300 images total
|
|
- Each image generates 3 versions (1 raw + 2 augmented)
|
|
|
|
### **Augmentation Strategy**
|
|
|
|
```yaml
|
|
augmentation:
|
|
strategy:
|
|
mode: "random_combine" # random_combine, sequential, individual
|
|
min_methods: 2 # Min augmentation methods per image
|
|
max_methods: 4 # Max augmentation methods per image
|
|
|
|
methods:
|
|
rotation:
|
|
enabled: true
|
|
probability: 0.8 # 80% chance to be selected
|
|
angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
|
|
|
|
random_cropping:
|
|
enabled: true
|
|
probability: 0.7
|
|
ratio_range: [0.7, 1.0]
|
|
|
|
# ... other methods with probabilities
|
|
```
|
|
|
|
## 🔄 Workflow
|
|
|
|
### **Smart Processing Pipeline**
|
|
|
|
#### **Step 1: Data Selection**
|
|
- **Sampling Mode**: Randomly select subset of input images
|
|
- **Multiplication Mode**: Process all input images
|
|
- **Stratified Sampling**: Preserve file type distribution
|
|
|
|
#### **Step 2: ID Card Detection** (Optional)
|
|
When `id_card_detection.enabled: true`:
|
|
1. **YOLO Detection**: Locate ID cards in large images
|
|
2. **Cropping**: Extract individual ID cards with padding
|
|
3. **Output**: Cropped ID cards saved to `out/processed/`
|
|
|
|
#### **Step 3: Smart Augmentation**
|
|
1. **Raw Processing**: Always include original (resized + grayscale)
|
|
2. **Random Combination**: Select 2-4 augmentation methods randomly
|
|
3. **Method Application**: Apply selected methods with probability weights
|
|
4. **Final Processing**: Grayscale conversion for all outputs
|
|
|
|
## 📊 Output Structure
|
|
|
|
```
|
|
output_directory/
|
|
├── processed/ # Cropped ID cards (if detection enabled)
|
|
│ ├── id_card_001.jpg
|
|
│ ├── id_card_002.jpg
|
|
│ └── processing_summary.json
|
|
├── im1__raw_001.jpg # Raw processed images
|
|
├── im1__aug_001.jpg # Augmented images (random combinations)
|
|
├── im1__aug_002.jpg
|
|
├── im2__raw_001.jpg
|
|
├── im2__aug_001.jpg
|
|
└── processing_summary.json
|
|
```
|
|
|
|
### **File Naming Convention**
|
|
- `{basename}_raw_001.jpg`: Original image (resized + grayscale)
|
|
- `{basename}_aug_001.jpg`: Augmented version 1 (random methods)
|
|
- `{basename}_aug_002.jpg`: Augmented version 2 (different methods)
|
|
|
|
## 🎯 Use Cases
|
|
|
|
### **Dataset Expansion**
|
|
```yaml
|
|
# Triple your dataset size with balanced augmentation
|
|
data_strategy:
|
|
multiplication_factor: 3.0
|
|
```
|
|
|
|
### **Smart Sampling for Large Datasets**
|
|
```yaml
|
|
# Process only 20% but maintain original dataset size
|
|
data_strategy:
|
|
multiplication_factor: 0.2
|
|
sampling:
|
|
method: "stratified" # Preserve file type distribution
|
|
```
|
|
|
|
### **Quality Control**
|
|
```bash
|
|
# Preview results before full processing
|
|
python main.py --preview
|
|
```
|
|
|
|
## ⚙️ Advanced Configuration
|
|
|
|
### **Augmentation Strategy Modes**
|
|
|
|
#### **Random Combination** (Recommended)
|
|
```yaml
|
|
augmentation:
|
|
strategy:
|
|
mode: "random_combine"
|
|
min_methods: 2
|
|
max_methods: 4
|
|
```
|
|
Each image gets 2-4 randomly selected augmentation methods.
|
|
|
|
#### **Sequential Application**
|
|
```yaml
|
|
augmentation:
|
|
strategy:
|
|
mode: "sequential"
|
|
```
|
|
All enabled methods applied to each image in sequence.
|
|
|
|
#### **Individual Methods**
|
|
```yaml
|
|
augmentation:
|
|
strategy:
|
|
mode: "individual"
|
|
```
|
|
Legacy mode - each method creates separate output images.
|
|
|
|
### **Method Probability Tuning**
|
|
```yaml
|
|
methods:
|
|
rotation:
|
|
probability: 0.9 # High chance - common transformation
|
|
perspective:
|
|
probability: 0.2 # Low chance - subtle effect
|
|
partial_blockage:
|
|
probability: 0.3 # Medium chance - specific use case
|
|
```
|
|
|
|
## 📊 Performance Statistics
|
|
|
|
The system provides detailed statistics:
|
|
|
|
```json
|
|
{
|
|
"input_images": 100,
|
|
"selected_images": 30, // In sampling mode
|
|
"target_total": 100,
|
|
"actual_generated": 98,
|
|
"multiplication_factor": 0.3,
|
|
"mode": "sampling",
|
|
"efficiency": 0.98 // 98% target achievement
|
|
}
|
|
```
|
|
|
|
## 🔧 Troubleshooting
|
|
|
|
### **Common Issues**
|
|
|
|
1. **Low efficiency in sampling mode**
|
|
- Increase `min_methods` or adjust `target_size`
|
|
- Check available augmentation methods
|
|
|
|
2. **Memory issues with large datasets**
|
|
- Use sampling mode with lower factor
|
|
- Reduce `target_size` resolution
|
|
- Enable `memory_efficient` mode
|
|
|
|
3. **Inconsistent augmentation results**
|
|
- Set `random_seed` for reproducibility
|
|
- Adjust method probabilities
|
|
- Check `min_methods`/`max_methods` balance
|
|
|
|
### **Performance Tips**
|
|
|
|
- **Sampling Mode**: Use for large datasets (>1000 images)
|
|
- **GPU Acceleration**: Enable for YOLO detection
|
|
- **Batch Processing**: Process in chunks for memory efficiency
|
|
- **Probability Tuning**: Higher probabilities for stable methods
|
|
|
|
## 📈 Benchmarks
|
|
|
|
### **Processing Speed**
|
|
- **Direct Mode**: ~2-3 images/second
|
|
- **YOLO + Augmentation**: ~1-2 images/second
|
|
- **Memory Usage**: ~2-4GB for 1000 images
|
|
|
|
### **Output Quality**
|
|
- **Raw Images**: 100% preserved quality
|
|
- **Augmented Images**: Balanced realism vs. diversity
|
|
- **Grayscale Conversion**: Consistent preprocessing
|
|
|
|
## 🤝 Contributing
|
|
|
|
1. Fork the repository
|
|
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
|
|
3. Commit your changes (`git commit -m 'Add amazing feature'`)
|
|
4. Push to the branch (`git push origin feature/amazing-feature`)
|
|
5. Open a Pull Request
|
|
|
|
## 📄 License
|
|
|
|
This project is licensed under the MIT License - see the LICENSE file for details.
|
|
|
|
## 🙏 Acknowledgments
|
|
|
|
- **YOLOv8**: Ultralytics for the detection framework
|
|
- **OpenCV**: Computer vision operations
|
|
- **NumPy**: Numerical computations
|
|
- **PyTorch**: Deep learning backend
|
|
|
|
---
|
|
|
|
**For questions and support, please open an issue on GitHub.** |