# ID Card Data Augmentation Pipeline A comprehensive data augmentation pipeline for ID card images with YOLO-based detection, smart sampling strategies, and advanced augmentation techniques. ![Pipeline Overview](docs/images/yolov8_pipeline.png) ## 🚀 New Features v2.0 ### **Smart Data Strategy** - **Sampling Mode** (`factor < 1.0`): Process only a percentage of input data - **Multiplication Mode** (`factor >= 1.0`): Multiply total dataset size - **Balanced Output**: Includes both raw and augmented images - **Configurable Sampling**: Random, stratified, or uniform selection ### **Enhanced Augmentation** - **Random Method Combination**: Mix and match augmentation techniques - **Method Probability Weights**: Control frequency of each augmentation - **Raw Image Preservation**: Always includes original processed images - **Flexible Processing Modes**: Individual, sequential, or random combination ## 🎯 Key Features ### **YOLO-based ID Card Detection** - Automatic detection and cropping of ID cards from large images - Configurable confidence and IoU thresholds - Multiple cropping modes (bbox, square, aspect_ratio) - Padding and target size customization ### **Advanced Data Augmentation** - **Geometric Transformations**: Rotation with multiple angles - **Random Cropping**: Simulates partially visible cards - **Noise Addition**: Simulates worn-out cards - **Partial Blockage**: Simulates occluded card details - **Blurring**: Simulates motion blur while keeping readability - **Brightness/Contrast**: Mimics different lighting conditions - **Color Jittering**: HSV adjustments for color variations - **Perspective Transform**: Simulates viewing angle changes - **Grayscale Conversion**: Final preprocessing step for all images ### **Flexible Configuration** - YAML-based configuration system - Command-line argument overrides - Smart data strategy configuration - Comprehensive logging and statistics ## 📋 Requirements ```bash # Python 3.8+ conda create -n gpu python=3.8 conda activate gpu # Install dependencies pip install -r requirements.txt ``` ### Dependencies - `opencv-python>=4.5.0` - `numpy>=1.21.0` - `Pillow>=8.3.0` - `PyYAML>=5.4.0` - `ultralytics>=8.0.0` (for YOLO models) - `torch>=1.12.0` (for GPU acceleration) ## 🛠️ Installation 1. **Clone the repository** ```bash git clone cd IDcardsGenerator ``` 2. **Install dependencies** ```bash pip install -r requirements.txt ``` 3. **Prepare YOLO model** (optional) ```bash # Place your trained YOLO model at: data/weights/id_cards_yolov8n.pt ``` ## 📖 Usage ### **Basic Usage** ```bash # Run with default configuration (3x multiplication) python main.py # Run with sampling mode (30% of input data) python main.py # Set multiplication_factor: 0.3 in config # Run with ID card detection enabled python main.py --enable-id-detection ``` ### **Data Strategy Examples** #### **Sampling Mode** (factor < 1.0) ```yaml data_strategy: multiplication_factor: 0.3 # Process 30% of input images sampling: method: "random" # random, stratified, uniform preserve_distribution: true ``` - Input: 100 images → Select 30 images → Output: 100 images total - Each selected image generates ~3-4 versions (including raw) #### **Multiplication Mode** (factor >= 1.0) ```yaml data_strategy: multiplication_factor: 3.0 # 3x dataset size ``` - Input: 100 images → Process all → Output: 300 images total - Each image generates 3 versions (1 raw + 2 augmented) ### **Augmentation Strategy** ```yaml augmentation: strategy: mode: "random_combine" # random_combine, sequential, individual min_methods: 2 # Min augmentation methods per image max_methods: 4 # Max augmentation methods per image methods: rotation: enabled: true probability: 0.8 # 80% chance to be selected angles: [30, 60, 120, 150, 180, 210, 240, 300, 330] random_cropping: enabled: true probability: 0.7 ratio_range: [0.7, 1.0] # ... other methods with probabilities ``` ## 🔄 Workflow ### **Smart Processing Pipeline** #### **Step 1: Data Selection** - **Sampling Mode**: Randomly select subset of input images - **Multiplication Mode**: Process all input images - **Stratified Sampling**: Preserve file type distribution #### **Step 2: ID Card Detection** (Optional) When `id_card_detection.enabled: true`: 1. **YOLO Detection**: Locate ID cards in large images 2. **Cropping**: Extract individual ID cards with padding 3. **Output**: Cropped ID cards saved to `out/processed/` #### **Step 3: Smart Augmentation** 1. **Raw Processing**: Always include original (resized + grayscale) 2. **Random Combination**: Select 2-4 augmentation methods randomly 3. **Method Application**: Apply selected methods with probability weights 4. **Final Processing**: Grayscale conversion for all outputs ## 📊 Output Structure ``` output_directory/ ├── processed/ # Cropped ID cards (if detection enabled) │ ├── id_card_001.jpg │ ├── id_card_002.jpg │ └── processing_summary.json ├── im1__raw_001.jpg # Raw processed images ├── im1__aug_001.jpg # Augmented images (random combinations) ├── im1__aug_002.jpg ├── im2__raw_001.jpg ├── im2__aug_001.jpg └── processing_summary.json ``` ### **File Naming Convention** - `{basename}_raw_001.jpg`: Original image (resized + grayscale) - `{basename}_aug_001.jpg`: Augmented version 1 (random methods) - `{basename}_aug_002.jpg`: Augmented version 2 (different methods) ## 🎯 Use Cases ### **Dataset Expansion** ```yaml # Triple your dataset size with balanced augmentation data_strategy: multiplication_factor: 3.0 ``` ### **Smart Sampling for Large Datasets** ```yaml # Process only 20% but maintain original dataset size data_strategy: multiplication_factor: 0.2 sampling: method: "stratified" # Preserve file type distribution ``` ### **Quality Control** ```bash # Preview results before full processing python main.py --preview ``` ## ⚙️ Advanced Configuration ### **Augmentation Strategy Modes** #### **Random Combination** (Recommended) ```yaml augmentation: strategy: mode: "random_combine" min_methods: 2 max_methods: 4 ``` Each image gets 2-4 randomly selected augmentation methods. #### **Sequential Application** ```yaml augmentation: strategy: mode: "sequential" ``` All enabled methods applied to each image in sequence. #### **Individual Methods** ```yaml augmentation: strategy: mode: "individual" ``` Legacy mode - each method creates separate output images. ### **Method Probability Tuning** ```yaml methods: rotation: probability: 0.9 # High chance - common transformation perspective: probability: 0.2 # Low chance - subtle effect partial_blockage: probability: 0.3 # Medium chance - specific use case ``` ## 📊 Performance Statistics The system provides detailed statistics: ```json { "input_images": 100, "selected_images": 30, // In sampling mode "target_total": 100, "actual_generated": 98, "multiplication_factor": 0.3, "mode": "sampling", "efficiency": 0.98 // 98% target achievement } ``` ## 🔧 Troubleshooting ### **Common Issues** 1. **Low efficiency in sampling mode** - Increase `min_methods` or adjust `target_size` - Check available augmentation methods 2. **Memory issues with large datasets** - Use sampling mode with lower factor - Reduce `target_size` resolution - Enable `memory_efficient` mode 3. **Inconsistent augmentation results** - Set `random_seed` for reproducibility - Adjust method probabilities - Check `min_methods`/`max_methods` balance ### **Performance Tips** - **Sampling Mode**: Use for large datasets (>1000 images) - **GPU Acceleration**: Enable for YOLO detection - **Batch Processing**: Process in chunks for memory efficiency - **Probability Tuning**: Higher probabilities for stable methods ## 📈 Benchmarks ### **Processing Speed** - **Direct Mode**: ~2-3 images/second - **YOLO + Augmentation**: ~1-2 images/second - **Memory Usage**: ~2-4GB for 1000 images ### **Output Quality** - **Raw Images**: 100% preserved quality - **Augmented Images**: Balanced realism vs. diversity - **Grayscale Conversion**: Consistent preprocessing ## 🤝 Contributing 1. Fork the repository 2. Create a feature branch (`git checkout -b feature/amazing-feature`) 3. Commit your changes (`git commit -m 'Add amazing feature'`) 4. Push to the branch (`git push origin feature/amazing-feature`) 5. Open a Pull Request ## 📄 License This project is licensed under the MIT License - see the LICENSE file for details. ## 🙏 Acknowledgments - **YOLOv8**: Ultralytics for the detection framework - **OpenCV**: Computer vision operations - **NumPy**: Numerical computations - **PyTorch**: Deep learning backend --- **For questions and support, please open an issue on GitHub.**