7.2 KiB
7.2 KiB
ID Card Data Augmentation Pipeline
A comprehensive data augmentation pipeline for ID card images with YOLO-based detection and advanced augmentation techniques.
🚀 Features
YOLO-based ID Card Detection
- Automatic detection and cropping of ID cards from large images
- Configurable confidence and IoU thresholds
- Multiple cropping modes (bbox, square, aspect_ratio)
- Padding and target size customization
Advanced Data Augmentation
- Geometric Transformations: Rotation with multiple angles
- Random Cropping: Simulates partially visible cards
- Noise Addition: Simulates worn-out cards
- Partial Blockage: Simulates occluded card details
- Blurring: Simulates blurred but readable images
- Brightness/Contrast: Mimics different lighting conditions
- Grayscale Conversion: Final preprocessing step for all images
Flexible Configuration
- YAML-based configuration system
- Command-line argument overrides
- Environment-specific settings
- Comprehensive logging
📋 Requirements
# Python 3.8+
conda create -n gpu python=3.8
conda activate gpu
# Install dependencies
pip install -r requirements.txt
Dependencies
opencv-python>=4.5.0
numpy>=1.21.0
Pillow>=8.3.0
PyYAML>=5.4.0
ultralytics>=8.0.0
(for YOLO models)
🛠️ Installation
- Clone the repository
git clone <repository-url>
cd IDcardsGenerator
- Install dependencies
pip install -r requirements.txt
- Prepare YOLO model (optional)
# Place your trained YOLO model at:
data/weights/id_cards_yolov8n.pt
📖 Usage
Basic Usage
# Run with default configuration
python main.py
# Run with ID card detection enabled
python main.py --enable-id-detection
# Run with custom input/output directories
python main.py --input-dir "path/to/input" --output-dir "path/to/output"
Configuration Options
ID Card Detection
# Enable detection with custom model
python main.py --enable-id-detection --model-path "path/to/model.pt"
# Adjust detection parameters
python main.py --enable-id-detection --confidence 0.3 --crop-mode square
# Set target size for cropped cards
python main.py --enable-id-detection --crop-target-size "640,640"
Data Augmentation
# Customize augmentation parameters
python main.py --num-augmentations 5 --target-size "512,512"
# Preview augmentation results
python main.py --preview
Configuration File
Edit config/config.yaml
for persistent settings:
# ID Card Detection
id_card_detection:
enabled: false # Enable/disable YOLO detection
model_path: "data/weights/id_cards_yolov8n.pt"
confidence_threshold: 0.25
iou_threshold: 0.45
padding: 10
crop_mode: "bbox"
target_size: null
# Data Augmentation
augmentation:
rotation:
enabled: true
angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
random_cropping:
enabled: true
ratio_range: [0.7, 1.0]
random_noise:
enabled: true
mean_range: [0.0, 0.7]
variance_range: [0.0, 0.1]
partial_blockage:
enabled: true
coverage_range: [0.0, 0.25]
blurring:
enabled: true
kernel_ratio_range: [0.0, 0.0084]
brightness_contrast:
enabled: true
alpha_range: [0.4, 3.0]
beta_range: [1, 100]
grayscale:
enabled: true # Applied as final step
# Processing
processing:
target_size: [640, 640]
num_augmentations: 3
save_format: "jpg"
quality: 95
🔄 Workflow
Two-Step Processing Pipeline
Step 1: ID Card Detection (Optional)
When id_card_detection.enabled: true
:
- Input: Large images containing multiple ID cards
- YOLO Detection: Locate and detect ID cards
- Cropping: Extract individual ID cards with padding
- Output: Cropped ID cards saved to
out/processed/
Step 2: Data Augmentation
- Input: Original images OR cropped ID cards
- Augmentation: Apply 6 augmentation methods:
- Rotation (9 different angles)
- Random cropping (70-100% ratio)
- Random noise (simulate wear)
- Partial blockage (simulate occlusion)
- Blurring (simulate motion blur)
- Brightness/Contrast adjustment
- Grayscale: Convert all images to grayscale (final step)
- Output: Augmented images in main output directory
Direct Augmentation Mode
When id_card_detection.enabled: false
:
- Skips YOLO detection
- Applies augmentation directly to input images
- All images are converted to grayscale
📊 Output Structure
output_directory/
├── processed/ # Cropped ID cards (if detection enabled)
│ ├── id_card_001.jpg
│ ├── id_card_002.jpg
│ └── processing_summary.json
├── im1__rotation_01.png # Augmented images
├── im1__cropping_01.png
├── im1__noise_01.png
├── im1__blockage_01.png
├── im1__blurring_01.png
├── im1__brightness_contrast_01.png
└── augmentation_summary.json
🎯 Use Cases
Training Data Generation
# Generate diverse training data
python main.py --enable-id-detection --num-augmentations 10
Quality Control
# Preview results before processing
python main.py --preview
Batch Processing
# Process large datasets
python main.py --input-dir "large_dataset/" --output-dir "augmented_dataset/"
⚙️ Advanced Configuration
Custom Augmentation Parameters
augmentation:
rotation:
angles: [45, 90, 135, 180, 225, 270, 315] # Custom angles
random_cropping:
ratio_range: [0.8, 0.95] # Tighter cropping
random_noise:
mean_range: [0.1, 0.5] # More noise
variance_range: [0.05, 0.15]
Performance Optimization
performance:
num_workers: 4
prefetch_factor: 2
pin_memory: true
use_gpu: false
📝 Logging
The system provides comprehensive logging:
- File:
logs/data_augmentation.log
- Console: Real-time progress updates
- Summary: JSON files with processing statistics
Log Levels
INFO
: General processing informationWARNING
: Non-critical issues (e.g., no cards detected)ERROR
: Critical errors
🔧 Troubleshooting
Common Issues
-
No images detected
- Check input directory path
- Verify image formats (jpg, png, bmp, tiff)
- Ensure images are not corrupted
-
YOLO model not found
- Place model file at
data/weights/id_cards_yolov8n.pt
- Or specify custom path with
--model-path
- Place model file at
-
Memory issues
- Reduce
num_augmentations
- Use smaller
target_size
- Enable GPU if available
- Reduce
Performance Tips
- GPU Acceleration: Set
use_gpu: true
in config - Batch Processing: Use multiple workers for large datasets
- Memory Management: Process in smaller batches
🤝 Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- YOLOv8: Ultralytics for the detection framework
- OpenCV: Computer vision operations
- NumPy: Numerical computations
For questions and support, please open an issue on GitHub.