Files
IDcardsGenerator/README.md

324 lines
8.9 KiB
Markdown
Raw Permalink Normal View History

2025-08-06 20:52:39 +07:00
# ID Card Data Augmentation Pipeline
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
A comprehensive data augmentation pipeline for ID card images with YOLO-based detection, smart sampling strategies, and advanced augmentation techniques.
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
![Pipeline Overview](docs/images/yolov8_pipeline.png)
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
## 🚀 New Features v2.0
### **Smart Data Strategy**
- **Sampling Mode** (`factor < 1.0`): Process only a percentage of input data
- **Multiplication Mode** (`factor >= 1.0`): Multiply total dataset size
- **Balanced Output**: Includes both raw and augmented images
- **Configurable Sampling**: Random, stratified, or uniform selection
### **Enhanced Augmentation**
- **Random Method Combination**: Mix and match augmentation techniques
- **Method Probability Weights**: Control frequency of each augmentation
- **Raw Image Preservation**: Always includes original processed images
- **Flexible Processing Modes**: Individual, sequential, or random combination
## 🎯 Key Features
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
### **YOLO-based ID Card Detection**
- Automatic detection and cropping of ID cards from large images
- Configurable confidence and IoU thresholds
- Multiple cropping modes (bbox, square, aspect_ratio)
- Padding and target size customization
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
### **Advanced Data Augmentation**
- **Geometric Transformations**: Rotation with multiple angles
- **Random Cropping**: Simulates partially visible cards
- **Noise Addition**: Simulates worn-out cards
- **Partial Blockage**: Simulates occluded card details
2025-08-06 21:44:39 +07:00
- **Blurring**: Simulates motion blur while keeping readability
2025-08-06 20:52:39 +07:00
- **Brightness/Contrast**: Mimics different lighting conditions
2025-08-06 21:44:39 +07:00
- **Color Jittering**: HSV adjustments for color variations
- **Perspective Transform**: Simulates viewing angle changes
2025-08-06 20:52:39 +07:00
- **Grayscale Conversion**: Final preprocessing step for all images
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
### **Flexible Configuration**
- YAML-based configuration system
- Command-line argument overrides
2025-08-06 21:44:39 +07:00
- Smart data strategy configuration
- Comprehensive logging and statistics
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
## 📋 Requirements
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
```bash
# Python 3.8+
conda create -n gpu python=3.8
conda activate gpu
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
# Install dependencies
pip install -r requirements.txt
```
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
### Dependencies
- `opencv-python>=4.5.0`
- `numpy>=1.21.0`
- `Pillow>=8.3.0`
- `PyYAML>=5.4.0`
- `ultralytics>=8.0.0` (for YOLO models)
2025-08-06 21:44:39 +07:00
- `torch>=1.12.0` (for GPU acceleration)
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
## 🛠️ Installation
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
1. **Clone the repository**
2025-08-05 21:42:23 +07:00
```bash
git clone <repository-url>
cd IDcardsGenerator
```
2025-08-06 20:52:39 +07:00
2. **Install dependencies**
2025-08-05 21:42:23 +07:00
```bash
2025-08-06 20:52:39 +07:00
pip install -r requirements.txt
2025-08-05 21:42:23 +07:00
```
2025-08-06 20:52:39 +07:00
3. **Prepare YOLO model** (optional)
2025-08-05 21:42:23 +07:00
```bash
2025-08-06 20:52:39 +07:00
# Place your trained YOLO model at:
data/weights/id_cards_yolov8n.pt
2025-08-05 21:42:23 +07:00
```
2025-08-06 20:52:39 +07:00
## 📖 Usage
### **Basic Usage**
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
```bash
2025-08-06 21:44:39 +07:00
# Run with default configuration (3x multiplication)
2025-08-06 20:52:39 +07:00
python main.py
2025-08-06 21:44:39 +07:00
# Run with sampling mode (30% of input data)
python main.py # Set multiplication_factor: 0.3 in config
2025-08-06 20:52:39 +07:00
# Run with ID card detection enabled
python main.py --enable-id-detection
2025-08-05 21:42:23 +07:00
```
2025-08-06 20:52:39 +07:00
2025-08-06 21:44:39 +07:00
### **Data Strategy Examples**
2025-08-06 20:52:39 +07:00
2025-08-06 21:44:39 +07:00
#### **Sampling Mode** (factor < 1.0)
```yaml
data_strategy:
multiplication_factor: 0.3 # Process 30% of input images
sampling:
method: "random" # random, stratified, uniform
preserve_distribution: true
2025-08-06 20:52:39 +07:00
```
2025-08-06 21:44:39 +07:00
- Input: 100 images → Select 30 images → Output: 100 images total
- Each selected image generates ~3-4 versions (including raw)
2025-08-06 20:52:39 +07:00
2025-08-06 21:44:39 +07:00
#### **Multiplication Mode** (factor >= 1.0)
```yaml
data_strategy:
multiplication_factor: 3.0 # 3x dataset size
2025-08-05 21:42:23 +07:00
```
2025-08-06 21:44:39 +07:00
- Input: 100 images → Process all → Output: 300 images total
- Each image generates 3 versions (1 raw + 2 augmented)
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
### **Augmentation Strategy**
2025-08-05 21:42:23 +07:00
```yaml
augmentation:
2025-08-06 21:44:39 +07:00
strategy:
mode: "random_combine" # random_combine, sequential, individual
min_methods: 2 # Min augmentation methods per image
max_methods: 4 # Max augmentation methods per image
methods:
rotation:
enabled: true
probability: 0.8 # 80% chance to be selected
angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
random_cropping:
enabled: true
probability: 0.7
ratio_range: [0.7, 1.0]
# ... other methods with probabilities
2025-08-05 21:42:23 +07:00
```
2025-08-06 20:52:39 +07:00
## 🔄 Workflow
2025-08-06 21:44:39 +07:00
### **Smart Processing Pipeline**
#### **Step 1: Data Selection**
- **Sampling Mode**: Randomly select subset of input images
- **Multiplication Mode**: Process all input images
- **Stratified Sampling**: Preserve file type distribution
2025-08-06 20:52:39 +07:00
2025-08-06 21:44:39 +07:00
#### **Step 2: ID Card Detection** (Optional)
2025-08-06 20:52:39 +07:00
When `id_card_detection.enabled: true`:
2025-08-06 21:44:39 +07:00
1. **YOLO Detection**: Locate ID cards in large images
2. **Cropping**: Extract individual ID cards with padding
3. **Output**: Cropped ID cards saved to `out/processed/`
#### **Step 3: Smart Augmentation**
1. **Raw Processing**: Always include original (resized + grayscale)
2. **Random Combination**: Select 2-4 augmentation methods randomly
3. **Method Application**: Apply selected methods with probability weights
4. **Final Processing**: Grayscale conversion for all outputs
2025-08-06 20:52:39 +07:00
## 📊 Output Structure
2025-08-05 21:42:23 +07:00
```
2025-08-06 20:52:39 +07:00
output_directory/
├── processed/ # Cropped ID cards (if detection enabled)
│ ├── id_card_001.jpg
2025-08-06 21:44:39 +07:00
│ ├── id_card_002.jpg
2025-08-06 20:52:39 +07:00
│ └── processing_summary.json
2025-08-06 21:44:39 +07:00
├── im1__raw_001.jpg # Raw processed images
├── im1__aug_001.jpg # Augmented images (random combinations)
├── im1__aug_002.jpg
├── im2__raw_001.jpg
├── im2__aug_001.jpg
└── processing_summary.json
2025-08-05 21:42:23 +07:00
```
2025-08-06 21:44:39 +07:00
### **File Naming Convention**
- `{basename}_raw_001.jpg`: Original image (resized + grayscale)
- `{basename}_aug_001.jpg`: Augmented version 1 (random methods)
- `{basename}_aug_002.jpg`: Augmented version 2 (different methods)
2025-08-06 20:52:39 +07:00
## 🎯 Use Cases
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
### **Dataset Expansion**
```yaml
# Triple your dataset size with balanced augmentation
data_strategy:
multiplication_factor: 3.0
2025-08-05 21:42:23 +07:00
```
2025-08-06 21:44:39 +07:00
### **Smart Sampling for Large Datasets**
```yaml
# Process only 20% but maintain original dataset size
data_strategy:
multiplication_factor: 0.2
sampling:
method: "stratified" # Preserve file type distribution
2025-08-05 21:42:23 +07:00
```
2025-08-06 21:44:39 +07:00
### **Quality Control**
2025-08-05 21:42:23 +07:00
```bash
2025-08-06 21:44:39 +07:00
# Preview results before full processing
python main.py --preview
2025-08-05 21:42:23 +07:00
```
2025-08-06 20:52:39 +07:00
## ⚙️ Advanced Configuration
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
### **Augmentation Strategy Modes**
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
#### **Random Combination** (Recommended)
2025-08-06 20:52:39 +07:00
```yaml
augmentation:
2025-08-06 21:44:39 +07:00
strategy:
mode: "random_combine"
min_methods: 2
max_methods: 4
```
Each image gets 2-4 randomly selected augmentation methods.
#### **Sequential Application**
```yaml
augmentation:
strategy:
mode: "sequential"
2025-08-05 21:42:23 +07:00
```
2025-08-06 21:44:39 +07:00
All enabled methods applied to each image in sequence.
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
#### **Individual Methods**
```yaml
augmentation:
strategy:
mode: "individual"
```
Legacy mode - each method creates separate output images.
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
### **Method Probability Tuning**
2025-08-06 20:52:39 +07:00
```yaml
2025-08-06 21:44:39 +07:00
methods:
rotation:
probability: 0.9 # High chance - common transformation
perspective:
probability: 0.2 # Low chance - subtle effect
partial_blockage:
probability: 0.3 # Medium chance - specific use case
2025-08-05 21:42:23 +07:00
```
2025-08-06 21:44:39 +07:00
## 📊 Performance Statistics
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
The system provides detailed statistics:
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
```json
{
"input_images": 100,
"selected_images": 30, // In sampling mode
"target_total": 100,
"actual_generated": 98,
"multiplication_factor": 0.3,
"mode": "sampling",
"efficiency": 0.98 // 98% target achievement
}
```
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
## 🔧 Troubleshooting
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
### **Common Issues**
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
1. **Low efficiency in sampling mode**
- Increase `min_methods` or adjust `target_size`
- Check available augmentation methods
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
2. **Memory issues with large datasets**
- Use sampling mode with lower factor
- Reduce `target_size` resolution
- Enable `memory_efficient` mode
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
3. **Inconsistent augmentation results**
- Set `random_seed` for reproducibility
- Adjust method probabilities
- Check `min_methods`/`max_methods` balance
2025-08-05 21:42:23 +07:00
2025-08-06 20:52:39 +07:00
### **Performance Tips**
2025-08-05 21:42:23 +07:00
2025-08-06 21:44:39 +07:00
- **Sampling Mode**: Use for large datasets (>1000 images)
- **GPU Acceleration**: Enable for YOLO detection
- **Batch Processing**: Process in chunks for memory efficiency
- **Probability Tuning**: Higher probabilities for stable methods
## 📈 Benchmarks
### **Processing Speed**
- **Direct Mode**: ~2-3 images/second
- **YOLO + Augmentation**: ~1-2 images/second
- **Memory Usage**: ~2-4GB for 1000 images
### **Output Quality**
- **Raw Images**: 100% preserved quality
- **Augmented Images**: Balanced realism vs. diversity
- **Grayscale Conversion**: Consistent preprocessing
2025-08-05 21:42:23 +07:00
## 🤝 Contributing
1. Fork the repository
2025-08-06 21:44:39 +07:00
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
2025-08-05 21:42:23 +07:00
## 📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
## 🙏 Acknowledgments
2025-08-06 20:52:39 +07:00
- **YOLOv8**: Ultralytics for the detection framework
- **OpenCV**: Computer vision operations
- **NumPy**: Numerical computations
2025-08-06 21:44:39 +07:00
- **PyTorch**: Deep learning backend
2025-08-05 21:42:23 +07:00
---
2025-08-06 21:44:39 +07:00
**For questions and support, please open an issue on GitHub.**