combine augment
This commit is contained in:
334
README.md
334
README.md
@@ -1,10 +1,24 @@
|
||||
# ID Card Data Augmentation Pipeline
|
||||
|
||||
A comprehensive data augmentation pipeline for ID card images with YOLO-based detection and advanced augmentation techniques.
|
||||
A comprehensive data augmentation pipeline for ID card images with YOLO-based detection, smart sampling strategies, and advanced augmentation techniques.
|
||||
|
||||

|
||||
|
||||
## 🚀 Features
|
||||
## 🚀 New Features v2.0
|
||||
|
||||
### **Smart Data Strategy**
|
||||
- **Sampling Mode** (`factor < 1.0`): Process only a percentage of input data
|
||||
- **Multiplication Mode** (`factor >= 1.0`): Multiply total dataset size
|
||||
- **Balanced Output**: Includes both raw and augmented images
|
||||
- **Configurable Sampling**: Random, stratified, or uniform selection
|
||||
|
||||
### **Enhanced Augmentation**
|
||||
- **Random Method Combination**: Mix and match augmentation techniques
|
||||
- **Method Probability Weights**: Control frequency of each augmentation
|
||||
- **Raw Image Preservation**: Always includes original processed images
|
||||
- **Flexible Processing Modes**: Individual, sequential, or random combination
|
||||
|
||||
## 🎯 Key Features
|
||||
|
||||
### **YOLO-based ID Card Detection**
|
||||
- Automatic detection and cropping of ID cards from large images
|
||||
@@ -17,15 +31,17 @@ A comprehensive data augmentation pipeline for ID card images with YOLO-based de
|
||||
- **Random Cropping**: Simulates partially visible cards
|
||||
- **Noise Addition**: Simulates worn-out cards
|
||||
- **Partial Blockage**: Simulates occluded card details
|
||||
- **Blurring**: Simulates blurred but readable images
|
||||
- **Blurring**: Simulates motion blur while keeping readability
|
||||
- **Brightness/Contrast**: Mimics different lighting conditions
|
||||
- **Color Jittering**: HSV adjustments for color variations
|
||||
- **Perspective Transform**: Simulates viewing angle changes
|
||||
- **Grayscale Conversion**: Final preprocessing step for all images
|
||||
|
||||
### **Flexible Configuration**
|
||||
- YAML-based configuration system
|
||||
- Command-line argument overrides
|
||||
- Environment-specific settings
|
||||
- Comprehensive logging
|
||||
- Smart data strategy configuration
|
||||
- Comprehensive logging and statistics
|
||||
|
||||
## 📋 Requirements
|
||||
|
||||
@@ -44,6 +60,7 @@ pip install -r requirements.txt
|
||||
- `Pillow>=8.3.0`
|
||||
- `PyYAML>=5.4.0`
|
||||
- `ultralytics>=8.0.0` (for YOLO models)
|
||||
- `torch>=1.12.0` (for GPU acceleration)
|
||||
|
||||
## 🛠️ Installation
|
||||
|
||||
@@ -69,115 +86,80 @@ data/weights/id_cards_yolov8n.pt
|
||||
### **Basic Usage**
|
||||
|
||||
```bash
|
||||
# Run with default configuration
|
||||
# Run with default configuration (3x multiplication)
|
||||
python main.py
|
||||
|
||||
# Run with sampling mode (30% of input data)
|
||||
python main.py # Set multiplication_factor: 0.3 in config
|
||||
|
||||
# Run with ID card detection enabled
|
||||
python main.py --enable-id-detection
|
||||
|
||||
# Run with custom input/output directories
|
||||
python main.py --input-dir "path/to/input" --output-dir "path/to/output"
|
||||
```
|
||||
|
||||
### **Configuration Options**
|
||||
### **Data Strategy Examples**
|
||||
|
||||
#### **ID Card Detection**
|
||||
```bash
|
||||
# Enable detection with custom model
|
||||
python main.py --enable-id-detection --model-path "path/to/model.pt"
|
||||
|
||||
# Adjust detection parameters
|
||||
python main.py --enable-id-detection --confidence 0.3 --crop-mode square
|
||||
|
||||
# Set target size for cropped cards
|
||||
python main.py --enable-id-detection --crop-target-size "640,640"
|
||||
#### **Sampling Mode** (factor < 1.0)
|
||||
```yaml
|
||||
data_strategy:
|
||||
multiplication_factor: 0.3 # Process 30% of input images
|
||||
sampling:
|
||||
method: "random" # random, stratified, uniform
|
||||
preserve_distribution: true
|
||||
```
|
||||
- Input: 100 images → Select 30 images → Output: 100 images total
|
||||
- Each selected image generates ~3-4 versions (including raw)
|
||||
|
||||
#### **Data Augmentation**
|
||||
```bash
|
||||
# Customize augmentation parameters
|
||||
python main.py --num-augmentations 5 --target-size "512,512"
|
||||
|
||||
# Preview augmentation results
|
||||
python main.py --preview
|
||||
#### **Multiplication Mode** (factor >= 1.0)
|
||||
```yaml
|
||||
data_strategy:
|
||||
multiplication_factor: 3.0 # 3x dataset size
|
||||
```
|
||||
- Input: 100 images → Process all → Output: 300 images total
|
||||
- Each image generates 3 versions (1 raw + 2 augmented)
|
||||
|
||||
### **Configuration File**
|
||||
|
||||
Edit `config/config.yaml` for persistent settings:
|
||||
### **Augmentation Strategy**
|
||||
|
||||
```yaml
|
||||
# ID Card Detection
|
||||
id_card_detection:
|
||||
enabled: false # Enable/disable YOLO detection
|
||||
model_path: "data/weights/id_cards_yolov8n.pt"
|
||||
confidence_threshold: 0.25
|
||||
iou_threshold: 0.45
|
||||
padding: 10
|
||||
crop_mode: "bbox"
|
||||
target_size: null
|
||||
|
||||
# Data Augmentation
|
||||
augmentation:
|
||||
rotation:
|
||||
enabled: true
|
||||
angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
|
||||
random_cropping:
|
||||
enabled: true
|
||||
ratio_range: [0.7, 1.0]
|
||||
random_noise:
|
||||
enabled: true
|
||||
mean_range: [0.0, 0.7]
|
||||
variance_range: [0.0, 0.1]
|
||||
partial_blockage:
|
||||
enabled: true
|
||||
coverage_range: [0.0, 0.25]
|
||||
blurring:
|
||||
enabled: true
|
||||
kernel_ratio_range: [0.0, 0.0084]
|
||||
brightness_contrast:
|
||||
enabled: true
|
||||
alpha_range: [0.4, 3.0]
|
||||
beta_range: [1, 100]
|
||||
grayscale:
|
||||
enabled: true # Applied as final step
|
||||
|
||||
# Processing
|
||||
processing:
|
||||
target_size: [640, 640]
|
||||
num_augmentations: 3
|
||||
save_format: "jpg"
|
||||
quality: 95
|
||||
strategy:
|
||||
mode: "random_combine" # random_combine, sequential, individual
|
||||
min_methods: 2 # Min augmentation methods per image
|
||||
max_methods: 4 # Max augmentation methods per image
|
||||
|
||||
methods:
|
||||
rotation:
|
||||
enabled: true
|
||||
probability: 0.8 # 80% chance to be selected
|
||||
angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
|
||||
|
||||
random_cropping:
|
||||
enabled: true
|
||||
probability: 0.7
|
||||
ratio_range: [0.7, 1.0]
|
||||
|
||||
# ... other methods with probabilities
|
||||
```
|
||||
|
||||
## 🔄 Workflow
|
||||
|
||||
### **Two-Step Processing Pipeline**
|
||||
### **Smart Processing Pipeline**
|
||||
|
||||
#### **Step 1: ID Card Detection (Optional)**
|
||||
#### **Step 1: Data Selection**
|
||||
- **Sampling Mode**: Randomly select subset of input images
|
||||
- **Multiplication Mode**: Process all input images
|
||||
- **Stratified Sampling**: Preserve file type distribution
|
||||
|
||||
#### **Step 2: ID Card Detection** (Optional)
|
||||
When `id_card_detection.enabled: true`:
|
||||
1. **Input**: Large images containing multiple ID cards
|
||||
2. **YOLO Detection**: Locate and detect ID cards
|
||||
3. **Cropping**: Extract individual ID cards with padding
|
||||
4. **Output**: Cropped ID cards saved to `out/processed/`
|
||||
1. **YOLO Detection**: Locate ID cards in large images
|
||||
2. **Cropping**: Extract individual ID cards with padding
|
||||
3. **Output**: Cropped ID cards saved to `out/processed/`
|
||||
|
||||
#### **Step 2: Data Augmentation**
|
||||
1. **Input**: Original images OR cropped ID cards
|
||||
2. **Augmentation**: Apply 6 augmentation methods:
|
||||
- Rotation (9 different angles)
|
||||
- Random cropping (70-100% ratio)
|
||||
- Random noise (simulate wear)
|
||||
- Partial blockage (simulate occlusion)
|
||||
- Blurring (simulate motion blur)
|
||||
- Brightness/Contrast adjustment
|
||||
3. **Grayscale**: Convert all images to grayscale (final step)
|
||||
4. **Output**: Augmented images in main output directory
|
||||
|
||||
### **Direct Augmentation Mode**
|
||||
When `id_card_detection.enabled: false`:
|
||||
- Skips YOLO detection
|
||||
- Applies augmentation directly to input images
|
||||
- All images are converted to grayscale
|
||||
#### **Step 3: Smart Augmentation**
|
||||
1. **Raw Processing**: Always include original (resized + grayscale)
|
||||
2. **Random Combination**: Select 2-4 augmentation methods randomly
|
||||
3. **Method Application**: Apply selected methods with probability weights
|
||||
4. **Final Processing**: Grayscale conversion for all outputs
|
||||
|
||||
## 📊 Output Structure
|
||||
|
||||
@@ -185,105 +167,146 @@ When `id_card_detection.enabled: false`:
|
||||
output_directory/
|
||||
├── processed/ # Cropped ID cards (if detection enabled)
|
||||
│ ├── id_card_001.jpg
|
||||
│ ├── id_card_002.jpg
|
||||
│ ├── id_card_002.jpg
|
||||
│ └── processing_summary.json
|
||||
├── im1__rotation_01.png # Augmented images
|
||||
├── im1__cropping_01.png
|
||||
├── im1__noise_01.png
|
||||
├── im1__blockage_01.png
|
||||
├── im1__blurring_01.png
|
||||
├── im1__brightness_contrast_01.png
|
||||
└── augmentation_summary.json
|
||||
├── im1__raw_001.jpg # Raw processed images
|
||||
├── im1__aug_001.jpg # Augmented images (random combinations)
|
||||
├── im1__aug_002.jpg
|
||||
├── im2__raw_001.jpg
|
||||
├── im2__aug_001.jpg
|
||||
└── processing_summary.json
|
||||
```
|
||||
|
||||
### **File Naming Convention**
|
||||
- `{basename}_raw_001.jpg`: Original image (resized + grayscale)
|
||||
- `{basename}_aug_001.jpg`: Augmented version 1 (random methods)
|
||||
- `{basename}_aug_002.jpg`: Augmented version 2 (different methods)
|
||||
|
||||
## 🎯 Use Cases
|
||||
|
||||
### **Training Data Generation**
|
||||
```bash
|
||||
# Generate diverse training data
|
||||
python main.py --enable-id-detection --num-augmentations 10
|
||||
### **Dataset Expansion**
|
||||
```yaml
|
||||
# Triple your dataset size with balanced augmentation
|
||||
data_strategy:
|
||||
multiplication_factor: 3.0
|
||||
```
|
||||
|
||||
### **Smart Sampling for Large Datasets**
|
||||
```yaml
|
||||
# Process only 20% but maintain original dataset size
|
||||
data_strategy:
|
||||
multiplication_factor: 0.2
|
||||
sampling:
|
||||
method: "stratified" # Preserve file type distribution
|
||||
```
|
||||
|
||||
### **Quality Control**
|
||||
```bash
|
||||
# Preview results before processing
|
||||
# Preview results before full processing
|
||||
python main.py --preview
|
||||
```
|
||||
|
||||
### **Batch Processing**
|
||||
```bash
|
||||
# Process large datasets
|
||||
python main.py --input-dir "large_dataset/" --output-dir "augmented_dataset/"
|
||||
```
|
||||
|
||||
## ⚙️ Advanced Configuration
|
||||
|
||||
### **Custom Augmentation Parameters**
|
||||
### **Augmentation Strategy Modes**
|
||||
|
||||
#### **Random Combination** (Recommended)
|
||||
```yaml
|
||||
augmentation:
|
||||
rotation:
|
||||
angles: [45, 90, 135, 180, 225, 270, 315] # Custom angles
|
||||
random_cropping:
|
||||
ratio_range: [0.8, 0.95] # Tighter cropping
|
||||
random_noise:
|
||||
mean_range: [0.1, 0.5] # More noise
|
||||
variance_range: [0.05, 0.15]
|
||||
strategy:
|
||||
mode: "random_combine"
|
||||
min_methods: 2
|
||||
max_methods: 4
|
||||
```
|
||||
Each image gets 2-4 randomly selected augmentation methods.
|
||||
|
||||
### **Performance Optimization**
|
||||
|
||||
#### **Sequential Application**
|
||||
```yaml
|
||||
performance:
|
||||
num_workers: 4
|
||||
prefetch_factor: 2
|
||||
pin_memory: true
|
||||
use_gpu: false
|
||||
augmentation:
|
||||
strategy:
|
||||
mode: "sequential"
|
||||
```
|
||||
All enabled methods applied to each image in sequence.
|
||||
|
||||
#### **Individual Methods**
|
||||
```yaml
|
||||
augmentation:
|
||||
strategy:
|
||||
mode: "individual"
|
||||
```
|
||||
Legacy mode - each method creates separate output images.
|
||||
|
||||
### **Method Probability Tuning**
|
||||
```yaml
|
||||
methods:
|
||||
rotation:
|
||||
probability: 0.9 # High chance - common transformation
|
||||
perspective:
|
||||
probability: 0.2 # Low chance - subtle effect
|
||||
partial_blockage:
|
||||
probability: 0.3 # Medium chance - specific use case
|
||||
```
|
||||
|
||||
## 📝 Logging
|
||||
## 📊 Performance Statistics
|
||||
|
||||
The system provides comprehensive logging:
|
||||
- **File**: `logs/data_augmentation.log`
|
||||
- **Console**: Real-time progress updates
|
||||
- **Summary**: JSON files with processing statistics
|
||||
The system provides detailed statistics:
|
||||
|
||||
### **Log Levels**
|
||||
- `INFO`: General processing information
|
||||
- `WARNING`: Non-critical issues (e.g., no cards detected)
|
||||
- `ERROR`: Critical errors
|
||||
```json
|
||||
{
|
||||
"input_images": 100,
|
||||
"selected_images": 30, // In sampling mode
|
||||
"target_total": 100,
|
||||
"actual_generated": 98,
|
||||
"multiplication_factor": 0.3,
|
||||
"mode": "sampling",
|
||||
"efficiency": 0.98 // 98% target achievement
|
||||
}
|
||||
```
|
||||
|
||||
## 🔧 Troubleshooting
|
||||
|
||||
### **Common Issues**
|
||||
|
||||
1. **No images detected**
|
||||
- Check input directory path
|
||||
- Verify image formats (jpg, png, bmp, tiff)
|
||||
- Ensure images are not corrupted
|
||||
1. **Low efficiency in sampling mode**
|
||||
- Increase `min_methods` or adjust `target_size`
|
||||
- Check available augmentation methods
|
||||
|
||||
2. **YOLO model not found**
|
||||
- Place model file at `data/weights/id_cards_yolov8n.pt`
|
||||
- Or specify custom path with `--model-path`
|
||||
2. **Memory issues with large datasets**
|
||||
- Use sampling mode with lower factor
|
||||
- Reduce `target_size` resolution
|
||||
- Enable `memory_efficient` mode
|
||||
|
||||
3. **Memory issues**
|
||||
- Reduce `num_augmentations`
|
||||
- Use smaller `target_size`
|
||||
- Enable GPU if available
|
||||
3. **Inconsistent augmentation results**
|
||||
- Set `random_seed` for reproducibility
|
||||
- Adjust method probabilities
|
||||
- Check `min_methods`/`max_methods` balance
|
||||
|
||||
### **Performance Tips**
|
||||
|
||||
- **GPU Acceleration**: Set `use_gpu: true` in config
|
||||
- **Batch Processing**: Use multiple workers for large datasets
|
||||
- **Memory Management**: Process in smaller batches
|
||||
- **Sampling Mode**: Use for large datasets (>1000 images)
|
||||
- **GPU Acceleration**: Enable for YOLO detection
|
||||
- **Batch Processing**: Process in chunks for memory efficiency
|
||||
- **Probability Tuning**: Higher probabilities for stable methods
|
||||
|
||||
## 📈 Benchmarks
|
||||
|
||||
### **Processing Speed**
|
||||
- **Direct Mode**: ~2-3 images/second
|
||||
- **YOLO + Augmentation**: ~1-2 images/second
|
||||
- **Memory Usage**: ~2-4GB for 1000 images
|
||||
|
||||
### **Output Quality**
|
||||
- **Raw Images**: 100% preserved quality
|
||||
- **Augmented Images**: Balanced realism vs. diversity
|
||||
- **Grayscale Conversion**: Consistent preprocessing
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
1. Fork the repository
|
||||
2. Create a feature branch
|
||||
3. Make your changes
|
||||
4. Add tests if applicable
|
||||
5. Submit a pull request
|
||||
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
|
||||
3. Commit your changes (`git commit -m 'Add amazing feature'`)
|
||||
4. Push to the branch (`git push origin feature/amazing-feature`)
|
||||
5. Open a Pull Request
|
||||
|
||||
## 📄 License
|
||||
|
||||
@@ -294,7 +317,8 @@ This project is licensed under the MIT License - see the LICENSE file for detail
|
||||
- **YOLOv8**: Ultralytics for the detection framework
|
||||
- **OpenCV**: Computer vision operations
|
||||
- **NumPy**: Numerical computations
|
||||
- **PyTorch**: Deep learning backend
|
||||
|
||||
---
|
||||
|
||||
**For questions and support, please open an issue on GitHub.**
|
||||
**For questions and support, please open an issue on GitHub.**
|
Reference in New Issue
Block a user