update augment + YOLO pipeline

2025-08-06 20:52:39 +07:00
parent 4ee14f17d3
commit 51d3a66cc4
9 changed files with 989 additions and 407 deletions
--- a/README.md
+++ b/README.md
@@ -1,132 +1,148 @@
-# ID Cards Data Augmentation Tool
+# ID Card Data Augmentation Pipeline

-A comprehensive data augmentation tool specifically designed for ID card images, implementing 7 different augmentation techniques to simulate real-world scenarios.
+A comprehensive data augmentation pipeline for ID card images with YOLO-based detection and advanced augmentation techniques.

-## 🎯 Overview
+![Pipeline Overview](docs/images/yolov8_pipeline.png)

-This tool provides data augmentation capabilities for ID card images, implementing various transformation techniques that mimic real-world conditions such as worn-out cards, partial occlusion, different lighting conditions, and more.
+## 🚀 Features

-## ✨ Features
+### **YOLO-based ID Card Detection**
+- Automatic detection and cropping of ID cards from large images
+- Configurable confidence and IoU thresholds
+- Multiple cropping modes (bbox, square, aspect_ratio)
+- Padding and target size customization

-### 7 Augmentation Techniques
+### **Advanced Data Augmentation**
+- **Geometric Transformations**: Rotation with multiple angles
+- **Random Cropping**: Simulates partially visible cards
+- **Noise Addition**: Simulates worn-out cards
+- **Partial Blockage**: Simulates occluded card details
+- **Blurring**: Simulates blurred but readable images
+- **Brightness/Contrast**: Mimics different lighting conditions
+- **Grayscale Conversion**: Final preprocessing step for all images

-1. **Rotation** - Simulates cards at different angles
-2. **Random Cropping** - Simulates partially visible cards
-3. **Random Noise** - Simulates worn-out cards
-4. **Horizontal Blockage** - Simulates occluded card details
-5. **Grayscale Transformation** - Simulates Xerox/scan copies
-6. **Blurring** - Simulates blurred but readable cards
-7. **Brightness & Contrast** - Simulates different lighting conditions
+### **Flexible Configuration**
+- YAML-based configuration system
+- Command-line argument overrides
+- Environment-specific settings
+- Comprehensive logging

-### Key Features
+## 📋 Requirements

- **Separate Methods**: Each augmentation technique is applied independently
- **Quality Preservation**: Maintains image quality with white background preservation
- **OpenCV Integration**: Uses OpenCV functions for reliable image processing
- **Configurable**: Easy configuration through YAML files
- **Progress Tracking**: Real-time progress monitoring
- **Batch Processing**: Process multiple images efficiently
+```bash
+# Python 3.8+
+conda create -n gpu python=3.8
+conda activate gpu

-## 🚀 Installation
+# Install dependencies
+pip install -r requirements.txt
+```

-### Prerequisites
+### Dependencies
+- `opencv-python>=4.5.0`
+- `numpy>=1.21.0`
+- `Pillow>=8.3.0`
+- `PyYAML>=5.4.0`
+- `ultralytics>=8.0.0` (for YOLO models)

- Python 3.7+
- OpenCV
- NumPy
- PyYAML
- PIL (Pillow)
+## 🛠️ Installation

-### Setup
-
-1. **Clone the repository**:
+1. **Clone the repository**
 ```bash
 git clone <repository-url>
 cd IDcardsGenerator
 ```

-2. **Install dependencies**:
+2. **Install dependencies**
 ```bash
-pip install opencv-python numpy pyyaml pillow
+pip install -r requirements.txt
 ```

-3. **Activate conda environment** (if using GPU):
+3. **Prepare YOLO model** (optional)
 ```bash
-conda activate gpu
+# Place your trained YOLO model at:
+data/weights/id_cards_yolov8n.pt
 ```

-## 📁 Project Structure
+## 📖 Usage

-```
-IDcardsGenerator/
-├── config/
-│   └── config.yaml          # Main configuration file
-├── data/
-│   └── IDcards/
-│       └── processed/       # Input images directory
-├── src/
-│   ├── data_augmentation.py # Core augmentation logic
-│   ├── config_manager.py    # Configuration management
-│   ├── image_processor.py   # Image processing utilities
-│   └── utils.py             # Utility functions
-├── logs/                    # Log files
-├── out/                     # Output directory
-└── main.py                  # Main script
+### **Basic Usage**
+
+```bash
+# Run with default configuration
+python main.py
+
+# Run with ID card detection enabled
+python main.py --enable-id-detection
+
+# Run with custom input/output directories
+python main.py --input-dir "path/to/input" --output-dir "path/to/output"
 ```

-## ⚙️ Configuration
+### **Configuration Options**

-### Main Configuration (`config/config.yaml`)
+#### **ID Card Detection**
+```bash
+# Enable detection with custom model
+python main.py --enable-id-detection --model-path "path/to/model.pt"
+
+# Adjust detection parameters
+python main.py --enable-id-detection --confidence 0.3 --crop-mode square
+
+# Set target size for cropped cards
+python main.py --enable-id-detection --crop-target-size "640,640"
+```
+
+#### **Data Augmentation**
+```bash
+# Customize augmentation parameters
+python main.py --num-augmentations 5 --target-size "512,512"
+
+# Preview augmentation results
+python main.py --preview
+```
+
+### **Configuration File**
+
+Edit `config/config.yaml` for persistent settings:

 ```yaml
-# Data augmentation parameters
+# ID Card Detection
+id_card_detection:
+  enabled: false  # Enable/disable YOLO detection
+  model_path: "data/weights/id_cards_yolov8n.pt"
+  confidence_threshold: 0.25
+  iou_threshold: 0.45
+  padding: 10
+  crop_mode: "bbox"
+  target_size: null
+
+# Data Augmentation
 augmentation:
-  # Rotation
  rotation:
    enabled: true
    angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
-    probability: 1.0
-  
-  # Random cropping
  random_cropping:
    enabled: true
    ratio_range: [0.7, 1.0]
-    probability: 1.0
-  
-  # Random noise
  random_noise:
    enabled: true
    mean_range: [0.0, 0.7]
    variance_range: [0.0, 0.1]
-    probability: 1.0
-  
-  # Partial blockage
  partial_blockage:
    enabled: true
-    num_occlusions_range: [1, 100]
    coverage_range: [0.0, 0.25]
-    variance_range: [0.0, 0.1]
-    probability: 1.0
-  
-  # Grayscale transformation
-  grayscale:
-    enabled: true
-    probability: 1.0
-  
-  # Blurring
  blurring:
    enabled: true
    kernel_ratio_range: [0.0, 0.0084]
-    probability: 1.0
-  
-  # Brightness and contrast
  brightness_contrast:
    enabled: true
    alpha_range: [0.4, 3.0]
    beta_range: [1, 100]
-    probability: 1.0
+  grayscale:
+    enabled: true  # Applied as final step

-# Processing configuration
+# Processing
 processing:
  target_size: [640, 640]
  num_augmentations: 3
@@ -134,156 +150,139 @@ processing:
  quality: 95
 ```

-## 🎮 Usage
+## 🔄 Workflow

-### Basic Usage
+### **Two-Step Processing Pipeline**

+#### **Step 1: ID Card Detection (Optional)**
+When `id_card_detection.enabled: true`:
+1. **Input**: Large images containing multiple ID cards
+2. **YOLO Detection**: Locate and detect ID cards
+3. **Cropping**: Extract individual ID cards with padding
+4. **Output**: Cropped ID cards saved to `out/processed/`
+
+#### **Step 2: Data Augmentation**
+1. **Input**: Original images OR cropped ID cards
+2. **Augmentation**: Apply 6 augmentation methods:
+   - Rotation (9 different angles)
+   - Random cropping (70-100% ratio)
+   - Random noise (simulate wear)
+   - Partial blockage (simulate occlusion)
+   - Blurring (simulate motion blur)
+   - Brightness/Contrast adjustment
+3. **Grayscale**: Convert all images to grayscale (final step)
+4. **Output**: Augmented images in main output directory
+
+### **Direct Augmentation Mode**
+When `id_card_detection.enabled: false`:
+- Skips YOLO detection
+- Applies augmentation directly to input images
+- All images are converted to grayscale
+
+## 📊 Output Structure
+
+```
+output_directory/
+├── processed/                    # Cropped ID cards (if detection enabled)
+│   ├── id_card_001.jpg
+│   ├── id_card_002.jpg
+│   └── processing_summary.json
+├── im1__rotation_01.png         # Augmented images
+├── im1__cropping_01.png
+├── im1__noise_01.png
+├── im1__blockage_01.png
+├── im1__blurring_01.png
+├── im1__brightness_contrast_01.png
+└── augmentation_summary.json
+```
+
+## 🎯 Use Cases
+
+### **Training Data Generation**
 ```bash
-python main.py --input-dir data/IDcards/processed --output-dir out
+# Generate diverse training data
+python main.py --enable-id-detection --num-augmentations 10
 ```

-### Command Line Options
-
+### **Quality Control**
 ```bash
-python main.py [OPTIONS]
-
-Options:
-  --config CONFIG           Path to configuration file (default: config/config.yaml)
-  --input-dir INPUT_DIR     Input directory containing images
-  --output-dir OUTPUT_DIR   Output directory for augmented images
-  --num-augmentations N     Number of augmented versions per image (default: 3)
-  --target-size SIZE        Target size for images (width x height)
-  --preview                 Preview augmentation on first image only
-  --info                    Show information about images in input directory
-  --list-presets           List available presets and exit
-  --log-level LEVEL        Logging level (DEBUG, INFO, WARNING, ERROR)
+# Preview results before processing
+python main.py --preview
 ```

-### Examples
-
-1. **Preview augmentation**:
+### **Batch Processing**
 ```bash
-python main.py --preview --input-dir data/IDcards/processed --output-dir test_output
+# Process large datasets
+python main.py --input-dir "large_dataset/" --output-dir "augmented_dataset/"
 ```

-2. **Show image information**:
-```bash
-python main.py --info --input-dir data/IDcards/processed
+## ⚙️ Advanced Configuration
+
+### **Custom Augmentation Parameters**
+
+```yaml
+augmentation:
+  rotation:
+    angles: [45, 90, 135, 180, 225, 270, 315]  # Custom angles
+  random_cropping:
+    ratio_range: [0.8, 0.95]  # Tighter cropping
+  random_noise:
+    mean_range: [0.1, 0.5]    # More noise
+    variance_range: [0.05, 0.15]
 ```

-3. **Custom number of augmentations**:
-```bash
-python main.py --input-dir data/IDcards/processed --output-dir out --num-augmentations 5
+### **Performance Optimization**
+
+```yaml
+performance:
+  num_workers: 4
+  prefetch_factor: 2
+  pin_memory: true
+  use_gpu: false
 ```

-4. **Custom target size**:
-```bash
-python main.py --input-dir data/IDcards/processed --output-dir out --target-size 512x512
-```
-
-## 📊 Output
-
-### File Naming Convention
-
-The tool creates separate files for each augmentation method:
-
-```
-im1_rotation_01.png          # Rotation method
-im1_cropping_01.png          # Random cropping method
-im1_noise_01.png             # Random noise method
-im1_blockage_01.png          # Partial blockage method
-im1_grayscale_01.png         # Grayscale method
-im1_blurring_01.png          # Blurring method
-im1_brightness_contrast_01.png  # Brightness/contrast method
-```
-
-### Output Summary
-
-After processing, you'll see a summary like:
-
-```
-==================================================
-AUGMENTATION SUMMARY
-==================================================
-Original images: 106
-Augmented images: 2226
-Augmentation ratio: 21.00
-Successful augmentations: 106
-Output directory: out
-==================================================
-```
-
-## 🔧 Augmentation Techniques Details
-
-### 1. Rotation
- **Purpose**: Simulates cards at different angles
- **Angles**: 30°, 60°, 120°, 150°, 180°, 210°, 240°, 300°, 330°
- **Method**: OpenCV rotation with white background preservation
-
-### 2. Random Cropping
- **Purpose**: Simulates partially visible ID cards
- **Ratio Range**: 0.7 to 1.0 (70% to 100% of original size)
- **Method**: Random crop with white background preservation
-
-### 3. Random Noise
- **Purpose**: Simulates worn-out cards
- **Mean Range**: 0.0 to 0.7
- **Variance Range**: 0.0 to 0.1
- **Method**: Gaussian noise addition
-
-### 4. Horizontal Blockage
- **Purpose**: Simulates occluded card details
- **Lines**: 1 to 100 horizontal lines
- **Coverage**: 0% to 25% of image area
- **Colors**: Multiple colors to simulate various objects
-
-### 5. Grayscale Transformation
- **Purpose**: Simulates Xerox/scan copies
- **Method**: OpenCV `cv2.cvtColor()` function
- **Output**: 3-channel grayscale image
-
-### 6. Blurring
- **Purpose**: Simulates blurred but readable cards
- **Kernel Ratio**: 0.0 to 0.0084
- **Method**: OpenCV `cv2.filter2D()` with Gaussian kernel
-
-### 7. Brightness & Contrast
- **Purpose**: Simulates different lighting conditions
- **Alpha Range**: 0.4 to 3.0 (contrast)
- **Beta Range**: 1 to 100 (brightness)
- **Method**: OpenCV `cv2.convertScaleAbs()`
-
-## 🛠️ Development
-
-### Adding New Augmentation Methods
-
-1. Add the method to `src/data_augmentation.py`
-2. Update configuration in `config/config.yaml`
-3. Update default config in `src/config_manager.py`
-4. Test with preview mode
-
-### Code Structure
-
- **`main.py`**: Entry point and command-line interface
- **`src/data_augmentation.py`**: Core augmentation logic
- **`src/config_manager.py`**: Configuration management
- **`src/image_processor.py`**: Image processing utilities
- **`src/utils.py`**: Utility functions
-
 ## 📝 Logging

-The tool provides comprehensive logging:
+The system provides comprehensive logging:
+- **File**: `logs/data_augmentation.log`
+- **Console**: Real-time progress updates
+- **Summary**: JSON files with processing statistics

- **File logging**: `logs/data_augmentation.log`
- **Console logging**: Real-time progress updates
- **Log levels**: DEBUG, INFO, WARNING, ERROR
+### **Log Levels**
+- `INFO`: General processing information
+- `WARNING`: Non-critical issues (e.g., no cards detected)
+- `ERROR`: Critical errors
+
+## 🔧 Troubleshooting
+
+### **Common Issues**
+
+1. **No images detected**
+   - Check input directory path
+   - Verify image formats (jpg, png, bmp, tiff)
+   - Ensure images are not corrupted
+
+2. **YOLO model not found**
+   - Place model file at `data/weights/id_cards_yolov8n.pt`
+   - Or specify custom path with `--model-path`
+
+3. **Memory issues**
+   - Reduce `num_augmentations`
+   - Use smaller `target_size`
+   - Enable GPU if available
+
+### **Performance Tips**
+
+- **GPU Acceleration**: Set `use_gpu: true` in config
+- **Batch Processing**: Use multiple workers for large datasets
+- **Memory Management**: Process in smaller batches

 ## 🤝 Contributing

 1. Fork the repository
 2. Create a feature branch
 3. Make your changes
-4. Test thoroughly
+4. Add tests if applicable
 5. Submit a pull request

 ## 📄 License
@@ -292,18 +291,10 @@ This project is licensed under the MIT License - see the LICENSE file for detail

 ## 🙏 Acknowledgments

- OpenCV for image processing capabilities
- NumPy for numerical operations
- PyYAML for configuration management
-
-## 📞 Support
-
-For issues and questions:
-1. Check the logs in `logs/data_augmentation.log`
-2. Review the configuration in `config/config.yaml`
-3. Test with preview mode first
-4. Create an issue with detailed information
+- **YOLOv8**: Ultralytics for the detection framework
+- **OpenCV**: Computer vision operations
+- **NumPy**: Numerical computations

 ---

-**Note**: This tool is specifically designed for ID card augmentation and may need adjustments for other image types. 
+**For questions and support, please open an issue on GitHub.**