Nguyễn Phước Thành 81c8afefe3 update textgen
2025-08-12 22:46:12 +07:00
2025-08-06 21:44:39 +07:00
2025-08-12 22:46:12 +07:00
2025-08-12 22:46:12 +07:00
2025-08-06 21:44:39 +07:00
2025-08-06 21:44:39 +07:00

ID Card Data Augmentation Pipeline

A comprehensive data augmentation pipeline for ID card images with YOLO-based detection, smart sampling strategies, and advanced augmentation techniques.

Pipeline Overview

🚀 New Features v2.0

Smart Data Strategy

  • Sampling Mode (factor < 1.0): Process only a percentage of input data
  • Multiplication Mode (factor >= 1.0): Multiply total dataset size
  • Balanced Output: Includes both raw and augmented images
  • Configurable Sampling: Random, stratified, or uniform selection

Enhanced Augmentation

  • Random Method Combination: Mix and match augmentation techniques
  • Method Probability Weights: Control frequency of each augmentation
  • Raw Image Preservation: Always includes original processed images
  • Flexible Processing Modes: Individual, sequential, or random combination

🎯 Key Features

YOLO-based ID Card Detection

  • Automatic detection and cropping of ID cards from large images
  • Configurable confidence and IoU thresholds
  • Multiple cropping modes (bbox, square, aspect_ratio)
  • Padding and target size customization

Advanced Data Augmentation

  • Geometric Transformations: Rotation with multiple angles
  • Random Cropping: Simulates partially visible cards
  • Noise Addition: Simulates worn-out cards
  • Partial Blockage: Simulates occluded card details
  • Blurring: Simulates motion blur while keeping readability
  • Brightness/Contrast: Mimics different lighting conditions
  • Color Jittering: HSV adjustments for color variations
  • Perspective Transform: Simulates viewing angle changes
  • Grayscale Conversion: Final preprocessing step for all images

Flexible Configuration

  • YAML-based configuration system
  • Command-line argument overrides
  • Smart data strategy configuration
  • Comprehensive logging and statistics

📋 Requirements

# Python 3.8+
conda create -n gpu python=3.8
conda activate gpu

# Install dependencies
pip install -r requirements.txt

Dependencies

  • opencv-python>=4.5.0
  • numpy>=1.21.0
  • Pillow>=8.3.0
  • PyYAML>=5.4.0
  • ultralytics>=8.0.0 (for YOLO models)
  • torch>=1.12.0 (for GPU acceleration)

🛠️ Installation

  1. Clone the repository
git clone <repository-url>
cd IDcardsGenerator
  1. Install dependencies
pip install -r requirements.txt
  1. Prepare YOLO model (optional)
# Place your trained YOLO model at:
data/weights/id_cards_yolov8n.pt

📖 Usage

Basic Usage

# Run with default configuration (3x multiplication)
python main.py

# Run with sampling mode (30% of input data)
python main.py  # Set multiplication_factor: 0.3 in config

# Run with ID card detection enabled
python main.py --enable-id-detection

Data Strategy Examples

Sampling Mode (factor < 1.0)

data_strategy:
  multiplication_factor: 0.3  # Process 30% of input images
  sampling:
    method: "random"          # random, stratified, uniform
    preserve_distribution: true
  • Input: 100 images → Select 30 images → Output: 100 images total
  • Each selected image generates ~3-4 versions (including raw)

Multiplication Mode (factor >= 1.0)

data_strategy:
  multiplication_factor: 3.0  # 3x dataset size
  • Input: 100 images → Process all → Output: 300 images total
  • Each image generates 3 versions (1 raw + 2 augmented)

Augmentation Strategy

augmentation:
  strategy:
    mode: "random_combine"     # random_combine, sequential, individual
    min_methods: 2             # Min augmentation methods per image
    max_methods: 4             # Max augmentation methods per image
    
  methods:
    rotation:
      enabled: true
      probability: 0.8         # 80% chance to be selected
      angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
      
    random_cropping:
      enabled: true  
      probability: 0.7
      ratio_range: [0.7, 1.0]
      
    # ... other methods with probabilities

🔄 Workflow

Smart Processing Pipeline

Step 1: Data Selection

  • Sampling Mode: Randomly select subset of input images
  • Multiplication Mode: Process all input images
  • Stratified Sampling: Preserve file type distribution

Step 2: ID Card Detection (Optional)

When id_card_detection.enabled: true:

  1. YOLO Detection: Locate ID cards in large images
  2. Cropping: Extract individual ID cards with padding
  3. Output: Cropped ID cards saved to out/processed/

Step 3: Smart Augmentation

  1. Raw Processing: Always include original (resized + grayscale)
  2. Random Combination: Select 2-4 augmentation methods randomly
  3. Method Application: Apply selected methods with probability weights
  4. Final Processing: Grayscale conversion for all outputs

📊 Output Structure

output_directory/
├── processed/                    # Cropped ID cards (if detection enabled)
│   ├── id_card_001.jpg
│   ├── id_card_002.jpg  
│   └── processing_summary.json
├── im1__raw_001.jpg             # Raw processed images
├── im1__aug_001.jpg             # Augmented images (random combinations)
├── im1__aug_002.jpg
├── im2__raw_001.jpg
├── im2__aug_001.jpg
└── processing_summary.json

File Naming Convention

  • {basename}_raw_001.jpg: Original image (resized + grayscale)
  • {basename}_aug_001.jpg: Augmented version 1 (random methods)
  • {basename}_aug_002.jpg: Augmented version 2 (different methods)

🎯 Use Cases

Dataset Expansion

# Triple your dataset size with balanced augmentation
data_strategy:
  multiplication_factor: 3.0

Smart Sampling for Large Datasets

# Process only 20% but maintain original dataset size
data_strategy:
  multiplication_factor: 0.2
  sampling:
    method: "stratified"  # Preserve file type distribution

Quality Control

# Preview results before full processing
python main.py --preview

⚙️ Advanced Configuration

Augmentation Strategy Modes

augmentation:
  strategy:
    mode: "random_combine"
    min_methods: 2
    max_methods: 4

Each image gets 2-4 randomly selected augmentation methods.

Sequential Application

augmentation:
  strategy:
    mode: "sequential"

All enabled methods applied to each image in sequence.

Individual Methods

augmentation:
  strategy:
    mode: "individual"

Legacy mode - each method creates separate output images.

Method Probability Tuning

methods:
  rotation:
    probability: 0.9      # High chance - common transformation
  perspective:
    probability: 0.2      # Low chance - subtle effect
  partial_blockage:
    probability: 0.3      # Medium chance - specific use case

📊 Performance Statistics

The system provides detailed statistics:

{
  "input_images": 100,
  "selected_images": 30,        // In sampling mode
  "target_total": 100,
  "actual_generated": 98,
  "multiplication_factor": 0.3,
  "mode": "sampling",
  "efficiency": 0.98            // 98% target achievement
}

🔧 Troubleshooting

Common Issues

  1. Low efficiency in sampling mode

    • Increase min_methods or adjust target_size
    • Check available augmentation methods
  2. Memory issues with large datasets

    • Use sampling mode with lower factor
    • Reduce target_size resolution
    • Enable memory_efficient mode
  3. Inconsistent augmentation results

    • Set random_seed for reproducibility
    • Adjust method probabilities
    • Check min_methods/max_methods balance

Performance Tips

  • Sampling Mode: Use for large datasets (>1000 images)
  • GPU Acceleration: Enable for YOLO detection
  • Batch Processing: Process in chunks for memory efficiency
  • Probability Tuning: Higher probabilities for stable methods

📈 Benchmarks

Processing Speed

  • Direct Mode: ~2-3 images/second
  • YOLO + Augmentation: ~1-2 images/second
  • Memory Usage: ~2-4GB for 1000 images

Output Quality

  • Raw Images: 100% preserved quality
  • Augmented Images: Balanced realism vs. diversity
  • Grayscale Conversion: Consistent preprocessing

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • YOLOv8: Ultralytics for the detection framework
  • OpenCV: Computer vision operations
  • NumPy: Numerical computations
  • PyTorch: Deep learning backend

For questions and support, please open an issue on GitHub.

Description
No description provided
Readme 28 MiB
Languages
Python 100%