update augment + YOLO pipeline
This commit is contained in:
9
.gitignore
vendored
9
.gitignore
vendored
@@ -16,4 +16,11 @@
|
|||||||
*.pt
|
*.pt
|
||||||
*.ipynb
|
*.ipynb
|
||||||
*.pyc
|
*.pyc
|
||||||
*.log
|
*.log
|
||||||
|
|
||||||
|
!docs/
|
||||||
|
!docs/**/*.png
|
||||||
|
!docs/**/*.jpg
|
||||||
|
!docs/**/*.jpeg
|
||||||
|
!docs/**/*.gif
|
||||||
|
!docs/**/*.svg
|
423
README.md
423
README.md
@@ -1,132 +1,148 @@
|
|||||||
# ID Cards Data Augmentation Tool
|
# ID Card Data Augmentation Pipeline
|
||||||
|
|
||||||
A comprehensive data augmentation tool specifically designed for ID card images, implementing 7 different augmentation techniques to simulate real-world scenarios.
|
A comprehensive data augmentation pipeline for ID card images with YOLO-based detection and advanced augmentation techniques.
|
||||||
|
|
||||||
## 🎯 Overview
|

|
||||||
|
|
||||||
This tool provides data augmentation capabilities for ID card images, implementing various transformation techniques that mimic real-world conditions such as worn-out cards, partial occlusion, different lighting conditions, and more.
|
## 🚀 Features
|
||||||
|
|
||||||
## ✨ Features
|
### **YOLO-based ID Card Detection**
|
||||||
|
- Automatic detection and cropping of ID cards from large images
|
||||||
|
- Configurable confidence and IoU thresholds
|
||||||
|
- Multiple cropping modes (bbox, square, aspect_ratio)
|
||||||
|
- Padding and target size customization
|
||||||
|
|
||||||
### 7 Augmentation Techniques
|
### **Advanced Data Augmentation**
|
||||||
|
- **Geometric Transformations**: Rotation with multiple angles
|
||||||
|
- **Random Cropping**: Simulates partially visible cards
|
||||||
|
- **Noise Addition**: Simulates worn-out cards
|
||||||
|
- **Partial Blockage**: Simulates occluded card details
|
||||||
|
- **Blurring**: Simulates blurred but readable images
|
||||||
|
- **Brightness/Contrast**: Mimics different lighting conditions
|
||||||
|
- **Grayscale Conversion**: Final preprocessing step for all images
|
||||||
|
|
||||||
1. **Rotation** - Simulates cards at different angles
|
### **Flexible Configuration**
|
||||||
2. **Random Cropping** - Simulates partially visible cards
|
- YAML-based configuration system
|
||||||
3. **Random Noise** - Simulates worn-out cards
|
- Command-line argument overrides
|
||||||
4. **Horizontal Blockage** - Simulates occluded card details
|
- Environment-specific settings
|
||||||
5. **Grayscale Transformation** - Simulates Xerox/scan copies
|
- Comprehensive logging
|
||||||
6. **Blurring** - Simulates blurred but readable cards
|
|
||||||
7. **Brightness & Contrast** - Simulates different lighting conditions
|
|
||||||
|
|
||||||
### Key Features
|
## 📋 Requirements
|
||||||
|
|
||||||
- **Separate Methods**: Each augmentation technique is applied independently
|
```bash
|
||||||
- **Quality Preservation**: Maintains image quality with white background preservation
|
# Python 3.8+
|
||||||
- **OpenCV Integration**: Uses OpenCV functions for reliable image processing
|
conda create -n gpu python=3.8
|
||||||
- **Configurable**: Easy configuration through YAML files
|
conda activate gpu
|
||||||
- **Progress Tracking**: Real-time progress monitoring
|
|
||||||
- **Batch Processing**: Process multiple images efficiently
|
|
||||||
|
|
||||||
## 🚀 Installation
|
# Install dependencies
|
||||||
|
pip install -r requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
### Prerequisites
|
### Dependencies
|
||||||
|
- `opencv-python>=4.5.0`
|
||||||
|
- `numpy>=1.21.0`
|
||||||
|
- `Pillow>=8.3.0`
|
||||||
|
- `PyYAML>=5.4.0`
|
||||||
|
- `ultralytics>=8.0.0` (for YOLO models)
|
||||||
|
|
||||||
- Python 3.7+
|
## 🛠️ Installation
|
||||||
- OpenCV
|
|
||||||
- NumPy
|
|
||||||
- PyYAML
|
|
||||||
- PIL (Pillow)
|
|
||||||
|
|
||||||
### Setup
|
1. **Clone the repository**
|
||||||
|
|
||||||
1. **Clone the repository**:
|
|
||||||
```bash
|
```bash
|
||||||
git clone <repository-url>
|
git clone <repository-url>
|
||||||
cd IDcardsGenerator
|
cd IDcardsGenerator
|
||||||
```
|
```
|
||||||
|
|
||||||
2. **Install dependencies**:
|
2. **Install dependencies**
|
||||||
```bash
|
```bash
|
||||||
pip install opencv-python numpy pyyaml pillow
|
pip install -r requirements.txt
|
||||||
```
|
```
|
||||||
|
|
||||||
3. **Activate conda environment** (if using GPU):
|
3. **Prepare YOLO model** (optional)
|
||||||
```bash
|
```bash
|
||||||
conda activate gpu
|
# Place your trained YOLO model at:
|
||||||
|
data/weights/id_cards_yolov8n.pt
|
||||||
```
|
```
|
||||||
|
|
||||||
## 📁 Project Structure
|
## 📖 Usage
|
||||||
|
|
||||||
```
|
### **Basic Usage**
|
||||||
IDcardsGenerator/
|
|
||||||
├── config/
|
```bash
|
||||||
│ └── config.yaml # Main configuration file
|
# Run with default configuration
|
||||||
├── data/
|
python main.py
|
||||||
│ └── IDcards/
|
|
||||||
│ └── processed/ # Input images directory
|
# Run with ID card detection enabled
|
||||||
├── src/
|
python main.py --enable-id-detection
|
||||||
│ ├── data_augmentation.py # Core augmentation logic
|
|
||||||
│ ├── config_manager.py # Configuration management
|
# Run with custom input/output directories
|
||||||
│ ├── image_processor.py # Image processing utilities
|
python main.py --input-dir "path/to/input" --output-dir "path/to/output"
|
||||||
│ └── utils.py # Utility functions
|
|
||||||
├── logs/ # Log files
|
|
||||||
├── out/ # Output directory
|
|
||||||
└── main.py # Main script
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## ⚙️ Configuration
|
### **Configuration Options**
|
||||||
|
|
||||||
### Main Configuration (`config/config.yaml`)
|
#### **ID Card Detection**
|
||||||
|
```bash
|
||||||
|
# Enable detection with custom model
|
||||||
|
python main.py --enable-id-detection --model-path "path/to/model.pt"
|
||||||
|
|
||||||
|
# Adjust detection parameters
|
||||||
|
python main.py --enable-id-detection --confidence 0.3 --crop-mode square
|
||||||
|
|
||||||
|
# Set target size for cropped cards
|
||||||
|
python main.py --enable-id-detection --crop-target-size "640,640"
|
||||||
|
```
|
||||||
|
|
||||||
|
#### **Data Augmentation**
|
||||||
|
```bash
|
||||||
|
# Customize augmentation parameters
|
||||||
|
python main.py --num-augmentations 5 --target-size "512,512"
|
||||||
|
|
||||||
|
# Preview augmentation results
|
||||||
|
python main.py --preview
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Configuration File**
|
||||||
|
|
||||||
|
Edit `config/config.yaml` for persistent settings:
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
# Data augmentation parameters
|
# ID Card Detection
|
||||||
|
id_card_detection:
|
||||||
|
enabled: false # Enable/disable YOLO detection
|
||||||
|
model_path: "data/weights/id_cards_yolov8n.pt"
|
||||||
|
confidence_threshold: 0.25
|
||||||
|
iou_threshold: 0.45
|
||||||
|
padding: 10
|
||||||
|
crop_mode: "bbox"
|
||||||
|
target_size: null
|
||||||
|
|
||||||
|
# Data Augmentation
|
||||||
augmentation:
|
augmentation:
|
||||||
# Rotation
|
|
||||||
rotation:
|
rotation:
|
||||||
enabled: true
|
enabled: true
|
||||||
angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
|
angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
|
||||||
probability: 1.0
|
|
||||||
|
|
||||||
# Random cropping
|
|
||||||
random_cropping:
|
random_cropping:
|
||||||
enabled: true
|
enabled: true
|
||||||
ratio_range: [0.7, 1.0]
|
ratio_range: [0.7, 1.0]
|
||||||
probability: 1.0
|
|
||||||
|
|
||||||
# Random noise
|
|
||||||
random_noise:
|
random_noise:
|
||||||
enabled: true
|
enabled: true
|
||||||
mean_range: [0.0, 0.7]
|
mean_range: [0.0, 0.7]
|
||||||
variance_range: [0.0, 0.1]
|
variance_range: [0.0, 0.1]
|
||||||
probability: 1.0
|
|
||||||
|
|
||||||
# Partial blockage
|
|
||||||
partial_blockage:
|
partial_blockage:
|
||||||
enabled: true
|
enabled: true
|
||||||
num_occlusions_range: [1, 100]
|
|
||||||
coverage_range: [0.0, 0.25]
|
coverage_range: [0.0, 0.25]
|
||||||
variance_range: [0.0, 0.1]
|
|
||||||
probability: 1.0
|
|
||||||
|
|
||||||
# Grayscale transformation
|
|
||||||
grayscale:
|
|
||||||
enabled: true
|
|
||||||
probability: 1.0
|
|
||||||
|
|
||||||
# Blurring
|
|
||||||
blurring:
|
blurring:
|
||||||
enabled: true
|
enabled: true
|
||||||
kernel_ratio_range: [0.0, 0.0084]
|
kernel_ratio_range: [0.0, 0.0084]
|
||||||
probability: 1.0
|
|
||||||
|
|
||||||
# Brightness and contrast
|
|
||||||
brightness_contrast:
|
brightness_contrast:
|
||||||
enabled: true
|
enabled: true
|
||||||
alpha_range: [0.4, 3.0]
|
alpha_range: [0.4, 3.0]
|
||||||
beta_range: [1, 100]
|
beta_range: [1, 100]
|
||||||
probability: 1.0
|
grayscale:
|
||||||
|
enabled: true # Applied as final step
|
||||||
|
|
||||||
# Processing configuration
|
# Processing
|
||||||
processing:
|
processing:
|
||||||
target_size: [640, 640]
|
target_size: [640, 640]
|
||||||
num_augmentations: 3
|
num_augmentations: 3
|
||||||
@@ -134,156 +150,139 @@ processing:
|
|||||||
quality: 95
|
quality: 95
|
||||||
```
|
```
|
||||||
|
|
||||||
## 🎮 Usage
|
## 🔄 Workflow
|
||||||
|
|
||||||
### Basic Usage
|
### **Two-Step Processing Pipeline**
|
||||||
|
|
||||||
|
#### **Step 1: ID Card Detection (Optional)**
|
||||||
|
When `id_card_detection.enabled: true`:
|
||||||
|
1. **Input**: Large images containing multiple ID cards
|
||||||
|
2. **YOLO Detection**: Locate and detect ID cards
|
||||||
|
3. **Cropping**: Extract individual ID cards with padding
|
||||||
|
4. **Output**: Cropped ID cards saved to `out/processed/`
|
||||||
|
|
||||||
|
#### **Step 2: Data Augmentation**
|
||||||
|
1. **Input**: Original images OR cropped ID cards
|
||||||
|
2. **Augmentation**: Apply 6 augmentation methods:
|
||||||
|
- Rotation (9 different angles)
|
||||||
|
- Random cropping (70-100% ratio)
|
||||||
|
- Random noise (simulate wear)
|
||||||
|
- Partial blockage (simulate occlusion)
|
||||||
|
- Blurring (simulate motion blur)
|
||||||
|
- Brightness/Contrast adjustment
|
||||||
|
3. **Grayscale**: Convert all images to grayscale (final step)
|
||||||
|
4. **Output**: Augmented images in main output directory
|
||||||
|
|
||||||
|
### **Direct Augmentation Mode**
|
||||||
|
When `id_card_detection.enabled: false`:
|
||||||
|
- Skips YOLO detection
|
||||||
|
- Applies augmentation directly to input images
|
||||||
|
- All images are converted to grayscale
|
||||||
|
|
||||||
|
## 📊 Output Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
output_directory/
|
||||||
|
├── processed/ # Cropped ID cards (if detection enabled)
|
||||||
|
│ ├── id_card_001.jpg
|
||||||
|
│ ├── id_card_002.jpg
|
||||||
|
│ └── processing_summary.json
|
||||||
|
├── im1__rotation_01.png # Augmented images
|
||||||
|
├── im1__cropping_01.png
|
||||||
|
├── im1__noise_01.png
|
||||||
|
├── im1__blockage_01.png
|
||||||
|
├── im1__blurring_01.png
|
||||||
|
├── im1__brightness_contrast_01.png
|
||||||
|
└── augmentation_summary.json
|
||||||
|
```
|
||||||
|
|
||||||
|
## 🎯 Use Cases
|
||||||
|
|
||||||
|
### **Training Data Generation**
|
||||||
```bash
|
```bash
|
||||||
python main.py --input-dir data/IDcards/processed --output-dir out
|
# Generate diverse training data
|
||||||
|
python main.py --enable-id-detection --num-augmentations 10
|
||||||
```
|
```
|
||||||
|
|
||||||
### Command Line Options
|
### **Quality Control**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python main.py [OPTIONS]
|
# Preview results before processing
|
||||||
|
python main.py --preview
|
||||||
Options:
|
|
||||||
--config CONFIG Path to configuration file (default: config/config.yaml)
|
|
||||||
--input-dir INPUT_DIR Input directory containing images
|
|
||||||
--output-dir OUTPUT_DIR Output directory for augmented images
|
|
||||||
--num-augmentations N Number of augmented versions per image (default: 3)
|
|
||||||
--target-size SIZE Target size for images (width x height)
|
|
||||||
--preview Preview augmentation on first image only
|
|
||||||
--info Show information about images in input directory
|
|
||||||
--list-presets List available presets and exit
|
|
||||||
--log-level LEVEL Logging level (DEBUG, INFO, WARNING, ERROR)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Examples
|
### **Batch Processing**
|
||||||
|
|
||||||
1. **Preview augmentation**:
|
|
||||||
```bash
|
```bash
|
||||||
python main.py --preview --input-dir data/IDcards/processed --output-dir test_output
|
# Process large datasets
|
||||||
|
python main.py --input-dir "large_dataset/" --output-dir "augmented_dataset/"
|
||||||
```
|
```
|
||||||
|
|
||||||
2. **Show image information**:
|
## ⚙️ Advanced Configuration
|
||||||
```bash
|
|
||||||
python main.py --info --input-dir data/IDcards/processed
|
### **Custom Augmentation Parameters**
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
augmentation:
|
||||||
|
rotation:
|
||||||
|
angles: [45, 90, 135, 180, 225, 270, 315] # Custom angles
|
||||||
|
random_cropping:
|
||||||
|
ratio_range: [0.8, 0.95] # Tighter cropping
|
||||||
|
random_noise:
|
||||||
|
mean_range: [0.1, 0.5] # More noise
|
||||||
|
variance_range: [0.05, 0.15]
|
||||||
```
|
```
|
||||||
|
|
||||||
3. **Custom number of augmentations**:
|
### **Performance Optimization**
|
||||||
```bash
|
|
||||||
python main.py --input-dir data/IDcards/processed --output-dir out --num-augmentations 5
|
```yaml
|
||||||
|
performance:
|
||||||
|
num_workers: 4
|
||||||
|
prefetch_factor: 2
|
||||||
|
pin_memory: true
|
||||||
|
use_gpu: false
|
||||||
```
|
```
|
||||||
|
|
||||||
4. **Custom target size**:
|
|
||||||
```bash
|
|
||||||
python main.py --input-dir data/IDcards/processed --output-dir out --target-size 512x512
|
|
||||||
```
|
|
||||||
|
|
||||||
## 📊 Output
|
|
||||||
|
|
||||||
### File Naming Convention
|
|
||||||
|
|
||||||
The tool creates separate files for each augmentation method:
|
|
||||||
|
|
||||||
```
|
|
||||||
im1_rotation_01.png # Rotation method
|
|
||||||
im1_cropping_01.png # Random cropping method
|
|
||||||
im1_noise_01.png # Random noise method
|
|
||||||
im1_blockage_01.png # Partial blockage method
|
|
||||||
im1_grayscale_01.png # Grayscale method
|
|
||||||
im1_blurring_01.png # Blurring method
|
|
||||||
im1_brightness_contrast_01.png # Brightness/contrast method
|
|
||||||
```
|
|
||||||
|
|
||||||
### Output Summary
|
|
||||||
|
|
||||||
After processing, you'll see a summary like:
|
|
||||||
|
|
||||||
```
|
|
||||||
==================================================
|
|
||||||
AUGMENTATION SUMMARY
|
|
||||||
==================================================
|
|
||||||
Original images: 106
|
|
||||||
Augmented images: 2226
|
|
||||||
Augmentation ratio: 21.00
|
|
||||||
Successful augmentations: 106
|
|
||||||
Output directory: out
|
|
||||||
==================================================
|
|
||||||
```
|
|
||||||
|
|
||||||
## 🔧 Augmentation Techniques Details
|
|
||||||
|
|
||||||
### 1. Rotation
|
|
||||||
- **Purpose**: Simulates cards at different angles
|
|
||||||
- **Angles**: 30°, 60°, 120°, 150°, 180°, 210°, 240°, 300°, 330°
|
|
||||||
- **Method**: OpenCV rotation with white background preservation
|
|
||||||
|
|
||||||
### 2. Random Cropping
|
|
||||||
- **Purpose**: Simulates partially visible ID cards
|
|
||||||
- **Ratio Range**: 0.7 to 1.0 (70% to 100% of original size)
|
|
||||||
- **Method**: Random crop with white background preservation
|
|
||||||
|
|
||||||
### 3. Random Noise
|
|
||||||
- **Purpose**: Simulates worn-out cards
|
|
||||||
- **Mean Range**: 0.0 to 0.7
|
|
||||||
- **Variance Range**: 0.0 to 0.1
|
|
||||||
- **Method**: Gaussian noise addition
|
|
||||||
|
|
||||||
### 4. Horizontal Blockage
|
|
||||||
- **Purpose**: Simulates occluded card details
|
|
||||||
- **Lines**: 1 to 100 horizontal lines
|
|
||||||
- **Coverage**: 0% to 25% of image area
|
|
||||||
- **Colors**: Multiple colors to simulate various objects
|
|
||||||
|
|
||||||
### 5. Grayscale Transformation
|
|
||||||
- **Purpose**: Simulates Xerox/scan copies
|
|
||||||
- **Method**: OpenCV `cv2.cvtColor()` function
|
|
||||||
- **Output**: 3-channel grayscale image
|
|
||||||
|
|
||||||
### 6. Blurring
|
|
||||||
- **Purpose**: Simulates blurred but readable cards
|
|
||||||
- **Kernel Ratio**: 0.0 to 0.0084
|
|
||||||
- **Method**: OpenCV `cv2.filter2D()` with Gaussian kernel
|
|
||||||
|
|
||||||
### 7. Brightness & Contrast
|
|
||||||
- **Purpose**: Simulates different lighting conditions
|
|
||||||
- **Alpha Range**: 0.4 to 3.0 (contrast)
|
|
||||||
- **Beta Range**: 1 to 100 (brightness)
|
|
||||||
- **Method**: OpenCV `cv2.convertScaleAbs()`
|
|
||||||
|
|
||||||
## 🛠️ Development
|
|
||||||
|
|
||||||
### Adding New Augmentation Methods
|
|
||||||
|
|
||||||
1. Add the method to `src/data_augmentation.py`
|
|
||||||
2. Update configuration in `config/config.yaml`
|
|
||||||
3. Update default config in `src/config_manager.py`
|
|
||||||
4. Test with preview mode
|
|
||||||
|
|
||||||
### Code Structure
|
|
||||||
|
|
||||||
- **`main.py`**: Entry point and command-line interface
|
|
||||||
- **`src/data_augmentation.py`**: Core augmentation logic
|
|
||||||
- **`src/config_manager.py`**: Configuration management
|
|
||||||
- **`src/image_processor.py`**: Image processing utilities
|
|
||||||
- **`src/utils.py`**: Utility functions
|
|
||||||
|
|
||||||
## 📝 Logging
|
## 📝 Logging
|
||||||
|
|
||||||
The tool provides comprehensive logging:
|
The system provides comprehensive logging:
|
||||||
|
- **File**: `logs/data_augmentation.log`
|
||||||
|
- **Console**: Real-time progress updates
|
||||||
|
- **Summary**: JSON files with processing statistics
|
||||||
|
|
||||||
- **File logging**: `logs/data_augmentation.log`
|
### **Log Levels**
|
||||||
- **Console logging**: Real-time progress updates
|
- `INFO`: General processing information
|
||||||
- **Log levels**: DEBUG, INFO, WARNING, ERROR
|
- `WARNING`: Non-critical issues (e.g., no cards detected)
|
||||||
|
- `ERROR`: Critical errors
|
||||||
|
|
||||||
|
## 🔧 Troubleshooting
|
||||||
|
|
||||||
|
### **Common Issues**
|
||||||
|
|
||||||
|
1. **No images detected**
|
||||||
|
- Check input directory path
|
||||||
|
- Verify image formats (jpg, png, bmp, tiff)
|
||||||
|
- Ensure images are not corrupted
|
||||||
|
|
||||||
|
2. **YOLO model not found**
|
||||||
|
- Place model file at `data/weights/id_cards_yolov8n.pt`
|
||||||
|
- Or specify custom path with `--model-path`
|
||||||
|
|
||||||
|
3. **Memory issues**
|
||||||
|
- Reduce `num_augmentations`
|
||||||
|
- Use smaller `target_size`
|
||||||
|
- Enable GPU if available
|
||||||
|
|
||||||
|
### **Performance Tips**
|
||||||
|
|
||||||
|
- **GPU Acceleration**: Set `use_gpu: true` in config
|
||||||
|
- **Batch Processing**: Use multiple workers for large datasets
|
||||||
|
- **Memory Management**: Process in smaller batches
|
||||||
|
|
||||||
## 🤝 Contributing
|
## 🤝 Contributing
|
||||||
|
|
||||||
1. Fork the repository
|
1. Fork the repository
|
||||||
2. Create a feature branch
|
2. Create a feature branch
|
||||||
3. Make your changes
|
3. Make your changes
|
||||||
4. Test thoroughly
|
4. Add tests if applicable
|
||||||
5. Submit a pull request
|
5. Submit a pull request
|
||||||
|
|
||||||
## 📄 License
|
## 📄 License
|
||||||
@@ -292,18 +291,10 @@ This project is licensed under the MIT License - see the LICENSE file for detail
|
|||||||
|
|
||||||
## 🙏 Acknowledgments
|
## 🙏 Acknowledgments
|
||||||
|
|
||||||
- OpenCV for image processing capabilities
|
- **YOLOv8**: Ultralytics for the detection framework
|
||||||
- NumPy for numerical operations
|
- **OpenCV**: Computer vision operations
|
||||||
- PyYAML for configuration management
|
- **NumPy**: Numerical computations
|
||||||
|
|
||||||
## 📞 Support
|
|
||||||
|
|
||||||
For issues and questions:
|
|
||||||
1. Check the logs in `logs/data_augmentation.log`
|
|
||||||
2. Review the configuration in `config/config.yaml`
|
|
||||||
3. Test with preview mode first
|
|
||||||
4. Create an issue with detailed information
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
**Note**: This tool is specifically designed for ID card augmentation and may need adjustments for other image types.
|
**For questions and support, please open an issue on GitHub.**
|
@@ -7,6 +7,17 @@ paths:
|
|||||||
output_dir: "out1"
|
output_dir: "out1"
|
||||||
log_file: "logs/data_augmentation.log"
|
log_file: "logs/data_augmentation.log"
|
||||||
|
|
||||||
|
# ID Card Detection configuration
|
||||||
|
id_card_detection:
|
||||||
|
enabled: false # Bật/tắt tính năng detect và crop ID cards
|
||||||
|
model_path: "data/weights/id_cards_yolov8n.pt" # Đường dẫn đến YOLO model
|
||||||
|
confidence_threshold: 0.25 # Confidence threshold cho detection
|
||||||
|
iou_threshold: 0.45 # IoU threshold cho NMS
|
||||||
|
padding: 10 # Padding thêm xung quanh bbox
|
||||||
|
crop_mode: "bbox" # Mode cắt: bbox, square, aspect_ratio
|
||||||
|
target_size: null # Kích thước target (width, height) hoặc null
|
||||||
|
save_original_crops: true # Có lưu ảnh gốc đã crop không
|
||||||
|
|
||||||
# Data augmentation parameters - ROTATION and RANDOM CROPPING
|
# Data augmentation parameters - ROTATION and RANDOM CROPPING
|
||||||
augmentation:
|
augmentation:
|
||||||
# Geometric transformations
|
# Geometric transformations
|
||||||
@@ -36,11 +47,6 @@ augmentation:
|
|||||||
variance_range: [0.0, 0.1] # Line thickness variance (min, max)
|
variance_range: [0.0, 0.1] # Line thickness variance (min, max)
|
||||||
probability: 1.0 # Always apply blockage
|
probability: 1.0 # Always apply blockage
|
||||||
|
|
||||||
# Grayscale transformation to mimic Xerox/scan copies
|
|
||||||
grayscale:
|
|
||||||
enabled: true
|
|
||||||
probability: 1.0 # Always apply grayscale
|
|
||||||
|
|
||||||
# Blurring to simulate blurred card images that are still readable
|
# Blurring to simulate blurred card images that are still readable
|
||||||
blurring:
|
blurring:
|
||||||
enabled: true
|
enabled: true
|
||||||
@@ -53,6 +59,11 @@ augmentation:
|
|||||||
alpha_range: [0.4, 3.0] # Contrast range (min, max)
|
alpha_range: [0.4, 3.0] # Contrast range (min, max)
|
||||||
beta_range: [1, 100] # Brightness range (min, max)
|
beta_range: [1, 100] # Brightness range (min, max)
|
||||||
probability: 1.0 # Always apply brightness/contrast adjustment
|
probability: 1.0 # Always apply brightness/contrast adjustment
|
||||||
|
|
||||||
|
# Grayscale transformation as final step (applied to all augmented images)
|
||||||
|
grayscale:
|
||||||
|
enabled: true
|
||||||
|
probability: 1.0 # Always apply grayscale as final step
|
||||||
|
|
||||||
# Processing configuration
|
# Processing configuration
|
||||||
processing:
|
processing:
|
||||||
|
BIN
docs/images/yolov8_pipeline.png
Normal file
BIN
docs/images/yolov8_pipeline.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 580 KiB |
170
main.py
170
main.py
@@ -12,6 +12,7 @@ sys.path.append(str(Path(__file__).parent / "src"))
|
|||||||
from src.config_manager import ConfigManager
|
from src.config_manager import ConfigManager
|
||||||
from src.data_augmentation import DataAugmentation
|
from src.data_augmentation import DataAugmentation
|
||||||
from src.image_processor import ImageProcessor
|
from src.image_processor import ImageProcessor
|
||||||
|
from src.id_card_detector import IDCardDetector
|
||||||
from src.utils import setup_logging, get_image_files, print_progress
|
from src.utils import setup_logging, get_image_files, print_progress
|
||||||
|
|
||||||
def parse_arguments():
|
def parse_arguments():
|
||||||
@@ -83,6 +84,38 @@ def parse_arguments():
|
|||||||
help="Logging level"
|
help="Logging level"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# ID Card Detection arguments
|
||||||
|
parser.add_argument(
|
||||||
|
"--enable-id-detection",
|
||||||
|
action="store_true",
|
||||||
|
help="Enable ID card detection and cropping before augmentation"
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--model-path",
|
||||||
|
type=str,
|
||||||
|
help="Path to YOLO model for ID card detection (overrides config)"
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--confidence",
|
||||||
|
type=float,
|
||||||
|
help="Confidence threshold for ID card detection (overrides config)"
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--crop-mode",
|
||||||
|
type=str,
|
||||||
|
choices=["bbox", "square", "aspect_ratio"],
|
||||||
|
help="Crop mode for ID cards (overrides config)"
|
||||||
|
)
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"--crop-target-size",
|
||||||
|
type=str,
|
||||||
|
help="Target size for cropped ID cards (widthxheight) (overrides config)"
|
||||||
|
)
|
||||||
|
|
||||||
return parser.parse_args()
|
return parser.parse_args()
|
||||||
|
|
||||||
def parse_range(range_str: str) -> tuple:
|
def parse_range(range_str: str) -> tuple:
|
||||||
@@ -134,7 +167,8 @@ def show_image_info(input_dir: Path):
|
|||||||
print(f"\nTotal file size: {total_size:.2f} MB")
|
print(f"\nTotal file size: {total_size:.2f} MB")
|
||||||
print(f"Average file size: {total_size/len(image_files):.2f} MB")
|
print(f"Average file size: {total_size/len(image_files):.2f} MB")
|
||||||
|
|
||||||
def preview_augmentation(input_dir: Path, output_dir: Path, config: Dict[str, Any]):
|
def preview_augmentation(input_dir: Path, output_dir: Path, config: Dict[str, Any],
|
||||||
|
id_detection_config: Dict[str, Any] = None):
|
||||||
"""Preview augmentation on first image"""
|
"""Preview augmentation on first image"""
|
||||||
image_files = get_image_files(input_dir)
|
image_files = get_image_files(input_dir)
|
||||||
|
|
||||||
@@ -147,7 +181,40 @@ def preview_augmentation(input_dir: Path, output_dir: Path, config: Dict[str, An
|
|||||||
# Create augmentation instance
|
# Create augmentation instance
|
||||||
augmenter = DataAugmentation(config)
|
augmenter = DataAugmentation(config)
|
||||||
|
|
||||||
# Augment first image
|
# Process with ID detection if enabled
|
||||||
|
if id_detection_config and id_detection_config.get('enabled', False):
|
||||||
|
print("🔍 ID Card Detection enabled - processing with YOLO model...")
|
||||||
|
|
||||||
|
# Initialize ID card detector
|
||||||
|
detector = IDCardDetector(
|
||||||
|
model_path=id_detection_config.get('model_path'),
|
||||||
|
config=config
|
||||||
|
)
|
||||||
|
|
||||||
|
if not detector.model:
|
||||||
|
print("❌ Failed to load YOLO model, proceeding with normal augmentation")
|
||||||
|
else:
|
||||||
|
# Process single image with ID detection
|
||||||
|
result = detector.process_single_image(
|
||||||
|
image_path=image_files[0],
|
||||||
|
output_dir=output_dir,
|
||||||
|
apply_augmentation=True,
|
||||||
|
save_original=id_detection_config.get('save_original_crops', True),
|
||||||
|
confidence=id_detection_config.get('confidence_threshold', 0.25),
|
||||||
|
iou_threshold=id_detection_config.get('iou_threshold', 0.45),
|
||||||
|
crop_mode=id_detection_config.get('crop_mode', 'bbox'),
|
||||||
|
target_size=id_detection_config.get('target_size'),
|
||||||
|
padding=id_detection_config.get('padding', 10)
|
||||||
|
)
|
||||||
|
|
||||||
|
if result and result.get('detections'):
|
||||||
|
print(f"✅ Detected {len(result['detections'])} ID cards")
|
||||||
|
print(f"💾 Saved {len(result['processed_cards'])} processed cards")
|
||||||
|
return
|
||||||
|
else:
|
||||||
|
print("⚠️ No ID cards detected, proceeding with normal augmentation")
|
||||||
|
|
||||||
|
# Normal augmentation (fallback)
|
||||||
augmented_paths = augmenter.augment_image_file(
|
augmented_paths = augmenter.augment_image_file(
|
||||||
image_files[0],
|
image_files[0],
|
||||||
output_dir,
|
output_dir,
|
||||||
@@ -225,9 +292,29 @@ def main():
|
|||||||
show_image_info(input_dir)
|
show_image_info(input_dir)
|
||||||
return
|
return
|
||||||
|
|
||||||
|
# Get ID detection config
|
||||||
|
id_detection_config = config.get('id_card_detection', {})
|
||||||
|
|
||||||
|
# Override ID detection config with command line arguments
|
||||||
|
if args.enable_id_detection:
|
||||||
|
id_detection_config['enabled'] = True
|
||||||
|
|
||||||
|
if args.model_path:
|
||||||
|
id_detection_config['model_path'] = args.model_path
|
||||||
|
|
||||||
|
if args.confidence:
|
||||||
|
id_detection_config['confidence_threshold'] = args.confidence
|
||||||
|
|
||||||
|
if args.crop_mode:
|
||||||
|
id_detection_config['crop_mode'] = args.crop_mode
|
||||||
|
|
||||||
|
if args.crop_target_size:
|
||||||
|
target_size = parse_size(args.crop_target_size)
|
||||||
|
id_detection_config['target_size'] = list(target_size)
|
||||||
|
|
||||||
# Preview augmentation if requested
|
# Preview augmentation if requested
|
||||||
if args.preview:
|
if args.preview:
|
||||||
preview_augmentation(input_dir, output_dir, augmentation_config)
|
preview_augmentation(input_dir, output_dir, augmentation_config, id_detection_config)
|
||||||
return
|
return
|
||||||
|
|
||||||
# Get image files
|
# Get image files
|
||||||
@@ -242,35 +329,56 @@ def main():
|
|||||||
logger.info(f"Number of augmentations per image: {processing_config.get('num_augmentations', 3)}")
|
logger.info(f"Number of augmentations per image: {processing_config.get('num_augmentations', 3)}")
|
||||||
logger.info(f"Target size: {processing_config.get('target_size', [224, 224])}")
|
logger.info(f"Target size: {processing_config.get('target_size', [224, 224])}")
|
||||||
|
|
||||||
# Create augmentation instance with new config
|
# Process with ID detection if enabled
|
||||||
augmenter = DataAugmentation(augmentation_config)
|
if id_detection_config.get('enabled', False):
|
||||||
|
logger.info("ID Card Detection enabled - processing with YOLO model...")
|
||||||
|
|
||||||
|
# Initialize ID card detector
|
||||||
|
detector = IDCardDetector(
|
||||||
|
model_path=id_detection_config.get('model_path'),
|
||||||
|
config=config
|
||||||
|
)
|
||||||
|
|
||||||
|
if not detector.model:
|
||||||
|
logger.error("Failed to load YOLO model")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
logger.info(f"YOLO model loaded: {detector.model_path}")
|
||||||
|
logger.info(f"Confidence threshold: {id_detection_config.get('confidence_threshold', 0.25)}")
|
||||||
|
logger.info(f"Crop mode: {id_detection_config.get('crop_mode', 'bbox')}")
|
||||||
|
|
||||||
|
# Bước 1: Detect và crop ID cards vào thư mục processed
|
||||||
|
processed_dir = output_dir / "processed"
|
||||||
|
processed_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
logger.info("Step 1: Detect and crop ID cards...")
|
||||||
|
detector.batch_process(
|
||||||
|
input_dir=input_dir,
|
||||||
|
output_dir=processed_dir,
|
||||||
|
confidence=id_detection_config.get('confidence_threshold', 0.25),
|
||||||
|
iou_threshold=id_detection_config.get('iou_threshold', 0.45),
|
||||||
|
crop_mode=id_detection_config.get('crop_mode', 'bbox'),
|
||||||
|
target_size=id_detection_config.get('target_size'),
|
||||||
|
padding=id_detection_config.get('padding', 10)
|
||||||
|
)
|
||||||
|
# Bước 2: Augment các card đã crop
|
||||||
|
logger.info("Step 2: Augment cropped ID cards...")
|
||||||
|
augmenter = DataAugmentation(augmentation_config)
|
||||||
|
augmenter.batch_augment(
|
||||||
|
processed_dir,
|
||||||
|
output_dir,
|
||||||
|
num_augmentations=processing_config.get("num_augmentations", 3)
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
# Augment trực tiếp ảnh gốc
|
||||||
|
logger.info("Starting normal batch augmentation (direct augmentation)...")
|
||||||
|
augmenter = DataAugmentation(augmentation_config)
|
||||||
|
augmenter.batch_augment(
|
||||||
|
input_dir,
|
||||||
|
output_dir,
|
||||||
|
num_augmentations=processing_config.get("num_augmentations", 3)
|
||||||
|
)
|
||||||
|
|
||||||
# Update target size
|
logger.info("Data processing completed successfully")
|
||||||
target_size = tuple(processing_config.get("target_size", [224, 224]))
|
|
||||||
augmenter.image_processor.target_size = target_size
|
|
||||||
|
|
||||||
# Perform batch augmentation
|
|
||||||
logger.info("Starting batch augmentation...")
|
|
||||||
results = augmenter.batch_augment(
|
|
||||||
input_dir,
|
|
||||||
output_dir,
|
|
||||||
num_augmentations=processing_config.get("num_augmentations", 3)
|
|
||||||
)
|
|
||||||
|
|
||||||
# Get and display summary
|
|
||||||
summary = augmenter.get_augmentation_summary(results)
|
|
||||||
|
|
||||||
print("\n" + "="*50)
|
|
||||||
print("AUGMENTATION SUMMARY")
|
|
||||||
print("="*50)
|
|
||||||
print(f"Original images: {summary['total_original_images']}")
|
|
||||||
print(f"Augmented images: {summary['total_augmented_images']}")
|
|
||||||
print(f"Augmentation ratio: {summary['augmentation_ratio']:.2f}")
|
|
||||||
print(f"Successful augmentations: {summary['successful_augmentations']}")
|
|
||||||
print(f"Output directory: {output_dir}")
|
|
||||||
print("="*50)
|
|
||||||
|
|
||||||
logger.info("Data augmentation completed successfully")
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
main()
|
main()
|
@@ -1,133 +0,0 @@
|
|||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
Simple ID Card Cropper using Roboflow API
|
|
||||||
Input: folder containing images
|
|
||||||
Output: folder with cropped ID cards
|
|
||||||
"""
|
|
||||||
import sys
|
|
||||||
import yaml
|
|
||||||
from pathlib import Path
|
|
||||||
import logging
|
|
||||||
import argparse
|
|
||||||
|
|
||||||
# Add src to path
|
|
||||||
sys.path.append(str(Path(__file__).parent / "src"))
|
|
||||||
|
|
||||||
from model.roboflow_id_detector import RoboflowIDDetector
|
|
||||||
|
|
||||||
def setup_logging():
|
|
||||||
"""Setup basic logging"""
|
|
||||||
logging.basicConfig(
|
|
||||||
level=logging.INFO,
|
|
||||||
format='%(asctime)s - %(levelname)s - %(message)s'
|
|
||||||
)
|
|
||||||
|
|
||||||
def crop_id_cards(input_folder: str, output_folder: str, api_key: str = "Pkz4puRA0Cy3xMOuNoNr"):
|
|
||||||
"""
|
|
||||||
Crop ID cards from all images in input folder
|
|
||||||
|
|
||||||
Args:
|
|
||||||
input_folder: Path to input folder containing images
|
|
||||||
output_folder: Path to output folder for cropped ID cards
|
|
||||||
api_key: Roboflow API key
|
|
||||||
"""
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
# Convert to Path objects
|
|
||||||
input_path = Path(input_folder)
|
|
||||||
output_path = Path(output_folder)
|
|
||||||
|
|
||||||
# Check if input folder exists
|
|
||||||
if not input_path.exists():
|
|
||||||
logger.error(f"Input folder not found: {input_folder}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
# Create output folder
|
|
||||||
output_path.mkdir(parents=True, exist_ok=True)
|
|
||||||
|
|
||||||
# Initialize detector
|
|
||||||
detector = RoboflowIDDetector(
|
|
||||||
api_key=api_key,
|
|
||||||
model_id="french-card-id-detect",
|
|
||||||
version=3,
|
|
||||||
confidence=0.5
|
|
||||||
)
|
|
||||||
|
|
||||||
# Get all image files
|
|
||||||
image_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff'}
|
|
||||||
image_files = []
|
|
||||||
|
|
||||||
for file_path in input_path.rglob('*'):
|
|
||||||
if file_path.is_file() and file_path.suffix.lower() in image_extensions:
|
|
||||||
image_files.append(file_path)
|
|
||||||
|
|
||||||
if not image_files:
|
|
||||||
logger.error(f"No images found in {input_folder}")
|
|
||||||
return False
|
|
||||||
|
|
||||||
logger.info(f"Found {len(image_files)} images to process")
|
|
||||||
|
|
||||||
# Process each image
|
|
||||||
total_cropped = 0
|
|
||||||
|
|
||||||
for i, image_path in enumerate(image_files, 1):
|
|
||||||
logger.info(f"Processing {i}/{len(image_files)}: {image_path.name}")
|
|
||||||
|
|
||||||
# Detect ID cards
|
|
||||||
detections = detector.detect_id_cards(image_path)
|
|
||||||
|
|
||||||
if not detections:
|
|
||||||
logger.warning(f"No ID cards detected in {image_path.name}")
|
|
||||||
continue
|
|
||||||
|
|
||||||
# Crop each detected ID card
|
|
||||||
for j, detection in enumerate(detections):
|
|
||||||
bbox = detection['bbox']
|
|
||||||
|
|
||||||
# Create output filename
|
|
||||||
stem = image_path.stem
|
|
||||||
suffix = f"_card_{j+1}.jpg"
|
|
||||||
output_file = output_path / f"{stem}{suffix}"
|
|
||||||
|
|
||||||
# Crop ID card
|
|
||||||
cropped = detector.crop_id_card(image_path, bbox, output_file)
|
|
||||||
|
|
||||||
if cropped is not None:
|
|
||||||
total_cropped += 1
|
|
||||||
logger.info(f" ✓ Cropped card {j+1} to {output_file.name}")
|
|
||||||
|
|
||||||
# Add delay between requests
|
|
||||||
if i < len(image_files):
|
|
||||||
import time
|
|
||||||
time.sleep(1.0)
|
|
||||||
|
|
||||||
logger.info(f"Processing completed! Total ID cards cropped: {total_cropped}")
|
|
||||||
return True
|
|
||||||
|
|
||||||
def main():
|
|
||||||
"""Main function"""
|
|
||||||
parser = argparse.ArgumentParser(description='Crop ID cards from images using Roboflow API')
|
|
||||||
parser.add_argument('input_folder', help='Input folder containing images')
|
|
||||||
parser.add_argument('output_folder', help='Output folder for cropped ID cards')
|
|
||||||
parser.add_argument('--api-key', default="Pkz4puRA0Cy3xMOuNoNr",
|
|
||||||
help='Roboflow API key (default: demo key)')
|
|
||||||
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
# Setup logging
|
|
||||||
setup_logging()
|
|
||||||
|
|
||||||
# Process images
|
|
||||||
success = crop_id_cards(args.input_folder, args.output_folder, args.api_key)
|
|
||||||
|
|
||||||
if success:
|
|
||||||
print(f"\n✓ Successfully processed images from '{args.input_folder}'")
|
|
||||||
print(f"✓ Cropped ID cards saved to '{args.output_folder}'")
|
|
||||||
else:
|
|
||||||
print(f"\n✗ Failed to process images")
|
|
||||||
return 1
|
|
||||||
|
|
||||||
return 0
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
exit(main())
|
|
@@ -363,8 +363,6 @@ class DataAugmentation:
|
|||||||
|
|
||||||
return result
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
def augment_single_image(self, image: np.ndarray, num_augmentations: int = None) -> List[np.ndarray]:
|
def augment_single_image(self, image: np.ndarray, num_augmentations: int = None) -> List[np.ndarray]:
|
||||||
"""
|
"""
|
||||||
Apply each augmentation method separately to create independent augmented versions
|
Apply each augmentation method separately to create independent augmented versions
|
||||||
@@ -455,20 +453,7 @@ class DataAugmentation:
|
|||||||
|
|
||||||
augmented_images.append(augmented)
|
augmented_images.append(augmented)
|
||||||
|
|
||||||
# 5. Grayscale only
|
# 5. Blurring only
|
||||||
if grayscale_config.get("enabled", False):
|
|
||||||
for i in range(num_augmentations):
|
|
||||||
augmented = image.copy()
|
|
||||||
augmented = self.convert_to_grayscale_preserve_quality(augmented)
|
|
||||||
|
|
||||||
# Resize preserving aspect ratio
|
|
||||||
target_size = self.image_processor.target_size
|
|
||||||
if target_size:
|
|
||||||
augmented = self.resize_preserve_aspect(augmented, target_size)
|
|
||||||
|
|
||||||
augmented_images.append(augmented)
|
|
||||||
|
|
||||||
# 6. Blurring only
|
|
||||||
if blurring_config.get("enabled", False):
|
if blurring_config.get("enabled", False):
|
||||||
for i in range(num_augmentations):
|
for i in range(num_augmentations):
|
||||||
augmented = image.copy()
|
augmented = image.copy()
|
||||||
@@ -481,7 +466,7 @@ class DataAugmentation:
|
|||||||
|
|
||||||
augmented_images.append(augmented)
|
augmented_images.append(augmented)
|
||||||
|
|
||||||
# 7. Brightness and contrast only
|
# 6. Brightness/Contrast only
|
||||||
if brightness_contrast_config.get("enabled", False):
|
if brightness_contrast_config.get("enabled", False):
|
||||||
for i in range(num_augmentations):
|
for i in range(num_augmentations):
|
||||||
augmented = image.copy()
|
augmented = image.copy()
|
||||||
@@ -494,6 +479,11 @@ class DataAugmentation:
|
|||||||
|
|
||||||
augmented_images.append(augmented)
|
augmented_images.append(augmented)
|
||||||
|
|
||||||
|
# 7. Apply grayscale as final step to ALL augmented images
|
||||||
|
if grayscale_config.get("enabled", False):
|
||||||
|
for i in range(len(augmented_images)):
|
||||||
|
augmented_images[i] = self.convert_to_grayscale_preserve_quality(augmented_images[i])
|
||||||
|
|
||||||
return augmented_images
|
return augmented_images
|
||||||
|
|
||||||
def augment_image_file(self, image_path: Path, output_dir: Path, num_augmentations: int = None) -> List[Path]:
|
def augment_image_file(self, image_path: Path, output_dir: Path, num_augmentations: int = None) -> List[Path]:
|
||||||
@@ -518,7 +508,7 @@ class DataAugmentation:
|
|||||||
|
|
||||||
# Save augmented images with method names
|
# Save augmented images with method names
|
||||||
saved_paths = []
|
saved_paths = []
|
||||||
method_names = ["rotation", "cropping", "noise", "blockage", "grayscale", "blurring", "brightness_contrast"]
|
method_names = ["rotation", "cropping", "noise", "blockage", "blurring", "brightness_contrast", "grayscale"]
|
||||||
method_index = 0
|
method_index = 0
|
||||||
|
|
||||||
for i, aug_image in enumerate(augmented_images):
|
for i, aug_image in enumerate(augmented_images):
|
||||||
|
611
src/id_card_detector.py
Normal file
611
src/id_card_detector.py
Normal file
@@ -0,0 +1,611 @@
|
|||||||
|
"""
|
||||||
|
ID Card Detector Module
|
||||||
|
Sử dụng YOLO để detect và cắt ID cards từ ảnh lớn, kết hợp với data augmentation
|
||||||
|
Tích hợp với YOLOv8 French ID Card Detection model
|
||||||
|
"""
|
||||||
|
import cv2
|
||||||
|
import numpy as np
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import List, Tuple, Optional, Dict, Any, Union
|
||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
from ultralytics import YOLO
|
||||||
|
import logging
|
||||||
|
from data_augmentation import DataAugmentation
|
||||||
|
from utils import load_image, save_image, create_augmented_filename, print_progress
|
||||||
|
import os
|
||||||
|
import json
|
||||||
|
import yaml
|
||||||
|
|
||||||
|
class IDCardDetector:
|
||||||
|
"""Class để detect và cắt ID cards từ ảnh lớn sử dụng YOLO"""
|
||||||
|
|
||||||
|
def __init__(self, model_path: str = None, config: Dict[str, Any] = None):
|
||||||
|
"""
|
||||||
|
Initialize ID Card Detector
|
||||||
|
|
||||||
|
Args:
|
||||||
|
model_path: Đường dẫn đến model YOLO đã train
|
||||||
|
config: Configuration dictionary
|
||||||
|
"""
|
||||||
|
self.config = config or {}
|
||||||
|
self.model_path = model_path
|
||||||
|
self.model = None
|
||||||
|
self.data_augmentation = DataAugmentation(config)
|
||||||
|
self.logger = self._setup_logger()
|
||||||
|
|
||||||
|
# Default model path nếu không được cung cấp
|
||||||
|
if not model_path:
|
||||||
|
default_model_path = "data/weights/id_cards_yolov8n.pt"
|
||||||
|
if os.path.exists(default_model_path):
|
||||||
|
model_path = default_model_path
|
||||||
|
self.model_path = model_path
|
||||||
|
|
||||||
|
# Load YOLO model nếu có
|
||||||
|
if model_path and os.path.exists(model_path):
|
||||||
|
self.load_model(model_path)
|
||||||
|
|
||||||
|
def _setup_logger(self) -> logging.Logger:
|
||||||
|
"""Setup logger cho module"""
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
logger.setLevel(logging.INFO)
|
||||||
|
|
||||||
|
if not logger.handlers:
|
||||||
|
handler = logging.StreamHandler()
|
||||||
|
formatter = logging.Formatter(
|
||||||
|
'%(asctime)s - %(name)s - %(levelname)s - %(message)s'
|
||||||
|
)
|
||||||
|
handler.setFormatter(formatter)
|
||||||
|
logger.addHandler(handler)
|
||||||
|
|
||||||
|
return logger
|
||||||
|
|
||||||
|
def load_model(self, model_path: str) -> bool:
|
||||||
|
"""
|
||||||
|
Load YOLO model từ file
|
||||||
|
|
||||||
|
Args:
|
||||||
|
model_path: Đường dẫn đến model file
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
True nếu load thành công, False nếu thất bại
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
self.model = YOLO(model_path)
|
||||||
|
self.logger.info(f"Loaded YOLO model from: {model_path}")
|
||||||
|
return True
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Failed to load model: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def detect_id_cards(self, image: np.ndarray, confidence: float = 0.5, iou_threshold: float = 0.45) -> List[Dict[str, Any]]:
|
||||||
|
"""
|
||||||
|
Detect ID cards trong ảnh sử dụng YOLO
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image: Input image
|
||||||
|
confidence: Confidence threshold
|
||||||
|
iou_threshold: IoU threshold cho NMS
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List các detection results với format:
|
||||||
|
{
|
||||||
|
'bbox': [x1, y1, x2, y2],
|
||||||
|
'confidence': float,
|
||||||
|
'class_id': int,
|
||||||
|
'class_name': str
|
||||||
|
}
|
||||||
|
"""
|
||||||
|
if self.model is None:
|
||||||
|
self.logger.error("Model chưa được load!")
|
||||||
|
return []
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Run inference
|
||||||
|
results = self.model(image, conf=confidence, iou=float(iou_threshold), verbose=False)
|
||||||
|
|
||||||
|
detections = []
|
||||||
|
for result in results:
|
||||||
|
boxes = result.boxes
|
||||||
|
if boxes is not None:
|
||||||
|
for box in boxes:
|
||||||
|
# Get bbox coordinates
|
||||||
|
x1, y1, x2, y2 = box.xyxy[0].cpu().numpy()
|
||||||
|
|
||||||
|
# Get confidence and class
|
||||||
|
confidence_score = float(box.conf[0].cpu().numpy())
|
||||||
|
class_id = int(box.cls[0].cpu().numpy())
|
||||||
|
class_name = self.model.names[class_id] if hasattr(self.model, 'names') else f"class_{class_id}"
|
||||||
|
|
||||||
|
detection = {
|
||||||
|
'bbox': [int(x1), int(y1), int(x2), int(y2)],
|
||||||
|
'confidence': confidence_score,
|
||||||
|
'class_id': class_id,
|
||||||
|
'class_name': class_name
|
||||||
|
}
|
||||||
|
detections.append(detection)
|
||||||
|
|
||||||
|
self.logger.info(f"Detected {len(detections)} ID cards")
|
||||||
|
return detections
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Error during detection: {e}")
|
||||||
|
return []
|
||||||
|
|
||||||
|
def crop_id_card(self, image: np.ndarray, bbox: List[int], padding: int = 10,
|
||||||
|
crop_mode: str = "bbox", target_size: Tuple[int, int] = None) -> np.ndarray:
|
||||||
|
"""
|
||||||
|
Cắt ID card từ ảnh gốc dựa trên bbox với nhiều options
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image: Input image
|
||||||
|
bbox: Bounding box [x1, y1, x2, y2]
|
||||||
|
padding: Padding thêm xung quanh bbox
|
||||||
|
crop_mode: Mode cắt ("bbox", "square", "aspect_ratio")
|
||||||
|
target_size: Kích thước target (width, height) nếu muốn resize
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Cropped ID card image
|
||||||
|
"""
|
||||||
|
x1, y1, x2, y2 = bbox
|
||||||
|
|
||||||
|
# Thêm padding
|
||||||
|
height, width = image.shape[:2]
|
||||||
|
x1 = max(0, x1 - padding)
|
||||||
|
y1 = max(0, y1 - padding)
|
||||||
|
x2 = min(width, x2 + padding)
|
||||||
|
y2 = min(height, y2 + padding)
|
||||||
|
|
||||||
|
# Cắt ảnh theo mode
|
||||||
|
if crop_mode == "bbox":
|
||||||
|
# Cắt theo bbox gốc
|
||||||
|
cropped = image[y1:y2, x1:x2]
|
||||||
|
elif crop_mode == "square":
|
||||||
|
# Cắt thành hình vuông
|
||||||
|
center_x = (x1 + x2) // 2
|
||||||
|
center_y = (y1 + y2) // 2
|
||||||
|
size = max(x2 - x1, y2 - y1)
|
||||||
|
half_size = size // 2
|
||||||
|
|
||||||
|
x1 = max(0, center_x - half_size)
|
||||||
|
y1 = max(0, center_y - half_size)
|
||||||
|
x2 = min(width, center_x + half_size)
|
||||||
|
y2 = min(height, center_y + half_size)
|
||||||
|
|
||||||
|
cropped = image[y1:y2, x1:x2]
|
||||||
|
elif crop_mode == "aspect_ratio":
|
||||||
|
# Cắt theo tỷ lệ khung hình chuẩn (3:4 cho ID card)
|
||||||
|
bbox_width = x2 - x1
|
||||||
|
bbox_height = y2 - y1
|
||||||
|
center_x = (x1 + x2) // 2
|
||||||
|
center_y = (y1 + y2) // 2
|
||||||
|
|
||||||
|
# Tỷ lệ 3:4 cho ID card
|
||||||
|
target_ratio = 3 / 4
|
||||||
|
current_ratio = bbox_width / bbox_height
|
||||||
|
|
||||||
|
if current_ratio > target_ratio:
|
||||||
|
# Bbox quá rộng, giữ chiều cao
|
||||||
|
new_width = int(bbox_height * target_ratio)
|
||||||
|
half_width = new_width // 2
|
||||||
|
x1 = max(0, center_x - half_width)
|
||||||
|
x2 = min(width, center_x + half_width)
|
||||||
|
else:
|
||||||
|
# Bbox quá cao, giữ chiều rộng
|
||||||
|
new_height = int(bbox_width / target_ratio)
|
||||||
|
half_height = new_height // 2
|
||||||
|
y1 = max(0, center_y - half_height)
|
||||||
|
y2 = min(height, center_y + half_height)
|
||||||
|
|
||||||
|
cropped = image[y1:y2, x1:x2]
|
||||||
|
else:
|
||||||
|
# Default: cắt theo bbox
|
||||||
|
cropped = image[y1:y2, x1:x2]
|
||||||
|
|
||||||
|
# Resize nếu có target_size
|
||||||
|
if target_size:
|
||||||
|
cropped = cv2.resize(cropped, target_size, interpolation=cv2.INTER_AREA)
|
||||||
|
|
||||||
|
return cropped
|
||||||
|
|
||||||
|
def process_single_image(self, image_path: Union[str, Path], output_dir: Path,
|
||||||
|
confidence: float = 0.5, iou_threshold: float = 0.45,
|
||||||
|
crop_mode: str = "bbox", target_size: Tuple[int, int] = None,
|
||||||
|
padding: int = 10, card_counter: int = 0) -> Dict[str, Any]:
|
||||||
|
"""
|
||||||
|
Xử lý một ảnh: detect ID cards, cắt và áp dụng augmentation
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image_path: Đường dẫn đến ảnh input
|
||||||
|
output_dir: Thư mục output
|
||||||
|
apply_augmentation: Có áp dụng data augmentation không
|
||||||
|
save_original: Có lưu ảnh gốc không
|
||||||
|
confidence: Confidence threshold
|
||||||
|
iou_threshold: IoU threshold
|
||||||
|
crop_mode: Mode cắt ("bbox", "square", "aspect_ratio")
|
||||||
|
target_size: Kích thước target (width, height) hoặc None
|
||||||
|
padding: Padding thêm xung quanh bbox
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dictionary chứa kết quả xử lý
|
||||||
|
"""
|
||||||
|
image_path = Path(image_path)
|
||||||
|
if not image_path.exists():
|
||||||
|
self.logger.error(f"Image not found: {image_path}")
|
||||||
|
return {}
|
||||||
|
|
||||||
|
# Load ảnh
|
||||||
|
image = load_image(str(image_path))
|
||||||
|
if image is None:
|
||||||
|
self.logger.error(f"Failed to load image: {image_path}")
|
||||||
|
return {}
|
||||||
|
|
||||||
|
# Detect ID cards
|
||||||
|
detections = self.detect_id_cards(image, confidence, float(iou_threshold))
|
||||||
|
|
||||||
|
if not detections:
|
||||||
|
self.logger.warning(f"No ID cards detected in: {image_path}")
|
||||||
|
return {
|
||||||
|
'image_path': str(image_path),
|
||||||
|
'detections': [],
|
||||||
|
'processed_cards': []
|
||||||
|
}
|
||||||
|
|
||||||
|
# Tạo thư mục output
|
||||||
|
output_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
processed_cards = []
|
||||||
|
current_card_counter = card_counter
|
||||||
|
|
||||||
|
for i, detection in enumerate(detections):
|
||||||
|
# Cắt ID card với options mới
|
||||||
|
cropped_card = self.crop_id_card(
|
||||||
|
image,
|
||||||
|
detection['bbox'],
|
||||||
|
padding=padding,
|
||||||
|
crop_mode=crop_mode,
|
||||||
|
target_size=target_size
|
||||||
|
)
|
||||||
|
|
||||||
|
# Tạo tên file unique cho mỗi ID card
|
||||||
|
current_card_counter += 1
|
||||||
|
card_filename = f"id_card_{current_card_counter:03d}.jpg"
|
||||||
|
card_path = output_dir / card_filename
|
||||||
|
|
||||||
|
# Lưu ảnh gốc
|
||||||
|
save_image(cropped_card, card_path)
|
||||||
|
processed_cards.append({
|
||||||
|
'original_path': str(card_path),
|
||||||
|
'detection_info': detection,
|
||||||
|
'crop_info': {
|
||||||
|
'mode': crop_mode,
|
||||||
|
'target_size': target_size,
|
||||||
|
'padding': padding
|
||||||
|
}
|
||||||
|
})
|
||||||
|
|
||||||
|
result = {
|
||||||
|
'image_path': str(image_path),
|
||||||
|
'detections': detections,
|
||||||
|
'processed_cards': processed_cards,
|
||||||
|
'total_cards': len(processed_cards),
|
||||||
|
'crop_settings': {
|
||||||
|
'mode': crop_mode,
|
||||||
|
'target_size': target_size,
|
||||||
|
'padding': padding
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
self.logger.info(f"Processed {len(processed_cards)} cards from {image_path.name}")
|
||||||
|
return result
|
||||||
|
|
||||||
|
def batch_process(self, input_dir: Union[str, Path], output_dir: Union[str, Path],
|
||||||
|
confidence: float = 0.5, iou_threshold: float = 0.45,
|
||||||
|
crop_mode: str = "bbox", target_size: Tuple[int, int] = None,
|
||||||
|
padding: int = 10) -> Dict[str, Any]:
|
||||||
|
"""
|
||||||
|
Xử lý batch nhiều ảnh
|
||||||
|
|
||||||
|
Args:
|
||||||
|
input_dir: Thư mục chứa ảnh input
|
||||||
|
output_dir: Thư mục output
|
||||||
|
apply_augmentation: Có áp dụng data augmentation không
|
||||||
|
save_original: Có lưu ảnh gốc không
|
||||||
|
confidence: Confidence threshold
|
||||||
|
iou_threshold: IoU threshold
|
||||||
|
crop_mode: Mode cắt ("bbox", "square", "aspect_ratio")
|
||||||
|
target_size: Kích thước target (width, height) hoặc None
|
||||||
|
padding: Padding thêm xung quanh bbox
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dictionary chứa kết quả batch processing
|
||||||
|
"""
|
||||||
|
input_dir = Path(input_dir)
|
||||||
|
output_dir = Path(output_dir)
|
||||||
|
|
||||||
|
if not input_dir.exists():
|
||||||
|
self.logger.error(f"Input directory not found: {input_dir}")
|
||||||
|
return {}
|
||||||
|
|
||||||
|
# Tạo thư mục output
|
||||||
|
output_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
# Tìm tất cả ảnh
|
||||||
|
supported_formats = self.config.get('supported_formats', ['.jpg', '.jpeg', '.png', '.bmp', '.tiff'])
|
||||||
|
image_files = []
|
||||||
|
for fmt in supported_formats:
|
||||||
|
image_files.extend(input_dir.glob(f"*{fmt}"))
|
||||||
|
image_files.extend(input_dir.glob(f"*{fmt.upper()}"))
|
||||||
|
|
||||||
|
if not image_files:
|
||||||
|
self.logger.warning(f"No supported images found in: {input_dir}")
|
||||||
|
return {}
|
||||||
|
|
||||||
|
self.logger.info(f"Found {len(image_files)} images to process")
|
||||||
|
|
||||||
|
results = {}
|
||||||
|
total_cards = 0
|
||||||
|
global_card_counter = 0 # Counter để tạo tên file unique
|
||||||
|
|
||||||
|
for i, image_path in enumerate(image_files):
|
||||||
|
self.logger.info(f"Processing {i+1}/{len(image_files)}: {image_path.name}")
|
||||||
|
|
||||||
|
# Xử lý ảnh - chỉ detect và crop, không augment
|
||||||
|
result = self.process_single_image(
|
||||||
|
image_path,
|
||||||
|
output_dir,
|
||||||
|
confidence,
|
||||||
|
iou_threshold,
|
||||||
|
crop_mode,
|
||||||
|
target_size,
|
||||||
|
padding,
|
||||||
|
global_card_counter
|
||||||
|
)
|
||||||
|
|
||||||
|
# Cập nhật counter
|
||||||
|
global_card_counter += len(result.get('detections', []))
|
||||||
|
|
||||||
|
results[image_path.name] = result
|
||||||
|
total_cards += len(result.get('detections', [])) # Số lượng ID cards thực tế đã detect
|
||||||
|
|
||||||
|
# Print progress
|
||||||
|
print_progress(i + 1, len(image_files), f"Processed {image_path.name}")
|
||||||
|
|
||||||
|
# Tạo summary
|
||||||
|
summary = {
|
||||||
|
'total_images': len(image_files),
|
||||||
|
'total_cards_detected': total_cards,
|
||||||
|
'images_with_cards': len([r for r in results.values() if r.get('detections')]),
|
||||||
|
'images_without_cards': len([r for r in results.values() if not r.get('detections')]),
|
||||||
|
'output_directory': str(output_dir),
|
||||||
|
'crop_settings': {
|
||||||
|
'mode': crop_mode,
|
||||||
|
'target_size': target_size,
|
||||||
|
'padding': padding
|
||||||
|
},
|
||||||
|
'results': results
|
||||||
|
}
|
||||||
|
|
||||||
|
# Lưu summary
|
||||||
|
summary_path = output_dir / "processing_summary.json"
|
||||||
|
with open(summary_path, 'w', encoding='utf-8') as f:
|
||||||
|
json.dump(summary, f, indent=2, ensure_ascii=False)
|
||||||
|
|
||||||
|
self.logger.info(f"Batch processing completed. Summary saved to: {summary_path}")
|
||||||
|
return summary
|
||||||
|
|
||||||
|
def get_detection_statistics(self, results: Dict[str, Any]) -> Dict[str, Any]:
|
||||||
|
"""
|
||||||
|
Tính toán thống kê từ kết quả detection
|
||||||
|
|
||||||
|
Args:
|
||||||
|
results: Kết quả từ batch_process
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dictionary chứa thống kê
|
||||||
|
"""
|
||||||
|
if not results:
|
||||||
|
return {}
|
||||||
|
|
||||||
|
total_images = results.get('total_images', 0)
|
||||||
|
total_cards = results.get('total_cards_detected', 0)
|
||||||
|
images_with_cards = results.get('images_with_cards', 0)
|
||||||
|
|
||||||
|
# Tính confidence statistics
|
||||||
|
all_confidences = []
|
||||||
|
for image_result in results.get('results', {}).values():
|
||||||
|
for detection in image_result.get('detections', []):
|
||||||
|
all_confidences.append(detection.get('confidence', 0))
|
||||||
|
|
||||||
|
stats = {
|
||||||
|
'total_images_processed': total_images,
|
||||||
|
'total_cards_detected': total_cards,
|
||||||
|
'images_with_cards': images_with_cards,
|
||||||
|
'images_without_cards': total_images - images_with_cards,
|
||||||
|
'average_cards_per_image': total_cards / total_images if total_images > 0 else 0,
|
||||||
|
'detection_rate': images_with_cards / total_images if total_images > 0 else 0,
|
||||||
|
'confidence_statistics': {
|
||||||
|
'min': min(all_confidences) if all_confidences else 0,
|
||||||
|
'max': max(all_confidences) if all_confidences else 0,
|
||||||
|
'mean': np.mean(all_confidences) if all_confidences else 0,
|
||||||
|
'std': np.std(all_confidences) if all_confidences else 0
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return stats
|
||||||
|
|
||||||
|
def augment_cropped_cards(self, input_dir: Union[str, Path], output_dir: Union[str, Path],
|
||||||
|
num_augmentations: int = 3) -> Dict[str, Any]:
|
||||||
|
"""
|
||||||
|
Augment tất cả ID cards đã crop trong thư mục input
|
||||||
|
|
||||||
|
Args:
|
||||||
|
input_dir: Thư mục chứa ID cards đã crop
|
||||||
|
output_dir: Thư mục output cho augmented images
|
||||||
|
num_augmentations: Số lượng augmentation cho mỗi card
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dictionary chứa kết quả augmentation
|
||||||
|
"""
|
||||||
|
input_dir = Path(input_dir)
|
||||||
|
output_dir = Path(output_dir)
|
||||||
|
|
||||||
|
if not input_dir.exists():
|
||||||
|
self.logger.error(f"Input directory not found: {input_dir}")
|
||||||
|
return {}
|
||||||
|
|
||||||
|
# Tạo thư mục output
|
||||||
|
output_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
# Tìm tất cả ID cards đã crop
|
||||||
|
card_files = list(input_dir.glob("id_card_*.jpg"))
|
||||||
|
|
||||||
|
if not card_files:
|
||||||
|
self.logger.warning(f"No ID card files found in: {input_dir}")
|
||||||
|
return {}
|
||||||
|
|
||||||
|
self.logger.info(f"Found {len(card_files)} ID cards to augment")
|
||||||
|
|
||||||
|
results = {}
|
||||||
|
total_augmented = 0
|
||||||
|
|
||||||
|
for i, card_path in enumerate(card_files):
|
||||||
|
self.logger.info(f"Augmenting {i+1}/{len(card_files)}: {card_path.name}")
|
||||||
|
|
||||||
|
# Load ID card
|
||||||
|
card_image = load_image(str(card_path))
|
||||||
|
if card_image is None:
|
||||||
|
self.logger.error(f"Failed to load card: {card_path}")
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Augment card
|
||||||
|
try:
|
||||||
|
augmented_cards = self.data_augmentation.augment_single_image(
|
||||||
|
card_image,
|
||||||
|
num_augmentations=num_augmentations
|
||||||
|
)
|
||||||
|
|
||||||
|
# Debug: Kiểm tra số lượng augmented cards
|
||||||
|
self.logger.info(f"Generated {len(augmented_cards)} augmented cards for {card_path.name}")
|
||||||
|
|
||||||
|
# Debug: Kiểm tra config
|
||||||
|
self.logger.info(f"DataAugmentation config: {self.data_augmentation.config}")
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Error during augmentation: {e}")
|
||||||
|
augmented_cards = []
|
||||||
|
|
||||||
|
# Save augmented cards
|
||||||
|
card_results = []
|
||||||
|
for j, aug_card in enumerate(augmented_cards):
|
||||||
|
aug_filename = f"{card_path.stem}_aug_{j+1}.jpg"
|
||||||
|
aug_path = output_dir / aug_filename
|
||||||
|
save_image(aug_card, aug_path)
|
||||||
|
|
||||||
|
card_results.append({
|
||||||
|
'augmented_path': str(aug_path),
|
||||||
|
'augmentation_index': j+1
|
||||||
|
})
|
||||||
|
|
||||||
|
results[card_path.name] = {
|
||||||
|
'original_path': str(card_path),
|
||||||
|
'augmented_cards': card_results,
|
||||||
|
'total_augmented': len(card_results)
|
||||||
|
}
|
||||||
|
|
||||||
|
total_augmented += len(card_results)
|
||||||
|
|
||||||
|
# Print progress
|
||||||
|
print_progress(i + 1, len(card_files), f"Augmented {card_path.name}")
|
||||||
|
|
||||||
|
# Tạo summary
|
||||||
|
summary = {
|
||||||
|
'total_cards': len(card_files),
|
||||||
|
'total_augmented': total_augmented,
|
||||||
|
'output_directory': str(output_dir),
|
||||||
|
'results': results
|
||||||
|
}
|
||||||
|
|
||||||
|
# Lưu summary
|
||||||
|
summary_path = output_dir / "augmentation_summary.json"
|
||||||
|
with open(summary_path, 'w', encoding='utf-8') as f:
|
||||||
|
json.dump(summary, f, indent=2, ensure_ascii=False)
|
||||||
|
|
||||||
|
self.logger.info(f"Augmentation completed. Summary saved to: {summary_path}")
|
||||||
|
return summary
|
||||||
|
|
||||||
|
def load_yolo_config(self, config_path: str = None) -> Dict[str, Any]:
|
||||||
|
"""
|
||||||
|
Load config từ YOLO detector
|
||||||
|
|
||||||
|
Args:
|
||||||
|
config_path: Đường dẫn đến file config
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Config dictionary
|
||||||
|
"""
|
||||||
|
if config_path is None:
|
||||||
|
# Tìm config mặc định
|
||||||
|
default_config_path = "src/model/ID_cards_detector/config.py"
|
||||||
|
if os.path.exists(default_config_path):
|
||||||
|
config_path = default_config_path
|
||||||
|
|
||||||
|
config = {}
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Import config từ YOLO detector
|
||||||
|
import sys
|
||||||
|
sys.path.append(str(Path("src/model/ID_cards_detector")))
|
||||||
|
|
||||||
|
from config import DEFAULT_TRAINING_CONFIG, DEFAULT_INFERENCE_CONFIG
|
||||||
|
|
||||||
|
config.update({
|
||||||
|
'yolo_training_config': DEFAULT_TRAINING_CONFIG,
|
||||||
|
'yolo_inference_config': DEFAULT_INFERENCE_CONFIG,
|
||||||
|
'detection': {
|
||||||
|
'confidence_threshold': DEFAULT_INFERENCE_CONFIG.get('conf_threshold', 0.25),
|
||||||
|
'iou_threshold': DEFAULT_INFERENCE_CONFIG.get('iou_threshold', 0.45),
|
||||||
|
'padding': 10
|
||||||
|
},
|
||||||
|
'processing': {
|
||||||
|
'apply_augmentation': True,
|
||||||
|
'save_original': True,
|
||||||
|
'num_augmentations': 3,
|
||||||
|
'save_format': "jpg",
|
||||||
|
'quality': 95,
|
||||||
|
'target_size': [640, 640]
|
||||||
|
},
|
||||||
|
'crop_options': {
|
||||||
|
'crop_mode': 'bbox', # bbox, square, aspect_ratio
|
||||||
|
'target_size': None, # (width, height) hoặc None
|
||||||
|
'padding': 10
|
||||||
|
}
|
||||||
|
})
|
||||||
|
|
||||||
|
self.logger.info("Loaded YOLO config successfully")
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.warning(f"Failed to load YOLO config: {e}")
|
||||||
|
# Fallback config
|
||||||
|
config = {
|
||||||
|
'detection': {
|
||||||
|
'confidence_threshold': 0.25,
|
||||||
|
'iou_threshold': 0.45,
|
||||||
|
'padding': 10
|
||||||
|
},
|
||||||
|
'processing': {
|
||||||
|
'apply_augmentation': True,
|
||||||
|
'save_original': True,
|
||||||
|
'num_augmentations': 3,
|
||||||
|
'save_format': "jpg",
|
||||||
|
'quality': 95,
|
||||||
|
'target_size': [640, 640]
|
||||||
|
},
|
||||||
|
'crop_options': {
|
||||||
|
'crop_mode': 'bbox',
|
||||||
|
'target_size': None,
|
||||||
|
'padding': 10
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return config
|
@@ -41,14 +41,11 @@ def load_image(image_path: Path, target_size: Tuple[int, int] = None) -> Optiona
|
|||||||
image = cv2.imread(str(image_path))
|
image = cv2.imread(str(image_path))
|
||||||
if image is None:
|
if image is None:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
# Convert BGR to RGB
|
# Convert BGR to RGB
|
||||||
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
|
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
|
||||||
|
|
||||||
# Resize if target_size is provided
|
# Resize if target_size is provided
|
||||||
if target_size:
|
if target_size:
|
||||||
image = cv2.resize(image, target_size, interpolation=cv2.INTER_AREA)
|
image = cv2.resize(image, target_size, interpolation=cv2.INTER_AREA)
|
||||||
|
|
||||||
return image
|
return image
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
print(f"Error loading image {image_path}: {e}")
|
print(f"Error loading image {image_path}: {e}")
|
||||||
|
Reference in New Issue
Block a user