update augment + YOLO pipeline
This commit is contained in:
9
.gitignore
vendored
9
.gitignore
vendored
@@ -16,4 +16,11 @@
|
||||
*.pt
|
||||
*.ipynb
|
||||
*.pyc
|
||||
*.log
|
||||
*.log
|
||||
|
||||
!docs/
|
||||
!docs/**/*.png
|
||||
!docs/**/*.jpg
|
||||
!docs/**/*.jpeg
|
||||
!docs/**/*.gif
|
||||
!docs/**/*.svg
|
423
README.md
423
README.md
@@ -1,132 +1,148 @@
|
||||
# ID Cards Data Augmentation Tool
|
||||
# ID Card Data Augmentation Pipeline
|
||||
|
||||
A comprehensive data augmentation tool specifically designed for ID card images, implementing 7 different augmentation techniques to simulate real-world scenarios.
|
||||
A comprehensive data augmentation pipeline for ID card images with YOLO-based detection and advanced augmentation techniques.
|
||||
|
||||
## 🎯 Overview
|
||||

|
||||
|
||||
This tool provides data augmentation capabilities for ID card images, implementing various transformation techniques that mimic real-world conditions such as worn-out cards, partial occlusion, different lighting conditions, and more.
|
||||
## 🚀 Features
|
||||
|
||||
## ✨ Features
|
||||
### **YOLO-based ID Card Detection**
|
||||
- Automatic detection and cropping of ID cards from large images
|
||||
- Configurable confidence and IoU thresholds
|
||||
- Multiple cropping modes (bbox, square, aspect_ratio)
|
||||
- Padding and target size customization
|
||||
|
||||
### 7 Augmentation Techniques
|
||||
### **Advanced Data Augmentation**
|
||||
- **Geometric Transformations**: Rotation with multiple angles
|
||||
- **Random Cropping**: Simulates partially visible cards
|
||||
- **Noise Addition**: Simulates worn-out cards
|
||||
- **Partial Blockage**: Simulates occluded card details
|
||||
- **Blurring**: Simulates blurred but readable images
|
||||
- **Brightness/Contrast**: Mimics different lighting conditions
|
||||
- **Grayscale Conversion**: Final preprocessing step for all images
|
||||
|
||||
1. **Rotation** - Simulates cards at different angles
|
||||
2. **Random Cropping** - Simulates partially visible cards
|
||||
3. **Random Noise** - Simulates worn-out cards
|
||||
4. **Horizontal Blockage** - Simulates occluded card details
|
||||
5. **Grayscale Transformation** - Simulates Xerox/scan copies
|
||||
6. **Blurring** - Simulates blurred but readable cards
|
||||
7. **Brightness & Contrast** - Simulates different lighting conditions
|
||||
### **Flexible Configuration**
|
||||
- YAML-based configuration system
|
||||
- Command-line argument overrides
|
||||
- Environment-specific settings
|
||||
- Comprehensive logging
|
||||
|
||||
### Key Features
|
||||
## 📋 Requirements
|
||||
|
||||
- **Separate Methods**: Each augmentation technique is applied independently
|
||||
- **Quality Preservation**: Maintains image quality with white background preservation
|
||||
- **OpenCV Integration**: Uses OpenCV functions for reliable image processing
|
||||
- **Configurable**: Easy configuration through YAML files
|
||||
- **Progress Tracking**: Real-time progress monitoring
|
||||
- **Batch Processing**: Process multiple images efficiently
|
||||
```bash
|
||||
# Python 3.8+
|
||||
conda create -n gpu python=3.8
|
||||
conda activate gpu
|
||||
|
||||
## 🚀 Installation
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### Prerequisites
|
||||
### Dependencies
|
||||
- `opencv-python>=4.5.0`
|
||||
- `numpy>=1.21.0`
|
||||
- `Pillow>=8.3.0`
|
||||
- `PyYAML>=5.4.0`
|
||||
- `ultralytics>=8.0.0` (for YOLO models)
|
||||
|
||||
- Python 3.7+
|
||||
- OpenCV
|
||||
- NumPy
|
||||
- PyYAML
|
||||
- PIL (Pillow)
|
||||
## 🛠️ Installation
|
||||
|
||||
### Setup
|
||||
|
||||
1. **Clone the repository**:
|
||||
1. **Clone the repository**
|
||||
```bash
|
||||
git clone <repository-url>
|
||||
cd IDcardsGenerator
|
||||
```
|
||||
|
||||
2. **Install dependencies**:
|
||||
2. **Install dependencies**
|
||||
```bash
|
||||
pip install opencv-python numpy pyyaml pillow
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
3. **Activate conda environment** (if using GPU):
|
||||
3. **Prepare YOLO model** (optional)
|
||||
```bash
|
||||
conda activate gpu
|
||||
# Place your trained YOLO model at:
|
||||
data/weights/id_cards_yolov8n.pt
|
||||
```
|
||||
|
||||
## 📁 Project Structure
|
||||
## 📖 Usage
|
||||
|
||||
```
|
||||
IDcardsGenerator/
|
||||
├── config/
|
||||
│ └── config.yaml # Main configuration file
|
||||
├── data/
|
||||
│ └── IDcards/
|
||||
│ └── processed/ # Input images directory
|
||||
├── src/
|
||||
│ ├── data_augmentation.py # Core augmentation logic
|
||||
│ ├── config_manager.py # Configuration management
|
||||
│ ├── image_processor.py # Image processing utilities
|
||||
│ └── utils.py # Utility functions
|
||||
├── logs/ # Log files
|
||||
├── out/ # Output directory
|
||||
└── main.py # Main script
|
||||
### **Basic Usage**
|
||||
|
||||
```bash
|
||||
# Run with default configuration
|
||||
python main.py
|
||||
|
||||
# Run with ID card detection enabled
|
||||
python main.py --enable-id-detection
|
||||
|
||||
# Run with custom input/output directories
|
||||
python main.py --input-dir "path/to/input" --output-dir "path/to/output"
|
||||
```
|
||||
|
||||
## ⚙️ Configuration
|
||||
### **Configuration Options**
|
||||
|
||||
### Main Configuration (`config/config.yaml`)
|
||||
#### **ID Card Detection**
|
||||
```bash
|
||||
# Enable detection with custom model
|
||||
python main.py --enable-id-detection --model-path "path/to/model.pt"
|
||||
|
||||
# Adjust detection parameters
|
||||
python main.py --enable-id-detection --confidence 0.3 --crop-mode square
|
||||
|
||||
# Set target size for cropped cards
|
||||
python main.py --enable-id-detection --crop-target-size "640,640"
|
||||
```
|
||||
|
||||
#### **Data Augmentation**
|
||||
```bash
|
||||
# Customize augmentation parameters
|
||||
python main.py --num-augmentations 5 --target-size "512,512"
|
||||
|
||||
# Preview augmentation results
|
||||
python main.py --preview
|
||||
```
|
||||
|
||||
### **Configuration File**
|
||||
|
||||
Edit `config/config.yaml` for persistent settings:
|
||||
|
||||
```yaml
|
||||
# Data augmentation parameters
|
||||
# ID Card Detection
|
||||
id_card_detection:
|
||||
enabled: false # Enable/disable YOLO detection
|
||||
model_path: "data/weights/id_cards_yolov8n.pt"
|
||||
confidence_threshold: 0.25
|
||||
iou_threshold: 0.45
|
||||
padding: 10
|
||||
crop_mode: "bbox"
|
||||
target_size: null
|
||||
|
||||
# Data Augmentation
|
||||
augmentation:
|
||||
# Rotation
|
||||
rotation:
|
||||
enabled: true
|
||||
angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
|
||||
probability: 1.0
|
||||
|
||||
# Random cropping
|
||||
random_cropping:
|
||||
enabled: true
|
||||
ratio_range: [0.7, 1.0]
|
||||
probability: 1.0
|
||||
|
||||
# Random noise
|
||||
random_noise:
|
||||
enabled: true
|
||||
mean_range: [0.0, 0.7]
|
||||
variance_range: [0.0, 0.1]
|
||||
probability: 1.0
|
||||
|
||||
# Partial blockage
|
||||
partial_blockage:
|
||||
enabled: true
|
||||
num_occlusions_range: [1, 100]
|
||||
coverage_range: [0.0, 0.25]
|
||||
variance_range: [0.0, 0.1]
|
||||
probability: 1.0
|
||||
|
||||
# Grayscale transformation
|
||||
grayscale:
|
||||
enabled: true
|
||||
probability: 1.0
|
||||
|
||||
# Blurring
|
||||
blurring:
|
||||
enabled: true
|
||||
kernel_ratio_range: [0.0, 0.0084]
|
||||
probability: 1.0
|
||||
|
||||
# Brightness and contrast
|
||||
brightness_contrast:
|
||||
enabled: true
|
||||
alpha_range: [0.4, 3.0]
|
||||
beta_range: [1, 100]
|
||||
probability: 1.0
|
||||
grayscale:
|
||||
enabled: true # Applied as final step
|
||||
|
||||
# Processing configuration
|
||||
# Processing
|
||||
processing:
|
||||
target_size: [640, 640]
|
||||
num_augmentations: 3
|
||||
@@ -134,156 +150,139 @@ processing:
|
||||
quality: 95
|
||||
```
|
||||
|
||||
## 🎮 Usage
|
||||
## 🔄 Workflow
|
||||
|
||||
### Basic Usage
|
||||
### **Two-Step Processing Pipeline**
|
||||
|
||||
#### **Step 1: ID Card Detection (Optional)**
|
||||
When `id_card_detection.enabled: true`:
|
||||
1. **Input**: Large images containing multiple ID cards
|
||||
2. **YOLO Detection**: Locate and detect ID cards
|
||||
3. **Cropping**: Extract individual ID cards with padding
|
||||
4. **Output**: Cropped ID cards saved to `out/processed/`
|
||||
|
||||
#### **Step 2: Data Augmentation**
|
||||
1. **Input**: Original images OR cropped ID cards
|
||||
2. **Augmentation**: Apply 6 augmentation methods:
|
||||
- Rotation (9 different angles)
|
||||
- Random cropping (70-100% ratio)
|
||||
- Random noise (simulate wear)
|
||||
- Partial blockage (simulate occlusion)
|
||||
- Blurring (simulate motion blur)
|
||||
- Brightness/Contrast adjustment
|
||||
3. **Grayscale**: Convert all images to grayscale (final step)
|
||||
4. **Output**: Augmented images in main output directory
|
||||
|
||||
### **Direct Augmentation Mode**
|
||||
When `id_card_detection.enabled: false`:
|
||||
- Skips YOLO detection
|
||||
- Applies augmentation directly to input images
|
||||
- All images are converted to grayscale
|
||||
|
||||
## 📊 Output Structure
|
||||
|
||||
```
|
||||
output_directory/
|
||||
├── processed/ # Cropped ID cards (if detection enabled)
|
||||
│ ├── id_card_001.jpg
|
||||
│ ├── id_card_002.jpg
|
||||
│ └── processing_summary.json
|
||||
├── im1__rotation_01.png # Augmented images
|
||||
├── im1__cropping_01.png
|
||||
├── im1__noise_01.png
|
||||
├── im1__blockage_01.png
|
||||
├── im1__blurring_01.png
|
||||
├── im1__brightness_contrast_01.png
|
||||
└── augmentation_summary.json
|
||||
```
|
||||
|
||||
## 🎯 Use Cases
|
||||
|
||||
### **Training Data Generation**
|
||||
```bash
|
||||
python main.py --input-dir data/IDcards/processed --output-dir out
|
||||
# Generate diverse training data
|
||||
python main.py --enable-id-detection --num-augmentations 10
|
||||
```
|
||||
|
||||
### Command Line Options
|
||||
|
||||
### **Quality Control**
|
||||
```bash
|
||||
python main.py [OPTIONS]
|
||||
|
||||
Options:
|
||||
--config CONFIG Path to configuration file (default: config/config.yaml)
|
||||
--input-dir INPUT_DIR Input directory containing images
|
||||
--output-dir OUTPUT_DIR Output directory for augmented images
|
||||
--num-augmentations N Number of augmented versions per image (default: 3)
|
||||
--target-size SIZE Target size for images (width x height)
|
||||
--preview Preview augmentation on first image only
|
||||
--info Show information about images in input directory
|
||||
--list-presets List available presets and exit
|
||||
--log-level LEVEL Logging level (DEBUG, INFO, WARNING, ERROR)
|
||||
# Preview results before processing
|
||||
python main.py --preview
|
||||
```
|
||||
|
||||
### Examples
|
||||
|
||||
1. **Preview augmentation**:
|
||||
### **Batch Processing**
|
||||
```bash
|
||||
python main.py --preview --input-dir data/IDcards/processed --output-dir test_output
|
||||
# Process large datasets
|
||||
python main.py --input-dir "large_dataset/" --output-dir "augmented_dataset/"
|
||||
```
|
||||
|
||||
2. **Show image information**:
|
||||
```bash
|
||||
python main.py --info --input-dir data/IDcards/processed
|
||||
## ⚙️ Advanced Configuration
|
||||
|
||||
### **Custom Augmentation Parameters**
|
||||
|
||||
```yaml
|
||||
augmentation:
|
||||
rotation:
|
||||
angles: [45, 90, 135, 180, 225, 270, 315] # Custom angles
|
||||
random_cropping:
|
||||
ratio_range: [0.8, 0.95] # Tighter cropping
|
||||
random_noise:
|
||||
mean_range: [0.1, 0.5] # More noise
|
||||
variance_range: [0.05, 0.15]
|
||||
```
|
||||
|
||||
3. **Custom number of augmentations**:
|
||||
```bash
|
||||
python main.py --input-dir data/IDcards/processed --output-dir out --num-augmentations 5
|
||||
### **Performance Optimization**
|
||||
|
||||
```yaml
|
||||
performance:
|
||||
num_workers: 4
|
||||
prefetch_factor: 2
|
||||
pin_memory: true
|
||||
use_gpu: false
|
||||
```
|
||||
|
||||
4. **Custom target size**:
|
||||
```bash
|
||||
python main.py --input-dir data/IDcards/processed --output-dir out --target-size 512x512
|
||||
```
|
||||
|
||||
## 📊 Output
|
||||
|
||||
### File Naming Convention
|
||||
|
||||
The tool creates separate files for each augmentation method:
|
||||
|
||||
```
|
||||
im1_rotation_01.png # Rotation method
|
||||
im1_cropping_01.png # Random cropping method
|
||||
im1_noise_01.png # Random noise method
|
||||
im1_blockage_01.png # Partial blockage method
|
||||
im1_grayscale_01.png # Grayscale method
|
||||
im1_blurring_01.png # Blurring method
|
||||
im1_brightness_contrast_01.png # Brightness/contrast method
|
||||
```
|
||||
|
||||
### Output Summary
|
||||
|
||||
After processing, you'll see a summary like:
|
||||
|
||||
```
|
||||
==================================================
|
||||
AUGMENTATION SUMMARY
|
||||
==================================================
|
||||
Original images: 106
|
||||
Augmented images: 2226
|
||||
Augmentation ratio: 21.00
|
||||
Successful augmentations: 106
|
||||
Output directory: out
|
||||
==================================================
|
||||
```
|
||||
|
||||
## 🔧 Augmentation Techniques Details
|
||||
|
||||
### 1. Rotation
|
||||
- **Purpose**: Simulates cards at different angles
|
||||
- **Angles**: 30°, 60°, 120°, 150°, 180°, 210°, 240°, 300°, 330°
|
||||
- **Method**: OpenCV rotation with white background preservation
|
||||
|
||||
### 2. Random Cropping
|
||||
- **Purpose**: Simulates partially visible ID cards
|
||||
- **Ratio Range**: 0.7 to 1.0 (70% to 100% of original size)
|
||||
- **Method**: Random crop with white background preservation
|
||||
|
||||
### 3. Random Noise
|
||||
- **Purpose**: Simulates worn-out cards
|
||||
- **Mean Range**: 0.0 to 0.7
|
||||
- **Variance Range**: 0.0 to 0.1
|
||||
- **Method**: Gaussian noise addition
|
||||
|
||||
### 4. Horizontal Blockage
|
||||
- **Purpose**: Simulates occluded card details
|
||||
- **Lines**: 1 to 100 horizontal lines
|
||||
- **Coverage**: 0% to 25% of image area
|
||||
- **Colors**: Multiple colors to simulate various objects
|
||||
|
||||
### 5. Grayscale Transformation
|
||||
- **Purpose**: Simulates Xerox/scan copies
|
||||
- **Method**: OpenCV `cv2.cvtColor()` function
|
||||
- **Output**: 3-channel grayscale image
|
||||
|
||||
### 6. Blurring
|
||||
- **Purpose**: Simulates blurred but readable cards
|
||||
- **Kernel Ratio**: 0.0 to 0.0084
|
||||
- **Method**: OpenCV `cv2.filter2D()` with Gaussian kernel
|
||||
|
||||
### 7. Brightness & Contrast
|
||||
- **Purpose**: Simulates different lighting conditions
|
||||
- **Alpha Range**: 0.4 to 3.0 (contrast)
|
||||
- **Beta Range**: 1 to 100 (brightness)
|
||||
- **Method**: OpenCV `cv2.convertScaleAbs()`
|
||||
|
||||
## 🛠️ Development
|
||||
|
||||
### Adding New Augmentation Methods
|
||||
|
||||
1. Add the method to `src/data_augmentation.py`
|
||||
2. Update configuration in `config/config.yaml`
|
||||
3. Update default config in `src/config_manager.py`
|
||||
4. Test with preview mode
|
||||
|
||||
### Code Structure
|
||||
|
||||
- **`main.py`**: Entry point and command-line interface
|
||||
- **`src/data_augmentation.py`**: Core augmentation logic
|
||||
- **`src/config_manager.py`**: Configuration management
|
||||
- **`src/image_processor.py`**: Image processing utilities
|
||||
- **`src/utils.py`**: Utility functions
|
||||
|
||||
## 📝 Logging
|
||||
|
||||
The tool provides comprehensive logging:
|
||||
The system provides comprehensive logging:
|
||||
- **File**: `logs/data_augmentation.log`
|
||||
- **Console**: Real-time progress updates
|
||||
- **Summary**: JSON files with processing statistics
|
||||
|
||||
- **File logging**: `logs/data_augmentation.log`
|
||||
- **Console logging**: Real-time progress updates
|
||||
- **Log levels**: DEBUG, INFO, WARNING, ERROR
|
||||
### **Log Levels**
|
||||
- `INFO`: General processing information
|
||||
- `WARNING`: Non-critical issues (e.g., no cards detected)
|
||||
- `ERROR`: Critical errors
|
||||
|
||||
## 🔧 Troubleshooting
|
||||
|
||||
### **Common Issues**
|
||||
|
||||
1. **No images detected**
|
||||
- Check input directory path
|
||||
- Verify image formats (jpg, png, bmp, tiff)
|
||||
- Ensure images are not corrupted
|
||||
|
||||
2. **YOLO model not found**
|
||||
- Place model file at `data/weights/id_cards_yolov8n.pt`
|
||||
- Or specify custom path with `--model-path`
|
||||
|
||||
3. **Memory issues**
|
||||
- Reduce `num_augmentations`
|
||||
- Use smaller `target_size`
|
||||
- Enable GPU if available
|
||||
|
||||
### **Performance Tips**
|
||||
|
||||
- **GPU Acceleration**: Set `use_gpu: true` in config
|
||||
- **Batch Processing**: Use multiple workers for large datasets
|
||||
- **Memory Management**: Process in smaller batches
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
1. Fork the repository
|
||||
2. Create a feature branch
|
||||
3. Make your changes
|
||||
4. Test thoroughly
|
||||
4. Add tests if applicable
|
||||
5. Submit a pull request
|
||||
|
||||
## 📄 License
|
||||
@@ -292,18 +291,10 @@ This project is licensed under the MIT License - see the LICENSE file for detail
|
||||
|
||||
## 🙏 Acknowledgments
|
||||
|
||||
- OpenCV for image processing capabilities
|
||||
- NumPy for numerical operations
|
||||
- PyYAML for configuration management
|
||||
|
||||
## 📞 Support
|
||||
|
||||
For issues and questions:
|
||||
1. Check the logs in `logs/data_augmentation.log`
|
||||
2. Review the configuration in `config/config.yaml`
|
||||
3. Test with preview mode first
|
||||
4. Create an issue with detailed information
|
||||
- **YOLOv8**: Ultralytics for the detection framework
|
||||
- **OpenCV**: Computer vision operations
|
||||
- **NumPy**: Numerical computations
|
||||
|
||||
---
|
||||
|
||||
**Note**: This tool is specifically designed for ID card augmentation and may need adjustments for other image types.
|
||||
**For questions and support, please open an issue on GitHub.**
|
@@ -7,6 +7,17 @@ paths:
|
||||
output_dir: "out1"
|
||||
log_file: "logs/data_augmentation.log"
|
||||
|
||||
# ID Card Detection configuration
|
||||
id_card_detection:
|
||||
enabled: false # Bật/tắt tính năng detect và crop ID cards
|
||||
model_path: "data/weights/id_cards_yolov8n.pt" # Đường dẫn đến YOLO model
|
||||
confidence_threshold: 0.25 # Confidence threshold cho detection
|
||||
iou_threshold: 0.45 # IoU threshold cho NMS
|
||||
padding: 10 # Padding thêm xung quanh bbox
|
||||
crop_mode: "bbox" # Mode cắt: bbox, square, aspect_ratio
|
||||
target_size: null # Kích thước target (width, height) hoặc null
|
||||
save_original_crops: true # Có lưu ảnh gốc đã crop không
|
||||
|
||||
# Data augmentation parameters - ROTATION and RANDOM CROPPING
|
||||
augmentation:
|
||||
# Geometric transformations
|
||||
@@ -36,11 +47,6 @@ augmentation:
|
||||
variance_range: [0.0, 0.1] # Line thickness variance (min, max)
|
||||
probability: 1.0 # Always apply blockage
|
||||
|
||||
# Grayscale transformation to mimic Xerox/scan copies
|
||||
grayscale:
|
||||
enabled: true
|
||||
probability: 1.0 # Always apply grayscale
|
||||
|
||||
# Blurring to simulate blurred card images that are still readable
|
||||
blurring:
|
||||
enabled: true
|
||||
@@ -53,6 +59,11 @@ augmentation:
|
||||
alpha_range: [0.4, 3.0] # Contrast range (min, max)
|
||||
beta_range: [1, 100] # Brightness range (min, max)
|
||||
probability: 1.0 # Always apply brightness/contrast adjustment
|
||||
|
||||
# Grayscale transformation as final step (applied to all augmented images)
|
||||
grayscale:
|
||||
enabled: true
|
||||
probability: 1.0 # Always apply grayscale as final step
|
||||
|
||||
# Processing configuration
|
||||
processing:
|
||||
|
BIN
docs/images/yolov8_pipeline.png
Normal file
BIN
docs/images/yolov8_pipeline.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 580 KiB |
170
main.py
170
main.py
@@ -12,6 +12,7 @@ sys.path.append(str(Path(__file__).parent / "src"))
|
||||
from src.config_manager import ConfigManager
|
||||
from src.data_augmentation import DataAugmentation
|
||||
from src.image_processor import ImageProcessor
|
||||
from src.id_card_detector import IDCardDetector
|
||||
from src.utils import setup_logging, get_image_files, print_progress
|
||||
|
||||
def parse_arguments():
|
||||
@@ -83,6 +84,38 @@ def parse_arguments():
|
||||
help="Logging level"
|
||||
)
|
||||
|
||||
# ID Card Detection arguments
|
||||
parser.add_argument(
|
||||
"--enable-id-detection",
|
||||
action="store_true",
|
||||
help="Enable ID card detection and cropping before augmentation"
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--model-path",
|
||||
type=str,
|
||||
help="Path to YOLO model for ID card detection (overrides config)"
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--confidence",
|
||||
type=float,
|
||||
help="Confidence threshold for ID card detection (overrides config)"
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--crop-mode",
|
||||
type=str,
|
||||
choices=["bbox", "square", "aspect_ratio"],
|
||||
help="Crop mode for ID cards (overrides config)"
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--crop-target-size",
|
||||
type=str,
|
||||
help="Target size for cropped ID cards (widthxheight) (overrides config)"
|
||||
)
|
||||
|
||||
return parser.parse_args()
|
||||
|
||||
def parse_range(range_str: str) -> tuple:
|
||||
@@ -134,7 +167,8 @@ def show_image_info(input_dir: Path):
|
||||
print(f"\nTotal file size: {total_size:.2f} MB")
|
||||
print(f"Average file size: {total_size/len(image_files):.2f} MB")
|
||||
|
||||
def preview_augmentation(input_dir: Path, output_dir: Path, config: Dict[str, Any]):
|
||||
def preview_augmentation(input_dir: Path, output_dir: Path, config: Dict[str, Any],
|
||||
id_detection_config: Dict[str, Any] = None):
|
||||
"""Preview augmentation on first image"""
|
||||
image_files = get_image_files(input_dir)
|
||||
|
||||
@@ -147,7 +181,40 @@ def preview_augmentation(input_dir: Path, output_dir: Path, config: Dict[str, An
|
||||
# Create augmentation instance
|
||||
augmenter = DataAugmentation(config)
|
||||
|
||||
# Augment first image
|
||||
# Process with ID detection if enabled
|
||||
if id_detection_config and id_detection_config.get('enabled', False):
|
||||
print("🔍 ID Card Detection enabled - processing with YOLO model...")
|
||||
|
||||
# Initialize ID card detector
|
||||
detector = IDCardDetector(
|
||||
model_path=id_detection_config.get('model_path'),
|
||||
config=config
|
||||
)
|
||||
|
||||
if not detector.model:
|
||||
print("❌ Failed to load YOLO model, proceeding with normal augmentation")
|
||||
else:
|
||||
# Process single image with ID detection
|
||||
result = detector.process_single_image(
|
||||
image_path=image_files[0],
|
||||
output_dir=output_dir,
|
||||
apply_augmentation=True,
|
||||
save_original=id_detection_config.get('save_original_crops', True),
|
||||
confidence=id_detection_config.get('confidence_threshold', 0.25),
|
||||
iou_threshold=id_detection_config.get('iou_threshold', 0.45),
|
||||
crop_mode=id_detection_config.get('crop_mode', 'bbox'),
|
||||
target_size=id_detection_config.get('target_size'),
|
||||
padding=id_detection_config.get('padding', 10)
|
||||
)
|
||||
|
||||
if result and result.get('detections'):
|
||||
print(f"✅ Detected {len(result['detections'])} ID cards")
|
||||
print(f"💾 Saved {len(result['processed_cards'])} processed cards")
|
||||
return
|
||||
else:
|
||||
print("⚠️ No ID cards detected, proceeding with normal augmentation")
|
||||
|
||||
# Normal augmentation (fallback)
|
||||
augmented_paths = augmenter.augment_image_file(
|
||||
image_files[0],
|
||||
output_dir,
|
||||
@@ -225,9 +292,29 @@ def main():
|
||||
show_image_info(input_dir)
|
||||
return
|
||||
|
||||
# Get ID detection config
|
||||
id_detection_config = config.get('id_card_detection', {})
|
||||
|
||||
# Override ID detection config with command line arguments
|
||||
if args.enable_id_detection:
|
||||
id_detection_config['enabled'] = True
|
||||
|
||||
if args.model_path:
|
||||
id_detection_config['model_path'] = args.model_path
|
||||
|
||||
if args.confidence:
|
||||
id_detection_config['confidence_threshold'] = args.confidence
|
||||
|
||||
if args.crop_mode:
|
||||
id_detection_config['crop_mode'] = args.crop_mode
|
||||
|
||||
if args.crop_target_size:
|
||||
target_size = parse_size(args.crop_target_size)
|
||||
id_detection_config['target_size'] = list(target_size)
|
||||
|
||||
# Preview augmentation if requested
|
||||
if args.preview:
|
||||
preview_augmentation(input_dir, output_dir, augmentation_config)
|
||||
preview_augmentation(input_dir, output_dir, augmentation_config, id_detection_config)
|
||||
return
|
||||
|
||||
# Get image files
|
||||
@@ -242,35 +329,56 @@ def main():
|
||||
logger.info(f"Number of augmentations per image: {processing_config.get('num_augmentations', 3)}")
|
||||
logger.info(f"Target size: {processing_config.get('target_size', [224, 224])}")
|
||||
|
||||
# Create augmentation instance with new config
|
||||
augmenter = DataAugmentation(augmentation_config)
|
||||
# Process with ID detection if enabled
|
||||
if id_detection_config.get('enabled', False):
|
||||
logger.info("ID Card Detection enabled - processing with YOLO model...")
|
||||
|
||||
# Initialize ID card detector
|
||||
detector = IDCardDetector(
|
||||
model_path=id_detection_config.get('model_path'),
|
||||
config=config
|
||||
)
|
||||
|
||||
if not detector.model:
|
||||
logger.error("Failed to load YOLO model")
|
||||
sys.exit(1)
|
||||
|
||||
logger.info(f"YOLO model loaded: {detector.model_path}")
|
||||
logger.info(f"Confidence threshold: {id_detection_config.get('confidence_threshold', 0.25)}")
|
||||
logger.info(f"Crop mode: {id_detection_config.get('crop_mode', 'bbox')}")
|
||||
|
||||
# Bước 1: Detect và crop ID cards vào thư mục processed
|
||||
processed_dir = output_dir / "processed"
|
||||
processed_dir.mkdir(parents=True, exist_ok=True)
|
||||
logger.info("Step 1: Detect and crop ID cards...")
|
||||
detector.batch_process(
|
||||
input_dir=input_dir,
|
||||
output_dir=processed_dir,
|
||||
confidence=id_detection_config.get('confidence_threshold', 0.25),
|
||||
iou_threshold=id_detection_config.get('iou_threshold', 0.45),
|
||||
crop_mode=id_detection_config.get('crop_mode', 'bbox'),
|
||||
target_size=id_detection_config.get('target_size'),
|
||||
padding=id_detection_config.get('padding', 10)
|
||||
)
|
||||
# Bước 2: Augment các card đã crop
|
||||
logger.info("Step 2: Augment cropped ID cards...")
|
||||
augmenter = DataAugmentation(augmentation_config)
|
||||
augmenter.batch_augment(
|
||||
processed_dir,
|
||||
output_dir,
|
||||
num_augmentations=processing_config.get("num_augmentations", 3)
|
||||
)
|
||||
else:
|
||||
# Augment trực tiếp ảnh gốc
|
||||
logger.info("Starting normal batch augmentation (direct augmentation)...")
|
||||
augmenter = DataAugmentation(augmentation_config)
|
||||
augmenter.batch_augment(
|
||||
input_dir,
|
||||
output_dir,
|
||||
num_augmentations=processing_config.get("num_augmentations", 3)
|
||||
)
|
||||
|
||||
# Update target size
|
||||
target_size = tuple(processing_config.get("target_size", [224, 224]))
|
||||
augmenter.image_processor.target_size = target_size
|
||||
|
||||
# Perform batch augmentation
|
||||
logger.info("Starting batch augmentation...")
|
||||
results = augmenter.batch_augment(
|
||||
input_dir,
|
||||
output_dir,
|
||||
num_augmentations=processing_config.get("num_augmentations", 3)
|
||||
)
|
||||
|
||||
# Get and display summary
|
||||
summary = augmenter.get_augmentation_summary(results)
|
||||
|
||||
print("\n" + "="*50)
|
||||
print("AUGMENTATION SUMMARY")
|
||||
print("="*50)
|
||||
print(f"Original images: {summary['total_original_images']}")
|
||||
print(f"Augmented images: {summary['total_augmented_images']}")
|
||||
print(f"Augmentation ratio: {summary['augmentation_ratio']:.2f}")
|
||||
print(f"Successful augmentations: {summary['successful_augmentations']}")
|
||||
print(f"Output directory: {output_dir}")
|
||||
print("="*50)
|
||||
|
||||
logger.info("Data augmentation completed successfully")
|
||||
logger.info("Data processing completed successfully")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
@@ -1,133 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Simple ID Card Cropper using Roboflow API
|
||||
Input: folder containing images
|
||||
Output: folder with cropped ID cards
|
||||
"""
|
||||
import sys
|
||||
import yaml
|
||||
from pathlib import Path
|
||||
import logging
|
||||
import argparse
|
||||
|
||||
# Add src to path
|
||||
sys.path.append(str(Path(__file__).parent / "src"))
|
||||
|
||||
from model.roboflow_id_detector import RoboflowIDDetector
|
||||
|
||||
def setup_logging():
|
||||
"""Setup basic logging"""
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format='%(asctime)s - %(levelname)s - %(message)s'
|
||||
)
|
||||
|
||||
def crop_id_cards(input_folder: str, output_folder: str, api_key: str = "Pkz4puRA0Cy3xMOuNoNr"):
|
||||
"""
|
||||
Crop ID cards from all images in input folder
|
||||
|
||||
Args:
|
||||
input_folder: Path to input folder containing images
|
||||
output_folder: Path to output folder for cropped ID cards
|
||||
api_key: Roboflow API key
|
||||
"""
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Convert to Path objects
|
||||
input_path = Path(input_folder)
|
||||
output_path = Path(output_folder)
|
||||
|
||||
# Check if input folder exists
|
||||
if not input_path.exists():
|
||||
logger.error(f"Input folder not found: {input_folder}")
|
||||
return False
|
||||
|
||||
# Create output folder
|
||||
output_path.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Initialize detector
|
||||
detector = RoboflowIDDetector(
|
||||
api_key=api_key,
|
||||
model_id="french-card-id-detect",
|
||||
version=3,
|
||||
confidence=0.5
|
||||
)
|
||||
|
||||
# Get all image files
|
||||
image_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff'}
|
||||
image_files = []
|
||||
|
||||
for file_path in input_path.rglob('*'):
|
||||
if file_path.is_file() and file_path.suffix.lower() in image_extensions:
|
||||
image_files.append(file_path)
|
||||
|
||||
if not image_files:
|
||||
logger.error(f"No images found in {input_folder}")
|
||||
return False
|
||||
|
||||
logger.info(f"Found {len(image_files)} images to process")
|
||||
|
||||
# Process each image
|
||||
total_cropped = 0
|
||||
|
||||
for i, image_path in enumerate(image_files, 1):
|
||||
logger.info(f"Processing {i}/{len(image_files)}: {image_path.name}")
|
||||
|
||||
# Detect ID cards
|
||||
detections = detector.detect_id_cards(image_path)
|
||||
|
||||
if not detections:
|
||||
logger.warning(f"No ID cards detected in {image_path.name}")
|
||||
continue
|
||||
|
||||
# Crop each detected ID card
|
||||
for j, detection in enumerate(detections):
|
||||
bbox = detection['bbox']
|
||||
|
||||
# Create output filename
|
||||
stem = image_path.stem
|
||||
suffix = f"_card_{j+1}.jpg"
|
||||
output_file = output_path / f"{stem}{suffix}"
|
||||
|
||||
# Crop ID card
|
||||
cropped = detector.crop_id_card(image_path, bbox, output_file)
|
||||
|
||||
if cropped is not None:
|
||||
total_cropped += 1
|
||||
logger.info(f" ✓ Cropped card {j+1} to {output_file.name}")
|
||||
|
||||
# Add delay between requests
|
||||
if i < len(image_files):
|
||||
import time
|
||||
time.sleep(1.0)
|
||||
|
||||
logger.info(f"Processing completed! Total ID cards cropped: {total_cropped}")
|
||||
return True
|
||||
|
||||
def main():
|
||||
"""Main function"""
|
||||
parser = argparse.ArgumentParser(description='Crop ID cards from images using Roboflow API')
|
||||
parser.add_argument('input_folder', help='Input folder containing images')
|
||||
parser.add_argument('output_folder', help='Output folder for cropped ID cards')
|
||||
parser.add_argument('--api-key', default="Pkz4puRA0Cy3xMOuNoNr",
|
||||
help='Roboflow API key (default: demo key)')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Setup logging
|
||||
setup_logging()
|
||||
|
||||
# Process images
|
||||
success = crop_id_cards(args.input_folder, args.output_folder, args.api_key)
|
||||
|
||||
if success:
|
||||
print(f"\n✓ Successfully processed images from '{args.input_folder}'")
|
||||
print(f"✓ Cropped ID cards saved to '{args.output_folder}'")
|
||||
else:
|
||||
print(f"\n✗ Failed to process images")
|
||||
return 1
|
||||
|
||||
return 0
|
||||
|
||||
if __name__ == "__main__":
|
||||
exit(main())
|
@@ -363,8 +363,6 @@ class DataAugmentation:
|
||||
|
||||
return result
|
||||
|
||||
|
||||
|
||||
def augment_single_image(self, image: np.ndarray, num_augmentations: int = None) -> List[np.ndarray]:
|
||||
"""
|
||||
Apply each augmentation method separately to create independent augmented versions
|
||||
@@ -455,20 +453,7 @@ class DataAugmentation:
|
||||
|
||||
augmented_images.append(augmented)
|
||||
|
||||
# 5. Grayscale only
|
||||
if grayscale_config.get("enabled", False):
|
||||
for i in range(num_augmentations):
|
||||
augmented = image.copy()
|
||||
augmented = self.convert_to_grayscale_preserve_quality(augmented)
|
||||
|
||||
# Resize preserving aspect ratio
|
||||
target_size = self.image_processor.target_size
|
||||
if target_size:
|
||||
augmented = self.resize_preserve_aspect(augmented, target_size)
|
||||
|
||||
augmented_images.append(augmented)
|
||||
|
||||
# 6. Blurring only
|
||||
# 5. Blurring only
|
||||
if blurring_config.get("enabled", False):
|
||||
for i in range(num_augmentations):
|
||||
augmented = image.copy()
|
||||
@@ -481,7 +466,7 @@ class DataAugmentation:
|
||||
|
||||
augmented_images.append(augmented)
|
||||
|
||||
# 7. Brightness and contrast only
|
||||
# 6. Brightness/Contrast only
|
||||
if brightness_contrast_config.get("enabled", False):
|
||||
for i in range(num_augmentations):
|
||||
augmented = image.copy()
|
||||
@@ -494,6 +479,11 @@ class DataAugmentation:
|
||||
|
||||
augmented_images.append(augmented)
|
||||
|
||||
# 7. Apply grayscale as final step to ALL augmented images
|
||||
if grayscale_config.get("enabled", False):
|
||||
for i in range(len(augmented_images)):
|
||||
augmented_images[i] = self.convert_to_grayscale_preserve_quality(augmented_images[i])
|
||||
|
||||
return augmented_images
|
||||
|
||||
def augment_image_file(self, image_path: Path, output_dir: Path, num_augmentations: int = None) -> List[Path]:
|
||||
@@ -518,7 +508,7 @@ class DataAugmentation:
|
||||
|
||||
# Save augmented images with method names
|
||||
saved_paths = []
|
||||
method_names = ["rotation", "cropping", "noise", "blockage", "grayscale", "blurring", "brightness_contrast"]
|
||||
method_names = ["rotation", "cropping", "noise", "blockage", "blurring", "brightness_contrast", "grayscale"]
|
||||
method_index = 0
|
||||
|
||||
for i, aug_image in enumerate(augmented_images):
|
||||
|
611
src/id_card_detector.py
Normal file
611
src/id_card_detector.py
Normal file
@@ -0,0 +1,611 @@
|
||||
"""
|
||||
ID Card Detector Module
|
||||
Sử dụng YOLO để detect và cắt ID cards từ ảnh lớn, kết hợp với data augmentation
|
||||
Tích hợp với YOLOv8 French ID Card Detection model
|
||||
"""
|
||||
import cv2
|
||||
import numpy as np
|
||||
from pathlib import Path
|
||||
from typing import List, Tuple, Optional, Dict, Any, Union
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
from ultralytics import YOLO
|
||||
import logging
|
||||
from data_augmentation import DataAugmentation
|
||||
from utils import load_image, save_image, create_augmented_filename, print_progress
|
||||
import os
|
||||
import json
|
||||
import yaml
|
||||
|
||||
class IDCardDetector:
|
||||
"""Class để detect và cắt ID cards từ ảnh lớn sử dụng YOLO"""
|
||||
|
||||
def __init__(self, model_path: str = None, config: Dict[str, Any] = None):
|
||||
"""
|
||||
Initialize ID Card Detector
|
||||
|
||||
Args:
|
||||
model_path: Đường dẫn đến model YOLO đã train
|
||||
config: Configuration dictionary
|
||||
"""
|
||||
self.config = config or {}
|
||||
self.model_path = model_path
|
||||
self.model = None
|
||||
self.data_augmentation = DataAugmentation(config)
|
||||
self.logger = self._setup_logger()
|
||||
|
||||
# Default model path nếu không được cung cấp
|
||||
if not model_path:
|
||||
default_model_path = "data/weights/id_cards_yolov8n.pt"
|
||||
if os.path.exists(default_model_path):
|
||||
model_path = default_model_path
|
||||
self.model_path = model_path
|
||||
|
||||
# Load YOLO model nếu có
|
||||
if model_path and os.path.exists(model_path):
|
||||
self.load_model(model_path)
|
||||
|
||||
def _setup_logger(self) -> logging.Logger:
|
||||
"""Setup logger cho module"""
|
||||
logger = logging.getLogger(__name__)
|
||||
logger.setLevel(logging.INFO)
|
||||
|
||||
if not logger.handlers:
|
||||
handler = logging.StreamHandler()
|
||||
formatter = logging.Formatter(
|
||||
'%(asctime)s - %(name)s - %(levelname)s - %(message)s'
|
||||
)
|
||||
handler.setFormatter(formatter)
|
||||
logger.addHandler(handler)
|
||||
|
||||
return logger
|
||||
|
||||
def load_model(self, model_path: str) -> bool:
|
||||
"""
|
||||
Load YOLO model từ file
|
||||
|
||||
Args:
|
||||
model_path: Đường dẫn đến model file
|
||||
|
||||
Returns:
|
||||
True nếu load thành công, False nếu thất bại
|
||||
"""
|
||||
try:
|
||||
self.model = YOLO(model_path)
|
||||
self.logger.info(f"Loaded YOLO model from: {model_path}")
|
||||
return True
|
||||
except Exception as e:
|
||||
self.logger.error(f"Failed to load model: {e}")
|
||||
return False
|
||||
|
||||
def detect_id_cards(self, image: np.ndarray, confidence: float = 0.5, iou_threshold: float = 0.45) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
Detect ID cards trong ảnh sử dụng YOLO
|
||||
|
||||
Args:
|
||||
image: Input image
|
||||
confidence: Confidence threshold
|
||||
iou_threshold: IoU threshold cho NMS
|
||||
|
||||
Returns:
|
||||
List các detection results với format:
|
||||
{
|
||||
'bbox': [x1, y1, x2, y2],
|
||||
'confidence': float,
|
||||
'class_id': int,
|
||||
'class_name': str
|
||||
}
|
||||
"""
|
||||
if self.model is None:
|
||||
self.logger.error("Model chưa được load!")
|
||||
return []
|
||||
|
||||
try:
|
||||
# Run inference
|
||||
results = self.model(image, conf=confidence, iou=float(iou_threshold), verbose=False)
|
||||
|
||||
detections = []
|
||||
for result in results:
|
||||
boxes = result.boxes
|
||||
if boxes is not None:
|
||||
for box in boxes:
|
||||
# Get bbox coordinates
|
||||
x1, y1, x2, y2 = box.xyxy[0].cpu().numpy()
|
||||
|
||||
# Get confidence and class
|
||||
confidence_score = float(box.conf[0].cpu().numpy())
|
||||
class_id = int(box.cls[0].cpu().numpy())
|
||||
class_name = self.model.names[class_id] if hasattr(self.model, 'names') else f"class_{class_id}"
|
||||
|
||||
detection = {
|
||||
'bbox': [int(x1), int(y1), int(x2), int(y2)],
|
||||
'confidence': confidence_score,
|
||||
'class_id': class_id,
|
||||
'class_name': class_name
|
||||
}
|
||||
detections.append(detection)
|
||||
|
||||
self.logger.info(f"Detected {len(detections)} ID cards")
|
||||
return detections
|
||||
|
||||
except Exception as e:
|
||||
self.logger.error(f"Error during detection: {e}")
|
||||
return []
|
||||
|
||||
def crop_id_card(self, image: np.ndarray, bbox: List[int], padding: int = 10,
|
||||
crop_mode: str = "bbox", target_size: Tuple[int, int] = None) -> np.ndarray:
|
||||
"""
|
||||
Cắt ID card từ ảnh gốc dựa trên bbox với nhiều options
|
||||
|
||||
Args:
|
||||
image: Input image
|
||||
bbox: Bounding box [x1, y1, x2, y2]
|
||||
padding: Padding thêm xung quanh bbox
|
||||
crop_mode: Mode cắt ("bbox", "square", "aspect_ratio")
|
||||
target_size: Kích thước target (width, height) nếu muốn resize
|
||||
|
||||
Returns:
|
||||
Cropped ID card image
|
||||
"""
|
||||
x1, y1, x2, y2 = bbox
|
||||
|
||||
# Thêm padding
|
||||
height, width = image.shape[:2]
|
||||
x1 = max(0, x1 - padding)
|
||||
y1 = max(0, y1 - padding)
|
||||
x2 = min(width, x2 + padding)
|
||||
y2 = min(height, y2 + padding)
|
||||
|
||||
# Cắt ảnh theo mode
|
||||
if crop_mode == "bbox":
|
||||
# Cắt theo bbox gốc
|
||||
cropped = image[y1:y2, x1:x2]
|
||||
elif crop_mode == "square":
|
||||
# Cắt thành hình vuông
|
||||
center_x = (x1 + x2) // 2
|
||||
center_y = (y1 + y2) // 2
|
||||
size = max(x2 - x1, y2 - y1)
|
||||
half_size = size // 2
|
||||
|
||||
x1 = max(0, center_x - half_size)
|
||||
y1 = max(0, center_y - half_size)
|
||||
x2 = min(width, center_x + half_size)
|
||||
y2 = min(height, center_y + half_size)
|
||||
|
||||
cropped = image[y1:y2, x1:x2]
|
||||
elif crop_mode == "aspect_ratio":
|
||||
# Cắt theo tỷ lệ khung hình chuẩn (3:4 cho ID card)
|
||||
bbox_width = x2 - x1
|
||||
bbox_height = y2 - y1
|
||||
center_x = (x1 + x2) // 2
|
||||
center_y = (y1 + y2) // 2
|
||||
|
||||
# Tỷ lệ 3:4 cho ID card
|
||||
target_ratio = 3 / 4
|
||||
current_ratio = bbox_width / bbox_height
|
||||
|
||||
if current_ratio > target_ratio:
|
||||
# Bbox quá rộng, giữ chiều cao
|
||||
new_width = int(bbox_height * target_ratio)
|
||||
half_width = new_width // 2
|
||||
x1 = max(0, center_x - half_width)
|
||||
x2 = min(width, center_x + half_width)
|
||||
else:
|
||||
# Bbox quá cao, giữ chiều rộng
|
||||
new_height = int(bbox_width / target_ratio)
|
||||
half_height = new_height // 2
|
||||
y1 = max(0, center_y - half_height)
|
||||
y2 = min(height, center_y + half_height)
|
||||
|
||||
cropped = image[y1:y2, x1:x2]
|
||||
else:
|
||||
# Default: cắt theo bbox
|
||||
cropped = image[y1:y2, x1:x2]
|
||||
|
||||
# Resize nếu có target_size
|
||||
if target_size:
|
||||
cropped = cv2.resize(cropped, target_size, interpolation=cv2.INTER_AREA)
|
||||
|
||||
return cropped
|
||||
|
||||
def process_single_image(self, image_path: Union[str, Path], output_dir: Path,
|
||||
confidence: float = 0.5, iou_threshold: float = 0.45,
|
||||
crop_mode: str = "bbox", target_size: Tuple[int, int] = None,
|
||||
padding: int = 10, card_counter: int = 0) -> Dict[str, Any]:
|
||||
"""
|
||||
Xử lý một ảnh: detect ID cards, cắt và áp dụng augmentation
|
||||
|
||||
Args:
|
||||
image_path: Đường dẫn đến ảnh input
|
||||
output_dir: Thư mục output
|
||||
apply_augmentation: Có áp dụng data augmentation không
|
||||
save_original: Có lưu ảnh gốc không
|
||||
confidence: Confidence threshold
|
||||
iou_threshold: IoU threshold
|
||||
crop_mode: Mode cắt ("bbox", "square", "aspect_ratio")
|
||||
target_size: Kích thước target (width, height) hoặc None
|
||||
padding: Padding thêm xung quanh bbox
|
||||
|
||||
Returns:
|
||||
Dictionary chứa kết quả xử lý
|
||||
"""
|
||||
image_path = Path(image_path)
|
||||
if not image_path.exists():
|
||||
self.logger.error(f"Image not found: {image_path}")
|
||||
return {}
|
||||
|
||||
# Load ảnh
|
||||
image = load_image(str(image_path))
|
||||
if image is None:
|
||||
self.logger.error(f"Failed to load image: {image_path}")
|
||||
return {}
|
||||
|
||||
# Detect ID cards
|
||||
detections = self.detect_id_cards(image, confidence, float(iou_threshold))
|
||||
|
||||
if not detections:
|
||||
self.logger.warning(f"No ID cards detected in: {image_path}")
|
||||
return {
|
||||
'image_path': str(image_path),
|
||||
'detections': [],
|
||||
'processed_cards': []
|
||||
}
|
||||
|
||||
# Tạo thư mục output
|
||||
output_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
processed_cards = []
|
||||
current_card_counter = card_counter
|
||||
|
||||
for i, detection in enumerate(detections):
|
||||
# Cắt ID card với options mới
|
||||
cropped_card = self.crop_id_card(
|
||||
image,
|
||||
detection['bbox'],
|
||||
padding=padding,
|
||||
crop_mode=crop_mode,
|
||||
target_size=target_size
|
||||
)
|
||||
|
||||
# Tạo tên file unique cho mỗi ID card
|
||||
current_card_counter += 1
|
||||
card_filename = f"id_card_{current_card_counter:03d}.jpg"
|
||||
card_path = output_dir / card_filename
|
||||
|
||||
# Lưu ảnh gốc
|
||||
save_image(cropped_card, card_path)
|
||||
processed_cards.append({
|
||||
'original_path': str(card_path),
|
||||
'detection_info': detection,
|
||||
'crop_info': {
|
||||
'mode': crop_mode,
|
||||
'target_size': target_size,
|
||||
'padding': padding
|
||||
}
|
||||
})
|
||||
|
||||
result = {
|
||||
'image_path': str(image_path),
|
||||
'detections': detections,
|
||||
'processed_cards': processed_cards,
|
||||
'total_cards': len(processed_cards),
|
||||
'crop_settings': {
|
||||
'mode': crop_mode,
|
||||
'target_size': target_size,
|
||||
'padding': padding
|
||||
}
|
||||
}
|
||||
|
||||
self.logger.info(f"Processed {len(processed_cards)} cards from {image_path.name}")
|
||||
return result
|
||||
|
||||
def batch_process(self, input_dir: Union[str, Path], output_dir: Union[str, Path],
|
||||
confidence: float = 0.5, iou_threshold: float = 0.45,
|
||||
crop_mode: str = "bbox", target_size: Tuple[int, int] = None,
|
||||
padding: int = 10) -> Dict[str, Any]:
|
||||
"""
|
||||
Xử lý batch nhiều ảnh
|
||||
|
||||
Args:
|
||||
input_dir: Thư mục chứa ảnh input
|
||||
output_dir: Thư mục output
|
||||
apply_augmentation: Có áp dụng data augmentation không
|
||||
save_original: Có lưu ảnh gốc không
|
||||
confidence: Confidence threshold
|
||||
iou_threshold: IoU threshold
|
||||
crop_mode: Mode cắt ("bbox", "square", "aspect_ratio")
|
||||
target_size: Kích thước target (width, height) hoặc None
|
||||
padding: Padding thêm xung quanh bbox
|
||||
|
||||
Returns:
|
||||
Dictionary chứa kết quả batch processing
|
||||
"""
|
||||
input_dir = Path(input_dir)
|
||||
output_dir = Path(output_dir)
|
||||
|
||||
if not input_dir.exists():
|
||||
self.logger.error(f"Input directory not found: {input_dir}")
|
||||
return {}
|
||||
|
||||
# Tạo thư mục output
|
||||
output_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Tìm tất cả ảnh
|
||||
supported_formats = self.config.get('supported_formats', ['.jpg', '.jpeg', '.png', '.bmp', '.tiff'])
|
||||
image_files = []
|
||||
for fmt in supported_formats:
|
||||
image_files.extend(input_dir.glob(f"*{fmt}"))
|
||||
image_files.extend(input_dir.glob(f"*{fmt.upper()}"))
|
||||
|
||||
if not image_files:
|
||||
self.logger.warning(f"No supported images found in: {input_dir}")
|
||||
return {}
|
||||
|
||||
self.logger.info(f"Found {len(image_files)} images to process")
|
||||
|
||||
results = {}
|
||||
total_cards = 0
|
||||
global_card_counter = 0 # Counter để tạo tên file unique
|
||||
|
||||
for i, image_path in enumerate(image_files):
|
||||
self.logger.info(f"Processing {i+1}/{len(image_files)}: {image_path.name}")
|
||||
|
||||
# Xử lý ảnh - chỉ detect và crop, không augment
|
||||
result = self.process_single_image(
|
||||
image_path,
|
||||
output_dir,
|
||||
confidence,
|
||||
iou_threshold,
|
||||
crop_mode,
|
||||
target_size,
|
||||
padding,
|
||||
global_card_counter
|
||||
)
|
||||
|
||||
# Cập nhật counter
|
||||
global_card_counter += len(result.get('detections', []))
|
||||
|
||||
results[image_path.name] = result
|
||||
total_cards += len(result.get('detections', [])) # Số lượng ID cards thực tế đã detect
|
||||
|
||||
# Print progress
|
||||
print_progress(i + 1, len(image_files), f"Processed {image_path.name}")
|
||||
|
||||
# Tạo summary
|
||||
summary = {
|
||||
'total_images': len(image_files),
|
||||
'total_cards_detected': total_cards,
|
||||
'images_with_cards': len([r for r in results.values() if r.get('detections')]),
|
||||
'images_without_cards': len([r for r in results.values() if not r.get('detections')]),
|
||||
'output_directory': str(output_dir),
|
||||
'crop_settings': {
|
||||
'mode': crop_mode,
|
||||
'target_size': target_size,
|
||||
'padding': padding
|
||||
},
|
||||
'results': results
|
||||
}
|
||||
|
||||
# Lưu summary
|
||||
summary_path = output_dir / "processing_summary.json"
|
||||
with open(summary_path, 'w', encoding='utf-8') as f:
|
||||
json.dump(summary, f, indent=2, ensure_ascii=False)
|
||||
|
||||
self.logger.info(f"Batch processing completed. Summary saved to: {summary_path}")
|
||||
return summary
|
||||
|
||||
def get_detection_statistics(self, results: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""
|
||||
Tính toán thống kê từ kết quả detection
|
||||
|
||||
Args:
|
||||
results: Kết quả từ batch_process
|
||||
|
||||
Returns:
|
||||
Dictionary chứa thống kê
|
||||
"""
|
||||
if not results:
|
||||
return {}
|
||||
|
||||
total_images = results.get('total_images', 0)
|
||||
total_cards = results.get('total_cards_detected', 0)
|
||||
images_with_cards = results.get('images_with_cards', 0)
|
||||
|
||||
# Tính confidence statistics
|
||||
all_confidences = []
|
||||
for image_result in results.get('results', {}).values():
|
||||
for detection in image_result.get('detections', []):
|
||||
all_confidences.append(detection.get('confidence', 0))
|
||||
|
||||
stats = {
|
||||
'total_images_processed': total_images,
|
||||
'total_cards_detected': total_cards,
|
||||
'images_with_cards': images_with_cards,
|
||||
'images_without_cards': total_images - images_with_cards,
|
||||
'average_cards_per_image': total_cards / total_images if total_images > 0 else 0,
|
||||
'detection_rate': images_with_cards / total_images if total_images > 0 else 0,
|
||||
'confidence_statistics': {
|
||||
'min': min(all_confidences) if all_confidences else 0,
|
||||
'max': max(all_confidences) if all_confidences else 0,
|
||||
'mean': np.mean(all_confidences) if all_confidences else 0,
|
||||
'std': np.std(all_confidences) if all_confidences else 0
|
||||
}
|
||||
}
|
||||
|
||||
return stats
|
||||
|
||||
def augment_cropped_cards(self, input_dir: Union[str, Path], output_dir: Union[str, Path],
|
||||
num_augmentations: int = 3) -> Dict[str, Any]:
|
||||
"""
|
||||
Augment tất cả ID cards đã crop trong thư mục input
|
||||
|
||||
Args:
|
||||
input_dir: Thư mục chứa ID cards đã crop
|
||||
output_dir: Thư mục output cho augmented images
|
||||
num_augmentations: Số lượng augmentation cho mỗi card
|
||||
|
||||
Returns:
|
||||
Dictionary chứa kết quả augmentation
|
||||
"""
|
||||
input_dir = Path(input_dir)
|
||||
output_dir = Path(output_dir)
|
||||
|
||||
if not input_dir.exists():
|
||||
self.logger.error(f"Input directory not found: {input_dir}")
|
||||
return {}
|
||||
|
||||
# Tạo thư mục output
|
||||
output_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Tìm tất cả ID cards đã crop
|
||||
card_files = list(input_dir.glob("id_card_*.jpg"))
|
||||
|
||||
if not card_files:
|
||||
self.logger.warning(f"No ID card files found in: {input_dir}")
|
||||
return {}
|
||||
|
||||
self.logger.info(f"Found {len(card_files)} ID cards to augment")
|
||||
|
||||
results = {}
|
||||
total_augmented = 0
|
||||
|
||||
for i, card_path in enumerate(card_files):
|
||||
self.logger.info(f"Augmenting {i+1}/{len(card_files)}: {card_path.name}")
|
||||
|
||||
# Load ID card
|
||||
card_image = load_image(str(card_path))
|
||||
if card_image is None:
|
||||
self.logger.error(f"Failed to load card: {card_path}")
|
||||
continue
|
||||
|
||||
# Augment card
|
||||
try:
|
||||
augmented_cards = self.data_augmentation.augment_single_image(
|
||||
card_image,
|
||||
num_augmentations=num_augmentations
|
||||
)
|
||||
|
||||
# Debug: Kiểm tra số lượng augmented cards
|
||||
self.logger.info(f"Generated {len(augmented_cards)} augmented cards for {card_path.name}")
|
||||
|
||||
# Debug: Kiểm tra config
|
||||
self.logger.info(f"DataAugmentation config: {self.data_augmentation.config}")
|
||||
|
||||
except Exception as e:
|
||||
self.logger.error(f"Error during augmentation: {e}")
|
||||
augmented_cards = []
|
||||
|
||||
# Save augmented cards
|
||||
card_results = []
|
||||
for j, aug_card in enumerate(augmented_cards):
|
||||
aug_filename = f"{card_path.stem}_aug_{j+1}.jpg"
|
||||
aug_path = output_dir / aug_filename
|
||||
save_image(aug_card, aug_path)
|
||||
|
||||
card_results.append({
|
||||
'augmented_path': str(aug_path),
|
||||
'augmentation_index': j+1
|
||||
})
|
||||
|
||||
results[card_path.name] = {
|
||||
'original_path': str(card_path),
|
||||
'augmented_cards': card_results,
|
||||
'total_augmented': len(card_results)
|
||||
}
|
||||
|
||||
total_augmented += len(card_results)
|
||||
|
||||
# Print progress
|
||||
print_progress(i + 1, len(card_files), f"Augmented {card_path.name}")
|
||||
|
||||
# Tạo summary
|
||||
summary = {
|
||||
'total_cards': len(card_files),
|
||||
'total_augmented': total_augmented,
|
||||
'output_directory': str(output_dir),
|
||||
'results': results
|
||||
}
|
||||
|
||||
# Lưu summary
|
||||
summary_path = output_dir / "augmentation_summary.json"
|
||||
with open(summary_path, 'w', encoding='utf-8') as f:
|
||||
json.dump(summary, f, indent=2, ensure_ascii=False)
|
||||
|
||||
self.logger.info(f"Augmentation completed. Summary saved to: {summary_path}")
|
||||
return summary
|
||||
|
||||
def load_yolo_config(self, config_path: str = None) -> Dict[str, Any]:
|
||||
"""
|
||||
Load config từ YOLO detector
|
||||
|
||||
Args:
|
||||
config_path: Đường dẫn đến file config
|
||||
|
||||
Returns:
|
||||
Config dictionary
|
||||
"""
|
||||
if config_path is None:
|
||||
# Tìm config mặc định
|
||||
default_config_path = "src/model/ID_cards_detector/config.py"
|
||||
if os.path.exists(default_config_path):
|
||||
config_path = default_config_path
|
||||
|
||||
config = {}
|
||||
|
||||
try:
|
||||
# Import config từ YOLO detector
|
||||
import sys
|
||||
sys.path.append(str(Path("src/model/ID_cards_detector")))
|
||||
|
||||
from config import DEFAULT_TRAINING_CONFIG, DEFAULT_INFERENCE_CONFIG
|
||||
|
||||
config.update({
|
||||
'yolo_training_config': DEFAULT_TRAINING_CONFIG,
|
||||
'yolo_inference_config': DEFAULT_INFERENCE_CONFIG,
|
||||
'detection': {
|
||||
'confidence_threshold': DEFAULT_INFERENCE_CONFIG.get('conf_threshold', 0.25),
|
||||
'iou_threshold': DEFAULT_INFERENCE_CONFIG.get('iou_threshold', 0.45),
|
||||
'padding': 10
|
||||
},
|
||||
'processing': {
|
||||
'apply_augmentation': True,
|
||||
'save_original': True,
|
||||
'num_augmentations': 3,
|
||||
'save_format': "jpg",
|
||||
'quality': 95,
|
||||
'target_size': [640, 640]
|
||||
},
|
||||
'crop_options': {
|
||||
'crop_mode': 'bbox', # bbox, square, aspect_ratio
|
||||
'target_size': None, # (width, height) hoặc None
|
||||
'padding': 10
|
||||
}
|
||||
})
|
||||
|
||||
self.logger.info("Loaded YOLO config successfully")
|
||||
|
||||
except Exception as e:
|
||||
self.logger.warning(f"Failed to load YOLO config: {e}")
|
||||
# Fallback config
|
||||
config = {
|
||||
'detection': {
|
||||
'confidence_threshold': 0.25,
|
||||
'iou_threshold': 0.45,
|
||||
'padding': 10
|
||||
},
|
||||
'processing': {
|
||||
'apply_augmentation': True,
|
||||
'save_original': True,
|
||||
'num_augmentations': 3,
|
||||
'save_format': "jpg",
|
||||
'quality': 95,
|
||||
'target_size': [640, 640]
|
||||
},
|
||||
'crop_options': {
|
||||
'crop_mode': 'bbox',
|
||||
'target_size': None,
|
||||
'padding': 10
|
||||
}
|
||||
}
|
||||
|
||||
return config
|
@@ -41,14 +41,11 @@ def load_image(image_path: Path, target_size: Tuple[int, int] = None) -> Optiona
|
||||
image = cv2.imread(str(image_path))
|
||||
if image is None:
|
||||
return None
|
||||
|
||||
# Convert BGR to RGB
|
||||
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
|
||||
|
||||
# Resize if target_size is provided
|
||||
if target_size:
|
||||
image = cv2.resize(image, target_size, interpolation=cv2.INTER_AREA)
|
||||
|
||||
return image
|
||||
except Exception as e:
|
||||
print(f"Error loading image {image_path}: {e}")
|
||||
|
Reference in New Issue
Block a user