combine augment
This commit is contained in:
328
README.md
328
README.md
@@ -1,10 +1,24 @@
|
|||||||
# ID Card Data Augmentation Pipeline
|
# ID Card Data Augmentation Pipeline
|
||||||
|
|
||||||
A comprehensive data augmentation pipeline for ID card images with YOLO-based detection and advanced augmentation techniques.
|
A comprehensive data augmentation pipeline for ID card images with YOLO-based detection, smart sampling strategies, and advanced augmentation techniques.
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
## 🚀 Features
|
## 🚀 New Features v2.0
|
||||||
|
|
||||||
|
### **Smart Data Strategy**
|
||||||
|
- **Sampling Mode** (`factor < 1.0`): Process only a percentage of input data
|
||||||
|
- **Multiplication Mode** (`factor >= 1.0`): Multiply total dataset size
|
||||||
|
- **Balanced Output**: Includes both raw and augmented images
|
||||||
|
- **Configurable Sampling**: Random, stratified, or uniform selection
|
||||||
|
|
||||||
|
### **Enhanced Augmentation**
|
||||||
|
- **Random Method Combination**: Mix and match augmentation techniques
|
||||||
|
- **Method Probability Weights**: Control frequency of each augmentation
|
||||||
|
- **Raw Image Preservation**: Always includes original processed images
|
||||||
|
- **Flexible Processing Modes**: Individual, sequential, or random combination
|
||||||
|
|
||||||
|
## 🎯 Key Features
|
||||||
|
|
||||||
### **YOLO-based ID Card Detection**
|
### **YOLO-based ID Card Detection**
|
||||||
- Automatic detection and cropping of ID cards from large images
|
- Automatic detection and cropping of ID cards from large images
|
||||||
@@ -17,15 +31,17 @@ A comprehensive data augmentation pipeline for ID card images with YOLO-based de
|
|||||||
- **Random Cropping**: Simulates partially visible cards
|
- **Random Cropping**: Simulates partially visible cards
|
||||||
- **Noise Addition**: Simulates worn-out cards
|
- **Noise Addition**: Simulates worn-out cards
|
||||||
- **Partial Blockage**: Simulates occluded card details
|
- **Partial Blockage**: Simulates occluded card details
|
||||||
- **Blurring**: Simulates blurred but readable images
|
- **Blurring**: Simulates motion blur while keeping readability
|
||||||
- **Brightness/Contrast**: Mimics different lighting conditions
|
- **Brightness/Contrast**: Mimics different lighting conditions
|
||||||
|
- **Color Jittering**: HSV adjustments for color variations
|
||||||
|
- **Perspective Transform**: Simulates viewing angle changes
|
||||||
- **Grayscale Conversion**: Final preprocessing step for all images
|
- **Grayscale Conversion**: Final preprocessing step for all images
|
||||||
|
|
||||||
### **Flexible Configuration**
|
### **Flexible Configuration**
|
||||||
- YAML-based configuration system
|
- YAML-based configuration system
|
||||||
- Command-line argument overrides
|
- Command-line argument overrides
|
||||||
- Environment-specific settings
|
- Smart data strategy configuration
|
||||||
- Comprehensive logging
|
- Comprehensive logging and statistics
|
||||||
|
|
||||||
## 📋 Requirements
|
## 📋 Requirements
|
||||||
|
|
||||||
@@ -44,6 +60,7 @@ pip install -r requirements.txt
|
|||||||
- `Pillow>=8.3.0`
|
- `Pillow>=8.3.0`
|
||||||
- `PyYAML>=5.4.0`
|
- `PyYAML>=5.4.0`
|
||||||
- `ultralytics>=8.0.0` (for YOLO models)
|
- `ultralytics>=8.0.0` (for YOLO models)
|
||||||
|
- `torch>=1.12.0` (for GPU acceleration)
|
||||||
|
|
||||||
## 🛠️ Installation
|
## 🛠️ Installation
|
||||||
|
|
||||||
@@ -69,115 +86,80 @@ data/weights/id_cards_yolov8n.pt
|
|||||||
### **Basic Usage**
|
### **Basic Usage**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Run with default configuration
|
# Run with default configuration (3x multiplication)
|
||||||
python main.py
|
python main.py
|
||||||
|
|
||||||
|
# Run with sampling mode (30% of input data)
|
||||||
|
python main.py # Set multiplication_factor: 0.3 in config
|
||||||
|
|
||||||
# Run with ID card detection enabled
|
# Run with ID card detection enabled
|
||||||
python main.py --enable-id-detection
|
python main.py --enable-id-detection
|
||||||
|
|
||||||
# Run with custom input/output directories
|
|
||||||
python main.py --input-dir "path/to/input" --output-dir "path/to/output"
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### **Configuration Options**
|
### **Data Strategy Examples**
|
||||||
|
|
||||||
#### **ID Card Detection**
|
#### **Sampling Mode** (factor < 1.0)
|
||||||
```bash
|
```yaml
|
||||||
# Enable detection with custom model
|
data_strategy:
|
||||||
python main.py --enable-id-detection --model-path "path/to/model.pt"
|
multiplication_factor: 0.3 # Process 30% of input images
|
||||||
|
sampling:
|
||||||
# Adjust detection parameters
|
method: "random" # random, stratified, uniform
|
||||||
python main.py --enable-id-detection --confidence 0.3 --crop-mode square
|
preserve_distribution: true
|
||||||
|
|
||||||
# Set target size for cropped cards
|
|
||||||
python main.py --enable-id-detection --crop-target-size "640,640"
|
|
||||||
```
|
```
|
||||||
|
- Input: 100 images → Select 30 images → Output: 100 images total
|
||||||
|
- Each selected image generates ~3-4 versions (including raw)
|
||||||
|
|
||||||
#### **Data Augmentation**
|
#### **Multiplication Mode** (factor >= 1.0)
|
||||||
```bash
|
```yaml
|
||||||
# Customize augmentation parameters
|
data_strategy:
|
||||||
python main.py --num-augmentations 5 --target-size "512,512"
|
multiplication_factor: 3.0 # 3x dataset size
|
||||||
|
|
||||||
# Preview augmentation results
|
|
||||||
python main.py --preview
|
|
||||||
```
|
```
|
||||||
|
- Input: 100 images → Process all → Output: 300 images total
|
||||||
|
- Each image generates 3 versions (1 raw + 2 augmented)
|
||||||
|
|
||||||
### **Configuration File**
|
### **Augmentation Strategy**
|
||||||
|
|
||||||
Edit `config/config.yaml` for persistent settings:
|
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
# ID Card Detection
|
|
||||||
id_card_detection:
|
|
||||||
enabled: false # Enable/disable YOLO detection
|
|
||||||
model_path: "data/weights/id_cards_yolov8n.pt"
|
|
||||||
confidence_threshold: 0.25
|
|
||||||
iou_threshold: 0.45
|
|
||||||
padding: 10
|
|
||||||
crop_mode: "bbox"
|
|
||||||
target_size: null
|
|
||||||
|
|
||||||
# Data Augmentation
|
|
||||||
augmentation:
|
augmentation:
|
||||||
rotation:
|
strategy:
|
||||||
enabled: true
|
mode: "random_combine" # random_combine, sequential, individual
|
||||||
angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
|
min_methods: 2 # Min augmentation methods per image
|
||||||
random_cropping:
|
max_methods: 4 # Max augmentation methods per image
|
||||||
enabled: true
|
|
||||||
ratio_range: [0.7, 1.0]
|
|
||||||
random_noise:
|
|
||||||
enabled: true
|
|
||||||
mean_range: [0.0, 0.7]
|
|
||||||
variance_range: [0.0, 0.1]
|
|
||||||
partial_blockage:
|
|
||||||
enabled: true
|
|
||||||
coverage_range: [0.0, 0.25]
|
|
||||||
blurring:
|
|
||||||
enabled: true
|
|
||||||
kernel_ratio_range: [0.0, 0.0084]
|
|
||||||
brightness_contrast:
|
|
||||||
enabled: true
|
|
||||||
alpha_range: [0.4, 3.0]
|
|
||||||
beta_range: [1, 100]
|
|
||||||
grayscale:
|
|
||||||
enabled: true # Applied as final step
|
|
||||||
|
|
||||||
# Processing
|
methods:
|
||||||
processing:
|
rotation:
|
||||||
target_size: [640, 640]
|
enabled: true
|
||||||
num_augmentations: 3
|
probability: 0.8 # 80% chance to be selected
|
||||||
save_format: "jpg"
|
angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
|
||||||
quality: 95
|
|
||||||
|
random_cropping:
|
||||||
|
enabled: true
|
||||||
|
probability: 0.7
|
||||||
|
ratio_range: [0.7, 1.0]
|
||||||
|
|
||||||
|
# ... other methods with probabilities
|
||||||
```
|
```
|
||||||
|
|
||||||
## 🔄 Workflow
|
## 🔄 Workflow
|
||||||
|
|
||||||
### **Two-Step Processing Pipeline**
|
### **Smart Processing Pipeline**
|
||||||
|
|
||||||
#### **Step 1: ID Card Detection (Optional)**
|
#### **Step 1: Data Selection**
|
||||||
|
- **Sampling Mode**: Randomly select subset of input images
|
||||||
|
- **Multiplication Mode**: Process all input images
|
||||||
|
- **Stratified Sampling**: Preserve file type distribution
|
||||||
|
|
||||||
|
#### **Step 2: ID Card Detection** (Optional)
|
||||||
When `id_card_detection.enabled: true`:
|
When `id_card_detection.enabled: true`:
|
||||||
1. **Input**: Large images containing multiple ID cards
|
1. **YOLO Detection**: Locate ID cards in large images
|
||||||
2. **YOLO Detection**: Locate and detect ID cards
|
2. **Cropping**: Extract individual ID cards with padding
|
||||||
3. **Cropping**: Extract individual ID cards with padding
|
3. **Output**: Cropped ID cards saved to `out/processed/`
|
||||||
4. **Output**: Cropped ID cards saved to `out/processed/`
|
|
||||||
|
|
||||||
#### **Step 2: Data Augmentation**
|
#### **Step 3: Smart Augmentation**
|
||||||
1. **Input**: Original images OR cropped ID cards
|
1. **Raw Processing**: Always include original (resized + grayscale)
|
||||||
2. **Augmentation**: Apply 6 augmentation methods:
|
2. **Random Combination**: Select 2-4 augmentation methods randomly
|
||||||
- Rotation (9 different angles)
|
3. **Method Application**: Apply selected methods with probability weights
|
||||||
- Random cropping (70-100% ratio)
|
4. **Final Processing**: Grayscale conversion for all outputs
|
||||||
- Random noise (simulate wear)
|
|
||||||
- Partial blockage (simulate occlusion)
|
|
||||||
- Blurring (simulate motion blur)
|
|
||||||
- Brightness/Contrast adjustment
|
|
||||||
3. **Grayscale**: Convert all images to grayscale (final step)
|
|
||||||
4. **Output**: Augmented images in main output directory
|
|
||||||
|
|
||||||
### **Direct Augmentation Mode**
|
|
||||||
When `id_card_detection.enabled: false`:
|
|
||||||
- Skips YOLO detection
|
|
||||||
- Applies augmentation directly to input images
|
|
||||||
- All images are converted to grayscale
|
|
||||||
|
|
||||||
## 📊 Output Structure
|
## 📊 Output Structure
|
||||||
|
|
||||||
@@ -187,103 +169,144 @@ output_directory/
|
|||||||
│ ├── id_card_001.jpg
|
│ ├── id_card_001.jpg
|
||||||
│ ├── id_card_002.jpg
|
│ ├── id_card_002.jpg
|
||||||
│ └── processing_summary.json
|
│ └── processing_summary.json
|
||||||
├── im1__rotation_01.png # Augmented images
|
├── im1__raw_001.jpg # Raw processed images
|
||||||
├── im1__cropping_01.png
|
├── im1__aug_001.jpg # Augmented images (random combinations)
|
||||||
├── im1__noise_01.png
|
├── im1__aug_002.jpg
|
||||||
├── im1__blockage_01.png
|
├── im2__raw_001.jpg
|
||||||
├── im1__blurring_01.png
|
├── im2__aug_001.jpg
|
||||||
├── im1__brightness_contrast_01.png
|
└── processing_summary.json
|
||||||
└── augmentation_summary.json
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### **File Naming Convention**
|
||||||
|
- `{basename}_raw_001.jpg`: Original image (resized + grayscale)
|
||||||
|
- `{basename}_aug_001.jpg`: Augmented version 1 (random methods)
|
||||||
|
- `{basename}_aug_002.jpg`: Augmented version 2 (different methods)
|
||||||
|
|
||||||
## 🎯 Use Cases
|
## 🎯 Use Cases
|
||||||
|
|
||||||
### **Training Data Generation**
|
### **Dataset Expansion**
|
||||||
```bash
|
```yaml
|
||||||
# Generate diverse training data
|
# Triple your dataset size with balanced augmentation
|
||||||
python main.py --enable-id-detection --num-augmentations 10
|
data_strategy:
|
||||||
|
multiplication_factor: 3.0
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Smart Sampling for Large Datasets**
|
||||||
|
```yaml
|
||||||
|
# Process only 20% but maintain original dataset size
|
||||||
|
data_strategy:
|
||||||
|
multiplication_factor: 0.2
|
||||||
|
sampling:
|
||||||
|
method: "stratified" # Preserve file type distribution
|
||||||
```
|
```
|
||||||
|
|
||||||
### **Quality Control**
|
### **Quality Control**
|
||||||
```bash
|
```bash
|
||||||
# Preview results before processing
|
# Preview results before full processing
|
||||||
python main.py --preview
|
python main.py --preview
|
||||||
```
|
```
|
||||||
|
|
||||||
### **Batch Processing**
|
|
||||||
```bash
|
|
||||||
# Process large datasets
|
|
||||||
python main.py --input-dir "large_dataset/" --output-dir "augmented_dataset/"
|
|
||||||
```
|
|
||||||
|
|
||||||
## ⚙️ Advanced Configuration
|
## ⚙️ Advanced Configuration
|
||||||
|
|
||||||
### **Custom Augmentation Parameters**
|
### **Augmentation Strategy Modes**
|
||||||
|
|
||||||
|
#### **Random Combination** (Recommended)
|
||||||
```yaml
|
```yaml
|
||||||
augmentation:
|
augmentation:
|
||||||
rotation:
|
strategy:
|
||||||
angles: [45, 90, 135, 180, 225, 270, 315] # Custom angles
|
mode: "random_combine"
|
||||||
random_cropping:
|
min_methods: 2
|
||||||
ratio_range: [0.8, 0.95] # Tighter cropping
|
max_methods: 4
|
||||||
random_noise:
|
|
||||||
mean_range: [0.1, 0.5] # More noise
|
|
||||||
variance_range: [0.05, 0.15]
|
|
||||||
```
|
```
|
||||||
|
Each image gets 2-4 randomly selected augmentation methods.
|
||||||
|
|
||||||
### **Performance Optimization**
|
#### **Sequential Application**
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
performance:
|
augmentation:
|
||||||
num_workers: 4
|
strategy:
|
||||||
prefetch_factor: 2
|
mode: "sequential"
|
||||||
pin_memory: true
|
```
|
||||||
use_gpu: false
|
All enabled methods applied to each image in sequence.
|
||||||
|
|
||||||
|
#### **Individual Methods**
|
||||||
|
```yaml
|
||||||
|
augmentation:
|
||||||
|
strategy:
|
||||||
|
mode: "individual"
|
||||||
|
```
|
||||||
|
Legacy mode - each method creates separate output images.
|
||||||
|
|
||||||
|
### **Method Probability Tuning**
|
||||||
|
```yaml
|
||||||
|
methods:
|
||||||
|
rotation:
|
||||||
|
probability: 0.9 # High chance - common transformation
|
||||||
|
perspective:
|
||||||
|
probability: 0.2 # Low chance - subtle effect
|
||||||
|
partial_blockage:
|
||||||
|
probability: 0.3 # Medium chance - specific use case
|
||||||
```
|
```
|
||||||
|
|
||||||
## 📝 Logging
|
## 📊 Performance Statistics
|
||||||
|
|
||||||
The system provides comprehensive logging:
|
The system provides detailed statistics:
|
||||||
- **File**: `logs/data_augmentation.log`
|
|
||||||
- **Console**: Real-time progress updates
|
|
||||||
- **Summary**: JSON files with processing statistics
|
|
||||||
|
|
||||||
### **Log Levels**
|
```json
|
||||||
- `INFO`: General processing information
|
{
|
||||||
- `WARNING`: Non-critical issues (e.g., no cards detected)
|
"input_images": 100,
|
||||||
- `ERROR`: Critical errors
|
"selected_images": 30, // In sampling mode
|
||||||
|
"target_total": 100,
|
||||||
|
"actual_generated": 98,
|
||||||
|
"multiplication_factor": 0.3,
|
||||||
|
"mode": "sampling",
|
||||||
|
"efficiency": 0.98 // 98% target achievement
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
## 🔧 Troubleshooting
|
## 🔧 Troubleshooting
|
||||||
|
|
||||||
### **Common Issues**
|
### **Common Issues**
|
||||||
|
|
||||||
1. **No images detected**
|
1. **Low efficiency in sampling mode**
|
||||||
- Check input directory path
|
- Increase `min_methods` or adjust `target_size`
|
||||||
- Verify image formats (jpg, png, bmp, tiff)
|
- Check available augmentation methods
|
||||||
- Ensure images are not corrupted
|
|
||||||
|
|
||||||
2. **YOLO model not found**
|
2. **Memory issues with large datasets**
|
||||||
- Place model file at `data/weights/id_cards_yolov8n.pt`
|
- Use sampling mode with lower factor
|
||||||
- Or specify custom path with `--model-path`
|
- Reduce `target_size` resolution
|
||||||
|
- Enable `memory_efficient` mode
|
||||||
|
|
||||||
3. **Memory issues**
|
3. **Inconsistent augmentation results**
|
||||||
- Reduce `num_augmentations`
|
- Set `random_seed` for reproducibility
|
||||||
- Use smaller `target_size`
|
- Adjust method probabilities
|
||||||
- Enable GPU if available
|
- Check `min_methods`/`max_methods` balance
|
||||||
|
|
||||||
### **Performance Tips**
|
### **Performance Tips**
|
||||||
|
|
||||||
- **GPU Acceleration**: Set `use_gpu: true` in config
|
- **Sampling Mode**: Use for large datasets (>1000 images)
|
||||||
- **Batch Processing**: Use multiple workers for large datasets
|
- **GPU Acceleration**: Enable for YOLO detection
|
||||||
- **Memory Management**: Process in smaller batches
|
- **Batch Processing**: Process in chunks for memory efficiency
|
||||||
|
- **Probability Tuning**: Higher probabilities for stable methods
|
||||||
|
|
||||||
|
## 📈 Benchmarks
|
||||||
|
|
||||||
|
### **Processing Speed**
|
||||||
|
- **Direct Mode**: ~2-3 images/second
|
||||||
|
- **YOLO + Augmentation**: ~1-2 images/second
|
||||||
|
- **Memory Usage**: ~2-4GB for 1000 images
|
||||||
|
|
||||||
|
### **Output Quality**
|
||||||
|
- **Raw Images**: 100% preserved quality
|
||||||
|
- **Augmented Images**: Balanced realism vs. diversity
|
||||||
|
- **Grayscale Conversion**: Consistent preprocessing
|
||||||
|
|
||||||
## 🤝 Contributing
|
## 🤝 Contributing
|
||||||
|
|
||||||
1. Fork the repository
|
1. Fork the repository
|
||||||
2. Create a feature branch
|
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
|
||||||
3. Make your changes
|
3. Commit your changes (`git commit -m 'Add amazing feature'`)
|
||||||
4. Add tests if applicable
|
4. Push to the branch (`git push origin feature/amazing-feature`)
|
||||||
5. Submit a pull request
|
5. Open a Pull Request
|
||||||
|
|
||||||
## 📄 License
|
## 📄 License
|
||||||
|
|
||||||
@@ -294,6 +317,7 @@ This project is licensed under the MIT License - see the LICENSE file for detail
|
|||||||
- **YOLOv8**: Ultralytics for the detection framework
|
- **YOLOv8**: Ultralytics for the detection framework
|
||||||
- **OpenCV**: Computer vision operations
|
- **OpenCV**: Computer vision operations
|
||||||
- **NumPy**: Numerical computations
|
- **NumPy**: Numerical computations
|
||||||
|
- **PyTorch**: Deep learning backend
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
@@ -1,5 +1,5 @@
|
|||||||
# Data Augmentation Configuration
|
# ID Card Data Augmentation Configuration v2.0
|
||||||
# Main configuration file for image data augmentation
|
# Enhanced configuration with smart sampling, multiplication, and random method combination
|
||||||
|
|
||||||
# Paths configuration
|
# Paths configuration
|
||||||
paths:
|
paths:
|
||||||
@@ -7,72 +7,123 @@ paths:
|
|||||||
output_dir: "out1"
|
output_dir: "out1"
|
||||||
log_file: "logs/data_augmentation.log"
|
log_file: "logs/data_augmentation.log"
|
||||||
|
|
||||||
|
# Data Sampling and Multiplication Strategy
|
||||||
|
data_strategy:
|
||||||
|
# Multiplication/Sampling factor:
|
||||||
|
# - If < 1.0 (e.g. 0.3): Random sampling 30% of input data to augment
|
||||||
|
# - If >= 1.0 (e.g. 2.0, 3.0): Multiply dataset size by 2x, 3x etc.
|
||||||
|
multiplication_factor: 0.3
|
||||||
|
|
||||||
|
# Random seed for reproducibility (null = random each run)
|
||||||
|
random_seed: null
|
||||||
|
|
||||||
|
# Sampling strategy for factor < 1.0
|
||||||
|
sampling:
|
||||||
|
method: "random" # random, stratified, uniform
|
||||||
|
preserve_distribution: true # Maintain file type distribution
|
||||||
|
|
||||||
# ID Card Detection configuration
|
# ID Card Detection configuration
|
||||||
id_card_detection:
|
id_card_detection:
|
||||||
enabled: false # Bật/tắt tính năng detect và crop ID cards
|
enabled: false # Enable/disable YOLO detection and cropping
|
||||||
model_path: "data/weights/id_cards_yolov8n.pt" # Đường dẫn đến YOLO model
|
model_path: "data/weights/id_cards_yolov8n.pt" # Path to YOLO model
|
||||||
confidence_threshold: 0.25 # Confidence threshold cho detection
|
confidence_threshold: 0.25 # Detection confidence threshold
|
||||||
iou_threshold: 0.45 # IoU threshold cho NMS
|
iou_threshold: 0.45 # IoU threshold for NMS
|
||||||
padding: 10 # Padding thêm xung quanh bbox
|
padding: 10 # Extra padding around bounding box
|
||||||
crop_mode: "bbox" # Mode cắt: bbox, square, aspect_ratio
|
crop_mode: "bbox" # Cropping mode: bbox, square, aspect_ratio
|
||||||
target_size: null # Kích thước target (width, height) hoặc null
|
target_size: null # Target size (width, height) or null
|
||||||
save_original_crops: true # Có lưu ảnh gốc đã crop không
|
save_original_crops: true # Save original cropped images
|
||||||
|
|
||||||
# Data augmentation parameters - ROTATION and RANDOM CROPPING
|
# Augmentation Strategy - Random Combination of Methods
|
||||||
augmentation:
|
augmentation:
|
||||||
# Geometric transformations
|
# Strategy for combining augmentation methods
|
||||||
rotation:
|
strategy:
|
||||||
enabled: true
|
mode: "random_combine" # random_combine, sequential, individual
|
||||||
angles: [30, 60, 120, 150, 180, 210, 240, 300, 330] # Specific rotation angles
|
min_methods: 2 # Minimum methods applied per image
|
||||||
probability: 1.0 # Always apply rotation
|
max_methods: 4 # Maximum methods applied per image
|
||||||
|
allow_duplicates: false # Allow same method multiple times with different params
|
||||||
|
|
||||||
# Random cropping to simulate partially visible ID cards
|
# Available augmentation methods with selection probabilities
|
||||||
random_cropping:
|
methods:
|
||||||
enabled: true
|
# Geometric transformations
|
||||||
ratio_range: [0.7, 1.0] # Crop ratio range (min, max)
|
rotation:
|
||||||
probability: 1.0 # Always apply cropping
|
enabled: true
|
||||||
|
probability: 0.8 # Selection probability for this method
|
||||||
|
angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
|
||||||
|
|
||||||
# Random noise to simulate worn-out ID cards
|
# Random cropping to simulate partially visible ID cards
|
||||||
random_noise:
|
random_cropping:
|
||||||
enabled: true
|
enabled: true
|
||||||
mean_range: [0.0, 0.7] # Noise mean range (min, max)
|
probability: 0.7
|
||||||
variance_range: [0.0, 0.1] # Noise variance range (min, max)
|
ratio_range: [0.7, 1.0]
|
||||||
probability: 1.0 # Always apply noise
|
|
||||||
|
|
||||||
# Partial blockage to simulate occluded card details
|
# Random noise to simulate worn-out ID cards
|
||||||
partial_blockage:
|
random_noise:
|
||||||
enabled: true
|
enabled: true
|
||||||
num_occlusions_range: [1, 100] # Number of occlusion lines (min, max)
|
probability: 0.6
|
||||||
coverage_range: [0.0, 0.25] # Coverage ratio (min, max)
|
mean_range: [0.0, 0.7]
|
||||||
variance_range: [0.0, 0.1] # Line thickness variance (min, max)
|
variance_range: [0.0, 0.1]
|
||||||
probability: 1.0 # Always apply blockage
|
|
||||||
|
|
||||||
# Blurring to simulate blurred card images that are still readable
|
# Partial blockage to simulate occluded card details
|
||||||
blurring:
|
partial_blockage:
|
||||||
enabled: true
|
enabled: true
|
||||||
kernel_ratio_range: [0.0, 0.0084] # Kernel ratio range (min, max)
|
probability: 0.5
|
||||||
probability: 1.0 # Always apply blurring
|
num_occlusions_range: [1, 100]
|
||||||
|
coverage_range: [0.0, 0.25]
|
||||||
|
variance_range: [0.0, 0.1]
|
||||||
|
|
||||||
# Brightness and contrast adjustment to mimic different environmental lighting conditions
|
# Blurring to simulate motion blur while keeping readability
|
||||||
brightness_contrast:
|
blurring:
|
||||||
enabled: true
|
enabled: true
|
||||||
alpha_range: [0.4, 3.0] # Contrast range (min, max)
|
probability: 0.6
|
||||||
beta_range: [1, 100] # Brightness range (min, max)
|
kernel_ratio_range: [0.0, 0.0084]
|
||||||
probability: 1.0 # Always apply brightness/contrast adjustment
|
|
||||||
|
|
||||||
# Grayscale transformation as final step (applied to all augmented images)
|
# Brightness and contrast adjustment for lighting variations
|
||||||
grayscale:
|
brightness_contrast:
|
||||||
enabled: true
|
enabled: true
|
||||||
probability: 1.0 # Always apply grayscale as final step
|
probability: 0.7
|
||||||
|
alpha_range: [0.4, 3.0]
|
||||||
|
beta_range: [1, 100]
|
||||||
|
|
||||||
|
# Color space transformations
|
||||||
|
color_jitter:
|
||||||
|
enabled: true
|
||||||
|
probability: 0.4
|
||||||
|
brightness_range: [0.8, 1.2]
|
||||||
|
contrast_range: [0.8, 1.2]
|
||||||
|
saturation_range: [0.8, 1.2]
|
||||||
|
hue_range: [-0.1, 0.1]
|
||||||
|
|
||||||
|
# Perspective transformation for viewing angle simulation
|
||||||
|
perspective:
|
||||||
|
enabled: false
|
||||||
|
probability: 0.3
|
||||||
|
distortion_scale: 0.2
|
||||||
|
|
||||||
|
# Final processing (always applied to all outputs)
|
||||||
|
final_processing:
|
||||||
|
# Grayscale transformation as final preprocessing step
|
||||||
|
grayscale:
|
||||||
|
enabled: true
|
||||||
|
probability: 1.0 # Always apply to ensure consistency
|
||||||
|
|
||||||
|
# Quality enhancement (future feature)
|
||||||
|
quality_enhancement:
|
||||||
|
enabled: false
|
||||||
|
sharpen: 0.1
|
||||||
|
denoise: false
|
||||||
|
|
||||||
# Processing configuration
|
# Processing configuration
|
||||||
processing:
|
processing:
|
||||||
target_size: [640, 640] # [width, height] - Increased for better coverage
|
target_size: [640, 640] # [width, height] - Target resolution
|
||||||
batch_size: 32
|
batch_size: 32
|
||||||
num_augmentations: 3 # number of augmented versions per image
|
|
||||||
save_format: "jpg"
|
save_format: "jpg"
|
||||||
quality: 95
|
quality: 95
|
||||||
|
|
||||||
|
# Advanced processing options
|
||||||
|
preserve_original: false # Whether to save original images
|
||||||
|
parallel_processing: true # Enable parallel processing
|
||||||
|
memory_efficient: true # Optimize memory usage
|
||||||
|
|
||||||
# Supported image formats
|
# Supported image formats
|
||||||
supported_formats:
|
supported_formats:
|
||||||
- ".jpg"
|
- ".jpg"
|
||||||
@@ -83,7 +134,7 @@ supported_formats:
|
|||||||
|
|
||||||
# Logging configuration
|
# Logging configuration
|
||||||
logging:
|
logging:
|
||||||
level: "INFO" # DEBUG, INFO, WARNING, ERROR
|
level: "INFO" # Available levels: DEBUG, INFO, WARNING, ERROR
|
||||||
format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||||
handlers:
|
handlers:
|
||||||
- type: "file"
|
- type: "file"
|
||||||
@@ -92,7 +143,7 @@ logging:
|
|||||||
|
|
||||||
# Performance settings
|
# Performance settings
|
||||||
performance:
|
performance:
|
||||||
num_workers: 4
|
num_workers: 4 # Number of parallel workers
|
||||||
prefetch_factor: 2
|
prefetch_factor: 2 # Data prefetching factor
|
||||||
pin_memory: true
|
pin_memory: true # Pin memory for GPU transfer
|
||||||
use_gpu: false
|
use_gpu: false # Enable GPU acceleration
|
61
main.py
61
main.py
@@ -214,11 +214,11 @@ def preview_augmentation(input_dir: Path, output_dir: Path, config: Dict[str, An
|
|||||||
else:
|
else:
|
||||||
print("⚠️ No ID cards detected, proceeding with normal augmentation")
|
print("⚠️ No ID cards detected, proceeding with normal augmentation")
|
||||||
|
|
||||||
# Normal augmentation (fallback)
|
# Normal augmentation (fallback) with new logic
|
||||||
augmented_paths = augmenter.augment_image_file(
|
augmented_paths = augmenter.augment_image_file(
|
||||||
image_files[0],
|
image_files[0],
|
||||||
output_dir,
|
output_dir,
|
||||||
num_augmentations=3
|
num_target_images=3
|
||||||
)
|
)
|
||||||
|
|
||||||
if augmented_paths:
|
if augmented_paths:
|
||||||
@@ -270,6 +270,7 @@ def main():
|
|||||||
processing_config = config_manager.get_processing_config()
|
processing_config = config_manager.get_processing_config()
|
||||||
augmentation_config = config_manager.get_augmentation_config()
|
augmentation_config = config_manager.get_augmentation_config()
|
||||||
logging_config = config_manager.get_logging_config()
|
logging_config = config_manager.get_logging_config()
|
||||||
|
data_strategy_config = config.get("data_strategy", {})
|
||||||
|
|
||||||
# Setup logging
|
# Setup logging
|
||||||
logger = setup_logging(logging_config.get("level", "INFO"))
|
logger = setup_logging(logging_config.get("level", "INFO"))
|
||||||
@@ -324,10 +325,20 @@ def main():
|
|||||||
logger.error(f"No images found in {input_dir}")
|
logger.error(f"No images found in {input_dir}")
|
||||||
sys.exit(1)
|
sys.exit(1)
|
||||||
|
|
||||||
|
# Get data strategy parameters
|
||||||
|
multiplication_factor = data_strategy_config.get("multiplication_factor", 3.0)
|
||||||
|
random_seed = data_strategy_config.get("random_seed")
|
||||||
|
|
||||||
logger.info(f"Found {len(image_files)} images to process")
|
logger.info(f"Found {len(image_files)} images to process")
|
||||||
logger.info(f"Output directory: {output_dir}")
|
logger.info(f"Output directory: {output_dir}")
|
||||||
logger.info(f"Number of augmentations per image: {processing_config.get('num_augmentations', 3)}")
|
logger.info(f"Data strategy: multiplication_factor = {multiplication_factor}")
|
||||||
|
if multiplication_factor < 1.0:
|
||||||
|
logger.info(f"SAMPLING MODE: Will process {multiplication_factor*100:.1f}% of input images")
|
||||||
|
else:
|
||||||
|
logger.info(f"MULTIPLICATION MODE: Target {multiplication_factor}x dataset size")
|
||||||
logger.info(f"Target size: {processing_config.get('target_size', [224, 224])}")
|
logger.info(f"Target size: {processing_config.get('target_size', [224, 224])}")
|
||||||
|
if random_seed:
|
||||||
|
logger.info(f"Random seed: {random_seed}")
|
||||||
|
|
||||||
# Process with ID detection if enabled
|
# Process with ID detection if enabled
|
||||||
if id_detection_config.get('enabled', False):
|
if id_detection_config.get('enabled', False):
|
||||||
@@ -360,24 +371,52 @@ def main():
|
|||||||
target_size=id_detection_config.get('target_size'),
|
target_size=id_detection_config.get('target_size'),
|
||||||
padding=id_detection_config.get('padding', 10)
|
padding=id_detection_config.get('padding', 10)
|
||||||
)
|
)
|
||||||
# Bước 2: Augment các card đã crop
|
# Bước 2: Augment các card đã crop với strategy mới
|
||||||
logger.info("Step 2: Augment cropped ID cards...")
|
logger.info("Step 2: Augment cropped ID cards with smart strategy...")
|
||||||
augmenter = DataAugmentation(augmentation_config)
|
augmenter = DataAugmentation(augmentation_config)
|
||||||
augmenter.batch_augment(
|
|
||||||
|
# Truyền full config để augmenter có thể access data_strategy
|
||||||
|
augmenter.config.update({"data_strategy": data_strategy_config})
|
||||||
|
|
||||||
|
augment_results = augmenter.batch_augment(
|
||||||
processed_dir,
|
processed_dir,
|
||||||
output_dir,
|
output_dir,
|
||||||
num_augmentations=processing_config.get("num_augmentations", 3)
|
multiplication_factor=multiplication_factor,
|
||||||
|
random_seed=random_seed
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Log results
|
||||||
|
if augment_results:
|
||||||
|
logger.info(f"Augmentation Summary:")
|
||||||
|
logger.info(f" Input images: {augment_results.get('input_images', 0)}")
|
||||||
|
logger.info(f" Selected for processing: {augment_results.get('selected_images', 0)}")
|
||||||
|
logger.info(f" Target total: {augment_results.get('target_total', 0)}")
|
||||||
|
logger.info(f" Actually generated: {augment_results.get('actual_generated', 0)}")
|
||||||
|
logger.info(f" Efficiency: {augment_results.get('efficiency', 0):.1%}")
|
||||||
else:
|
else:
|
||||||
# Augment trực tiếp ảnh gốc
|
# Augment trực tiếp ảnh gốc với strategy mới
|
||||||
logger.info("Starting normal batch augmentation (direct augmentation)...")
|
logger.info("Starting smart batch augmentation (direct augmentation)...")
|
||||||
augmenter = DataAugmentation(augmentation_config)
|
augmenter = DataAugmentation(augmentation_config)
|
||||||
augmenter.batch_augment(
|
|
||||||
|
# Truyền full config để augmenter có thể access data_strategy
|
||||||
|
augmenter.config.update({"data_strategy": data_strategy_config})
|
||||||
|
|
||||||
|
augment_results = augmenter.batch_augment(
|
||||||
input_dir,
|
input_dir,
|
||||||
output_dir,
|
output_dir,
|
||||||
num_augmentations=processing_config.get("num_augmentations", 3)
|
multiplication_factor=multiplication_factor,
|
||||||
|
random_seed=random_seed
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Log results
|
||||||
|
if augment_results:
|
||||||
|
logger.info(f"Augmentation Summary:")
|
||||||
|
logger.info(f" Input images: {augment_results.get('input_images', 0)}")
|
||||||
|
logger.info(f" Selected for processing: {augment_results.get('selected_images', 0)}")
|
||||||
|
logger.info(f" Target total: {augment_results.get('target_total', 0)}")
|
||||||
|
logger.info(f" Actually generated: {augment_results.get('actual_generated', 0)}")
|
||||||
|
logger.info(f" Efficiency: {augment_results.get('efficiency', 0):.1%}")
|
||||||
|
|
||||||
logger.info("Data processing completed successfully")
|
logger.info("Data processing completed successfully")
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
@@ -7,6 +7,7 @@ from pathlib import Path
|
|||||||
from typing import List, Tuple, Optional, Dict, Any
|
from typing import List, Tuple, Optional, Dict, Any
|
||||||
import random
|
import random
|
||||||
import math
|
import math
|
||||||
|
import logging
|
||||||
from image_processor import ImageProcessor
|
from image_processor import ImageProcessor
|
||||||
from utils import load_image, save_image, create_augmented_filename, print_progress
|
from utils import load_image, save_image, create_augmented_filename, print_progress
|
||||||
|
|
||||||
@@ -22,6 +23,7 @@ class DataAugmentation:
|
|||||||
"""
|
"""
|
||||||
self.config = config or {}
|
self.config = config or {}
|
||||||
self.image_processor = ImageProcessor()
|
self.image_processor = ImageProcessor()
|
||||||
|
self.logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
def random_crop_preserve_quality(self, image: np.ndarray, crop_ratio_range: Tuple[float, float] = (0.7, 1.0)) -> np.ndarray:
|
def random_crop_preserve_quality(self, image: np.ndarray, crop_ratio_range: Tuple[float, float] = (0.7, 1.0)) -> np.ndarray:
|
||||||
"""
|
"""
|
||||||
@@ -363,21 +365,306 @@ class DataAugmentation:
|
|||||||
|
|
||||||
return result
|
return result
|
||||||
|
|
||||||
def augment_single_image(self, image: np.ndarray, num_augmentations: int = None) -> List[np.ndarray]:
|
def augment_single_image(self, image: np.ndarray, num_target_images: int = None) -> List[np.ndarray]:
|
||||||
"""
|
"""
|
||||||
Apply each augmentation method separately to create independent augmented versions
|
Apply random combination of augmentation methods to create diverse augmented versions
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image: Input image
|
image: Input image
|
||||||
num_augmentations: Number of augmented versions to create per method
|
num_target_images: Number of target augmented images to generate
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List of augmented images (each method creates separate versions)
|
List of augmented images with random method combinations
|
||||||
"""
|
"""
|
||||||
num_augmentations = num_augmentations or 3 # Default value
|
num_target_images = num_target_images or 3 # Default value
|
||||||
|
|
||||||
|
# Get strategy config
|
||||||
|
strategy_config = self.config.get("strategy", {})
|
||||||
|
methods_config = self.config.get("methods", {})
|
||||||
|
final_config = self.config.get("final_processing", {})
|
||||||
|
|
||||||
|
mode = strategy_config.get("mode", "random_combine")
|
||||||
|
min_methods = strategy_config.get("min_methods", 2)
|
||||||
|
max_methods = strategy_config.get("max_methods", 4)
|
||||||
|
|
||||||
|
if mode == "random_combine":
|
||||||
|
return self._augment_random_combine(image, num_target_images, methods_config, final_config, min_methods, max_methods)
|
||||||
|
elif mode == "sequential":
|
||||||
|
return self._augment_sequential(image, num_target_images, methods_config, final_config)
|
||||||
|
elif mode == "individual":
|
||||||
|
return self._augment_individual_legacy(image, num_target_images)
|
||||||
|
else:
|
||||||
|
# Fallback to legacy method
|
||||||
|
return self._augment_individual_legacy(image, num_target_images)
|
||||||
|
|
||||||
|
def _augment_random_combine(self, image: np.ndarray, num_target_images: int,
|
||||||
|
methods_config: dict, final_config: dict,
|
||||||
|
min_methods: int, max_methods: int) -> List[np.ndarray]:
|
||||||
|
"""Apply random combination of methods"""
|
||||||
augmented_images = []
|
augmented_images = []
|
||||||
|
|
||||||
# Get configuration
|
# Get enabled methods with their probabilities
|
||||||
|
available_methods = []
|
||||||
|
for method_name, method_config in methods_config.items():
|
||||||
|
if method_config.get("enabled", False):
|
||||||
|
available_methods.append((method_name, method_config))
|
||||||
|
|
||||||
|
if not available_methods:
|
||||||
|
self.logger.warning("No augmentation methods enabled!")
|
||||||
|
return [image.copy() for _ in range(num_target_images)]
|
||||||
|
|
||||||
|
for i in range(num_target_images):
|
||||||
|
# Decide number of methods for this image
|
||||||
|
num_methods = random.randint(min_methods, min(max_methods, len(available_methods)))
|
||||||
|
|
||||||
|
# Select methods based on probability
|
||||||
|
selected_methods = self._select_methods_by_probability(available_methods, num_methods)
|
||||||
|
|
||||||
|
# Apply selected methods in sequence
|
||||||
|
augmented = image.copy()
|
||||||
|
method_names = []
|
||||||
|
|
||||||
|
for method_name, method_config in selected_methods:
|
||||||
|
if random.random() < method_config.get("probability", 0.5):
|
||||||
|
augmented = self._apply_single_method(augmented, method_name, method_config)
|
||||||
|
method_names.append(method_name)
|
||||||
|
|
||||||
|
# Apply final processing
|
||||||
|
augmented = self._apply_final_processing(augmented, final_config)
|
||||||
|
|
||||||
|
# Resize preserving aspect ratio
|
||||||
|
target_size = self.image_processor.target_size
|
||||||
|
if target_size:
|
||||||
|
augmented = self.resize_preserve_aspect(augmented, target_size)
|
||||||
|
|
||||||
|
augmented_images.append(augmented)
|
||||||
|
|
||||||
|
return augmented_images
|
||||||
|
|
||||||
|
def _select_methods_by_probability(self, available_methods: List[Tuple], num_methods: int) -> List[Tuple]:
|
||||||
|
"""Select methods based on their probability weights"""
|
||||||
|
# Create weighted list
|
||||||
|
weighted_methods = []
|
||||||
|
for method_name, method_config in available_methods:
|
||||||
|
probability = method_config.get("probability", 0.5)
|
||||||
|
weighted_methods.append((method_name, method_config, probability))
|
||||||
|
|
||||||
|
# Sort by probability (highest first) and select top candidates
|
||||||
|
weighted_methods.sort(key=lambda x: x[2], reverse=True)
|
||||||
|
|
||||||
|
# Use weighted random selection
|
||||||
|
selected = []
|
||||||
|
remaining_methods = weighted_methods.copy()
|
||||||
|
|
||||||
|
for _ in range(num_methods):
|
||||||
|
if not remaining_methods:
|
||||||
|
break
|
||||||
|
|
||||||
|
# Calculate cumulative probabilities
|
||||||
|
total_prob = sum(method[2] for method in remaining_methods)
|
||||||
|
if total_prob == 0:
|
||||||
|
# If all probabilities are 0, select randomly
|
||||||
|
selected_method = random.choice(remaining_methods)
|
||||||
|
else:
|
||||||
|
rand_val = random.uniform(0, total_prob)
|
||||||
|
cumulative_prob = 0
|
||||||
|
selected_method = None
|
||||||
|
|
||||||
|
for method in remaining_methods:
|
||||||
|
cumulative_prob += method[2]
|
||||||
|
if rand_val <= cumulative_prob:
|
||||||
|
selected_method = method
|
||||||
|
break
|
||||||
|
|
||||||
|
if selected_method is None:
|
||||||
|
selected_method = remaining_methods[-1]
|
||||||
|
|
||||||
|
selected.append((selected_method[0], selected_method[1]))
|
||||||
|
remaining_methods.remove(selected_method)
|
||||||
|
|
||||||
|
return selected
|
||||||
|
|
||||||
|
def _apply_single_method(self, image: np.ndarray, method_name: str, method_config: dict) -> np.ndarray:
|
||||||
|
"""Apply a single augmentation method"""
|
||||||
|
try:
|
||||||
|
if method_name == "rotation":
|
||||||
|
angles = method_config.get("angles", [30, 60, 90, 120, 150, 180, 210, 240, 300, 330])
|
||||||
|
angle = random.choice(angles)
|
||||||
|
return self.rotate_image_preserve_quality(image, angle)
|
||||||
|
|
||||||
|
elif method_name == "random_cropping":
|
||||||
|
ratio_range = method_config.get("ratio_range", (0.7, 1.0))
|
||||||
|
return self.random_crop_preserve_quality(image, ratio_range)
|
||||||
|
|
||||||
|
elif method_name == "random_noise":
|
||||||
|
mean_range = method_config.get("mean_range", (0.0, 0.7))
|
||||||
|
variance_range = method_config.get("variance_range", (0.0, 0.1))
|
||||||
|
return self.add_random_noise_preserve_quality(image, mean_range, variance_range)
|
||||||
|
|
||||||
|
elif method_name == "partial_blockage":
|
||||||
|
num_range = method_config.get("num_occlusions_range", (1, 100))
|
||||||
|
coverage_range = method_config.get("coverage_range", (0.0, 0.25))
|
||||||
|
variance_range = method_config.get("variance_range", (0.0, 0.1))
|
||||||
|
return self.add_partial_blockage_preserve_quality(image, num_range, coverage_range, variance_range)
|
||||||
|
|
||||||
|
elif method_name == "blurring":
|
||||||
|
kernel_range = method_config.get("kernel_ratio_range", (0.0, 0.0084))
|
||||||
|
return self.apply_blurring_preserve_quality(image, kernel_range)
|
||||||
|
|
||||||
|
elif method_name == "brightness_contrast":
|
||||||
|
alpha_range = method_config.get("alpha_range", (0.4, 3.0))
|
||||||
|
beta_range = method_config.get("beta_range", (1, 100))
|
||||||
|
return self.adjust_brightness_contrast_preserve_quality(image, alpha_range, beta_range)
|
||||||
|
|
||||||
|
elif method_name == "color_jitter":
|
||||||
|
return self.apply_color_jitter(image, method_config)
|
||||||
|
|
||||||
|
elif method_name == "perspective":
|
||||||
|
distortion_scale = method_config.get("distortion_scale", 0.2)
|
||||||
|
return self.apply_perspective_transform(image, distortion_scale)
|
||||||
|
|
||||||
|
else:
|
||||||
|
return image
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error applying method {method_name}: {e}")
|
||||||
|
return image
|
||||||
|
|
||||||
|
def _apply_final_processing(self, image: np.ndarray, final_config: dict) -> np.ndarray:
|
||||||
|
"""Apply final processing steps - ALWAYS applied to all outputs"""
|
||||||
|
# Grayscale conversion - ALWAYS applied if enabled
|
||||||
|
grayscale_config = final_config.get("grayscale", {})
|
||||||
|
if grayscale_config.get("enabled", False):
|
||||||
|
# Always apply grayscale, no random check
|
||||||
|
image = self.convert_to_grayscale_preserve_quality(image)
|
||||||
|
|
||||||
|
# Quality enhancement (future feature)
|
||||||
|
quality_config = final_config.get("quality_enhancement", {})
|
||||||
|
if quality_config.get("enabled", False):
|
||||||
|
# TODO: Implement quality enhancement
|
||||||
|
pass
|
||||||
|
|
||||||
|
return image
|
||||||
|
|
||||||
|
def apply_color_jitter(self, image: np.ndarray, config: dict) -> np.ndarray:
|
||||||
|
"""
|
||||||
|
Apply color jittering (brightness, contrast, saturation, hue adjustments)
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image: Input image
|
||||||
|
config: Color jitter configuration
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Color-jittered image
|
||||||
|
"""
|
||||||
|
# Get parameters
|
||||||
|
brightness_range = config.get("brightness_range", [0.8, 1.2])
|
||||||
|
contrast_range = config.get("contrast_range", [0.8, 1.2])
|
||||||
|
saturation_range = config.get("saturation_range", [0.8, 1.2])
|
||||||
|
hue_range = config.get("hue_range", [-0.1, 0.1])
|
||||||
|
|
||||||
|
# Convert to HSV for saturation and hue adjustments
|
||||||
|
hsv = cv2.cvtColor(image, cv2.COLOR_RGB2HSV).astype(np.float32)
|
||||||
|
|
||||||
|
# Apply brightness (adjust V channel)
|
||||||
|
brightness_factor = random.uniform(brightness_range[0], brightness_range[1])
|
||||||
|
hsv[:, :, 2] = np.clip(hsv[:, :, 2] * brightness_factor, 0, 255)
|
||||||
|
|
||||||
|
# Apply saturation (adjust S channel)
|
||||||
|
saturation_factor = random.uniform(saturation_range[0], saturation_range[1])
|
||||||
|
hsv[:, :, 1] = np.clip(hsv[:, :, 1] * saturation_factor, 0, 255)
|
||||||
|
|
||||||
|
# Apply hue shift (adjust H channel)
|
||||||
|
hue_shift = random.uniform(hue_range[0], hue_range[1]) * 179 # OpenCV hue range is 0-179
|
||||||
|
hsv[:, :, 0] = (hsv[:, :, 0] + hue_shift) % 180
|
||||||
|
|
||||||
|
# Convert back to RGB
|
||||||
|
result = cv2.cvtColor(hsv.astype(np.uint8), cv2.COLOR_HSV2RGB)
|
||||||
|
|
||||||
|
# Apply contrast (after converting back to RGB)
|
||||||
|
contrast_factor = random.uniform(contrast_range[0], contrast_range[1])
|
||||||
|
result = cv2.convertScaleAbs(result, alpha=contrast_factor, beta=0)
|
||||||
|
|
||||||
|
return result
|
||||||
|
|
||||||
|
def apply_perspective_transform(self, image: np.ndarray, distortion_scale: float = 0.2) -> np.ndarray:
|
||||||
|
"""
|
||||||
|
Apply perspective transformation to simulate viewing angle changes
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image: Input image
|
||||||
|
distortion_scale: Scale of perspective distortion (0.0 to 1.0)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Perspective-transformed image
|
||||||
|
"""
|
||||||
|
height, width = image.shape[:2]
|
||||||
|
|
||||||
|
# Define source points (corners of original image)
|
||||||
|
src_points = np.float32([
|
||||||
|
[0, 0],
|
||||||
|
[width-1, 0],
|
||||||
|
[width-1, height-1],
|
||||||
|
[0, height-1]
|
||||||
|
])
|
||||||
|
|
||||||
|
# Add random distortion to destination points
|
||||||
|
max_distortion = min(width, height) * distortion_scale
|
||||||
|
|
||||||
|
dst_points = np.float32([
|
||||||
|
[random.uniform(0, max_distortion), random.uniform(0, max_distortion)],
|
||||||
|
[width-1-random.uniform(0, max_distortion), random.uniform(0, max_distortion)],
|
||||||
|
[width-1-random.uniform(0, max_distortion), height-1-random.uniform(0, max_distortion)],
|
||||||
|
[random.uniform(0, max_distortion), height-1-random.uniform(0, max_distortion)]
|
||||||
|
])
|
||||||
|
|
||||||
|
# Calculate perspective transformation matrix
|
||||||
|
matrix = cv2.getPerspectiveTransform(src_points, dst_points)
|
||||||
|
|
||||||
|
# Apply transformation
|
||||||
|
result = cv2.warpPerspective(image, matrix, (width, height),
|
||||||
|
borderMode=cv2.BORDER_CONSTANT,
|
||||||
|
borderValue=(255, 255, 255))
|
||||||
|
|
||||||
|
return result
|
||||||
|
|
||||||
|
def _augment_sequential(self, image: np.ndarray, num_target_images: int,
|
||||||
|
methods_config: dict, final_config: dict) -> List[np.ndarray]:
|
||||||
|
"""Apply methods in sequence (pipeline style)"""
|
||||||
|
augmented_images = []
|
||||||
|
|
||||||
|
# Get enabled methods
|
||||||
|
enabled_methods = [
|
||||||
|
(name, config) for name, config in methods_config.items()
|
||||||
|
if config.get("enabled", False)
|
||||||
|
]
|
||||||
|
|
||||||
|
for i in range(num_target_images):
|
||||||
|
augmented = image.copy()
|
||||||
|
|
||||||
|
# Apply all enabled methods in sequence
|
||||||
|
for method_name, method_config in enabled_methods:
|
||||||
|
if random.random() < method_config.get("probability", 0.5):
|
||||||
|
augmented = self._apply_single_method(augmented, method_name, method_config)
|
||||||
|
|
||||||
|
# Apply final processing
|
||||||
|
augmented = self._apply_final_processing(augmented, final_config)
|
||||||
|
|
||||||
|
# Resize preserving aspect ratio
|
||||||
|
target_size = self.image_processor.target_size
|
||||||
|
if target_size:
|
||||||
|
augmented = self.resize_preserve_aspect(augmented, target_size)
|
||||||
|
|
||||||
|
augmented_images.append(augmented)
|
||||||
|
|
||||||
|
return augmented_images
|
||||||
|
|
||||||
|
def _augment_individual_legacy(self, image: np.ndarray, num_target_images: int) -> List[np.ndarray]:
|
||||||
|
"""Legacy individual method application (backward compatibility)"""
|
||||||
|
# This is the old implementation for backward compatibility
|
||||||
|
augmented_images = []
|
||||||
|
|
||||||
|
# Get old-style configuration
|
||||||
rotation_config = self.config.get("rotation", {})
|
rotation_config = self.config.get("rotation", {})
|
||||||
cropping_config = self.config.get("random_cropping", {})
|
cropping_config = self.config.get("random_cropping", {})
|
||||||
noise_config = self.config.get("random_noise", {})
|
noise_config = self.config.get("random_noise", {})
|
||||||
@@ -386,177 +673,272 @@ class DataAugmentation:
|
|||||||
blurring_config = self.config.get("blurring", {})
|
blurring_config = self.config.get("blurring", {})
|
||||||
brightness_contrast_config = self.config.get("brightness_contrast", {})
|
brightness_contrast_config = self.config.get("brightness_contrast", {})
|
||||||
|
|
||||||
# Configuration parameters
|
# Apply individual methods (old logic)
|
||||||
angles = rotation_config.get("angles", [30, 60, 120, 150, 180, 210, 240, 300, 330])
|
methods = [
|
||||||
crop_ratio_range = cropping_config.get("ratio_range", (0.7, 1.0))
|
("rotation", rotation_config, self.rotate_image_preserve_quality),
|
||||||
mean_range = noise_config.get("mean_range", (0.0, 0.7))
|
("cropping", cropping_config, self.random_crop_preserve_quality),
|
||||||
variance_range = noise_config.get("variance_range", (0.0, 0.1))
|
("noise", noise_config, self.add_random_noise_preserve_quality),
|
||||||
num_occlusions_range = blockage_config.get("num_occlusions_range", (1, 100))
|
("blockage", blockage_config, self.add_partial_blockage_preserve_quality),
|
||||||
coverage_range = blockage_config.get("coverage_range", (0.0, 0.25))
|
("blurring", blurring_config, self.apply_blurring_preserve_quality),
|
||||||
blockage_variance_range = blockage_config.get("variance_range", (0.0, 0.1))
|
("brightness_contrast", brightness_contrast_config, self.adjust_brightness_contrast_preserve_quality)
|
||||||
kernel_ratio_range = blurring_config.get("kernel_ratio_range", (0.0, 0.0084))
|
]
|
||||||
alpha_range = brightness_contrast_config.get("alpha_range", (0.4, 3.0))
|
|
||||||
beta_range = brightness_contrast_config.get("beta_range", (1, 100))
|
|
||||||
|
|
||||||
# Apply each method separately to create independent versions
|
for method_name, method_config, method_func in methods:
|
||||||
|
if method_config.get("enabled", False):
|
||||||
|
for i in range(num_target_images):
|
||||||
|
augmented = image.copy()
|
||||||
|
# Apply single method with appropriate parameters
|
||||||
|
if method_name == "rotation":
|
||||||
|
angles = method_config.get("angles", [30, 60, 90, 120, 150, 180, 210, 240, 300, 330])
|
||||||
|
angle = random.choice(angles)
|
||||||
|
augmented = method_func(augmented, angle)
|
||||||
|
elif method_name == "cropping":
|
||||||
|
ratio_range = method_config.get("ratio_range", (0.7, 1.0))
|
||||||
|
augmented = method_func(augmented, ratio_range)
|
||||||
|
# Add other method parameter handling as needed
|
||||||
|
|
||||||
# 1. Rotation only
|
# Resize preserving aspect ratio
|
||||||
if rotation_config.get("enabled", False):
|
target_size = self.image_processor.target_size
|
||||||
for i in range(num_augmentations):
|
if target_size:
|
||||||
augmented = image.copy()
|
augmented = self.resize_preserve_aspect(augmented, target_size)
|
||||||
angle = random.choice(angles)
|
|
||||||
augmented = self.rotate_image_preserve_quality(augmented, angle)
|
|
||||||
|
|
||||||
# Resize preserving aspect ratio
|
augmented_images.append(augmented)
|
||||||
target_size = self.image_processor.target_size
|
|
||||||
if target_size:
|
|
||||||
augmented = self.resize_preserve_aspect(augmented, target_size)
|
|
||||||
|
|
||||||
augmented_images.append(augmented)
|
# Apply grayscale to all images
|
||||||
|
|
||||||
# 2. Random cropping only
|
|
||||||
if cropping_config.get("enabled", False):
|
|
||||||
for i in range(num_augmentations):
|
|
||||||
augmented = image.copy()
|
|
||||||
augmented = self.random_crop_preserve_quality(augmented, crop_ratio_range)
|
|
||||||
|
|
||||||
# Resize preserving aspect ratio
|
|
||||||
target_size = self.image_processor.target_size
|
|
||||||
if target_size:
|
|
||||||
augmented = self.resize_preserve_aspect(augmented, target_size)
|
|
||||||
|
|
||||||
augmented_images.append(augmented)
|
|
||||||
|
|
||||||
# 3. Random noise only
|
|
||||||
if noise_config.get("enabled", False):
|
|
||||||
for i in range(num_augmentations):
|
|
||||||
augmented = image.copy()
|
|
||||||
augmented = self.add_random_noise_preserve_quality(augmented, mean_range, variance_range)
|
|
||||||
|
|
||||||
# Resize preserving aspect ratio
|
|
||||||
target_size = self.image_processor.target_size
|
|
||||||
if target_size:
|
|
||||||
augmented = self.resize_preserve_aspect(augmented, target_size)
|
|
||||||
|
|
||||||
augmented_images.append(augmented)
|
|
||||||
|
|
||||||
# 4. Partial blockage only
|
|
||||||
if blockage_config.get("enabled", False):
|
|
||||||
for i in range(num_augmentations):
|
|
||||||
augmented = image.copy()
|
|
||||||
augmented = self.add_partial_blockage_preserve_quality(augmented, num_occlusions_range, coverage_range, blockage_variance_range)
|
|
||||||
|
|
||||||
# Resize preserving aspect ratio
|
|
||||||
target_size = self.image_processor.target_size
|
|
||||||
if target_size:
|
|
||||||
augmented = self.resize_preserve_aspect(augmented, target_size)
|
|
||||||
|
|
||||||
augmented_images.append(augmented)
|
|
||||||
|
|
||||||
# 5. Blurring only
|
|
||||||
if blurring_config.get("enabled", False):
|
|
||||||
for i in range(num_augmentations):
|
|
||||||
augmented = image.copy()
|
|
||||||
augmented = self.apply_blurring_preserve_quality(augmented, kernel_ratio_range)
|
|
||||||
|
|
||||||
# Resize preserving aspect ratio
|
|
||||||
target_size = self.image_processor.target_size
|
|
||||||
if target_size:
|
|
||||||
augmented = self.resize_preserve_aspect(augmented, target_size)
|
|
||||||
|
|
||||||
augmented_images.append(augmented)
|
|
||||||
|
|
||||||
# 6. Brightness/Contrast only
|
|
||||||
if brightness_contrast_config.get("enabled", False):
|
|
||||||
for i in range(num_augmentations):
|
|
||||||
augmented = image.copy()
|
|
||||||
augmented = self.adjust_brightness_contrast_preserve_quality(augmented, alpha_range, beta_range)
|
|
||||||
|
|
||||||
# Resize preserving aspect ratio
|
|
||||||
target_size = self.image_processor.target_size
|
|
||||||
if target_size:
|
|
||||||
augmented = self.resize_preserve_aspect(augmented, target_size)
|
|
||||||
|
|
||||||
augmented_images.append(augmented)
|
|
||||||
|
|
||||||
# 7. Apply grayscale as final step to ALL augmented images
|
|
||||||
if grayscale_config.get("enabled", False):
|
if grayscale_config.get("enabled", False):
|
||||||
for i in range(len(augmented_images)):
|
for i in range(len(augmented_images)):
|
||||||
augmented_images[i] = self.convert_to_grayscale_preserve_quality(augmented_images[i])
|
augmented_images[i] = self.convert_to_grayscale_preserve_quality(augmented_images[i])
|
||||||
|
|
||||||
return augmented_images
|
return augmented_images
|
||||||
|
|
||||||
def augment_image_file(self, image_path: Path, output_dir: Path, num_augmentations: int = None) -> List[Path]:
|
def augment_image_file(self, image_path: Path, output_dir: Path, num_target_images: int = None) -> List[Path]:
|
||||||
"""
|
"""
|
||||||
Augment a single image file and save results with quality preservation
|
Augment a single image file and save results with quality preservation
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image_path: Path to input image
|
image_path: Path to input image
|
||||||
output_dir: Output directory for augmented images
|
output_dir: Output directory for augmented images
|
||||||
num_augmentations: Number of augmented versions to create per method
|
num_target_images: Number of target augmented images to generate
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List of paths to saved augmented images
|
List of paths to saved augmented images
|
||||||
"""
|
"""
|
||||||
# Load image without resizing to preserve original quality
|
# Load image without resizing to preserve original quality
|
||||||
image = load_image(image_path, None) # Load original size
|
image = load_image(image_path, None)
|
||||||
if image is None:
|
if image is None:
|
||||||
return []
|
return []
|
||||||
|
|
||||||
# Apply augmentations
|
# Apply augmentations
|
||||||
augmented_images = self.augment_single_image(image, num_augmentations)
|
augmented_images = self.augment_single_image(image, num_target_images)
|
||||||
|
|
||||||
# Save augmented images with method names
|
# Save augmented images
|
||||||
saved_paths = []
|
saved_paths = []
|
||||||
method_names = ["rotation", "cropping", "noise", "blockage", "blurring", "brightness_contrast", "grayscale"]
|
|
||||||
method_index = 0
|
|
||||||
|
|
||||||
for i, aug_image in enumerate(augmented_images):
|
for i, aug_image in enumerate(augmented_images):
|
||||||
# Determine method name based on index
|
base_name = image_path.stem
|
||||||
method_name = method_names[method_index // num_augmentations] if method_index // num_augmentations < len(method_names) else "aug"
|
output_filename = f"{base_name}_aug_{i+1:03d}.jpg"
|
||||||
|
output_path = output_dir / output_filename
|
||||||
|
|
||||||
# Create output filename with method name
|
|
||||||
output_filename = create_augmented_filename(image_path, (i % num_augmentations) + 1, method_name)
|
|
||||||
output_path = output_dir / output_filename.name
|
|
||||||
|
|
||||||
# Save image
|
|
||||||
if save_image(aug_image, output_path):
|
if save_image(aug_image, output_path):
|
||||||
saved_paths.append(output_path)
|
saved_paths.append(output_path)
|
||||||
|
|
||||||
method_index += 1
|
return saved_paths
|
||||||
|
|
||||||
|
def augment_image_file_with_raw(self, image_path: Path, output_dir: Path,
|
||||||
|
num_total_versions: int = None) -> List[Path]:
|
||||||
|
"""
|
||||||
|
Augment a single image file including raw/original version
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image_path: Path to input image
|
||||||
|
output_dir: Output directory for all image versions
|
||||||
|
num_total_versions: Total number of versions (including raw)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of paths to saved images (raw + augmented)
|
||||||
|
"""
|
||||||
|
# Load original image
|
||||||
|
image = load_image(image_path, None)
|
||||||
|
if image is None:
|
||||||
|
return []
|
||||||
|
|
||||||
|
saved_paths = []
|
||||||
|
base_name = image_path.stem
|
||||||
|
|
||||||
|
# Always save raw version first (resized but not augmented)
|
||||||
|
if num_total_versions > 0:
|
||||||
|
raw_image = image.copy()
|
||||||
|
|
||||||
|
# Apply final processing (grayscale) but no augmentation
|
||||||
|
final_config = self.config.get("final_processing", {})
|
||||||
|
raw_image = self._apply_final_processing(raw_image, final_config)
|
||||||
|
|
||||||
|
# Resize to target size
|
||||||
|
target_size = self.image_processor.target_size
|
||||||
|
if target_size:
|
||||||
|
raw_image = self.resize_preserve_aspect(raw_image, target_size)
|
||||||
|
|
||||||
|
# Save raw version
|
||||||
|
raw_filename = f"{base_name}_raw_001.jpg"
|
||||||
|
raw_path = output_dir / raw_filename
|
||||||
|
if save_image(raw_image, raw_path):
|
||||||
|
saved_paths.append(raw_path)
|
||||||
|
|
||||||
|
# Generate augmented versions for remaining slots
|
||||||
|
num_augmented = max(0, num_total_versions - 1)
|
||||||
|
if num_augmented > 0:
|
||||||
|
augmented_images = self.augment_single_image(image, num_augmented)
|
||||||
|
|
||||||
|
for i, aug_image in enumerate(augmented_images):
|
||||||
|
aug_filename = f"{base_name}_aug_{i+1:03d}.jpg"
|
||||||
|
aug_path = output_dir / aug_filename
|
||||||
|
|
||||||
|
if save_image(aug_image, aug_path):
|
||||||
|
saved_paths.append(aug_path)
|
||||||
|
|
||||||
return saved_paths
|
return saved_paths
|
||||||
|
|
||||||
def batch_augment(self, input_dir: Path, output_dir: Path, num_augmentations: int = None) -> Dict[str, List[Path]]:
|
def batch_augment(self, input_dir: Path, output_dir: Path,
|
||||||
|
multiplication_factor: float = None, random_seed: int = None) -> Dict[str, List[Path]]:
|
||||||
"""
|
"""
|
||||||
Augment all images in a directory
|
Augment images in a directory with smart sampling and multiplication strategy
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
input_dir: Input directory containing images
|
input_dir: Input directory containing images
|
||||||
output_dir: Output directory for augmented images
|
output_dir: Output directory for augmented images
|
||||||
num_augmentations: Number of augmented versions per image
|
multiplication_factor:
|
||||||
|
- If < 1.0: Sample percentage of input data to augment
|
||||||
|
- If >= 1.0: Target multiplication factor for output data size
|
||||||
|
random_seed: Random seed for reproducibility
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Dictionary mapping original images to their augmented versions
|
Dictionary containing results and statistics
|
||||||
"""
|
"""
|
||||||
from utils import get_image_files
|
from utils import get_image_files
|
||||||
|
|
||||||
image_files = get_image_files(input_dir)
|
# Set random seed for reproducibility
|
||||||
|
if random_seed is not None:
|
||||||
|
random.seed(random_seed)
|
||||||
|
np.random.seed(random_seed)
|
||||||
|
|
||||||
|
# Get all input images
|
||||||
|
all_image_files = get_image_files(input_dir)
|
||||||
|
if not all_image_files:
|
||||||
|
print("No images found in input directory")
|
||||||
|
return {}
|
||||||
|
|
||||||
|
# Get multiplication factor from config if not provided
|
||||||
|
if multiplication_factor is None:
|
||||||
|
data_strategy = self.config.get("data_strategy", {})
|
||||||
|
multiplication_factor = data_strategy.get("multiplication_factor", 3.0)
|
||||||
|
|
||||||
|
print(f"Found {len(all_image_files)} total images")
|
||||||
|
print(f"Multiplication factor: {multiplication_factor}")
|
||||||
|
|
||||||
|
# Determine sampling strategy
|
||||||
|
if multiplication_factor < 1.0:
|
||||||
|
# Sampling mode: Take a percentage of input data
|
||||||
|
num_selected = int(len(all_image_files) * multiplication_factor)
|
||||||
|
selected_images = self._sample_images(all_image_files, num_selected)
|
||||||
|
target_total_images = len(all_image_files) # Keep original dataset size
|
||||||
|
images_per_input = max(1, target_total_images // len(selected_images))
|
||||||
|
print(f"SAMPLING MODE: Selected {len(selected_images)} images ({multiplication_factor*100:.1f}%)")
|
||||||
|
print(f"Target: {target_total_images} total images, {images_per_input} per selected image")
|
||||||
|
else:
|
||||||
|
# Multiplication mode: Multiply dataset size
|
||||||
|
selected_images = all_image_files
|
||||||
|
target_total_images = int(len(all_image_files) * multiplication_factor)
|
||||||
|
images_per_input = max(1, target_total_images // len(selected_images))
|
||||||
|
print(f"MULTIPLICATION MODE: Processing all {len(selected_images)} images")
|
||||||
|
print(f"Target: {target_total_images} total images ({multiplication_factor}x original), {images_per_input} per image")
|
||||||
|
|
||||||
|
# Process selected images
|
||||||
results = {}
|
results = {}
|
||||||
|
total_generated = 0
|
||||||
|
|
||||||
print(f"Found {len(image_files)} images to augment")
|
for i, image_path in enumerate(selected_images):
|
||||||
|
print_progress(i + 1, len(selected_images), f"Processing {image_path.name}")
|
||||||
|
|
||||||
for i, image_path in enumerate(image_files):
|
# Calculate number of versions for this image (including raw)
|
||||||
print_progress(i + 1, len(image_files), "Augmenting images")
|
remaining_images = target_total_images - total_generated
|
||||||
|
remaining_inputs = len(selected_images) - i
|
||||||
|
total_versions_needed = min(images_per_input, remaining_images)
|
||||||
|
|
||||||
# Augment single image
|
# Always include raw image, then augmented ones
|
||||||
augmented_paths = self.augment_image_file(image_path, output_dir, num_augmentations)
|
augmented_paths = self.augment_image_file_with_raw(
|
||||||
|
image_path, output_dir, total_versions_needed
|
||||||
|
)
|
||||||
|
|
||||||
if augmented_paths:
|
if augmented_paths:
|
||||||
results[str(image_path)] = augmented_paths
|
results[str(image_path)] = augmented_paths
|
||||||
|
total_generated += len(augmented_paths)
|
||||||
|
|
||||||
print(f"\nAugmented {len(results)} images successfully")
|
# Generate summary
|
||||||
return results
|
summary = {
|
||||||
|
"input_images": len(all_image_files),
|
||||||
|
"selected_images": len(selected_images),
|
||||||
|
"target_total": target_total_images,
|
||||||
|
"actual_generated": total_generated,
|
||||||
|
"multiplication_factor": multiplication_factor,
|
||||||
|
"mode": "sampling" if multiplication_factor < 1.0 else "multiplication",
|
||||||
|
"results": results,
|
||||||
|
"efficiency": total_generated / target_total_images if target_total_images > 0 else 0
|
||||||
|
}
|
||||||
|
|
||||||
|
print(f"\n✅ Augmentation completed!")
|
||||||
|
print(f"Generated {total_generated} images from {len(selected_images)} selected images")
|
||||||
|
print(f"Target vs Actual: {target_total_images} → {total_generated} ({summary['efficiency']:.1%} efficiency)")
|
||||||
|
|
||||||
|
return summary
|
||||||
|
|
||||||
|
def _sample_images(self, image_files: List[Path], num_selected: int) -> List[Path]:
|
||||||
|
"""Sample images from the input list based on strategy"""
|
||||||
|
data_strategy = self.config.get("data_strategy", {})
|
||||||
|
sampling_config = data_strategy.get("sampling", {})
|
||||||
|
|
||||||
|
method = sampling_config.get("method", "random")
|
||||||
|
preserve_distribution = sampling_config.get("preserve_distribution", True)
|
||||||
|
|
||||||
|
if method == "random":
|
||||||
|
# Simple random sampling
|
||||||
|
return random.sample(image_files, min(num_selected, len(image_files)))
|
||||||
|
|
||||||
|
elif method == "stratified" and preserve_distribution:
|
||||||
|
# Stratified sampling by file extension
|
||||||
|
extension_groups = {}
|
||||||
|
for img_file in image_files:
|
||||||
|
ext = img_file.suffix.lower()
|
||||||
|
if ext not in extension_groups:
|
||||||
|
extension_groups[ext] = []
|
||||||
|
extension_groups[ext].append(img_file)
|
||||||
|
|
||||||
|
selected = []
|
||||||
|
for ext, files in extension_groups.items():
|
||||||
|
# Sample proportionally from each extension group
|
||||||
|
group_size = max(1, int(num_selected * len(files) / len(image_files)))
|
||||||
|
group_selected = random.sample(files, min(group_size, len(files)))
|
||||||
|
selected.extend(group_selected)
|
||||||
|
|
||||||
|
# If we have too few, add more randomly
|
||||||
|
if len(selected) < num_selected:
|
||||||
|
remaining = [f for f in image_files if f not in selected]
|
||||||
|
additional = random.sample(remaining,
|
||||||
|
min(num_selected - len(selected), len(remaining)))
|
||||||
|
selected.extend(additional)
|
||||||
|
|
||||||
|
return selected[:num_selected]
|
||||||
|
|
||||||
|
elif method == "uniform":
|
||||||
|
# Uniform sampling - evenly spaced
|
||||||
|
if num_selected >= len(image_files):
|
||||||
|
return image_files
|
||||||
|
|
||||||
|
step = len(image_files) / num_selected
|
||||||
|
indices = [int(i * step) for i in range(num_selected)]
|
||||||
|
return [image_files[i] for i in indices]
|
||||||
|
|
||||||
|
else:
|
||||||
|
# Fallback to random
|
||||||
|
return random.sample(image_files, min(num_selected, len(image_files)))
|
||||||
|
|
||||||
def get_augmentation_summary(self, results: Dict[str, List[Path]]) -> Dict[str, Any]:
|
def get_augmentation_summary(self, results: Dict[str, List[Path]]) -> Dict[str, Any]:
|
||||||
"""
|
"""
|
||||||
|
Reference in New Issue
Block a user