update augment + YOLO pipeline

2025-08-06 20:52:39 +07:00
parent 4ee14f17d3
commit 51d3a66cc4
9 changed files with 989 additions and 407 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -16,4 +16,11 @@
 *.pt
 *.ipynb
 *.pyc
-*.log
+*.log
 !docs/
 !docs/**/*.png
 !docs/**/*.jpg
 !docs/**/*.jpeg
 !docs/**/*.gif
 !docs/**/*.svg
--- a/README.md
+++ b/README.md
@@ -1,132 +1,148 @@
-# ID Cards Data Augmentation Tool
+# ID Card Data Augmentation Pipeline
-A comprehensive data augmentation tool specifically designed for ID card images, implementing 7 different augmentation techniques to simulate real-world scenarios.
+A comprehensive data augmentation pipeline for ID card images with YOLO-based detection and advanced augmentation techniques.
-## 🎯 Overview
+![Pipeline Overview](docs/images/yolov8_pipeline.png)
-This tool provides data augmentation capabilities for ID card images, implementing various transformation techniques that mimic real-world conditions such as worn-out cards, partial occlusion, different lighting conditions, and more.
+## 🚀 Features
-## ✨ Features
+### **YOLO-based ID Card Detection**
 - Automatic detection and cropping of ID cards from large images
 - Configurable confidence and IoU thresholds
 - Multiple cropping modes (bbox, square, aspect_ratio)
 - Padding and target size customization
-### 7 Augmentation Techniques
+### **Advanced Data Augmentation**
 - **Geometric Transformations**: Rotation with multiple angles
 - **Random Cropping**: Simulates partially visible cards
 - **Noise Addition**: Simulates worn-out cards
 - **Partial Blockage**: Simulates occluded card details
 - **Blurring**: Simulates blurred but readable images
 - **Brightness/Contrast**: Mimics different lighting conditions
 - **Grayscale Conversion**: Final preprocessing step for all images
-1. **Rotation** - Simulates cards at different angles
+### **Flexible Configuration**
-2. **Random Cropping** - Simulates partially visible cards
+- YAML-based configuration system
-3. **Random Noise** - Simulates worn-out cards
+- Command-line argument overrides
-4. **Horizontal Blockage** - Simulates occluded card details
+- Environment-specific settings
-5. **Grayscale Transformation** - Simulates Xerox/scan copies
+- Comprehensive logging
 6. **Blurring** - Simulates blurred but readable cards
 7. **Brightness & Contrast** - Simulates different lighting conditions
-### Key Features
+## 📋 Requirements
- **Separate Methods**: Each augmentation technique is applied independently
+```bash
- **Quality Preservation**: Maintains image quality with white background preservation
+# Python 3.8+
- **OpenCV Integration**: Uses OpenCV functions for reliable image processing
+conda create -n gpu python=3.8
- **Configurable**: Easy configuration through YAML files
+conda activate gpu
 - **Progress Tracking**: Real-time progress monitoring
 - **Batch Processing**: Process multiple images efficiently
-## 🚀 Installation
+# Install dependencies
 pip install -r requirements.txt
 ```
-### Prerequisites
+### Dependencies
 - `opencv-python>=4.5.0`
 - `numpy>=1.21.0`
 - `Pillow>=8.3.0`
 - `PyYAML>=5.4.0`
 - `ultralytics>=8.0.0` (for YOLO models)
- Python 3.7+
+## 🛠️ Installation
 - OpenCV
 - NumPy
 - PyYAML
 - PIL (Pillow)
-### Setup
+1. **Clone the repository**
 1. **Clone the repository**:
 ```bash
 git clone <repository-url>
 cd IDcardsGenerator
 ```
-2. **Install dependencies**:
+2. **Install dependencies**
 ```bash
-pip install opencv-python numpy pyyaml pillow
+pip install -r requirements.txt
 ```
-3. **Activate conda environment** (if using GPU):
+3. **Prepare YOLO model** (optional)
 ```bash
-conda activate gpu
+# Place your trained YOLO model at:
 data/weights/id_cards_yolov8n.pt
 ```
-## 📁 Project Structure
+## 📖 Usage
-```
+### **Basic Usage**
-IDcardsGenerator/
+
-├── config/
+```bash
-│   └── config.yaml          # Main configuration file
+# Run with default configuration
-├── data/
+python main.py
-│   └── IDcards/
+
-│       └── processed/       # Input images directory
+# Run with ID card detection enabled
-├── src/
+python main.py --enable-id-detection
-│   ├── data_augmentation.py # Core augmentation logic
+
-│   ├── config_manager.py    # Configuration management
+# Run with custom input/output directories
-│   ├── image_processor.py   # Image processing utilities
+python main.py --input-dir "path/to/input" --output-dir "path/to/output"
 │   └── utils.py             # Utility functions
 ├── logs/                    # Log files
 ├── out/                     # Output directory
 └── main.py                  # Main script
 ```
-## ⚙️ Configuration
+### **Configuration Options**
-### Main Configuration (`config/config.yaml`)
+#### **ID Card Detection**
 ```bash
 # Enable detection with custom model
 python main.py --enable-id-detection --model-path "path/to/model.pt"
 # Adjust detection parameters
 python main.py --enable-id-detection --confidence 0.3 --crop-mode square
 # Set target size for cropped cards
 python main.py --enable-id-detection --crop-target-size "640,640"
 ```
 #### **Data Augmentation**
 ```bash
 # Customize augmentation parameters
 python main.py --num-augmentations 5 --target-size "512,512"
 # Preview augmentation results
 python main.py --preview
 ```
 ### **Configuration File**
 Edit `config/config.yaml` for persistent settings:
 ```yaml
-# Data augmentation parameters
+# ID Card Detection
 id_card_detection:
  enabled: false  # Enable/disable YOLO detection
  model_path: "data/weights/id_cards_yolov8n.pt"
  confidence_threshold: 0.25
  iou_threshold: 0.45
  padding: 10
  crop_mode: "bbox"
  target_size: null
 # Data Augmentation
 augmentation:
  # Rotation
  rotation:
    enabled: true
    angles: [30, 60, 120, 150, 180, 210, 240, 300, 330]
    probability: 1.0
  # Random cropping
  random_cropping:
    enabled: true
    ratio_range: [0.7, 1.0]
    probability: 1.0
  # Random noise
  random_noise:
    enabled: true
    mean_range: [0.0, 0.7]
    variance_range: [0.0, 0.1]
    probability: 1.0
  # Partial blockage
  partial_blockage:
    enabled: true
    num_occlusions_range: [1, 100]
    coverage_range: [0.0, 0.25]
    variance_range: [0.0, 0.1]
    probability: 1.0
  # Grayscale transformation
  grayscale:
    enabled: true
    probability: 1.0
  # Blurring
  blurring:
    enabled: true
    kernel_ratio_range: [0.0, 0.0084]
    probability: 1.0
  # Brightness and contrast
  brightness_contrast:
    enabled: true
    alpha_range: [0.4, 3.0]
    beta_range: [1, 100]
-    probability: 1.0
+  grayscale:
    enabled: true  # Applied as final step
-# Processing configuration
+# Processing
 processing:
  target_size: [640, 640]
  num_augmentations: 3
@@ -134,156 +150,139 @@ processing:
  quality: 95
 ```
-## 🎮 Usage
+## 🔄 Workflow
-### Basic Usage
+### **Two-Step Processing Pipeline**
 #### **Step 1: ID Card Detection (Optional)**
 When `id_card_detection.enabled: true`:
 1. **Input**: Large images containing multiple ID cards
 2. **YOLO Detection**: Locate and detect ID cards
 3. **Cropping**: Extract individual ID cards with padding
 4. **Output**: Cropped ID cards saved to `out/processed/`
 #### **Step 2: Data Augmentation**
 1. **Input**: Original images OR cropped ID cards
 2. **Augmentation**: Apply 6 augmentation methods:
   - Rotation (9 different angles)
   - Random cropping (70-100% ratio)
   - Random noise (simulate wear)
   - Partial blockage (simulate occlusion)
   - Blurring (simulate motion blur)
   - Brightness/Contrast adjustment
 3. **Grayscale**: Convert all images to grayscale (final step)
 4. **Output**: Augmented images in main output directory
 ### **Direct Augmentation Mode**
 When `id_card_detection.enabled: false`:
 - Skips YOLO detection
 - Applies augmentation directly to input images
 - All images are converted to grayscale
 ## 📊 Output Structure
 ```
 output_directory/
 ├── processed/                    # Cropped ID cards (if detection enabled)
 │   ├── id_card_001.jpg
 │   ├── id_card_002.jpg
 │   └── processing_summary.json
 ├── im1__rotation_01.png         # Augmented images
 ├── im1__cropping_01.png
 ├── im1__noise_01.png
 ├── im1__blockage_01.png
 ├── im1__blurring_01.png
 ├── im1__brightness_contrast_01.png
 └── augmentation_summary.json
 ```
 ## 🎯 Use Cases
 ### **Training Data Generation**
 ```bash
-python main.py --input-dir data/IDcards/processed --output-dir out
+# Generate diverse training data
 python main.py --enable-id-detection --num-augmentations 10
 ```
-### Command Line Options
+### **Quality Control**
 ```bash
-python main.py [OPTIONS]
+# Preview results before processing
-
+python main.py --preview
 Options:
  --config CONFIG           Path to configuration file (default: config/config.yaml)
  --input-dir INPUT_DIR     Input directory containing images
  --output-dir OUTPUT_DIR   Output directory for augmented images
  --num-augmentations N     Number of augmented versions per image (default: 3)
  --target-size SIZE        Target size for images (width x height)
  --preview                 Preview augmentation on first image only
  --info                    Show information about images in input directory
  --list-presets           List available presets and exit
  --log-level LEVEL        Logging level (DEBUG, INFO, WARNING, ERROR)
 ```
-### Examples
+### **Batch Processing**
 1. **Preview augmentation**:
 ```bash
-python main.py --preview --input-dir data/IDcards/processed --output-dir test_output
+# Process large datasets
 python main.py --input-dir "large_dataset/" --output-dir "augmented_dataset/"
 ```
-2. **Show image information**:
+## ⚙️ Advanced Configuration
-```bash
+
-python main.py --info --input-dir data/IDcards/processed
+### **Custom Augmentation Parameters**
 ```yaml
 augmentation:
  rotation:
    angles: [45, 90, 135, 180, 225, 270, 315]  # Custom angles
  random_cropping:
    ratio_range: [0.8, 0.95]  # Tighter cropping
  random_noise:
    mean_range: [0.1, 0.5]    # More noise
    variance_range: [0.05, 0.15]
 ```
-3. **Custom number of augmentations**:
+### **Performance Optimization**
-```bash
+
-python main.py --input-dir data/IDcards/processed --output-dir out --num-augmentations 5
+```yaml
 performance:
  num_workers: 4
  prefetch_factor: 2
  pin_memory: true
  use_gpu: false
 ```
 4. **Custom target size**:
 ```bash
 python main.py --input-dir data/IDcards/processed --output-dir out --target-size 512x512
 ```
 ## 📊 Output
 ### File Naming Convention
 The tool creates separate files for each augmentation method:
 ```
 im1_rotation_01.png          # Rotation method
 im1_cropping_01.png          # Random cropping method
 im1_noise_01.png             # Random noise method
 im1_blockage_01.png          # Partial blockage method
 im1_grayscale_01.png         # Grayscale method
 im1_blurring_01.png          # Blurring method
 im1_brightness_contrast_01.png  # Brightness/contrast method
 ```
 ### Output Summary
 After processing, you'll see a summary like:
 ```
 ==================================================
 AUGMENTATION SUMMARY
 ==================================================
 Original images: 106
 Augmented images: 2226
 Augmentation ratio: 21.00
 Successful augmentations: 106
 Output directory: out
 ==================================================
 ```
 ## 🔧 Augmentation Techniques Details
 ### 1. Rotation
 - **Purpose**: Simulates cards at different angles
 - **Angles**: 30°, 60°, 120°, 150°, 180°, 210°, 240°, 300°, 330°
 - **Method**: OpenCV rotation with white background preservation
 ### 2. Random Cropping
 - **Purpose**: Simulates partially visible ID cards
 - **Ratio Range**: 0.7 to 1.0 (70% to 100% of original size)
 - **Method**: Random crop with white background preservation
 ### 3. Random Noise
 - **Purpose**: Simulates worn-out cards
 - **Mean Range**: 0.0 to 0.7
 - **Variance Range**: 0.0 to 0.1
 - **Method**: Gaussian noise addition
 ### 4. Horizontal Blockage
 - **Purpose**: Simulates occluded card details
 - **Lines**: 1 to 100 horizontal lines
 - **Coverage**: 0% to 25% of image area
 - **Colors**: Multiple colors to simulate various objects
 ### 5. Grayscale Transformation
 - **Purpose**: Simulates Xerox/scan copies
 - **Method**: OpenCV `cv2.cvtColor()` function
 - **Output**: 3-channel grayscale image
 ### 6. Blurring
 - **Purpose**: Simulates blurred but readable cards
 - **Kernel Ratio**: 0.0 to 0.0084
 - **Method**: OpenCV `cv2.filter2D()` with Gaussian kernel
 ### 7. Brightness & Contrast
 - **Purpose**: Simulates different lighting conditions
 - **Alpha Range**: 0.4 to 3.0 (contrast)
 - **Beta Range**: 1 to 100 (brightness)
 - **Method**: OpenCV `cv2.convertScaleAbs()`
 ## 🛠️ Development
 ### Adding New Augmentation Methods
 1. Add the method to `src/data_augmentation.py`
 2. Update configuration in `config/config.yaml`
 3. Update default config in `src/config_manager.py`
 4. Test with preview mode
 ### Code Structure
 - **`main.py`**: Entry point and command-line interface
 - **`src/data_augmentation.py`**: Core augmentation logic
 - **`src/config_manager.py`**: Configuration management
 - **`src/image_processor.py`**: Image processing utilities
 - **`src/utils.py`**: Utility functions
 ## 📝 Logging
-The tool provides comprehensive logging:
+The system provides comprehensive logging:
 - **File**: `logs/data_augmentation.log`
 - **Console**: Real-time progress updates
 - **Summary**: JSON files with processing statistics
- **File logging**: `logs/data_augmentation.log`
+### **Log Levels**
- **Console logging**: Real-time progress updates
+- `INFO`: General processing information
- **Log levels**: DEBUG, INFO, WARNING, ERROR
+- `WARNING`: Non-critical issues (e.g., no cards detected)
 - `ERROR`: Critical errors
 ## 🔧 Troubleshooting
 ### **Common Issues**
 1. **No images detected**
   - Check input directory path
   - Verify image formats (jpg, png, bmp, tiff)
   - Ensure images are not corrupted
 2. **YOLO model not found**
   - Place model file at `data/weights/id_cards_yolov8n.pt`
   - Or specify custom path with `--model-path`
 3. **Memory issues**
   - Reduce `num_augmentations`
   - Use smaller `target_size`
   - Enable GPU if available
 ### **Performance Tips**
 - **GPU Acceleration**: Set `use_gpu: true` in config
 - **Batch Processing**: Use multiple workers for large datasets
 - **Memory Management**: Process in smaller batches
 ## 🤝 Contributing
 1. Fork the repository
 2. Create a feature branch
 3. Make your changes
-4. Test thoroughly
+4. Add tests if applicable
 5. Submit a pull request
 ## 📄 License
@@ -292,18 +291,10 @@ This project is licensed under the MIT License - see the LICENSE file for detail
 ## 🙏 Acknowledgments
- OpenCV for image processing capabilities
+- **YOLOv8**: Ultralytics for the detection framework
- NumPy for numerical operations
+- **OpenCV**: Computer vision operations
- PyYAML for configuration management
+- **NumPy**: Numerical computations
 ## 📞 Support
 For issues and questions:
 1. Check the logs in `logs/data_augmentation.log`
 2. Review the configuration in `config/config.yaml`
 3. Test with preview mode first
 4. Create an issue with detailed information
 ---
-**Note**: This tool is specifically designed for ID card augmentation and may need adjustments for other image types. 
+**For questions and support, please open an issue on GitHub.** 
--- a/config/config.yaml
+++ b/config/config.yaml
@@ -7,6 +7,17 @@ paths:
  output_dir: "out1"
  log_file: "logs/data_augmentation.log"
 # ID Card Detection configuration
 id_card_detection:
  enabled: false  # Bật/tắt tính năng detect và crop ID cards
  model_path: "data/weights/id_cards_yolov8n.pt"  # Đường dẫn đến YOLO model
  confidence_threshold: 0.25  # Confidence threshold cho detection
  iou_threshold: 0.45  # IoU threshold cho NMS
  padding: 10  # Padding thêm xung quanh bbox
  crop_mode: "bbox"  # Mode cắt: bbox, square, aspect_ratio
  target_size: null  # Kích thước target (width, height) hoặc null
  save_original_crops: true  # Có lưu ảnh gốc đã crop không
 # Data augmentation parameters - ROTATION and RANDOM CROPPING
 augmentation:
  # Geometric transformations
@@ -36,11 +47,6 @@ augmentation:
    variance_range: [0.0, 0.1]  # Line thickness variance (min, max)
    probability: 1.0  # Always apply blockage
  # Grayscale transformation to mimic Xerox/scan copies
  grayscale:
    enabled: true
    probability: 1.0  # Always apply grayscale
  # Blurring to simulate blurred card images that are still readable
  blurring:
    enabled: true
@@ -53,6 +59,11 @@ augmentation:
    alpha_range: [0.4, 3.0]  # Contrast range (min, max)
    beta_range: [1, 100]  # Brightness range (min, max)
    probability: 1.0  # Always apply brightness/contrast adjustment
  # Grayscale transformation as final step (applied to all augmented images)
  grayscale:
    enabled: true
    probability: 1.0  # Always apply grayscale as final step
 # Processing configuration
 processing:
--- a/docs/images/yolov8_pipeline.png
+++ b/docs/images/yolov8_pipeline.png
--- a/main.py
+++ b/main.py
@@ -12,6 +12,7 @@ sys.path.append(str(Path(__file__).parent / "src"))
 from src.config_manager import ConfigManager
 from src.data_augmentation import DataAugmentation
 from src.image_processor import ImageProcessor
 from src.id_card_detector import IDCardDetector
 from src.utils import setup_logging, get_image_files, print_progress
 def parse_arguments():
@@ -83,6 +84,38 @@ def parse_arguments():
        help="Logging level"
    )
    # ID Card Detection arguments
    parser.add_argument(
        "--enable-id-detection",
        action="store_true",
        help="Enable ID card detection and cropping before augmentation"
    )
    parser.add_argument(
        "--model-path",
        type=str,
        help="Path to YOLO model for ID card detection (overrides config)"
    )
    parser.add_argument(
        "--confidence",
        type=float,
        help="Confidence threshold for ID card detection (overrides config)"
    )
    parser.add_argument(
        "--crop-mode",
        type=str,
        choices=["bbox", "square", "aspect_ratio"],
        help="Crop mode for ID cards (overrides config)"
    )
    parser.add_argument(
        "--crop-target-size",
        type=str,
        help="Target size for cropped ID cards (widthxheight) (overrides config)"
    )
    return parser.parse_args()
 def parse_range(range_str: str) -> tuple:
@@ -134,7 +167,8 @@ def show_image_info(input_dir: Path):
    print(f"\nTotal file size: {total_size:.2f} MB")
    print(f"Average file size: {total_size/len(image_files):.2f} MB")
-def preview_augmentation(input_dir: Path, output_dir: Path, config: Dict[str, Any]):
+def preview_augmentation(input_dir: Path, output_dir: Path, config: Dict[str, Any], 
                        id_detection_config: Dict[str, Any] = None):
    """Preview augmentation on first image"""
    image_files = get_image_files(input_dir)
@@ -147,7 +181,40 @@ def preview_augmentation(input_dir: Path, output_dir: Path, config: Dict[str, An
    # Create augmentation instance
    augmenter = DataAugmentation(config)
-    # Augment first image
+    # Process with ID detection if enabled
    if id_detection_config and id_detection_config.get('enabled', False):
        print("🔍 ID Card Detection enabled - processing with YOLO model...")
        # Initialize ID card detector
        detector = IDCardDetector(
            model_path=id_detection_config.get('model_path'),
            config=config
        )
        if not detector.model:
            print("❌ Failed to load YOLO model, proceeding with normal augmentation")
        else:
            # Process single image with ID detection
            result = detector.process_single_image(
                image_path=image_files[0],
                output_dir=output_dir,
                apply_augmentation=True,
                save_original=id_detection_config.get('save_original_crops', True),
                confidence=id_detection_config.get('confidence_threshold', 0.25),
                iou_threshold=id_detection_config.get('iou_threshold', 0.45),
                crop_mode=id_detection_config.get('crop_mode', 'bbox'),
                target_size=id_detection_config.get('target_size'),
                padding=id_detection_config.get('padding', 10)
            )
            if result and result.get('detections'):
                print(f"✅ Detected {len(result['detections'])} ID cards")
                print(f"💾 Saved {len(result['processed_cards'])} processed cards")
                return
            else:
                print("⚠️  No ID cards detected, proceeding with normal augmentation")
    # Normal augmentation (fallback)
    augmented_paths = augmenter.augment_image_file(
        image_files[0], 
        output_dir, 
@@ -225,9 +292,29 @@ def main():
        show_image_info(input_dir)
        return
    # Get ID detection config
    id_detection_config = config.get('id_card_detection', {})
    # Override ID detection config with command line arguments
    if args.enable_id_detection:
        id_detection_config['enabled'] = True
    if args.model_path:
        id_detection_config['model_path'] = args.model_path
    if args.confidence:
        id_detection_config['confidence_threshold'] = args.confidence
    if args.crop_mode:
        id_detection_config['crop_mode'] = args.crop_mode
    if args.crop_target_size:
        target_size = parse_size(args.crop_target_size)
        id_detection_config['target_size'] = list(target_size)
    # Preview augmentation if requested
    if args.preview:
-        preview_augmentation(input_dir, output_dir, augmentation_config)
+        preview_augmentation(input_dir, output_dir, augmentation_config, id_detection_config)
        return
    # Get image files
@@ -242,35 +329,56 @@ def main():
    logger.info(f"Number of augmentations per image: {processing_config.get('num_augmentations', 3)}")
    logger.info(f"Target size: {processing_config.get('target_size', [224, 224])}")
-    # Create augmentation instance with new config
+    # Process with ID detection if enabled
-    augmenter = DataAugmentation(augmentation_config)
+    if id_detection_config.get('enabled', False):
        logger.info("ID Card Detection enabled - processing with YOLO model...")
        # Initialize ID card detector
        detector = IDCardDetector(
            model_path=id_detection_config.get('model_path'),
            config=config
        )
        if not detector.model:
            logger.error("Failed to load YOLO model")
            sys.exit(1)
        logger.info(f"YOLO model loaded: {detector.model_path}")
        logger.info(f"Confidence threshold: {id_detection_config.get('confidence_threshold', 0.25)}")
        logger.info(f"Crop mode: {id_detection_config.get('crop_mode', 'bbox')}")
        # Bước 1: Detect và crop ID cards vào thư mục processed
        processed_dir = output_dir / "processed"
        processed_dir.mkdir(parents=True, exist_ok=True)
        logger.info("Step 1: Detect and crop ID cards...")
        detector.batch_process(
            input_dir=input_dir,
            output_dir=processed_dir,
            confidence=id_detection_config.get('confidence_threshold', 0.25),
            iou_threshold=id_detection_config.get('iou_threshold', 0.45),
            crop_mode=id_detection_config.get('crop_mode', 'bbox'),
            target_size=id_detection_config.get('target_size'),
            padding=id_detection_config.get('padding', 10)
        )
        # Bước 2: Augment các card đã crop
        logger.info("Step 2: Augment cropped ID cards...")
        augmenter = DataAugmentation(augmentation_config)
        augmenter.batch_augment(
            processed_dir,
            output_dir,
            num_augmentations=processing_config.get("num_augmentations", 3)
        )
    else:
        # Augment trực tiếp ảnh gốc
        logger.info("Starting normal batch augmentation (direct augmentation)...")
        augmenter = DataAugmentation(augmentation_config)
        augmenter.batch_augment(
            input_dir,
            output_dir,
            num_augmentations=processing_config.get("num_augmentations", 3)
        )
-    # Update target size
+    logger.info("Data processing completed successfully")
    target_size = tuple(processing_config.get("target_size", [224, 224]))
    augmenter.image_processor.target_size = target_size
    # Perform batch augmentation
    logger.info("Starting batch augmentation...")
    results = augmenter.batch_augment(
        input_dir, 
        output_dir, 
        num_augmentations=processing_config.get("num_augmentations", 3)
    )
    # Get and display summary
    summary = augmenter.get_augmentation_summary(results)
    print("\n" + "="*50)
    print("AUGMENTATION SUMMARY")
    print("="*50)
    print(f"Original images: {summary['total_original_images']}")
    print(f"Augmented images: {summary['total_augmented_images']}")
    print(f"Augmentation ratio: {summary['augmentation_ratio']:.2f}")
    print(f"Successful augmentations: {summary['successful_augmentations']}")
    print(f"Output directory: {output_dir}")
    print("="*50)
    logger.info("Data augmentation completed successfully")
 if __name__ == "__main__":
    main() 
--- a/script/id_card_cropper.py
+++ b/script/id_card_cropper.py
@@ -1,133 +0,0 @@
 #!/usr/bin/env python3
 """
 Simple ID Card Cropper using Roboflow API
 Input: folder containing images
 Output: folder with cropped ID cards
 """
 import sys
 import yaml
 from pathlib import Path
 import logging
 import argparse
 # Add src to path
 sys.path.append(str(Path(__file__).parent / "src"))
 from model.roboflow_id_detector import RoboflowIDDetector
 def setup_logging():
    """Setup basic logging"""
    logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s - %(levelname)s - %(message)s'
    )
 def crop_id_cards(input_folder: str, output_folder: str, api_key: str = "Pkz4puRA0Cy3xMOuNoNr"):
    """
    Crop ID cards from all images in input folder
    Args:
        input_folder: Path to input folder containing images
        output_folder: Path to output folder for cropped ID cards
        api_key: Roboflow API key
    """
    logger = logging.getLogger(__name__)
    # Convert to Path objects
    input_path = Path(input_folder)
    output_path = Path(output_folder)
    # Check if input folder exists
    if not input_path.exists():
        logger.error(f"Input folder not found: {input_folder}")
        return False
    # Create output folder
    output_path.mkdir(parents=True, exist_ok=True)
    # Initialize detector
    detector = RoboflowIDDetector(
        api_key=api_key,
        model_id="french-card-id-detect",
        version=3,
        confidence=0.5
    )
    # Get all image files
    image_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff'}
    image_files = []
    for file_path in input_path.rglob('*'):
        if file_path.is_file() and file_path.suffix.lower() in image_extensions:
            image_files.append(file_path)
    if not image_files:
        logger.error(f"No images found in {input_folder}")
        return False
    logger.info(f"Found {len(image_files)} images to process")
    # Process each image
    total_cropped = 0
    for i, image_path in enumerate(image_files, 1):
        logger.info(f"Processing {i}/{len(image_files)}: {image_path.name}")
        # Detect ID cards
        detections = detector.detect_id_cards(image_path)
        if not detections:
            logger.warning(f"No ID cards detected in {image_path.name}")
            continue
        # Crop each detected ID card
        for j, detection in enumerate(detections):
            bbox = detection['bbox']
            # Create output filename
            stem = image_path.stem
            suffix = f"_card_{j+1}.jpg"
            output_file = output_path / f"{stem}{suffix}"
            # Crop ID card
            cropped = detector.crop_id_card(image_path, bbox, output_file)
            if cropped is not None:
                total_cropped += 1
                logger.info(f"  ✓ Cropped card {j+1} to {output_file.name}")
        # Add delay between requests
        if i < len(image_files):
            import time
            time.sleep(1.0)
    logger.info(f"Processing completed! Total ID cards cropped: {total_cropped}")
    return True
 def main():
    """Main function"""
    parser = argparse.ArgumentParser(description='Crop ID cards from images using Roboflow API')
    parser.add_argument('input_folder', help='Input folder containing images')
    parser.add_argument('output_folder', help='Output folder for cropped ID cards')
    parser.add_argument('--api-key', default="Pkz4puRA0Cy3xMOuNoNr", 
                       help='Roboflow API key (default: demo key)')
    args = parser.parse_args()
    # Setup logging
    setup_logging()
    # Process images
    success = crop_id_cards(args.input_folder, args.output_folder, args.api_key)
    if success:
        print(f"\n✓ Successfully processed images from '{args.input_folder}'")
        print(f"✓ Cropped ID cards saved to '{args.output_folder}'")
    else:
        print(f"\n✗ Failed to process images")
        return 1
    return 0
 if __name__ == "__main__":
    exit(main()) 
--- a/src/data_augmentation.py
+++ b/src/data_augmentation.py
@@ -363,8 +363,6 @@ class DataAugmentation:
        return result
    def augment_single_image(self, image: np.ndarray, num_augmentations: int = None) -> List[np.ndarray]:
        """
        Apply each augmentation method separately to create independent augmented versions
@@ -455,20 +453,7 @@ class DataAugmentation:
                augmented_images.append(augmented)
-        # 5. Grayscale only
+        # 5. Blurring only
        if grayscale_config.get("enabled", False):
            for i in range(num_augmentations):
                augmented = image.copy()
                augmented = self.convert_to_grayscale_preserve_quality(augmented)
                # Resize preserving aspect ratio
                target_size = self.image_processor.target_size
                if target_size:
                    augmented = self.resize_preserve_aspect(augmented, target_size)
                augmented_images.append(augmented)
        # 6. Blurring only
        if blurring_config.get("enabled", False):
            for i in range(num_augmentations):
                augmented = image.copy()
@@ -481,7 +466,7 @@ class DataAugmentation:
                augmented_images.append(augmented)
-        # 7. Brightness and contrast only
+        # 6. Brightness/Contrast only
        if brightness_contrast_config.get("enabled", False):
            for i in range(num_augmentations):
                augmented = image.copy()
@@ -494,6 +479,11 @@ class DataAugmentation:
                augmented_images.append(augmented)
        # 7. Apply grayscale as final step to ALL augmented images
        if grayscale_config.get("enabled", False):
            for i in range(len(augmented_images)):
                augmented_images[i] = self.convert_to_grayscale_preserve_quality(augmented_images[i])
        return augmented_images
    def augment_image_file(self, image_path: Path, output_dir: Path, num_augmentations: int = None) -> List[Path]:
@@ -518,7 +508,7 @@ class DataAugmentation:
        # Save augmented images with method names
        saved_paths = []
-        method_names = ["rotation", "cropping", "noise", "blockage", "grayscale", "blurring", "brightness_contrast"]
+        method_names = ["rotation", "cropping", "noise", "blockage", "blurring", "brightness_contrast", "grayscale"]
        method_index = 0
        for i, aug_image in enumerate(augmented_images):
--- a/src/id_card_detector.py
+++ b/src/id_card_detector.py
@@ -0,0 +1,611 @@
 """
 ID Card Detector Module
 Sử dụng YOLO để detect và cắt ID cards từ ảnh lớn, kết hợp với data augmentation
 Tích hợp với YOLOv8 French ID Card Detection model
 """
 import cv2
 import numpy as np
 from pathlib import Path
 from typing import List, Tuple, Optional, Dict, Any, Union
 import torch
 import torch.nn as nn
 from ultralytics import YOLO
 import logging
 from data_augmentation import DataAugmentation
 from utils import load_image, save_image, create_augmented_filename, print_progress
 import os
 import json
 import yaml
 class IDCardDetector:
    """Class để detect và cắt ID cards từ ảnh lớn sử dụng YOLO"""
    def __init__(self, model_path: str = None, config: Dict[str, Any] = None):
        """
        Initialize ID Card Detector
        Args:
            model_path: Đường dẫn đến model YOLO đã train
            config: Configuration dictionary
        """
        self.config = config or {}
        self.model_path = model_path
        self.model = None
        self.data_augmentation = DataAugmentation(config)
        self.logger = self._setup_logger()
        # Default model path nếu không được cung cấp
        if not model_path:
            default_model_path = "data/weights/id_cards_yolov8n.pt"
            if os.path.exists(default_model_path):
                model_path = default_model_path
                self.model_path = model_path
        # Load YOLO model nếu có
        if model_path and os.path.exists(model_path):
            self.load_model(model_path)
    def _setup_logger(self) -> logging.Logger:
        """Setup logger cho module"""
        logger = logging.getLogger(__name__)
        logger.setLevel(logging.INFO)
        if not logger.handlers:
            handler = logging.StreamHandler()
            formatter = logging.Formatter(
                '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
            )
            handler.setFormatter(formatter)
            logger.addHandler(handler)
        return logger
    def load_model(self, model_path: str) -> bool:
        """
        Load YOLO model từ file
        Args:
            model_path: Đường dẫn đến model file
        Returns:
            True nếu load thành công, False nếu thất bại
        """
        try:
            self.model = YOLO(model_path)
            self.logger.info(f"Loaded YOLO model from: {model_path}")
            return True
        except Exception as e:
            self.logger.error(f"Failed to load model: {e}")
            return False
    def detect_id_cards(self, image: np.ndarray, confidence: float = 0.5, iou_threshold: float = 0.45) -> List[Dict[str, Any]]:
        """
        Detect ID cards trong ảnh sử dụng YOLO
        Args:
            image: Input image
            confidence: Confidence threshold
            iou_threshold: IoU threshold cho NMS
        Returns:
            List các detection results với format:
            {
                'bbox': [x1, y1, x2, y2],
                'confidence': float,
                'class_id': int,
                'class_name': str
            }
        """
        if self.model is None:
            self.logger.error("Model chưa được load!")
            return []
        try:
            # Run inference
            results = self.model(image, conf=confidence, iou=float(iou_threshold), verbose=False)
            detections = []
            for result in results:
                boxes = result.boxes
                if boxes is not None:
                    for box in boxes:
                        # Get bbox coordinates
                        x1, y1, x2, y2 = box.xyxy[0].cpu().numpy()
                        # Get confidence and class
                        confidence_score = float(box.conf[0].cpu().numpy())
                        class_id = int(box.cls[0].cpu().numpy())
                        class_name = self.model.names[class_id] if hasattr(self.model, 'names') else f"class_{class_id}"
                        detection = {
                            'bbox': [int(x1), int(y1), int(x2), int(y2)],
                            'confidence': confidence_score,
                            'class_id': class_id,
                            'class_name': class_name
                        }
                        detections.append(detection)
            self.logger.info(f"Detected {len(detections)} ID cards")
            return detections
        except Exception as e:
            self.logger.error(f"Error during detection: {e}")
            return []
    def crop_id_card(self, image: np.ndarray, bbox: List[int], padding: int = 10, 
                     crop_mode: str = "bbox", target_size: Tuple[int, int] = None) -> np.ndarray:
        """
        Cắt ID card từ ảnh gốc dựa trên bbox với nhiều options
        Args:
            image: Input image
            bbox: Bounding box [x1, y1, x2, y2]
            padding: Padding thêm xung quanh bbox
            crop_mode: Mode cắt ("bbox", "square", "aspect_ratio")
            target_size: Kích thước target (width, height) nếu muốn resize
        Returns:
            Cropped ID card image
        """
        x1, y1, x2, y2 = bbox
        # Thêm padding
        height, width = image.shape[:2]
        x1 = max(0, x1 - padding)
        y1 = max(0, y1 - padding)
        x2 = min(width, x2 + padding)
        y2 = min(height, y2 + padding)
        # Cắt ảnh theo mode
        if crop_mode == "bbox":
            # Cắt theo bbox gốc
            cropped = image[y1:y2, x1:x2]
        elif crop_mode == "square":
            # Cắt thành hình vuông
            center_x = (x1 + x2) // 2
            center_y = (y1 + y2) // 2
            size = max(x2 - x1, y2 - y1)
            half_size = size // 2
            x1 = max(0, center_x - half_size)
            y1 = max(0, center_y - half_size)
            x2 = min(width, center_x + half_size)
            y2 = min(height, center_y + half_size)
            cropped = image[y1:y2, x1:x2]
        elif crop_mode == "aspect_ratio":
            # Cắt theo tỷ lệ khung hình chuẩn (3:4 cho ID card)
            bbox_width = x2 - x1
            bbox_height = y2 - y1
            center_x = (x1 + x2) // 2
            center_y = (y1 + y2) // 2
            # Tỷ lệ 3:4 cho ID card
            target_ratio = 3 / 4
            current_ratio = bbox_width / bbox_height
            if current_ratio > target_ratio:
                # Bbox quá rộng, giữ chiều cao
                new_width = int(bbox_height * target_ratio)
                half_width = new_width // 2
                x1 = max(0, center_x - half_width)
                x2 = min(width, center_x + half_width)
            else:
                # Bbox quá cao, giữ chiều rộng
                new_height = int(bbox_width / target_ratio)
                half_height = new_height // 2
                y1 = max(0, center_y - half_height)
                y2 = min(height, center_y + half_height)
            cropped = image[y1:y2, x1:x2]
        else:
            # Default: cắt theo bbox
            cropped = image[y1:y2, x1:x2]
        # Resize nếu có target_size
        if target_size:
            cropped = cv2.resize(cropped, target_size, interpolation=cv2.INTER_AREA)
        return cropped
    def process_single_image(self, image_path: Union[str, Path], output_dir: Path, 
                           confidence: float = 0.5, iou_threshold: float = 0.45,
                           crop_mode: str = "bbox", target_size: Tuple[int, int] = None,
                           padding: int = 10, card_counter: int = 0) -> Dict[str, Any]:
        """
        Xử lý một ảnh: detect ID cards, cắt và áp dụng augmentation
        Args:
            image_path: Đường dẫn đến ảnh input
            output_dir: Thư mục output
            apply_augmentation: Có áp dụng data augmentation không
            save_original: Có lưu ảnh gốc không
            confidence: Confidence threshold
            iou_threshold: IoU threshold
            crop_mode: Mode cắt ("bbox", "square", "aspect_ratio")
            target_size: Kích thước target (width, height) hoặc None
            padding: Padding thêm xung quanh bbox
        Returns:
            Dictionary chứa kết quả xử lý
        """
        image_path = Path(image_path)
        if not image_path.exists():
            self.logger.error(f"Image not found: {image_path}")
            return {}
        # Load ảnh
        image = load_image(str(image_path))
        if image is None:
            self.logger.error(f"Failed to load image: {image_path}")
            return {}
        # Detect ID cards
        detections = self.detect_id_cards(image, confidence, float(iou_threshold))
        if not detections:
            self.logger.warning(f"No ID cards detected in: {image_path}")
            return {
                'image_path': str(image_path),
                'detections': [],
                'processed_cards': []
            }
        # Tạo thư mục output
        output_dir.mkdir(parents=True, exist_ok=True)
        processed_cards = []
        current_card_counter = card_counter
        for i, detection in enumerate(detections):
            # Cắt ID card với options mới
            cropped_card = self.crop_id_card(
                image, 
                detection['bbox'], 
                padding=padding,
                crop_mode=crop_mode,
                target_size=target_size
            )
            # Tạo tên file unique cho mỗi ID card
            current_card_counter += 1
            card_filename = f"id_card_{current_card_counter:03d}.jpg"
            card_path = output_dir / card_filename
            # Lưu ảnh gốc
            save_image(cropped_card, card_path)
            processed_cards.append({
                'original_path': str(card_path),
                'detection_info': detection,
                'crop_info': {
                    'mode': crop_mode,
                    'target_size': target_size,
                    'padding': padding
                }
            })
        result = {
            'image_path': str(image_path),
            'detections': detections,
            'processed_cards': processed_cards,
            'total_cards': len(processed_cards),
            'crop_settings': {
                'mode': crop_mode,
                'target_size': target_size,
                'padding': padding
            }
        }
        self.logger.info(f"Processed {len(processed_cards)} cards from {image_path.name}")
        return result
    def batch_process(self, input_dir: Union[str, Path], output_dir: Union[str, Path],
                     confidence: float = 0.5, iou_threshold: float = 0.45,
                     crop_mode: str = "bbox", target_size: Tuple[int, int] = None,
                     padding: int = 10) -> Dict[str, Any]:
        """
        Xử lý batch nhiều ảnh
        Args:
            input_dir: Thư mục chứa ảnh input
            output_dir: Thư mục output
            apply_augmentation: Có áp dụng data augmentation không
            save_original: Có lưu ảnh gốc không
            confidence: Confidence threshold
            iou_threshold: IoU threshold
            crop_mode: Mode cắt ("bbox", "square", "aspect_ratio")
            target_size: Kích thước target (width, height) hoặc None
            padding: Padding thêm xung quanh bbox
        Returns:
            Dictionary chứa kết quả batch processing
        """
        input_dir = Path(input_dir)
        output_dir = Path(output_dir)
        if not input_dir.exists():
            self.logger.error(f"Input directory not found: {input_dir}")
            return {}
        # Tạo thư mục output
        output_dir.mkdir(parents=True, exist_ok=True)
        # Tìm tất cả ảnh
        supported_formats = self.config.get('supported_formats', ['.jpg', '.jpeg', '.png', '.bmp', '.tiff'])
        image_files = []
        for fmt in supported_formats:
            image_files.extend(input_dir.glob(f"*{fmt}"))
            image_files.extend(input_dir.glob(f"*{fmt.upper()}"))
        if not image_files:
            self.logger.warning(f"No supported images found in: {input_dir}")
            return {}
        self.logger.info(f"Found {len(image_files)} images to process")
        results = {}
        total_cards = 0
        global_card_counter = 0  # Counter để tạo tên file unique
        for i, image_path in enumerate(image_files):
            self.logger.info(f"Processing {i+1}/{len(image_files)}: {image_path.name}")
            # Xử lý ảnh - chỉ detect và crop, không augment
            result = self.process_single_image(
                image_path, 
                output_dir,
                confidence,
                iou_threshold,
                crop_mode,
                target_size,
                padding,
                global_card_counter
            )
            # Cập nhật counter
            global_card_counter += len(result.get('detections', []))
            results[image_path.name] = result
            total_cards += len(result.get('detections', []))  # Số lượng ID cards thực tế đã detect
            # Print progress
            print_progress(i + 1, len(image_files), f"Processed {image_path.name}")
        # Tạo summary
        summary = {
            'total_images': len(image_files),
            'total_cards_detected': total_cards,
            'images_with_cards': len([r for r in results.values() if r.get('detections')]),
            'images_without_cards': len([r for r in results.values() if not r.get('detections')]),
            'output_directory': str(output_dir),
            'crop_settings': {
                'mode': crop_mode,
                'target_size': target_size,
                'padding': padding
            },
            'results': results
        }
        # Lưu summary
        summary_path = output_dir / "processing_summary.json"
        with open(summary_path, 'w', encoding='utf-8') as f:
            json.dump(summary, f, indent=2, ensure_ascii=False)
        self.logger.info(f"Batch processing completed. Summary saved to: {summary_path}")
        return summary
    def get_detection_statistics(self, results: Dict[str, Any]) -> Dict[str, Any]:
        """
        Tính toán thống kê từ kết quả detection
        Args:
            results: Kết quả từ batch_process
        Returns:
            Dictionary chứa thống kê
        """
        if not results:
            return {}
        total_images = results.get('total_images', 0)
        total_cards = results.get('total_cards_detected', 0)
        images_with_cards = results.get('images_with_cards', 0)
        # Tính confidence statistics
        all_confidences = []
        for image_result in results.get('results', {}).values():
            for detection in image_result.get('detections', []):
                all_confidences.append(detection.get('confidence', 0))
        stats = {
            'total_images_processed': total_images,
            'total_cards_detected': total_cards,
            'images_with_cards': images_with_cards,
            'images_without_cards': total_images - images_with_cards,
            'average_cards_per_image': total_cards / total_images if total_images > 0 else 0,
            'detection_rate': images_with_cards / total_images if total_images > 0 else 0,
            'confidence_statistics': {
                'min': min(all_confidences) if all_confidences else 0,
                'max': max(all_confidences) if all_confidences else 0,
                'mean': np.mean(all_confidences) if all_confidences else 0,
                'std': np.std(all_confidences) if all_confidences else 0
            }
        }
        return stats
    def augment_cropped_cards(self, input_dir: Union[str, Path], output_dir: Union[str, Path],
                             num_augmentations: int = 3) -> Dict[str, Any]:
        """
        Augment tất cả ID cards đã crop trong thư mục input
        Args:
            input_dir: Thư mục chứa ID cards đã crop
            output_dir: Thư mục output cho augmented images
            num_augmentations: Số lượng augmentation cho mỗi card
        Returns:
            Dictionary chứa kết quả augmentation
        """
        input_dir = Path(input_dir)
        output_dir = Path(output_dir)
        if not input_dir.exists():
            self.logger.error(f"Input directory not found: {input_dir}")
            return {}
        # Tạo thư mục output
        output_dir.mkdir(parents=True, exist_ok=True)
        # Tìm tất cả ID cards đã crop
        card_files = list(input_dir.glob("id_card_*.jpg"))
        if not card_files:
            self.logger.warning(f"No ID card files found in: {input_dir}")
            return {}
        self.logger.info(f"Found {len(card_files)} ID cards to augment")
        results = {}
        total_augmented = 0
        for i, card_path in enumerate(card_files):
            self.logger.info(f"Augmenting {i+1}/{len(card_files)}: {card_path.name}")
            # Load ID card
            card_image = load_image(str(card_path))
            if card_image is None:
                self.logger.error(f"Failed to load card: {card_path}")
                continue
            # Augment card
            try:
                augmented_cards = self.data_augmentation.augment_single_image(
                    card_image, 
                    num_augmentations=num_augmentations
                )
                # Debug: Kiểm tra số lượng augmented cards
                self.logger.info(f"Generated {len(augmented_cards)} augmented cards for {card_path.name}")
                # Debug: Kiểm tra config
                self.logger.info(f"DataAugmentation config: {self.data_augmentation.config}")
            except Exception as e:
                self.logger.error(f"Error during augmentation: {e}")
                augmented_cards = []
            # Save augmented cards
            card_results = []
            for j, aug_card in enumerate(augmented_cards):
                aug_filename = f"{card_path.stem}_aug_{j+1}.jpg"
                aug_path = output_dir / aug_filename
                save_image(aug_card, aug_path)
                card_results.append({
                    'augmented_path': str(aug_path),
                    'augmentation_index': j+1
                })
            results[card_path.name] = {
                'original_path': str(card_path),
                'augmented_cards': card_results,
                'total_augmented': len(card_results)
            }
            total_augmented += len(card_results)
            # Print progress
            print_progress(i + 1, len(card_files), f"Augmented {card_path.name}")
        # Tạo summary
        summary = {
            'total_cards': len(card_files),
            'total_augmented': total_augmented,
            'output_directory': str(output_dir),
            'results': results
        }
        # Lưu summary
        summary_path = output_dir / "augmentation_summary.json"
        with open(summary_path, 'w', encoding='utf-8') as f:
            json.dump(summary, f, indent=2, ensure_ascii=False)
        self.logger.info(f"Augmentation completed. Summary saved to: {summary_path}")
        return summary
    def load_yolo_config(self, config_path: str = None) -> Dict[str, Any]:
        """
        Load config từ YOLO detector
        Args:
            config_path: Đường dẫn đến file config
        Returns:
            Config dictionary
        """
        if config_path is None:
            # Tìm config mặc định
            default_config_path = "src/model/ID_cards_detector/config.py"
            if os.path.exists(default_config_path):
                config_path = default_config_path
        config = {}
        try:
            # Import config từ YOLO detector
            import sys
            sys.path.append(str(Path("src/model/ID_cards_detector")))
            from config import DEFAULT_TRAINING_CONFIG, DEFAULT_INFERENCE_CONFIG
            config.update({
                'yolo_training_config': DEFAULT_TRAINING_CONFIG,
                'yolo_inference_config': DEFAULT_INFERENCE_CONFIG,
                'detection': {
                    'confidence_threshold': DEFAULT_INFERENCE_CONFIG.get('conf_threshold', 0.25),
                    'iou_threshold': DEFAULT_INFERENCE_CONFIG.get('iou_threshold', 0.45),
                    'padding': 10
                },
                'processing': {
                    'apply_augmentation': True,
                    'save_original': True,
                    'num_augmentations': 3,
                    'save_format': "jpg",
                    'quality': 95,
                    'target_size': [640, 640]
                },
                'crop_options': {
                    'crop_mode': 'bbox',  # bbox, square, aspect_ratio
                    'target_size': None,  # (width, height) hoặc None
                    'padding': 10
                }
            })
            self.logger.info("Loaded YOLO config successfully")
        except Exception as e:
            self.logger.warning(f"Failed to load YOLO config: {e}")
            # Fallback config
            config = {
                'detection': {
                    'confidence_threshold': 0.25,
                    'iou_threshold': 0.45,
                    'padding': 10
                },
                'processing': {
                    'apply_augmentation': True,
                    'save_original': True,
                    'num_augmentations': 3,
                    'save_format': "jpg",
                    'quality': 95,
                    'target_size': [640, 640]
                },
                'crop_options': {
                    'crop_mode': 'bbox',
                    'target_size': None,
                    'padding': 10
                }
            }
        return config 
--- a/src/utils.py
+++ b/src/utils.py
@@ -41,14 +41,11 @@ def load_image(image_path: Path, target_size: Tuple[int, int] = None) -> Optiona
        image = cv2.imread(str(image_path))
        if image is None:
            return None
        # Convert BGR to RGB
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        # Resize if target_size is provided
        if target_size:
            image = cv2.resize(image, target_size, interpolation=cv2.INTER_AREA)
        return image
    except Exception as e:
        print(f"Error loading image {image_path}: {e}")