diff --git a/README_ID_Card_Cropper.md b/README_ID_Card_Cropper.md new file mode 100644 index 0000000..c34a2aa --- /dev/null +++ b/README_ID_Card_Cropper.md @@ -0,0 +1,56 @@ +# ID Card Cropper + +Script đơn giản để cắt ID cards từ ảnh sử dụng Roboflow API. + +## Cách sử dụng + +```bash +python id_card_cropper.py input_folder output_folder +``` + +### Ví dụ: + +```bash +# Sử dụng API key mặc định +python id_card_cropper.py data/IDcards/Archive output/cropped_cards + +# Sử dụng API key tùy chỉnh +python id_card_cropper.py data/IDcards/Archive output/cropped_cards --api-key YOUR_API_KEY +``` + +## Tham số + +- `input_folder`: Thư mục chứa ảnh cần xử lý +- `output_folder`: Thư mục lưu ID cards đã cắt +- `--api-key`: API key Roboflow (mặc định: demo key) + +## Hỗ trợ định dạng ảnh + +- JPG/JPEG +- PNG +- BMP +- TIFF + +## Kết quả + +Script sẽ: +1. Tìm tất cả ảnh trong thư mục input +2. Phát hiện ID cards trong mỗi ảnh +3. Cắt và lưu ID cards vào thư mục output +4. Đặt tên file theo format: `{tên_ảnh_gốc}_card_{số}.jpg` + +## Ví dụ kết quả + +``` +output/cropped_cards/ +├── im1__card_1.jpg +├── im5_card_1.jpg +├── im11_card_1.jpg +└── im11_card_2.jpg +``` + +## Lưu ý + +- Cần kết nối internet để sử dụng Roboflow API +- Có delay 1 giây giữa các request để tránh rate limiting +- Chỉ lưu ID cards đã cắt, không lưu ảnh gốc với bounding boxes \ No newline at end of file diff --git a/README_ID_Card_Processing.md b/README_ID_Card_Processing.md deleted file mode 100644 index 3e922a4..0000000 --- a/README_ID_Card_Processing.md +++ /dev/null @@ -1,183 +0,0 @@ -# ID Card Processing with YOLO Detection - -Hệ thống xử lý ID cards sử dụng YOLO để detect và crop, kết hợp với các phương pháp tiền xử lý để clean background và enhance chất lượng ảnh. - -## Tính năng chính - -- **YOLO Detection**: Detect và crop ID cards từ ảnh gốc -- **Background Removal**: 3 phương pháp loại bỏ background (GrabCut, Threshold, Contour) -- **Image Enhancement**: Cải thiện chất lượng ảnh cho OCR -- **Batch Processing**: Xử lý hàng loạt ảnh -- **Flexible Pipeline**: Có thể chạy từng bước riêng biệt - -## Cài đặt - -1. Cài đặt dependencies: -```bash -pip install -r requirements.txt -``` - -2. Cấu trúc thư mục: -``` -OCR/ -├── src/ -│ ├── model/ -│ │ ├── __init__.py -│ │ ├── yolo_detector.py -│ │ └── id_card_processor.py -│ └── ... -├── data/ -│ ├── IDcards/ # Thư mục chứa ảnh ID cards gốc -│ └── processed_id_cards/ # Thư mục output -├── id_card_processor_main.py -└── requirements.txt -``` - -## Sử dụng - -### 1. Full Pipeline (Detect + Preprocess) - -```bash -python id_card_processor_main.py \ - --input-dir "data/IDcards" \ - --output-dir "data/processed_id_cards" \ - --confidence 0.5 \ - --bg-removal grabcut \ - --target-size 800x600 \ - --save-annotated -``` - -### 2. Chỉ Detect và Crop - -```bash -python id_card_processor_main.py \ - --input-dir "data/IDcards" \ - --output-dir "data/processed_id_cards" \ - --detect-only \ - --save-annotated -``` - -### 3. Chỉ Preprocess (bỏ qua detection) - -```bash -python id_card_processor_main.py \ - --input-dir "data/IDcards" \ - --output-dir "data/processed_id_cards" \ - --preprocess-only \ - --bg-removal threshold \ - --target-size 800x600 -``` - -## Các tham số - -### Detection Parameters -- `--model-path`: Đường dẫn đến custom YOLO model (.pt file) -- `--confidence`: Ngưỡng confidence cho detection (default: 0.5) - -### Preprocessing Parameters -- `--bg-removal`: Phương pháp loại bỏ background - - `grabcut`: Sử dụng GrabCut algorithm (recommended) - - `threshold`: Sử dụng thresholding - - `contour`: Sử dụng contour detection - - `none`: Không loại bỏ background -- `--target-size`: Kích thước chuẩn hóa (width x height) - -### Output Options -- `--save-annotated`: Lưu ảnh với bounding boxes -- `--detect-only`: Chỉ chạy detection -- `--preprocess-only`: Chỉ chạy preprocessing - -## Output Structure - -``` -data/processed_id_cards/ -├── cropped/ # Ảnh đã được crop từ YOLO -│ ├── image1_card_1.jpg -│ ├── image1_card_2.jpg -│ └── ... -├── processed/ # Ảnh đã được preprocess -│ ├── image1_card_1_processed.jpg -│ ├── image1_card_2_processed.jpg -│ └── ... -└── annotated/ # Ảnh với bounding boxes (nếu có) - ├── image1_annotated.jpg - └── ... -``` - -## Ví dụ sử dụng - -### Ví dụ 1: Xử lý toàn bộ dataset -```bash -# Xử lý tất cả ảnh trong thư mục IDcards -python id_card_processor_main.py \ - --input-dir "data/IDcards" \ - --output-dir "data/processed_id_cards" \ - --confidence 0.6 \ - --bg-removal grabcut \ - --target-size 1024x768 \ - --save-annotated -``` - -### Ví dụ 2: Test với một vài ảnh -```bash -# Tạo thư mục test với một vài ảnh -mkdir -p data/test_images -# Copy một vài ảnh vào test_images - -# Chạy detection -python id_card_processor_main.py \ - --input-dir "data/test_images" \ - --output-dir "data/test_output" \ - --detect-only \ - --save-annotated -``` - -### Ví dụ 3: Sử dụng custom model -```bash -# Nếu bạn có custom YOLO model đã train -python id_card_processor_main.py \ - --input-dir "data/IDcards" \ - --output-dir "data/processed_id_cards" \ - --model-path "models/custom_id_card_model.pt" \ - --confidence 0.7 -``` - -## Lưu ý - -1. **YOLO Model**: Mặc định sử dụng YOLOv8n pre-trained. Nếu có custom model tốt hơn, hãy sử dụng `--model-path` - -2. **Background Removal**: - - `grabcut`: Tốt nhất cho ID cards có background phức tạp - - `threshold`: Nhanh, phù hợp với background đơn giản - - `contour`: Phù hợp với ID cards có viền rõ ràng - -3. **Performance**: - - Sử dụng GPU nếu có thể để tăng tốc độ detection - - Có thể điều chỉnh `--confidence` để cân bằng giữa precision và recall - -4. **Memory**: Với dataset lớn, có thể cần tăng memory hoặc xử lý theo batch nhỏ hơn - -## Troubleshooting - -### Lỗi thường gặp - -1. **No detections found**: - - Giảm `--confidence` xuống 0.3-0.4 - - Kiểm tra chất lượng ảnh input - -2. **Memory error**: - - Giảm batch size hoặc xử lý từng ảnh một - - Sử dụng CPU thay vì GPU - -3. **Poor background removal**: - - Thử các phương pháp khác nhau: `grabcut`, `threshold`, `contour` - - Điều chỉnh parameters trong code - -### Debug mode - -```bash -python id_card_processor_main.py \ - --input-dir "data/IDcards" \ - --output-dir "data/processed_id_cards" \ - --log-level DEBUG -``` \ No newline at end of file diff --git a/config/roboflow_config.yaml b/config/roboflow_config.yaml new file mode 100644 index 0000000..ab18e2c --- /dev/null +++ b/config/roboflow_config.yaml @@ -0,0 +1,40 @@ +# Roboflow ID Card Detection Configuration + +# API Configuration +api: + key: "Pkz4puRA0Cy3xMOuNoNr" # Your Roboflow API key + model_id: "french-card-id-detect" + version: 3 + confidence: 0.5 + timeout: 30 # seconds + +# Processing Configuration +processing: + input_dir: "data/IDcards" + output_dir: "output/roboflow_detections" + save_annotated: true + delay_between_requests: 1.0 # seconds + padding: 10 # pixels around detected cards + +# Supported image formats +supported_formats: + - ".jpg" + - ".jpeg" + - ".png" + - ".bmp" + - ".tiff" + +# Logging configuration +logging: + level: "INFO" # DEBUG, INFO, WARNING, ERROR + format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s" + handlers: + - type: "file" + filename: "logs/roboflow_detector.log" + - type: "console" + +# Performance settings +performance: + batch_size: 1 # Process one image at a time due to API limits + max_retries: 3 + retry_delay: 2.0 # seconds \ No newline at end of file diff --git a/id_card_processor_main.py b/id_card_processor_main.py deleted file mode 100644 index 5dabff9..0000000 --- a/id_card_processor_main.py +++ /dev/null @@ -1,234 +0,0 @@ -""" -Main script for ID Card Processing with YOLO Detection -""" -import argparse -import sys -from pathlib import Path -from typing import Dict, Any -import logging - -# Add src to path for imports -sys.path.append(str(Path(__file__).parent / "src")) - -from src.model.yolo_detector import YOLODetector -from src.model.id_card_processor import IDCardProcessor -from src.utils import setup_logging - -def parse_arguments(): - """Parse command line arguments""" - parser = argparse.ArgumentParser(description="ID Card Processing with YOLO Detection") - - parser.add_argument( - "--input-dir", - type=str, - required=True, - help="Input directory containing ID card images" - ) - - parser.add_argument( - "--output-dir", - type=str, - default="data/processed_id_cards", - help="Output directory for processed images" - ) - - parser.add_argument( - "--model-path", - type=str, - help="Path to custom YOLO model (.pt file)" - ) - - parser.add_argument( - "--confidence", - type=float, - default=0.5, - help="Confidence threshold for YOLO detection" - ) - - parser.add_argument( - "--detect-only", - action="store_true", - help="Only detect and crop ID cards, skip preprocessing" - ) - - parser.add_argument( - "--preprocess-only", - action="store_true", - help="Skip detection, directly preprocess images" - ) - - parser.add_argument( - "--bg-removal", - type=str, - default="grabcut", - choices=["grabcut", "threshold", "contour", "none"], - help="Background removal method" - ) - - parser.add_argument( - "--target-size", - type=str, - default="800x600", - help="Target size for normalization (width x height)" - ) - - parser.add_argument( - "--save-annotated", - action="store_true", - help="Save annotated images with bounding boxes" - ) - - parser.add_argument( - "--log-level", - type=str, - default="INFO", - choices=["DEBUG", "INFO", "WARNING", "ERROR"], - help="Logging level" - ) - - return parser.parse_args() - -def parse_size(size_str: str) -> tuple: - """Parse size string like '800x600' to tuple (800, 600)""" - try: - width, height = map(int, size_str.split('x')) - return (width, height) - except ValueError: - print(f"Invalid size format: {size_str}. Expected format: widthxheight") - sys.exit(1) - -def main(): - """Main function""" - args = parse_arguments() - - # Setup logging - logging_config = {"level": args.log_level} - logger = setup_logging(logging_config.get("level", "INFO")) - logger.info("Starting ID Card Processing") - - # Parse paths - input_dir = Path(args.input_dir) - output_dir = Path(args.output_dir) - - # Check if input directory exists - if not input_dir.exists(): - logger.error(f"Input directory does not exist: {input_dir}") - sys.exit(1) - - # Create output directory - output_dir.mkdir(parents=True, exist_ok=True) - - # Parse target size - target_size = parse_size(args.target_size) - - # Initialize YOLO detector - logger.info("Initializing YOLO detector...") - yolo_detector = YOLODetector( - model_path=args.model_path, - confidence=args.confidence - ) - - # Initialize ID card processor - logger.info("Initializing ID card processor...") - id_processor = IDCardProcessor(yolo_detector) - - if args.detect_only: - # Only detect and crop ID cards - logger.info("Running YOLO detection only...") - results = yolo_detector.batch_process( - input_dir, - output_dir / "cropped", - save_annotated=args.save_annotated - ) - - print("\n" + "="*50) - print("YOLO DETECTION RESULTS") - print("="*50) - print(f"Total images: {results['total_images']}") - print(f"Processed images: {results['processed_images']}") - print(f"Total detections: {results['total_detections']}") - print(f"Total cropped: {results['total_cropped']}") - print(f"Output directory: {output_dir / 'cropped'}") - print("="*50) - - elif args.preprocess_only: - # Skip detection, directly preprocess - logger.info("Running preprocessing only...") - results = id_processor.batch_process_id_cards( - input_dir, - output_dir / "processed", - detect_first=False, - remove_bg=args.bg_removal != "none", - enhance=True, - normalize=True, - target_size=target_size - ) - - print("\n" + "="*50) - print("PREPROCESSING RESULTS") - print("="*50) - print(f"Total images: {results['total_images']}") - print(f"Processed images: {results['processed_images']}") - print(f"Output directory: {output_dir / 'processed'}") - print("="*50) - - else: - # Full pipeline: detect + preprocess - logger.info("Running full pipeline: detection + preprocessing...") - - # Step 1: Detect and crop ID cards - logger.info("Step 1: Detecting and cropping ID cards...") - detection_results = yolo_detector.batch_process( - input_dir, - output_dir / "cropped", - save_annotated=args.save_annotated - ) - - # Step 2: Preprocess cropped images - cropped_dir = output_dir / "cropped" - if cropped_dir.exists(): - logger.info("Step 2: Preprocessing cropped ID cards...") - preprocessing_results = id_processor.batch_process_id_cards( - cropped_dir, - output_dir / "processed", - detect_first=False, - remove_bg=args.bg_removal != "none", - enhance=True, - normalize=True, - target_size=target_size - ) - else: - logger.warning("No cropped images found, preprocessing original images") - preprocessing_results = id_processor.batch_process_id_cards( - input_dir, - output_dir / "processed", - detect_first=False, - remove_bg=args.bg_removal != "none", - enhance=True, - normalize=True, - target_size=target_size - ) - - # Print summary - print("\n" + "="*50) - print("FULL PIPELINE RESULTS") - print("="*50) - print("DETECTION PHASE:") - print(f" - Total images: {detection_results['total_images']}") - print(f" - Processed images: {detection_results['processed_images']}") - print(f" - Total detections: {detection_results['total_detections']}") - print(f" - Total cropped: {detection_results['total_cropped']}") - print("\nPREPROCESSING PHASE:") - print(f" - Total images: {preprocessing_results['total_images']}") - print(f" - Processed images: {preprocessing_results['processed_images']}") - print(f"\nOutput directories:") - print(f" - Cropped images: {output_dir / 'cropped'}") - print(f" - Processed images: {output_dir / 'processed'}") - if args.save_annotated: - print(f" - Annotated images: {output_dir / 'cropped'}") - print("="*50) - - logger.info("ID Card Processing completed successfully") - -if __name__ == "__main__": - main() \ No newline at end of file diff --git a/logs/roboflow_detector.log b/logs/roboflow_detector.log new file mode 100644 index 0000000..d53fb77 --- /dev/null +++ b/logs/roboflow_detector.log @@ -0,0 +1,66 @@ +2025-08-05 19:29:51,215 - model.roboflow_id_detector - INFO - Initialized Roboflow ID detector with model: french-card-id-detect/3 +2025-08-05 19:29:51,215 - __main__ - INFO - Starting batch processing... +2025-08-05 19:29:51,215 - __main__ - INFO - Input directory: data\IDcards\Archive +2025-08-05 19:29:51,215 - __main__ - INFO - Output directory: output\roboflow_detections +2025-08-05 19:29:51,217 - model.roboflow_id_detector - INFO - Processing 15 images from data\IDcards\Archive and subdirectories +2025-08-05 19:29:51,217 - model.roboflow_id_detector - INFO - Processing 1/15: im10.png +2025-08-05 19:29:53,563 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im10.png +2025-08-05 19:29:53,563 - model.roboflow_id_detector - WARNING - No ID cards detected in im10.png +2025-08-05 19:29:54,567 - model.roboflow_id_detector - INFO - Processing 2/15: im11.png +2025-08-05 19:29:56,871 - model.roboflow_id_detector - INFO - Found 2 ID card detections in im11.png +2025-08-05 19:29:56,897 - model.roboflow_id_detector - INFO - Saved cropped image to output\roboflow_detections\im11_card_1.jpg +2025-08-05 19:29:56,906 - model.roboflow_id_detector - INFO - Saved cropped image to output\roboflow_detections\im11_card_2.jpg +2025-08-05 19:29:56,920 - model.roboflow_id_detector - INFO - Processed im11.png: 2 cards cropped +2025-08-05 19:29:57,928 - model.roboflow_id_detector - INFO - Processing 3/15: im12.png +2025-08-05 19:30:00,037 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im12.png +2025-08-05 19:30:00,037 - model.roboflow_id_detector - WARNING - No ID cards detected in im12.png +2025-08-05 19:30:01,039 - model.roboflow_id_detector - INFO - Processing 4/15: im13.png +2025-08-05 19:30:04,856 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im13.png +2025-08-05 19:30:04,856 - model.roboflow_id_detector - WARNING - No ID cards detected in im13.png +2025-08-05 19:30:05,860 - model.roboflow_id_detector - INFO - Processing 5/15: im14.png +2025-08-05 19:30:08,314 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im14.png +2025-08-05 19:30:08,314 - model.roboflow_id_detector - WARNING - No ID cards detected in im14.png +2025-08-05 19:30:09,327 - model.roboflow_id_detector - INFO - Processing 6/15: im15.png +2025-08-05 19:30:11,459 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im15.png +2025-08-05 19:30:11,460 - model.roboflow_id_detector - WARNING - No ID cards detected in im15.png +2025-08-05 19:30:12,468 - model.roboflow_id_detector - INFO - Processing 7/15: im1_.png +2025-08-05 19:30:19,295 - model.roboflow_id_detector - INFO - Found 1 ID card detections in im1_.png +2025-08-05 19:30:19,535 - model.roboflow_id_detector - INFO - Saved cropped image to output\roboflow_detections\im1__card_1.jpg +2025-08-05 19:30:19,847 - model.roboflow_id_detector - INFO - Processed im1_.png: 1 cards cropped +2025-08-05 19:30:20,874 - model.roboflow_id_detector - INFO - Processing 8/15: im2.png +2025-08-05 19:30:23,435 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im2.png +2025-08-05 19:30:23,436 - model.roboflow_id_detector - WARNING - No ID cards detected in im2.png +2025-08-05 19:30:24,439 - model.roboflow_id_detector - INFO - Processing 9/15: im3.png +2025-08-05 19:30:26,301 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im3.png +2025-08-05 19:30:26,301 - model.roboflow_id_detector - WARNING - No ID cards detected in im3.png +2025-08-05 19:30:27,302 - model.roboflow_id_detector - INFO - Processing 10/15: im4.png +2025-08-05 19:30:29,921 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im4.png +2025-08-05 19:30:29,921 - model.roboflow_id_detector - WARNING - No ID cards detected in im4.png +2025-08-05 19:30:30,931 - model.roboflow_id_detector - INFO - Processing 11/15: im5.png +2025-08-05 19:30:33,293 - model.roboflow_id_detector - INFO - Found 1 ID card detections in im5.png +2025-08-05 19:30:33,313 - model.roboflow_id_detector - INFO - Saved cropped image to output\roboflow_detections\im5_card_1.jpg +2025-08-05 19:30:33,335 - model.roboflow_id_detector - INFO - Processed im5.png: 1 cards cropped +2025-08-05 19:30:34,336 - model.roboflow_id_detector - INFO - Processing 12/15: im6.png +2025-08-05 19:30:37,885 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im6.png +2025-08-05 19:30:37,886 - model.roboflow_id_detector - WARNING - No ID cards detected in im6.png +2025-08-05 19:30:38,894 - model.roboflow_id_detector - INFO - Processing 13/15: im7.png +2025-08-05 19:30:40,956 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im7.png +2025-08-05 19:30:40,956 - model.roboflow_id_detector - WARNING - No ID cards detected in im7.png +2025-08-05 19:30:41,964 - model.roboflow_id_detector - INFO - Processing 14/15: im8.png +2025-08-05 19:30:45,927 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im8.png +2025-08-05 19:30:45,927 - model.roboflow_id_detector - WARNING - No ID cards detected in im8.png +2025-08-05 19:30:46,935 - model.roboflow_id_detector - INFO - Processing 15/15: im9.png +2025-08-05 19:30:49,393 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im9.png +2025-08-05 19:30:49,393 - model.roboflow_id_detector - WARNING - No ID cards detected in im9.png +2025-08-05 19:30:49,393 - model.roboflow_id_detector - INFO - Batch processing completed: +2025-08-05 19:30:49,394 - model.roboflow_id_detector - INFO - - Total images: 15 +2025-08-05 19:30:49,394 - model.roboflow_id_detector - INFO - - Processed: 3 +2025-08-05 19:30:49,394 - model.roboflow_id_detector - INFO - - Total detections: 4 +2025-08-05 19:30:49,394 - model.roboflow_id_detector - INFO - - Total cropped: 4 +2025-08-05 19:30:49,395 - __main__ - INFO - Batch processing completed! +2025-08-05 19:30:49,395 - __main__ - INFO - - Total images: 15 +2025-08-05 19:30:49,395 - __main__ - INFO - - Processed: 3 +2025-08-05 19:30:49,395 - __main__ - INFO - - Total detections: 4 +2025-08-05 19:30:49,395 - __main__ - INFO - - Total cropped: 4 +2025-08-05 19:30:49,396 - __main__ - INFO - Processing summary saved to: output\roboflow_detections\processing_summary.txt +2025-08-05 19:30:49,398 - __main__ - INFO - Processing completed successfully! diff --git a/script/id_card_cropper.py b/script/id_card_cropper.py new file mode 100644 index 0000000..5dc78ef --- /dev/null +++ b/script/id_card_cropper.py @@ -0,0 +1,133 @@ +#!/usr/bin/env python3 +""" +Simple ID Card Cropper using Roboflow API +Input: folder containing images +Output: folder with cropped ID cards +""" +import sys +import yaml +from pathlib import Path +import logging +import argparse + +# Add src to path +sys.path.append(str(Path(__file__).parent / "src")) + +from model.roboflow_id_detector import RoboflowIDDetector + +def setup_logging(): + """Setup basic logging""" + logging.basicConfig( + level=logging.INFO, + format='%(asctime)s - %(levelname)s - %(message)s' + ) + +def crop_id_cards(input_folder: str, output_folder: str, api_key: str = "Pkz4puRA0Cy3xMOuNoNr"): + """ + Crop ID cards from all images in input folder + + Args: + input_folder: Path to input folder containing images + output_folder: Path to output folder for cropped ID cards + api_key: Roboflow API key + """ + logger = logging.getLogger(__name__) + + # Convert to Path objects + input_path = Path(input_folder) + output_path = Path(output_folder) + + # Check if input folder exists + if not input_path.exists(): + logger.error(f"Input folder not found: {input_folder}") + return False + + # Create output folder + output_path.mkdir(parents=True, exist_ok=True) + + # Initialize detector + detector = RoboflowIDDetector( + api_key=api_key, + model_id="french-card-id-detect", + version=3, + confidence=0.5 + ) + + # Get all image files + image_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff'} + image_files = [] + + for file_path in input_path.rglob('*'): + if file_path.is_file() and file_path.suffix.lower() in image_extensions: + image_files.append(file_path) + + if not image_files: + logger.error(f"No images found in {input_folder}") + return False + + logger.info(f"Found {len(image_files)} images to process") + + # Process each image + total_cropped = 0 + + for i, image_path in enumerate(image_files, 1): + logger.info(f"Processing {i}/{len(image_files)}: {image_path.name}") + + # Detect ID cards + detections = detector.detect_id_cards(image_path) + + if not detections: + logger.warning(f"No ID cards detected in {image_path.name}") + continue + + # Crop each detected ID card + for j, detection in enumerate(detections): + bbox = detection['bbox'] + + # Create output filename + stem = image_path.stem + suffix = f"_card_{j+1}.jpg" + output_file = output_path / f"{stem}{suffix}" + + # Crop ID card + cropped = detector.crop_id_card(image_path, bbox, output_file) + + if cropped is not None: + total_cropped += 1 + logger.info(f" ✓ Cropped card {j+1} to {output_file.name}") + + # Add delay between requests + if i < len(image_files): + import time + time.sleep(1.0) + + logger.info(f"Processing completed! Total ID cards cropped: {total_cropped}") + return True + +def main(): + """Main function""" + parser = argparse.ArgumentParser(description='Crop ID cards from images using Roboflow API') + parser.add_argument('input_folder', help='Input folder containing images') + parser.add_argument('output_folder', help='Output folder for cropped ID cards') + parser.add_argument('--api-key', default="Pkz4puRA0Cy3xMOuNoNr", + help='Roboflow API key (default: demo key)') + + args = parser.parse_args() + + # Setup logging + setup_logging() + + # Process images + success = crop_id_cards(args.input_folder, args.output_folder, args.api_key) + + if success: + print(f"\n✓ Successfully processed images from '{args.input_folder}'") + print(f"✓ Cropped ID cards saved to '{args.output_folder}'") + else: + print(f"\n✗ Failed to process images") + return 1 + + return 0 + +if __name__ == "__main__": + exit(main()) \ No newline at end of file diff --git a/src/model/__init__.py b/src/model/__init__.py index ecb9162..bdef87a 100644 --- a/src/model/__init__.py +++ b/src/model/__init__.py @@ -1,8 +1,7 @@ """ -Model module for YOLO-based ID card detection and cropping +Model module for Roboflow-based ID card detection and cropping """ -from .yolo_detector import YOLODetector -from .id_card_processor import IDCardProcessor +from .roboflow_id_detector import RoboflowIDDetector -__all__ = ['YOLODetector', 'IDCardProcessor'] \ No newline at end of file +__all__ = ['RoboflowIDDetector'] \ No newline at end of file diff --git a/src/model/__pycache__/__init__.cpython-313.pyc b/src/model/__pycache__/__init__.cpython-313.pyc new file mode 100644 index 0000000..d7e494e Binary files /dev/null and b/src/model/__pycache__/__init__.cpython-313.pyc differ diff --git a/src/model/__pycache__/__init__.cpython-39.pyc b/src/model/__pycache__/__init__.cpython-39.pyc index 231b3dd..232e191 100644 Binary files a/src/model/__pycache__/__init__.cpython-39.pyc and b/src/model/__pycache__/__init__.cpython-39.pyc differ diff --git a/src/model/__pycache__/roboflow_id_detector.cpython-313.pyc b/src/model/__pycache__/roboflow_id_detector.cpython-313.pyc new file mode 100644 index 0000000..78cb388 Binary files /dev/null and b/src/model/__pycache__/roboflow_id_detector.cpython-313.pyc differ diff --git a/src/model/__pycache__/roboflow_id_detector.cpython-39.pyc b/src/model/__pycache__/roboflow_id_detector.cpython-39.pyc new file mode 100644 index 0000000..3138a0d Binary files /dev/null and b/src/model/__pycache__/roboflow_id_detector.cpython-39.pyc differ diff --git a/src/model/__pycache__/yolo_detector.cpython-39.pyc b/src/model/__pycache__/yolo_detector.cpython-39.pyc index a67f54a..300f3a9 100644 Binary files a/src/model/__pycache__/yolo_detector.cpython-39.pyc and b/src/model/__pycache__/yolo_detector.cpython-39.pyc differ diff --git a/src/model/yolo_detector.py b/src/model/roboflow_id_detector.py similarity index 62% rename from src/model/yolo_detector.py rename to src/model/roboflow_id_detector.py index b1974bf..f703d4a 100644 --- a/src/model/yolo_detector.py +++ b/src/model/roboflow_id_detector.py @@ -1,46 +1,104 @@ """ -YOLO Detector for ID Card Detection and Cropping +Roboflow ID Card Detector using French Card ID Detection Model """ import cv2 import numpy as np from pathlib import Path from typing import List, Tuple, Optional, Dict, Any import logging -from ultralytics import YOLO -import torch +import requests +import base64 +import json +import time +from urllib.parse import quote -class YOLODetector: +class RoboflowIDDetector: """ - YOLO-based detector for ID card detection and cropping + Roboflow-based detector for French ID card detection using the french-card-id-detect model """ - def __init__(self, model_path: Optional[str] = None, confidence: float = 0.5): + def __init__(self, api_key: str, model_id: str = "french-card-id-detect", + version: int = 3, confidence: float = 0.5): """ - Initialize YOLO detector + Initialize Roboflow ID detector Args: - model_path: Path to YOLO model file (.pt) + api_key: Roboflow API key + model_id: Model identifier (default: french-card-id-detect) + version: Model version (default: 3) confidence: Confidence threshold for detection """ + self.api_key = api_key + self.model_id = model_id + self.version = version self.confidence = confidence self.logger = logging.getLogger(__name__) - # Initialize model - if model_path and Path(model_path).exists(): - self.model = YOLO(model_path) - self.logger.info(f"Loaded custom YOLO model from {model_path}") - else: - # Use pre-trained YOLO model for general object detection - self.model = YOLO('yolov8n.pt') - self.logger.info("Using pre-trained YOLOv8n model") + # API endpoint + self.api_url = f"https://serverless.roboflow.com/{model_id}/{version}" - # Set device - self.device = 'cuda' if torch.cuda.is_available() else 'cpu' - self.logger.info(f"Using device: {self.device}") + self.logger.info(f"Initialized Roboflow ID detector with model: {model_id}/{version}") + + def _encode_image(self, image_path: Path) -> str: + """ + Encode image to base64 + + Args: + image_path: Path to image file + + Returns: + Base64 encoded image string + """ + try: + with open(image_path, "rb") as image_file: + encoded_string = base64.b64encode(image_file.read()).decode('utf-8') + return encoded_string + except Exception as e: + self.logger.error(f"Error encoding image {image_path}: {e}") + return None + + def _make_api_request(self, image_data: str, image_name: str = "image.jpg") -> Optional[Dict]: + """ + Make API request to Roboflow + + Args: + image_data: Base64 encoded image data + image_name: Name of the image file + + Returns: + API response as dictionary + """ + try: + headers = { + 'Content-Type': 'application/x-www-form-urlencoded' + } + + params = { + 'api_key': self.api_key, + 'name': image_name + } + + response = requests.post( + self.api_url, + params=params, + data=image_data, + headers=headers, + timeout=30 + ) + + if response.status_code == 200: + return response.json() + else: + self.logger.error(f"API request failed with status {response.status_code}: {response.text}") + return None + + except Exception as e: + self.logger.error(f"Error making API request: {e}") + return None def detect_id_cards(self, image_path: Path) -> List[Dict[str, Any]]: """ - Detect ID cards in an image + Detect ID cards in an image using Roboflow API Args: image_path: Path to image file @@ -49,39 +107,50 @@ class YOLODetector: List of detection results with bounding boxes """ try: - # Load image - image = cv2.imread(str(image_path)) - if image is None: - self.logger.error(f"Could not load image: {image_path}") + # Encode image + image_data = self._encode_image(image_path) + if not image_data: return [] - # Run detection - results = self.model(image, conf=self.confidence) + # Make API request + response = self._make_api_request(image_data, image_path.name) + if not response: + return [] detections = [] - for result in results: - boxes = result.boxes - if boxes is not None: - for box in boxes: - # Get coordinates - x1, y1, x2, y2 = box.xyxy[0].cpu().numpy() - confidence = float(box.conf[0]) - class_id = int(box.cls[0]) - class_name = self.model.names[class_id] - - detection = { - 'bbox': [int(x1), int(y1), int(x2), int(y2)], - 'confidence': confidence, - 'class_id': class_id, - 'class_name': class_name, - 'area': (x2 - x1) * (y2 - y1) - } - detections.append(detection) - # Sort by confidence and area (prefer larger, more confident detections) + # Parse predictions from response + if 'predictions' in response: + for prediction in response['predictions']: + # Check confidence threshold + if prediction.get('confidence', 0) < self.confidence: + continue + + # Extract bounding box coordinates + x = prediction.get('x', 0) + y = prediction.get('y', 0) + width = prediction.get('width', 0) + height = prediction.get('height', 0) + + # Convert to [x1, y1, x2, y2] format + x1 = int(x - width / 2) + y1 = int(y - height / 2) + x2 = int(x + width / 2) + y2 = int(y + height / 2) + + detection = { + 'bbox': [x1, y1, x2, y2], + 'confidence': prediction.get('confidence', 0), + 'class_id': prediction.get('class_id', 0), + 'class_name': prediction.get('class', 'id_card'), + 'area': width * height + } + detections.append(detection) + + # Sort by confidence and area detections.sort(key=lambda x: (x['confidence'], x['area']), reverse=True) - self.logger.info(f"Found {len(detections)} detections in {image_path.name}") + self.logger.info(f"Found {len(detections)} ID card detections in {image_path.name}") return detections except Exception as e: @@ -201,7 +270,7 @@ class YOLODetector: return result def batch_process(self, input_dir: Path, output_dir: Path, - save_annotated: bool = False) -> Dict[str, Any]: + save_annotated: bool = False, delay: float = 1.0) -> Dict[str, Any]: """ Process all images in a directory and subdirectories @@ -209,6 +278,7 @@ class YOLODetector: input_dir: Input directory containing images output_dir: Output directory for cropped images save_annotated: Whether to save annotated images + delay: Delay between API requests (seconds) Returns: Batch processing results @@ -216,11 +286,10 @@ class YOLODetector: # Create output directory output_dir.mkdir(parents=True, exist_ok=True) - # Get all image files recursively from input directory and subdirectories + # Get all image files recursively image_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff'} image_files = [] - # Recursively find all image files for file_path in input_dir.rglob('*'): if file_path.is_file() and file_path.suffix.lower() in image_extensions: image_files.append(file_path) @@ -255,6 +324,10 @@ class YOLODetector: results['processed_images'] += 1 results['total_detections'] += len(result['detections']) results['total_cropped'] += len(result['cropped_paths']) + + # Add delay between requests to avoid rate limiting + if i < len(image_files) - 1: # Don't delay after the last image + time.sleep(delay) # Summary self.logger.info(f"Batch processing completed:")