YOLO crop model

This commit is contained in:
Nguyễn Phước Thành
2025-08-05 20:21:47 +07:00
parent 24060e4ce7
commit a4e7573dca
13 changed files with 420 additions and 470 deletions

56
README_ID_Card_Cropper.md Normal file
View File

@@ -0,0 +1,56 @@
# ID Card Cropper
Script đơn giản để cắt ID cards từ ảnh sử dụng Roboflow API.
## Cách sử dụng
```bash
python id_card_cropper.py input_folder output_folder
```
### Ví dụ:
```bash
# Sử dụng API key mặc định
python id_card_cropper.py data/IDcards/Archive output/cropped_cards
# Sử dụng API key tùy chỉnh
python id_card_cropper.py data/IDcards/Archive output/cropped_cards --api-key YOUR_API_KEY
```
## Tham số
- `input_folder`: Thư mục chứa ảnh cần xử lý
- `output_folder`: Thư mục lưu ID cards đã cắt
- `--api-key`: API key Roboflow (mặc định: demo key)
## Hỗ trợ định dạng ảnh
- JPG/JPEG
- PNG
- BMP
- TIFF
## Kết quả
Script sẽ:
1. Tìm tất cả ảnh trong thư mục input
2. Phát hiện ID cards trong mỗi ảnh
3. Cắt và lưu ID cards vào thư mục output
4. Đặt tên file theo format: `{tên_ảnh_gốc}_card_{số}.jpg`
## Ví dụ kết quả
```
output/cropped_cards/
├── im1__card_1.jpg
├── im5_card_1.jpg
├── im11_card_1.jpg
└── im11_card_2.jpg
```
## Lưu ý
- Cần kết nối internet để sử dụng Roboflow API
- Có delay 1 giây giữa các request để tránh rate limiting
- Chỉ lưu ID cards đã cắt, không lưu ảnh gốc với bounding boxes

View File

@@ -1,183 +0,0 @@
# ID Card Processing with YOLO Detection
Hệ thống xử lý ID cards sử dụng YOLO để detect và crop, kết hợp với các phương pháp tiền xử lý để clean background và enhance chất lượng ảnh.
## Tính năng chính
- **YOLO Detection**: Detect và crop ID cards từ ảnh gốc
- **Background Removal**: 3 phương pháp loại bỏ background (GrabCut, Threshold, Contour)
- **Image Enhancement**: Cải thiện chất lượng ảnh cho OCR
- **Batch Processing**: Xử lý hàng loạt ảnh
- **Flexible Pipeline**: Có thể chạy từng bước riêng biệt
## Cài đặt
1. Cài đặt dependencies:
```bash
pip install -r requirements.txt
```
2. Cấu trúc thư mục:
```
OCR/
├── src/
│ ├── model/
│ │ ├── __init__.py
│ │ ├── yolo_detector.py
│ │ └── id_card_processor.py
│ └── ...
├── data/
│ ├── IDcards/ # Thư mục chứa ảnh ID cards gốc
│ └── processed_id_cards/ # Thư mục output
├── id_card_processor_main.py
└── requirements.txt
```
## Sử dụng
### 1. Full Pipeline (Detect + Preprocess)
```bash
python id_card_processor_main.py \
--input-dir "data/IDcards" \
--output-dir "data/processed_id_cards" \
--confidence 0.5 \
--bg-removal grabcut \
--target-size 800x600 \
--save-annotated
```
### 2. Chỉ Detect và Crop
```bash
python id_card_processor_main.py \
--input-dir "data/IDcards" \
--output-dir "data/processed_id_cards" \
--detect-only \
--save-annotated
```
### 3. Chỉ Preprocess (bỏ qua detection)
```bash
python id_card_processor_main.py \
--input-dir "data/IDcards" \
--output-dir "data/processed_id_cards" \
--preprocess-only \
--bg-removal threshold \
--target-size 800x600
```
## Các tham số
### Detection Parameters
- `--model-path`: Đường dẫn đến custom YOLO model (.pt file)
- `--confidence`: Ngưỡng confidence cho detection (default: 0.5)
### Preprocessing Parameters
- `--bg-removal`: Phương pháp loại bỏ background
- `grabcut`: Sử dụng GrabCut algorithm (recommended)
- `threshold`: Sử dụng thresholding
- `contour`: Sử dụng contour detection
- `none`: Không loại bỏ background
- `--target-size`: Kích thước chuẩn hóa (width x height)
### Output Options
- `--save-annotated`: Lưu ảnh với bounding boxes
- `--detect-only`: Chỉ chạy detection
- `--preprocess-only`: Chỉ chạy preprocessing
## Output Structure
```
data/processed_id_cards/
├── cropped/ # Ảnh đã được crop từ YOLO
│ ├── image1_card_1.jpg
│ ├── image1_card_2.jpg
│ └── ...
├── processed/ # Ảnh đã được preprocess
│ ├── image1_card_1_processed.jpg
│ ├── image1_card_2_processed.jpg
│ └── ...
└── annotated/ # Ảnh với bounding boxes (nếu có)
├── image1_annotated.jpg
└── ...
```
## Ví dụ sử dụng
### Ví dụ 1: Xử lý toàn bộ dataset
```bash
# Xử lý tất cả ảnh trong thư mục IDcards
python id_card_processor_main.py \
--input-dir "data/IDcards" \
--output-dir "data/processed_id_cards" \
--confidence 0.6 \
--bg-removal grabcut \
--target-size 1024x768 \
--save-annotated
```
### Ví dụ 2: Test với một vài ảnh
```bash
# Tạo thư mục test với một vài ảnh
mkdir -p data/test_images
# Copy một vài ảnh vào test_images
# Chạy detection
python id_card_processor_main.py \
--input-dir "data/test_images" \
--output-dir "data/test_output" \
--detect-only \
--save-annotated
```
### Ví dụ 3: Sử dụng custom model
```bash
# Nếu bạn có custom YOLO model đã train
python id_card_processor_main.py \
--input-dir "data/IDcards" \
--output-dir "data/processed_id_cards" \
--model-path "models/custom_id_card_model.pt" \
--confidence 0.7
```
## Lưu ý
1. **YOLO Model**: Mặc định sử dụng YOLOv8n pre-trained. Nếu có custom model tốt hơn, hãy sử dụng `--model-path`
2. **Background Removal**:
- `grabcut`: Tốt nhất cho ID cards có background phức tạp
- `threshold`: Nhanh, phù hợp với background đơn giản
- `contour`: Phù hợp với ID cards có viền rõ ràng
3. **Performance**:
- Sử dụng GPU nếu có thể để tăng tốc độ detection
- Có thể điều chỉnh `--confidence` để cân bằng giữa precision và recall
4. **Memory**: Với dataset lớn, có thể cần tăng memory hoặc xử lý theo batch nhỏ hơn
## Troubleshooting
### Lỗi thường gặp
1. **No detections found**:
- Giảm `--confidence` xuống 0.3-0.4
- Kiểm tra chất lượng ảnh input
2. **Memory error**:
- Giảm batch size hoặc xử lý từng ảnh một
- Sử dụng CPU thay vì GPU
3. **Poor background removal**:
- Thử các phương pháp khác nhau: `grabcut`, `threshold`, `contour`
- Điều chỉnh parameters trong code
### Debug mode
```bash
python id_card_processor_main.py \
--input-dir "data/IDcards" \
--output-dir "data/processed_id_cards" \
--log-level DEBUG
```

View File

@@ -0,0 +1,40 @@
# Roboflow ID Card Detection Configuration
# API Configuration
api:
key: "Pkz4puRA0Cy3xMOuNoNr" # Your Roboflow API key
model_id: "french-card-id-detect"
version: 3
confidence: 0.5
timeout: 30 # seconds
# Processing Configuration
processing:
input_dir: "data/IDcards"
output_dir: "output/roboflow_detections"
save_annotated: true
delay_between_requests: 1.0 # seconds
padding: 10 # pixels around detected cards
# Supported image formats
supported_formats:
- ".jpg"
- ".jpeg"
- ".png"
- ".bmp"
- ".tiff"
# Logging configuration
logging:
level: "INFO" # DEBUG, INFO, WARNING, ERROR
format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
handlers:
- type: "file"
filename: "logs/roboflow_detector.log"
- type: "console"
# Performance settings
performance:
batch_size: 1 # Process one image at a time due to API limits
max_retries: 3
retry_delay: 2.0 # seconds

View File

@@ -1,234 +0,0 @@
"""
Main script for ID Card Processing with YOLO Detection
"""
import argparse
import sys
from pathlib import Path
from typing import Dict, Any
import logging
# Add src to path for imports
sys.path.append(str(Path(__file__).parent / "src"))
from src.model.yolo_detector import YOLODetector
from src.model.id_card_processor import IDCardProcessor
from src.utils import setup_logging
def parse_arguments():
"""Parse command line arguments"""
parser = argparse.ArgumentParser(description="ID Card Processing with YOLO Detection")
parser.add_argument(
"--input-dir",
type=str,
required=True,
help="Input directory containing ID card images"
)
parser.add_argument(
"--output-dir",
type=str,
default="data/processed_id_cards",
help="Output directory for processed images"
)
parser.add_argument(
"--model-path",
type=str,
help="Path to custom YOLO model (.pt file)"
)
parser.add_argument(
"--confidence",
type=float,
default=0.5,
help="Confidence threshold for YOLO detection"
)
parser.add_argument(
"--detect-only",
action="store_true",
help="Only detect and crop ID cards, skip preprocessing"
)
parser.add_argument(
"--preprocess-only",
action="store_true",
help="Skip detection, directly preprocess images"
)
parser.add_argument(
"--bg-removal",
type=str,
default="grabcut",
choices=["grabcut", "threshold", "contour", "none"],
help="Background removal method"
)
parser.add_argument(
"--target-size",
type=str,
default="800x600",
help="Target size for normalization (width x height)"
)
parser.add_argument(
"--save-annotated",
action="store_true",
help="Save annotated images with bounding boxes"
)
parser.add_argument(
"--log-level",
type=str,
default="INFO",
choices=["DEBUG", "INFO", "WARNING", "ERROR"],
help="Logging level"
)
return parser.parse_args()
def parse_size(size_str: str) -> tuple:
"""Parse size string like '800x600' to tuple (800, 600)"""
try:
width, height = map(int, size_str.split('x'))
return (width, height)
except ValueError:
print(f"Invalid size format: {size_str}. Expected format: widthxheight")
sys.exit(1)
def main():
"""Main function"""
args = parse_arguments()
# Setup logging
logging_config = {"level": args.log_level}
logger = setup_logging(logging_config.get("level", "INFO"))
logger.info("Starting ID Card Processing")
# Parse paths
input_dir = Path(args.input_dir)
output_dir = Path(args.output_dir)
# Check if input directory exists
if not input_dir.exists():
logger.error(f"Input directory does not exist: {input_dir}")
sys.exit(1)
# Create output directory
output_dir.mkdir(parents=True, exist_ok=True)
# Parse target size
target_size = parse_size(args.target_size)
# Initialize YOLO detector
logger.info("Initializing YOLO detector...")
yolo_detector = YOLODetector(
model_path=args.model_path,
confidence=args.confidence
)
# Initialize ID card processor
logger.info("Initializing ID card processor...")
id_processor = IDCardProcessor(yolo_detector)
if args.detect_only:
# Only detect and crop ID cards
logger.info("Running YOLO detection only...")
results = yolo_detector.batch_process(
input_dir,
output_dir / "cropped",
save_annotated=args.save_annotated
)
print("\n" + "="*50)
print("YOLO DETECTION RESULTS")
print("="*50)
print(f"Total images: {results['total_images']}")
print(f"Processed images: {results['processed_images']}")
print(f"Total detections: {results['total_detections']}")
print(f"Total cropped: {results['total_cropped']}")
print(f"Output directory: {output_dir / 'cropped'}")
print("="*50)
elif args.preprocess_only:
# Skip detection, directly preprocess
logger.info("Running preprocessing only...")
results = id_processor.batch_process_id_cards(
input_dir,
output_dir / "processed",
detect_first=False,
remove_bg=args.bg_removal != "none",
enhance=True,
normalize=True,
target_size=target_size
)
print("\n" + "="*50)
print("PREPROCESSING RESULTS")
print("="*50)
print(f"Total images: {results['total_images']}")
print(f"Processed images: {results['processed_images']}")
print(f"Output directory: {output_dir / 'processed'}")
print("="*50)
else:
# Full pipeline: detect + preprocess
logger.info("Running full pipeline: detection + preprocessing...")
# Step 1: Detect and crop ID cards
logger.info("Step 1: Detecting and cropping ID cards...")
detection_results = yolo_detector.batch_process(
input_dir,
output_dir / "cropped",
save_annotated=args.save_annotated
)
# Step 2: Preprocess cropped images
cropped_dir = output_dir / "cropped"
if cropped_dir.exists():
logger.info("Step 2: Preprocessing cropped ID cards...")
preprocessing_results = id_processor.batch_process_id_cards(
cropped_dir,
output_dir / "processed",
detect_first=False,
remove_bg=args.bg_removal != "none",
enhance=True,
normalize=True,
target_size=target_size
)
else:
logger.warning("No cropped images found, preprocessing original images")
preprocessing_results = id_processor.batch_process_id_cards(
input_dir,
output_dir / "processed",
detect_first=False,
remove_bg=args.bg_removal != "none",
enhance=True,
normalize=True,
target_size=target_size
)
# Print summary
print("\n" + "="*50)
print("FULL PIPELINE RESULTS")
print("="*50)
print("DETECTION PHASE:")
print(f" - Total images: {detection_results['total_images']}")
print(f" - Processed images: {detection_results['processed_images']}")
print(f" - Total detections: {detection_results['total_detections']}")
print(f" - Total cropped: {detection_results['total_cropped']}")
print("\nPREPROCESSING PHASE:")
print(f" - Total images: {preprocessing_results['total_images']}")
print(f" - Processed images: {preprocessing_results['processed_images']}")
print(f"\nOutput directories:")
print(f" - Cropped images: {output_dir / 'cropped'}")
print(f" - Processed images: {output_dir / 'processed'}")
if args.save_annotated:
print(f" - Annotated images: {output_dir / 'cropped'}")
print("="*50)
logger.info("ID Card Processing completed successfully")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,66 @@
2025-08-05 19:29:51,215 - model.roboflow_id_detector - INFO - Initialized Roboflow ID detector with model: french-card-id-detect/3
2025-08-05 19:29:51,215 - __main__ - INFO - Starting batch processing...
2025-08-05 19:29:51,215 - __main__ - INFO - Input directory: data\IDcards\Archive
2025-08-05 19:29:51,215 - __main__ - INFO - Output directory: output\roboflow_detections
2025-08-05 19:29:51,217 - model.roboflow_id_detector - INFO - Processing 15 images from data\IDcards\Archive and subdirectories
2025-08-05 19:29:51,217 - model.roboflow_id_detector - INFO - Processing 1/15: im10.png
2025-08-05 19:29:53,563 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im10.png
2025-08-05 19:29:53,563 - model.roboflow_id_detector - WARNING - No ID cards detected in im10.png
2025-08-05 19:29:54,567 - model.roboflow_id_detector - INFO - Processing 2/15: im11.png
2025-08-05 19:29:56,871 - model.roboflow_id_detector - INFO - Found 2 ID card detections in im11.png
2025-08-05 19:29:56,897 - model.roboflow_id_detector - INFO - Saved cropped image to output\roboflow_detections\im11_card_1.jpg
2025-08-05 19:29:56,906 - model.roboflow_id_detector - INFO - Saved cropped image to output\roboflow_detections\im11_card_2.jpg
2025-08-05 19:29:56,920 - model.roboflow_id_detector - INFO - Processed im11.png: 2 cards cropped
2025-08-05 19:29:57,928 - model.roboflow_id_detector - INFO - Processing 3/15: im12.png
2025-08-05 19:30:00,037 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im12.png
2025-08-05 19:30:00,037 - model.roboflow_id_detector - WARNING - No ID cards detected in im12.png
2025-08-05 19:30:01,039 - model.roboflow_id_detector - INFO - Processing 4/15: im13.png
2025-08-05 19:30:04,856 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im13.png
2025-08-05 19:30:04,856 - model.roboflow_id_detector - WARNING - No ID cards detected in im13.png
2025-08-05 19:30:05,860 - model.roboflow_id_detector - INFO - Processing 5/15: im14.png
2025-08-05 19:30:08,314 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im14.png
2025-08-05 19:30:08,314 - model.roboflow_id_detector - WARNING - No ID cards detected in im14.png
2025-08-05 19:30:09,327 - model.roboflow_id_detector - INFO - Processing 6/15: im15.png
2025-08-05 19:30:11,459 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im15.png
2025-08-05 19:30:11,460 - model.roboflow_id_detector - WARNING - No ID cards detected in im15.png
2025-08-05 19:30:12,468 - model.roboflow_id_detector - INFO - Processing 7/15: im1_.png
2025-08-05 19:30:19,295 - model.roboflow_id_detector - INFO - Found 1 ID card detections in im1_.png
2025-08-05 19:30:19,535 - model.roboflow_id_detector - INFO - Saved cropped image to output\roboflow_detections\im1__card_1.jpg
2025-08-05 19:30:19,847 - model.roboflow_id_detector - INFO - Processed im1_.png: 1 cards cropped
2025-08-05 19:30:20,874 - model.roboflow_id_detector - INFO - Processing 8/15: im2.png
2025-08-05 19:30:23,435 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im2.png
2025-08-05 19:30:23,436 - model.roboflow_id_detector - WARNING - No ID cards detected in im2.png
2025-08-05 19:30:24,439 - model.roboflow_id_detector - INFO - Processing 9/15: im3.png
2025-08-05 19:30:26,301 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im3.png
2025-08-05 19:30:26,301 - model.roboflow_id_detector - WARNING - No ID cards detected in im3.png
2025-08-05 19:30:27,302 - model.roboflow_id_detector - INFO - Processing 10/15: im4.png
2025-08-05 19:30:29,921 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im4.png
2025-08-05 19:30:29,921 - model.roboflow_id_detector - WARNING - No ID cards detected in im4.png
2025-08-05 19:30:30,931 - model.roboflow_id_detector - INFO - Processing 11/15: im5.png
2025-08-05 19:30:33,293 - model.roboflow_id_detector - INFO - Found 1 ID card detections in im5.png
2025-08-05 19:30:33,313 - model.roboflow_id_detector - INFO - Saved cropped image to output\roboflow_detections\im5_card_1.jpg
2025-08-05 19:30:33,335 - model.roboflow_id_detector - INFO - Processed im5.png: 1 cards cropped
2025-08-05 19:30:34,336 - model.roboflow_id_detector - INFO - Processing 12/15: im6.png
2025-08-05 19:30:37,885 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im6.png
2025-08-05 19:30:37,886 - model.roboflow_id_detector - WARNING - No ID cards detected in im6.png
2025-08-05 19:30:38,894 - model.roboflow_id_detector - INFO - Processing 13/15: im7.png
2025-08-05 19:30:40,956 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im7.png
2025-08-05 19:30:40,956 - model.roboflow_id_detector - WARNING - No ID cards detected in im7.png
2025-08-05 19:30:41,964 - model.roboflow_id_detector - INFO - Processing 14/15: im8.png
2025-08-05 19:30:45,927 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im8.png
2025-08-05 19:30:45,927 - model.roboflow_id_detector - WARNING - No ID cards detected in im8.png
2025-08-05 19:30:46,935 - model.roboflow_id_detector - INFO - Processing 15/15: im9.png
2025-08-05 19:30:49,393 - model.roboflow_id_detector - INFO - Found 0 ID card detections in im9.png
2025-08-05 19:30:49,393 - model.roboflow_id_detector - WARNING - No ID cards detected in im9.png
2025-08-05 19:30:49,393 - model.roboflow_id_detector - INFO - Batch processing completed:
2025-08-05 19:30:49,394 - model.roboflow_id_detector - INFO - - Total images: 15
2025-08-05 19:30:49,394 - model.roboflow_id_detector - INFO - - Processed: 3
2025-08-05 19:30:49,394 - model.roboflow_id_detector - INFO - - Total detections: 4
2025-08-05 19:30:49,394 - model.roboflow_id_detector - INFO - - Total cropped: 4
2025-08-05 19:30:49,395 - __main__ - INFO - Batch processing completed!
2025-08-05 19:30:49,395 - __main__ - INFO - - Total images: 15
2025-08-05 19:30:49,395 - __main__ - INFO - - Processed: 3
2025-08-05 19:30:49,395 - __main__ - INFO - - Total detections: 4
2025-08-05 19:30:49,395 - __main__ - INFO - - Total cropped: 4
2025-08-05 19:30:49,396 - __main__ - INFO - Processing summary saved to: output\roboflow_detections\processing_summary.txt
2025-08-05 19:30:49,398 - __main__ - INFO - Processing completed successfully!

133
script/id_card_cropper.py Normal file
View File

@@ -0,0 +1,133 @@
#!/usr/bin/env python3
"""
Simple ID Card Cropper using Roboflow API
Input: folder containing images
Output: folder with cropped ID cards
"""
import sys
import yaml
from pathlib import Path
import logging
import argparse
# Add src to path
sys.path.append(str(Path(__file__).parent / "src"))
from model.roboflow_id_detector import RoboflowIDDetector
def setup_logging():
"""Setup basic logging"""
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s'
)
def crop_id_cards(input_folder: str, output_folder: str, api_key: str = "Pkz4puRA0Cy3xMOuNoNr"):
"""
Crop ID cards from all images in input folder
Args:
input_folder: Path to input folder containing images
output_folder: Path to output folder for cropped ID cards
api_key: Roboflow API key
"""
logger = logging.getLogger(__name__)
# Convert to Path objects
input_path = Path(input_folder)
output_path = Path(output_folder)
# Check if input folder exists
if not input_path.exists():
logger.error(f"Input folder not found: {input_folder}")
return False
# Create output folder
output_path.mkdir(parents=True, exist_ok=True)
# Initialize detector
detector = RoboflowIDDetector(
api_key=api_key,
model_id="french-card-id-detect",
version=3,
confidence=0.5
)
# Get all image files
image_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff'}
image_files = []
for file_path in input_path.rglob('*'):
if file_path.is_file() and file_path.suffix.lower() in image_extensions:
image_files.append(file_path)
if not image_files:
logger.error(f"No images found in {input_folder}")
return False
logger.info(f"Found {len(image_files)} images to process")
# Process each image
total_cropped = 0
for i, image_path in enumerate(image_files, 1):
logger.info(f"Processing {i}/{len(image_files)}: {image_path.name}")
# Detect ID cards
detections = detector.detect_id_cards(image_path)
if not detections:
logger.warning(f"No ID cards detected in {image_path.name}")
continue
# Crop each detected ID card
for j, detection in enumerate(detections):
bbox = detection['bbox']
# Create output filename
stem = image_path.stem
suffix = f"_card_{j+1}.jpg"
output_file = output_path / f"{stem}{suffix}"
# Crop ID card
cropped = detector.crop_id_card(image_path, bbox, output_file)
if cropped is not None:
total_cropped += 1
logger.info(f" ✓ Cropped card {j+1} to {output_file.name}")
# Add delay between requests
if i < len(image_files):
import time
time.sleep(1.0)
logger.info(f"Processing completed! Total ID cards cropped: {total_cropped}")
return True
def main():
"""Main function"""
parser = argparse.ArgumentParser(description='Crop ID cards from images using Roboflow API')
parser.add_argument('input_folder', help='Input folder containing images')
parser.add_argument('output_folder', help='Output folder for cropped ID cards')
parser.add_argument('--api-key', default="Pkz4puRA0Cy3xMOuNoNr",
help='Roboflow API key (default: demo key)')
args = parser.parse_args()
# Setup logging
setup_logging()
# Process images
success = crop_id_cards(args.input_folder, args.output_folder, args.api_key)
if success:
print(f"\n✓ Successfully processed images from '{args.input_folder}'")
print(f"✓ Cropped ID cards saved to '{args.output_folder}'")
else:
print(f"\n✗ Failed to process images")
return 1
return 0
if __name__ == "__main__":
exit(main())

View File

@@ -1,8 +1,7 @@
""" """
Model module for YOLO-based ID card detection and cropping Model module for Roboflow-based ID card detection and cropping
""" """
from .yolo_detector import YOLODetector from .roboflow_id_detector import RoboflowIDDetector
from .id_card_processor import IDCardProcessor
__all__ = ['YOLODetector', 'IDCardProcessor'] __all__ = ['RoboflowIDDetector']

Binary file not shown.

View File

@@ -1,46 +1,104 @@
""" """
YOLO Detector for ID Card Detection and Cropping Roboflow ID Card Detector using French Card ID Detection Model
""" """
import cv2 import cv2
import numpy as np import numpy as np
from pathlib import Path from pathlib import Path
from typing import List, Tuple, Optional, Dict, Any from typing import List, Tuple, Optional, Dict, Any
import logging import logging
from ultralytics import YOLO import requests
import torch import base64
import json
import time
from urllib.parse import quote
class YOLODetector: class RoboflowIDDetector:
""" """
YOLO-based detector for ID card detection and cropping Roboflow-based detector for French ID card detection using the french-card-id-detect model
""" """
def __init__(self, model_path: Optional[str] = None, confidence: float = 0.5): def __init__(self, api_key: str, model_id: str = "french-card-id-detect",
version: int = 3, confidence: float = 0.5):
""" """
Initialize YOLO detector Initialize Roboflow ID detector
Args: Args:
model_path: Path to YOLO model file (.pt) api_key: Roboflow API key
model_id: Model identifier (default: french-card-id-detect)
version: Model version (default: 3)
confidence: Confidence threshold for detection confidence: Confidence threshold for detection
""" """
self.api_key = api_key
self.model_id = model_id
self.version = version
self.confidence = confidence self.confidence = confidence
self.logger = logging.getLogger(__name__) self.logger = logging.getLogger(__name__)
# Initialize model # API endpoint
if model_path and Path(model_path).exists(): self.api_url = f"https://serverless.roboflow.com/{model_id}/{version}"
self.model = YOLO(model_path)
self.logger.info(f"Loaded custom YOLO model from {model_path}")
else:
# Use pre-trained YOLO model for general object detection
self.model = YOLO('yolov8n.pt')
self.logger.info("Using pre-trained YOLOv8n model")
# Set device self.logger.info(f"Initialized Roboflow ID detector with model: {model_id}/{version}")
self.device = 'cuda' if torch.cuda.is_available() else 'cpu'
self.logger.info(f"Using device: {self.device}") def _encode_image(self, image_path: Path) -> str:
"""
Encode image to base64
Args:
image_path: Path to image file
Returns:
Base64 encoded image string
"""
try:
with open(image_path, "rb") as image_file:
encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
return encoded_string
except Exception as e:
self.logger.error(f"Error encoding image {image_path}: {e}")
return None
def _make_api_request(self, image_data: str, image_name: str = "image.jpg") -> Optional[Dict]:
"""
Make API request to Roboflow
Args:
image_data: Base64 encoded image data
image_name: Name of the image file
Returns:
API response as dictionary
"""
try:
headers = {
'Content-Type': 'application/x-www-form-urlencoded'
}
params = {
'api_key': self.api_key,
'name': image_name
}
response = requests.post(
self.api_url,
params=params,
data=image_data,
headers=headers,
timeout=30
)
if response.status_code == 200:
return response.json()
else:
self.logger.error(f"API request failed with status {response.status_code}: {response.text}")
return None
except Exception as e:
self.logger.error(f"Error making API request: {e}")
return None
def detect_id_cards(self, image_path: Path) -> List[Dict[str, Any]]: def detect_id_cards(self, image_path: Path) -> List[Dict[str, Any]]:
""" """
Detect ID cards in an image Detect ID cards in an image using Roboflow API
Args: Args:
image_path: Path to image file image_path: Path to image file
@@ -49,39 +107,50 @@ class YOLODetector:
List of detection results with bounding boxes List of detection results with bounding boxes
""" """
try: try:
# Load image # Encode image
image = cv2.imread(str(image_path)) image_data = self._encode_image(image_path)
if image is None: if not image_data:
self.logger.error(f"Could not load image: {image_path}")
return [] return []
# Run detection # Make API request
results = self.model(image, conf=self.confidence) response = self._make_api_request(image_data, image_path.name)
if not response:
return []
detections = [] detections = []
for result in results:
boxes = result.boxes
if boxes is not None:
for box in boxes:
# Get coordinates
x1, y1, x2, y2 = box.xyxy[0].cpu().numpy()
confidence = float(box.conf[0])
class_id = int(box.cls[0])
class_name = self.model.names[class_id]
detection = {
'bbox': [int(x1), int(y1), int(x2), int(y2)],
'confidence': confidence,
'class_id': class_id,
'class_name': class_name,
'area': (x2 - x1) * (y2 - y1)
}
detections.append(detection)
# Sort by confidence and area (prefer larger, more confident detections) # Parse predictions from response
if 'predictions' in response:
for prediction in response['predictions']:
# Check confidence threshold
if prediction.get('confidence', 0) < self.confidence:
continue
# Extract bounding box coordinates
x = prediction.get('x', 0)
y = prediction.get('y', 0)
width = prediction.get('width', 0)
height = prediction.get('height', 0)
# Convert to [x1, y1, x2, y2] format
x1 = int(x - width / 2)
y1 = int(y - height / 2)
x2 = int(x + width / 2)
y2 = int(y + height / 2)
detection = {
'bbox': [x1, y1, x2, y2],
'confidence': prediction.get('confidence', 0),
'class_id': prediction.get('class_id', 0),
'class_name': prediction.get('class', 'id_card'),
'area': width * height
}
detections.append(detection)
# Sort by confidence and area
detections.sort(key=lambda x: (x['confidence'], x['area']), reverse=True) detections.sort(key=lambda x: (x['confidence'], x['area']), reverse=True)
self.logger.info(f"Found {len(detections)} detections in {image_path.name}") self.logger.info(f"Found {len(detections)} ID card detections in {image_path.name}")
return detections return detections
except Exception as e: except Exception as e:
@@ -201,7 +270,7 @@ class YOLODetector:
return result return result
def batch_process(self, input_dir: Path, output_dir: Path, def batch_process(self, input_dir: Path, output_dir: Path,
save_annotated: bool = False) -> Dict[str, Any]: save_annotated: bool = False, delay: float = 1.0) -> Dict[str, Any]:
""" """
Process all images in a directory and subdirectories Process all images in a directory and subdirectories
@@ -209,6 +278,7 @@ class YOLODetector:
input_dir: Input directory containing images input_dir: Input directory containing images
output_dir: Output directory for cropped images output_dir: Output directory for cropped images
save_annotated: Whether to save annotated images save_annotated: Whether to save annotated images
delay: Delay between API requests (seconds)
Returns: Returns:
Batch processing results Batch processing results
@@ -216,11 +286,10 @@ class YOLODetector:
# Create output directory # Create output directory
output_dir.mkdir(parents=True, exist_ok=True) output_dir.mkdir(parents=True, exist_ok=True)
# Get all image files recursively from input directory and subdirectories # Get all image files recursively
image_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff'} image_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff'}
image_files = [] image_files = []
# Recursively find all image files
for file_path in input_dir.rglob('*'): for file_path in input_dir.rglob('*'):
if file_path.is_file() and file_path.suffix.lower() in image_extensions: if file_path.is_file() and file_path.suffix.lower() in image_extensions:
image_files.append(file_path) image_files.append(file_path)
@@ -255,6 +324,10 @@ class YOLODetector:
results['processed_images'] += 1 results['processed_images'] += 1
results['total_detections'] += len(result['detections']) results['total_detections'] += len(result['detections'])
results['total_cropped'] += len(result['cropped_paths']) results['total_cropped'] += len(result['cropped_paths'])
# Add delay between requests to avoid rate limiting
if i < len(image_files) - 1: # Don't delay after the last image
time.sleep(delay)
# Summary # Summary
self.logger.info(f"Batch processing completed:") self.logger.info(f"Batch processing completed:")