init structure

This commit is contained in:
2025-08-26 09:35:24 +00:00
commit 42047598ae
18 changed files with 2023 additions and 0 deletions

1
.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
*.pyc

223
README.md Normal file
View File

@@ -0,0 +1,223 @@
# IQA Metric Benchmark
A comprehensive framework for evaluating Image Quality Assessment (IQA) metrics against human quality annotations.
## Overview
This project evaluates various IQA metrics by comparing their scores with human quality judgments for identity document images. The goal is to identify which IQA metrics best correlate with human perceptions of image quality.
## Features
### 📊 **IQA Metrics Evaluation**
- **15 IQA Metrics** evaluated against human quality annotations
- **Correlation Analysis** - Both Pearson and Spearman correlations
- **Statistical Significance Testing** - p-value analysis
- **Performance Rankings** - Comprehensive metric comparisons
### 🔧 **Core Components**
- **IQA Score Processing** - Load and parse IQA metric scores
- **Human Label Analysis** - Process human quality annotations
- **Correlation Calculator** - Statistical correlation analysis
- **Results Generator** - Comprehensive reporting and visualization
### 📈 **Analysis Capabilities**
- **Batch Processing** - Evaluate multiple IQA metrics simultaneously
- **Statistical Analysis** - Correlation coefficients and significance testing
- **Visualization** - Comparison plots and rankings
- **Multiple Export Formats** - CSV, Markdown, and text reports
## Installation
### Prerequisites
- Python 3.8 or higher
- pip package manager
### Setup
1. Clone the repository:
```bash
git clone <repository-url>
cd IQA-Metric-Benchmark
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
## Usage
### Quick Start
Run the IQA evaluation analysis:
```bash
python main.py
```
### Command Line Options
```bash
# Basic evaluation
python main.py
# Custom directories
python main.py --results-dir results --human-labels data/task/cni/human-label.csv
# Different output formats
python main.py --output-format csv
python main.py --output-format txt
```
## Project Structure
```
IQA-Metric-Benchmark/
├── src/ # Source code
│ ├── __init__.py
│ ├── deqa_scorer.py # DeQA model wrapper
│ ├── iqa_analyzer.py # Main analysis engine
│ └── logger_config.py # Logging configuration
├── scripts/ # Utility scripts
│ ├── env.sh # Environment setup
│ └── cleanup_logs.py # Log cleanup utility
├── data/ # Data files
│ └── task/
│ └── cni/
│ ├── images/ # Image files
│ └── human-label.csv # Human quality annotations
├── docs/ # Documentation
│ └── task/
│ └── cni/
│ └── evaluation_results.md # IQA evaluation results
├── logs/ # Log files (created automatically)
├── results/ # Output directory (created automatically)
├── main.py # Main execution script
├── requirements.txt # Python dependencies
└── README.md # This file
```
## IQA Metrics Evaluated
The framework evaluates the following IQA metrics:
### **No-Reference Metrics:**
- **DEQA** - Deep Quality Assessment
- **URanker** - Unified Ranking
- **DBCNN** - Deep Blind CNN
- **HyperIQA** - Hyperparameter-free IQA
- **MANIQA** - Multi-dimension Attention Network
- **MUSIQ** - Multi-scale Image Quality Transformer
- **NIMA** - Neural Image Assessment
- **BRISQUE** - Blind/Referenceless Image Spatial Quality Evaluator
- **NIQE** - Natural Image Quality Evaluator
- **PIQE** - Perception-based Image Quality Evaluator
- **NRQM** - No-Reference Quality Metric
- **Unique** - Unified Quality Evaluator
- **PaQ2PIQ** - Perceptual Quality Assessment
- **CLIPIQA+** - CLIP-based IQA
- **TopIQ** - Topology-aware IQA
## Output and Results
### Results Directory Structure
```
results/
├── detailed_iqa_correlation_results.csv # Complete correlation analysis
├── iqa_ranking_table.csv # Performance rankings
├── detailed_data_*.csv # Individual metric data
├── iqa_correlation_comparison.png # Visualization plots
└── evaluation_summary.txt # Summary report
```
### 📊 IQA Metrics Evaluation Results
Evaluation of 15 IQA metrics against human quality annotations for 81 identity document images:
#### 🏆 **Top Performing Metrics**
1. **DEQA** (0.6185) - Best overall performer with strong positive correlation
2. **URanker** (0.3629) - Good positive correlation
3. **DBCNN** (0.3605) - Strong negative correlation (interpretation differs)
4. **NRQM** (0.3596) - Strong negative correlation
5. **BRISQUE** (0.3509) - Strong negative correlation
#### 📈 **Key Findings**
- **9/15 metrics** have statistically significant Pearson correlations (p < 0.05)
- **10/15 metrics** have statistically significant Spearman correlations (p < 0.05)
- **DEQA recommended** as primary metric for identity document quality assessment
- **Average absolute correlation:** 0.2713
#### 🔍 **Correlation Patterns**
- **Positive correlations** (higher IQA score = higher human quality): DEQA, URanker, NIMA
- **Negative correlations** (lower IQA score = higher human quality): DBCNN, NRQM, BRISQUE, HyperIQA
📋 **Detailed results:** See [docs/task/cni/evaluation_results.md](docs/task/cni/evaluation_results.md) for complete analysis.
## Methodology
### Data Sources
- **IQA Metrics**: 15 different IQA metrics computed for 81 identity document images
- **Human Labels**: Quality annotations (coherence scores 1-5) for the same 81 images
- **Correlation Analysis**: Both Pearson (linear) and Spearman (rank) correlations
### Statistical Analysis
- **Correlation Types**: Pearson (linear) and Spearman (rank-order)
- **Significance Threshold**: p < 0.05
- **Overall Score**: Average of absolute Pearson and Spearman correlations
## Recommendations
### 🎯 **Primary Recommendation**
**Use DEQA as the primary IQA metric** for identity document quality assessment due to its strong positive correlation (0.6185) with human quality judgments.
### 🔄 **Robust Evaluation Strategy**
Combine multiple metrics for comprehensive assessment:
1. **DEQA** (primary) - Strong positive correlation
2. **URanker** (secondary) - Good positive correlation
3. **NIMA** (validation) - Moderate positive correlation
### ⚠️ **Important Notes**
- Some metrics show negative correlations, indicating different quality interpretations
- Consider dataset-specific calibration for better performance
- Results may vary with different image types or quality ranges
## Performance Considerations
- **Efficient Processing**: Optimized for batch analysis of multiple metrics
- **Memory Management**: Handles large datasets efficiently
- **Error Handling**: Robust error handling for missing or corrupted data
- **Scalability**: Designed to accommodate additional metrics
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests if applicable
5. Submit a pull request
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Citation
If you use this framework in your research, please cite:
```bibtex
@software{iqa_metric_benchmark,
title={IQA Metric Benchmark: Evaluation of IQA Metrics Against Human Quality Annotations},
author={Your Name},
year={2024},
url={https://github.com/yourusername/IQA-Metric-Benchmark}
}
```
## Support
For questions, issues, or contributions:
- Open an issue on GitHub
- Contact the maintainers
- Check the documentation
---
**Happy IQA Evaluation! 📊✨**

12
data/.gitignore vendored Normal file
View File

@@ -0,0 +1,12 @@
*.png
*.csv
*.jpg
*.jpeg
*.gif
*.bmp
*.tiff
*.webp
*.ico
*.heic
*.heif
*.txt

View File

@@ -0,0 +1,128 @@
# CNI Task - IQA Metrics Evaluation Results
## Overview
This document presents the comprehensive evaluation results of 15 Image Quality Assessment (IQA) metrics against human quality annotations for 81 identity document images.
**Evaluation Date:** 2025-08-26
**Total Images:** 81
**Total Metrics Evaluated:** 15
## Key Findings
- **9/15 metrics** have statistically significant Pearson correlations (p < 0.05)
- **10/15 metrics** have statistically significant Spearman correlations (p < 0.05)
- **Best performing metric:** DEQA with correlation 0.6185
- **Average absolute correlation:** 0.2713
## Top Performing Metrics
### 🏆 Best Overall Performer: DEQA
| Metric | Pearson Correlation | Spearman Correlation | Overall Score | Significance |
|--------|-------------------|---------------------|---------------|--------------|
| **deqa** | **0.6185** | **0.6059** | **0.6122** | Both |
| uranker | 0.3349 | 0.3909 | 0.3629 | Both |
| dbcnn | -0.3721 | -0.3489 | 0.3605 | Both |
| nrqm | -0.3493 | -0.3699 | 0.3596 | Both |
| brisque | -0.3159 | -0.3859 | 0.3509 | Both |
## Complete Rankings
| Rank | Metric | Pearson Corr | Spearman Corr | Overall Score | Significant |
|------|--------|--------------|---------------|---------------|-------------|
| 1 | **deqa** | **0.6185** | **0.6059** | **0.6122** | Both |
| 2 | uranker | 0.3349 | 0.3909 | 0.3629 | Both |
| 3 | dbcnn | -0.3721 | -0.3489 | 0.3605 | Both |
| 4 | nrqm | -0.3493 | -0.3699 | 0.3596 | Both |
| 5 | brisque | -0.3159 | -0.3859 | 0.3509 | Both |
| 6 | hyperiqa | -0.3271 | -0.3106 | 0.3189 | Both |
| 7 | nima | 0.2989 | 0.3321 | 0.3155 | Both |
| 8 | topiq_nr | -0.2244 | -0.2445 | 0.2345 | Both |
| 9 | maniqa | -0.2106 | -0.2420 | 0.2263 | Spearman only |
| 10 | musiq | -0.2013 | -0.2386 | 0.2200 | Spearman only |
| 11 | clipiqa+_vitL14_512 | -0.2259 | -0.1960 | 0.2109 | Pearson only |
| 12 | unique | -0.1875 | -0.1971 | 0.1923 | None |
| 13 | piqe | 0.1958 | 0.1763 | 0.1860 | None |
| 14 | paq2piq | -0.1445 | -0.1548 | 0.1497 | None |
| 15 | niqe | -0.0627 | 0.0745 | 0.0686 | None |
## Correlation Analysis
### Positive Correlations (Higher IQA Score = Higher Human Quality)
| Metric | Pearson | Spearman | Significance |
|--------|---------|----------|--------------|
| **deqa** | **0.6185** | **0.6059** | Both |
| uranker | 0.3349 | 0.3909 | Both |
| nima | 0.2989 | 0.3321 | Both |
| piqe | 0.1958 | 0.1763 | None |
### Negative Correlations (Lower IQA Score = Higher Human Quality)
| Metric | Pearson | Spearman | Significance |
|--------|---------|----------|--------------|
| dbcnn | -0.3721 | -0.3489 | Both |
| nrqm | -0.3493 | -0.3699 | Both |
| hyperiqa | -0.3271 | -0.3106 | Both |
| brisque | -0.3159 | -0.3859 | Both |
| clipiqa+_vitL14_512 | -0.2259 | -0.1960 | Mixed |
| topiq_nr | -0.2244 | -0.2445 | Both |
| maniqa | -0.2106 | -0.2420 | Mixed |
| musiq | -0.2013 | -0.2386 | Mixed |
| unique | -0.1875 | -0.1971 | None |
| paq2piq | -0.1445 | -0.1548 | None |
| niqe | -0.0627 | 0.0745 | None |
## Statistical Significance Summary
### ✅ Highly Significant (Both Pearson and Spearman, p < 0.05)
- deqa, uranker, dbcnn, nrqm, brisque, hyperiqa, nima, topiq_nr
### ⚠️ Partially Significant (One correlation type, p < 0.05)
- maniqa, musiq, clipiqa+_vitL14_512
### ❌ Not Significant (Both correlations, p ≥ 0.05)
- unique, piqe, paq2piq, niqe
## Recommendations
### 🎯 Primary Recommendation
**Use DEQA as the primary IQA metric** for identity document quality assessment due to its strong positive correlation (0.6185) with human quality judgments.
### 🔄 Robust Evaluation Strategy
Combine multiple metrics for comprehensive assessment:
1. **deqa** (primary) - Strong positive correlation
2. **uranker** (secondary) - Good positive correlation
3. **nima** (validation) - Moderate positive correlation
### ⚠️ Important Notes
- Some metrics show negative correlations, indicating different quality interpretations
- Consider dataset-specific calibration for better performance
- Results may vary with different image types or quality ranges
## Methodology
### Data Sources
- **IQA Metrics**: 15 different IQA metrics computed for 81 identity document images
- **Human Labels**: Quality annotations (coherence scores 1-5) for the same 81 images
- **Correlation Analysis**: Both Pearson (linear) and Spearman (rank) correlations
### Statistical Analysis
- **Correlation Types**: Pearson (linear) and Spearman (rank-order)
- **Significance Threshold**: p < 0.05
- **Overall Score**: Average of absolute Pearson and Spearman correlations
### Files Generated
- `detailed_iqa_correlation_results.csv` - Complete analysis data
- `iqa_ranking_table.csv` - Performance rankings
- `detailed_data_*.csv` - Individual metric data (15 files)
- `iqa_correlation_comparison.png` - Visualization plots
## Conclusion
DEQA emerges as the top-performing IQA metric for identity document quality assessment, showing the strongest correlation with human quality judgments. The evaluation demonstrates that several metrics have statistically significant relationships with human assessments, providing a solid foundation for automated quality evaluation systems.
---
*Last updated: 2025-08-26*

Binary file not shown.

After

Width:  |  Height:  |  Size: 657 KiB

1
logs/.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
*.log

168
main.py Normal file
View File

@@ -0,0 +1,168 @@
#!/usr/bin/env python3
"""
Main script for Image Quality Assessment using DeQA scoring.
"""
import argparse
import sys
from pathlib import Path
# Add src directory to path for imports
sys.path.append(str(Path(__file__).parent / "src"))
from src.iqa_analyzer import IQAAnalyzer
from src.logger_config import setup_logging
def main():
"""Main function to run the IQA analysis."""
parser = argparse.ArgumentParser(
description="IQA Benchmark Runner - Analyze image quality using DeQA scoring",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# Run analysis on default image directory
python main.py
# Run analysis on specific directory
python main.py --image-dir /path/to/images
# Run with verbose logging and save results as CSV
python main.py --verbose --output-format csv
# Run analysis and save to custom output directory
python main.py --output-dir custom_results
"""
)
parser.add_argument(
'--image-dir',
type=str,
default='data/task/cni/images',
help='Directory containing images to analyze (default: data/task/cni/images)'
)
parser.add_argument(
'--output-dir',
type=str,
default='results',
help='Directory to save results and reports (default: results)'
)
parser.add_argument(
'--log-dir',
type=str,
default='logs',
help='Directory to store log files (default: logs)'
)
parser.add_argument(
'--output-format',
choices=['json', 'csv', 'txt'],
default='json',
help='Output format for results (default: json)'
)
parser.add_argument(
'--verbose', '-v',
action='store_true',
help='Enable verbose logging'
)
parser.add_argument(
'--enable-deqa',
action='store_true',
default=True,
help='Enable DeQA metric (default: True)'
)
parser.add_argument(
'--enable-traditional',
action='store_true',
default=False,
help='Enable traditional metrics (default: False)'
)
parser.add_argument(
'--enable-pyiqa',
action='store_true',
default=True,
help='Enable PyIQA metrics (default: True)'
)
parser.add_argument(
'--pyiqa-top20',
action='store_true',
help='Use curated top 20 PyIQA metrics only'
)
parser.add_argument(
'--deqa-only',
action='store_true',
help='Use only DeQA metric (disable traditional metrics)'
)
args = parser.parse_args()
# Setup logging
log_level = "DEBUG" if args.verbose else "INFO"
logger = setup_logging(log_dir=args.log_dir, log_level=log_level)
# Validate image directory
image_dir = Path(args.image_dir)
if not image_dir.exists():
logger.error(f"Image directory does not exist: {image_dir}")
sys.exit(1)
if not image_dir.is_dir():
logger.error(f"Path is not a directory: {image_dir}")
sys.exit(1)
# Initialize and run IQA analysis
try:
logger.info("Starting IQA analysis...")
logger.info(f"Image directory: {image_dir}")
logger.info(f"Output directory: {args.output_dir}")
# Determine which metrics to enable
enable_deqa = args.enable_deqa and not args.deqa_only
enable_traditional = args.enable_traditional and not args.deqa_only
# User-requested 20 metrics (NR + FR) from PyIQA
selected_top20 = [
# No-Reference (for OCR practical use)
'brisque', 'niqe', 'piqe', 'nrqm', 'nima', 'paq2piq', 'dbcnn', 'hyperiqa',
'musiq', 'topiq_nr', 'clipiqa+_vitL14_512', 'maniqa', 'ahiq', 'unique', 'uranker',
# Full-Reference (benchmark)
'ssim', 'ms_ssim', 'fsim', 'vif', 'dists'
]
analyzer = IQAAnalyzer(
str(image_dir),
args.output_dir,
enable_deqa=enable_deqa,
enable_traditional=enable_traditional,
enable_pyiqa=args.enable_pyiqa,
pyiqa_selected_metrics=(selected_top20 if args.pyiqa_top20 else None)
)
results, report = analyzer.run_analysis()
# Save results only in requested format; default to TXT if none
if args.output_format == 'csv':
analyzer.save_results('csv')
else:
# TXT as the primary output for this workflow
analyzer.save_results('txt')
# Optionally display report (keep for visibility)
print("\n" + report)
logger.info("IQA analysis completed successfully!")
except Exception as e:
logger.error(f"Error during IQA analysis: {e}")
sys.exit(1)
if __name__ == "__main__":
main()

12
requirements.txt Normal file
View File

@@ -0,0 +1,12 @@
opencv-python==4.8.1.78
numpy==1.24.3
scikit-image==0.21.0
matplotlib==3.7.2
seaborn==0.12.2
pandas==2.0.3
Pillow==10.0.0
scipy==1.11.1
tqdm==4.65.0
torch
transformers
pyiqa

1
scripts/env.sh Normal file
View File

@@ -0,0 +1 @@
source /mnt/disk/nvme0n1/venv/thanh-dev/bin/activate

14
src/__init__.py Normal file
View File

@@ -0,0 +1,14 @@
"""
IQA Metric Benchmark - Image Quality Assessment Framework
"""
from .deqa_scorer import DeQAScorer
from .iqa_analyzer import IQAAnalyzer
from .logger_config import setup_logging, get_logger
from .metrics import MetricsManager, DeQAMetric, TraditionalMetrics, PyIQAMetrics
__version__ = "1.0.0"
__all__ = [
"DeQAScorer", "IQAAnalyzer", "setup_logging", "get_logger",
"MetricsManager", "DeQAMetric", "TraditionalMetrics", "PyIQAMetrics"
]

124
src/deqa_scorer.py Normal file
View File

@@ -0,0 +1,124 @@
"""
DeQA Image Quality Scorer
Core module for scoring images using the DeQA model.
"""
import torch
from transformers import AutoModelForCausalLM
from PIL import Image
from typing import List, Union
from .logger_config import get_logger
logger = get_logger(__name__)
class DeQAScorer:
"""DeQA model wrapper for image quality scoring."""
def __init__(self):
"""Initialize the DeQA scorer."""
self.model = None
self._is_loaded = False
def load_model(self) -> None:
"""Load the DeQA scoring model."""
if self._is_loaded:
return
logger.info("Loading DeQA model...")
try:
self.model = AutoModelForCausalLM.from_pretrained(
"zhiyuanyou/DeQA-Score-Mix3",
trust_remote_code=True,
attn_implementation="eager",
torch_dtype=torch.float16,
device_map="auto",
)
self._is_loaded = True
logger.info("DeQA model loaded successfully!")
except Exception as e:
logger.error(f"Failed to load DeQA model: {e}")
raise
def score_single_image(self, image_path: str) -> Union[float, None]:
"""
Score a single image using DeQA model.
Args:
image_path: Path to the image file
Returns:
DeQA score (0-5 scale) or None if failed
"""
if not self._is_loaded:
self.load_model()
try:
image = Image.open(image_path)
scores = self.model.score([image])
# Convert tensor to float
if hasattr(scores, 'item'):
return float(scores.item())
elif hasattr(scores, 'tolist'):
return float(scores.tolist()[0])
else:
return float(scores[0])
except Exception as e:
logger.error(f"Error scoring image {image_path}: {e}")
return None
def score_multiple_images(self, image_paths: List[str]) -> List[Union[float, None]]:
"""
Score multiple images using DeQA model.
Args:
image_paths: List of image file paths
Returns:
List of DeQA scores (0-5 scale) or None for failed images
"""
if not self._is_loaded:
self.load_model()
try:
# Open all images
images = []
for path in image_paths:
try:
image = Image.open(path)
images.append(image)
except Exception as e:
logger.warning(f"Failed to open image {path}: {e}")
images.append(None)
# Score images
scores = self.model.score(images)
# Convert scores to list of floats
result_scores = []
for i, score in enumerate(scores):
if images[i] is None:
result_scores.append(None)
else:
try:
if hasattr(score, 'item'):
result_scores.append(float(score.item()))
elif hasattr(score, 'tolist'):
result_scores.append(float(score.tolist()))
else:
result_scores.append(float(score))
except Exception as e:
logger.warning(f"Failed to convert score for image {image_paths[i]}: {e}")
result_scores.append(None)
return result_scores
except Exception as e:
logger.error(f"Error scoring images: {e}")
return [None] * len(image_paths)
def is_loaded(self) -> bool:
"""Check if the model is loaded."""
return self._is_loaded

271
src/iqa_analyzer.py Normal file
View File

@@ -0,0 +1,271 @@
"""
Image Quality Assessment Analyzer
Main module for analyzing image quality using comprehensive metrics.
"""
import os
import json
import pandas as pd
from pathlib import Path
from typing import Dict, List, Tuple
from datetime import datetime
from .metrics import MetricsManager
from .logger_config import get_logger
logger = get_logger(__name__)
class IQAAnalyzer:
"""Main IQA analyzer using comprehensive metrics."""
def __init__(self, image_dir: str, output_dir: str = "results",
enable_deqa: bool = True, enable_traditional: bool = True,
enable_pyiqa: bool = True, pyiqa_selected_metrics: List[str] | None = None):
"""
Initialize the IQA analyzer.
Args:
image_dir: Directory containing images to analyze
output_dir: Directory to save results
enable_deqa: Whether to enable DeQA metric
enable_traditional: Whether to enable traditional metrics
"""
self.image_dir = Path(image_dir)
self.output_dir = Path(output_dir)
self.output_dir.mkdir(parents=True, exist_ok=True)
# Initialize metrics manager
self.metrics_manager = MetricsManager(
enable_deqa=enable_deqa,
enable_traditional=enable_traditional,
enable_pyiqa=enable_pyiqa,
pyiqa_selected_metrics=pyiqa_selected_metrics
)
# Results storage
self.results = {}
self.timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
logger.info(f"IQA Analyzer initialized for: {image_dir}")
logger.info(f"Available metrics: {self.metrics_manager.get_available_metrics()}")
def _get_image_files(self) -> List[Path]:
"""Get all image files from the directory."""
image_extensions = {'.png', '.jpg', '.jpeg', '.bmp', '.tiff', '.PNG', '.JPG', '.JPEG', '.BMP', '.TIFF'}
image_files = []
for ext in image_extensions:
image_files.extend(self.image_dir.glob(f"*{ext}"))
return sorted(image_files)
def analyze_images(self) -> Dict[str, Dict]:
"""
Analyze all images using comprehensive metrics.
Returns:
Dictionary containing analysis results
"""
logger.info("Starting image quality analysis...")
# Get all image files
image_files = self._get_image_files()
if not image_files:
logger.error("No image files found!")
return {}
logger.info(f"Found {len(image_files)} images to analyze")
# Prepare image paths for analysis
image_paths = [str(img_path) for img_path in image_files]
# Calculate all metrics for all images
logger.info("Calculating comprehensive metrics...")
self.results = self.metrics_manager.calculate_metrics_batch(image_paths)
logger.info(f"Analysis completed for {len(self.results)} images")
return self.results
def generate_report(self) -> str:
"""Generate comprehensive analysis report."""
if not self.results:
return "No results to report. Run analyze_images() first."
# Create summary statistics
valid_results = {k: v for k, v in self.results.items() if 'error' not in v}
if not valid_results:
return "No valid results to report."
# Extract scores
deqa_scores = [v.get('deqa_score', 0) for v in valid_results.values() if v.get('deqa_score') is not None]
musiq_scores = [v.get('musiq_score', 0) for v in valid_results.values() if v.get('musiq_score') is not None]
# Get available metrics info
metrics_info = self.metrics_manager.get_metric_info()
# Calculate statistics
stats = {
'total_images': len(self.results),
'valid_images': len(valid_results),
'deqa_score_stats': {
'mean': round(sum(deqa_scores) / len(deqa_scores), 3) if deqa_scores else 0,
'min': round(min(deqa_scores), 3) if deqa_scores else 0,
'max': round(max(deqa_scores), 3) if deqa_scores else 0,
'std': round(pd.Series(deqa_scores).std(), 3) if len(deqa_scores) > 1 else 0
}
}
# Generate report
report = f"""
Image Quality Assessment Report
Generated: {datetime.now().strftime("%Y-%m-%d %H:%M:%S")}
Image Directory: {self.image_dir}
Summary Statistics:
==================
Total Images: {stats['total_images']}
Valid Images: {stats['valid_images']}
Available Metrics: {', '.join(metrics_info.keys())}
DeQA Score Statistics (0-5 scale):
- Mean: {stats['deqa_score_stats']['mean']}
- Min: {stats['deqa_score_stats']['min']}
- Max: {stats['deqa_score_stats']['max']}
- Std Dev: {stats['deqa_score_stats']['std']}
MUSIQ Score Statistics (0-100 scale):
- Mean: {round(sum(musiq_scores) / len(musiq_scores), 3) if musiq_scores else 0}
- Min: {round(min(musiq_scores), 3) if musiq_scores else 0}
- Max: {round(max(musiq_scores), 3) if musiq_scores else 0}
Quality Distribution (DeQA):
- Excellent (4.5-5.0): {len([s for s in deqa_scores if s >= 4.5])}
- Good (3.5-4.4): {len([s for s in deqa_scores if 3.5 <= s < 4.5])}
- Fair (2.5-3.4): {len([s for s in deqa_scores if 2.5 <= s < 3.5])}
- Poor (1.5-2.4): {len([s for s in deqa_scores if 1.5 <= s < 2.5])}
- Very Poor (0.0-1.4): {len([s for s in deqa_scores if s < 1.5])}
Top 10 Highest Quality Images (DeQA):
"""
# Sort by DeQA score
sorted_results = sorted(
valid_results.items(),
key=lambda x: x[1].get('deqa_score', 0),
reverse=True
)
for i, (img_name, result) in enumerate(sorted_results[:10]):
deqa = result.get('deqa_score', 'N/A')
musiq = result.get('musiq_score', 'N/A')
size_mb = result.get('file_size_mb', 'N/A')
report += f"{i+1}. {img_name}: DeQA: {deqa}, MUSIQ: {musiq}, Size: {size_mb} MB\n"
report += "\nBottom 10 Lowest Quality Images (DeQA):\n"
for i, (img_name, result) in enumerate(sorted_results[-10:]):
deqa = result.get('deqa_score', 'N/A')
musiq = result.get('musiq_score', 'N/A')
size_mb = result.get('file_size_mb', 'N/A')
report += f"{i+1}. {img_name}: DeQA: {deqa}, MUSIQ: {musiq}, Size: {size_mb} MB\n"
# Add traditional metrics summary if available
if 'traditional' in metrics_info:
report += f"\nTraditional Metrics Available:\n"
for metric_name in metrics_info['traditional']['metrics'][:10]: # Show first 10
report += f"- {metric_name}\n"
if len(metrics_info['traditional']['metrics']) > 10:
report += f"... and {len(metrics_info['traditional']['metrics']) - 10} more\n"
return report
def save_results(self, format: str = 'json') -> str:
"""Save results to file."""
if not self.results:
return "No results to save. Run analyze_images() first."
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
if format.lower() == 'json':
output_file = self.output_dir / f"iqa_results_{timestamp}.json"
with open(output_file, 'w') as f:
json.dump(self.results, f, indent=2, default=str)
elif format.lower() == 'csv':
# Flatten results for CSV with all available metrics
flat_results = []
for img_name, result in self.results.items():
if 'error' not in result:
row = {'image_name': img_name}
# Add all available metrics
for key, value in result.items():
if key not in ['file_path']: # Skip file path for CSV
row[key] = value
flat_results.append(row)
df = pd.DataFrame(flat_results)
output_file = self.output_dir / f"iqa_results_{timestamp}.csv"
df.to_csv(output_file, index=False)
elif format.lower() == 'txt':
# Always save DeQA
output_file = self.output_dir / f"deqa-score-{timestamp}.txt"
with open(output_file, 'w') as f:
sorted_results = sorted(
self.results.items(),
key=lambda x: x[1].get('deqa_score', 0) if 'error' not in x[1] else 0,
reverse=True
)
for img_name, result in sorted_results:
if 'error' not in result and result.get('deqa_score') is not None:
f.write(f"{result['deqa_score']:.1f} - {img_name}\n")
# For PyIQA selected metrics, emit one TXT per metric ONLY if any value exists
selected = self.metrics_manager.get_selected_pyiqa_metrics() or []
# Precompute which metrics have at least one value
metrics_with_values = set()
for metric_name in selected:
key = f"pyiqa_{metric_name}"
for result in self.results.values():
if 'error' not in result and result.get(key) is not None:
metrics_with_values.add(metric_name)
break
for metric_name in selected:
if metric_name not in metrics_with_values:
# Skip creating a file when no values are available
continue
key = f"pyiqa_{metric_name}"
path = self.output_dir / f"{metric_name}-score-{timestamp}.txt"
with open(path, 'w') as f:
# Sort by this metric when available
sorted_items = sorted(
[item for item in self.results.items() if 'error' not in item[1] and item[1].get(key) is not None],
key=lambda x: x[1].get(key, float('-inf')),
reverse=True
)
for img_name, result in sorted_items:
val = result.get(key)
f.write(f"{float(val):.3f} - {img_name}\n")
else:
return f"Unsupported format: {format}"
return str(output_file)
def run_analysis(self) -> Tuple[Dict, str]:
"""
Run complete analysis and generate report.
Returns:
Tuple of (results_dict, report_string)
"""
# Analyze images
results = self.analyze_images()
# Generate report
report = self.generate_report()
# Do not auto-save JSON anymore; TXT will be saved via main's choice
return results, report

63
src/logger_config.py Normal file
View File

@@ -0,0 +1,63 @@
"""
Simple logging configuration for the IQA framework.
Creates a log directory and stores log output files.
"""
import logging
from pathlib import Path
from datetime import datetime
def setup_logging(log_dir: str = "logs", log_level: str = "INFO") -> logging.Logger:
"""
Setup simple logging configuration with console and file output.
Args:
log_dir: Directory to store log files
log_level: Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
Returns:
Configured logger instance
"""
# Create log directory if it doesn't exist
log_path = Path(log_dir)
log_path.mkdir(parents=True, exist_ok=True)
# Generate log filename with timestamp
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
log_filename = f"iqa_benchmark_{timestamp}.log"
log_file_path = log_path / log_filename
# Setup basic logging configuration
logging.basicConfig(
level=getattr(logging, log_level.upper()),
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S',
handlers=[
logging.StreamHandler(), # Console output
logging.FileHandler(log_file_path, encoding='utf-8') # File output
]
)
# Get logger and log setup info
logger = logging.getLogger(__name__)
logger.info(f"Logging initialized - Level: {log_level}")
logger.info(f"Log file: {log_file_path}")
logger.info(f"Log directory: {log_path.absolute()}")
return logger
def get_logger(name: str = None) -> logging.Logger:
"""
Get a logger instance with the specified name.
Args:
name: Logger name (usually __name__)
Returns:
Logger instance
"""
if name:
return logging.getLogger(name)
return logging.getLogger()

12
src/metrics/__init__.py Normal file
View File

@@ -0,0 +1,12 @@
"""
Metrics module for Image Quality Assessment.
Provides both DeQA and traditional IQA metrics.
"""
from .deqa_metric import DeQAMetric
from .traditional_metrics import TraditionalMetrics
from .pyiqa_metrics import PyIQAMetrics
from .metrics_manager import MetricsManager
__version__ = "1.0.0"
__all__ = ["DeQAMetric", "TraditionalMetrics", "PyIQAMetrics", "MetricsManager"]

115
src/metrics/deqa_metric.py Normal file
View File

@@ -0,0 +1,115 @@
"""
DeQA Metric Implementation
Deep Quality Assessment metric using the DeQA model.
"""
import torch
from transformers import AutoModelForCausalLM
from PIL import Image
from typing import List, Union, Optional
import logging
logger = logging.getLogger(__name__)
class DeQAMetric:
"""DeQA model wrapper for image quality scoring."""
def __init__(self, model_name: str = "zhiyuanyou/DeQA-Score-Mix3"):
"""
Initialize the DeQA metric.
Args:
model_name: HuggingFace model name for DeQA
"""
self.model_name = model_name
self.model = None
self._is_loaded = False
def load_model(self) -> None:
"""Load the DeQA scoring model."""
if self._is_loaded:
return
logger.info("Loading DeQA model...")
try:
self.model = AutoModelForCausalLM.from_pretrained(
self.model_name,
trust_remote_code=True,
attn_implementation="eager",
torch_dtype=torch.float16,
device_map="auto",
)
self._is_loaded = True
logger.info("DeQA model loaded successfully!")
except Exception as e:
logger.error(f"Failed to load DeQA model: {e}")
raise
def score_single_image(self, image_path: str) -> Optional[float]:
"""
Score a single image using DeQA model.
Args:
image_path: Path to the image file
Returns:
DeQA score (0-5 scale) or None if failed
"""
if not self._is_loaded:
self.load_model()
try:
image = Image.open(image_path)
scores = self.model.score([image])
# Convert tensor to float
if hasattr(scores, 'item'):
return float(scores.item())
elif hasattr(scores, 'tolist'):
return float(scores.tolist()[0])
else:
return float(scores[0])
except Exception as e:
logger.error(f"Error scoring image {image_path}: {e}")
return None
def score_multiple_images(self, image_paths: List[str]) -> List[Optional[float]]:
"""
Score multiple images using DeQA model.
Args:
image_paths: List of image file paths
Returns:
List of DeQA scores (0-5 scale) or None for failed images
"""
if not self._is_loaded:
self.load_model()
try:
images = [Image.open(path) for path in image_paths]
scores = self.model.score(images)
# Convert tensor to list of floats
if hasattr(scores, 'tolist'):
return [float(score) for score in scores.tolist()]
else:
return [float(score) for score in scores]
except Exception as e:
logger.error(f"Error scoring multiple images: {e}")
return [None] * len(image_paths)
def get_metric_name(self) -> str:
"""Get the name of this metric."""
return "DeQA"
def get_score_range(self) -> tuple:
"""Get the score range for this metric."""
return (0.0, 5.0)
def get_description(self) -> str:
"""Get description of this metric."""
return "Deep Quality Assessment using DeQA-Score-Mix3 model"

View File

@@ -0,0 +1,284 @@
"""
Metrics Manager
Coordinates all IQA metrics and provides a unified interface.
"""
import os
from typing import Dict, List, Optional, Any
from pathlib import Path
import logging
from .deqa_metric import DeQAMetric
from .traditional_metrics import TraditionalMetrics
from .pyiqa_metrics import PyIQAMetrics
logger = logging.getLogger(__name__)
class MetricsManager:
"""Manages all IQA metrics and provides unified scoring interface."""
def __init__(self, enable_deqa: bool = True, enable_traditional: bool = True,
enable_pyiqa: bool = True, pyiqa_selected_metrics: Optional[List[str]] = None):
"""
Initialize the metrics manager.
Args:
enable_deqa: Whether to enable DeQA metric
enable_traditional: Whether to enable traditional metrics
enable_pyiqa: Whether to enable PyIQA metrics
"""
self.metrics = {}
# Initialize DeQA metric
if enable_deqa:
try:
self.metrics['deqa'] = DeQAMetric()
logger.info("DeQA metric initialized successfully")
except Exception as e:
logger.warning(f"Failed to initialize DeQA metric: {e}")
self.metrics['deqa'] = None
# Initialize traditional metrics
if enable_traditional:
try:
self.metrics['traditional'] = TraditionalMetrics()
logger.info("Traditional metrics initialized successfully")
except Exception as e:
logger.warning(f"Failed to initialize traditional metrics: {e}")
self.metrics['traditional'] = None
# Initialize PyIQA metrics
if enable_pyiqa:
try:
self.metrics['pyiqa'] = PyIQAMetrics(selected_metrics=pyiqa_selected_metrics)
logger.info("PyIQA metrics initialized successfully")
# Preload priority metrics to avoid download delays
logger.info("Preloading priority PyIQA metrics...")
preload_results = self.metrics['pyiqa'].preload_priority_metrics()
logger.info(f"Preloaded {sum(preload_results.values())}/{len(preload_results)} priority metrics")
except Exception as e:
logger.warning(f"Failed to initialize PyIQA metrics: {e}")
self.metrics['pyiqa'] = None
def get_available_metrics(self) -> List[str]:
"""Get list of available metric types."""
return list(self.metrics.keys())
def calculate_all_metrics(self, image_path: str) -> Dict[str, Any]:
"""
Calculate all available metrics for an image.
Args:
image_path: Path to the image file
Returns:
Dictionary containing all calculated metrics
"""
results = {}
# Calculate DeQA score
if 'deqa' in self.metrics and self.metrics['deqa'] is not None:
try:
deqa_score = self.metrics['deqa'].score_single_image(image_path)
if deqa_score is not None:
results['deqa_score'] = deqa_score
except Exception as e:
logger.error(f"Error calculating DeQA score: {e}")
# Calculate traditional metrics
if 'traditional' in self.metrics and self.metrics['traditional'] is not None:
try:
traditional_metrics = self.metrics['traditional'].calculate_all_metrics(image_path)
results.update(traditional_metrics)
except Exception as e:
logger.error(f"Error calculating traditional metrics: {e}")
# Calculate PyIQA metrics (all available metrics)
if 'pyiqa' in self.metrics and self.metrics['pyiqa'] is not None:
try:
# Calculate all available PyIQA metrics
pyiqa_metrics = self.metrics['pyiqa'].calculate_all_pyiqa_metrics(image_path)
results.update(pyiqa_metrics)
# Also keep individual MUSIQ score for compatibility
if 'pyiqa_musiq' in pyiqa_metrics:
results['musiq_score'] = pyiqa_metrics['pyiqa_musiq']
except Exception as e:
logger.error(f"Error calculating PyIQA metrics: {e}")
# Add file information
try:
file_size = os.path.getsize(image_path)
results['file_size_bytes'] = file_size
results['file_size_mb'] = round(file_size / (1024 * 1024), 2)
results['file_path'] = image_path
except Exception as e:
logger.error(f"Error getting file information: {e}")
return results
def calculate_metrics_batch(self, image_paths: List[str]) -> Dict[str, Dict[str, Any]]:
"""
Calculate metrics for multiple images.
Args:
image_paths: List of image file paths
Returns:
Dictionary mapping image names to their metrics
"""
results = {}
for image_path in image_paths:
try:
image_name = Path(image_path).name
results[image_name] = self.calculate_all_metrics(image_path)
except Exception as e:
logger.error(f"Error processing {image_path}: {e}")
results[Path(image_path).name] = {'error': str(e)}
return results
def get_metric_info(self) -> Dict[str, Dict[str, Any]]:
"""Get information about all available metrics."""
info = {}
if 'deqa' in self.metrics and self.metrics['deqa'] is not None:
info['deqa'] = {
'name': self.metrics['deqa'].get_metric_name(),
'description': self.metrics['deqa'].get_description(),
'score_range': self.metrics['deqa'].get_score_range(),
'type': 'learning_based'
}
if 'traditional' in self.metrics and self.metrics['traditional'] is not None:
info['traditional'] = {
'name': 'Traditional IQA Metrics',
'description': 'Classical image quality assessment metrics',
'metrics': self.metrics['traditional'].get_metric_names(),
'descriptions': self.metrics['traditional'].get_metric_descriptions(),
'type': 'statistical'
}
if 'pyiqa' in self.metrics and self.metrics['pyiqa'] is not None:
info['pyiqa'] = {
'name': 'PyIQA Framework Metrics',
'description': 'Advanced IQA metrics from PyIQA framework',
'available_metrics': self.metrics['pyiqa'].get_available_metrics(),
'musiq_info': self.metrics['pyiqa'].get_musiq_info(),
'type': 'learning_based'
}
return info
def get_selected_pyiqa_metrics(self) -> Optional[List[str]]:
"""Return the explicitly selected PyIQA metrics, if any."""
if 'pyiqa' in self.metrics and self.metrics['pyiqa'] is not None:
try:
return getattr(self.metrics['pyiqa'], 'selected_metrics', None)
except Exception:
return None
return None
def get_deqa_score_only(self, image_path: str) -> Optional[float]:
"""Get only the DeQA score for an image."""
if 'deqa' in self.metrics and self.metrics['deqa'] is not None:
return self.metrics['deqa'].score_single_image(image_path)
return None
def get_deqa_scores_batch(self, image_paths: List[str]) -> List[Optional[float]]:
"""Get DeQA scores for multiple images."""
if 'deqa' in self.metrics and self.metrics['deqa'] is not None:
return self.metrics['deqa'].score_multiple_images(image_paths)
return [None] * len(image_paths)
def get_traditional_metrics_only(self, image_path: str) -> Dict[str, float]:
"""Get only traditional metrics for an image."""
if 'traditional' in self.metrics and self.metrics['traditional'] is not None:
return self.metrics['traditional'].calculate_all_metrics(image_path)
return {}
def get_musiq_score_only(self, image_path: str) -> Optional[float]:
"""Get only the MUSIQ score for an image."""
if 'pyiqa' in self.metrics and self.metrics['pyiqa'] is not None:
return self.metrics['pyiqa'].calculate_musiq_score(image_path)
return None
def get_musiq_scores_batch(self, image_paths: List[str]) -> List[Optional[float]]:
"""Get MUSIQ scores for multiple images."""
if 'pyiqa' in self.metrics and self.metrics['pyiqa'] is not None:
return self.metrics['pyiqa'].calculate_musiq_scores_batch(image_paths)
return [None] * len(image_paths)
def get_pyiqa_metric_score(self, metric_name: str, image_path: str,
reference_path: Optional[str] = None) -> Optional[float]:
"""Get score for a specific PyIQA metric."""
if 'pyiqa' in self.metrics and self.metrics['pyiqa'] is not None:
return self.metrics['pyiqa'].calculate_metric_score(metric_name, image_path, reference_path)
return None
def export_metrics_summary(self, results: Dict[str, Dict[str, Any]]) -> Dict[str, Any]:
"""
Generate a summary of all metrics across all images.
Args:
results: Results from calculate_metrics_batch
Returns:
Summary statistics for all metrics
"""
summary = {
'total_images': len(results),
'successful_images': 0,
'failed_images': 0,
'metrics_summary': {}
}
# Count successful and failed images
for image_name, image_results in results.items():
if 'error' in image_results:
summary['failed_images'] += 1
else:
summary['successful_images'] += 1
# Calculate summary statistics for each metric
if summary['successful_images'] > 0:
# Get all metric names from successful results
all_metrics = set()
for image_results in results.values():
if 'error' not in image_results:
all_metrics.update(image_results.keys())
# Remove file info metrics
file_info_metrics = {'file_size_bytes', 'file_size_mb', 'file_path'}
metrics_to_analyze = all_metrics - file_info_metrics
for metric in metrics_to_analyze:
values = []
for image_results in results.values():
if 'error' not in image_results and metric in image_results:
values.append(image_results[metric])
if values:
summary['metrics_summary'][metric] = {
'count': len(values),
'mean': sum(values) / len(values),
'min': min(values),
'max': max(values),
'std': self._calculate_std(values)
}
return summary
def _calculate_std(self, values: List[float]) -> float:
"""Calculate standard deviation."""
if len(values) < 2:
return 0.0
mean = sum(values) / len(values)
variance = sum((x - mean) ** 2 for x in values) / (len(values) - 1)
return variance ** 0.5

View File

@@ -0,0 +1,373 @@
"""
PyIQA Metrics Implementation
Integrates with the PyIQA framework for various image quality assessment metrics.
"""
import torch
import pyiqa
from typing import Dict, List, Optional, Union, Any
import logging
from pathlib import Path
logger = logging.getLogger(__name__)
class PyIQAMetrics:
"""PyIQA framework integration for various IQA metrics."""
def __init__(self, device: Optional[str] = None, selected_metrics: Optional[List[str]] = None):
"""
Initialize PyIQA metrics.
Args:
device: Device to use ('cuda', 'cpu', or None for auto-detection)
"""
self.device = device or ('cuda' if torch.cuda.is_available() else 'cpu')
self.device_torch = torch.device(self.device)
self.metrics = {}
self.available_metrics = pyiqa.list_models()
# Optional user-selected subset; if provided, we will prefer/limit to these
self.selected_metrics = selected_metrics
logger.info(f"PyIQA initialized on device: {self.device}")
logger.info(f"Available metrics: {len(self.available_metrics)}")
def get_available_metrics(self) -> List[str]:
"""Get list of all available PyIQA metrics."""
return self.available_metrics
def create_metric(self, metric_name: str, **kwargs) -> Any:
"""
Create a specific PyIQA metric.
Args:
metric_name: Name of the metric to create
**kwargs: Additional arguments for metric creation
Returns:
PyIQA metric instance
"""
try:
if metric_name not in self.available_metrics:
logger.warning(f"Metric '{metric_name}' not available in PyIQA")
return None
# Create metric with device specification
metric = pyiqa.create_metric(metric_name, device=self.device, **kwargs)
logger.info(f"Created PyIQA metric: {metric_name}")
return metric
except Exception as e:
logger.error(f"Failed to create metric '{metric_name}': {e}")
return None
def calculate_musiq_score(self, image_path: str) -> Optional[float]:
"""
Calculate MUSIQ score for a single image.
Args:
image_path: Path to the image file
Returns:
MUSIQ score or None if failed
"""
try:
# Create MUSIQ metric
musiq_metric = self.create_metric('musiq')
if musiq_metric is None:
return None
# Calculate score (MUSIQ is no-reference, so only one image needed)
score = musiq_metric(image_path)
# Convert to float if it's a tensor
if hasattr(score, 'item'):
score = float(score.item())
elif hasattr(score, 'tolist'):
score = float(score.tolist()[0])
else:
score = float(score)
logger.debug(f"MUSIQ score for {image_path}: {score}")
return score
except Exception as e:
logger.error(f"Error calculating MUSIQ score for {image_path}: {e}")
return None
def calculate_musiq_scores_batch(self, image_paths: List[str]) -> List[Optional[float]]:
"""
Calculate MUSIQ scores for multiple images.
Args:
image_paths: List of image file paths
Returns:
List of MUSIQ scores or None for failed images
"""
try:
# Create MUSIQ metric once
musiq_metric = self.create_metric('musiq')
if musiq_metric is None:
return [None] * len(image_paths)
scores = []
for image_path in image_paths:
try:
score = musiq_metric(image_path)
# Convert to float if it's a tensor
if hasattr(score, 'item'):
score = float(score.item())
elif hasattr(score, 'tolist'):
score = float(score.tolist()[0])
else:
score = float(score)
scores.append(score)
logger.debug(f"MUSIQ score for {image_path}: {score}")
except Exception as e:
logger.error(f"Error calculating MUSIQ score for {image_path}: {e}")
scores.append(None)
return scores
except Exception as e:
logger.error(f"Error in batch MUSIQ calculation: {e}")
return [None] * len(image_paths)
def calculate_metric_score(self, metric_name: str, image_path: str,
reference_path: Optional[str] = None) -> Optional[float]:
"""
Calculate score for a specific metric.
Args:
metric_name: Name of the PyIQA metric
image_path: Path to the image file
reference_path: Path to reference image (for full-reference metrics)
Returns:
Metric score or None if failed
"""
try:
# Create metric
metric = self.create_metric(metric_name)
if metric is None:
return None
# Determine FR/NR with explicit lists first, then fallback to metric attribute
name_l = metric_name.lower()
explicit_fr = {
'ssim', 'ms_ssim', 'fsim', 'vif', 'dists', 'psnr', 'lpips', 'fid', 'vmaf'
}
explicit_nr = {
'brisque','niqe','piqe','nrqm','nima','paq2piq','dbcnn','hyperiqa',
'musiq','topiq_nr','clipiqa+_vitl14_512','maniqa','ahiq','unique','uranker'
}
if name_l in explicit_fr:
is_full_reference = True
elif name_l in explicit_nr:
is_full_reference = False
else:
try:
if hasattr(metric, 'metric_mode'):
is_full_reference = str(getattr(metric, 'metric_mode')).upper() == 'FR'
else:
is_full_reference = name_l in explicit_fr
except Exception:
is_full_reference = name_l in explicit_fr
if is_full_reference and reference_path is None:
logger.warning(f"Metric '{metric_name}' requires reference image")
return None
# Calculate score
if is_full_reference:
score = metric(image_path, reference_path)
else:
score = metric(image_path)
# Convert to float if it's a tensor
if hasattr(score, 'item'):
score = float(score.item())
elif hasattr(score, 'tolist'):
score = float(score.tolist()[0])
else:
score = float(score)
logger.debug(f"{metric_name} score for {image_path}: {score}")
return score
except Exception as e:
logger.error(f"Error calculating {metric_name} score for {image_path}: {e}")
return None
def get_metric_info(self, metric_name: str) -> Dict[str, Any]:
"""
Get information about a specific metric.
Args:
metric_name: Name of the metric
Returns:
Dictionary containing metric information
"""
try:
metric = self.create_metric(metric_name)
if metric is None:
return {}
info = {
'name': metric_name,
'available': True,
'device': self.device
}
# Get metric properties if available
if hasattr(metric, 'lower_better'):
info['lower_better'] = metric.lower_better
if hasattr(metric, 'metric_mode'):
info['mode'] = metric.metric_mode
# Determine if it's full-reference or no-reference
if metric_name.lower() in ['psnr', 'ssim', 'lpips', 'fid', 'vif', 'vmaf']:
info['type'] = 'full_reference'
info['description'] = f'Full-reference {metric_name.upper()} metric'
else:
info['type'] = 'no_reference'
info['description'] = f'No-reference {metric_name.upper()} metric'
return info
except Exception as e:
logger.error(f"Error getting info for metric '{metric_name}': {e}")
return {
'name': metric_name,
'available': False,
'error': str(e)
}
def get_musiq_info(self) -> Dict[str, Any]:
"""Get specific information about MUSIQ metric."""
return self.get_metric_info('musiq')
def get_metric_names_by_type(self) -> Dict[str, List[str]]:
"""Get metrics categorized by type."""
full_reference = []
no_reference = []
for metric_name in self.available_metrics:
if metric_name.lower() in ['psnr', 'ssim', 'lpips', 'fid', 'vif', 'vmaf']:
full_reference.append(metric_name)
else:
no_reference.append(metric_name)
return {
'full_reference': full_reference,
'no_reference': no_reference
}
def test_metric(self, metric_name: str, test_image_path: str,
reference_path: Optional[str] = None) -> bool:
"""
Test if a metric works with given images.
Args:
metric_name: Name of the metric to test
test_image_path: Path to test image
reference_path: Path to reference image (if needed)
Returns:
True if metric works, False otherwise
"""
try:
score = self.calculate_metric_score(metric_name, test_image_path, reference_path)
return score is not None
except Exception as e:
logger.error(f"Test failed for metric '{metric_name}': {e}")
return False
def calculate_all_pyiqa_metrics(self, image_path: str) -> Dict[str, float]:
"""
Calculate all available PyIQA metrics for an image.
Args:
image_path: Path to the image file
Returns:
Dictionary containing all PyIQA metric scores
"""
results = {}
# Define metrics to skip (require reference images or have download issues)
skip_metrics = {
'psnr', 'ssim', 'lpips', 'fid', 'vif', 'vmaf', 'brisque_matlab', 'ckdn',
'clipiqa', 'clipiqa_plus', 'clipiqa_plus_plus', 'clipiqa_plus_plus_plus'
}
# If the user provided an explicit subset, use that set (filtered for availability)
if self.selected_metrics:
metric_iter = [m for m in self.selected_metrics if m in self.available_metrics and m not in skip_metrics]
else:
# Define priority metrics that are known to work well
priority_metrics = [
'musiq', 'niqe', 'brisque', 'piqe', 'paq2piq', 'hyperiqa', 'dbcnn',
'nima', 'koncept512', 'koncept512_plus', 'koncept512_plus_plus'
]
# Start with priority, then others (excluding problematic ones)
remaining = [
m for m in self.available_metrics
if m not in skip_metrics and m not in priority_metrics
]
metric_iter = [m for m in priority_metrics if m in self.available_metrics] + remaining
# Calculate along the chosen iterator
for metric_name in metric_iter:
try:
score = self.calculate_metric_score(metric_name, image_path)
if score is not None:
results[f'pyiqa_{metric_name}'] = score
except Exception as e:
logger.debug(f"Failed to calculate {metric_name}: {e}")
continue
logger.info(f"Successfully calculated {len(results)} PyIQA metrics")
return results
def preload_priority_metrics(self) -> Dict[str, bool]:
"""
Preload priority metrics to avoid download delays during analysis.
Returns:
Dictionary mapping metric names to success status
"""
priority_metrics = [
'musiq', 'niqe', 'brisque', 'piqe', 'paq2piq', 'hyperiqa', 'dbcnn',
'nima', 'koncept512', 'koncept512_plus', 'koncept512_plus_plus'
]
results = {}
logger.info("Preloading priority PyIQA metrics...")
for metric_name in priority_metrics:
if metric_name in self.available_metrics:
try:
metric = self.create_metric(metric_name)
if metric is not None:
results[metric_name] = True
logger.info(f"✓ Preloaded {metric_name}")
else:
results[metric_name] = False
logger.warning(f"✗ Failed to preload {metric_name}")
except Exception as e:
results[metric_name] = False
logger.warning(f"✗ Failed to preload {metric_name}: {e}")
else:
results[metric_name] = False
logger.warning(f"{metric_name} not available")
success_count = sum(results.values())
logger.info(f"Preloaded {success_count}/{len(priority_metrics)} priority metrics")
return results

View File

@@ -0,0 +1,221 @@
"""
Traditional IQA Metrics Implementation
Uses pyiqa library for various image quality assessment metrics.
"""
import cv2
import numpy as np
from typing import Dict, Optional, Tuple
import logging
logger = logging.getLogger(__name__)
class TraditionalMetrics:
"""Traditional IQA metrics using pyiqa and OpenCV."""
def __init__(self):
"""Initialize the traditional metrics calculator."""
self.metrics = {}
def calculate_all_metrics(self, image_path: str) -> Dict[str, float]:
"""
Calculate all traditional IQA metrics for an image.
Args:
image_path: Path to the image file
Returns:
Dictionary containing all calculated metrics
"""
try:
# Load image
image = cv2.imread(image_path)
if image is None:
logger.error(f"Failed to load image: {image_path}")
return {}
# Convert BGR to RGB
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Calculate metrics
metrics = {}
# Basic statistical metrics
metrics.update(self._calculate_statistical_metrics(image_rgb))
# Sharpness metrics
metrics.update(self._calculate_sharpness_metrics(image_rgb))
# Color metrics
metrics.update(self._calculate_color_metrics(image_rgb))
# Noise metrics
metrics.update(self._calculate_noise_metrics(image_rgb))
# Edge metrics
metrics.update(self._calculate_edge_metrics(image_rgb))
return metrics
except Exception as e:
logger.error(f"Error calculating metrics for {image_path}: {e}")
return {}
def _calculate_statistical_metrics(self, image: np.ndarray) -> Dict[str, float]:
"""Calculate basic statistical metrics."""
metrics = {}
# Convert to grayscale for some metrics
if len(image.shape) == 3:
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
else:
gray = image
# Brightness (mean intensity)
metrics['brightness'] = float(np.mean(gray))
# Contrast (standard deviation)
metrics['contrast'] = float(np.std(gray))
# Entropy (information content)
metrics['entropy'] = self._calculate_entropy(gray)
# Variance
metrics['variance'] = float(np.var(gray))
return metrics
def _calculate_sharpness_metrics(self, image: np.ndarray) -> Dict[str, float]:
"""Calculate sharpness-related metrics."""
metrics = {}
# Convert to grayscale
if len(image.shape) == 3:
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
else:
gray = image
# Laplacian variance (sharpness)
laplacian = cv2.Laplacian(gray, cv2.CV_64F)
metrics['sharpness_laplacian'] = float(np.var(laplacian))
# Sobel gradient magnitude
sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize=3)
sobely = cv2.Sobel(gray, cv2.CV_64F, 0, 1, ksize=3)
gradient_magnitude = np.sqrt(sobelx**2 + sobely**2)
metrics['sharpness_sobel'] = float(np.mean(gradient_magnitude))
# Tenengrad (gradient-based sharpness)
metrics['sharpness_tenengrad'] = float(np.sum(gradient_magnitude))
return metrics
def _calculate_color_metrics(self, image: np.ndarray) -> Dict[str, float]:
"""Calculate color-related metrics."""
metrics = {}
if len(image.shape) != 3:
return metrics
# Colorfulness (saturation)
hsv = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)
metrics['colorfulness'] = float(np.mean(hsv[:, :, 1])) # Saturation channel
# Color variance
metrics['color_variance'] = float(np.var(image))
# RGB channel statistics
for i, channel in enumerate(['red', 'green', 'blue']):
metrics[f'{channel}_mean'] = float(np.mean(image[:, :, i]))
metrics[f'{channel}_std'] = float(np.std(image[:, :, i]))
return metrics
def _calculate_noise_metrics(self, image: np.ndarray) -> Dict[str, float]:
"""Calculate noise-related metrics."""
metrics = {}
# Convert to grayscale
if len(image.shape) == 3:
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
else:
gray = image
# Median absolute deviation (noise estimation)
median = np.median(gray)
mad = np.median(np.abs(gray - median))
metrics['noise_mad'] = float(mad)
# Noise level using high-frequency components
kernel = np.array([[-1, -1, -1], [-1, 8, -1], [-1, -1, -1]])
high_freq = cv2.filter2D(gray, cv2.CV_64F, kernel)
metrics['noise_high_freq'] = float(np.std(high_freq))
return metrics
def _calculate_edge_metrics(self, image: np.ndarray) -> Dict[str, float]:
"""Calculate edge-related metrics."""
metrics = {}
# Convert to grayscale
if len(image.shape) == 3:
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
else:
gray = image
# Canny edge detection
edges = cv2.Canny(gray, 100, 200)
edge_density = np.sum(edges > 0) / (edges.shape[0] * edges.shape[1])
metrics['edge_density'] = float(edge_density)
# Edge strength
sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize=3)
sobely = cv2.Sobel(gray, cv2.CV_64F, 0, 1, ksize=3)
edge_strength = np.sqrt(sobelx**2 + sobely**2)
metrics['edge_strength'] = float(np.mean(edge_strength))
return metrics
def _calculate_entropy(self, image: np.ndarray) -> float:
"""Calculate image entropy."""
hist = cv2.calcHist([image], [0], None, [256], [0, 256])
hist = hist / hist.sum()
hist = hist[hist > 0] # Remove zero probabilities
entropy = -np.sum(hist * np.log2(hist))
return float(entropy)
def get_metric_names(self) -> list:
"""Get list of all available metric names."""
return [
'brightness', 'contrast', 'entropy', 'variance',
'sharpness_laplacian', 'sharpness_sobel', 'sharpness_tenengrad',
'colorfulness', 'color_variance', 'red_mean', 'green_mean', 'blue_mean',
'red_std', 'green_std', 'blue_std',
'noise_mad', 'noise_high_freq',
'edge_density', 'edge_strength'
]
def get_metric_descriptions(self) -> Dict[str, str]:
"""Get descriptions for all metrics."""
return {
'brightness': 'Average pixel intensity (0-255)',
'contrast': 'Standard deviation of pixel values',
'entropy': 'Information content of the image',
'variance': 'Variance of pixel values',
'sharpness_laplacian': 'Sharpness using Laplacian operator',
'sharpness_sobel': 'Sharpness using Sobel gradient',
'sharpness_tenengrad': 'Sharpness using gradient magnitude',
'colorfulness': 'Color saturation level',
'color_variance': 'Overall color variance',
'red_mean': 'Red channel mean value',
'green_mean': 'Green channel mean value',
'blue_mean': 'Blue channel mean value',
'red_std': 'Red channel standard deviation',
'green_std': 'Green channel standard deviation',
'blue_std': 'Blue channel standard deviation',
'noise_mad': 'Noise level using median absolute deviation',
'noise_high_freq': 'Noise level using high-frequency components',
'edge_density': 'Percentage of edge pixels',
'edge_strength': 'Average edge strength'
}