Files
distillation/easydistill/mmkd/mllm_demo_distill.json

1306 lines
392 KiB
JSON

[
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie - 27518206_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"629309709\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"CLARISSE DEROLLEZ\", \"total_billed\": 50.0, \"bill_paid\": true, \"amount_paid\": 50.0, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Ang\\u00e9lique RINGARD\", \"beneficiary_dob\": null, \"invoice_date\": \"10-06-2023\", \"security_number\": null, \"invoice_issuer\": \"CLARISE DEROLEZ\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation de Psychoth\\u00e9rapie\", \"quantity\": 1, \"date_of_service\": \"10-06-2023\", \"mandatory_coverage\": null, \"amount\": 50.0}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"629309709\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"CLARISSE DEROLLEZ\", \"total_billed\": 50.0, \"bill_paid\": true, \"amount_paid\": 50.0, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Ang\\u00e9lique RINGARD\", \"beneficiary_dob\": null, \"invoice_date\": \"10-06-2023\", \"security_number\": null, \"invoice_issuer\": \"CLARISSE DEROLLEZ\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation de Psychoth\\u00e9rapie\", \"quantity\": 1, \"date_of_service\": \"10-06-2023\", \"mandatory_coverage\": null, \"amount\": 50.0}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/1498644-copie-de-l-acte-de-divorce-d-antonio-580x0-1_1_scale_1.0.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/quittance-de-loyer-exemple_1_scale_1.0.jpg"
},
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/quittance-de-loyer_1_scale_1.0.jpg"
},
{
"type": "text",
"text": "<image><image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/RF MO ODO 2018 2 8_1_scale_1.0.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/RIB Epitech-1_1_scale_1.0.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/lettre-justificatif-domicile-parents-L-w2s_7r_1_scale_1.0.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/RIB DU SIE SCEAUX-1_1_scale_1.0.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/mobilenew2_1_scale_1.0.jpg"
},
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/mobilenew3_1_scale_1.0.jpg"
},
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/mobilenew_1_scale_1.0.jpg"
},
{
"type": "text",
"text": "<image><image><image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Attestation de droits RO-25878827-1_1_scale_1.0.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/facture anonyme_1_scale_1.0.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": false, \"profession\": null, \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": null, \"total_billed\": null, \"bill_paid\": null, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": null, \"currency\": null, \"items\": [], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie - 27581557_1_scale_1.5.jpg"
},
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie - 27581557_2_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image><image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "[{\"is_bill\": true, \"profession\": \"Psychologue\", \"rpps_number\": \"10000895531\", \"doctor_name\": \"ALBERTO BLANQUER-LAGUARTA\", \"total_billed\": 78.5, \"bill_paid\": true, \"amount_paid\": 78.5, \"mandatory_coverage\": 42.5, \"complementary_coverage\": null, \"client_part\": 36.0, \"insured_name\": \"BOUZAT MARIE-CLAUDE\", \"insured_dob\": \"07-09-1960\", \"beneficiary_name\": \"BOUZAT MARIE-CLAUDE\", \"beneficiary_dob\": \"07-09-1960\", \"invoice_date\": \"23-06-2023\", \"security_number\": \"2600912208266\", \"invoice_issuer\": \"ALBERTO BLANQUER-LAGUARTA\", \"currency\": \"EUR\", \"items\": [{\"description\": \"CNP\", \"date_of_service\": \"23-06-2023\", \"mandatory_coverage\": 42.5, \"amount\": 78.5}], \"prescripteur_finess_number\": null}, {\"is_bill\": true, \"profession\": \"Psychiatrie\", \"rpps_number\": \"10000895531\", \"doctor_name\": \"ALBERTO BLANQUER-LAGUARTA\", \"total_billed\": 78.5, \"bill_paid\": true, \"amount_paid\": 78.5, \"mandatory_coverage\": 42.5, \"complementary_coverage\": null, \"client_part\": 36.0, \"insured_name\": \"BOUZAT MARIE-CLAUDE\", \"insured_dob\": \"07-09-1960\", \"beneficiary_name\": \"BOUZAT MARIE-CLAUDE\", \"beneficiary_dob\": \"07-09-1960\", \"invoice_date\": \"30-06-2023\", \"security_number\": \"2600912208266\", \"invoice_issuer\": \"ALBERTO BLANQUER-LAGUARTA\", \"currency\": \"EUR\", \"items\": [{\"description\": \"CNP\", \"date_of_service\": \"30-06-2023\", \"mandatory_coverage\": 42.5, \"amount\": 78.5}], \"prescripteur_finess_number\": null}]"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\n \"is_bill\": true,\n \"profession\": \"Psychiatrie\",\n \"adeli_number\": null,\n \"rpps_number\": \"10000895531\",\n \"finess_number\": null,\n \"doctor_name\": \"ALBERTO BLANQUER-LAGUARTA\",\n \"prescripteur_finess_number\": null,\n \"total_billed\": 78.5,\n \"bill_paid\": true,\n \"amount_paid\": 78.5,\n \"mandatory_coverage\": 42.5,\n \"complementary_coverage\": 0.0,\n \"client_part\": 36.0,\n \"remaining_payment\": null,\n \"insured_name\": \"BOUZAT MARIE-CLAUDE\",\n \"insured_dob\": \"07-09-1960\",\n \"beneficiary_name\": \"BOUZAT MARIE-CLAUDE\",\n \"beneficiary_dob\": \"07-09-1960\",\n \"invoice_date\": \"23-06-2023\",\n \"security_number\": \"2600912208266\",\n \"invoice_issuer\": \"ALBERTO BLANQUER-LAGUARTA\",\n \"currency\": \"EUR\",\n \"items\": [\n {\n \"description\": \"CNP\",\n \"quantity\": 1,\n \"date_of_service\": \"23-06-2023\",\n \"mandatory_coverage\": 42.5,\n \"amount\": 78.5\n }\n ]\n}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie - 27641339_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"789327913\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Marie-Lucie GORGA\", \"total_billed\": 65.0, \"bill_paid\": true, \"amount_paid\": 65.0, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Audrey LE MAUFF\", \"beneficiary_dob\": null, \"invoice_date\": \"07-07-2023\", \"security_number\": null, \"invoice_issuer\": \"Marie-Lucie GORGA\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Entretien th\\u00e9rapeutique\", \"quantity\": 1, \"date_of_service\": \"07-07-2023\", \"mandatory_coverage\": null, \"amount\": 65}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"789327913\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Marie-Lucie GORGA\", \"total_billed\": 65.0, \"bill_paid\": true, \"amount_paid\": 65.0, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Audrey LE MAUFF\", \"beneficiary_dob\": null, \"invoice_date\": \"07-07-2023\", \"security_number\": null, \"invoice_issuer\": \"Marie-Lucie GORGA\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Entretien th\\u00e9rapeutique\", \"quantity\": 1, \"date_of_service\": \"07-07-2023\", \"mandatory_coverage\": null, \"amount\": 65}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED A-B-2481575_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"8900001472\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Marie-Laure Caussanel\", \"total_billed\": 160, \"bill_paid\": true, \"amount_paid\": 160, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"BUYS Audrey\", \"beneficiary_dob\": \"01-03-1975\", \"invoice_date\": \"30-05-2023\", \"security_number\": null, \"invoice_issuer\": \"Marie-Laure Caussanel\", \"currency\": \"EUR\", \"items\": [{\"description\": \"s\\u00e9ance(s) de psychoth\\u00e9rapie\", \"quantity\": 1, \"date_of_service\": \"28-01-2023\", \"mandatory_coverage\": null, \"amount\": 80}, {\"description\": \"s\\u00e9ance(s) de psychoth\\u00e9rapie\", \"quantity\": 1, \"date_of_service\": \"11-02-2023\", \"mandatory_coverage\": null, \"amount\": 80}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"8900001472\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Marie-Laure Caussanel\", \"prescripteur_finess_number\": null, \"total_billed\": 160, \"bill_paid\": true, \"amount_paid\": 160, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": \"BUYS Audrey\", \"insured_dob\": \"01-03-1975\", \"beneficiary_name\": \"BUYS Audrey\", \"beneficiary_dob\": \"01-03-1975\", \"invoice_date\": \"30-05-2023\", \"security_number\": null, \"invoice_issuer\": \"Marie-Laure Caussanel\", \"currency\": \"EUR\", \"items\": [{\"description\": \"séance(s) de psychothérapie\", \"quantity\": 1, \"date_of_service\": \"28-01-2023\", \"mandatory_coverage\": null, \"amount\": 80}, {\"description\": \"séance(s) de psychothérapie\", \"quantity\": 1, \"date_of_service\": \"11-02-2023\", \"mandatory_coverage\": null, \"amount\": 80}]}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED A-B22-677661_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"029302213\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"LEF\\u00c8VRE H\\u00e9l\\u00e8ne\", \"total_billed\": 50, \"bill_paid\": true, \"amount_paid\": 50, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"BEAURAIN Gabriel\", \"beneficiary_dob\": \"30-06-2011\", \"invoice_date\": \"12-05-2023\", \"security_number\": null, \"invoice_issuer\": \"LEF\\u00c8VRE H\\u00e9l\\u00e8ne El\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"12-05-2023\", \"mandatory_coverage\": null, \"amount\": 50}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"029302213\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"LEFÈVRE Hélène\", \"total_billed\": 50, \"bill_paid\": true, \"amount_paid\": 50, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"BEAURAIN Gabriel\", \"beneficiary_dob\": \"30-06-2011\", \"invoice_date\": \"12-05-2023\", \"security_number\": null, \"invoice_issuer\": \"LEFÈVRE Hélène El\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"12-05-2023\", \"mandatory_coverage\": null, \"amount\": 50}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED A-B22-677662_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"029302213\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"LEF\\u00c8VRE H\\u00e9l\\u00e8ne\", \"total_billed\": 50, \"bill_paid\": true, \"amount_paid\": 50, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"BEAURAIN Th\\u00e9o\", \"beneficiary_dob\": \"10-12-2013\", \"invoice_date\": \"26-05-2023\", \"security_number\": null, \"invoice_issuer\": \"LEF\\u00c8VRE H\\u00e9l\\u00e8ne El\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"26-05-2023\", \"mandatory_coverage\": null, \"amount\": 50}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"029302213\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"LEFÈVRE Hélène\", \"total_billed\": 50, \"bill_paid\": true, \"amount_paid\": 50, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"BEAURAIN Théo\", \"beneficiary_dob\": \"10-12-2013\", \"invoice_date\": \"26-05-2023\", \"security_number\": null, \"invoice_issuer\": \"LEFÈVRE Hélène El\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"26-05-2023\", \"mandatory_coverage\": null, \"amount\": 50}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED K-M1-1228144_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"169300134\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Nathalie LABARCHEDE\", \"total_billed\": 120, \"bill_paid\": true, \"amount_paid\": 120, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"C\\u00e9line LE GUELLANFF\", \"beneficiary_dob\": null, \"invoice_date\": \"13-06-2023\", \"security_number\": null, \"invoice_issuer\": \"Nathalie LABARCHEDE\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation psychologique\", \"quantity\": 1, \"date_of_service\": \"13-06-2023\", \"mandatory_coverage\": null, \"amount\": 120}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"169300134\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Nathalie LABARCHEDE\", \"total_billed\": 120, \"bill_paid\": true, \"amount_paid\": 120, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Céline LE GUELLANFF\", \"beneficiary_dob\": null, \"invoice_date\": \"13-06-2023\", \"security_number\": null, \"invoice_issuer\": \"Nathalie LABARCHEDE\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation psychologique\", \"quantity\": 1, \"date_of_service\": \"13-06-2023\", \"mandatory_coverage\": null, \"amount\": 120}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED N-R-2391367_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"769315011\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"TABET Laarem\", \"total_billed\": 65, \"bill_paid\": true, \"amount_paid\": 65, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"PERRET Andr\\u00e9as\", \"beneficiary_dob\": null, \"invoice_date\": \"17-04-2023\", \"security_number\": null, \"invoice_issuer\": \"TABET Laarem\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"17-04-2023\", \"mandatory_coverage\": null, \"amount\": 65}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"769315011\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"TABET Laarem\", \"total_billed\": 65, \"bill_paid\": true, \"amount_paid\": 65, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"PERRET Andr\\u00e9as\", \"beneficiary_dob\": null, \"invoice_date\": \"17-04-2023\", \"security_number\": null, \"invoice_issuer\": \"TABET Laarem\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"17-04-2023\", \"mandatory_coverage\": null, \"amount\": 65}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED N-R-2391368_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"769315011\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"TABET Laarem\", \"total_billed\": 65, \"bill_paid\": true, \"amount_paid\": 65, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"PERRET Andr\\u00e9as\", \"beneficiary_dob\": null, \"invoice_date\": \"26-05-2023\", \"security_number\": null, \"invoice_issuer\": \"TABET Laarem\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"26-05-2023\", \"mandatory_coverage\": null, \"amount\": 65}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"769315011\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"TABET Laarem\", \"total_billed\": 65, \"bill_paid\": true, \"amount_paid\": 65, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"PERRET Andr\\u00e9as\", \"beneficiary_dob\": null, \"invoice_date\": \"26-05-2023\", \"security_number\": null, \"invoice_issuer\": \"TABET Laarem\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"26-05-2023\", \"mandatory_coverage\": null, \"amount\": 65}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED N-R-2391503_2_scale_1.5.jpg"
},
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED N-R-2391503_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image><image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"659301832\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Maud GREFFIER\", \"total_billed\": 50, \"bill_paid\": true, \"amount_paid\": 50, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Gabriel Piques\", \"beneficiary_dob\": null, \"invoice_date\": \"11-07-2023\", \"security_number\": null, \"invoice_issuer\": \"Maud GREFFIER\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"11-07-2023\", \"mandatory_coverage\": null, \"amount\": 50}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"659301832\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Maud GREFFIER\", \"total_billed\": 50, \"bill_paid\": true, \"amount_paid\": 50, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Gabriel Piques\", \"beneficiary_dob\": null, \"invoice_date\": \"11-07-2023\", \"security_number\": null, \"invoice_issuer\": \"Maud GREFFIER\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"11-07-2023\", \"mandatory_coverage\": null, \"amount\": 50}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED N-R1-941543_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"679300335\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"HEINTZ Yannick\", \"total_billed\": 120, \"bill_paid\": true, \"amount_paid\": 120, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"ROCHE Elisabeth\", \"beneficiary_dob\": null, \"invoice_date\": \"05-06-2023\", \"security_number\": null, \"invoice_issuer\": \"HEINTZ Yannick\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultations du 23 mai et 5 juin 2023\", \"quantity\": 2, \"date_of_service\": \"23-06-2023\", \"mandatory_coverage\": null, \"amount\": 120}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"679300335\", \"rpps_number\": \"79285035600025\", \"finess_number\": null, \"doctor_name\": \"HEINTZ Yannick\", \"total_billed\": 120, \"bill_paid\": true, \"amount_paid\": 120, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"ROCHE Elisabeth\", \"beneficiary_dob\": null, \"invoice_date\": \"05-06-2023\", \"security_number\": null, \"invoice_issuer\": \"HEINTZ Yannick\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultations du 23 mai et 5 juin 2023\", \"quantity\": 2, \"date_of_service\": \"23-05-2023\", \"mandatory_coverage\": null, \"amount\": 120}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED N-R1-946126_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"519302764\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Stephen Dehoul\", \"total_billed\": 60.0, \"bill_paid\": true, \"amount_paid\": 60.0, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0.0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Vincent Oberlin\", \"beneficiary_dob\": null, \"invoice_date\": \"26-06-2023\", \"security_number\": null, \"invoice_issuer\": \"Stephen Dehoul - Psychologue Reims\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"26-06-2023\", \"mandatory_coverage\": null, \"amount\": 60.0}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"519302764\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Stephen Dehoul\", \"total_billed\": 60.0, \"bill_paid\": true, \"amount_paid\": 60.0, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0.0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Vincent Oberlin\", \"beneficiary_dob\": null, \"invoice_date\": \"26-06-2023\", \"security_number\": null, \"invoice_issuer\": \"Stephen Dehoul - Psychologue Reims\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"26-06-2023\", \"mandatory_coverage\": null, \"amount\": 60.0}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED N-R1-948960_1_scale_1.5.jpg"
},
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED N-R1-948960_2_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image><image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"609312582\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Sophie NORMAND\", \"total_billed\": 60.0, \"bill_paid\": true, \"amount_paid\": 60.0, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Mickael PAYET\", \"beneficiary_dob\": null, \"invoice_date\": \"07-07-2023\", \"security_number\": null, \"invoice_issuer\": \"Sophie NORMAND\", \"currency\": \"EUR\", \"items\": [{\"description\": \"S\\u00e9ance de psychoth\\u00e9rapie du 07/07/2023\", \"quantity\": 1, \"date_of_service\": \"07-07-2023\", \"mandatory_coverage\": null, \"amount\": 60.0}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"609312582\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Sophie NORMAND\", \"total_billed\": 60.0, \"bill_paid\": true, \"amount_paid\": 60.0, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Mickael PAYET\", \"beneficiary_dob\": null, \"invoice_date\": \"07-07-2023\", \"security_number\": null, \"invoice_issuer\": \"Sophie NORMAND\", \"currency\": \"EUR\", \"items\": [{\"description\": \"S\\u00e9ance de psychoth\\u00e9rapie du 07/07/2023\", \"quantity\": 1, \"date_of_service\": \"07-07-2023\", \"mandatory_coverage\": null, \"amount\": 60.0}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED S-Z22-1003798_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"689309136\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"KLIMPEL GEOFFROY\", \"total_billed\": 50.0, \"bill_paid\": true, \"amount_paid\": 50.0, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"NATHANAEL SCHULLER\", \"beneficiary_dob\": null, \"invoice_date\": \"24-06-2023\", \"security_number\": null, \"invoice_issuer\": \"CABINET LIBERAL KLIMPEL GEOFFROY\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Entretien psychologique\", \"quantity\": 1, \"date_of_service\": \"24-06-2023\", \"mandatory_coverage\": null, \"amount\": 50.0}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"689309136\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"KLIMPEL GEOFFROY\", \"total_billed\": 50.0, \"bill_paid\": true, \"amount_paid\": 50.0, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"NATHANAEL SCHULLER\", \"beneficiary_dob\": null, \"invoice_date\": \"24-06-2023\", \"security_number\": null, \"invoice_issuer\": \"CABINET LIBERAL KLIMPEL GEOFFROY\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Entretien psychologique\", \"quantity\": 1, \"date_of_service\": \"24-06-2023\", \"mandatory_coverage\": null, \"amount\": 50.0}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED S-Z22-1003799_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"689309136\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"KLIMPEL GEOFFROY\", \"total_billed\": 50, \"bill_paid\": true, \"amount_paid\": 50, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"NATHANAEL SCHULLER\", \"beneficiary_dob\": null, \"invoice_date\": \"24-07-2023\", \"security_number\": null, \"invoice_issuer\": \"CABINET LIBERAL KLIMPEL GEOFFROY\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Entretien psychologique\", \"quantity\": 1, \"date_of_service\": \"24-07-2023\", \"mandatory_coverage\": null, \"amount\": 50}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"689309136\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"KLIMPEL GEOFFROY\", \"total_billed\": 50, \"bill_paid\": true, \"amount_paid\": 50, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"NATHANAEL SCHULLER\", \"beneficiary_dob\": null, \"invoice_date\": \"24-07-2023\", \"security_number\": null, \"invoice_issuer\": \"CABINET LIBERAL KLIMPEL GEOFFROY\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Entretien psychologique\", \"quantity\": 1, \"date_of_service\": \"24-07-2023\", \"mandatory_coverage\": null, \"amount\": 50}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED S-Z22-1003800_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"689309136\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"KLIMPEL GEOFFROY\", \"total_billed\": 50.0, \"bill_paid\": true, \"amount_paid\": 50.0, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"NATHANAEL SCHULLER\", \"beneficiary_dob\": null, \"invoice_date\": \"21-08-2023\", \"security_number\": null, \"invoice_issuer\": \"CABINET LIBERAL KLIMPEL GEOFFROY\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Entretien psychologique\", \"quantity\": 1, \"date_of_service\": \"21-08-2023\", \"mandatory_coverage\": null, \"amount\": 50.0}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"689309136\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"KLIMPEL GEOFFROY\", \"total_billed\": 50.0, \"bill_paid\": true, \"amount_paid\": 50.0, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"NATHANAEL SCHULLER\", \"beneficiary_dob\": null, \"invoice_date\": \"21-08-2023\", \"security_number\": null, \"invoice_issuer\": \"CABINET LIBERAL KLIMPEL GEOFFROY\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Entretien psychologique\", \"quantity\": 1, \"date_of_service\": \"21-08-2023\", \"mandatory_coverage\": null, \"amount\": 50.0}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture psychologie-GED S-Z22-996335_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"759350168\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Sandra Ferrara\", \"total_billed\": 65, \"bill_paid\": true, \"amount_paid\": 65, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Enora LE BARS TERRAZA\", \"beneficiary_dob\": null, \"invoice_date\": \"02-07-2023\", \"security_number\": null, \"invoice_issuer\": \"Sandra Ferrara\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"30-06-2023\", \"mandatory_coverage\": null, \"amount\": 65}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"759350168\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Sandra Ferrara\", \"total_billed\": 65, \"bill_paid\": true, \"amount_paid\": 65, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Enora LE BARS TERRAZA\", \"beneficiary_dob\": null, \"invoice_date\": \"02-07-2023\", \"security_number\": null, \"invoice_issuer\": \"Sandra Ferrara\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation\", \"quantity\": 1, \"date_of_service\": \"30-06-2023\", \"mandatory_coverage\": null, \"amount\": 65}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27103224_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"2B9300348\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"FRANCHI C\\u00e9line\", \"total_billed\": 60, \"bill_paid\": false, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"CLEMENTE NEVIA\", \"beneficiary_dob\": \"02-07-1977\", \"invoice_date\": \"02-01-2023\", \"security_number\": null, \"invoice_issuer\": \"CENTRE PSYCHOTHERAPIE\", \"currency\": \"EUR\", \"items\": [{\"description\": \"1 s\\u00e9ance de Psychoth\\u00e9rapie\", \"quantity\": 1, \"date_of_service\": \"02-01-2023\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"2B9300348\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"FRANCHI C\\u00e9line\", \"total_billed\": 60, \"bill_paid\": false, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"CLEMENTE NEVIA\", \"beneficiary_dob\": \"02-07-1977\", \"invoice_date\": \"02-01-2023\", \"security_number\": null, \"invoice_issuer\": \"CENTRE PSYCHOTHERAPIE\", \"currency\": \"EUR\", \"items\": [{\"description\": \"1 s\\u00e9ance de Psychoth\\u00e9rapie\", \"quantity\": 1, \"date_of_service\": \"02-01-2023\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27126402_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Marie-Laure CHOUKAIR\", \"total_billed\": 60, \"bill_paid\": true, \"amount_paid\": 60, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Laura Mycyszyn\", \"beneficiary_dob\": null, \"invoice_date\": \"13-01-2023\", \"security_number\": null, \"invoice_issuer\": \"Marie-Laure CHOUKAIR\", \"currency\": \"EUR\", \"items\": [{\"description\": \"consultation en psychoth\\u00e9rapie\", \"quantity\": 1, \"date_of_service\": \"13-01-2023\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Marie-Laure CHOUKAIR\", \"total_billed\": 60, \"bill_paid\": true, \"amount_paid\": 60, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Laura Mycyszyn\", \"beneficiary_dob\": null, \"invoice_date\": \"13-01-2023\", \"security_number\": null, \"invoice_issuer\": \"Marie-Laure CHOUKAIR\", \"currency\": \"EUR\", \"items\": [{\"description\": \"consultation en psychothérapie\", \"quantity\": 1, \"date_of_service\": \"13-01-2023\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27146817_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"749309787\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Caroline SERALTA\", \"total_billed\": 60.0, \"bill_paid\": false, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": \"Gonzalez Aur\\u00e9lie\", \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": \"27-01-2023\", \"security_number\": \"287029520305257\", \"invoice_issuer\": \"Caroline SERALTA\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation psychologique\", \"quantity\": 1, \"date_of_service\": \"27-01-2023\", \"mandatory_coverage\": null, \"amount\": 60.0}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"749309787\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Caroline SERALTA\", \"total_billed\": 60.0, \"bill_paid\": false, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": \"Gonzalez Aur\\u00e9lie\", \"insured_dob\": null, \"beneficiary_name\": null, \"beneficiary_dob\": null, \"invoice_date\": \"27-01-2023\", \"security_number\": \"287029520305257\", \"invoice_issuer\": \"Caroline SERALTA\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultation psychologique\", \"quantity\": 1, \"date_of_service\": \"27-01-2023\", \"mandatory_coverage\": null, \"amount\": 60.0}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27148581_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"669302515\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"CATHERINE BANET-BOUAZIZ\", \"total_billed\": 60, \"bill_paid\": true, \"amount_paid\": 60, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Kais Zeggou-Cotte\", \"beneficiary_dob\": null, \"invoice_date\": \"08-02-2023\", \"security_number\": null, \"invoice_issuer\": \"CATHERINE BANET-BOUAZIZ\", \"currency\": \"EUR\", \"items\": [{\"description\": \"S\\u00e9ance de psychologie\", \"quantity\": 1, \"date_of_service\": \"08-02-2023\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"669302515\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"CATHERINE BANET-BOUAZIZ\", \"total_billed\": 60, \"bill_paid\": true, \"amount_paid\": 60, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Kais Zeggou-Cotte\", \"beneficiary_dob\": null, \"invoice_date\": \"08-02-2023\", \"security_number\": null, \"invoice_issuer\": \"CATHERINE BANET-BOUAZIZ\", \"currency\": \"EUR\", \"items\": [{\"description\": \"S\\u00e9ance de psychologie\", \"quantity\": 1, \"date_of_service\": \"08-02-2023\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27151190_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"929317485\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Cohen Nathalie\", \"total_billed\": 120, \"bill_paid\": true, \"amount_paid\": 120, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Nila Ramos\", \"beneficiary_dob\": null, \"invoice_date\": \"24-01-2023\", \"security_number\": null, \"invoice_issuer\": \"Cohen Nathalie\", \"currency\": \"EUR\", \"items\": [{\"description\": \"S\\u00e9ance de psychologie individuelle\", \"quantity\": 1, \"date_of_service\": \"02-01-2023\", \"mandatory_coverage\": null, \"amount\": 60.0}, {\"description\": \"S\\u00e9ance de psychologie individuelle\", \"quantity\": 1, \"date_of_service\": \"24-01-2023\", \"mandatory_coverage\": null, \"amount\": 60.0}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"929317485\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Cohen Nathalie\", \"total_billed\": 120, \"bill_paid\": true, \"amount_paid\": 120, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Nila Ramos\", \"beneficiary_dob\": null, \"invoice_date\": \"24-01-2023\", \"security_number\": null, \"invoice_issuer\": \"Cohen Nathalie\", \"currency\": \"EUR\", \"items\": [{\"description\": \"S\\u00e9ance de psychologie individuelle\", \"quantity\": 1, \"date_of_service\": \"02-01-2023\", \"mandatory_coverage\": null, \"amount\": 60.0}, {\"description\": \"S\\u00e9ance de psychologie individuelle\", \"quantity\": 1, \"date_of_service\": \"24-01-2023\", \"mandatory_coverage\": null, \"amount\": 60.0}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27162326_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"261048854\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"VAUGAN VALERIE\", \"total_billed\": 105.0, \"bill_paid\": true, \"amount_paid\": 62.5, \"mandatory_coverage\": 42.5, \"complementary_coverage\": null, \"client_part\": 62.5, \"remaining_payment\": 0, \"insured_name\": \"SAMIRI BAHIJA\", \"insured_dob\": \"10-09-1973\", \"beneficiary_name\": \"SAMIRI BAHIJA\", \"beneficiary_dob\": \"10-09-1973\", \"invoice_date\": \"06-02-2023\", \"security_number\": \"2730999350481\", \"invoice_issuer\": \"Madame VAUGAN VALERIE\", \"currency\": \"EUR\", \"items\": [{\"description\": \"CNP 1\", \"quantity\": 1, \"date_of_service\": null, \"mandatory_coverage\": 42.5, \"amount\": 105}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"261048854\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"VAUGAN VALERIE\", \"total_billed\": 105.0, \"bill_paid\": true, \"amount_paid\": 62.5, \"mandatory_coverage\": 42.5, \"complementary_coverage\": null, \"client_part\": 62.5, \"remaining_payment\": 0, \"insured_name\": \"SAMIRI BAHIJA\", \"insured_dob\": \"10-09-1973\", \"beneficiary_name\": \"SAMIRI BAHIJA\", \"beneficiary_dob\": \"10-09-1973\", \"invoice_date\": \"06-02-2023\", \"security_number\": \"2730999350481\", \"invoice_issuer\": \"Madame VAUGAN VALERIE\", \"currency\": \"EUR\", \"items\": [{\"description\": \"CNP 1\", \"quantity\": 1, \"date_of_service\": null, \"mandatory_coverage\": 42.5, \"amount\": 105}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27163243_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"599333044\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Anne Flahaut\", \"total_billed\": 70.0, \"bill_paid\": false, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"D\\u00e9borah Roose\", \"beneficiary_dob\": null, \"invoice_date\": \"30-01-2023\", \"security_number\": null, \"invoice_issuer\": \"Anne Flahaut El\", \"currency\": \"EUR\", \"items\": [{\"description\": \"S\\u00e9ance de psychoth\\u00e9rapie\", \"quantity\": 1, \"date_of_service\": null, \"mandatory_coverage\": null, \"amount\": 70.0}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"599333044\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Anne Flahaut\", \"total_billed\": 70.0, \"bill_paid\": false, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"D\\u00e9borah Roose\", \"beneficiary_dob\": null, \"invoice_date\": \"30-01-2023\", \"security_number\": null, \"invoice_issuer\": \"Anne Flahaut El\", \"currency\": \"EUR\", \"items\": [{\"description\": \"S\\u00e9ance de psychoth\\u00e9rapie\", \"quantity\": 1, \"date_of_service\": null, \"mandatory_coverage\": null, \"amount\": 70.0}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27195007_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"919313718\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Claire LANGELIER\", \"total_billed\": 60, \"bill_paid\": true, \"amount_paid\": 60, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Emma Thomi\\u00e8re\", \"beneficiary_dob\": null, \"invoice_date\": \"31-01-2023\", \"security_number\": null, \"invoice_issuer\": \"Claire LANGELIER\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultations psychologiques\", \"quantity\": 1, \"date_of_service\": \"31-01-2023\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"919313718\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Claire LANGELIER\", \"total_billed\": 60, \"bill_paid\": true, \"amount_paid\": 60, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Emma Thomi\\u00e8re\", \"beneficiary_dob\": null, \"invoice_date\": \"31-01-2023\", \"security_number\": null, \"invoice_issuer\": \"Claire LANGELIER\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Consultations psychologiques\", \"quantity\": 1, \"date_of_service\": \"31-01-2023\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27195869_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"649309903\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Becerra Ang\\u00e9lique\", \"total_billed\": 147.0, \"bill_paid\": false, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Campano Antoine\", \"beneficiary_dob\": null, \"invoice_date\": \"17-01-2023\", \"security_number\": null, \"invoice_issuer\": \"La Clinique E-sant\\u00e9\", \"currency\": \"EUR\", \"items\": [{\"description\": \"S\\u00e9ance de psychoth\\u00e9rapie \\u00e0 distance\", \"quantity\": 1, \"date_of_service\": \"17-01-2023\", \"mandatory_coverage\": null, \"amount\": 147.0}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"649309903\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Becerra Ang\\u00e9lique\", \"total_billed\": 147.0, \"bill_paid\": false, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Campano Antoine\", \"beneficiary_dob\": null, \"invoice_date\": \"17-01-2023\", \"security_number\": null, \"invoice_issuer\": \"La Clinique E-sant\\u00e9\", \"currency\": \"EUR\", \"items\": [{\"description\": \"S\\u00e9ance de psychoth\\u00e9rapie \\u00e0 distance\", \"quantity\": 1, \"date_of_service\": \"17-01-2023\", \"mandatory_coverage\": null, \"amount\": 147.0}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27195883_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"939314530\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Benard Eliana\", \"total_billed\": 147.0, \"bill_paid\": false, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Campano Antoine\", \"beneficiary_dob\": null, \"invoice_date\": \"17-02-2023\", \"security_number\": null, \"invoice_issuer\": \"La Clinique E-sant\\u00e9\", \"currency\": \"EUR\", \"items\": [{\"description\": \"S\\u00e9ance de psychoth\\u00e9rapie \\u00e0 distance\", \"quantity\": 1, \"date_of_service\": \"17-02-2023\", \"mandatory_coverage\": null, \"amount\": 147.0}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"939314530\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Benard Eliana\", \"total_billed\": 147.0, \"bill_paid\": false, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Campano Antoine\", \"beneficiary_dob\": null, \"invoice_date\": \"17-02-2023\", \"security_number\": null, \"invoice_issuer\": \"La Clinique E-santé\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Séance de psychothérapie à distance\", \"quantity\": 1, \"date_of_service\": \"17-02-2023\", \"mandatory_coverage\": null, \"amount\": 147.0}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27196883_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"069317675\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Prune LEROY\", \"total_billed\": 60, \"bill_paid\": true, \"amount_paid\": 60, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Nicolas CARDONA\", \"beneficiary_dob\": null, \"invoice_date\": \"07-02-2023\", \"security_number\": null, \"invoice_issuer\": \"Prune LEROY\", \"currency\": \"EUR\", \"items\": [{\"description\": \"consultation psychologique\", \"quantity\": 1, \"date_of_service\": \"07-02-2023\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"069317675\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Prune LEROY\", \"total_billed\": 60, \"bill_paid\": true, \"amount_paid\": 60, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Nicolas CARDONA\", \"beneficiary_dob\": null, \"invoice_date\": \"07-02-2023\", \"security_number\": null, \"invoice_issuer\": \"Prune LEROY\", \"currency\": \"EUR\", \"items\": [{\"description\": \"consultation psychologique\", \"quantity\": 1, \"date_of_service\": \"07-02-2023\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27199895_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"599334174\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Cassandra Carrier\", \"total_billed\": 240, \"bill_paid\": true, \"amount_paid\": 240, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": 240, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Beaumont Julien\", \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": \"Cassandra Carrier\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"05-09-2022\", \"mandatory_coverage\": null, \"amount\": 60}, {\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"12-09-2022\", \"mandatory_coverage\": null, \"amount\": 60}, {\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"19-09-2022\", \"mandatory_coverage\": null, \"amount\": 60}, {\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"26-09-2022\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"599334174\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Cassandra Carrier\", \"total_billed\": 240, \"bill_paid\": true, \"amount_paid\": 240, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": 240, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Beaumont Julien\", \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": \"Cassandra Carrier\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"05-09-2022\", \"mandatory_coverage\": null, \"amount\": 60}, {\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"12-09-2022\", \"mandatory_coverage\": null, \"amount\": 60}, {\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"19-09-2022\", \"mandatory_coverage\": null, \"amount\": 60}, {\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"26-09-2022\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27199897_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": \"599334174\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Cassandra Carrier\", \"total_billed\": 240, \"bill_paid\": true, \"amount_paid\": 240, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": 240, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Beaumont Emilie\", \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": \"Cassandra Carrier\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"30-01-2023\", \"mandatory_coverage\": null, \"amount\": 60}, {\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"06-02-2023\", \"mandatory_coverage\": null, \"amount\": 60}, {\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"13-02-2023\", \"mandatory_coverage\": null, \"amount\": 60}, {\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"20-02-2023\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologie\", \"adeli_number\": \"599334174\", \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Cassandra Carrier\", \"total_billed\": 240, \"bill_paid\": true, \"amount_paid\": 240, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": 240, \"remaining_payment\": 0, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Beaumont Emilie\", \"beneficiary_dob\": null, \"invoice_date\": null, \"security_number\": null, \"invoice_issuer\": \"Cassandra Carrier\", \"currency\": \"EUR\", \"items\": [{\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"30-01-2023\", \"mandatory_coverage\": null, \"amount\": 60}, {\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"06-02-2023\", \"mandatory_coverage\": null, \"amount\": 60}, {\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"13-02-2023\", \"mandatory_coverage\": null, \"amount\": 60}, {\"description\": \"Couple - Consultation de psychologie\", \"quantity\": 1, \"date_of_service\": \"20-02-2023\", \"mandatory_coverage\": null, \"amount\": 60}], \"prescripteur_finess_number\": null}"
}
]
}
],
[
{
"role": "system",
"content": "You are an advanced AI agent created by Rizlum AI. You are designed to extract structured information from health invoices with high accuracy. Your task is to parse invoices and return only the requested fields in a strict JSON format.\n\n### **General Instructions**\n1. **Extract Only the Specified Fields**: Do not include extra information.\n2. **Do Not Guess or hallucinate if information is missing or represented by placeholders (e.g., dots, dashes).**\n3. **Ignore irrelevant fields (e.g., address, SIRET, membership numbers).**.\n4. **Ensure Strictly Valid JSON Output**: Do not return additional text or explanations.\n5. **Field Relationship Guidance**: Formula: total_bill = mandatory_coverage + complementary_coverage + client_part. Instruction: Prioritize extracting all values directly and only if they appear on the invoice. This formula is a guide to verify the consistency of extracted numbers, not a command to calculate a missing total_bill\n\n### **Handling Ambiguous Cases**\n- **Adeli Number**: If a 9-digit number appears without the keyword 'Adeli', check if it matches the Adeli number format and is associated with a recognized healthcare professional.\n- **Doctor Selection**:\n - If the invoice shows multiple doctors, exclude any doctor that is visibly crossed out.\n - Prioritize doctor information (e.g., name, Adeli, RPPS) within a stamp (identified by visual stamp features like borders or official markings) over unstamped doctor blocks. Exclude unstamped doctor information if a stamped block exists.\n- **Item Selection in Tables**:\n - If multiple items or acts are listed, extract only those that are highlighted (e.g., marked with color).\n - Ignore all other items that are not explicitly marked or priced.\n- **Date**:\n - Distinguish carefully between similar characters: treat '/1' as '1' (e.g., January), not '11' (e.g., November), by focusing on stroke separation and context rather than assuming a slash implies a specific number.\n"
},
{
"role": "user",
"content": [
{
"type": "image",
"image": "/home/nguyendc/model-factory/Finetuning-Automation/etc/data/media/docai_mgp_facture_v2_1/Facture-quittance toutes spécialités-27200192_1_scale_1.5.jpg"
},
{
"type": "text",
"text": "<image>Extract the following structured information from the provided invoice. Fill in only existing values.\nStrictly return a valid JSON following this schema:\n\n**Json schema**\n{\n \"type\": \"object \",\n \"properties\": {\n \"is_bill\": {\n \"type\": \"boolean\",\n \"description\": \"True if the document is an invoice, false otherwise.\"\n },\n \"profession\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Type of healthcare profession, if it is presented in the list [Optique, Kinésiologie, Kinésithérapie, Pharmacie, Biologie, Psychologie, Infirmier, Ostéopathie, Dentaire, Sage-femme, Sophrologie, Soins hospitaliers, Orthopédie, Podologie, Diététique, Radiologie, Orthophonie, Pédiatrie, Assurance Maladie, Pompes funèbres, Laboratoire, Gynécologie-obstétrique, Chiropractie, Psychomotricité, Ostéodensitométrie, Pneumologie, Vaccins, Sevrage tabagique, Contraception, Homéopathie, Acupunture], Unknown otherwise.\"\n },\n \"adeli_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Adeli number (9-digit identifier) associated with the healthcare provider\"\n },\n \"rpps_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"11 digits identifier, indicated after the term 'RPPS'\"\n },\n \"finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"9 digits identifier, indicated after one of the terms in list ['finess', 'identifiant CPAM']\"\n },\n \"doctor_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the doctor\"\n },\n \"prescripteur_finess_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Finess number of the prescriber in the invoice (9 digits identifier, indicated after the term 'finess')\"\n },\n \"total_billed\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The total amount billed on the invoice\"\n },\n \"bill_paid\": {\n \"type\": \"boolean\",\n \"description\": \"True if the invoice has been paid, false otherwise (Look for terms like: 'acquittée', 'payée', 'quittance', 'réglée', 'certifie avoir reçu le règlement')\"\n },\n \"amount_paid\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The amount paid for the invoice\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by compulsory health insurance (indicated after terms like 'AMO', 'Rbmt RO', 'CAISSE', 'Noemie', etc.)\"\n },\n \"complementary_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by complementary insurance (indicated after terms like 'AMC', 'RC', 'Mutuelle')\"\n },\n \"client_part\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount paid by client (indicated after terms like 'ASSURE', 'Part Client', 'Part Assuré')\"\n },\n \"remaining_payment\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"The remaining balance to be paid by the beneficiary if the invoice is unpaid.\"\n },\n \"insured_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the insured person (indicated after terms like 'Assure')\"\n },\n \"insured_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the insured person (format: dd-mm-yyyy)\"\n },\n \"beneficiary_name\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Full name of the invoice beneficiary\"\n },\n \"beneficiary_dob\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of birth of the beneficiary (format: dd-mm-yyyy)\"\n },\n \"invoice_date\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of the invoice (format: dd-mm-yyyy)\"\n },\n \"security_number\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Social Security number (13 or 15 digit identifier, indicated after terms like 'Sécurité Social' ou 'N° INSEE' ou 'N° SS')\"\n },\n \"invoice_issuer\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Name or organization issuing the invoice or providing the service\"\n },\n \"currency\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Currency used (e.g., EUR, USD)\"\n },\n \"items\": {\n \"type\": \"array\",\n \"description\": \"List of items or services included in the invoice.\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"description\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Description of the item or service.\"\n },\n \"quantity\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Quantity of the item or service.\"\n },\n \"date_of_service\": {\n \"type\": [\"string\", \"null\"],\n \"description\": \"Date of service (when the item was provided), in format dd-mm-yyyy.\"\n },\n \"mandatory_coverage\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Amount covered by mandatory health insurance for this item.\"\n },\n \"amount\": {\n \"type\": [\"number\", \"null\"],\n \"description\": \"Total amount for the item (unit price * quantity).\"\n }\n }\n }\n }\n }\n}\n\n"
}
]
},
{
"role": "assistant",
"content": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Dominique Brun\", \"total_billed\": 70, \"bill_paid\": false, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Kevin Batilliot\", \"beneficiary_dob\": null, \"invoice_date\": \"22-02-2023\", \"security_number\": null, \"invoice_issuer\": \"M. Dominique BRUN\", \"currency\": \"EUR\", \"items\": [{\"description\": \"s\\u00e9ance de Psychoth\\u00e9rapie\", \"quantity\": 1, \"date_of_service\": \"22-02-2023\", \"mandatory_coverage\": null, \"amount\": 70}], \"prescripteur_finess_number\": null}"
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "{\"is_bill\": true, \"profession\": \"Psychologue\", \"adeli_number\": null, \"rpps_number\": null, \"finess_number\": null, \"doctor_name\": \"Dominique Brun\", \"total_billed\": 70, \"bill_paid\": false, \"amount_paid\": null, \"mandatory_coverage\": null, \"complementary_coverage\": null, \"client_part\": null, \"remaining_payment\": null, \"insured_name\": null, \"insured_dob\": null, \"beneficiary_name\": \"Kevin Batilliot\", \"beneficiary_dob\": null, \"invoice_date\": \"22-02-2023\", \"security_number\": null, \"invoice_issuer\": \"M. Dominique BRUN\", \"currency\": \"EUR\", \"items\": [{\"description\": \"s\\u00e9ance de Psychoth\\u00e9rapie\", \"quantity\": 1, \"date_of_service\": \"22-02-2023\", \"mandatory_coverage\": null, \"amount\": 70}], \"prescripteur_finess_number\": null}"
}
]
}
]
]