Back to Case Studies

Generative AI  
in medical documentation processing

The project was designed to transform the health claim processing workflow from a labor-intensive task into an automated, efficient, and reliable system. The initiative focused on digitizing physical documents, categorizing them, extracting key facts, and generating comprehensive medical summaries and reports.

Shortcuts:

Client

ICR Sp. z o.o.

Industry

Insurance

Market

Poland

Engagement

PoC

Scope

Generative AI workflow

Team Size

2 Developers,
QA, PM

PoC

2 months

Partnership

6 years (ongoing)
Project description

Medatex is a leading insurance claim management solution used by over 50 insurance companies in Europe. The process involves handling claims based on legal, medical, and insurance standards, providing adjusters with analytical solutions for assessing the situation of the injured parties and the extent of their damages and enabling the automatic generation of documents, including templates for correspondence with the injured party or their representative, as well as decision templates.

The business objective: The pilot project's goal was to automate the health claim dispatch process to assess potential gains in process efficiency and time savings. 

Project results
85%
Accuracy of documents
digitalization with OCR
> 50%
Potential processing time
for claims reduction
≈ 95%
Accuracy of document
digitization and fact retrieval
About the problem

In the healthcare sector, the dispatch of health claims involves processing an extensive array of documents, including medical reports, examination results, lab tests, medical procedures, and billing information. Traditionally, this process has been manual, time-consuming, and prone to errors, leading to delays in claims processing and increased operational costs.

The project aimed to:

leverage Generative AI and Optical Character Recognition (OCR) technologies

automate the health claim dispatch process

enhance efficiency, accuracy, and patient satisfaction.

Project scope
Step 1
Document Digitalization

The first step involved converting scanned documents into digital formats using OCR technology. This phase was crucial due to the diverse nature and quality of the scanned documents. Advanced OCR solutions were employed, capable of handling various text formats, handwriting, and even low-quality scans, ensuring high accuracy in digitization.

Step 2
Document Categorization

Once digitized, the documents were categorized into predefined classes such as medical reports, lab tests, and billing documents. This categorization was facilitated by a machine learning model trained on a large dataset of annotated healthcare documents. The model was fine-tuned to recognize and categorize documents accurately, even when the formats and templates varied significantly.

Step 3
Key Facts Retrieval

The extraction of key facts from the categorized documents was the next critical step. Using natural language processing (NLP) and machine learning algorithms, the system identified and extracted pertinent information such as patient names, birthdates, addresses, ICD codes, and details of medical procedures. The AI model was trained to understand the context and semantics of the healthcare domain, ensuring a high level of precision in fact retrieval.

Step 4
Medical Summary and Report Generation

The final step involved synthesizing the extracted information into coherent medical summaries and reports. Generative AI models, trained on a vast corpus of medical texts, were employed to generate summaries that were both accurate and easily comprehensible. These summaries provided a consolidated view of the patient's medical history and current claims, significantly aiding in the decision-making process.

Key features
Generative AI
base
OCR
Optical Character Recognition
API
workflow
HIPAA/ GDPR
compliance
Project timeline
2 weeks
Ideation

During the initial phase, we performed a business analysis and established a clear problem definition, including success criteria for the customer. Our analysts documented the current business process of claim management, highlighting the predominance of manual tasks. We detailed each step of the process, specifying the input and output, and also created a set of test data.

2 months
Proof of Concept

In this phase, we deployed various prototype solutions to assess top generative AI engines, aiming to choose the one that aligns with both customer needs and process requirements. We developed precise automated test cases to investigate the limits of accuracy and efficiency. We also established an automated system for processing documents and extracting facts. A thorough analysis of the outcomes was conducted alongside detailed statistical evaluations.

ongoing
MVP

The objective of the upcoming phase is to implement the solution on a small scale with actual cases, while also focusing on refining the model and improving cost efficiency. All data will continue to be reviewed through a human-assisted process.

Tech stack
OpenAI
Nvidia
Java
JavaScript
Tesseract
Tool stack
Jira Cloud

Project Management

Confluence Cloud

Project Documentation

Slack

Project Communication

Technical description

The technological backbone of this project was a strategic combination of OpenAI and Nvidia AI tools, chosen for their efficiency and cost-effectiveness. To digitize the documents, we employed a suite of OCR technologies, with Tesseract playing a pivotal role due to its versatility and wide adoption.

Recognizing the diverse linguistic nuances present in medical documents, we also developed and deployed custom models specifically tailored to address language-specific challenges. This approach ensured not only the high fidelity of digitized text but also the nuanced understanding necessary for accurate categorization and information extraction in subsequent stages.