Akaike AI logo

Developing Data Extraction API Software for Health Insurance Cards

Processing Health Insurance Cards Automatically and Securely with AI

The Client Introduction

Our US client from the health industry needed an AI solution for extracting and processing insurance details from various card layouts. The objective was to save time, reduce costs, and eliminate errors in the insurance domain. Although OCR technology effectively extracted printed information, it faced limitations when claim-payer IDs were not visible on the cards. The ultimate goal of this project was to empower healthcare providers by streamlining insurance-related tasks and allowing them to prioritise patient care. The project marked a significant advancement beyond OCR, improving the handling of healthcare insurance data.

Top Benefits

  • We automated the insurance data processing with unprecedented speed and accuracy using machine learning techniques.
  • We developed an end-to-end pipeline that optimised the entire process within 2-2.5 seconds.
  • We eliminated the need for manual labelling on insurance cards.
  • We provided our clients with an API that allows seamless integration of the enhanced solution into their workflow.
Book a Trial

Executive Summary

Industry Overview About the US Health Insurance Industry

The individual health insurance market is experiencing significant growth in 2023, with over 3.6 million new consumers and a wide selection of 88 plans. This year marks the tenth anniversary of the US health insurance exchanges, and consumer participation has risen by 25% to approximately 16 million. This increase can be attributed to extended enrollment periods and improved subsidies. While insurer participation continues to grow, there has been a slight slowdown in 2023. National insurers have had the most significant expansion in participation, while insurtechs have experienced a decline as some have exited the market. Consumer choice has dramatically increased, with 87% of individuals accessing three or more insurers in 2023. There has been a modest increase of just 4% in silver plan premiums this year. Open enrollment has the potential to continue growing. In addition, individuals seeking more comprehensive coverage can choose from a wide range of insurance products, including HMOs and EPOs. Premium rates have increased across all metal tiers following several years of ups and downs.

Business Challenge Traditional OCR Systems

Traditional insurance processing presents significant process challenges - requiring more time and resources. This is further compounded by the staggering amount of denied health insurance claims in the United States of America, totalling over $262 billion annually. On top of that, there is a 27% error rate in patient registration and insurance processing, costing an additional $71 billion. These erroneous transactions comprise 1/60th of US healthcare spending and a third of hospital administrative costs. The contributing factors to this predicament include the costly need for insurance expertise, clearer information on insurance cards, and the complexity of selecting the right payer. Moreover, human-prone traditional OCR systems add to this problem with a 3% error rate regarding health insurance data.

The Akaike Edge

Inbuilt libraries, DL models with transfer learning capabilities

Impact Delivered

  • By reducing the processing time from 7 seconds to just 2 seconds, we have significantly improved the operational efficiency and user experience.
  • By replacing two KV extractor models, our model accuracy has increased by a remarkable 99%. This ensures more reliable outcomes for our clients.
  • Our system has successfully processed over 80,000 cards in one day, showcasing its extensive capability to manage large volumes effectively.
Book a Trial


Blend of Vision AI and Deep Learning

Extracting Card from the Image

Leveraging state-of-the-art technology, we employed the YOLO V5 model to skillfully extract the card from images, seamlessly removing background clutter.

Text Extraction via Azure OCR

Ensuring a comprehensive approach, we simultaneously processed the front and back card images through the Azure OCR API. This dynamic tool provided us with accurate and speedy information retrieval from the cards.

KV Extractor Model (LiLT) for Efficient Insurance Card Extraction

Our commitment to precision led us to train the LiLT model on a robust dataset of around 80,000 insurance cards. This model excels in classifying essential words into their respective labels. The integration of 'Regular Expressions' further facilitated the extraction of vital information, such as phone numbers and website details, from insurance cards. To elevate the accuracy of our solution, we implemented the MobileNetV3 model to classify insurance cards. This strategic choice ensures a streamlined and efficient process, enhancing the overall performance of our system.
Find more
Author: Shilpa Ramaswamy
Date: November 18, 2023