Automated accounting process using Machine Learning

Want to automate your bill reconciliation process? Accounting made easy and automatic with data science and computer vision.

OCR

Problem statement

Accounting is an integral part of business finance management. Keeping track of manual bills is an exhausting task and errors are likely to be introduced while handling large numbers of the same. Our client is an accounting company who wanted to automate some parts of their bill reconciliation process. Automation will help them perform the accounting quicker with fewer or no errors. The customers of our client will submit the manual bills for the accounting. The client wanted to build an OCR system to convert expense receipt stubs stored as scanned documents and images. To achieve this, we were required to extract elements and fields from the expense receipt stub, namely, date, total price, tax, etc.

Our solution

This project involves automatic OCR conversion of receipt stubs into textual CSV data. We gathered their dataset of receipt scans and performed preliminary data cleanup and grouping. Since all the documents contained standard fonts and languages, developing an OCR program was quite straightforward. There was no customized OCR model development required for this project and so we used the pre-trained LSTM model of Tesseract for this project. The next goal was to impart intelligence to the system by automatically identifying specific text fields in the OCR output. We used a natural language processing framework to model the text context from the text output of the OCR engine. We then packaged this solution as a library that was then integrated into their existing desktop application. The user will load a collection of scanned receipts and the OCR engine will produce a list of CSV files corresponding to the input files.

Key metrics

This project was developed in a time frame of 15 weeks. The OCR engine was very efficient and reduced the manual text conversion time to 0. The OCR accuracy was above 95%.

Technology stack

Logo for tesseract Logo for OpenCV image processing library Logo for tensorflow library Logo for NLTK

Key Metrics

We are a team that values efficiency, innovation, and the pursuit of excellence in everything we do. We are a high performance team that is passionate about bringing AI and Cloud computing technologies to a larger industry audience. We have accomplished so much in such a short amount of time!

8

Years in business

40+

Happy clients

120+

Completed projects

2 %

Hire top talent

100 %

Certified team

90 %

Client retentivity