Post

Building a document processing pipeline (Part 1): Preface

I am pleased to have a side project currently running reliably in production which integrates local Large Language Models (LLMs); a document processing web app that my girlfriend uses for her work.

The user interface. I had to wipe out the subject lines.

Part of her work (as far as I understand) is receiving many letters, memos, categorizing them, routing them, logging, and monitoring. The letters come in physical papers. Most of those in their office who work on the tasks do it using pen and papers. On the other hand, my clever girlfriend took an initiative to keep digital records in a spreadsheet. But it’s still a bit of effort: typing out and summarizing the documents. And there I found a solid domain-specific use case for LLMs that are all the rage these days.

So I will be writing a series of posts about how I built the document processing pipeline. The objective of this pipeline is to extract specific information from the documents: the recipient, the subject, the body, the sender, and a summary.

Overview of the document processing pipeline.

I will be continually updating this Preface post to add links to the future posts.

Part 1: Preface (this post)

Part 2: OCR and working with HEIF files