This project contains the companion code for GPT-4o versus Azure Document Intelligence and Azure Computer Vision OCR
| Notebook Name | Description |
|---|---|
| PdfToTextPages.ipynb | Baseline, makes a text file for each page using pypdf to extract the text |
| PdfToPageImages.ipynb | Given a folder of PDF files, converts each page of each PDF file into JPEG images using a resolution of 300 dpi, and saves them in a structured directory format. |
| DocIntelligencePipeline.ipynb | C# Polyglot notebook with functions to convert an entire PDF to markdown and another that creates an OCR markdown file from each image created using PdfToPageImages.ipynb |
| turbo-2024-04-09.ipynb | Azure Open AI using GPT-4 with vision to create a markdown file for each image created using PdfToPageImages.ipynb |
| v4omni.ipynb | OpenAI using GPT-4o to create a markdown file for each image created using PdfToPageImages.ipynb |
| v4omni-image-plus-docIntelOcr.ipynb | OpenAI using GPT-4o to create a markdown file for each image created using PdfToPageImages.ipynb grounded with OCR text created using DocIntelligencePipeline.ipynb |
| visionWithOcr.ipynb | Azure Computer Vision GPT4-Vision OCR |
| visionWithOcrAndGrounding.ipynb | Azure Computer Vision GPT4-Vision OCR and grounding |