Город МОСКОВСКИЙ
00:11:36

Improve OCR Results with Sparrow (running on Streamlit/Python and Ngrok)

Аватар
PythonEssence
Просмотры:
21
Дата загрузки:
02.12.2023 16:35
Длительность:
00:11:36
Категория:
Обучение

Описание

OCR can often generate results in a different order. But to produce a dataset for data extraction ML model fine-tuning (for example - Donut), fields in all documents must be ordered correctly. Our solution (open-source), Sparrow, for data annotation/labeling includes functionality for OCRed field reordering. In this video, I explain and show how it works.

Sparrow - data extraction from documents with ML:
https://github.com/katanaml/sparrow

Sparrow UI running on Hugging Face Spaces:
https://katanaml-org-sparrow-ui.hf.space

0:00 Introduction
0:40 Sparrow
1:15 OCRed Results Reordering
4:50 Deployment with NGROK
5:40 Deployment with Hugging Face Spaces
8:15 Code
9:27 NGROK
10:30 Summary

CONNECT:
- Subscribe to this YouTube channel
- Twitter: https://twitter.com/andrejusb
- LinkedIn: https://www.linkedin.com/in/andrej-baranovskij/
- Medium: https://medium.com/@andrejusb

#machinelearning #python #data

Рекомендуемые видео