bolha.us is one of the many independent Mastodon servers you can use to participate in the fediverse.
We're a Brazilian IT Community. We love IT/DevOps/Cloud, but we also love to talk about life, the universe, and more. | Nós somos uma comunidade de TI Brasileira, gostamos de Dev/DevOps/Cloud e mais!

Server stats:

248
active users

#ocr

3 posts3 participants0 posts today

So if you’re using Mastodon on the web, you can press the ⚠️ALT button and then follow the “Detect text from picture” link.

On Mac/iOS, you can select text on images as if they were text by clicking/tapping and dragging and paste that in (might be more accurate; that’s what I did).

PS. This was meant to be a reply to mastodon.social/@fatbrit/11421 but somehow didn’t get threaded correctly (was using the web client instead of Mona. I somehow manage to do that there sometimes. Has happened before.) :)

Mistral #OCR is an Optical Character Recognition API that sets a new standard in document understanding. Unlike other models, Mistral OCR comprehends each element of documents—media, text, tables, equations—with unprecedented accuracy and cognition. It takes images and PDFs as input and extracts content in an ordered interleaved text and images.
As a result, Mistral OCR is an ideal model to use in combination with a #RAG system taking multimodal documents (such as slides or complex PDFs) as input. #AI

mistral.ai/fr/news/mistral-ocr

mistral.aiMistral OCR | Mistral AIIntroducing the world’s best document understanding API.

I'm not convinced: #eScriptorium #LLM enhanced scheint sich darin zu erschöpfen, dass man

1. Transformermodelle für das OCR nutzen könnte (extrem ressourcenintensiv)
2. #OCR correction via #LLM (prompt-based)
3. #NER mit #LLM (prompt-based)

Nichts von dem erscheint mir in meiner Naivität den sozialen, ökonomischen und ökologischen Impact der #LLM Nutzung zu rechtfertigen. Und ich werd auch nicht warm mit diesem Prompting-Ansatz.