Docling

Docling · 2026-05-07T15:46:50.743Z

Make sure you don’t miss this talk if you are in Atlanta area!!

Technology, Information and Internet

Get your documents ready for gen AI

About us

Docling unlocks the information trapped in your PDFs, Office files, images, and more, so you can automate document processing and build AI applications with ease and speed.

Website: https://docling.ai
External link for Docling
Industry: Technology, Information and Internet
Company size: 2-10 employees
Type: Nonprofit

Updates

Docling

6,317 followers
14h
Report this post
Make sure you don’t miss this talk if you are in Atlanta area!!
Michele Dolfi, PhD
18h

🚀 Docling is heading to #RedHatSummit 2026 in Atlanta — May 11–14! Next week, the Docling team and friends from Red Hat will be at the Georgia World Congress Center for a packed lineup of talks, a hands-on lab, and lots of conversations about scalable and agentic document AI. If you're working on document processing, RAG, agentic AI over unstructured data, or large-scale ingestion pipelines — come find us. 🤝 📅 Sessions on the agenda: 🔹 Mon, May 11 · 4:15–4:35 PM · B401-B402, Level 4 Meet Docling: The Swiss Army knife for scalable document AI! — Lightning talk with Peter W. J. Staar & Michele Dolfi, PhD 🔹 Mon, May 11 · 4:40–5:00 PM · B401-B402, Level 4 From simple RHEL docs to intelligent answers: Building AI systems with Docling and Red Hat AI — Lightning talk with Ali Maredia, Máirín Duffy & Major Hayden 🔹 Tue, May 12 · 3:30–5:00 PM · A312, Level 3 Building intelligent apps with Python and RAG: From raw data to real-time insights — Hands-on lab with Christopher Nuland, Cedric Clyburn & Natale Vinto 🔹 Wed, May 13 · 12:15–12:35 PM · DevZone Theater Large-scale data processing with RayData and Docling — Lightning talk with Jehlum Vitasta Pandit & Michele Dolfi, PhD 📍 Where to find us between sessions: IBM Booth #160 — live demos and use-case deep dives Community Central — say hi, grab Docling stickers and swag 🔗 Up-to-date schedule (sessions, rooms): https://lnkd.in/eJF8Jw8r Whether you're already building with Docling, evaluating it for your team, or just curious how open-source document AI fits into agentic workflows and the Red Hat ecosystem (OpenShift AI, AI Inference Server, Kubernetes-native tooling) — we'd love to chat. DM me, stop by the booth, or catch us after a session. ☕ See you in Atlanta! 🍑 #RedHatSummit #Docling #OpenSource #DocumentAI #RAG #AgenticAI #GenAI #OpenShiftAI Red Hat IBM LF AI & Data Foundation
Like Comment Share
Docling reposted this
Iprova

4,702 followers
16h Edited
Report this post
On 19 May at IMD Business School, Lausanne, Peter W. J. Staar - IBM, Tarek R. Besold - SonyAI, Sebastien Massart - Dassault Systèmes and Dr Harry Cronin - Iprova with Maaike Van Velzen - Iprova Advisory Board as moderator, all take to the stage at 09:30. This session brings together experts from IBM, SonyAI, Dassault Systèmes and Iprova to explore the technology behind next gen product development, including AI enabled knowledge discovery, scientific discovery, invention and simulation, and what this means for the way new products are created. Join us in Lausanne on 19 May. See the full programme and register at https://lnkd.in/eKq9p85D
2 Comments

Like Comment Share
Docling reposted this
Michele Dolfi, PhD
18h
Report this post
🚀 Docling is heading to #RedHatSummit 2026 in Atlanta — May 11–14! Next week, the Docling team and friends from Red Hat will be at the Georgia World Congress Center for a packed lineup of talks, a hands-on lab, and lots of conversations about scalable and agentic document AI. If you're working on document processing, RAG, agentic AI over unstructured data, or large-scale ingestion pipelines — come find us. 🤝 📅 Sessions on the agenda: 🔹 Mon, May 11 · 4:15–4:35 PM · B401-B402, Level 4 Meet Docling: The Swiss Army knife for scalable document AI! — Lightning talk with Peter W. J. Staar & Michele Dolfi, PhD 🔹 Mon, May 11 · 4:40–5:00 PM · B401-B402, Level 4 From simple RHEL docs to intelligent answers: Building AI systems with Docling and Red Hat AI — Lightning talk with Ali Maredia, Máirín Duffy & Major Hayden 🔹 Tue, May 12 · 3:30–5:00 PM · A312, Level 3 Building intelligent apps with Python and RAG: From raw data to real-time insights — Hands-on lab with Christopher Nuland, Cedric Clyburn & Natale Vinto 🔹 Wed, May 13 · 12:15–12:35 PM · DevZone Theater Large-scale data processing with RayData and Docling — Lightning talk with Jehlum Vitasta Pandit & Michele Dolfi, PhD 📍 Where to find us between sessions: IBM Booth #160 — live demos and use-case deep dives Community Central — say hi, grab Docling stickers and swag 🔗 Up-to-date schedule (sessions, rooms): https://lnkd.in/eJF8Jw8r Whether you're already building with Docling, evaluating it for your team, or just curious how open-source document AI fits into agentic workflows and the Red Hat ecosystem (OpenShift AI, AI Inference Server, Kubernetes-native tooling) — we'd love to chat. DM me, stop by the booth, or catch us after a session. ☕ See you in Atlanta! 🍑 #RedHatSummit #Docling #OpenSource #DocumentAI #RAG #AgenticAI #GenAI #OpenShiftAI Red Hat IBM LF AI & Data Foundation
2 Comments

Like Comment Share
Docling reposted this
PebbleRoad

3,590 followers
1d
Report this post
🎁 Our second contribution to the AI community: solving the broken table problem. If you're extracting documents for RAG pipelines, you know that paginated tables (tables split across PDF pages) are incredibly difficult. A page break severs the bottom half of a table from its headers, leaving the data completely orphaned and causing instant LLM hallucinations. We figured out a clean way to solve this. Instead of relying on heavy Vision models or LLMs to guess the missing context, we fix the table at the source before it ever reaches your chunker. Presenting table-stitcher 🧵 It automatically detects fragmented tables across document pages and seamlessly stitches them back into one single, continuous table. Every row is reunited with its rightful headers. No context lost, no complex workarounds needed. (Pro-tip: It pairs perfectly with our previous release, table2rules!) 🐍 Pure Python 🚫 Zero ML models (100% deterministic) ⚖️ MIT-licensed Docling adapter included! Code and PyPI link in the comments. Enjoy! 👇 https://lnkd.in/gs_fy83e
Like Comment Share
Docling reposted this
Eric Baquol
2w
Report this post
You’re probably using Claude wrong. Not in a “you broke it” way. In a “leaving money on the table” way. Most people paste in a PDF or a Word doc and wonder why they burn through their session limit so fast. Here’s the thing nobody tells you up front: the file format you use matters way more than how long your prompt is. Convert your files to Markdown before you paste them in. I know, sounds like nerd homework. But the token savings are absurd and your session will last dramatically longer. I felt mildly ridiculous that I didn’t know this sooner. Two free tools that take seconds: Docling and AnythingMD. Think of it like packing for a trip. You could stuff everything loose into the trunk, or you could actually fold your clothes and fit twice as much in the same bag. Same destination. Way less chaos getting there.
2 Comments

Like Comment Share
Docling reposted this
Sonia Tabti, PhD
2w
Report this post
Vous utilisez Docling pour analyser vos documents mais vous êtes frustrés de pas visualiser ce qui se passe sous le capot ? J'ai la solution pour vous ! Docling est un super système de parsing de PDFs et d'autres types de fichiers qui fait également de l'OCR (Optical Character Recognition). C'est un package open-source qui peut tourner sur votre ordinateur même sans GPU ! C'est mon outil préféré pour analyser des PDFs personnellement 😍 Mais parfois, j'ai bien envie de visualiser comment Docling fait l'analyse du layout de mon document pour mieux comprendre son fonctionnement surtout si le parsing s'est mal passé. Docling Studio est un projet open-source développé par Scub qui permet de régler cette frustration ! Je vous le mets en action en vidéo avec les liens en commentaire pour commencer à l'utiliser vous même si vous êtes déjà fan de Docling 🦆 ! N'hésitez pas à réagir pour me soutenir et si vous ne me connaissez pas, je suis Sonia, PhD, bientôt 14 ans d'expérience en IA et Computer Vision 🧑💻

11 Comments

Like Comment Share
Docling reposted this
Alain AIROM (Ayrom)
2w
Report this post
📑 Is your RAG ready for industrialization? 🏭 I’m sharing a new sample application that swaps out #Unstructured-io for #Docling to power DataStax #AstraDB Serverless implementations. A scalable document intelligence. See the full breakdown below! 👇 ➡️ https://lnkd.in/eEERss49 or ➡️ https://lnkd.in/ewgF84wt #IBMBob
Like Comment Share
Docling reposted this
IBM Developer

50,379 followers
3w Edited
Report this post
A lot of knowledge is locked in video, audio, and documents that agents can’t use. Docling turns it into structured Markdown or JSON in under 40 lines of code. In this walkthrough, Tejas Kumar shows how to make it usable for RAG and agent workflows. 📺 Full video → https://ibm.biz/~hFQxzMxpd 💻 Code → https://ibm.biz/~sAzZRp2f6

4 Comments

Like Comment Share
Docling reposted this
Theodoros Mastromanolis
3w
Report this post
Η ελληνική νομοθεσία είναι δημόσια διαθέσιμη, αλλά δεν είναι πάντα εύκολο να την εξερευνήσεις με τρόπο που να είναι πραγματικά χρήσιμος για ένα LLM. Γι’ αυτό έφτιαξα ένα μικρό side project: ένα LLM-friendly wiki για ελληνικά νομικά έγγραφα, που μετατρέπει PDF από το Εθνικό Τυπογραφείο σε δομημένη, διασυνδεδεμένη γνώση. Για το parsing των PDF χρησιμοποίησα το Docling, μετατρέποντας τα έγγραφα σε markdown, και πάνω σε αυτό έστησα ένα wiki-based knowledge structure με συνδέσεις και διασυνδέσεις ανάμεσα στα documents. Αφορμή και έμπνευση για το project ήταν και το παρακάτω project, το οποίο βρήκα τυχαία στο GitHub: https://lnkd.in/dsqUmhN2 Τα έγγραφα τα αντλώ από εδώ: https://lnkd.in/dabPMbEX Αν θέλετε να ρίξετε μια ματιά στο πώς εφάρμοσα το LLM wiki του Karpathy: Repo: https://lnkd.in/dCw52ypp Article: https://lnkd.in/d8r4dB2C Feedback και ιδέες είναι πάντα ευπρόσδεκτες.
16 Comments

Like Comment Share
Docling reposted this
Basil Owen
3w Edited
Report this post
Check out the tutorial Talib ul haq and I just published on IBM Developer focusing on building an end-to-end document processing pipeline using Docling Serve (deployed on IBM Code Engine) and watsonx Orchestrate! 🤖📄 If you’re working with PDFs or unstructured documents, this shows how to extract structured text, tables, and layout-aware content that AI agents can actually act on 💡 🔑 𝐊𝐞𝐲 𝐛𝐞𝐧𝐞𝐟𝐢𝐭𝐬: - Extraction of structured data from complex documents (tables, layouts, text) 📊 - API-first document processing with Docling Serve for easy integration 🔌 - Reusable, scalable service across workflows (no more one-off scripts) ♻️ - Faster, lightweight deployments with docling-serve-cpu (no GPU/CUDA needed) ⚡ - Direct integration with watsonx Orchestrate 🔗 - This tutorial presents a practical approach to move from document parsing to real production workflows 🏗️ 📘 𝐓𝐮𝐭𝐨𝐫𝐢𝐚𝐥: https://lnkd.in/e-pm7mBn Ahmed Azraq Moisés Domínguez Michele Dolfi, PhD Peter W. J. Staar #AI #agents #IBM #watsonxorchestrate #Docling #watsonx
4 Comments

Like Comment Share

Docling

Technology, Information and Internet

Get your documents ready for gen AI

About us

Updates

Join now to see what you are missing

Similar pages

Langflow

Qdrant

Arke

Pydantic

LlamaIndex

n8n About

Hugging Face

Ollama

AI Engineering

FastAPI