Skip to main content
Reference Architecture

Automate invoice processing – End-to-End in 6 weeks

Incoming invoices from inbox to GoBD archive. Four entry channels, one data model, automatic posting to DATEV/Lexoffice/SAP/Odoo. Sickness-, vacation- and resignation-proof. Example scenario: ~8,050 € net effect per month, payback under 3 months.

FinanceBuchhaltungOCRGPT-4 VisionDATEVLexofficeZUGFeRDXRechnungGoBDResilienz
Branche
Bookkeeping / SMB 100–500 invoices/mo
Umsetzung
6 Wochen
Year-1 ROI (example)
~440 %
First video on our channel

How the automated invoice intake works

About 10 minutes from incoming email to finished posting. Including the question rarely asked: what happens when your bookkeeper is sick, on vacation, or quits?

Read the full guide

Picture this: Frau Müller, bookkeeping, thirty-person company. She handles invoice intake alone – three hundred PDFs a month, three hours a day, always the same clicks. If she gets sick next week, the whole place stalls. Cash discount deadlines gone, suppliers waiting, end-of-month bookings missing.

In the video we show how a pipeline solves this: four entry channels (email, scan, Peppol/XRechnung, supplier portals), format detection, OCR plus GPT-4 Vision for extraction, five validation checks, optional three-way-match against your ERP, role-based approval in Slack or Teams, automatic posting to DATEV, Lexoffice, SAP or Odoo, GoBD-compliant archive.

And: what that means in numbers. In the example scenario about eight thousand euros net effect per month, payback in under three months – even with conservative assumptions.

This is the kickoff. On the channel we cover automation from different angles — finance, sales, HR, operations. One task per video, no slides.

Real talk: nobody enjoys processing invoices. It's one of those jobs that exists because invoices keep coming — not because anyone signed up for it.

In a thirty-person company with three hundred invoices a month, it's one person's job. They open every PDF, type vendor names and amounts, hunt down purchase orders, chase managers for approval, post, file. Twelve minutes per invoice, sixty hours a month — time that produces nothing.

Then comes the day this exact person gets sick, quits, or goes on vacation for two weeks. Intake stalls. Cash discounts gone. Suppliers calling.

That's what we solve here — not in twelve months of consulting, but in six weeks of pipeline.

Before vs. After

Time per invoice
Before
12–15 min manual
After
1–2 min spot check
Hours / mo (300 invoices)
Before
60–75 h
After
8–10 h exceptions + sampling
Total cost / mo
Before
~2,350 €
After
~665 € (incl. software & maintenance)
Cash discount usage
Before
30–50 %
After
> 90 %
Error rate
Before
5–8 %
After
< 0.5 %
Throughput intake → booking
Before
8–12 days
After
< 2 days (median ~4 h)
Duplicate payments / yr
Before
2–5 cases @ ~1,500 €
After
0 (hash duplicate lock)
Audit readiness (GoBD)
Before
ad-hoc, 3–5 days prep
After
permanent, one-click export
Bus factor
Before
1 person holds the knowledge
After
Pipeline + caretaker (replaceable)

The Challenge

Example scenario: a business with about 30 employees, 35 €/h personnel cost, ~600,000 €/month purchase volume. A single person handles invoice intake — 300 invoices a month, 12 to 15 minutes manual work each.

Open PDF, type vendor, number, date, net, VAT, find purchase order, route for approval, wait, post, file. That's 60 to 75 hours a month producing nothing. Plus 5 to 8 % typing errors, occasional duplicate payments (avg. 1,500 € each), regularly missed cash discount deadlines.

The unaddressed question: what happens when this one person gets sick, quits, or takes two weeks off? Intake stalls, suppliers wait, cash discounts are gone. Bus factor of 1 — that's the real risk, not just the hours.

Our Solution

An end-to-end pipeline with four entry channels: dedicated email address, scan intake via SFTP, Peppol/XRechnung receipt via AS4 access point, and polling on supplier portals. Format detection separates classic PDFs, ZUGFeRD-PDFs (CII XML embedded) and pure XRechnung (UBL).

Three extraction paths run in parallel: OCR plus LLM (GPT-4 Vision or Claude) for classic PDFs, direct XML parse for ZUGFeRD and XRechnung per EN 16931. Everything lands in a unified data model (invoice.json) with per-field confidence — humans only see what the machine isn't sure about.

Five parallel validation checks: schema, VAT plausibility, duplicate hash against Postgres, master data lookup, EU sanctions list. Optional three-way-match against PO and goods receipt in SAP, Odoo or Business Central — deviations beyond tolerance (typically 2–5 %) feed a structured clarification workflow, not a lost email.

Role- and amount-based approval cascade: under 500 € auto, up to 5,000 € Slack/Teams 1-click, above that four-eyes with management. Delegate routing from Postgres + calendar sync, 24-hour escalation.

After approval: automatic posting to DATEV Rechnungswesen, Lexoffice, sevDesk, SAP FI or Odoo. Original PDF archived tamper-evident with hash chain, 10 years GoBD-compliant, one-click audit export.

Key Features

Four entry channels in one pipeline

Email inbox, SFTP scan, Peppol/AS4 for XRechnung, polling on supplier portals. All entries land in the same data model — no parallel processes, no chaos.

OCR + GPT-4 Vision for extraction

Azure Document Intelligence or Google Cloud Vision for text recognition, GPT-4 Vision or Claude for field-level structuring. 95–99 % accuracy on standard invoices, even when every vendor writes 'invoice number' differently.

Five checks before every booking

Schema (EN 16931), VAT plausibility, SHA-256 duplicate hash, master data match against ERP, EU sanctions list, IBAN validation. Duplicate payments reliably blocked — average 1,500 € prevented loss per case.

Role-based approval cascade

< 500 € auto, up to 5,000 € Slack/Teams 1-click by department, above that four-eyes principle with management. Delegate from Postgres + calendar sync, escalation after 24 hours — nothing rots in an inbox.

Booking into DATEV, Lexoffice, SAP or Odoo

DATEVconnect online, Lexoffice REST, sevDesk REST, SAP FI BAPI, Odoo XML-RPC or Microsoft Business Central. Cost center suggested by AI from historical bookings, human confirms with one click.

Sickness-, vacation- and resignation-proof

The pipeline runs 24/7. Sickness, vacation, resignation — no bottleneck. Delegates routed automatically, new hires onboarded in under a day via plain-text docs. Bus factor solved for real.

Results

Example workflow, not a specific customer project

The figures shown are target values and expected magnitudes of a reference architecture – based on industry benchmarks, public studies of comparable setups, and our own tests on a real stack. They are not measured outcomes from a specific hospital. Actual results in live deployment depend on hospital size, process maturity, and integration depth.

12 → 2 days
Throughput
+8,050 €/mo
Net effect
~2.2 months
Payback
1 → Pipeline
Bus factor

Throughput 12 → 2 days, cash discount usage above 90 %, ~8,050 € net effect per month in the example scenario, bus factor from 1 to pipeline + caretaker.

Integrations

Seamless connection to your existing infrastructure

DATEV Rechnungswesen / Unternehmen Online

Bookkeeping

DATEVconnect online for direct receipt upload, EXTF-CSV as fallback for tax advisor handover.

Lexoffice / sevDesk

Bookkeeping (REST API)

Native integration via Lexoffice and sevDesk APIs. Voucher with original PDF, cost center and tax code in one step.

SAP S/4HANA / Business One

ERP

OData and BAPI interfaces for PO matching (three-way-match) and posting to SAP FI.

Odoo / Microsoft Business Central

ERP

REST/XML-RPC for Odoo, Microsoft Graph for Business Central. Full receipt flow, no manual re-entry.

Peppol Access Point

E-invoicing

AS4 receipt for XRechnung — relevant for public sector and the German B2B mandate from 2027. Implemented via Phase4 or providers like ecosio.

Slack / Microsoft Teams

Approval

Block Kit (Slack) or Adaptive Cards (Teams) for approval buttons. Delegate routing and escalation built in.

Azure Document Intelligence / Google Cloud Vision

OCR

Text recognition for classic PDF and scanned invoices.

OpenAI GPT-4 Vision / Anthropic Claude

AI structuring

Function calling / tool use for field-level extraction with confidence scores. EU endpoints, zero data retention.

Security & Compliance

Enterprise-ready with highest security standards

GoBD-compliant archive

Original PDF hashed at intake (SHA-256) and archived tamper-evident for 10 years. Compliant with the German Ministry of Finance directive of 28.11.2019.

GDPR & EU hosting

Processing in EU data centers, DPA per GDPR Art. 28. LLM API calls (OpenAI/Anthropic) via EU endpoints with zero data retention.

Append-only audit log

Every step — intake, extraction, correction, approval, booking — logged in an immutable hash chain. One-click export for audits.

IBAN and sanctions check

Before each booking IBAN is validated against schema and check digit, vendor checked against the EU consolidated sanctions list.

Technology Stack

n8nMake.comOpenAI GPT-4 VisionAnthropic ClaudeAzure Document IntelligenceGoogle Cloud VisionLexoffice APIDATEV ConnectSAP FIOdooPeppolPostgreSQLSlack

Frequently Asked Questions

Sweet spot: 10 to 100 employees, ~100 to 500 incoming invoices per month. The example scenario shown (300 invoices/mo, 30 staff, 600k € purchase volume) is the typical mid-point. Smaller setups — solo with under 30 invoices — are better served by tools like GetMyInvoices at 29 €/month. Large enterprises with their own procure-to-pay setup usually have the lever already.
Fixed price by stack. Pure Lexoffice/DATEV with OCR pipeline: 8,000 to 14,000 €. With SAP FI, three-way-match against PO/goods receipt, or Peppol AS4 access point: 15,000 to 25,000 €. Build time 6 weeks, 8–10 with deep SAP integration. Maintenance retainer 200–500 €/month covers updates, new vendor formats, emergency support. You get a concrete number before signing — no day rate, no open-ended scope.
Conservatively. 300 invoices × 12 min = 60 h personnel time, ~50 h saved (×35 €/h ≈ 1,750 €). Cash discount: 3 % on additional usable purchase volume gives ~6,000 €/mo at 600k € volume. Avoided duplicate payments: ~440 €/mo on average. Software/cloud (n8n, LLM API, DMS): ~350 €/mo. Maintenance: ~140 €/mo. Net: ~8,050 €/mo. Even removing the cash discount effect entirely (only personnel + duplicate avoidance minus software/maintenance), you still net 2,000 €+/mo with payback under a year.
95 to 99 % on standard invoices with readable PDFs. Edge cases (poor scans, handwritten corrections, unusual layouts) go into a short spot-check — you see the AI suggestion with confidence score and confirm or correct in one click. The system learns from each correction for similar cases from the same vendor.
That's exactly the point. The pipeline runs 24/7 without anyone present. Incoming invoices are extracted, validated, and — where possible — auto-approved and booked. Approvals route via Slack or Teams to the delegate (Postgres table, optional calendar sync). Cash discount deadlines are monitored automatically. When the bookkeeper returns, everything is booked or in a clearly marked clarification stack.
Yes, as long as GoBD requirements are met. Original PDF is hashed (SHA-256) at intake and archived tamper-evident for 10 years (S3 with Object Lock or a DMS like d.velop, DocuWare, or Nextcloud with DSP). Each change goes through a hash-chained audit log. A GoBD process documentation is part of the deliverable — once written, reviewed by your tax advisor, usable for every audit.

Automate invoice intake in 6 weeks?

30-minute call about your volume, your bookkeeping stack, and the question of what happens when your bookkeeper is sick next week. With a concrete ROI calc for your real case — even if the answer is: doesn't pay off for you.