Prilk </>Documentation
< Automation />

PDFs in.
ERPNext records out.

Supplier invoices, orders, receipts — arrive as PDFs, leave as ERPNext drafts. OCR plus AI does the typing. Your team approves and posts.
Book a free consultation
Document
supplier-invoice-2026-05.pdf
Supplier
Invoice #
Date
Total
Line items✓ 4
Extracted · 5 fields, 4 line items
What changes

Forty minutes per invoice, or thirty seconds.

Without us

Manual data entry, on every page.

Open the PDF in one window, ERPNext in another.
Retype the supplier, invoice number, date, total.
Type each line item, hope you didn’t miss a digit.
If the column on the PDF is rotated, start over.
End of week: backlog of unentered invoices.
With us

Drafts ready before you open them.

PDF lands in ERPNext via email, drop, or upload.
OCR + AI extracts supplier, fields, line items.
Validation rules check for missing or weird values.
Draft Purchase Invoice ready for review and submit.
End of week: nothing in the queue.
Data entry is the slowest part of accounts payable.
Suppliers don’t send PEPPOL XML — they send PDFs in email. Someone retypes those into ERPNext. The cost isn’t one invoice; it’s 200 a week and the typos that ripple through the ledger. AI does the typing now. Your team does the approving.
What you get

OCR + AI, scoped to your documents.

OCR for any PDF

pytesseract reads the file, including scanned and photo-of-paper invoices. PDFs with embedded text are read directly.

AI field extraction

An LLM extracts the supplier, invoice number, date, totals, and line items. Trained on the kind of documents your business actually receives.

Document classification

Sales order, purchase order, invoice, receipt — classified automatically. The right draft DocType is created.

Validation rules

Missing VAT number? Suspicious total? Misclassified document? Validation catches it before you waste time approving a bad draft.

Field-level confidence

Each extracted field carries a confidence score. High-confidence fields auto-fill; low-confidence ones surface for review.

Process queue + reprocessing

Re-run extraction with different prompts or models. Reprocess a batch when supplier formats change. No code, just settings.
How a PDF becomes a draft

Drop. Extract. Validate. Approve.

01
PDF arrives
Email, drop folder, or upload. ERPNext sees the file as a Document Processor record.
02
OCR + AI extract
Pytesseract reads the page; an LLM extracts the fields and line items.
03
Validated
Validation rules flag missing or weird values. Low-confidence fields surface for review.
04
Draft created
A draft Purchase Invoice (or other doctype) is created with all fields filled. You review, edit, submit.
Built for

Anyone receiving PDFs as a default.

Accounts Payable
Supplier invoices that arrive by email or post. OCR + AI replaces the typist.
Inbound logistics
Delivery notes, packing slips, customs documents — all PDFs, all OCR-able.
Anywhere paper meets ERP
If your team retypes anything from a PDF into ERPNext, this product replaces that step.
Under the hood

Pytesseract + pdf2image + LLM. Open source.

OCR uses pytesseract over pdf2image and OpenCV preprocessing. Field extraction goes through an LLM provider (OpenAI GPT-4, or bring your own — the same setup as the Pilot copilot). Extracted records are draft ERPNext documents (Sales Order, Purchase Order, Sales Invoice, Purchase Invoice). No parallel data model; documents land where they would have landed if a human had typed them. Open source.

Ready to fix your systems?

30-minute call. No pitch deck. Just an honest conversation about what you need.
Book a CallFree consultation. No strings.