How to Scan Receipts to PDF With OCR for Expense Tracking
Turn paper receipts into searchable, organized PDFs for expense tracking, tax filing, and accounting — using free apps and tools.
AltoUnlockPDF Team
PDF Tools Expert
Paper receipts are the bane of expense reporting. They fade, wrinkle, get lost in pockets, and pile up in drawers. Scanning them to PDF — ideally with OCR so the data is searchable — solves all of that.
Why Scan Receipts to PDF?
- IRS/tax compliance: The IRS accepts digital copies of receipts for business expenses. Per IRS Revenue Procedure 98-25, electronic records must be accurate, retrievable, and secure.
- Expense reporting: Accounting software (QuickBooks, Xero, Expensify) can import PDFs and auto-extract data
- Receipt preservation: Thermal paper receipts fade within 2–5 years
- Searchability: With OCR, you can search “Marriott Hotel March 2024” instantly
Best Apps for Scanning Receipts to PDF
1. Adobe Scan (Free, iOS & Android)
Adobe Scan is the most polished receipt scanner:
- Auto-detects receipt edges
- Auto-adjusts lighting and color
- Saves to Adobe cloud or exports as PDF
- Basic OCR included (full OCR requires Adobe subscription)
2. Microsoft Lens (Free, iOS & Android)
Microsoft Lens (formerly Office Lens) is excellent and integrates with OneDrive/OneNote:
- Receipt mode crops and corrects perspective automatically
- Exports to PDF, Word, or OneNote
- OCR built-in with reasonable accuracy
3. Google PhotoScan / Google Drive (Free)
- Take a photo with your camera app
- Upload to Google Drive
- Right-click → Open with Google Docs
- Google performs OCR — the receipt data is now searchable text
4. AltoUnlockPDF (Free, Online)
- Take a photo of your receipt
- Upload to our OCR tool
- Download as searchable PDF or text
Receipt Organization System
Having receipts as PDFs is only useful if they’re organized. Here’s a simple system:
Folder structure:
Receipts/
2024/
01-January/
2024-01-15_Marriott-Hotel_$249.pdf
2024-01-18_Delta-Airlines_$380.pdf
02-February/
...
Naming convention: YYYY-MM-DD_Vendor_Amount.pdf
This makes tax season trivial — receipts are organized by date and vendor, and the amounts are visible in the filename.
Automated OCR Receipt Processing
For high volumes, automate the OCR extraction:
import pytesseract
from PIL import Image
import re
import json
def extract_receipt_data(image_path):
image = Image.open(image_path)
text = pytesseract.image_to_string(image)
# Extract common receipt fields with regex
data = {}
# Total amount (look for patterns like "$24.99" or "TOTAL 24.99")
total_match = re.search(r'(?:TOTAL|Total|AMOUNT DUE)[:\s]+\$?([\d,]+\.?\d*)', text)
if total_match:
data['total'] = float(total_match.group(1).replace(',', ''))
# Date (various formats)
date_match = re.search(r'\b(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})\b', text)
if date_match:
data['date'] = date_match.group(1)
return data
# Process a folder of receipt images
import os
receipts = []
for file in os.listdir('./receipts'):
if file.endswith(('.jpg', '.png', '.pdf')):
data = extract_receipt_data(f'./receipts/{file}')
data['filename'] = file
receipts.append(data)
# Export to JSON for import into accounting software
with open('receipts.json', 'w') as f:
json.dump(receipts, f, indent=2)
Integrating With Accounting Software
Most modern accounting platforms accept receipt PDFs directly:
- Expensify — forward receipt emails; auto-OCR extracts merchant, date, amount
- QuickBooks — receipt capture via mobile app with automatic categorization
- Xero — expense claims with PDF receipt attachment
- Wave (free) — receipt scanning via mobile app
Tips for Better Receipt Scans
- Lay flat on a dark surface — provides contrast for edge detection
- Good lighting — avoid shadows that cut across the text
- Scan while the receipt is fresh — thermal paper fades with heat and time
- Multiple angles for crumpled receipts — take 2–3 shots
- Keep originals for purchases over $75 (IRS recommendation for audit purposes)
The combination of a good scanner app, systematic naming, and cloud backup makes expense tracking nearly effortless.
Related Articles
How to Extract Tables From PDF With OCR (Free Methods)
Extract structured table data from PDF documents — both native PDFs and scanned images — using free tools and Python libraries.
Read Article
Multilingual OCR: How to Extract Text From PDFs in Any Language
Guide to running OCR on non-English documents — Arabic, Chinese, Japanese, Russian, and more — with the best free and paid tools.
Read Article
Best Free OCR PDF Online Tools: Extract Text From Scanned Documents
Compare the best free OCR tools to extract text from scanned PDFs — with accuracy tests, file size limits, and language support compared.
Read Article