Extract Financial Data from PDF Statements Using Claude
Tools:Claude.ai Pro
Time:15-20 minutes
Difficulty:Intermediate
Claude.ai (Pro)
What This Does
Claude Pro can read uploaded PDF tax returns, CPA-prepared financial statements, and accountant compilations, and extract the key figures you need for spreading into a structured table, without manual transcription. Cuts spreading time from 30-45 minutes to 5-10 minutes for a 3-year income statement and balance sheet.
Before You Start
- You have a Claude.ai Pro subscription (file upload requires Pro)
- You have the borrower's financial statements as PDF files
- You understand what data you need to extract (income statement items, balance sheet, schedule of debt)
Steps
1. Open Claude.ai and start a new conversation
2. Upload the PDF
Click the attachment icon (paperclip) in the message box. Upload the financial statement PDF. You can upload multiple files in one conversation.
3. Paste the extraction prompt
4. Copy the output to your spreading model
Verify 3-5 key line items against the PDF before using in your credit model.
The Prompt
Copy and paste this
I've uploaded financial statements for [Company Name]. Please extract the following data and present it in a clean table format suitable for credit analysis.
**From the Income Statement (for each year available):**
- Revenue / Net sales
- Cost of goods sold / Cost of revenue
- Gross profit
- Operating expenses (total)
- EBITDA (calculate as operating income + D&A if not stated)
- Depreciation and amortization
- Operating income / EBIT
- Interest expense
- Net income before tax
- Net income
**From the Balance Sheet (most recent year-end):**
- Cash and equivalents
- Accounts receivable
- Inventory
- Total current assets
- Total assets
- Accounts payable
- Short-term debt / current portion of LTD
- Total current liabilities
- Total debt (short-term + long-term)
- Total liabilities
- Owner's equity / net worth
**From Schedule of Debt or Notes (if available):**
- Each debt obligation: lender name, original balance, current balance, annual payment, maturity date
Format as a table. If a line item is not present or cannot be found, note "NF" (not found). Do not calculate or interpolate — only extract what is explicitly stated.
Tips
- Verify before spreading: Always spot-check 5+ line items directly in the PDF before relying on the extraction. Claude occasionally misreads tables with unusual formatting.
- For tax returns (Form 1120, 1120-S, 1065), specify: "This is a federal tax return. Extract from Schedule L (balance sheet), Schedule M-1, and the main income section."
- If extraction has gaps: "The balance sheet items for 2022 are missing. Check pages 8-12 of the PDF again."
- Never upload borrower PDFs to public AI tools without confirming your bank's data governance policy
Always verify extracted data against source documents before using in a credit model.