LINGYANG is designed for sensitive financial documents: conversion runs in your browser, OCR runs on your device, and statement contents are not sent to a conversion server in Local Mode.
Last updated: 2 July 2026
Statement contents are never transmitted in Local Mode. The website may make separate requests for page delivery, anonymous analytics, payment processing and license validation. These requests never contain statement contents.
Scanned PDFs and images are recognized with self-hosted Tesseract.js assets in the browser. No cloud OCR subprocessor is used today.
When Pro is activated, the browser sends the license key to Gumroad for validation. Statement contents are not included.
The public app is served as static assets, reducing server-side attack surface for conversion data.
The deployed site uses CSP, HSTS, frame protection and restricted permissions headers through Cloudflare Pages.
Automated tests cover parsing, exports, SEO metadata and privacy wording. Dependency audit is run before release.
Cloud OCR is not enabled today. If it is added, the upload, OCR and download pipeline must be treated as a separate security product and must pass this launch gate before any production traffic is accepted. This follows the same defense-in-depth shape recommended by the OWASP File Upload Cheat Sheet: validate type, extension, size and storage location, and isolate uploaded files from the application.
| Control | Required implementation before cloud OCR launch |
|---|---|
| File identity | Validate allow-listed extensions and server-detected MIME/magic bytes. Do not trust client headers or extensions alone. |
| Size and structure limits | Set maximum file size, PDF page count, image dimensions, object count and decompressed size/ratio before parsing. |
| Compressed and container files | Reject abnormal archives, archive bombs, nested containers and malformed compressed streams. |
| Private storage | Store uploads outside any public web root using randomized object names that do not include the original file name. |
| Signed access | Use short-lived signed URLs for upload and download; no public bucket reads and no permanent file URLs. |
| Access control | Only authenticated or licensed users may upload; authorization must be checked again for status and download requests. |
| Malware handling | Run uploads through antivirus, sandboxing or content disarm and reconstruction where practical for PDFs, images and office files. |
| OCR isolation | Run OCR workers in containers with deny-by-default outbound network access and only the minimum internal services required. |
| Resource limits | Enforce per-job CPU, memory, wall-clock time, concurrency and queue limits, including PDF parser resource-exhaustion protections. |
| Deletion and evidence | Automatically delete uploads, derivatives and exports; deletion jobs must write verification records without storing contents. |
| Safe logging | Logs must not contain original file names, transaction text, statement contents, amounts or account numbers. |
| API abuse protection | Rate limit upload, OCR, status and download APIs by IP, license/user and job state. |
| Dependency maintenance | Regularly update PDF, Excel and image parsing dependencies, OCR images and container base images; run dependency audits. |
| Security testing | Test polyglot files, MIME spoofing, malformed PDFs, oversized PDFs, image bombs, archive bombs and timeout paths before release. |
Statement contents are never transmitted in Local Mode. The website may make separate requests for page delivery, anonymous analytics, payment processing and license validation. These requests never contain statement contents.
Cloud OCR must ship as a separate upload security product with real MIME validation, file and page limits, randomized private storage, short-lived signed URLs, isolated OCR containers, resource limits, deletion evidence, safe logs, rate limiting and parser dependency updates.
No. Statement files, OCR images, extracted rows and exported files are not used to train AI or machine-learning models.
Email support@antelope.tools with a clear description, affected URL and reproduction steps. Please do not access or modify other people's data.