Security Guide
PDF processing involves parsing complex, potentially untrusted binary data. This guide covers the security considerations and recommended practices for using @scaryterry/pdfium safely.
Content Security Policy (CSP)
Section titled “Content Security Policy (CSP)”When running PDFium in the browser via WASM, strict CSP headers are recommended.
Required Directives
Section titled “Required Directives”PDFium WASM execution requires:
Content-Security-Policy: script-src 'self' 'wasm-unsafe-eval';'self': Allows loading the JS glue code and worker scripts from your domain.'wasm-unsafe-eval': Required by V8 and other engines to compile and instantiate WASM modules.
If you load the WASM binary from a CDN or different origin:
Content-Security-Policy: script-src 'self' 'wasm-unsafe-eval'; connect-src 'self' https://cdn.example.com;Worker Isolation
Section titled “Worker Isolation”If using WorkerProxy, ensure your worker script is served with appropriate headers:
Cross-Origin-Embedder-Policy: require-corpCross-Origin-Opener-Policy: same-originThese headers are required if you use SharedArrayBuffer (which PDFium might use for optimisation in future versions, though currently it uses structured cloning).
Subresource Integrity (SRI)
Section titled “Subresource Integrity (SRI)”When loading the WASM binary from a CDN, consider using SRI hashes to verify integrity:
<script> // Verify the WASM binary hash after fetching const response = await fetch('/pdfium.wasm'); const buffer = await response.arrayBuffer(); const hashBuffer = await crypto.subtle.digest('SHA-256', buffer); const hashHex = Array.from(new Uint8Array(hashBuffer)) .map(b => b.toString(16).padStart(2, '0')) .join('');
// Compare against known hash (from build output) if (hashHex !== expectedHash) { throw new Error('WASM binary integrity check failed'); }</script>The library exposes __WASM_HASH__ at build time for this purpose.
WASM Isolation
Section titled “WASM Isolation”WebAssembly runs in a sandboxed environment. It cannot access the DOM, cookies, or local storage directly. However, memory corruption within the WASM heap is possible (buffer overflows inside the virtual machine).
To mitigate impact:
- Process untrusted PDFs in a Worker: Even if the WASM module crashes or hangs, the main thread remains responsive.
- Use resource limits: Set
maxDocumentSizeandmaxRenderDimensionto prevent memory exhaustion attacks. - Set timeouts: Use worker timeouts to prevent infinite loops from hanging your application.
using pdfium = await PDFium.init({ limits: { maxDocumentSize: 50 * 1024 * 1024, // 50 MB maxRenderDimension: 4096, // 4096×4096 max },});Input Validation
Section titled “Input Validation”Always validate untrusted input before passing it to PDFium:
File Size
Section titled “File Size”Reject files larger than your application’s limit before passing to openDocument:
const MAX_PDF_SIZE = 50 * 1024 * 1024; // 50 MB
function validatePDFInput(data: Uint8Array): void { if (data.byteLength === 0) { throw new Error('Empty file'); } if (data.byteLength > MAX_PDF_SIZE) { throw new Error(`File too large: ${data.byteLength} bytes (max ${MAX_PDF_SIZE})`); }}File Type
Section titled “File Type”Check the PDF magic bytes before processing:
function isPDFHeader(data: Uint8Array): boolean { // PDF files start with %PDF- return data[0] === 0x25 && data[1] === 0x50 && data[2] === 0x44 && data[3] === 0x46;}Render Dimensions
Section titled “Render Dimensions”Validate scale factors and dimensions to prevent memory exhaustion:
function safeRender(page: PDFiumPage, scale: number) { const maxDimension = 8192; const width = Math.ceil(page.width * scale); const height = Math.ceil(page.height * scale);
if (width > maxDimension || height > maxDimension) { throw new Error(`Render dimensions ${width}×${height} exceed limit of ${maxDimension}`); }
return page.render({ scale });}Filename Sanitisation
Section titled “Filename Sanitisation”When extracting attachments, sanitise filenames to prevent path traversal:
import path from 'path';
function sanitiseFilename(name: string): string { // Remove path components name = path.basename(name); // Remove problematic characters name = name.replace(/[<>:"/\\|?*\x00-\x1F]/g, '_'); // Prevent empty or dot-only names if (!name || name === '.' || name === '..') { name = 'attachment'; } return name;}Password Handling
Section titled “Password Handling”When working with password-protected PDFs:
- Pass passwords via the
passwordoption inopenDocument(). - The library’s internal helpers use
secure: trueto zero out password buffers after use. - Never log or persist passwords.
using document = await pdfium.openDocument(data, { password: userProvidedPassword,});// Password buffer is zeroed after use internallyDocument Permissions
Section titled “Document Permissions”PDFs can restrict operations (printing, copying, editing) via permission flags. The library respects these and throws PermissionsError when a restricted operation is attempted:
import { PermissionsError } from '@scaryterry/pdfium';
try { const bytes = document.save();} catch (error) { if (error instanceof PermissionsError) { console.error('This document does not allow modification'); }}Server-Side Considerations
Section titled “Server-Side Considerations”Resource Limits
Section titled “Resource Limits”In server environments processing user-uploaded PDFs, always set strict limits:
using pdfium = await PDFium.init({ limits: { maxDocumentSize: 20 * 1024 * 1024, // 20 MB for uploads maxRenderDimension: 4096, maxTextCharCount: 500_000, },});Global Configuration
Section titled “Global Configuration”For applications that process PDFs across multiple instances, use configure() to set limits globally rather than per-instance:
import { configure, getConfig } from '@scaryterry/pdfium';
configure({ limits: { maxDocumentSize: 20 * 1024 * 1024, // 20 MB maxRenderDimension: 4096, maxTextCharCount: 500_000, },});
// Verify the current configurationconst config = getConfig();console.log(config.limits.maxDocumentSize); // 20971520Process Isolation
Section titled “Process Isolation”For high-security environments, consider running PDF processing in a separate process or container:
- Use Node.js
worker_threadsfor process-level isolation - Run in a container with limited memory and CPU
- Set OS-level resource limits (
ulimiton Linux)
Error Information Leakage
Section titled “Error Information Leakage”In production, error context is automatically sanitised to strip internal WASM details. Avoid exposing raw PDFiumError.context to end users:
try { using document = await pdfium.openDocument(data);} catch (error) { if (error instanceof PDFiumError) { // Log full details internally logger.error('PDF processing failed', { code: error.code, context: error.context });
// Return sanitised message to user res.status(400).json({ error: 'Invalid PDF file' }); }}Dependency Scanning
Section titled “Dependency Scanning”The library:
- Has zero runtime dependencies — PDFium is compiled to WASM and bundled
- Runs
npm auditas part of CI - Uses
pnpmlockfile for reproducible installs
See Also
Section titled “See Also”- Error Handling — Error recovery patterns
- Error Reference — Complete error code listing
- Worker Mode — Off-main-thread processing
- Memory Management — WASM memory considerations