Skip to content

Architecture

Understanding the library architecture helps you choose the right backend (Native vs WASM) and debug issues. This guide explains how PDFium is wrapped for JavaScript/TypeScript.

The library employs a Hybrid Architecture that provides a unified API over two distinct backends:

  1. Native Backend (Node.js only): Uses N-API bindings to call the system’s compiled PDFium library directly. Fast, zero-overhead, but requires platform-specific binaries.
  2. WASM Backend (Universal): Uses a WebAssembly-compiled version of PDFium. Portable, works in browsers and Node.js, but has slight marshalling overhead.
graph TD
subgraph App[Your Application]
Code[Your Code]
end
subgraph Wrapper[@scaryterry/pdfium]
API[PDFium Class]
Doc[PDFiumDocument]
Page[PDFiumPage]
end
subgraph Backends[Backend Selection]
Switch{Native or WASM?}
Native[Native Backend]
WASM[WASM Backend]
end
subgraph Implementation[Implementation Layer]
NAPI[N-API Bindings]
Emscripten[Emscripten Runtime]
end
subgraph Core[PDFium Core]
Lib[libpdfium.so / .dll / .dylib]
WasmBin[pdfium.wasm]
end
Code --> API
API --> Switch
Switch -- "useNative: true" --> Native
Switch -- "Default" --> WASM
Native --> NAPI
WASM --> Emscripten
NAPI --> Lib
Emscripten --> WasmBin
sequenceDiagram
participant App
participant PDFium
participant Loader as Native Loader
participant WASM as WASM Loader
App->>PDFium: init({ useNative: true })
PDFium->>Loader: loadNativeBinding()
alt Native Module Found
Loader-->>PDFium: NativeInstance
PDFium-->>App: NativePDFiumInstance
else Not Found
PDFium->>WASM: loadWASM()
WASM-->>PDFium: WasmModule
PDFium-->>App: PDFium (WASM)
end

Regardless of the backend, you interact with similar high-level interfaces:

  • PDFiumDocument (WASM) / NativePDFiumDocument (Native)
  • PDFiumPage (WASM) / NativePDFiumPage (Native)

This allows you to write code that is largely portable between Node.js (Native) and the Browser (WASM), though some advanced features (forms, creation) are currently WASM-only.

  • Implementation: src/native/
  • Dependencies: Platform-specific npm packages (e.g., @scaryterry/pdfium-darwin-arm64).
  • Performance: Direct C++ calls. No memory copying for some operations.
  • Limitations: Only works in Node.js environments.
  • Implementation: src/wasm/
  • Dependencies: None (WASM binary is bundled).
  • Performance: Near-native, but incurs overhead when copying image data between WASM heap and JS memory.
  • Capabilities: Runs everywhere (Node.js, Browsers, Workers).

The WASM backend requires manual memory management because WebAssembly memory is linear and manual. The wrapper handles this automatically using the Disposable pattern.

graph LR
JS[JavaScript Memory] <--> Bridge[Marshalling Layer] <--> Linear[WASM Linear Memory]
Linear --> Heap[C++ Heap]

Both backends use a “Handle” system to track pointers to underlying C++ objects.

Handle TypeC++ TypeDescription
DocumentHandleFPDF_DOCUMENTPointer to an open document
PageHandleFPDF_PAGEPointer to a loaded page
BitmapHandleFPDF_BITMAPPointer to a pixel buffer

Safety: These handles are “branded types” in TypeScript to prevent you from accidentally passing a Page handle to a function expecting a Document handle.

FeatureNative BackendWASM Backend
EnvironmentNode.js OnlyBrowsers & Node.js
SetupInstall platform pkgZero config
PerformanceMaximumHigh
Worker SupportNode.js Worker ThreadsWeb Workers & Worker Threads
Memory LimitSystem RAMBrowser/WASM Limit (often 2GB/4GB)