Document Lifecycle
This guide explains the lifecycle of documents, pages, and related resources in the library.
Lifecycle Overview
Section titled “Lifecycle Overview”PDFium.init() │ ├── openDocument() │ │ │ ├── getPage() / pages() │ │ │ │ │ ├── render() │ │ ├── getText() │ │ ├── findText() │ │ ├── getObjects() / objects() │ │ └── getAnnotations() │ │ │ │ │ └── page.dispose() │ │ │ ├── getBookmarks() │ ├── getAttachments() │ └── save() │ │ │ └── document.dispose() │ ├── createDocument() │ │ │ ├── addPage() │ │ │ │ │ ├── addText() │ │ ├── addRect() │ │ └── finalize() │ │ │ │ │ └── pageBuilder.dispose() │ │ │ └── save() │ │ │ └── builder.dispose() │ └── pdfium.dispose()States
Section titled “States”PDFium Instance
Section titled “PDFium Instance”| State | Description |
|---|---|
| Uninitialised | Before PDFium.init() |
| Ready | After init(), before dispose() |
| Disposed | After dispose() |
Document
Section titled “Document”| State | Description |
|---|---|
| Loading | During openDocument() |
| Open | Successfully loaded |
| Disposed | After dispose() |
| State | Description |
|---|---|
| Loading | During getPage() |
| Open | Successfully loaded |
| Disposed | After dispose() |
Ownership Rules
Section titled “Ownership Rules”Parent-Child Relationships
Section titled “Parent-Child Relationships”PDFium (owns) └── PDFiumDocument (owns) └── PDFiumPageRules:
- Parent must outlive children
- Disposing parent may invalidate children
- Always dispose children before parents
// CORRECT: Dispose in reverse orderusing pdfium = await PDFium.init();using document = await pdfium.openDocument(data);using page = document.getPage(0);// Use page...// page disposed// document disposed// pdfium disposed
// INCORRECT: Parent disposed firstconst pdfium = await PDFium.init();const document = await pdfium.openDocument(data);pdfium.dispose(); // BAD: document still opendocument.dispose(); // May crash or leakReference Invalidation
Section titled “Reference Invalidation”When a document is disposed, all its pages become invalid:
using document = await pdfium.openDocument(data);const page = document.getPage(0); // Don't store!
document.dispose();
page.getText(); // ERROR: Document is closedUsing Keyword Flow
Section titled “Using Keyword Flow”The using keyword ensures proper cleanup:
async function processDocument(data: Uint8Array) { using pdfium = await PDFium.init(); // pdfium is ready
using document = await pdfium.openDocument(data); // document is open
using page = document.getPage(0); // page is open
const text = page.getText(); return text;
// At this point (end of function): // 1. page is disposed // 2. document is disposed // 3. pdfium is disposed}Nested Scopes
Section titled “Nested Scopes”Use block scopes for fine-grained control:
using pdfium = await PDFium.init();using document = await pdfium.openDocument(data);
// Process pages in sequencefor (let i = 0; i < document.pageCount; i++) { using page = document.getPage(i); await processPage(page); // page disposed at end of each iteration}
// Or with explicit blocks{ using page = document.getPage(0); // Use page...} // page disposed here
{ using page = document.getPage(1); // Use page...} // page disposed hereGenerator Pattern
Section titled “Generator Pattern”The pages() generator yields pages that must be managed:
// CORRECT: using inside loopfor (const page of document.pages()) { using p = page; // Assign to using variable console.log(p.getText());} // Each page disposed after its iteration
// INCORRECT: Storing generator resultsconst allPages = [...document.pages()]; // Don't do this!// Pages won't be properly disposedBuilder Lifecycle
Section titled “Builder Lifecycle”Document Builder
Section titled “Document Builder”using builder = pdfium.createDocument();
// Add pages{ using page = builder.addPage(); page.addText('Hello', 72, 720, font, 24); page.finalize(); // REQUIRED before dispose} // page builder disposed
// Save documentconst bytes = builder.save();// builder disposed when scope endsPage Builder Rules
Section titled “Page Builder Rules”- Must call
finalize()before scope ends - Cannot modify after
finalize() - Disposed when block scope ends
{ using page = builder.addPage(); page.addText('Title', 72, 720, font, 24); // page.finalize(); // MISSING - page will be incomplete!}
// Better pattern{ using page = builder.addPage(); try { page.addText('Title', 72, 720, font, 24); page.addRect(72, 700, 200, 50, style); } finally { page.finalize(); // Always called }}Document Events
Section titled “Document Events”PDFiumDocument exposes an events emitter for lifecycle hooks:
using document = await pdfium.openDocument(data);
document.events.on('pageLoaded', ({ pageIndex }) => { console.log(`Page ${pageIndex} loaded`);});
document.events.on('willSave', () => { console.log('Document is about to be saved');});Available Events
Section titled “Available Events”| Event | Payload | Triggered When |
|---|---|---|
pageLoaded | { pageIndex: number } | A page is loaded via getPage() or pages() |
willSave | undefined | Immediately before save() writes the document |
Error Handling and Cleanup
Section titled “Error Handling and Cleanup”Resources are cleaned up even when errors occur:
async function safeProcess(data: Uint8Array) { using pdfium = await PDFium.init();
try { using document = await pdfium.openDocument(data); using page = document.getPage(999); // May throw
return page.getText(); } catch (error) { console.error('Error:', error); throw error; } // All resources disposed even on error}Anti-Patterns
Section titled “Anti-Patterns”Storing References
Section titled “Storing References”// BAD: Stored reference becomes invalidclass PDFViewer { private page?: PDFiumPage;
async loadPage(document: PDFiumDocument, index: number) { this.page = document.getPage(index); // Don't store! }}
// GOOD: Process immediatelyclass PDFViewer { async renderPage(document: PDFiumDocument, index: number) { using page = document.getPage(index); return page.render({ scale: 2 }); // page disposed, result data persists }}Returning Undisposed Resources
Section titled “Returning Undisposed Resources”// BAD: Caller must remember to disposefunction getFirstPage(document: PDFiumDocument) { return document.getPage(0); // Who disposes this?}
// GOOD: Return processed datafunction getFirstPageText(document: PDFiumDocument): string { using page = document.getPage(0); return page.getText();}
// OK: Document return (caller owns it)async function openDocument(data: Uint8Array): Promise<PDFiumDocument> { using pdfium = await PDFium.init(); return await pdfium.openDocument(data); // Note: pdfium disposed but document returned // This requires careful handling}Best Practices
Section titled “Best Practices”- Use
usingkeyword for all disposable resources - Process, don’t store pages and temporary resources
- Dispose in reverse order of creation
- Keep scopes tight for memory efficiency
- Always call
finalize()on page builders
See Also
Section titled “See Also”- Resource Management — Disposal patterns
- Error Handling — Error recovery
- Memory Management — Memory considerations