How PageTurner Works
PageTurner uses a sophisticated 5-phase AI pipeline to achieve professional-grade translations while preserving React components, technical terminology, and document structure.
This isn't just Google Translate for docs - it's a purpose-built system for technical documentation translation.
Why Not Just Use Google Translate?β
The Problem with Generic Translation:
Generic translation tools process sentences in isolation, leading to:
- β Inconsistent terminology - "repository" might be "repositorio" on page 1 and "repo" on page 50
- β Lost context - Technical concepts misunderstood without document context
- β Broken structure - React components, code blocks, and formatting destroyed
- β Poor quality - Generic tools average 65-70/100 for technical content
PageTurner's Approach:
PageTurner treats documentation as a structured, interconnected system:
- β Perfect term consistency - "authentication" translates identically across all 200 pages
- β Context-aware - Understands your documentation's domain and terminology
- β Structure preservation - React components, MDX, and formatting stay intact
- β High quality - Average 91.3/100 quality score
The 5-Phase Pipelineβ
Phase 1: Parallel Intelligence Extractionβ
What happens:
- Content Analysis - Scans entire documentation to understand structure and domain
- Keyterm Extraction - Identifies critical technical terms requiring consistent translation
- API names (e.g., "useEffect", "useState")
- Technical concepts (e.g., "authentication", "middleware")
- Product-specific terms (e.g., "Webhook", "OAuth")
- Initial Translation - Performs first-pass translation with full document context
Why this matters:
- Identifies terms that must be translated consistently across all pages
- Understands your documentation's domain (database, web framework, cloud platform, etc.)
- Provides context for better translation decisions
Example:
For WatermelonDB documentation:
Extracted keyterms:
- "WatermelonDB" β Keep untranslated (product name)
- "database" β Must be consistent
- "query" β Technical term, needs consistency
- "reactive" β Core concept, critical consistency
- "synchronization" β Feature name, must match everywhere
Phase 2: Term Relationship Analysisβ
What happens:
- Semantic Clustering - Groups related terms using AI algorithms
- Relationship Mapping - Identifies terms that must maintain consistency
- Synonyms (e.g., "repo" and "repository")
- Hierarchical terms (e.g., "database", "database query", "database migration")
- Related concepts (e.g., "authenticate" and "authentication")
Why this matters:
- Ensures "database query" and "query" use the same translation for "query"
- Prevents inconsistencies when synonyms are used (repo vs repository)
- Maintains semantic relationships in target language
Example:
Relationship clusters identified:
Cluster 1: ["repository", "repo", "git repository"]
β All must use consistent translation
Cluster 2: ["authenticate", "authentication", "auth"]
β Related terms, translations must align
Cluster 3: ["synchronize", "sync", "synchronization"]
β Ensure verb/noun consistency
Phase 3: Term Parsing & Validationβ
What happens:
- Translation Extraction - Parses term translations from Phase 1
- Quality Validation - Ensures term translations are appropriate
- Checks if translation matches term meaning
- Validates grammatical correctness
- Verifies cultural appropriateness
Why this matters:
- Catches translation errors early before they propagate
- Ensures technical accuracy
- Prevents awkward or incorrect terminology
Example:
Term validation:
β
"database" β "base de datos" (Spanish) - Validated
β
"query" β "consulta" (Spanish) - Validated
β "sync" β "sincronizar" (verb form) - Corrected to "sincronizaciΓ³n" (noun)
β
"authentication" β "autenticaciΓ³n" (Spanish) - Validated
Phase 4: Consistency Resolutionβ
What happens:
- Global Dictionary Creation - Builds consistent translations for all keyterms
- Conflict Resolution - Resolves any translation inconsistencies
- Chooses best translation when multiple options exist
- Ensures consistency across entire documentation
- Term Locking - Finalizes terminology dictionary for Phase 5
Why this matters:
- This is where perfect term consistency is guaranteed
- One source of truth for all technical terms
- Eliminates the "translated differently on different pages" problem
Example:
Global terminology dictionary (English β Spanish):
{
"database": "base de datos",
"query": "consulta",
"synchronization": "sincronizaciΓ³n",
"authentication": "autenticaciΓ³n",
"WatermelonDB": "WatermelonDB" // Product name, kept as-is
}
This dictionary is applied to ALL translations in Phase 5.
Phase 5: Translation Refinementβ
What happens:
- Second Pass Translation - Retranslates with enforced terminology consistency
- Quality Assurance - Final validation and error correction
- Applies global dictionary from Phase 4
- Validates MDX/React component preservation
- Checks link integrity
- Verifies formatting preservation
- Output Generation - Creates final translated files
Why this matters:
- Guarantees perfect term consistency across entire site
- Final quality check before deployment
- Ensures production-ready output
Example:
Page 1: "The database query system..."
β
Page 1: "El sistema de consulta de base de datos..."
Page 50: "Execute a database query..."
β
Page 50: "Ejecutar una consulta de base de datos..."
β
"database" β "base de datos" (consistent)
β
"query" β "consulta" (consistent)
Translation Memory: The Secret Weaponβ
PageTurner includes a powerful translation memory system that learns and improves over time.
How It Worksβ
SHA256-Based Change Detection:
Original content: "Install PageTurner with npm install pageturner"
Content hash: a3f5b8c9d2e1...
Spanish translation: "Instala PageTurner con npm install pageturner"
Stored with hash: a3f5b8c9d2e1...
When content changes:
Updated content: "Install PageTurner with npm install pageturner-cli"
New hash: b7e2c4f1a8d9...
β Only this changed segment gets retranslated
Why This Mattersβ
Cost Savings on Updates:
- First translation: 100 pages Γ 3 languages = 300 translation requests ($30)
- Update 5 pages: Only 5 pages Γ 3 languages = 15 requests ($1.50)
- Savings: 95% cost reduction on updates
Cross-Project Learning:
- Translation memory is shared across all your projects
- Translating a second Docusaurus site reuses 40-60% of translations
- Team collaboration: Shared terminology across distributed teams
MDX & React Component Preservationβ
PageTurner was built specifically for Docusaurus, which uses MDX (Markdown + JSX).
What Gets Preservedβ
React Components:
<Tabs>
<TabItem value="js" label="JavaScript">
{/* code block with: const db = new Database(); */}
</TabItem>
</Tabs>
Result: Component structure preserved, only text labels translated:
<Tabs>
<TabItem value="js" label="JavaScript">
{/* code block with: const db = new Database(); */}
</TabItem>
</Tabs>
Code Blocks:
## Installation
(bash code block: npm install watermelondb)
Result: Code never translated, only surrounding text:
## InstalaciΓ³n
(bash code block: npm install watermelondb)
Quality Metricsβ
Average Translation Quality: 91.3/100β
Based on 22 production deployments across 13+ languages:
| Metric | PageTurner | Generic Tools |
|---|---|---|
| Term Consistency | 99.2% | 73% |
| Technical Accuracy | 94.1% | 68% |
| Natural Flow | 89.7% | 81% |
| Structure Preservation | 100% | 65% |
| Overall Quality | 91.3/100 | 65-70/100 |
Real-World Examplesβ
WatermelonDB (62 pages, 3 languages):
- Quality score: 92.4/100
- Translation time: 20 minutes
- Components preserved: 100%
- Term consistency: 99.8%
Prettier (180 pages, 5 languages):
- Quality score: 90.8/100
- Translation time: 45 minutes
- Components preserved: 100%
- Term consistency: 99.1%
Performance & Scalabilityβ
Parallel Processingβ
- Up to 100 concurrent translation tasks
- Intelligent rate limiting (default: 1000 requests/minute)
- Smart chunking for LLM context limits
- Token optimization reduces costs by 30-40%
Typical Translation Timesβ
| Documentation Size | Languages | Time |
|---|---|---|
| 50 pages | 3 | 10-15 min |
| 100 pages | 3 | 20-30 min |
| 200 pages | 3 | 40-60 min |
| 100 pages | 10 | 60-90 min |
Updates (with translation memory): 2-5 minutes for typical changes
Multi-LLM Provider Strategyβ
PageTurner uses different AI models for different tasks:
| Task | Model | Why |
|---|---|---|
| Translation | Claude Sonnet 4 | Best quality, context awareness |
| Term Extraction | GPT-4 | Excellent at identifying key concepts |
| Validation | Claude Opus | Highest quality checks |
| Cost-Effective | DeepSeek V3 | 90% of quality at 20% cost |
Smart provider selection optimizes both quality and cost.
What Makes PageTurner Differentβ
| Feature | PageTurner | Generic Translation | Human Translation |
|---|---|---|---|
| Term Consistency | β Perfect (99%+) | β Inconsistent (70%) | β Perfect (100%) |
| Context Awareness | β Full document | β Sentence-level | β Full document |
| MDX/React Preservation | β Native support | β Breaks components | β οΈ Manual effort |
| Translation Memory | β Automatic, 60-80% savings | β None | β οΈ Manual CAT tools |
| Deployment Automation | β GitHub + Vercel | β Manual | β Manual |
| Time to Deploy | β Minutes | β Hours | β Weeks |
| Cost (100 pages, 3 languages) | β $30 | β $20 | β $3,000+ |
| Quality | β 91/100 | β 65-70/100 | β 95-98/100 |
PageTurner sweet spot: Near-human quality at 1% of the cost, 100Γ faster than human translation.
Next Stepsβ
Now that you understand how PageTurner works:
- Try it yourself - Get your first translation running
- Configure advanced options - Customize the pipeline
- Join the Beta Program - Access early features and pricing
Questions? Check our FAQ or contact us.