Projects BibleWeb Search Dutch synonym expansion
Done

Dutch synonym expansion

Area: Search

Context

Problem: Dutch Bible language has many morphological variations. Searching for "liefde" (love) won't find "liefhebben" (to love) or "bemind" (beloved) because FTS5 stemming doesn't handle Dutch well. Users miss relevant results.

Solution: A hand-curated dictionary of 14 key Bible terms mapped to their morphological variants. When a user searches in Dutch, known terms are automatically expanded into FTS5 OR groups covering all related word forms.

Not included: Automatic stemming, fuzzy matching, or typo correction. Only the 14 predefined term families are expanded.

Functional

When searching in Dutch, the system automatically expands known Bible terms to include related word forms.

Example: Searching "liefde" actually searches: (liefde* OR liefheb* OR bemin* OR lieflijk*)

14 covered term families: liefde (love), geloof (faith), zonde (sin), gebed (prayer), genade (grace), hoop (hope), vrede (peace), waarheid (truth), rechtvaardig (righteous), verlossing (redemption), heilig (holy), dood (death), opstanding (resurrection), koninkrijk (kingdom)

Edge cases:

  • Unknown words get simple prefix matching: word*
  • Matching uses bidirectional startsWith: key.startsWith(word) || word.startsWith(key)
  • Short prefixes can cause false positives (e.g., "kon" matches "koninkrijk")

UX & Design

Invisible to the user — expansion happens server-side. Users just type their query and get better results.

TODO: Show "Also searching for: [expanded terms]" in the UI when expansion occurs.

Technical

Dictionary (dutch-synonyms.ts):

DUTCH_SYNONYMS = {
  liefde: ['liefde*', 'liefheb*', 'bemin*', 'lieflijk*'],
  geloof: ['geloof*', 'gelov*', 'vertrouw*'],
  // ... 12 more
}

expandDutchQuery(query):

  1. Splits query on whitespace
  2. Checks each word against dictionary (bidirectional startsWith)
  3. Matches → (syn1 OR syn2 OR ...)
  4. No match → word* (prefix)
  5. Joins all terms with spaces (implicit AND in FTS5)

Integration: Called from searchVerses() when lang === 'nl'.

Files:

  • apps/web/src/lib/server/queries/dutch-synonyms.ts — dictionary + expansion function
  • apps/web/src/lib/server/queries/search.ts — calls expandDutchQuery() for Dutch searches

Status

Current: IN_PROGRESS Milestone: Foundation Priority: Medium — significantly improves Dutch search quality

What's done:

  • 14 term families curated and implemented
  • Integrated into search pipeline
  • Working for all covered terms

What remains:

  • Only 14 terms covered — dictionary is sparse
  • No UI indication that expansion occurred
  • No fuzzy/typo matching
  • Short prefix false positives not addressed

Dependencies:

  • Requires: full-text search (DONE)