Referenced from ideas.md → Personal Translation Builder
Three bugs in the original ShortGloss matching were identified and fixed:
far (passed, spent) was splitting into 3 items instead of 1, pushing "many" for G4183 past the 8-item cap.
Fix: Paren-aware comma splitter + cap raised to 12.
γάρ candidates ["and","as","because","but","even","for"...]: "and" appeared first AND appeared in the sentence (at position 12), so it was returned instead of "for" (the correct positional match).
Fix: Position-aware second-pass refinement assigns each ShortGloss to the candidate closest to its expected English position.
Fix: Safety-net insert added so the highlighted meaning is always visible.
BSB/KJV translation divergence. λέγοντες (G3004) = "claiming" in BSB but "say" in KJV usage — no amount of matching can bridge this without BSB-specific word-level alignment data.
bereanbible.com/bsb_tables.tsv (CC0/public domain) imported via 10_import_bsb_alignment.py. 112,747 NT Greek words now carry an authoritative BSB gloss in greek_words.bsb_gloss. PopupBuilder uses this as the primary ShortGloss, falling back to KJV heuristic only when null (TR-only/untranslated words).
Result: λέγοντες (Matt 24:5) → "claiming" ✓, γάρ → "For" ✓.
Investigated why some Greek words appear in the interlinear with no visible English counterpart. Two root causes identified:
Greek externalizes relationships as standalone words that English handles implicitly:
E.g. αὐτῷ ("to him", indirect object of ἐπιδεῖξαι) has bsb_gloss = NULL because BSB renders "came up to Him to point out its buildings" — the "to him" is already implied by the verb phrase and not given its own token.
When this happens, the KJV heuristic fires and produces wrong results (e.g. "the other" for αὐτῷ — a valid but rare KJV rendering of αὐτός that happens to score best against the sentence).
PopupBuilder now detects "absorbed" words: if the verse has BSB data but a word's gloss is null → the word was deliberately not tokenized by BSB.
Visual treatment:
Theme.AbsorbedGloss)Theme.GuessGloss)Color key: gold = authoritative BSB | muted amber = KJV guess | dim = absorbed/alternatives
Impact: Fixes "the other" for αὐτῷ (Matt 24:1) and ~28,973 other null-gloss words.
Two issues discovered when the "Your translation" footer was first tested:
Hebrew lacks BSB word-level alignment; ShortGloss comes from Strong's dictionary definitions (grammatical tags like "(Qal)", generic entries like "day"). Not verse-specific enough to compose sentences.
Fix: Hebrew footer shows disclaimer instead of composed sentence.
Greek clause order differs from English (e.g. Mark 3:2: Greek puts ἵνα κατηγορήσωσιν "In order to accuse" at the end, BSB moves it to the front).
Fix: Reorder composed parts by finding each BsbGloss's character position in the English sentence text. Uses closest-match heuristic for words appearing multiple times (e.g. "the"). Falls back to proportional position estimate for words without a match.