Skip to content

Frequently Asked at Google — This problem has been reported multiple times in recent Google L4 interviews.

Word Break I and II (LC 139 / 140) ​

Problem Statement ​

Word Break I (LC 139, Medium): Given a string s and a list of words wordDict, return true if s can be segmented into a space-separated sequence of dictionary words.

Input:  s = "leetcode", wordDict = ["leet","code"]
Output: true

Word Break II (LC 140, Hard): Return all possible sentences — every way s can be broken into dictionary words.

Input:  s = "catsanddog", wordDict = ["cat","cats","and","sand","dog"]
Output: ["cats and dog", "cat sand dog"]

Constraints:

  • 1 <= s.length <= 300
  • 1 <= wordDict.length <= 1000
  • Words are unique.

Difficulty: Medium / Hard


Pattern Recognition ​

  • Input structure: a string and a dictionary.
  • What are we computing? existence (I) or enumeration (II) of segmentations.
  • Pattern:
    • LC 139 → 1D DP on the string (dp[i] = true if s[0..i] is segmentable).
    • LC 140 → DFS with memoization keyed by suffix; the memo stores all sentences generated from that suffix.
  • Why: the decomposition has overlapping subproblems (many positions want to know "can s[i..] be split?"). DP collapses these. For enumeration, you must actually return the trees of splits, hence memoized recursion.

Approach ​

Brute Force — Recursion ​

Try every prefix of s. If it's in the dictionary, recurse on the suffix. Exponential worst case because the same suffix is explored many times.

Optimal Word Break I — 1D DP ​

Key Insight: dp[i] is true iff there is some j < i where dp[j] is true and s[j..i] is in the dictionary.

  1. Put dictionary into a hash set for O(1) lookup.
  2. dp[0] = true (empty prefix is segmentable).
  3. For each i from 1 to n:
    • For each j from 0 to i-1, if dp[j] and s[j..i] ∈ dict, set dp[i] = true and break.
  4. Return dp[n].

Time: O(n^2) iterations times O(L) substring hash (L = max word length) → practically O(n^2) if you trim inner loop to max word length. Space: O(n).

Optimal Word Break II — Memoized DFS ​

Key Insight: the answer for suffix s[i..] is independent of how you got to i. Cache sentences by start index.

solve(i):
  if i == n: return [""]            // one empty sentence
  if i in memo: return memo[i]
  sentences = []
  for each word w in dict starting at i:
      for tail in solve(i + len(w)):
          sentences.push(w + (tail.empty ? "" : " " + tail))
  memo[i] = sentences
  return sentences

Time: O(n^2 * 2^n) worst case (exponential number of sentences) — but memoization keeps it practical for the given constraints. Space: O(n * total answers).

To avoid wasted work on unbreakable strings, run LC 139 first and early-exit if false.


Walkthrough ​

LC 139 on s = "applepenapple", dict = {"apple", "pen"}.

iChecksdp[i]
0basetrue
5s[0..5]="apple" ∈ dict ∧ dp[0]true
8s[5..8]="pen" ∈ dict ∧ dp[5]true
13s[8..13]="apple" ∈ dict ∧ dp[8]true

Return dp[13] = true.

LC 140 on "catsanddog": the DFS from index 0 tries "cat" (go to 3) and "cats" (go to 4). From 3, "sand" leads to 7; from 4, "and" leads to 7. From 7, only "dog" matches. So both branches produce "dog" as the tail, giving the two sentences.


Implementation ​

typescript
// LC 139
function wordBreak(s: string, wordDict: string[]): boolean {
    const dict = new Set(wordDict);
    const n = s.length;
    const dp: boolean[] = new Array(n + 1).fill(false);
    dp[0] = true;
    const maxWord = Math.max(...wordDict.map(w => w.length));
    for (let i = 1; i <= n; i++) {
        for (let j = Math.max(0, i - maxWord); j < i; j++) {
            if (dp[j] && dict.has(s.substring(j, i))) {
                dp[i] = true;
                break;
            }
        }
    }
    return dp[n];
}

// LC 140
function wordBreakAll(s: string, wordDict: string[]): string[] {
    const dict = new Set(wordDict);
    const memo = new Map<number, string[]>();

    const solve = (start: number): string[] => {
        if (start === s.length) return [""];
        if (memo.has(start)) return memo.get(start)!;

        const sentences: string[] = [];
        for (let end = start + 1; end <= s.length; end++) {
            const word = s.substring(start, end);
            if (!dict.has(word)) continue;
            for (const tail of solve(end)) {
                sentences.push(tail === "" ? word : word + " " + tail);
            }
        }
        memo.set(start, sentences);
        return sentences;
    };

    return solve(0);
}
cpp
// LC 139
bool wordBreak(string s, vector<string>& wordDict) {
    unordered_set<string> dict(wordDict.begin(), wordDict.end());
    int n = s.size();
    vector<bool> dp(n + 1, false);
    dp[0] = true;
    int maxWord = 0;
    for (auto& w : wordDict) maxWord = max(maxWord, (int)w.size());
    for (int i = 1; i <= n; i++) {
        for (int j = max(0, i - maxWord); j < i; j++) {
            if (dp[j] && dict.count(s.substr(j, i - j))) {
                dp[i] = true;
                break;
            }
        }
    }
    return dp[n];
}

// LC 140
class WordBreakII {
    unordered_set<string> dict;
    unordered_map<int, vector<string>> memo;
    string src;

    vector<string> solve(int start) {
        if (start == (int)src.size()) return {""};
        auto it = memo.find(start);
        if (it != memo.end()) return it->second;

        vector<string> result;
        for (int end = start + 1; end <= (int)src.size(); end++) {
            string word = src.substr(start, end - start);
            if (!dict.count(word)) continue;
            for (auto& tail : solve(end)) {
                result.push_back(tail.empty() ? word : word + " " + tail);
            }
        }
        return memo[start] = result;
    }

public:
    vector<string> wordBreak(string s, vector<string>& wordDict) {
        dict = unordered_set<string>(wordDict.begin(), wordDict.end());
        src = s;
        return solve(0);
    }
};

Watch out:

  1. In LC 140, forgetting the memo makes adversarial inputs like "aaaaa...b" blow up exponentially.
  2. Concatenating the leading space when tail is empty creates trailing whitespace — guard against it.
  3. substring(j, i) copies the substring each call; on very long strings, a rolling hash or trie cuts that cost.

Complexity Analysis ​

TimeSpace
LC 139 bruteO(2^n)O(n)
LC 139 DPO(n * L) (L = max word length)O(n)
LC 140 memoized DFSO(n^2 * answers)O(n * answers)

Bottleneck: LC 140 is exponential in the number of valid sentences — no way around that because you must produce them all.


Edge Cases ​

  1. Empty string — dp[0] = true; LC 140 returns [""] or [] depending on spec (LeetCode returns [""] for some variants, empty for others).
  2. Word appears multiple times — both solutions handle it because dict is a set.
  3. Dictionary words longer than s — the end <= n bound protects you.
  4. Adversarial inputs for LC 140 — "aaaaaaaab" with dict of all "a" prefixes but no b — memoization stops after detecting the dead end at the b.
  5. Duplicate sentences — cannot occur because dict has unique entries.
  6. Very long single word — capping inner loop at max(wordDict).length is an important optimization.

  • LC 472 — Concatenated Words — Runs Word Break for each word in a list; same DP.
  • LC 91 — Decode Ways — Number of segmentations into valid codes; same DP style.
  • LC 132 — Palindrome Partitioning II — dp[i] = min cuts; different objective, identical shape.
  • LC 212 — Word Search II — Dictionary + grid; uses a trie, great follow-up if the interviewer pushes harder on dictionary indexing.

What Is Expected at Each Level ​

Junior: Solve LC 139 with the standard DP, articulate the recurrence, and explain why a set is used.

Mid-level: Cap the inner loop at maxWord, handle LC 140 with memoized DFS, hit both correctness and complexity.

Senior (L4 target): Use a trie for prefix lookups (crucial when wordDict is huge), discuss the exponential-output reality of LC 140, short-circuit with a Word Break I precheck, and articulate why the memo key is the start index and nothing else.

Frontend interview preparation reference.