Frequently Asked at Google — This problem has been reported multiple times in recent Google L4 interviews.
Word Break I and II (LC 139 / 140) ​
Problem Statement ​
Word Break I (LC 139, Medium): Given a string s and a list of words wordDict, return true if s can be segmented into a space-separated sequence of dictionary words.
Input: s = "leetcode", wordDict = ["leet","code"]
Output: trueWord Break II (LC 140, Hard): Return all possible sentences — every way s can be broken into dictionary words.
Input: s = "catsanddog", wordDict = ["cat","cats","and","sand","dog"]
Output: ["cats and dog", "cat sand dog"]Constraints:
1 <= s.length <= 3001 <= wordDict.length <= 1000- Words are unique.
Difficulty: Medium / Hard
Pattern Recognition ​
- Input structure: a string and a dictionary.
- What are we computing? existence (I) or enumeration (II) of segmentations.
- Pattern:
- LC 139 → 1D DP on the string (
dp[i] = trueifs[0..i]is segmentable). - LC 140 → DFS with memoization keyed by suffix; the memo stores all sentences generated from that suffix.
- LC 139 → 1D DP on the string (
- Why: the decomposition has overlapping subproblems (many positions want to know "can
s[i..]be split?"). DP collapses these. For enumeration, you must actually return the trees of splits, hence memoized recursion.
Approach ​
Brute Force — Recursion ​
Try every prefix of s. If it's in the dictionary, recurse on the suffix. Exponential worst case because the same suffix is explored many times.
Optimal Word Break I — 1D DP ​
Key Insight:
dp[i]is true iff there is somej < iwheredp[j]is true ands[j..i]is in the dictionary.
- Put dictionary into a hash set for O(1) lookup.
dp[0] = true(empty prefix is segmentable).- For each
ifrom 1 ton:- For each
jfrom 0 toi-1, ifdp[j]ands[j..i]∈ dict, setdp[i] = trueand break.
- For each
- Return
dp[n].
Time: O(n^2) iterations times O(L) substring hash (L = max word length) → practically O(n^2) if you trim inner loop to max word length. Space: O(n).
Optimal Word Break II — Memoized DFS ​
Key Insight: the answer for suffix
s[i..]is independent of how you got toi. Cache sentences by start index.
solve(i):
if i == n: return [""] // one empty sentence
if i in memo: return memo[i]
sentences = []
for each word w in dict starting at i:
for tail in solve(i + len(w)):
sentences.push(w + (tail.empty ? "" : " " + tail))
memo[i] = sentences
return sentencesTime: O(n^2 * 2^n) worst case (exponential number of sentences) — but memoization keeps it practical for the given constraints. Space: O(n * total answers).
To avoid wasted work on unbreakable strings, run LC 139 first and early-exit if false.
Walkthrough ​
LC 139 on s = "applepenapple", dict = {"apple", "pen"}.
| i | Checks | dp[i] |
|---|---|---|
| 0 | base | true |
| 5 | s[0..5]="apple" ∈ dict ∧ dp[0] | true |
| 8 | s[5..8]="pen" ∈ dict ∧ dp[5] | true |
| 13 | s[8..13]="apple" ∈ dict ∧ dp[8] | true |
Return dp[13] = true.
LC 140 on "catsanddog": the DFS from index 0 tries "cat" (go to 3) and "cats" (go to 4). From 3, "sand" leads to 7; from 4, "and" leads to 7. From 7, only "dog" matches. So both branches produce "dog" as the tail, giving the two sentences.
Implementation ​
// LC 139
function wordBreak(s: string, wordDict: string[]): boolean {
const dict = new Set(wordDict);
const n = s.length;
const dp: boolean[] = new Array(n + 1).fill(false);
dp[0] = true;
const maxWord = Math.max(...wordDict.map(w => w.length));
for (let i = 1; i <= n; i++) {
for (let j = Math.max(0, i - maxWord); j < i; j++) {
if (dp[j] && dict.has(s.substring(j, i))) {
dp[i] = true;
break;
}
}
}
return dp[n];
}
// LC 140
function wordBreakAll(s: string, wordDict: string[]): string[] {
const dict = new Set(wordDict);
const memo = new Map<number, string[]>();
const solve = (start: number): string[] => {
if (start === s.length) return [""];
if (memo.has(start)) return memo.get(start)!;
const sentences: string[] = [];
for (let end = start + 1; end <= s.length; end++) {
const word = s.substring(start, end);
if (!dict.has(word)) continue;
for (const tail of solve(end)) {
sentences.push(tail === "" ? word : word + " " + tail);
}
}
memo.set(start, sentences);
return sentences;
};
return solve(0);
}// LC 139
bool wordBreak(string s, vector<string>& wordDict) {
unordered_set<string> dict(wordDict.begin(), wordDict.end());
int n = s.size();
vector<bool> dp(n + 1, false);
dp[0] = true;
int maxWord = 0;
for (auto& w : wordDict) maxWord = max(maxWord, (int)w.size());
for (int i = 1; i <= n; i++) {
for (int j = max(0, i - maxWord); j < i; j++) {
if (dp[j] && dict.count(s.substr(j, i - j))) {
dp[i] = true;
break;
}
}
}
return dp[n];
}
// LC 140
class WordBreakII {
unordered_set<string> dict;
unordered_map<int, vector<string>> memo;
string src;
vector<string> solve(int start) {
if (start == (int)src.size()) return {""};
auto it = memo.find(start);
if (it != memo.end()) return it->second;
vector<string> result;
for (int end = start + 1; end <= (int)src.size(); end++) {
string word = src.substr(start, end - start);
if (!dict.count(word)) continue;
for (auto& tail : solve(end)) {
result.push_back(tail.empty() ? word : word + " " + tail);
}
}
return memo[start] = result;
}
public:
vector<string> wordBreak(string s, vector<string>& wordDict) {
dict = unordered_set<string>(wordDict.begin(), wordDict.end());
src = s;
return solve(0);
}
};Watch out:
- In LC 140, forgetting the memo makes adversarial inputs like
"aaaaa...b"blow up exponentially.- Concatenating the leading space when
tailis empty creates trailing whitespace — guard against it.substring(j, i)copies the substring each call; on very long strings, a rolling hash or trie cuts that cost.
Complexity Analysis ​
| Time | Space | |
|---|---|---|
| LC 139 brute | O(2^n) | O(n) |
| LC 139 DP | O(n * L) (L = max word length) | O(n) |
| LC 140 memoized DFS | O(n^2 * answers) | O(n * answers) |
Bottleneck: LC 140 is exponential in the number of valid sentences — no way around that because you must produce them all.
Edge Cases ​
- Empty string — dp[0] = true; LC 140 returns
[""]or[]depending on spec (LeetCode returns[""]for some variants, empty for others). - Word appears multiple times — both solutions handle it because dict is a set.
- Dictionary words longer than
s— theend <= nbound protects you. - Adversarial inputs for LC 140 —
"aaaaaaaab"with dict of all"a"prefixes but nob— memoization stops after detecting the dead end at theb. - Duplicate sentences — cannot occur because dict has unique entries.
- Very long single word — capping inner loop at
max(wordDict).lengthis an important optimization.
Related Problems ​
- LC 472 — Concatenated Words — Runs Word Break for each word in a list; same DP.
- LC 91 — Decode Ways — Number of segmentations into valid codes; same DP style.
- LC 132 — Palindrome Partitioning II —
dp[i]= min cuts; different objective, identical shape. - LC 212 — Word Search II — Dictionary + grid; uses a trie, great follow-up if the interviewer pushes harder on dictionary indexing.
What Is Expected at Each Level ​
Junior: Solve LC 139 with the standard DP, articulate the recurrence, and explain why a set is used.
Mid-level: Cap the inner loop at maxWord, handle LC 140 with memoized DFS, hit both correctness and complexity.
Senior (L4 target): Use a trie for prefix lookups (crucial when wordDict is huge), discuss the exponential-output reality of LC 140, short-circuit with a Word Break I precheck, and articulate why the memo key is the start index and nothing else.