Skip to content

Word Break I & II ​

Problem Statement ​

Word Break I (LC 139) ​

Given a string s and a dictionary of words wordDict, return true if s can be segmented into a space-separated sequence of one or more dictionary words.

Example:

Input: s = "leetcode", wordDict = ["leet", "code"]
Output: true
Explanation: "leetcode" = "leet" + "code"

Word Break II (LC 140) ​

Given the same input, return all possible sentences formed by segmenting s into dictionary words.

Example:

Input: s = "catsanddog", wordDict = ["cat","cats","and","sand","dog"]
Output: ["cats and dog", "cat sand dog"]

Constraints:

  • 1 <= s.length <= 300 (Word Break I), 1 <= s.length <= 20 (Word Break II)
  • 1 <= wordDict.length <= 1000
  • 1 <= wordDict[i].length <= 20
  • All strings consist of lowercase English letters.

Difficulty: Medium (I), Hard (II)


Pattern Recognition ​

  • What is the input structure? A string to partition, with a set of valid words.
  • What are we optimizing? Part I: can the string be segmented? (decision problem). Part II: enumerate all valid segmentations.
  • What pattern does this fit? Dynamic Programming (Part I) and Backtracking with Memoization (Part II).
  • Why does this pattern fit? Whether s[0..i] can be segmented depends on whether some prefix s[0..j] can be segmented AND s[j..i] is a word. This is textbook overlapping subproblems.

Approach ​

Part I — Word Break ​

Brute Force — Recursion ​

Try every possible first word, recursively check the rest.

function wordBreak(s, wordDict):
    if s is empty: return true
    for word in wordDict:
        if s starts with word:
            if wordBreak(s[len(word):], wordDict): return true
    return false

Time: O(2^n) — exponential branching. Space: O(n) recursion stack. Why insufficient: Repeated subproblems. "aaaa" with dict ["a", "aa"] explores the same suffixes many times.

Optimal — Bottom-Up DP ​

Key Insight: Define dp[i] = true if s[0..i-1] (the first i characters) can be segmented. For each position i, check all possible last words: if dp[j] is true and s[j..i] is in the dictionary, then dp[i] is true.

Step-by-step:

  1. Create dp array of size n+1, all false. Set dp[0] = True (empty string).
  2. Convert wordDict to a set for O(1) lookup.
  3. For each i from 1 to n:
    • For each j from 0 to i-1:
      • If dp[j] is true and s[j:i] is in the word set, set dp[i] = True and break.
  4. Return dp[n].

Time: O(n^2 * m) where m is the average word length (for substring creation and hashing). Space: O(n).

Part II — Word Break II ​

Optimal — Backtracking with Memoization ​

Key Insight: First check if the string can be segmented (use Part I as a pruning step). Then use backtracking to enumerate all valid segmentations, memoizing the results for each suffix to avoid recomputation.

Step-by-step:

  1. Build a memoization cache: memo[i] = list of all valid sentences for s[i:].
  2. For each position i, try every word that matches starting at i.
  3. Recursively get all sentences for the remainder, and prepend the current word.
  4. Cache and return.

Time: O(2^n * n) in the worst case (exponential number of valid segmentations). Space: O(2^n * n) for storing all sentences.


Walkthrough ​

Part I — s = "applepenapple", wordDict = ["apple", "pen"] ​

ij checkeds[j:i]dp[j]?In dict?dp[i]
1j=0"a"dp[0]=TNoF
2j=0,1"ap","p"—NoF
3j=0..2"app","pp","p"—NoF
4j=0..3"appl","ppl","pl","l"—NoF
5j=0"apple"dp[0]=TYesT
6j=0..5—No match—F
7j=0..6—No match—F
8j=5"pen"dp[5]=TYesT
9-12————F
13j=8"apple"dp[8]=TYesT

dp[13] = True — "apple" + "pen" + "apple".

Part II — s = "catsanddog", wordDict = ["cat","cats","and","sand","dog"] ​

Backtracking from index 0:

  • "cat" matches → recurse on "sanddog"
    • "sand" matches → recurse on "dog"
      • "dog" matches → recurse on "" → base case → sentence: "cat sand dog"
  • "cats" matches → recurse on "anddog"
    • "and" matches → recurse on "dog"
      • "dog" matches → sentence: "cats and dog"

Output: ["cat sand dog", "cats and dog"]


Implementation ​

Part I — Word Break ​

typescript
function wordBreak(s: string, wordDict: string[]): boolean {
    const wordSet = new Set(wordDict);
    const n = s.length;

    // dp[i] = true if s[0..i-1] can be segmented
    const dp = new Array<boolean>(n + 1).fill(false);
    dp[0] = true; // Empty string is trivially segmentable

    for (let i = 1; i <= n; i++) {
        for (let j = 0; j < i; j++) {
            if (dp[j] && wordSet.has(s.slice(j, i))) {
                dp[i] = true;
                break; // No need to check more j values
            }
        }
    }

    return dp[n];
}
cpp
bool wordBreak(string s, vector<string>& wordDict) {
    unordered_set<string> wordSet(wordDict.begin(), wordDict.end());
    int n = s.size();

    // dp[i] = true if s[0..i-1] can be segmented
    vector<bool> dp(n + 1, false);
    dp[0] = true; // Empty string is trivially segmentable

    for (int i = 1; i <= n; i++) {
        for (int j = 0; j < i; j++) {
            if (dp[j] && wordSet.count(s.substr(j, i - j))) {
                dp[i] = true;
                break; // No need to check more j values
            }
        }
    }

    return dp[n];
}

Optimized — limit j range by max word length:

typescript
function wordBreakOptimized(s: string, wordDict: string[]): boolean {
    const wordSet = new Set(wordDict);
    const maxWordLen = Math.max(...wordDict.map(w => w.length));
    const n = s.length;

    const dp = new Array<boolean>(n + 1).fill(false);
    dp[0] = true;

    for (let i = 1; i <= n; i++) {
        // Only check j values where the word length is valid
        for (let j = Math.max(0, i - maxWordLen); j < i; j++) {
            if (dp[j] && wordSet.has(s.slice(j, i))) {
                dp[i] = true;
                break;
            }
        }
    }

    return dp[n];
}
cpp
bool wordBreakOptimized(string s, vector<string>& wordDict) {
    unordered_set<string> wordSet(wordDict.begin(), wordDict.end());
    int maxWordLen = 0;
    for (auto& w : wordDict) maxWordLen = max(maxWordLen, (int)w.size());
    int n = s.size();

    vector<bool> dp(n + 1, false);
    dp[0] = true;

    for (int i = 1; i <= n; i++) {
        // Only check j values where the word length is valid
        for (int j = max(0, i - maxWordLen); j < i; j++) {
            if (dp[j] && wordSet.count(s.substr(j, i - j))) {
                dp[i] = true;
                break;
            }
        }
    }

    return dp[n];
}

Part II — Word Break II ​

typescript
function wordBreakII(s: string, wordDict: string[]): string[] {
    const wordSet = new Set(wordDict);

    // First check if segmentation is even possible (prune)
    if (!wordBreak(s, wordDict)) return [];

    const memo = new Map<number, string[]>();

    function backtrack(start: number): string[] {
        if (memo.has(start)) return memo.get(start)!;

        if (start === s.length) return [""]; // Base case: empty tail

        const sentences: string[] = [];
        for (let end = start + 1; end <= s.length; end++) {
            const word = s.slice(start, end);
            if (wordSet.has(word)) {
                const restSentences = backtrack(end);
                for (const rest of restSentences) {
                    sentences.push(rest ? word + " " + rest : word);
                }
            }
        }

        memo.set(start, sentences);
        return sentences;
    }

    return backtrack(0);
}
cpp
class Solution {
    unordered_set<string> wordSet;
    unordered_map<int, vector<string>> memo;

    vector<string> backtrack(const string& s, int start) {
        if (memo.count(start)) return memo[start];

        if (start == (int)s.size()) return {""}; // Base case: empty tail

        vector<string> sentences;
        for (int end = start + 1; end <= (int)s.size(); end++) {
            string word = s.substr(start, end - start);
            if (wordSet.count(word)) {
                auto restSentences = backtrack(s, end);
                for (auto& rest : restSentences) {
                    sentences.push_back(rest.empty() ? word : word + " " + rest);
                }
            }
        }

        memo[start] = sentences;
        return sentences;
    }

public:
    vector<string> wordBreakII(string s, vector<string>& wordDict) {
        wordSet = unordered_set<string>(wordDict.begin(), wordDict.end());
        memo.clear();

        // First check if segmentation is even possible (prune)
        // using the DP approach from Part I
        int n = s.size();
        vector<bool> dp(n + 1, false);
        dp[0] = true;
        for (int i = 1; i <= n; i++) {
            for (int j = 0; j < i; j++) {
                if (dp[j] && wordSet.count(s.substr(j, i - j))) {
                    dp[i] = true;
                    break;
                }
            }
        }
        if (!dp[n]) return {};

        return backtrack(s, 0);
    }
};

Watch out:

  1. Part I: not breaking early. Once dp[i] is set to True, stop checking more j values. Forgetting the break does not affect correctness but degrades performance.
  2. Part II: not pruning with Part I check. For inputs like s = "aaaa...a" (n=300) with wordDict = ["a", "aa", ..., "aaa...a"], there are exponentially many segmentations. The Part I check can detect impossible cases in O(n^2), avoiding wasted backtracking.
  3. Part II: string concatenation in recursion. Building sentences by prepending/appending strings in each recursive call can be inefficient. Using a list of words and joining at the end is more efficient.

Complexity Analysis ​

TimeSpace
Part I — Brute ForceO(2^n)O(n)
Part I — DPO(n^2 * m)O(n)
Part II — Backtracking + MemoO(2^n * n) worst caseO(2^n * n)

Bottleneck: Part I is bounded by O(n^2) iterations with O(m) substring hashing, giving O(n^2 * m). Part II can have exponentially many valid segmentations (e.g., "aaa...a" with dict ["a", "aa"]), so the output itself is exponential. The memoization avoids recomputing subproblems but cannot reduce the output size.


Edge Cases ​

  1. Single character string — Check if it is in the dictionary.
  2. Word dictionary contains the entire string — dp[n] is set directly from dp[0].
  3. Empty dictionary — No word can match. Return false / empty list.
  4. String with repeated characters, dict with various lengths — e.g., s = "aaa", dict = ["a", "aa"]. Part I returns true. Part II returns ["a a a", "aa a", "a aa"].
  5. No valid segmentation — Part I returns false. Part II should return []. The pruning step prevents wasted computation.
  6. Very long string for Part I (n=300) — DP handles this. The O(n^2) loop with up to 300 iterations is fast.


What is Expected at Each Level ​

Junior: Solve Part I with the brute-force recursive approach. Recognize overlapping subproblems and convert to DP with guidance. May not attempt Part II.

Mid-level: Write the DP solution for Part I cleanly. Optimize with max word length. For Part II, implement backtracking with memoization. Discuss the exponential nature of Part II output and why memoization helps.

Senior: Solve both parts quickly. Discuss Trie-based optimization for word lookup (avoids repeated substring creation). Explain why Part II is inherently exponential and connect to NP-hard problems. Discuss BFS approach for Part I as an alternative. Mention practical applications: NLP word segmentation, autocomplete.

Frontend interview preparation reference.