Given an input string and a dictionary of words, find out if the input string can be segmented into a space-separated sequence of dictionary words.
For example, Given a dictionary { i, like, ili, kesa, msun, sam, sung, mobile} then
text “ilikemobile” is breakable into [“i”,”like”,”mobile”] text “ilikesamobile” is breakable into [“ili”,”kesa”,”mobile”] text “ilikesammobile” is breakable into [“i”,”like”,”sam”,”mobile”] text “likeimobile” is breakable into [“like”,”i”,”mobile”]
but
text “ilikesa” is not brekable
text “likeis” is not brekable
text “ilikesamsun” is not brekable
We can solve it recursively by splitting at each prefix and search it in dictionary. If the prefix is present in dictionary then we recur for the suffix (rest of the string). If the recursive call for suffix returns true then we return true, otherwise we try next prefix. If we have tried all prefixes and none of them resulted in a solution, we return false.
public static boolean wordBreak(Set<String> dictionary, String text){ //base case if(text.isEmpty()){ return true; } //break the string at i+1 such that prefix text[...i] is in dict and suffix text[i+1...] is breakable for(int i = 0; i<text.length(); i++){ if(dictionary.contains(text.substring(0, i+1)) && wordBreak(dictionary, text.substring(i+1))){ return true; } } return false; }
Repetitive Subproblems
The recursive solution has many subproblems that are repeating at various stage of the recursion. For example, for string abcd the recursion tree will look like
abcd / \ \ bcd cd d / \ | cd d d
We can see that subproblems for “cd” is repeating twice. So we ca use DP to cache the solution
Dynamic Programming Solution
Straight forward dp solution in Java. We use dp[i] table to store results computed for each prefix string[0..i]. A prefix string[0..i] is breakable if and only if either the whole prefix is contained in dictionary ot there is a smaller prefix at index j < i such that prefix string[0..j] is breakable and substring(j+1..i) is contained in the dictionary. Below is the implementation of the above using DP that runs in O(n^2) time.
public static boolean wordBreakDP(Set<String> dictionary, String text){ int n = text.length(); if(n == 0){ return true; } //dp[i] = true if there is a solution in prefix text[0..i] boolean[] dp = new boolean[n]; //try all possible prefixes for(int i = 0; i< n; i++){ if(dictionary.contains(text.substring(0, i+1))){ dp[i] = true; } else{ //Try to break in each position j < i for(int j = 0; j < i; j++){ if(dp[j] && dictionary.contains(text.subSequence(j+1, i+1))){ dp[i] = true; break; } } } } return dp[n-1]; }
Find All Possible Valid Sentences by Breaking
We do the similar recursion but with a DP map to cache the subproblem solutions.
public static HashMap<String, ArrayList<String>> wordBreakMap = new HashMap<String, ArrayList<String>>(); public static ArrayList<String> wordBreakAll(Set<String> dictionary, String text){ //if already computed the current substring text then return from map if(wordBreakMap.containsKey(text)){ return wordBreakMap.get(text); } ArrayList<String> result = new ArrayList<String>(); //if the whole word is in the dictionary then we add this to final result if(dictionary.contains(text)){ result.add(text); } //try each prefix and extend for(int i = 0; i< text.length(); i++){ String prefix = text.substring(0, i+1); if(dictionary.contains(prefix)){ //extend String suffix = text.substring(i+1); ArrayList<String> subRes = wordBreakAll(dictionary, suffix); for(String word : subRes){ result.add(prefix+" "+word); } } } wordBreakMap.put(text, result); return result; }