Question
Given a string S and a string T, find the minimum window in S which will contain all the characters in T in complexity O(n).
For example,
S = "ADOBECODEBANC"
T = "ABC"
Minimum window is "BANC"
.
Note:
If there is no such window in S that covers all characters in T, return the emtpy string ""
.
If there are multiple such windows, you are guaranteed that there will always be only one unique minimum window in S.
Stats
Frequency | 2 |
Difficulty | 4 |
Adjusted Difficulty | 5 |
Time to use | ---------- |
Ratings/Color = 1(white) 2(lime) 3(yellow) 4/5(red)
Analysis
This is a very difficult string matching question.
The sliding window solution is very well explained in this post (the best solution).
To help illustrate this approach, I use a different example: S = “acbbaca” and T = “aba“. The idea is mainly based on the help of two pointers (begin and end position of the window) and two tables (needToFind and hasFound) while traversing S. needToFind stores the total count of a character in T and hasFound stores the total count of a character met so far. We also use a count variable to store the total characters in T that’s met so far (not counting characters where hasFound[x] exceeds needToFind[x]). When count equals T‘s length, we know a valid window is found.
Each time we advance the end pointer (pointing to an element x), we increment hasFound[x] by one. We also increment count by one if hasFound[x] is less than or equal to needToFind[x]. Why? When the constraint is met (that is, count equals to T‘s size), we immediately advance begin pointer as far right as possible while maintaining the constraint.
How do we check if it is maintaining the constraint? Assume that begin points to an element x, we check if hasFound[x] is greater than needToFind[x]. If it is, we can decrement hasFound[x] by one and advancing begin pointer without breaking the constraint. On the other hand, if it is not, we stop immediately as advancing begin pointer breaks the window constraint.
Finally, we check if the minimum window length is less than the current minimum. Update the current minimum if a new minimum is found.
Essentially, the algorithm finds the first window that satisfies the constraint, then continue maintaining the constraint throughout.
i) S = “acbbaca” and T = “aba“.ii) The first minimum window is found. Notice that we cannot advance begin pointer as hasFound['a'] == needToFind['a'] == 2. Advancing would mean breaking the constraint.iii) The second window is found. begin pointer still points to the first element ‘a’. hasFound['a'] (3) is greater than needToFind['a'] (2). We decrement hasFound['a'] by one and advance begin pointer to the right.iv) We skip ‘c’ since it is not found in T. Begin pointer now points to ‘b’. hasFound['b'] (2) is greater than needToFind['b'] (1). We decrement hasFound['b'] by one and advance begin pointer to the right.v) Begin pointer now points to the next ‘b’. hasFound['b'] (1) is equal to needToFind['b'] (1). We stop immediately and this is our newly found minimum window.Both the begin and end pointers can advance at most N steps (where N is S‘s size) in the worst case, adding to a total of 2N times. Therefore, the run time complexity must be in O(N).
Solution
Best code is from this post.
First of all, keep a HashMap to store all letters and occurrance. Then declare a ‘count’ variable. This is an important varaible. It helps us to check whether we have successfully achieve at least 1 window. After we found the first window, the ‘count’ variable shall always equals to total number of letters in ’T'.
Now the looping part. Basically we assume that the window end at one point, and we find the correct starting position and calculate corresponding length.
The coding part is very difficult. Try to practise more.
Code
Updated on July 7th, code:
public String minWindow(String S, String T) {
HashMap<Character, Integer> map = new HashMap<Character, Integer>();
HashMap<Character, Integer> map2 = new HashMap<Character, Integer>();
for (Character ch: T.toCharArray()) {
if (map.containsKey(ch)) {
map.put(ch, map.get(ch) + 1);
} else {
map.put(ch, 1);
map2.put(ch, 0);
}
}
int count = 0;
int start = 0;
int end = 0;
String result = "";
while (end < S.length()) {
char cur = S.charAt(end);
if (!map.containsKey(cur)) {
end++;
continue;
}
map2.put(cur, map2.get(cur) + 1);
if (map2.get(cur) <= map.get(cur)) {
count++;
}
if (count == T.length()) {
// locate start point
while(true) {
char ll = S.charAt(start);
if (!map.containsKey(ll)) {
start++;
continue;
}
if (map2.get(ll) > map.get(ll)) {
map2.put(ll, map2.get(ll) - 1);
start++;
continue;
} else {
break;
}
}
if (result.equals("") || result.length() > end - start + 1) {
result = S.substring(start, end + 1);
}
}
end++;
}
return result;
}
Updated on July 19th, rewrite the code changing all while-loop to for-loop:
public String minWindow(String S, String T) {
if (S.length() < T.length()) {
return "";
}
HashMap<Character, Integer> map = new HashMap<Character, Integer>();
HashMap<Character, Integer> found = new HashMap<Character, Integer>();
for (Character ch: T.toCharArray()) {
if (map.containsKey(ch)) {
map.put(ch, map.get(ch) + 1);
} else {
map.put(ch, 1);
found.put(ch, 0);
}
}
String window = S;
int count = 0;
int start = 0;
char[] letters = S.toCharArray();
for (int i = 0; i < letters.length; i++) {
char ch = letters[i];
if (!map.containsKey(ch)) {
continue;
}
if (found.get(ch) < map.get(ch)) {
count++;
}
found.put(ch, found.get(ch) + 1);
if (count == T.length()) {
// update the start pointer
for (; start <= i; start++) {
char sChar = letters[start];
if (!map.containsKey(sChar)) {
continue;
}
if (found.get(sChar) <= map.get(sChar)) {
break;
} else {
found.put(sChar, found.get(sChar) - 1);
}
}
if (window.length() > i - start + 1) {
window = S.substring(start, i + 1);
}
}
}
return count == T.length() ? window : "";
}