792. Number of Matching Subsequences
Medium, Dynamic Programming
Given a string s and an array of strings words, return the number of words[i] that is a subsequence of s.
A subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.
- For example, "ace" is a subsequence of "abcde".
Example 1:
Input: s = "abcde", words = ["a","bb","acd","ace"]
Output: 3
Explanation: There are three strings in words that are a subsequence of s: "a", "acd", "ace".
Example 2:
Input: s = "dsahjpjauf", words = ["ahjpjau","ja","ahbwzgqnuk","tnmlanowax"]
Output: 2
Constraints:
- 1 <= s.length <= 5 * 104
- 1 <= words.length <= 5000
- 1 <= words[i].length <= 50
- s and words[i] consist of only lowercase English letters.
SOLUTION:
To solve this problem, we need to efficiently check how many strings in the words
array are subsequences of the string s
. A subsequence is defined as a sequence of characters that appear in the same order as in the original string s
, but not necessarily consecutively.
Plan:
-
Optimize the Matching Process: Instead of checking each word in
words
againsts
character by character (which would be inefficient), we can use a bucket approach:- Group the words based on their first character.
- For each character in
s
, process all words waiting for that character.
-
Iterate Through
s
:- As we iterate through the characters of
s
, we advance the matching process for words waiting on that character.
- As we iterate through the characters of
-
Use Buckets:
- Maintain a dictionary where each key is a character, and the value is a list of iterators. Each iterator represents a word and its current position in the matching process.
-
Count Matches:
- When a word's iterator reaches the end, it means the word is a subsequence of
s
.
- When a word's iterator reaches the end, it means the word is a subsequence of
Implementation:
using System;
using System.Collections.Generic;
public class Solution {
public int NumMatchingSubseq(string s, string[] words) {
// Dictionary to hold buckets of words grouped by the current character they are waiting for
var waiting = new Dictionary<char, Queue<IEnumerator<char>>>();
// Initialize the dictionary with all lowercase letters
for (char c = 'a'; c <= 'z'; c++) {
waiting[c] = new Queue<IEnumerator<char>>();
}
// Populate the buckets based on the first character of each word
foreach (var word in words) {
var iterator = word.GetEnumerator();
if (iterator.MoveNext()) {
waiting[iterator.Current].Enqueue(iterator);
}
}
int count = 0;
// Iterate through characters of the string `s`
foreach (char c in s) {
var bucket = waiting[c]; // Get the bucket for the current character
int size = bucket.Count;
for (int i = 0; i < size; i++) {
var iterator = bucket.Dequeue(); // Get the next word's iterator
if (iterator.MoveNext()) {
waiting[iterator.Current].Enqueue(iterator); // Move to the next character
} else {
count++; // Word is a subsequence
}
}
}
return count;
}
public static void Main(string[] args) {
var solution = new Solution();
Console.WriteLine(solution.NumMatchingSubseq("abcde", new string[] { "a", "bb", "acd", "ace" })); // Output: 3
Console.WriteLine(solution.NumMatchingSubseq("dsahjpjauf", new string[] { "ahjpjau", "ja", "ahbwzgqnuk", "tnmlanowax" })); // Output: 2
}
}
Explanation of Code:
-
Dictionary for Buckets:
waiting
is a dictionary where keys are characters ('a'
to'z'
), and values are queues of iterators for words waiting on that character.
-
Iterating Over
s
:- For each character in
s
, process all words waiting for it. - If the iterator can move to the next character, enqueue it into the bucket for the next character.
- If the iterator is exhausted, increment the match count.
- For each character in
-
Efficiency:
- Each word is processed exactly once, and each character in
s
is processed once. - Total complexity: , where is the total length of all words in
words
, and is the length ofs
.
- Each word is processed exactly once, and each character in
Example Outputs:
Example 1:
s = "abcde"
,words = ["a", "bb", "acd", "ace"]
- Words
"a"
,"acd"
, and"ace"
are subsequences of"abcde"
. - Output:
3
Example 2:
s = "dsahjpjauf"
,words = ["ahjpjau", "ja", "ahbwzgqnuk", "tnmlanowax"]
- Words
"ahjpjau"
and"ja"
are subsequences of"dsahjpjauf"
. - Output:
2