LeetCode 792. Number of Matching Subsequences

   792. Number of Matching Subsequences

Medium, Dynamic Programming

Given a string s and an array of strings words, return the number of words[i] that is a subsequence of s.

subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.

  • For example, "ace" is a subsequence of "abcde".

 

Example 1:

Input: s = "abcde", words = ["a","bb","acd","ace"]

Output: 3

Explanation: There are three strings in words that are a subsequence of s: "a", "acd", "ace".

Example 2:

Input: s = "dsahjpjauf", words = ["ahjpjau","ja","ahbwzgqnuk","tnmlanowax"]

Output: 2

 

Constraints:

  • 1 <= s.length <= 5 * 104
  • 1 <= words.length <= 5000
  • 1 <= words[i].length <= 50
  • s and words[i] consist of only lowercase English letters.

 

SOLUTION:

To solve this problem, we need to efficiently check how many strings in the words array are subsequences of the string s. A subsequence is defined as a sequence of characters that appear in the same order as in the original string s, but not necessarily consecutively.

Plan:

  1. Optimize the Matching Process: Instead of checking each word in words against s character by character (which would be inefficient), we can use a bucket approach:

    • Group the words based on their first character.
    • For each character in s, process all words waiting for that character.
  2. Iterate Through s:

    • As we iterate through the characters of s, we advance the matching process for words waiting on that character.
  3. Use Buckets:

    • Maintain a dictionary where each key is a character, and the value is a list of iterators. Each iterator represents a word and its current position in the matching process.
  4. Count Matches:

    • When a word's iterator reaches the end, it means the word is a subsequence of s.

Implementation:

using System;
using System.Collections.Generic;

public class Solution {
    public int NumMatchingSubseq(string s, string[] words) {
        // Dictionary to hold buckets of words grouped by the current character they are waiting for
        var waiting = new Dictionary<char, Queue<IEnumerator<char>>>();
        
        // Initialize the dictionary with all lowercase letters
        for (char c = 'a'; c <= 'z'; c++) {
            waiting[c] = new Queue<IEnumerator<char>>();
        }

        // Populate the buckets based on the first character of each word
        foreach (var word in words) {
            var iterator = word.GetEnumerator();
            if (iterator.MoveNext()) {
                waiting[iterator.Current].Enqueue(iterator);
            }
        }

        int count = 0;

        // Iterate through characters of the string `s`
        foreach (char c in s) {
            var bucket = waiting[c]; // Get the bucket for the current character
            int size = bucket.Count;

            for (int i = 0; i < size; i++) {
                var iterator = bucket.Dequeue(); // Get the next word's iterator
                if (iterator.MoveNext()) {
                    waiting[iterator.Current].Enqueue(iterator); // Move to the next character
                } else {
                    count++; // Word is a subsequence
                }
            }
        }

        return count;
    }

    public static void Main(string[] args) {
        var solution = new Solution();
        Console.WriteLine(solution.NumMatchingSubseq("abcde", new string[] { "a", "bb", "acd", "ace" })); // Output: 3
        Console.WriteLine(solution.NumMatchingSubseq("dsahjpjauf", new string[] { "ahjpjau", "ja", "ahbwzgqnuk", "tnmlanowax" })); // Output: 2
    }
}

Explanation of Code:

  1. Dictionary for Buckets:

    • waiting is a dictionary where keys are characters ('a' to 'z'), and values are queues of iterators for words waiting on that character.
  2. Iterating Over s:

    • For each character in s, process all words waiting for it.
    • If the iterator can move to the next character, enqueue it into the bucket for the next character.
    • If the iterator is exhausted, increment the match count.
  3. Efficiency:

    • Each word is processed exactly once, and each character in s is processed once.
    • Total complexity: O(L+M)O(L + M), where LL is the total length of all words in words, and MM is the length of s.

Example Outputs:

Example 1:

  • s = "abcde", words = ["a", "bb", "acd", "ace"]
  • Words "a", "acd", and "ace" are subsequences of "abcde".
  • Output: 3

Example 2:

  • s = "dsahjpjauf", words = ["ahjpjau", "ja", "ahbwzgqnuk", "tnmlanowax"]
  • Words "ahjpjau" and "ja" are subsequences of "dsahjpjauf".
  • Output: 2

Featured Posts

Leetcode 4. Median of Two Sorted Arrays

  4. Median of Two Sorted Arrays Hard Given two sorted arrays  nums1  and  nums2  of size  m  and  n  respectively, return  the median  of t...

Popular Posts