当前位置：首页 > article >正文

2.7 滑动窗口专题：串联所有单词的子串

article 2025/3/17 15:01:09

LeetCode 30. 串联所有单词的子串算法对比分析

1. 题目链接

LeetCode 30. 串联所有单词的子串

2. 题目描述

给定一个字符串 s 和一个字符串数组 words，words 中所有单词长度相同。要求找到 s 中所有起始索引，使得从该位置开始的连续子串包含 words 中所有单词的某种排列（不限制顺序）。
示例：
输入：s = "barfoothefoobarman", words = ["foo","bar"]
输出：[0,9]（子串 "barfoo" 和 "foobar" 符合条件）。

3. 算法思路

滑动窗口法：

问题转化：将 words 的排列匹配问题转化为固定窗口长度的滑动窗口问题。
哈希表统计：用 hash1 记录 words 中单词的出现次数，hash2 记录当前窗口内单词的出现次数。
多起点遍历：由于单词长度固定为 nwSub，需遍历 nwSub 种可能的起始偏移（0 ≤ i < nwSub）。
窗口动态调整：
- 右指针扩展：每次截取一个单词加入窗口，更新哈希表。
- 左指针收缩：当窗口内单词数量超过 nw 时，移动左指针。
结果判断：当窗口内单词数量等于 nw 且所有单词频率匹配时，记录起始索引。

暴力枚举法：

遍历所有子串：枚举所有长度为 nw * nwSub 的子串。
分割统计：将子串分割为 nw 个单词，统计频率是否与 words 一致。

4. 示例分析

输入：s = "barfoothefoobarman", words = ["foo","bar"]

暴力枚举法：
- 枚举所有长度为 6 的子串，例如 "barfoo", "arfoot", "rfooth" 等。
- 对每个子串分割为 ["bar","foo"] 或 ["arf","oot"]，检查是否与 words 匹配。
滑动窗口法：
- 当 i=0 时，窗口从 left=0 开始，截取 "bar" 和 "foo"，count=2，记录索引 0。
- 当 i=9 时，窗口从 left=9 开始，截取 "foo" 和 "bar"，count=2，记录索引 9。

5. 边界条件与注意事项

单词长度相同：words 中所有单词长度必须一致。
空输入处理：若 words 为空或 s 长度不足，直接返回空。
哈希表更新：需在收缩窗口时及时减少 hash2 的计数，避免无效单词干扰。

6. 代码实现

class Solution 
{
public:
    vector<int> findSubstring(string s, vector<string>& words) 
    {
        vector<int> ret;
        int ns = s.size(), nw = words.size(), nwSub = words[0].size();
        if (ns < nwSub * nw) return ret;

        unordered_map<string, int> hash1;
        for (auto& word : words) hash1[word]++;

        for (int i = 0; i < nwSub; i++) 
        { // 遍历所有可能的起始偏移
            unordered_map<string, int> hash2;
            int left = i, count = 0; // left为窗口左边界
            for (int right = i; right + nwSub <= ns; right += nwSub) 
            {
                // 截取当前单词
                string in = s.substr(right, nwSub);
                hash2[in]++;
                // 更新有效计数：仅在当前单词属于hash1且未超过次数时增加count
                if (hash1.count(in) && hash2[in] <= hash1[in]) count++;

                // 当窗口内的单词数量超过nw时，收缩左边界
                while ((right - left) / nwSub + 1 > nw) 
                {
                    string out = s.substr(left, nwSub);
                    if (hash1.count(out) && hash2[out] <= hash1[out]) count--;
                    hash2[out]--;
                    left += nwSub; // 左指针移动一个单词长度
                }

                // 若有效计数等于nw，记录起始索引
                if (count == nw) ret.push_back(left);
            }
        }
        return ret;
    }
};

在这里插入图片描述

7.暴力枚举法与滑动窗口法对比图表

对比维度	暴力枚举法	滑动窗口法
核心思想	枚举所有长度为 `nw * nwSub` 的子串，分割后比较单词频率。	维护固定窗口长度，动态调整窗口内的单词频率。
时间复杂度	O(ns * nw * nwSub)（每个子串需分割并统计频率）。	O(ns * nwSub)（每个单词被处理一次）。
空间复杂度	O(nw)（存储 `words` 的哈希表）。	O(nw)（存储两个哈希表）。
实现方式	双重循环遍历子串，内层循环分割并统计。	单层循环扩展右指针，动态调整左指针。
适用场景	小规模数据（`ns ≤ 1e3`, `nw ≤ 10`）。	大规模数据（`ns ≤ 1e5`）。
优点	逻辑简单，直接穷举所有可能性。	时间复杂度低，适用于大规模数据。
缺点	数据规模大时性能极差（例如 `ns=1e4` 时需 1e8 次操作）。	需处理哈希表的动态更新和边界条件。