力扣labuladong一刷day21天滑动哈希算法共2题
力扣labuladong一刷day21天滑动哈希算法共2题
文章目录
- 力扣labuladong一刷day21天滑动哈希算法共2题
- 一、187. 重复的DNA序列
- 二、28. 找出字符串中第一个匹配项的下标
一、187. 重复的DNA序列
题目链接:https://leetcode.cn/problems/repeated-dna-sequences/description/
思路:字符串序列里找重复出现的子串就是子串匹配问题,正常的思路是截取所有长为L的子串,看set里是否重复,可是截取的过程是O(n)每个都截取就成了O(x*n)。采用另外一种思路,也是滑动窗口,只不过把滑动窗口中的字符串映射成数字,这种映射的方式就是right右移尾部加数,left右移头部减数,复杂度是O(1)。这样只需要比较set中的映射值即可。
class Solution {
public List<String> findRepeatedDnaSequences(String s) {
Set<String> list = new HashSet<>();
Set<Integer> set = new HashSet<>();
int[] nums = new int[s.length()];
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
if (c == 'A') {
nums[i] = 0;
} else if (c == 'C') {
nums[i] = 1;
} else if (c == 'G') {
nums[i] = 2;
}else {
nums[i] = 3;
}
}
int hashcode = 0, left = 0, right = 0, r = 4;
int rl = (int) Math.pow(r,9);
while (right < nums.length) {
hashcode = hashcode * r + nums[right];
right++;
if (right - left == 10) {
if (set.contains(hashcode)) {
list.add(s.substring(left, right));
}else {
set.add(hashcode);
}
hashcode = hashcode - nums[left] * rl;
left++;
}
}
return new ArrayList<>(list);
}
}
二、28. 找出字符串中第一个匹配项的下标
题目链接:https://leetcode.cn/problems/find-the-index-of-the-first-occurrence-in-a-string/
思路:滑动窗口使用needmap和windowmap要求需要匹配的是子串的排列,如果不是排列是无法区分ab与ba的。本题是完全匹配就只能用滑动哈希算法了。
选择一个较大的素数Q,这样会减少哈希碰撞。
class Solution {
public int strStr(String haystack, String needle) {
long Q = 1658598167, L = needle.length(), R = 256;
long hashCode = 0, patCode = 0;
long RL = 1;
for (int i = 0; i < L - 1; i++) {
RL = (RL*R) % Q;
}
for (int i = 0; i < L; i++) {
patCode = (patCode * R+ needle.charAt(i)) % Q;
}
int left = 0, right = 0;
while (right < haystack.length()) {
hashCode = ((hashCode * R) % Q + haystack.charAt(right)) % Q;
right++;
if (right - left == L) {
if (hashCode == patCode) {
if (needle.equals(haystack.substring(left, right))) return left;
}
hashCode = (hashCode - (haystack.charAt(left)*RL) % Q + Q) % Q;
left++;
}
}
return -1;
}
}