16. Rabin-Karp Pattern Matching
What if we use hashing for pattern matching? Welp, Rabin and Karp did that.
public int RabinKarpFind(String text, String pattern) {
int M = pattern.length();
int N = text.length();
long polBase = 31;
long firstCoef = (long) Math.pow(polBase, M-1);
long patHash = hash(pattern, M, polBase);
if (N < M) {
return -1;
}
long txtHash = hash(text, M, polBase);
if ((patHash == txtHash) && check(text, pattern, 0)) {
return 0;
}
for (int i = M; i < N; i++) {
txtHash = (txtHash - firstCoef * text.charAt(i - M)) * polBase + text.charAt(i);
int startInd = i - M + 1;
if ((patHash == txtHash) && check(text, pattern, startInd)) {
return startInd;
}
}
return -1;
}
private long hash(String key, int m, long polBase) {
long h = 0;
for (int j = 0; j < m; j++) {
h = (polBase * h + key.charAt(j));
}
return h;
}
private boolean check(String text, String pattern, int i) {
for (int j = 0; j < pattern.length(); j++) {
if (pattern.charAt(j) != text.charAt(i + j)) {
return false;
}
}
return true;
}
Time complexity of $\Theta(n)$ on the average and $O(mn)$ with the worst case (Las Vegas version, very unlikely).