Skip to content

16. Rabin-Karp Pattern Matching

What if we use hashing for pattern matching? Welp, Rabin and Karp did that.

public int RabinKarpFind(String text, String pattern) {
    int M = pattern.length();
    int N = text.length();

    long polBase = 31; 
    long firstCoef = (long) Math.pow(polBase, M-1);
    long patHash = hash(pattern, M, polBase);

    if (N < M) {
        return -1;
    }

    long txtHash = hash(text, M, polBase);

    if ((patHash == txtHash) && check(text, pattern, 0)) {
        return 0;
    }

    for (int i = M; i < N; i++) {
        txtHash = (txtHash - firstCoef * text.charAt(i - M)) * polBase + text.charAt(i);

        int startInd = i - M + 1;
        if ((patHash == txtHash) && check(text, pattern, startInd)) {
            return startInd;
        }
    }

    return -1;
}

private long hash(String key, int m, long polBase) {
    long h = 0;
    for (int j = 0; j < m; j++) {
        h = (polBase * h + key.charAt(j));
    }
    return h;
}

private boolean check(String text, String pattern, int i) {
    for (int j = 0; j < pattern.length(); j++) {
        if (pattern.charAt(j) != text.charAt(i + j)) {
            return false;
        }
    }
    return true;
}

Time complexity of $\Theta(n)$ on the average and $O(mn)$ with the worst case (Las Vegas version, very unlikely).