KMP algorithm needs no buffer to record the data. It can be used to find the substring on-the-fly. It is a linear time algorithm. It uses a DFA to achieve linear time search. The construction of DFA takes O(R*W) time and space where R is the number of different characters used and M is the length of the pattern to be search in a text of length N.
Source:
Output:
Substring "needle" found at index 13
Substring "jatin" found at index -1
Substring "ack" found at index 5
Substring "a" found at index 1
Substring "z" found at index -1
Substring "lein" found at index 17
Source:
public class KMPSubstringMatch { private int W, R; private int[][] dfa; private String s; private String sub; public KMPSubstringMatch(String s, String sub, int R){ this.W = sub.length(); this.R = R; this.sub = sub; this.s = s; dfa = new int[R][W]; } public void constructDFA(){ int X=0; dfa[sub.charAt(0)][0] = 1; for(int i=1;i<W;++i){ for(int j=0;j<R;++j){ dfa[j][i] = dfa[j][X]; } dfa[sub.charAt(i)][i] = i+1; X = dfa[sub.charAt(i)][X]; } } public int find(){ int i, j; for(i=0, j=0;i<s.length() && j!=W;++i){ j=dfa[s.charAt(i)][j]; } if(j==W)return i-sub.length(); else return -1; } public static void find(String s, String sub){ KMPSubstringMatch kmp = new KMPSubstringMatch(s, sub, 257); kmp.constructDFA(); System.out.println("Substring \""+sub+"\" found at index "+kmp.find()); } public static void main(String[] args){ String s = "haystackintheneedleinthehaystack"; find(s, "needle"); find(s, "jatin"); find(s, "ack"); find(s, "a"); find(s, "z"); find(s, "lein"); } }
Output:
Substring "needle" found at index 13
Substring "jatin" found at index -1
Substring "ack" found at index 5
Substring "a" found at index 1
Substring "z" found at index -1
Substring "lein" found at index 17
No comments:
Post a Comment