Health & Fitness

CS4311 Design and Analysis of Algorithms. Tutorial: KMP Algorithm

Description
CS4311 Design and Analysis of Algorithms Tutorial: KM Algorithm 1 About this tutorial Introdue String Mathing problem Knuth-Morris-ratt (KM) algorithm 2 String Mathing Let T[0..n-1] be a text of length
Published
of 26
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
CS4311 Design and Analysis of Algorithms Tutorial: KM Algorithm 1 About this tutorial Introdue String Mathing problem Knuth-Morris-ratt (KM) algorithm 2 String Mathing Let T[0..n-1] be a text of length n Let [0..p-1] be a pattern of length p Can we find all loations in T that ours? E.g., T = babababababb = ababa Here, ours at positions 4 and 6 in T 3 Brute Fore Approah The easiest way to find the loations where ours in T is as follows: For eah position of T Chek if ours at that position Running time: worst-ase O(n p) 4 Brute Fore Approah In the simple algorithm, when we deide that does not our at a position x, we start over to math at position x+1 However, even if does not our at position x, we may learn some information from this unsuessful math may help to speed up later heking 5 Brute Fore Approah E.g., suppose when we hek if ours at position x, we get the following senario: T x a? a b Charater mismath Can our in position x + 1? 6 Brute Fore Approah How about this ase? T x a? a b Charater mismath Can our in positions x+1, x+2, or x+3? 7 Lemma: Key Observation Suppose has mathed k hars with T[x], but has a mismath at the (k+1) th har That is, but [k] [0..k-1] = T[x..x+k-1], Then, for any 0 r k, T[x+k] if T[x+rx+k-1] is not a prefix of, annot our at position x + r 8 Cheking Whih osition Next? So, when T[x..] gets a first mismath after mathing k hars with, so that [0..k-1] = T[x..x+k-1] we an restart the next heking at the leftmost position x+r suh that T[x+r..x+k-1] is a prefix of Note: Leftmost x+r smallest r 9 Key Observation E.g., in our first example, T x a? a b next heking an restart at pos x+2 10 Key Observation In our seond example, T x a? a b next heking an restart at pos x+3 11 We observe that Finding Desired r T[x+r..x+k-1] = [r..k-1] So to find the desired r, we need the smallest r suh that [r..k-1] is a prefix of What does that mean?? 12 Finding Desired r (Example 1) a b When k = 3, we ask: prefix of? No prefix of? Yes! (r=2) a 13 Finding Desired r (Example 2) a When k = 5 (what does that mean??), we ask: prefix of? No a prefix of? No a prefix of? Yes! (r=3) 14 Finding Desired r For eah k, the smallest r suh that [r..k-1] is a prefix of implies [r..k-1] is longest suh prefix Let us define a funtion, alled prefix funtion, suh that (k) = length of suh [r..k-1] 15 KM Algorithm The KM algorithm relies on the prefix funtion to loate all ourrenes of in O( n ) time optimal! Next, we assume that the prefix funtion is already omputed We first desribe a simplified version and then the atual KM Finally, we show how to get prefix funtion 16 Simplified Version Set x = 0; while (x n-p+1) { 1. Math T with at position x ; 2. Let k = #mathed hars ; 3. if ( k == p ) output math at x ; 4. Update x = x + k - (k) ; } What is the worst-ase running time? 17 How an we improve? In simplified version, inside the while loop, Line 1 restarts mathing (every har of) T with from position x In fat, if previous step of while loop has mathed k hars, we know in this round, the first (k) hars are already mathed What if we take advantage of this?? 18 KM Algorithm Set x = 0; k = 0 ; while (x n-p+1) { 1. Math T with at position x but starting from k+1 th position; 2. Update k = #mathed hars; 3. if ( k == p ) output math at x ; 4. Update x = x + k - (k) ; 5. Update k = (k) ; } k keeps trak of #mathed hars 19 The running time omes from four parts: 1. Mis/mathing a har of T with (Line 1) 2. Updating the position x (Line 4) 3. Output math (Line 3) 4. Updating k (Line 2, Line 5) Sine eah har is mathed one, and x inreases for eah mismath in total O(n) time Running Time 20 Computing refix Funtion It remains to ompute the prefix funtion In fat, it an be omputed inrementally (finding (1), then (2), then (3), and so on) For instane, suppose we have obtained (1), (2),, (k) already How an we get (k+1)? 21 Key Observation We know that a prefix of length (k) [0.. (k)-1 ] is the longest prefix mathing the suffix of [0..k-1] k? # (k) 22 Key Observation What if the next orresponding hars, [ (k)] and [k] are the same??? # If same, (k+1) = (k) + 1 (prove by ontradition) 23 Key Observation However, if [ (k)] and [k] are different, we should move the below rightwards to searh for the next longest prefix of mathing the suffix of [0..k-1]? # ( (k)) 24 Key Observation What if the next orresponding hars, [ ( (k))] and [k] are the same??? # If same, (k+1) = ( (k)) + 1 (prove by ontradition) 25 Key Observation However, if [ ( (k))] and [k] are different, we see that we an repeat the proedure and obtain (k+1) when we find: the longest prefix of mathing the suffix of [0..k-1], with its next har = [k] Exatly the same as in string mathing Total time : O( p ) time sine (1) at most mathes, and (2) below moves rightwards for eah mismath 26

A norm

Jul 25, 2017

trombofeblitis

Jul 25, 2017
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x