Z algorithm

This algorithm finds all occurrences of a pattern in a text in linear time. Let length of text be n and of pattern be m, then total time taken is O(m + n) with linear space complexity. Z algorithm works by maintaining an auxiliary array called the Z array. This Z array stores the length of the longest substring, starting from the current index, that also it’s prefix. 

What is Z Array? 

For a string str[0..n-1], Z array is of same length as string. An element Z[i] of Z array stores length of the longest substring starting from str[i] which is also a prefix of str[0..n-1]. The first entry of Z array is meaning less as complete string is always prefix of itself.

Example:

Index            0   1   2   3   4   5   6   7   8   9  10  11 
Text             a   a   b   c   a   a   b   x   a   a   a   z
Z values         X   1   0   0   3   1   0   0   2   2   1   0 

How to construct Z array?

A Simple Solution is to run two nested loops, the outer loop goes to every index and the inner loop finds length of the longest prefix that matches the substring starting at current index. The time complexity of this solution is O(n2).

We can construct Z array in linear time. The idea is to maintain an interval [L, R] which is the interval with max R
such that [L, R] is prefix substring (substring which is also a prefix. 

Steps for maintaining this interval are as follows – 

  1. If i > R then there is no prefix substring that starts before i and ends after i, so we reset L and R and compute new [L, R] by comparing str[0..] to str[i..] and get Z[i] (= R-L+1).
  2. If i <= R then let K = i-L,  now Z[i] >= min(Z[K], R-i+1)  because str[i..] matches with str[K..] for atleast R-i+1 characters (they are in[L, R] interval which we know is a prefix substring). 
    Now two sub cases arise:
    • If Z[K] < R-i+1  then there is no prefix substring starting at str[i] (otherwise Z[K] would be larger)  so  Z[i] = Z[K]and interval [L, R] remains same.
    • If Z[K] >= R-i+1 then it is possible to extend the [L, R] interval thus we will set L as i and start matching from str[R] onwards  and get new R then we will update interval [L, R] and calculate Z[i] (=R-L+1).

Construction of Z array

Below is the implementation of the Z algorithm:

C++




// A C++ program that implements Z algorithm for pattern
// searching
#include <iostream>
using namespace std;
  
void getZarr(string str, int Z[]);
  
// prints all occurrences of pattern in text using Z algo
void search(string text, string pattern)
{
    // Create concatenated string "P$T"
    string concat = pattern + "$" + text;
    int l = concat.length();
  
    // Construct Z array
    int Z[l];
    getZarr(concat, Z);
  
    // now looping through Z array for matching condition
    for (int i = 0; i < l; ++i) {
        // if Z[i] (matched region) is equal to pattern
        // length we got the pattern
        if (Z[i] == pattern.length())
            cout << "Pattern found at index "
                 << i - pattern.length() - 1 << endl;
    }
}
  
// Fills Z array for given string str[]
void getZarr(string str, int Z[])
{
    int n = str.length();
    int L, R, k;
  
    // [L, R] make a window which matches with prefix of s
    L = R = 0;
    for (int i = 1; i < n; ++i) {
        // if i>R nothing matches so we will calculate.
        // Z[i] using naive way.
        if (i > R) {
            L = R = i;
  
            // R-L = 0 in starting, so it will start
            // checking from 0'th index. For example,
            // for "ababab" and i = 1, the value of R
            // remains 0 and Z[i] becomes 0. For string
            // "aaaaaa" and i = 1, Z[i] and R become 5
            while (R < n && str[R - L] == str[R])
                R++;
            Z[i] = R - L;
            R--;
        }
        else {
            // k = i-L so k corresponds to number which
            // matches in [L, R] interval.
            k = i - L;
  
            // if Z[k] is less than remaining interval
            // then Z[i] will be equal to Z[k].
            // For example, str = "ababab", i = 3, R = 5
            // and L = 2
            if (Z[k] < R - i + 1)
                Z[i] = Z[k];
  
            // For example str = "aaaaaa" and i = 2, R is 5,
            // L is 0
            else {
                // else start from R and check manually
                L = i;
                while (R < n && str[R - L] == str[R])
                    R++;
                Z[i] = R - L;
                R--;
            }
        }
    }
}
  
// Driver program
int main()
{
    string text = "GEEKS FOR GEEKS";
    string pattern = "GEEK";
    search(text, pattern);
    return 0;
}


Java




// A Java program that implements Z algorithm for pattern
// searching
import java.io.*;
  
class GFG 
{
  
  // prints all occurrences of pattern in text using Z
  // algo
  static void search(String text, String pattern)
  {
  
    // Create concatenated string "P$T"
    String concat = pattern + "$" + text;
    int l = concat.length();
  
    // Construct Z array
    int[] Z = new int[l];
    getZarr(concat, Z);
  
    // now looping through Z array for matching
    // condition
    for (int i = 0; i < l; i++) {
      // if Z[i] (matched region) is equal to pattern
      // length we got the pattern
      if (Z[i] == pattern.length()) {
        System.out.println(
          "Pattern found at index "
          + (i - pattern.length() - 1));
      }
    }
  }
  
  // Fills Z array for given string str[]
  static void getZarr(String str, int[] Z)
  {
    int n = str.length();
    // [L, R] make a window which matches with prefix of
    // s
    int L = 0, R = 0, k;
  
    for (int i = 1; i < n; ++i) {
      // if i>R nothing matches so we will calculate.
      // Z[i] using naive way.
      if (i > R) {
        L = R = i;
        // R-L = 0 in starting, so it will start
        // checking from 0'th index. For example,
        // for "ababab" and i = 1, the value of R
        // remains 0 and Z[i] becomes 0. For string
        // "aaaaaa" and i = 1, Z[i] and R become 5
        while (R < n
               && str.charAt(R - L)
               == str.charAt(R)) {
          R++;
        }
        Z[i] = R - L;
        R--;
      }
      else {
        // k = i-L so k corresponds to number which
        // matches in [L, R] interval.
        k = i - L;
  
        // if Z[k] is less than remaining interval
        // then Z[i] will be equal to Z[k].
        // For example, str = "ababab", i = 3, R = 5
        // and L = 2
        if (Z[k] < R - i + 1)
          Z[i] = Z[k];
  
        // For example str = "aaaaaa" and i = 2, R
        // is 5, L is 0
        else {
          // else start from R and check manually
          L = i;
          while (R < n
                 && str.charAt(R - L)
                 == str.charAt(R)) {
            R++;
          }
          Z[i] = R - L;
          R--;
        }
      }
    }
  }
  
  public static void main(String[] args)
  {
    String text = "GEEKS FOR GEEKS";
    String pattern = "GEEK";
    search(text, pattern);
  }
}
  
// This code is contributed by lokeshmvs21.


Python3




# A Python program that implements Z algorithm for pattern
# searching
# Fills Z array for given string str[]
def getZarr(string, Z):
    n = len(string)
      
    # [L, R] make a window which matches with prefix of s
    L, R, k = 0, 0, 0
    Z[0] = n
  
    for i in range(1, n):
        
      # if i>R nothing matches so we will calculate.
        # Z[i] using naive way.
        if i > R:
            L, R = i, i
              
            # R-L = 0 in starting, so it will start
            # checking from 0'th index. For example,
            # for "ababab" and i = 1, the value of R
            # remains 0 and Z[i] becomes 0. For string
            # "aaaaaa" and i = 1, Z[i] and R become 5
            while R < n and string[R - L] == string[R]:
                R += 1
            Z[i] = R - L
            R -= 1
        else:
            
          # k = i-L so k corresponds to number which
            # matches in [L, R] interval.
            k = i - L
              
            # if Z[k] is less than remaining interval
            # then Z[i] will be equal to Z[k].
            # For example, str = "ababab", i = 3, R = 5
            # and L = 2
            if Z[k] < R - i + 1:
                Z[i] = Z[k]
                  
            # For example str = "aaaaaa" and i = 2, R is 5,
            # L is 0
            else:
                
              # else start from R and check manually
                L = i
                while R < n and string[R - L] == string[R]:
                    R += 1
                Z[i] = R - L
                R -= 1
                  
# prints all occurrences of pattern in text using Z algo
def search(text, pattern):
    
  # Create concatenated string "P$T"
    concat = pattern + "$" + text
    l = len(concat)
  
    # Construct Z array
    Z = [0] * l
    getZarr(concat, Z)
  
    # now looping through Z array for matching condition
    for i in range(l):
        
      # if Z[i] (matched region) is equal to pattern
        # length we got the pattern
        if Z[i] == len(pattern):
            print("Pattern found at index", i - len(pattern) - 1)
  
# Driver program
if __name__ == "__main__":
    text = "GEEKS FOR GEEKS"
    pattern = "GEEK"
    search(text, pattern)
      
# This code is contributed by akashish__


C#




using System;
using System.Linq;
  
public class GFG {
  
  // prints all occurrences of pattern in text using Z
  // algo
  static void search(string text, string pattern)
  {
    // Create concatenated string "P$T"
    string concat = pattern + "$" + text;
    int l = concat.Length;
  
    // Construct Z array
    int[] Z = new int[l];
    GetZarr(concat, Z);
  
    // now looping through Z array for matching
    // condition
    for (int i = 0; i < l; i++) {
      // if Z[i] (matched region) is equal to
      // pattern length we got the pattern
      if (Z[i] == pattern.Length) {
        Console.WriteLine(
          "Pattern found at index "
          + (i - pattern.Length - 1));
      }
    }
  }
  
  // Fills Z array for given string str[]
  static void GetZarr(string str, int[] Z)
  {
    int n = str.Length;
    // [L, R] make a window which matches with
    // prefix of
    // s
    int L = 0, R = 0, k;
  
    for (int i = 1; i < n; ++i) {
      // if i>R nothing matches so we will
      // calculate. Z[i] using naive way.
      if (i > R) {
        L = R = i;
        // R-L = 0 in starting, so it will start
        // checking from 0'th index. For
        // example, for "ababab" and i = 1, the
        // value of R remains 0 and Z[i] becomes
        // 0. For string "aaaaaa" and i = 1,
        // Z[i] and R become 5
        while (R < n && str[R - L] == str[R]) {
          R++;
        }
        Z[i] = R - L;
        R--;
      }
      else {
        // k = i-L so k corresponds to number
        // which matches in [L, R] interval.
        k = i - L;
  
        // if Z[k] is less than remaining
        // interval then Z[i] will be equal to
        // Z[k]. For example, str = "ababab", i
        // = 3, R = 5 and L = 2
        if (Z[k] < R - i + 1)
          Z[i] = Z[k];
  
        // For example str = "aaaaaa" and i = 2,
        // R is 5, L is 0
        else {
          // else start from R and check
          // manually
          L = i;
          while (R < n && str[R - L] == str[R]) {
            R++;
          }
          Z[i] = R - L;
          R--;
        }
      }
    }
  }
  
  static public void Main()
  {
    string text = "GEEKS FOR GEEKS";
    string pattern = "GEEK";
    search(text, pattern);
  }
}
// This code is contributed by akashish__


Javascript




function search(text, pattern) {
  // Create concatenated string "P$T"
  let concat = pattern + "$" + text;
  let l = concat.length;
  
  // Construct Z array
  let Z = [];
  getZarr(concat, Z);
  
  // now looping through Z array for matching condition
  for (let i = 0; i < l; i++) {
    // if Z[i] (matched region) is equal to pattern
    // length we got the pattern
    if (Z[i] == pattern.length) {
      console.log(`Pattern found at index ${i - pattern.length - 1}`);
    }
  }
}
  
// Fills Z array for given string str[]
function getZarr(str, Z) {
  let n = str.length;
  let L, R, k;
  
  // [L, R] make a window which matches with prefix of s
  L = R = 0;
  for (let i = 1; i < n; i++) {
    // if i>R nothing matches so we will calculate.
    // Z[i] using naive way.
    if (i > R) {
      L = R = i;
  
      // R-L = 0 in starting, so it will start
      // checking from 0'th index. For example,
      // for "ababab" and i = 1, the value of R
      // remains 0 and Z[i] becomes 0. For string
      // "aaaaaa" and i = 1, Z[i] and R become 5
      while (R < n && str[R - L] == str[R]) {
        R++;
      }
      Z[i] = R - L;
      R--;
    } else {
      // k = i-L so k corresponds to number which
      // matches in [L, R] interval.
      k = i - L;
  
      // if Z[k] is less than remaining interval
      // then Z[i] will be equal to Z[k].
      // For example, str = "ababab", i = 3, R = 5
      // and L = 2
      if (Z[k] < R - i + 1) {
        Z[i] = Z[k];
      }
  
      // For example str = "aaaaaa" and i = 2, R is 5,
      // L is 0
      else {
        // else start from R and check manually
        L = i;
        while (R < n && str[R - L] == str[R]) {
          R++;
        }
        Z[i] = R - L;
        R--;
      }
    }
  }
}
  
// Driver program
let text = "GEEKS FOR GEEKS";
let pattern = "GEEK";
search(text, pattern);
  
// This code is contributed by akashish__


Output

Pattern found at index 0
Pattern found at index 10

Time Complexity: O(m+n), where m is length of pattern and n is length of text.
Auxiliary Space: O(m+n)

Introduction to Pattern Searching – Data Structure and Algorithm Tutorial

Pattern searching is an algorithm that involves searching for patterns such as strings, words, images, etc.

We use certain algorithms to do the search process. The complexity of pattern searching varies from algorithm to algorithm. They are very useful when performing a search in a database. The Pattern Searching algorithm is useful for finding patterns in substrings of larger strings. This process can be accomplished using a variety of algorithms that we are going to discuss in this blog. 

Introduction to Pattern Searching – Data Structure and Algorithm Tutorial

Similar Reads

Features of Pattern Searching Algorithm:

Pattern searching algorithms should recognize familiar patterns quickly and accurately. Recognize and classify unfamiliar patterns. Identify patterns even when partly hidden. Recognize patterns quickly with ease, and with automaticity....

Naive Pattern Searching algorithm

Naive pattern searching is the simplest method among other pattern-searching algorithms. It checks for all characters of the main string to the pattern. This algorithm is helpful for smaller texts. It does not need any pre-processing phases. We can find the substring by checking once for the string. It also does not occupy extra space to perform the operation....

KMP algorithm

...

Rabin Karp algorithm:

...

Z algorithm:

...

Aho-Corasick algorithm:

...

Contact Us