CSES Solutions – String Matching
Given a string S and a pattern P, your task is to count the number of positions where the pattern occurs in the string.
Examples:
Input: S = “saippuakauppias”, P = “pp”
Output: 2
Explanation: “pp” appears 2 times in S.Input: S = “aaaa”, P = “aa”
Output: 3
Explanation: “aa” appears 3 times in S.
Approach: To solve the problem, follow the below idea:
To find all occurrences of a pattern in a text we can use various String-Matching algorithms. The Knuth-Morris-Pratt (KMP) algorithm is a suitable choice for this problem. KMP is an efficient string-matching algorithm that can find all occurrences of a pattern in a string in linear time.
Concatenate the Pattern and Text: The first step is to concatenate the pattern and the text with a special character # in between. This is done to ensure that the pattern and text don’t overlap during the computation of the prefix function.
Compute the Prefix Function: The computePrefix function is used to compute the prefix function of the concatenated string. The prefix function for a position i in the string is defined as the maximum proper prefix of the substring ending at position i that is also a suffix of this substring. This function is a key part of the KMP algorithm.
Count the Occurrences: After the prefix function is computed, the next step is to count the number of occurrences of the pattern in the text. This is done by iterating over the prefix function array and checking how many times the pattern length appears in the array. Each time the pattern length appears in the array, it means an occurrence of the pattern has been found in the text.
Step-by-step algorithm:
- Declare the prefix function array pi[], and the count of occurrences.
- The prefix function is computed for the pattern string. This function calculates the longest proper prefix which is also a suffix for each substring of the pattern. This information is stored in the pi array.
- The pattern string is concatenated with the text string, with a special character (#) in between to separate them.
- Iterate over the concatenated string. For each character, check if it matches the current character of the pattern (using the pi[] array). If it does, move to the next character of both the pattern and the text. If it doesn’t, move to the next character of the text, but stay on the current character of the pattern (or move to the character indicated by the pi array).
- Each time the end of the pattern is reached (i.e., all characters of the pattern have matched), increment the count of occurrences.
- After the entire text has been scanned, print the count of occurrences.
Below is the implementation of the algorithm:
#include <bits/stdc++.h>
using namespace std;
// Function to compute the prefix function of a string for
// KMP algorithm
vector<int> computePrefix(string S)
{
int N = S.length();
vector<int> pi(N);
for (int i = 1; i < N; i++) {
int j = pi[i - 1];
// Find the longest proper prefix which is also a
// suffix
while (j > 0 && S[i] != S[j])
j = pi[j - 1];
if (S[i] == S[j])
j++;
pi[i] = j;
}
return pi;
}
// Function to count the number of occurrences of a pattern
// in a text using KMP algorithm
int countOccurrences(string S, string P)
{
// Concatenate pattern and text with a special character
// in between
string combined = P + "#" + S;
// Compute the prefix function
vector<int> prefixArray = computePrefix(combined);
int count = 0;
// Count the number of times the pattern appears in the
// text
for (int i = 0; i < prefixArray.size(); i++) {
if (prefixArray[i] == P.size())
count++;
}
return count;
}
// Driver code
int main()
{
string S = "saippuakauppias";
string P = "pp";
cout << countOccurrences(S, P) << "\n";
return 0;
}
import java.util.*;
public class KMPAlgorithm {
// Function to compute the prefix function of a string for KMP algorithm
static List<Integer> computePrefix(String S) {
int N = S.length();
List<Integer> pi = new ArrayList<>(Collections.nCopies(N, 0));
for (int i = 1; i < N; i++) {
int j = pi.get(i - 1);
// Find the longest proper prefix which is also a suffix
while (j > 0 && S.charAt(i) != S.charAt(j))
j = pi.get(j - 1);
if (S.charAt(i) == S.charAt(j))
j++;
pi.set(i, j);
}
return pi;
}
// Function to count the number of occurrences of a pattern in a text using KMP algorithm
static int countOccurrences(String S, String P) {
// Concatenate pattern and text with a special character in between
String combined = P + "#" + S;
// Compute the prefix function
List<Integer> prefixArray = computePrefix(combined);
int count = 0;
// Count the number of times the pattern appears in the text
for (int i = 0; i < prefixArray.size(); i++) {
if (prefixArray.get(i) == P.length())
count++;
}
return count;
}
// Driver code
public static void main(String[] args) {
String S = "saippuakauppias";
String P = "pp";
System.out.println(countOccurrences(S, P));
}
}
# Function to compute the prefix function of a string for
# KMP algorithm
def compute_prefix(s):
n = len(s)
pi = [0] * n
j = 0
for i in range(1, n):
while j > 0 and s[i] != s[j]:
j = pi[j - 1]
if s[i] == s[j]:
j += 1
pi[i] = j
return pi
# Function to count the number of occurrences of a pattern
# in a text using KMP algorithm
def count_occurrences(s, p):
# Concatenate pattern and text with a special character
# in between
combined = p + "#" + s
# Compute the prefix function
prefix_array = compute_prefix(combined)
count = 0
# Count the number of times the pattern appears in the
# text
for pi in prefix_array:
if pi == len(p):
count += 1
return count
# Driver code
if __name__ == "__main__":
S = "saippuakauppias"
P = "pp"
print(count_occurrences(S, P))
using System;
using System.Collections.Generic;
public class KMPAlgorithm
{
// Function to compute the prefix function of a string for KMP algorithm
static List<int> ComputePrefix(string S)
{
int N = S.Length;
List<int> pi = new List<int>(new int[N]);
for (int i = 1; i < N; i++)
{
int j = pi[i - 1];
// Find the longest proper prefix which is also a suffix
while (j > 0 && S[i] != S[j])
j = pi[j - 1];
if (S[i] == S[j])
j++;
pi[i] = j;
}
return pi;
}
// Function to count the number of occurrences of a pattern in a text using KMP algorithm
static int CountOccurrences(string S, string P)
{
// Concatenate pattern and text with a special character in between
string combined = P + "#" + S;
// Compute the prefix function
List<int> prefixArray = ComputePrefix(combined);
int count = 0;
// Count the number of times the pattern appears in the text
for (int i = 0; i < prefixArray.Count; i++)
{
if (prefixArray[i] == P.Length)
count++;
}
return count;
}
// Driver code
public static void Main(string[] args)
{
string S = "saippuakauppias";
string P = "pp";
Console.WriteLine(CountOccurrences(S, P));
}
}
// Function to compute the prefix function of a string for
// KMP algorithm
function computePrefix(S) {
let N = S.length;
let pi = new Array(N).fill(0);
for (let i = 1; i < N; i++) {
let j = pi[i - 1];
// Find the longest proper prefix which is also a
// suffix
while (j > 0 && S[i] != S[j])
j = pi[j - 1];
if (S[i] == S[j])
j++;
pi[i] = j;
}
return pi;
}
// Function to count the number of occurrences of a pattern
// in a text using KMP algorithm
function countOccurrences(S, P) {
// Concatenate pattern and text with a special character
// in between
let combined = P + "#" + S;
// Compute the prefix function
let prefixArray = computePrefix(combined);
let count = 0;
// Count the number of times the pattern appears in the
// text
for (let i = 0; i < prefixArray.length; i++) {
if (prefixArray[i] == P.length)
count++;
}
return count;
}
// Driver code
let S = "saippuakauppias";
let P = "pp";
console.log(countOccurrences(S, P));
Output
2
Time Complexity: O(N+M) where N is the length of the text and M is the length of the pattern to be found.
Auxiliary Space: O(N)
Contact Us