Java Program to Implement Suffix Array

A Suffix Array is a fundamental data structure used in string processing and computational biology. It represents an array of all suffixes of a given string, sorted lexicographically. Suffix arrays find applications in pattern matching, substring search, text compression, and bioinformatics, among others.

Organization of Suffix Array

A suffix array is essentially an array of integers, each representing the starting index of a suffix in the original string. The array is sorted in lexicographical order, so it provides an efficient way to search for substrings within the string. Here’s an illustration:

Let’s take an example string “banana”:


The corresponding suffix array for “banana” would be: [ 5 , 3 , 1 , 0 , 4 , 2 ]

Implementation of Suffix Array

The implementation of a suffix array can be done efficiently using various algorithms such as the Manber-Myers algorithm, the Skew algorithm, or the SA-IS algorithm. Here’s a simplified explanation of how it can be implemented:

  • Construct Suffixes : Generate all suffixes of the given string.
  • Sort Suffixes : Sort the generated suffixes lexicographically.
  • Construct Suffix Array : Create an array that stores the indices of the sorted suffixes.

Basic Operations and Complexity

Operation

Description

Complexity

Search

Suffix arrays enable efficient substring search using techniques like binary search.

Time complexity: O(m * log n), where m is the length of the query string and n is the length of the original string.

Space Complexity: O(n), where n is the length of the original string.

Program to Implement Suffix Array

Below is the Program to Implement Suffix Array:

Java
// Java Program to Implement Suffix Array
import java.util.*;

// Driver Class
class SuffixArray {
      public static int[] constructSuffixArray(String text) {
        // Get the length of the input text
        int n = text.length();
    
        // Create an array of Integer objects to store indices
        Integer[] helper = new Integer[n];
    
        // Initialize the array with indices from 0 to n-1
        for (int i = 0; i < n; i++)
            helper[i] = i;

        // Sort the array of indices based on the suffixes
        Arrays.sort(sa, (a, b) -> text.substring(a)
                    .compareTo(text.substring(b)));
    
        // Convert the Integer array to an int array
        int[] suffixArray = new int[n];
        for (int i = 0; i < n; i++)
              suffixArray[i] = helper[i];

        return suffixArray;
    }

      public static void main(String[] args) {
        // Example input text
        String text = "banana$";

        // Construct the suffix array
        int[] suffixArray = constructSuffixArray(text);

        // Print the suffix array
        System.out.println("Suffix Array for \"" + text
                           + "\": " + Arrays.toString(suffixArray));
    }
}

Output


Applications of Suffix Array

  • Pattern Matching: Efficiently find occurrences of substrings within a text.
  • Substring Search: Quickly locate substrings within large texts.
  • Bioinformatics: Analyze DNA sequences for patterns and mutations.
  • Text Compression: Used in algorithms like Burrows-Wheeler Transform.