Page 32 - 2024F
P. 32

UEC Int’l Mini-Conference No.53                                                               25







            common prefix of the strings str(v 1 ) and str(v 2 ).  have that S[a + l − 1] ̸= S[c + l − 1] for any
            For each suffix S[i..N] of S, there is a node v   l > L, so for any assignment b := a + l − 1 and
            such that S[i..N] = str(v) and by the previous    d := c + l − 1 the property S[a..b] = S[c..d]
            property, the node v is unique, thus we denote    does not hold. For other values of b and d, the
            it by ℓ(i). We term ℓ(i) the associated node of   strings S[a..b] and S[c..d] have different length.
            the suffix S[i..N]. If S is of the form S $ where  Thus, the number of tuples (a, b, c, d) with the
                                                 ′
            $ is a character that does not appear anywhere    desired property is |{1..L}|, which, in turn, is
            in S , then the associated node ℓ(i) of a suffix  equal to |lcp(S[a..N], S[c..N])|.
                ′
            S[i..N] of S is a leaf. We denote the root of the
            tree by v r .                                       Let T be the suffix tree built on the string S$
              We denote by Tree(v) the set of all nodes       of length N + 1, where the character $ does not
            in the subtree of v, i.e.  v and all its de-      appear in the string S.
            scendant nodes.   By the property of the lcp      Then pairs of suffixes with a certain longest com-
            function we have that lcp(S[i..N], S[j..N]) =     mon prefix can be found using Remark 3:
            str(lca(ℓ(i), ℓ(j))).
              Assuming the alphabet of a string is of a       Remark 3      For two suffixes S[i..N] and
            constant size, the suffix tree can be constructed  S[j..N], we have that lca(ℓ(i), ℓ(j)) = u is equiv-
            in O(N) time using Ukkonen’s algorithm [4].       alent to the existence of two nodes v 1 and v 2
            Given a suffix tree of a string, Gusfield [2] shows  children of the node u such that ℓ(i) ∈ Tree(v 1 )
            how to compute lcp(S[i..N], S[j..N]) in O(1)      and ℓ(j) ∈ Tree(v 2 ).
            time.                                             Since we have that lcp(str(ℓ(i)), str(ℓ(j))) =
                                                              str(lca(ℓ(i), ℓ(j))), any such pair i, j indicates
              We define an integer p ≥ 1 to be a period       suffixes S[i..N], S[j..N] with the same longest
            of string S if S[i] = S[i + p] for all 1 ≤ i ≤    common prefix.
            |S|−p. We also call a triple r = (i, j, p) a run of
            string S if the smallest period p of S[i..j] both  Algorithm 1 The Algorithm 1 is comprised
            satisfies |S[i..j]| ≥ 2p and the periodicity cannot  of the procedure iterate() that is ran with the
            be extended to the left or right, i.e. either i = 1  root node v r of T as the argument, traversing the
            or S[i − 1] ̸= S[i + p − 1], and either j = N or  suffix tree in a depth-first manner and counting
            S[j + 1] ̸= S[j + p + 1]. The rational number     excluding the condition b < c:
            j−i+1  is called the exponent of r. Finally, let
              p
            Runs(S) denote the set of runs of string S.        Procedure iterate(v)
                                                                Data: ans 1 ← 0;
                                                                Sf [u] - the number of suffixes in the
            3    Methodology
                                                                subtree of the node u
                                                              1 sf ← 0;
            Problem 1     Let S be a given string of length
            N. Count how many tuples (a, b, c, d) exist such  2 cnt ← 0;
            that 1 ≤ a ≤ b < c ≤ d ≤ N and S[a..b] =          3 if v is leaf then
            S[c..d]. We denote the set of all such tuples by  4    sf ← 1;
                                                              5 else
            Q.
                                                              6    for ch ← child of n do
                                                              7        iterate(ch);
            Lemma 2     For fixed values a and c, the num-
                                                              8        cnt ← cnt + Sf [ch] · sf ;
            ber of tuples (a, b, c, d) such that S[a..b] = S[c..d]
                                                              9        sf ← sf + Sf [ch];
            is |lcp(S[a..N], S[c..N])|.
                                                             10    end
            Proof.    Let L     =   |lcp(S[a..N], S[c..N])|.
            For   any   l   ∈    {1..L}  we   have  that     11 end
            S[a..a + l − 1] = S[c..c + l − 1], thus as-      12 ans 1 ← ans 1 + cnt · |str(v)|;
            signing b := a + l − 1 and d := c + l − 1        13 Sf [v] ← sf ;
            the property S[a..b] = S[c..d] holds. Also we
   27   28   29   30   31   32   33   34   35   36   37