Theoretical Mathematics & Applications

Distribution of Frequencies of the Word Occurrence in a Random String in the Vicinity of the Word’s Critical Length

  • Pdf Icon [ Download ]
  • Times downloaded: 8910
  • Abstract 
     
    This work explores features of the frequency distribution probability function for the preset word’s occurrence in a random string. The recurrent formula that determines distribution function, which, in turn, depends on the word and the string lengths, as well as on the overlap coordinates, has been deduced based on the multitudes’ properties and is being presented herewith in the form previously unknown. Asymptotic formulas have been drawn for minimum and maximum probabilities of the word’s just for once occurrence in a random string. Critical distribution parameters have been determined: the word’s critical length, whereby probability of its occurrence at least once is close to 0.5, and the lengths’ critical interval, whereby probability of the word’s just for once occurrence shifts from the value close to one, to the value close to zero. It has been shown, that in the long string case the critical interval’s width does not depend on the lengths of either word or string, and meanwhile the word’s critical length is linearly dependent on the string length’s logarithm. Examples have been offered for the frequency probability distribution tabulation in different cases of overlaps and at different line lengths. The attached C - language SW application allows tabulation of the frequency distribution function at any word and string lengths’ value. 
     
     
    Mathematics Subject Classification: 60C05, 0504, 6004.
    Keywords: Combinatorics on words, Overlaps, Probability extremes.