Stringology

From Wikipedia, the free encyclopedia

In formal languages, which are used in mathematical logic and theoretical computer science, stringology deals with algorithms and data structures used for string processing. The name was coined in 1984 by computer scientist Zvi Galil.^[1]

Formal theory

Topology

Strings admit the following interpretation as nodes on a graph:

Fixed-length strings can be viewed as nodes on a hypercube
Variable-length strings (of finite length) can be viewed as nodes on the k-ary tree, where k is the number of symbols in Σ
Infinite strings can be viewed as infinite paths on the k-ary tree.

The natural topology on the set of fixed-length strings or variable length strings is the discrete topology, but the natural topology on the set of infinite strings is the limit topology, viewing the set of infinite strings as the inverse limit of the sets of finite strings. This is the construction used for the p-adic numbers and some constructions of the Cantor set, and yields the same topology.

String processing algorithms

There are many algorithms for processing strings, each with various trade-offs. Some categories of algorithms include:

String searching algorithms for finding a given substring or pattern
String manipulation algorithms
Sorting algorithms
Regular expression algorithms
Parsing a string
Sequence mining

Advanced string algorithms often employ complex mechanisms and data structures, among them suffix trees and finite state machines.

Character string functions

String functions are used to manipulate a string or change or edit the contents of a string. They also are used to query information about a string. They are usually used within the context of a computer programming language.

The most basic example of a string function is the length(string) function, which returns the length of a string (not counting any terminator characters or any of the string's internal structural information) and does not modify the string. For example, length("hello world") returns 11.

There are many string functions that exist in other languages with similar or exactly the same syntax or parameters. For example, in many languages, the length function is usually represented as len(string). Even though string functions are very useful to a computer programmer, a computer programmer using these functions should be mindful that a string function in one language could in another language behave differently or have a similar or completely different function name, parameters, syntax, and results.

References

↑ http://www.stringology.org/

v t e Data types

Uninterpreted	Bit Byte Trit Tryte Word

Numeric	Bignum Complex Decimal Fixed-point Floating-point Integer signedness Interval Rational

Text	Character String null-terminated

Pointer	Address physical virtual Reference

Composite	Algebraic data type generalized Array Associative array Class Dependent Equality Inductive List Object metaobject Option type Product Record Set Union tagged

Other	Boolean Bottom type Collection Enumerated type Exception Function type Opaque data type Recursive data type Semaphore Stream Top type Type class Unit type Void

Related topics	Abstract data type Data structure Generic Kind metaclass Parametric polymorphism Primitive data type Protocol interface Subtyping Type constructor Type conversion Type system

This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.