From Wikipedia, the free encyclopedia
"String functions" redirects here. For string functions in formal language theory, see
String operations.
String functions are used in computer programming languages to manipulate a string or query information about a string (some do both).
Most computer programming languages that have a string datatype will have some string functions although it should be noted that there may be other low level ways within each language to handle strings directly. In object oriented languages, string functions are often implemented as properties and methods of string objects. In both Prolog and Erlang, a string is represented as a list (of character codes), therefore all list-manipulation procedures are applicable, though the latter also implements a set of such procedures that are string-specific.
The most basic example of a string function is the length(string) function. This function returns the length of a string literal.
eg. length("hello world") would return 11.
Other languages may have string functions with similar or exactly the same syntax or parameters or outcomes. For example in many languages the length function is usually represented as len(string). The below list of common functions aims to help limit this confusion.
[edit] Common String Functions (multi language reference)
Here is a list of common string functions which are found in other languages. Any other equivalent functions used by other languages are also listed. The below list of common functions aims to help programmers find the equivalent function in a language. Note, string concatenation and regular expressions are handled in separate pages.
[edit] CharAt
Definition |
charAt(string,integer) returns character. |
Description |
Returns character at index in the string. |
Equivalent |
See substring of length 1 character. |
[edit] Compare (integer result)
Definition |
compare(string1,string2) returns integer. |
Description |
Compares two strings to each other. If they are equivalent, a zero is returned. Otherwise, most of these routines will return a positive or negative result corresponding to whether string1 is lexicographically greater than, or less than, respectively, than string2. The exceptions are the Scheme and REXX routines which return the index of the first mismatch. |
Format |
Languages |
cmp(string1, string2) |
Python |
strcmp(string1, string2) |
C, PHP |
StrComp(string1, string2) |
VB |
string1 cmp string2 |
Perl |
string1 <=> string2 |
Ruby |
string1.compare(string2) |
C++ |
compare(string1, string2) |
REXX |
CompareStr(string1, string2) |
Pascal, Object Pascal (Delphi) |
string1.compareTo(string2) |
Java |
string1.CompareTo(string2) |
VB .NET, C# |
(string-compare string1 string2 p< p= p>) |
Scheme |
compare string1 string2 |
OCaml |
compare string1 string2 |
Haskell (returns LT, EQ, or GT) |
[string]::Compare(string1, string2) |
Windows PowerShell |
# Example in Python
cmp("hello", "world") # returns: -1
/** Example in REXX */
compare("hello", "world") /* returns index of mismatch: 1 */
; Example in Scheme
(use-modules (srfi srfi-13))
; returns index of mismatch: 0
(string-compare "hello" "world" values values values)
[edit] Compare (integer result, fast/non-human ordering)
Definition |
compare(string1,string2) returns integer. |
Description |
Compares two strings to each other. If they are equivalent, a zero is returned. Otherwise, most of these routines will return a positive or negative result corresponding to whether string1 is greater than, or less than, respectively, than string2. The reason for the difference between this and #Compare (integer result) is that no ordering guarantees are given, leaving the implementation to do whatever ordering is fastest, a common implementation is to order by length and then by byte value (which can be significantly faster, but much less useful to humans). |
Format |
Languages |
if (!(ret = (string1_len - string2_len))) ret = memcmp(string1, string2, string1_len) |
C |
ustr_cmp_fast(string1, string2) |
Ustr string library |
(length(string1) <=> length(string2) ¦¦ string1 cmp string2) |
Perl |
cmp(len(string1), len(string2)) or cmp(string1, string2) |
Python |
(Length(string1) <> Length(string2)) or CompareStr(string1, string2) |
Pascal, Object Pascal (Delphi) |
[edit] Compare (relational operator-based, Boolean result)
Definition |
string1 op string2 OR (compare string1 string2) returns Boolean. |
Description |
Lexicographically compares two strings using a relational operator or function. Boolean result returned. |
Format |
Languages |
string1 op string2, where op can be any of =, <>, <, >, <= and >= |
Pascal, Object Pascal (Delphi), OCaml, VB .NET |
(stringX? string1 string2), where X can be any of =, -ci=, <>, -ci<>, <, -ci<, >, -ci>, <=, -ci<=, >= and -ci>= (operators starting with '-ci' are case-insensitive) |
Scheme |
(stringX string1 string2), where X can be any of =, -equal, /=, -not-equal, <, -lessp, >, -greaterp, <=, -not-greaterp, >= and -not-lessp (the verbal operators are case-insensitive) |
Common Lisp |
string1 op string2, where op can be any of =, \=, <, >, <= and >= |
REXX |
string1 op string2, where op can be any of =, /=, <, >, <= and >= |
Ada |
string1 op string2, where op can be any of ==, /=, <, >, =< and >= |
Erlang |
string1 op string2, where op can be any of ==, /=, <, >, <= and >= |
Haskell |
string1 op string2, where op can be any of eq, ne, lt, gt, le and ge |
Perl |
string1 op string2, where op can be any of ==, !=, <, >, <= and >= |
C++ (std::string only), C#, JavaScript, Python |
string1 op string2, where op can be any of -eq, -ceq, -ne, -cne, -lt, -clt, -gt, -cgt, -le, -cle, -ge, and -cge (operators starting with 'c' are case-sensitive) |
Windows PowerShell |
% Example in Erlang
"hello" > "world". % returns: false
[edit] Concatenation
-
Main article: Concatenation
Definition |
concatenate(string1,string2) returns string. |
Description |
Concatenates (joins) two strings to each other, returning the combined string. Note that some languages like C have mutable strings, so really the second string is being appended to the first string and the mutated string is returned. |
Format |
Languages |
string1 & string2 |
Ada, VB, VB .NET |
strcat(string1, string2) |
C, C++ |
string1 . string2 |
Perl, PHP |
string1 + string2 |
C++ (std::string only), C#, Pascal, Object Pascal (Delphi), Java, JavaScript, Windows PowerShell, Python |
(string-append string1 string2) |
Scheme |
(concatenate 'string string1 string2) |
Common Lisp |
string1 || string2 |
REXX,Fortran |
string1 ++ string2 |
Erlang |
string1 ^ string2 |
OCaml |
string1 ++ string2 |
Haskell |
[edit] Equality
Tests if two strings are equal. See also #Compare and #Compare. Note that doing equality checks via. a generic Compare with integer result is not only confusing for the programer but is often a significantly more expensive operation, this is especially true when using "C-strings".
Format |
Languages |
string1 == string2 |
Python, C++(std::string only), C#, JavaScript PHP, Ruby, Erlang, Haskell |
strcmp(string1, string2) == 0 |
C |
(string=? string1 string2) |
Scheme |
(string= string1 string2) |
Common Lisp |
string1 = string2 |
Ada, Object Pascal (Delphi), OCaml, Pascal, REXX, VB, VB .NET |
test string1 = string2, or
[ string1 = string2 ] |
Bourne Shell |
string1 eq string2 |
Perl |
string1.equals(string2) |
Java |
string1 -eq string2, or
[string]::Equals(string1, string2) |
Windows PowerShell |
Definition |
find(string,substring) returns integer |
Description |
Returns the position of the start of the first occurrence of substring in string. If the substring is not found most of these routines return an invalid index value – -1 where indexes are 0-based, 0 where they are 1-based – or some value to be interpreted as Boolean FALSE. |
Related |
instrrev |
Format |
Languages |
If not found |
InStr([startpos,]string,substring) |
VB (startpos optional) (positions start at 1) |
returns 0 |
index(string,substring) |
AWK |
returns 0 |
index(string,substring[,startpos]) |
Perl |
returns -1 |
strpos(string,substring[,startpos]) |
PHP |
returns FALSE |
locate(string, substring) |
Ingres |
returns string length + 1 |
strstr(string, substring) |
C, C++ (returns pointer to first character) |
returns NULL |
pos(substring, string) |
Pascal, Object Pascal (Delphi) |
returns 0 |
pos(substring, string[,startpos]) |
REXX |
returns 0 |
string.find(substring[,startpos]) |
C++ (std::string only) |
returns std::string::npos |
string.find(substring[,startpos[[,endpos]]) |
Python |
returns -1 |
string.index(substring[,startpos]) |
Ruby |
returns nil |
string.indexOf(substring[,startpos]) |
Java, JavaScript |
returns -1 |
string.IndexOf(substring[,startpos[, charcount]]) |
VB .NET, C#, Windows PowerShell |
returns -1 |
string:str(string, substring)) |
Erlang |
returns 0 |
(string-contains string substring) |
Scheme |
returns #f |
(search substring string) |
Common Lisp |
returns NIL |
' Example in Visual Basic
InStr("Hello mate", "ell") ' returns: 2
InStr(5, "Hello mate", "e") ' returns: 10
InStr("word", "z") ' returns: 0
; Example in Scheme
(use-modules (srfi srfi-13))
(string-contains "Hello mate", "ell") ; returns: 1
(string-contains "word", "z") ; returns: #f
[edit] Format
Definition |
format(formatstring, items) returns string |
Description |
Returns the formatted string representation of one or more items. See sprintf for more information. |
Format |
Languages |
Format(item, formatstring) |
VB |
sprintf(formatstring, items) |
Perl, PHP, Ruby |
sprintf(outputstring, formatstring, items) |
C, C++ |
printf -v outputstring formatstring items |
Unix |
formatstring % (items) |
Python |
Printf.sprintf formatstring items |
OCaml |
Text.Printf.printf formatstring items |
Haskell (GHC) |
String.format(formatstring, items) |
Java |
String.Format(formatstring, items) |
VB .NET, C#, Windows PowerShell |
(format formatstring items) |
Scheme |
(format nil formatstring items) |
Common Lisp |
[edit] Inequality
Tests if two strings are not equal. See also #Equality.
Format |
Languages |
string1 <> string2 |
VB,VB .NET, Pascal, Object Pascal (Delphi), OCaml |
string1 ne string2 |
Perl |
(string<>? string1 string2) |
Scheme |
(string/= string1 string2)) |
Common Lisp |
string1 != string2 |
C++ (std::string only), C#, JavaScript, Python |
string1 \= string2 |
REXX |
string1 /= string2 |
Ada, Erlang, Haskell |
test string1 != string2, or
[ string1 != string2 ] |
Bourne Shell |
string1 -ne string2, or
![string]::Equals(string1, string2) |
Windows PowerShell |
see #Find
[edit] indexof
see #Find
see #Find
[edit] instrrev
see #rfind
Definition |
join(separator, list_of_strings) joins a list of strings with a separator |
Description |
Joins the list of strings into a new string, with the separator string between each of the substrings. Opposite of split. |
Format |
Languages |
join(separator, list_of_strings) |
Perl |
implode(separator, array_of_strings) |
PHP |
separator.join(sequence_of_strings) |
Python |
array_of_strings.join(separator) |
Ruby, JavaScript |
(string-join array_of_strings separator) |
Scheme |
(format nil "~{~a~^separator~}" array_of_strings) |
Common Lisp |
String.concat separator list_of_strings |
OCaml |
Data.List.intercalate separator list_of_strings |
Haskell (GHC) |
Join(array_of_strings, separator) |
VB |
String.Join(separator, array_of_strings) |
VB .NET, C# |
&{$OFS=$separator; "$array_of_strings"}, or
[string]::Join(separator, array_of_strings) |
Windows PowerShell |
# Example in Python
mystr = "-".join(["a", "b", "c"]) # ('a-b-c')
; Example in Scheme
(use-modules (srfi srfi-13))
(string-join '("a" "b" "c") "-") ; "a-b-c"
[edit] lastindexof
see #rfind
Definition |
left(string,n) returns string |
Description |
Returns the left n part of a string. If n is greater than the length of the string then most implementations return the whole string (exceptions exist - see code examples). |
Format |
Languages |
string (string'First .. string'First + n - 1) |
Ada |
Left(string,n) |
VB |
left(string,n) |
Ingres |
left(string,n [,padchar]) |
REXX, Erlang |
substr(string, 0, n) |
AWK (changes string), Perl, PHP |
string[:n] |
Python |
string[0..n - 1] |
Ruby |
string.substr(0,n) |
C++ |
string.Substring(0,n) |
VB .NET, C#, Windows PowerShell |
leftstr(string, n) |
Pascal, Object Pascal (Delphi) |
string.substring(0,n) |
Java, JavaScript |
(string-take string n) |
Scheme |
take n string |
Haskell |
' Example in Visual Basic
Left("sandroguidi", 3) ' returns: "san"
Left("sandroguidi", 100) ' returns: "sandroguidi"
/* Example in REXX */
left("abcde", 3) /* returns: "abc" */
left("abcde", 8) /* returns: "abcde " */
left("abcde", 8, "*") /* returns: "abcde***" */
; Example in Scheme
(use-modules (srfi srfi-13))
(string-take "abcde", 3) ; returns: "abc"
(string-take "abcde", 8) ; returns: error
see #length
[edit] length
Definition |
length(string) returns an integer number |
Description |
Returns the length of a string (not counting the null terminator or any other of the string's internal structural information). An empty string returns a length of 0. |
Format |
Languages |
string'Length |
Ada |
length(string) |
Perl, Ingres, Pascal, Object Pascal (Delphi), REXX |
len(string) |
Python, Erlang |
Len(string) |
VB |
string.Length |
VB .NET, C# |
string.size OR string.length |
Ruby, Windows PowerShell |
strlen(string) |
C, C++, PHP |
string.length() |
C++ (std::string only), Java |
string.length |
JavaScript |
(string-length string) |
Scheme |
(length string) |
Common Lisp |
String.length string |
OCaml |
length string |
Haskell |
# Examples in Perl
length("hello") # returns: 5
length("") # returns: 0
# Examples in Erlang
string:len("hello"). % returns: 5
string:len(""). % returns: 0
[edit] locate
see #Find
[edit] Lowercase
see also #Uppercase
see #substring
[edit] partition
Definition |
<string>.partition(separator) returns the sub-string before the separator; the separator; then the sub-string after the separator. |
Description |
Splits the given string by the separator and returns the three substrings that together make the original. |
Format |
Languages |
string.partition(separator) |
Python, Ruby |
# Examples in Python
"Spam eggs spam spam and ham".partition('spam') # ('Spam eggs ', 'spam', ' spam and ham')
"Spam eggs spam spam and ham".partition('X') # ('Spam eggs spam spam and ham', "", "")
[edit] replace
Definition |
replace(string, find, replace) returns string |
Description |
Returns a string the with find occurrences changed to replace. |
Format |
Languages |
changestr(find, string, replace) |
REXX |
Replace(string, find, replace) |
VB |
string.Replace(find, replace) |
VB .NET, C# |
str_replace(find, replace, string) |
PHP |
string.replace(find, replace) |
Python, Java (1.5+) |
string.replaceAll(find_regex, replace)[1] |
Java |
string.gsub(find, replace) |
Ruby |
string =~ s/find_regex/replace/g[1] |
Perl |
string.replace(find, replace, "g") or
string.replace(/find_regex/g, replace)[1] |
JavaScript |
echo "string" | sed 's/find_regex/replace/g'[1] |
Unix |
string.replace(find, replace), or
string -replace find_regex, replace[1] |
Windows PowerShell |
' Examples in Visual Basic
Replace("effffff", "f", "jump") ' returns "ejumpjumpjumpjumpjumpjump"
Replace("blah", "z", "y") ' returns "blah".
[edit] reverse
Definition |
reverse(string) |
Description |
Reverses the order of the characters in the string. |
Format |
Languages |
reverse string |
Perl, Haskell |
strrev(string) |
PHP |
string[::-1] |
Python |
(string-reverse string) |
Scheme |
(reverse string) |
Common Lisp |
string.reverse |
Ruby |
new StringBuilder(string).reverse().toString() |
Java |
std::reverse(string.begin(), string.end()); |
C++ (std::string only) |
StrReverse(string) |
VB |
New String(Array.Reverse(string.ToCharArray())) |
VB .NET |
new string(Array.Reverse(string.ToCharArray())) |
C# |
Definition |
rfind(string,substring) returns integer |
Description |
Returns the position of the start of the last occurrence of substring in string. If the substring is not found most of these routines return an invalid index value – -1 where indexes are 0-based, 0 where they are 1-based – or some value to be interpreted as Boolean FALSE. |
Related |
instr |
Format |
Languages |
If not found |
InStrRev([startpos, ]string,substring) |
VB |
returns 0 |
rindex(string,substring[,startpos]) |
Perl |
returns -1 |
strrpos(string,substring[,startpos]) |
PHP |
returns FALSE |
string.rfind(substring[,startpos]) |
C++ (std::string only) |
returns std::string::npos |
string.rfind(substring[,startpos]) |
Python |
returns -1 |
string.lastIndexOf(substring[,startpos]) |
Java, JavaScript |
returns -1 |
string.LastIndexOf(substring[,startpos[, charcount]]) |
VB .NET, C#, Windows PowerShell |
returns -1 |
(search substring string :from-end) |
Common Lisp |
returns NIL |
Definition |
right(string,n) returns string |
Description |
Returns the right n part of a string. If n is greater than the length of the string then most implementations return the whole string (exceptions exist - see code examples). |
Format |
Languages |
string (string'Last - n + 1 .. string'Last) |
Ada |
Right(string,n) |
VB |
right(string,n) |
Ingres |
right(string,n [,padchar]) |
REXX, Erlang |
substr(string,-n) |
Perl, PHP |
string[-n:] |
Python |
(string-take-right string n) |
Scheme |
' Example in Visual Basic
Right("sandroguidi", 3) ' returns: "idi"
Right("sandroguidi", 100) ' returns: "sandroguidi"
/* Example in REXX */
right("abcde", 3) /* returns: "cde" */
right("abcde", 8) /* returns: " abcde" */
right("abcde", 8, "*") /* returns: "***abcde" */
; Example in Scheme
(use-modules (srfi srfi-13))
(string-take-right "abcde", 3) ; returns: "cde"
(string-take-right "abcde", 8) ; returns: error
[edit] rpartition
Definition |
<string>.rpartition(separator) Searches for the separator from right-to-left within the string then returns the sub-string before the separator; the separator; then the sub-string after the separator. |
Description |
Splits the given string by the right-most separator and returns the three substrings that together make the original. |
Format |
Languages |
string.rpartition(separator) |
Python, Ruby |
# Examples in Python
"Spam eggs spam spam and ham".rpartition('spam') ### ('Spam eggs spam ', 'spam', ' and ham')
"Spam eggs spam spam and ham".rpartition('X') ### ("", "", 'Spam eggs spam spam and ham')
see #substring
Definition |
<string>.split(separator[, limit]) splits a string on separator, optionally only up to a limited number of substrings |
Description |
Splits the given string by occurrences of the separator (itself a string) and returns a list (or array) of the substrings. If limit is given, after limit - 1 separators have been read, the rest of the string is made into the last substring, regardless of whether it has any separators in it. The Scheme and Erlang implementations are similar but differ in several ways. Opposite of join. |
Format |
Languages |
split(/separator/, string[, limit]) |
Perl |
explode(separator, string[, limit]) |
PHP |
string.split(separator[, limit]) |
Javascript, Java, Python, Ruby |
tokens(string, sepchars) |
Erlang |
(string-tokenize string[ charset[ start[ end]]]) |
Scheme |
Split(string, sepchars[, limit]) |
VB |
string.Split(sepchars[, limit[, options]]) |
VB .NET, C# |
string.split(separator) |
Windows PowerShell |
# Example in Python
"Spam eggs spam spam and ham".split('spam') ### ('Spam eggs ', ' ', ' and ham')
"Spam eggs spam spam and ham".split('X') ### ('Spam eggs spam spam and ham')
% Example in Erlang
string:tokens("abc;defgh;ijk", ";"). % ["abc", "defgh", "ijk"]
; Example in Scheme
(use-modules (srfi srfi-13))
(string-tokenize "abc,defgh,ijk" char-set:letter) ; ("abc" "defgh" "ijk")
[edit] sprintf
see #Format
see #trim
[edit] strcmp
see #Compare (integer result)
[edit] substring
Definition |
substr(string, startpos, numChars) returns string |
Description |
Returns a substring of string starting at startpos of length numChars. The resulting string is truncated if there are fewer than numChars characters beyond the starting point |
Format |
Languages |
string (startpos .. startpos + numChars - 1) |
Ada |
Mid(string, startpos, numChars) |
VB |
substr(string, startpos, numChars) |
AWK (changes string), Perl, PHP |
substr(string, startpos [,numChars, padChar]) |
REXX |
string[startpos:startpos + numChars - 1] |
Python |
string[startpos, numChars] |
Ruby |
string.slice(startpos, endpos) |
JavaScript |
string.substr(startpos, numChars) |
C++ (std::string only) |
string.Substring(startpos, numChars) |
VB .NET, C#, Windows PowerShell |
string.substring(startpos, endpos) |
Java, JavaScript |
copy(string, startpos, numChars) |
Delphi |
(string-copy string startpos endpos) |
Scheme |
(subseq string startpos endpos) |
Common Lisp |
String.sub string startpos numChars |
Ocaml |
substr(string, startpos, numChars) |
Erlang |
char result[numChars+1] = "";
strncat(result, string + startpos, numChars); |
C |
take numChars $ drop startpos string |
Haskell |
# Example in AWK
substr("abc", 2, 1) # returns: "b"
substr("abc", 2, 6) # returns: "bc"
/* Example in REXX */
substr("abc", 2, 1) /* returns: "b" */
substr("abc", 2) /* returns: "bc" */
substr("abc", 2, 6) /* returns: "bc " */
substr("abc", 2, 6, "*") /* returns: "bc****" */
% Example in Erlang
string:substr("abc", 2, 1). % returns: "b"
string:substr("abc", 2). % returns: "bc"
' Example in Visual Basic .NET
' -> Left(input, len) and input.Substring(0, len) are equivalent!
Dim len As Integer = 3
Dim input As String = "abcdef"
Dim expected = "abc"
'Gets the first 3 characters using Left and String.Substring
Dim withLeft As String = Left(input, len)
Dim withSubstring As String = input.Substring(0, len)
'Verify the two methods returned the expected value
Debug.Assert(withSubstring = expected AndAlso withLeft = expected)
[edit] Uppercase
Definition |
uppercase(string) returns string |
Description |
Returns the string in upper case. |
Format |
Languages |
UCase(string) |
VB |
toupper(string) |
AWK (changes string) |
uc(string) |
Perl |
toupper(char) |
C (operates on a single character) |
transform(string.begin(), string.end(), result.begin(), toupper) |
C++ (std::string only) (result is stored in string result which is at least as long as string, and may or may not be string itself) |
uppercase(string) |
Delphi |
strtoupper(string) |
PHP |
echo "string" | tr 'a-z' 'A-Z' |
Unix |
translate(string) |
REXX |
string.upper() |
Python |
string.upcase |
Ruby |
(string-upcase string) |
Scheme, Common Lisp |
String.uppercase string |
OCaml |
map Char.toUpper string |
Haskell |
string.toUpperCase() |
Java, JavaScript |
to_upper(string) |
Erlang |
string.ToUpper() |
VB .NET, C#, Windows PowerShell |
' Example in Visual Basic
UCase("Wiki means fast?") ' "WIKI MEANS FAST?"
/* Example in REXX */
translate("Wiki means fast?") /* "WIKI MEANS FAST?" */
; Example in Scheme
(use-modules (srfi srfi-13))
(string-upcase "Wiki means fast?") ; "WIKI MEANS FAST?"
-
trim or strip is used to remove whitespace from the beginning, end, or both beginning and end, of a string.
- ^ a b c d e The "find" string in this construct is interpreted as a regular expression. Certain characters have special meaning in regular expressions. If you want to find a string literally, you need to quote the special characters.
[edit] External links