Trim (programming)
From Wikipedia, the free encyclopedia
In programming, trim or strip is a string manipulation function or algorithm which removes leading and trailing whitespace from a string.
For example, in Python:
' this is a test '.strip()
will return the string:
'this is a test'
Contents |
[edit] Variants
The most popular variants of the trim function strip only the beginning or end of the string. Typically named ltrim and rtrim respectively, or in the case of Python: lstrip and rstrip. C# uses TrimStart and TrimEnd, and Common Lisp string-left-trim and string-right-trim. Pascal and Java do not have these variants built-in.
Many trim functions have an optional parameter to specify a list of characters to trim, instead of the default whitespace characters. For example, PHP and Python allow this optional parameter, while Pascal and Java do not. Common Lisp's string-trim
function names the parameter character-bag and required the parameter, offering no default. The C++ Boost library defines space characters according to locale, as well as offering variants with a predicate parameter (a functor) to select which characters are trimmed.
Rarely, there is a variant of trim which returns a special result if no characters are left after the trim operation. For example, Apache Jakarta's StringUtils has a function stripToNull
which returns null
in place of an empty string.
Space normalization is an alternative to trim, where as well as the removal of surrounding whitespace, any sequence of whitespace characters within the string is replaced with a single space. This is done by the normalize-space()
function in XSLT and XPath.
While most algorithms return a new (trimmed) string, some algorithms alter the original string in-place. Notably, the Boost library allows either in-place trimming or a trimmed copy to be returned.
[edit] Meaning of whitespace
What is considered whitespace will vary between languages and implementations. For example, C traditionally only counts space, tab, newline and return characters, while languages supporting Unicode may count all Unicode space characters. Some implementations also count all ASCII control codes (non-printing characters) as whitespace.
For example, Java's trim method considers spaces and all control codes as whitespace. However, unlike the Java isWhitespace() method, it does not consider special Unicode space characters to be whitespace.
[edit] Usage
How to trim string in languages with built-in trim functions. These examples all return a new string and do not alter the original variable.
Example usage | Languages |
---|---|
String.Trim(); | C# |
(string-trim '(#\Space #\Tab #\Newline) string) | Common Lisp |
string.trim(); | Java |
Trim(String) | Pascal [1] |
string.strip() | Python |
trim(string) | REXX |
string.strip | Ruby |
trim($string) | PHP |
Trim(String) | QBasic, Visual Basic, VB.NET, Delphi |
[edit] Other languages
In languages without a built-in trim function, a custom function may need to be written, or a library found.
[edit] AWK
AWK uses regular expressions to trim[2] :
ltrim(v) = gsub(/^[ \t]+/, "", v) rtrim(v) = gsub(/[ \t]+$/, "", v) trim(v) = ltrim(v); rtrim(v)
or:
function ltrim(s) { sub(/^ */, "", s); return s } function rtrim(s) { sub(/ *$/, "", s); return s } function trim(s) { return rtrim(ltrim(s)); }
[edit] C/C++
There is no standard trim function in C or C++. The equivalent function has also often been called EatWhitespace in non-standard C libraries.
The open source C++ library Boost has several trim variants, including a standard one: [3]
trimmed = boost::algorithm::trim_copy(string);
Note that with boost's function named simply trim
the input sequence is modified in-place[4], and does not return a result.
The Linux kernel also includes a strip function, strstrip()
, since 2.6.18-rc1, which trims the string "in place".
The open source and portable C and C++ library "The Better String Library" has support for trimming as well:
btrimws (b = bfromcstr (" string "));
[edit] Haskell
A trim algorithm in Haskell was described as follows:
- We trim-left the string, then reverse it, then trim-left the reversed string (at this point all the trimming is done), then finally reverse the reversed and trimmed string. [5]
The function definition:
trim :: [Char] -> [Char] trim = applyTwice (reverse . trim1) where trim1 = dropWhile (`elem` delim) delim = [' ', '\t', '\n', '\r'] applyTwice f = f . f
[edit] JavaScript
There is no built-in trim function, but it can be added to the String class [6]:
To add a trim function to all strings:
String.prototype.trim = function() { return this.replace(/^\s*|\s*$/g, "") }
This allows the same syntax as Java to be used for JavaScript.
[edit] Perl
Perl has no built-in function, and a trimming is usually achieved through regular expressions.
Example:
$string =~ s/^\s+//; # remove leading whitespace $string =~ s/\s+$//; # remove trailing whitespace
or:
$string =~ s/^\s+|\s+$//g ; # remove both leading and trailing whitespace
These examples modify the value of the original variable $string
.
Also available for Perl is StripLTSpace in String::Strip
from CPAN.
[edit] XSLT
XSLT has the function normalize-space(string)
which strips leading and trailing whitespace and also replaces any sequence of whitespace characters with a single space.
Example:
<xsl:variable name='trimmed'> <xsl:value-of select='normalize-space(string)'/> </xsl:variable>
XSLT 2.0 will also include regular expressions, providing another mechanism to perform trimming.