Trim (programming)
From Wikipedia, the free encyclopedia
In programming, trim or strip is a common string manipulation function which removes leading and trailing whitespace from a string.
For example, in Python:
' this is a test '.strip()
returns the string:
'this is a test'
Contents |
[edit] Variants
The most popular variants of the trim function strip only the beginning or end of the string. Typically named ltrim and rtrim respectively, or in the case of Python: lstrip and rstrip. C# uses TrimStart and TrimEnd, and Common Lisp string-left-trim and string-right-trim. Pascal and Java do not have these variants built-in, although Delphi (Borland's object-oriented derivative of Pascal) has TrimLeft and TrimRight functions[1].
Many trim functions have an optional parameter to specify a list of characters to trim, instead of the default whitespace characters. For example, PHP and Python allow this optional parameter, while Pascal and Java do not. With Common Lisp's string-trim
function, the parameter (called character-bag) is required. The C++ Boost library defines space characters according to locale, as well as offering variants with a predicate parameter (a functor) to select which characters are trimmed.
An uncommon variant of trim returns a special result if no characters remain after the trim operation. For example, Apache Jakarta's StringUtils has a function called stripToNull
which returns null
in place of an empty string.
An alternative to trimming a string is space normalization, where in addition to removing surrounding whitespace, any sequence of whitespace characters within the string is replaced with a single space. Space normalization is done by Trim()
in spreadsheet applications (including Excel, Calc, Gnumeric, and Google Docs), and by the normalize-space()
function in XSLT and XPath,
While most algorithms return a new (trimmed) string, some alter the original string in-place. Notably, the Boost library allows either in-place trimming or a trimmed copy to be returned.
[edit] Definition of whitespace
The characters which are considered whitespace varies between programming languages and implementations. For example, C traditionally only counts space, tab, line feed, and carriage return characters, while languages which support Unicode typically include all Unicode space characters. Some implementations also include ASCII control codes (non-printing characters) along with whitespace characters.
Java's trim method considers ASCII spaces and control codes as whitespace, while Java's isWhitespace() method recognizes Unicode space characters.
[edit] Usage
Following are examples of trimming a string using several programming languages. All of the implementations shown return a new string and do not alter the original variable.
Example usage | Languages |
---|---|
String.Trim([chars]) | C#, VB.NET, Windows PowerShell |
(string-trim '(#\Space #\Tab #\Newline) string) | Common Lisp |
(string-trim string) | Scheme |
string.trim() | Java |
Trim(String) | Pascal [2] |
string.strip() | Python |
strip(string [,option , char]) | REXX |
string:strip(string [,option , char]) | Erlang |
string.strip | Ruby |
trim($string) | PHP |
Trim(String) | QBasic, Visual Basic, Delphi |
string trim $string | Tcl |
[edit] Other languages
In languages without a built-in trim function, it is usually simple to create a custom function which accomplishes the same task.
[edit] AWK
In AWK, one can use regular expressions to trim:
ltrim(v) = gsub(/^[ \t]+/, "", v) rtrim(v) = gsub(/[ \t]+$/, "", v) trim(v) = ltrim(v); rtrim(v)
or:
function ltrim(s) { sub(/^ +/, "", s); return s } function rtrim(s) { sub(/ +$/, "", s); return s } function trim(s) { return rtrim(ltrim(s)); }
[edit] C/C++
There is no standard trim function in C or C++. Most of the available string libraries[3] for C contain code which implements trimming, or functions that significantly ease an efficient implementation. The function has also often been called EatWhitespace in some, non-standard C libraries.
The open source C++ library Boost has several trim variants, including a standard one: [4]
#include <boost/algorithm/string/trim.hpp> trimmed = boost::algorithm::trim_copy("string");
Note that with boost's function named simply trim
the input sequence is modified in-place[5], and does not return a result.
The Linux kernel also includes a strip function, strstrip()
, since 2.6.18-rc1, which trims the string "in place".
[edit] Haskell
A trim algorithm in Haskell:
import Data.Char (isSpace) trim :: String -> String trim = f . f where f = reverse . dropWhile isSpace
may be interpreted as follows: f drops the preceding whitespace, and reverses the string. f is then again applied to its own output. Note that the type signature (the second line) is optional.
[edit] JavaScript
There is no built-in trim function, but it can be added to the String object's prototype to add a trim method to all strings:
String.prototype.trim = function() { return this.replace(/^\s+|\s+$/g, ""); }
[edit] Perl
Perl has no built-in trim function. However, the functionality is commonly achieved using regular expressions.
Example:
$string =~ s/^\s+//; # remove leading whitespace $string =~ s/\s+$//; # remove trailing whitespace
or:
$string =~ s/^\s+|\s+$//g ; # remove both leading and trailing whitespace
These examples modify the value of the original variable $string
.
Also available for Perl is StripLTSpace in String::Strip
from CPAN.
There are however two functions that are commonly used to strip whitespace from the end of strings, chomp and chop:
- chop removes the last character from a string and returns it.
- chomp removes the trailing newline from a string if present.
[edit] Tcl
The Tcl string
command has three relevant subcommands: trim
, trimright
and trimleft
. For each of those commands, an additional argument may be specified: a string that represents a set of characters to remove -- the default is whitespace (space, tab, newline, carriage return).
Example of trimming vowels:
set string onomatopoeia set trimmed [string trim $string aeiou] ;# result is nomatop set r_trimmed [string trimright $string aeiou] ;# result is onomatop set l_trimmed [string trimleft $string aeiou] ;# result is nomatopoeia
[edit] XSLT
XSLT includes the function normalize-space(string)
which strips leading and trailing whitespace, in addition to replacing any whitespace sequence (including line breaks) with a single space.
Example:
<xsl:variable name='trimmed'> <xsl:value-of select='normalize-space(string)'/> </xsl:variable>
XSLT 2.0 includes regular expressions, providing another mechanism to perform string trimming.
Another XSLT technique for trimming is to utilize the XPath 2.0 substring() function.
[edit] See also
[edit] External links
- Tcl: string trim
- Faster JavaScript Trim - compares various JavaScript trim implementations