Trim (programming)

From Wikipedia, the free encyclopedia

In programming, trim or strip is a common string manipulation function which removes leading and trailing whitespace from a string.

For example, in Python:

'  this  is a test  '.strip()

returns the string:

'this  is a test'

Contents

[edit] Variants

The most popular variants of the trim function strip only the beginning or end of the string. Typically named ltrim and rtrim respectively, or in the case of Python: lstrip and rstrip. C# uses TrimStart and TrimEnd, and Common Lisp string-left-trim and string-right-trim. Pascal and Java do not have these variants built-in, although Delphi (Borland's object-oriented derivative of Pascal) has TrimLeft and TrimRight functions[1].

Many trim functions have an optional parameter to specify a list of characters to trim, instead of the default whitespace characters. For example, PHP and Python allow this optional parameter, while Pascal and Java do not. With Common Lisp's string-trim function, the parameter (called character-bag) is required. The C++ Boost library defines space characters according to locale, as well as offering variants with a predicate parameter (a functor) to select which characters are trimmed.

An uncommon variant of trim returns a special result if no characters remain after the trim operation. For example, Apache Jakarta's StringUtils has a function called stripToNull which returns null in place of an empty string.

An alternative to trimming a string is space normalization, where in addition to removing surrounding whitespace, any sequence of whitespace characters within the string is replaced with a single space. Space normalization is done by Trim() in spreadsheet applications (including Excel, Calc, Gnumeric, and Google Docs), and by the normalize-space() function in XSLT and XPath,

While most algorithms return a new (trimmed) string, some alter the original string in-place. Notably, the Boost library allows either in-place trimming or a trimmed copy to be returned.

[edit] Definition of whitespace

The characters which are considered whitespace varies between programming languages and implementations. For example, C traditionally only counts space, tab, line feed, and carriage return characters, while languages which support Unicode typically include all Unicode space characters. Some implementations also include ASCII control codes (non-printing characters) along with whitespace characters.

Java's trim method considers ASCII spaces and control codes as whitespace, while Java's isWhitespace() method recognizes Unicode space characters.

[edit] Usage

Following are examples of trimming a string using several programming languages. All of the implementations shown return a new string and do not alter the original variable.

Example usage Languages
String.Trim([chars]) C#, VB.NET, Windows PowerShell
(string-trim '(#\Space #\Tab #\Newline) string) Common Lisp
(string-trim string) Scheme
string.trim() Java
Trim(String) Pascal [2]
string.strip() Python
strip(string [,option , char]) REXX
string:strip(string [,option , char]) Erlang
string.strip Ruby
trim($string) PHP
Trim(String) QBasic, Visual Basic, Delphi
string trim $string Tcl

[edit] Other languages

In languages without a built-in trim function, it is usually simple to create a custom function which accomplishes the same task.

[edit] AWK

In AWK, one can use regular expressions to trim:

ltrim(v) = gsub(/^[ \t]+/, "", v)
rtrim(v) = gsub(/[ \t]+$/, "", v)
trim(v)  = ltrim(v); rtrim(v)

or:

function ltrim(s) { sub(/^ +/, "", s); return s }
function rtrim(s) { sub(/ +$/, "", s); return s }
function trim(s)  { return rtrim(ltrim(s)); }

[edit] C/C++

There is no standard trim function in C or C++. Most of the available string libraries[3] for C contain code which implements trimming, or functions that significantly ease an efficient implementation. The function has also often been called EatWhitespace in some, non-standard C libraries.

The open source C++ library Boost has several trim variants, including a standard one: [4]

#include <boost/algorithm/string/trim.hpp>
trimmed = boost::algorithm::trim_copy("string");

Note that with boost's function named simply trim the input sequence is modified in-place[5], and does not return a result.

The Linux kernel also includes a strip function, strstrip(), since 2.6.18-rc1, which trims the string "in place".

[edit] Haskell

A trim algorithm in Haskell:

import Data.Char (isSpace)
trim      :: String -> String
trim      = f . f
   where f = reverse . dropWhile isSpace

may be interpreted as follows: f drops the preceding whitespace, and reverses the string. f is then again applied to its own output. Note that the type signature (the second line) is optional.

[edit] JavaScript

There is no built-in trim function, but it can be added to the String object's prototype to add a trim method to all strings:

String.prototype.trim = function() {
  return this.replace(/^\s+|\s+$/g, "");
}

[edit] Perl

Perl has no built-in trim function. However, the functionality is commonly achieved using regular expressions.

Example:

$string =~ s/^\s+//;            # remove leading whitespace
$string =~ s/\s+$//;            # remove trailing whitespace

or:

$string =~ s/^\s+|\s+$//g ;     # remove both leading and trailing whitespace

These examples modify the value of the original variable $string.

Also available for Perl is StripLTSpace in String::Strip from CPAN.

There are however two functions that are commonly used to strip whitespace from the end of strings, chomp and chop:

  • chop removes the last character from a string and returns it.
  • chomp removes the trailing newline from a string if present.

[edit] Tcl

The Tcl string command has three relevant subcommands: trim, trimright and trimleft. For each of those commands, an additional argument may be specified: a string that represents a set of characters to remove -- the default is whitespace (space, tab, newline, carriage return).

Example of trimming vowels:

set string onomatopoeia
set trimmed [string trim $string aeiou]         ;# result is nomatop
set r_trimmed [string trimright $string aeiou]  ;# result is onomatop
set l_trimmed [string trimleft $string aeiou]   ;# result is nomatopoeia

[edit] XSLT

XSLT includes the function normalize-space(string) which strips leading and trailing whitespace, in addition to replacing any whitespace sequence (including line breaks) with a single space.

Example:

<xsl:variable name='trimmed'>
   <xsl:value-of select='normalize-space(string)'/>
</xsl:variable>

XSLT 2.0 includes regular expressions, providing another mechanism to perform string trimming.


Another XSLT technique for trimming is to utilize the XPath 2.0 substring() function.

[edit] See also

[edit] External links