Comparison of regular expression engines

From Wikipedia, the free encyclopedia

Contents

[edit] Libraries

List of regular expression libraries
Official website Programming language Software license
Boost.Regex Boost C++ Libraries C++ Boost Software License
Boost.Xpressive Boost C++ Libraries C++ Boost Software License
CL-PPCRE Edi Weitz Common Lisp BSD
GLib/GRegex Marco Barisione C ?
GRETA Microsoft Research C++ ?
ICU International Components for Unicode C/C++/Java ICU license
Jakarta/Regexp The Apache Jakarta Project Java Apache License
Oniguruma Kosako C BSD
PCRE Philip Hazel C/C++ BSD
QT/QRegExp Trolltech C++ GPLv2 / commercial / QPL
TRE Ville Laurikari C LGPL

^  formerly called Regex++

^  included since version 2.13.0

^  C++ bindings were developed by Google and became officially part of PCRE in 2006

[edit] Languages

List of languages coming with regular expression support
Official website Software license
Haskell Haskell.org BSD3
.NET MSDN ?
Perl Perl.com Artistic License or the GNU General Public License
PHP PHP.net ?
Python python.org Python Software Foundation License
Ruby ruby-doc.org ?
Tcl 8.4 tcl.tk Tcl/Tk License
(Permissive, similar to BSD)
D D ?
Java Java ?

[edit] Language features

NOTE: An application using a library for regular expression support does not necessarily offer the full set of features of the library, e.g. GNU Grep which uses PCRE does not offer lookahead support, though PCRE does.

[edit] Part 1

Language feature comparison (part 1)
"+" quantifier Negated character classes Non-greedy quantifiers Shy groups Lookahead Lookbehind Backreferences >9 indexable captures
Boost.Regex Yes Yes Yes Yes Yes Yes Yes No
Boost.Xpressive Yes Yes Yes Yes Yes Yes Yes ?
CL-PPCRE Yes Yes Yes Yes Yes Yes Yes Yes
EmEditor Yes Yes Yes Yes Yes Yes Yes No
GLib/GRegex ? ? ? ? ? ? ? ?
GNU Grep Yes Yes No No No No Yes ?
Haskell Yes Yes Yes Yes Yes Yes Yes Yes
ICU Regex Yes Yes Yes Yes Yes Yes Yes Yes
JGsoft Yes Yes Yes Yes Yes Yes Yes Yes
.NET Yes Yes Yes Yes Yes Yes Yes Yes
OmniOutliner 3.6.2 Yes Yes Yes No No No ? ?
PCRE Yes Yes Yes Yes Yes Yes Yes Yes
Perl Yes Yes Yes Yes Yes Yes Yes Yes
PHP Yes Yes Yes Yes Yes Yes Yes Yes
Python Yes Yes Yes Yes Yes Yes Yes Yes
Qt/QRegExp Yes Yes No Yes Yes No Yes Yes
Ruby Yes Yes Yes Yes Yes No Yes Yes
TRE Yes Yes Yes Yes No No Yes No
Vim 7.1.314  (2008-06-09) Yes Yes Yes Yes Yes Yes Yes Yes
  • ^  Non-greedy quantifiers match as few characters as possible, instead of the default as many. Note that many older, pre-POSIX engines were non-greedy and didn't have greedy quantifiers at all
  • ^  Shy groups, also called non-capturing groups cannot be referred to with backreferences; non-capturing groups are used to speed up matching where the groups content needs not be accessed later.
  • ^  Backreferences enable referring to previously matched groups in later parts of the regex and/or replacement string (where applicable). For instance, ([ab]+)\1 matches "abab" but not "abaab"

[edit] Part 2

Language feature comparison (part 2)
Directives Conditionals Atomic groups Named capture Comments Embedded code Partial matching Fuzzy matching Unicode property support [1]
Boost.Regex Yes Yes ? ? Yes No Yes No Yes
Boost.Xpressive No No Yes No No No Yes No No
CL-PPCRE Yes Yes Yes Yes Yes Yes ? No No
EmEditor Yes Yes ? ? Yes No Yes No ?
GLib/GRegex ? ? ? ? ? No Yes No Yes
GNU Grep ? ? ? ? ? No ? No No
Haskell ? ? ? ? ? No ? No No
ICU Regex Yes Yes Yes No Yes No No No Yes
JGsoft Yes Yes Yes Yes Yes No Yes ? Yes
.NET Yes Yes Yes Yes Yes No ? No Yes
OmniOutliner 3.6.2 ? ? ? ? No No ? No ?
PCRE Yes Yes Yes Yes Yes Yes Yes No Yes
Perl Yes Yes Yes Yes Yes Yes No No Yes
PHP Yes Yes Yes ? Yes No No No No
Python Yes Yes No Yes Yes No No No No
Qt/QRegExp No No No No No No Yes No Yes
Ruby Yes No No No Yes No No No No
TRE No No No No No No No Yes ?
Vim 7.1.314  (2008-06-09) Yes ? Yes ? ? No ? No ?
  • ^  Also known as Flags modifiers or Option letters. Example pattern: "(?i:test)"
  • ^  Also called Independent sub-expressions
  • ^  Similar to back references but with names instead of indices
  • ^  Available as of PCRE 7.0 (as of PCRE 4.0 with Python-like syntax (?P<name>...))
  • ^  Available as of perl 5.9.5
  • ^  Requires optional Unicode support enabled.
  • ^  As of Ruby 1.8. The current development version, Ruby 1.9, has additional features.

[edit] API features

API feature comparison
Native UTF-16 support Native UTF-8 support Non-linear input support Dot-matches-newline option Anchor-matches-newline option
Boost.Regex No No Yes Yes Yes
Boost.Xpressive ? ? ? ? ?
GLib/GRegex No Yes No Yes Yes
ICU Regex Yes No No Yes ?
.NET Yes No Yes Yes ?
PCRE No Yes No Yes Yes
Qt/QRegExp Yes No No No No
TRE No ? Yes Yes Yes
  • ^  Native support means that conversion between UTF-16 <-> UTF-8 isn't required, the Unicode properties are supported, and the encoding type is always available (platform dependent wchar_t doesn't count).

[edit] See also

[edit] External links