The Cusp of Helix

Intact Case

The Difference of Regular Expression in each Programming Language

When I make libraries of Intact Case, to use Regular Expression for interconverting camelCase and snake_case. I had to create some patterns interconverting for each programming language, because PHP, Ruby and Javascript has different specification of regular expression. Javascript means ECMAScript5 in this article (July 2015 currently).

PatternKindPHPRubyJavascript
(?=)lookahead
(?!)
(?<=)lookbehindnone
(?<!)none
(?>)atomic groupingnone
??lazy matching
*?
+?
?+greedy matchingnone
*+none
++none

Javascript has no lookbehind, atomic grouping and greedy matching. it is necessary to devise for writing complicated match patterns.

Bug of Regular Expression

When I create libraries of Intact Case, got some cases that matching results is not in the expected results by some programming languages. As a result of investigation, I found the cause that the difference in the specification of regular expression about "or conditions".

When replace string by match pattern as follows, attempt to think about expected results. Javascript has no (?<=), therefore do not use it.

Add the @ in the following locations
  1. Top of string
  2. After "Abc"
Match pattern/^|(Abc)/g
Replace string"$1@"
Haystack stringAbcAbcAbc
Expected result@Abc@Abc@Abc@

Codes

PHPpreg_replace('/^|(Abc)/', '$1@', 'AbcAbcAbc');
Javascript"AbcAbcAbc".replace(/^|(Abc)/g, "$1@");
Ruby'AbcAbcAbc'.gsub(/^|(Abc)/) { "#{$1}@" }

There are results of each programming language. did not get excepted result in Javascript and Ruby.

PHP@Abc@Abc@Abc@
Javascript@AbcAbc@Abc@
Ruby@AbcAbc@Abc@