Understand $0 and $1 in Javascript Regular Expression

An prototype to extend exec method

Including parentheses in a regular expression pattern causes the corresponding submatches to be remembered. For example, /a(b)c/ matches the characters 'abc' and remembers 'b', in which ‘abc’ may be stored in $0, and ‘b’ may be stored in group $1. /a(b)c(d)e/ matches the character ‘abcde’ and remembers ‘b’ in $1 and remembers ‘d’ in $2. There may be more groups. The number of groups could be the number of parentheses used in regular expression. But in most cases, we only deal with one group. So for the 1st step, understand the trick is very useful.

There may be more than one submatch for $1. But our current methods only remember 1 submatch and they perform differently. Exec - RegExp method remembers the 1st submatch while match – string method remembers the last submatch.

For example, /\=(\w+)/g submatches the values of the url

    var str ='http://www.pagecolumn.com/tool/regtest.htm?par1=value1&par2=value2&par3=value3';
exec method
    var re = new RegExp("\=(\w+)", "g");
    var myArray = re.exec(str); 
    
return: value1 in $1
Test
match method
    var myArray = str.match(/\=(\w+)/g);
    
return: value3 in $1
Test

But it’s useless in most cases because both exec and match return only one submatch. We need a way to retrieve all submatches. The most popular way is using replace method.

    var str = "border-top-width";
    str = str.replace(/-(\w)/gi, function($0,$1) {
        return $0.replace($0,$1.toUpperCase(););		
    });
        
return: "borderTopWidth"

There are many similar applications. Each elements of $1 is used within the method replace. I was inspired to develop a function to retrieve all $1 elements.

    String.prototype.$1elements=function(vregex) {
    var elm=[];
    var str=this;
    var re= new RegExp(vregex, "");
    str = str.replace(/re/g, function($0,$1) {
        elm.push($1)
        return $0;		
    });
    return elm;
    } 
    var str = "border-top-width".$1elements("\\=\(\\w\+\)");
 
return: t,w

It should be noted here regular expression is used as a parameter of function. The type of augment of is string. There must be an escape character before the special characters of input string.

In this function we don't care about the value of str, but the array elm. Someone may ask if we can simply pass $1 as augment because actually we don’t use $0 in the function.

    str = str.replace(/re/g, function($1) {
        elm.push($1)
        return $1;		
    });

No, because the above example will return –t,-w. Here $1 stores the value of match, but not submatch. $1 must be used with $0 together to make sure retrieving values of submatches. You can also change the replace sentence to

    str = str.replace(/re/g, function($1,$2)	{
        elm.push($2);
        return $1;		
    });

This will also return the array of submatches. So you see Javascript doesn’t care about the start number 0, 1, or 2 after $. The 1st one remembers the match strings, the 2nd one remembers the submatch strings.

I would do another experiment. I remove parentheses in the regular expression. The regexp becomes /\=\w+/ and it is put it in the above function (don’t forget the escape character \).

    var str = "border-top-width".$1elements("\\=\\w\+");
 
return: 6,10

A surprising result. $1 remembers the index of matched string if without parentheses.

Social Bookmark if the page is useful.