All about HTML tags

9 Regular Expressions to strip HTML tags

Quick syntax reference

flags
  • g - global match
  • i - ignore case
  • m - match over multiple lines
Escaping
  • \ - special characters to literal and literal characters to special
Quantifiers
  • ? - matches zero or one times
  • * - matches zero or more times
  • + - matches one or more times
  • {n} - matches n times
  • {n, m} - matches at least n times, but not more than m times
Anchors
  • ^ - matches at the start of the line
  • $ - matches at the end of the line
  • \b - matches at the beginning or the end of a word
delimiter
  • (?:x) - matches x not remember the match
  • x(?=y) - matches x only if x is followed by y
  • x(?!y) - matches x only if x is not followed by y
Character Escapes
  • \s - matches whitespace
  • \S - matches anything but a whitespace
  • \f - matches a form-feed
  • \n - matches a linefeed
  • \r - matches a carriage return
  • \t - matches a horizontal tab
  • \v - matches vertical tab
  • \w - matches any alphanumeric character including the underscore. Equivalent to [A-Za-z0-9_]
  • \W - matches any non-word character. Equivalent to [^A-Za-z0-9_]
Others
  • . - matches any character except a newline

It's not an easy job to parse HTML tags of the whole page using regular expressions. But if you are dealing with a part of HTML tags and handle it as a string, the following regular expressions may be of your help.

1
matches specific tag pairs and content between them
RegEx Expression:
/<\s*h4[^>]*>(.*?)<\s*/\s*h4>/g
Method:
exec, match
Testing String
<h4 class="sds">And more ...</h4>
Live Test
2
matches all HTML tags pairs including attributes in the tags
RegEx Expression:
/<(.|\n)*?>/g
Method:
match
Testing String
<div class="tab0">CSS code formatter</div><div class="tab2">CSS code compressor</div>
Live Test
3
match all start tags including attributes in the tags
RegEx Expression:
/<\s*\w.*?>/g
Method:
match
Testing String
<div class="box">5 px radius of round corner</div><div class="box">7 px radius of round corner</div><div style="color:#6699cc">color</div>
Live Test
4
matches all close tag
RegEx Expression:
/<\s*\/\s*\w\s*.*?>|<\s*br\s*>/g
Method:
match
Testing String
<div class="sds">not sure where it can be used</div></br>
Live Test
5
matches start tag of specific tag including attibutes
RegEx Expression:
/<\s*div.*?>/g
Method:
match
Testing String
<div class="tab1">tabs generator</div>
Live Test
6
matches close part of specific tag pair
RegEx Expression:
/<\s*\/\s*div\s*.*?>/g
Method:
match
Testing String
<div class="sds">javascript + CSS ...</div>
Live Test
7
matches specific HTML tag pair including attributes in the tags.
RegEx Expression:
/<\s*\/?\s*span\s*.*?>/g
Method:
match
Testing String
<span class="csc">Regex examples</span>
Live Test
8
matches start tag with specific attribute
RegEx Expression:
/<\s*\w*\s*style.*?>/g
Method:
match
Testing String
<div style="color:#6699cc">round corner</div>
Live Test
9
matches start tag with specific attribute
RegEx Expression:
/<\s*\w*\s*href\s*=\s*"?\s*([\w\s%#\/\.;:_-]*)\s*"?.*?>/g
Method:
exec, match
Testing String
<span ><a href="http://www.pagecolumn.com/"> 3 Column Layout Generator </a></span> <span > <a href="http://www.pagecolumn.com/2_col_generator.htm">2 Column Layout Generator</a></span>
Live Test