A Regular Expression to match all HTML tags

Sharing a Regular Expression to match all HTML tags along with the surrounding white space.

A regular expression to match all HTML tags, including surrounding white space, as well as clustered tags:

/(\s*<[^>]+>\s*)+/

For example,  this can be used in java (or android) code to split an HTML table and return the contained text in each cell as members of an array of strings :

 String tablecontents [] = htmltable.split("(\\s*<[^>]+>\\s*)+");

This uses the string splitter function String.split(regexp) which can split strings based on a given regular expression.  The fragments are returned as an array, which is quite handy.

Leave a Reply

Your email address will not be published. Required fields are marked *