Parsing HTML with preg_match

preg_match_all(‘#<b>.+?</b>#i’, $html, $matches);
// ? means non-greedy and is absolutely critical

$match[1], $match[2] contain the results

Non greedy stops the match gobbling everything between the first bold tag and the very last. I’ve never quite understood why greedy is the default behavior of RegExp (Regular Expressions) but I guess there’s a good reason for it.

PHP also has the very useful strip_tags function.

