Posts Tagged ‘regex’

Perl: Regexp to filter out/match just the price/floating point number ie digits and dots

April 15th, 2010 No comments

There are two ways:-

$price=~tr/0-9.//cd; # delete anything but 0-9 and a real dot

$row_array[6]=~s/[^0-9.]//g; # clean up price, remove pound/dollar sign etc

Categories: Uncategorized Tags: , , ,

Parsing HTML with preg_match

April 8th, 2010 No comments

preg_match_all(‘#<b>.+?</b>#i’, $html, $matches);
// ? means non-greedy and is absolutely critical

$match[1], $match[2] contain the results

Non greedy stops the match gobbling everything between the first bold tag and the very last. I’ve never quite understood why greedy is the default behavior of RegExp (Regular Expressions) but I guess there’s a good reason for it.

PHP also has the very useful strip_tags function.

Categories: Coding Tips, PHP Tags: