Need to find all ’ and " in a string, and replace them with ’ and "
…but ONLY if they’re not part of an HTML tag! (because that would screw up the HTML big-time
).
Tried this with java regexp, but seems I broke the regexp engine (at least, as far as I can understand the cryptic error msg - it seems to be saying “the point of what you’re trying to achieve here … is something I’m programmed not to even attempt”)
“Exception: Look-behind group does not have an obvious maximum length near index 13 (?i)(?])’(?![^>]>) ^”
on regexp: “(?i)(?<!<[^>])’(?![^>]>)”
Is there another way of achieving this? The HTML will probably be poorly formed, so XML isn’t an option really.
