regex - Java :: Parsing a multiline text with regular expressions -
i want parse multiline text, wrote this:
string text = "[timestamp1] info - message1 \r\n" + "[timestamp2] error - message2 \r\n" + "[timestamp3] info - message3 \r\n" + "message3_details1......... \r\n" + "message3_details2 ......... \r\n"; string regex = "\\[(.*)\\] (.*) - (.*)"; pattern p = pattern.compile(regex, pattern.dotall); matcher m = p.matcher(text); while (m.find()) { system.out.println("g1: " + m.group(1)); system.out.println("g2: " + m.group(2)); system.out.println("g3: " + m.group(3)); system.out.println(); }
what want this:
g1: timestamp1 g2: info g3: message1 g1: timestamp2 g2: error g3: message2 g1: timestamp3 g2: info g3: message3 message_details1.... message_details2...
but this:
g1: timestamp1] info - message1 [timestamp2] error - message2 [timestamp3 g2: info g3: message3 message3_details1........ message3_details2........
i'm not able solve google's help.
you have used greedy quantifier in regex. so, .*
in [(.*)]
consume till last found ]
. need use reluctant quantifier. add ?
after .*
.
also, last .*
, need use look-ahead, make stop before next [
.
the following code work:
string text = "[timestamp1] info - message1 \r\n" + "[timestamp2] error - message2 \r\n" + "[timestamp3] info - message3 \r\n" + "message3_details1......... \r\n" + "message3_details2 ......... \r\n"; string regex = "\\[(.*?)\\] (.*?) - (.*?)(?=\\[|$)"; pattern p = pattern.compile(regex, pattern.dotall); matcher m = p.matcher(text); while (m.find()) { system.out.println("g1: " + m.group(1)); system.out.println("g2: " + m.group(2)); system.out.println("g3: " + m.group(3)); system.out.println(); }
the last part of regex - (.*?)(?=\\[|$)
matches till [
in next line, or till end ($
). $
required last 2 lines captured in group 3 of last match.
output:
g1: timestamp1 g2: info g3: message1 g1: timestamp2 g2: error g3: message2 g1: timestamp3 g2: info g3: message3 message3_details1......... message3_details2 .........
Comments
Post a Comment