regex - Java :: Parsing a multiline text with regular expressions -
i want parse multiline text, wrote this:
string text = "[timestamp1] info - message1 \r\n" + "[timestamp2] error - message2 \r\n" + "[timestamp3] info - message3 \r\n" + "message3_details1......... \r\n" + "message3_details2 ......... \r\n"; string regex = "\\[(.*)\\] (.*) - (.*)"; pattern p = pattern.compile(regex, pattern.dotall); matcher m = p.matcher(text); while (m.find()) { system.out.println("g1: " + m.group(1)); system.out.println("g2: " + m.group(2)); system.out.println("g3: " + m.group(3)); system.out.println(); } what want this:
g1: timestamp1 g2: info g3: message1 g1: timestamp2 g2: error g3: message2 g1: timestamp3 g2: info g3: message3 message_details1.... message_details2... but this:
g1: timestamp1] info - message1 [timestamp2] error - message2 [timestamp3 g2: info g3: message3 message3_details1........ message3_details2........ i'm not able solve google's help.
you have used greedy quantifier in regex. so, .* in [(.*)] consume till last found ]. need use reluctant quantifier. add ? after .*.
also, last .*, need use look-ahead, make stop before next [.
the following code work:
string text = "[timestamp1] info - message1 \r\n" + "[timestamp2] error - message2 \r\n" + "[timestamp3] info - message3 \r\n" + "message3_details1......... \r\n" + "message3_details2 ......... \r\n"; string regex = "\\[(.*?)\\] (.*?) - (.*?)(?=\\[|$)"; pattern p = pattern.compile(regex, pattern.dotall); matcher m = p.matcher(text); while (m.find()) { system.out.println("g1: " + m.group(1)); system.out.println("g2: " + m.group(2)); system.out.println("g3: " + m.group(3)); system.out.println(); } the last part of regex - (.*?)(?=\\[|$) matches till [ in next line, or till end ($). $ required last 2 lines captured in group 3 of last match.
output:
g1: timestamp1 g2: info g3: message1 g1: timestamp2 g2: error g3: message2 g1: timestamp3 g2: info g3: message3 message3_details1......... message3_details2 .........
Comments
Post a Comment