Sunday 15 February 2015

Confused about Matcher group in Java regex -



Confused about Matcher group in Java regex -

i have next line,

typename="abc:xxxxx;";

i need fetch word abc,

i wrote next code snippet,

pattern pattern4=pattern.compile("(.*):"); matcher=pattern4.matcher(typename); string namestr=""; if(matcher.find()) { namestr=matcher.group(1); }

so if set group(0) abc: if set group(1) abc, want know

what 0 , 1 mean? improve if can explain me examples.

the regex pattern contains : in it, why group(1) result omits that? grouping 1 detects words within parenthesis?

so, if set 2 more parenthesis such as, \\s*(\d*)(.*): then, there 2 groups? group(1) homecoming (\d*) part , group(2) homecoming (.*) part?

the code snippet given in purpose clear confusions. not code dealing with. code given above can done string.split() in much easier way.

capturing , grouping

capturing group (pattern) creates group has capturing property.

a related 1 might see (and use) (?:pattern), creates group without capturing property, hence named non-capturing group.

a grouping used when need repeat sequence of patterns, e.g. (\.\w+)+, or specify alternation should take effect, e.g. ^(0*1|1*0)$ (^, 0*1 or 1*0, $) versus ^0*1|1*0$ (^0*1 or 1*0$).

a capturing group, apart grouping, record text matched pattern within capturing grouping (pattern). using example, (.*):, .* matches abc , : matches :, , since .* within capturing grouping (.*), text abc recorded capturing grouping 1.

group number

the whole pattern defined grouping number 0.

any capturing grouping in pattern start indexing 1. indices defined order of the opening parentheses of capturing groups. example, here all 5 capturing groups in below pattern:

(group)(?:non-capturing-group)(g(?:ro|u)p( (nested)inside)(another)group)(?=assertion) | | | | | | || | | 1-----1 | | 4------4 |5-------5 | | 3---------------3 | 2-----------------------------------------2

the grouping numbers used in back-reference \n in pattern , $n in replacement string.

in other regex flavors (pcre, perl), can used in sub-routine calls.

you can access text matched grouping matcher.group(int group). grouping numbers can identified rule stated above.

in regex flavors (pcre, perl), there branch reset feature allows utilize the same number capturing groups in different branches of alternation.

group name

from java 7, can define named capturing group (?<name>pattern), , can access content matched matcher.group(string name). regex longer, code more meaningful, since indicates trying match or extract regex.

the grouping names used in back-reference \k<name> in pattern , ${name} in replacement string.

named capturing groups still numbered same numbering scheme, can accessed via matcher.group(int group).

internally, java's implementation maps name grouping number. therefore, cannot utilize same name 2 different capturing groups.

java regex

No comments:

Post a Comment