Saturday, 15 January 2011

regex - Tokenization of a string using C# -



regex - Tokenization of a string using C# -

i tokenize string contains definition of function prototype using c# code (i.e. tokenization implemented in c#).

a generic function prototype looks like:

[string var1, string var2, integer var3], double[] var4, double[,] var5 = myfunction(dates[] dates, double[,] prices, double upperboundweight)

i relatively new c#. effort splits sentence right , left based on "=" character , tokenize them separately. code looks like:

string[] words = examplestring.split(new string[]{", "}, stringsplitoptions.none); foreach (string word in words) { console.writeline(word); }

notice whitespace after "," forcefulness users utilize space in input definition. looking string lhs of string input, get

[string var1 string var2 integer var3] double[] var4 double[,] var5

which close looking except grouping var1, var2 , var3 (i.e. should work nested expressions 1 level deep).

the code fragile missing whitespace, etc. can cause break.

surely there must nicer way. tried regex.split look got hairy real quick before charged windmill, figured inquire here.

what nice way tokenize given string? there standard libraries or modules can help this?

c# regex tokenize

No comments:

Post a Comment