I am trying to get all the words in a string, that are at least 4 characters long and less than 10 characters. When I use the following regular expression, it just returned the whole string as one word. Can you please look at the following example and tell me how should I write this regular expression?
string result = "Overfishing, erosion and warmer waters are feeding jellyfish blooms in coastal regions worldwide. And they're causing damage"
string[] words = Regex.Split(result, @"[\W]{4,10}");
foreach (string line in words)
{
Console.WriteLine(line);
}
Pravesh Singh
30-Jan-2014Your code isn't working because the pattern will only match a sequence of 4 to 10 consecutive non-word characters, which doesn't appear in the string. So Regex.Split just returns an array containing the original string.
Try using this pattern:
\b\w{4,10}\b
For example:
This will match any sequence of 4 to 10 consecutive word characters, surrounded by word boundaries.