|
|
 |
| Author |
Message |
werwin

|
Posted: Visual C# General, string regex |
Top |
Im not familure with regular expression so i was wondering if some one could help me with this. im trying to ummm get information from inside of a html tag. i can do the the ones like <body> but i can do meta tags here
<meta name="description" content="bbc.co.uk offers a varied range of sites including news, sport, community, education, children's, and lifestyle sites, with TV programme support, radio on demand, and easy to use web search from the BBC." /> thats the tag from bbc. what i want is to only have "bbc.co.uk offers a varied range of sites including news, sport,
community, education, children's, and lifestyle sites, with TV
programme support, radio on demand, and easy to use web search from the
BBC". how would i disgard every thing else
Visual C#5
|
| |
|
| |
 |
David L

|
Posted: Visual C# General, string regex |
Top |
The following example may be of some guidance to you:
string html = "<lots of html...><meta name=\"description\" content=\"bbc.co.uk offers a varied range of sites including news, sport, community, education, children's, and lifestyle sites, with TV programme support, radio on demand, and easy to use web search from the BBC.\" /><lots of html...>";
string pattern = "<meta name=\"description\" content=\"(.*)\" />";
MatchCollection mc = Regex.Matches(html, pattern, RegexOptions.Multiline); // Gets a collection of all matches
GroupCollection gc = mc[0].Groups; // check the "inner matches" of the first match
string myString = gc[1].Value; // get the second match of the inner matches (the first one being the full pattern match)
basically the (.*) in the pattern will match any number of all characters.
That's what you're looking for
|
| |
|
| |
 |
werwin

|
Posted: Visual C# General, string regex |
Top |
Thanks alot that does for me.
|
| |
|
| |
 |
| |
|