c# - How do I remove all HTML tags from a string without knowing which tags are in it? -


is there easy way remove html tags or html related string?

for example:

string title = "<b> hulk hogan's celebrity championship wrestling &nbsp;&nbsp;&nbsp;<font color=\"#228b22\">[proj # 206010]</font></b>&nbsp;&nbsp;&nbsp; (reality series, &nbsp;)" 

the above should be:

"hulk hogan's celebrity championship wrestling [proj # 206010] (reality series)"

you can use simple regex this:

public static string striphtml(string input) {    return regex.replace(input, "<.*?>", string.empty); } 

be aware solution has own flaw. see remove html tags in string more information (especially comments of @mehaase)

another solution use html agility pack.
can find example using library here: html agility pack - removing unwanted tags without removing content?


Comments

Popular posts from this blog

css - Which browser returns the correct result for getBoundingClientRect of an SVG element? -

gcc - Calling fftR4() in c from assembly -

.htaccess - Matching full URL in RewriteCond -