c# - How do I remove all HTML tags from a string without knowing which tags are in it? -
this question has answer here:
is there easy way remove html tags or html related string?
for example:
string title = "<b> hulk hogan's celebrity championship wrestling <font color=\"#228b22\">[proj # 206010]</font></b> (reality series, )"
the above should be:
"hulk hogan's celebrity championship wrestling [proj # 206010] (reality series)"
you can use simple regex this:
public static string striphtml(string input) { return regex.replace(input, "<.*?>", string.empty); }
be aware solution has own flaw. see remove html tags in string more information (especially comments of @mehaase)
another solution use html agility pack.
can find example using library here: html agility pack - removing unwanted tags without removing content?
Comments
Post a Comment