How To Strip Out One Common Attribute From Every Form Element On The Page?
I have a string variable that contains an HTML page's response. It contains hundreds of tags, including the the following three html tags: <
Solution 1:
Look at Html Agility Pack.
Using regex:
(?<=<[^<>]*)\sprefix\w+="[^"]"\s?(?=[^<>]*>)
var result = Regex.Replace(s,
@"(?<=<[^<>]*)\sprefix\w+=""[^""]""(?=[^<>]*>)", string.Empty);
Solution 2:
RegEx is not the solution since HTML is not a regular language and as such shouldn't be parsed with RegEx's. I've heard good things about HTML Agility Pack for parsing and working with HTML. Check it out.
Solution 3:
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(/* your html here */);
foreach (var item in doc.DocumentNode.Descendants()) {
foreach (var attr in item.Attributes.Where(x =>x.Name.StartsWith("prefix")).ToArray()) {
item.Attributes.Remove(attr);
}
}
Solution 4:
html = Regex.Replace(html, @"(?<=<\w+\s[^>]*)\s" + Regex.Escape(prefix) + @"\w+\s?=\s?""[^""]*""(?=[^>]*>)", "");
You have a look behind and look ahead that will find , then you have a matcher for the prefix#####="?????".
Solution 5:
Here's the heavy handed method of doing it.
Stringstr = "<tag1 prefix131403013654=\"2\">";
while (str.IndexOf("prefix131403013654=\"") != -1) //At least one still exists...
{
int point = str.IndexOf("prefix131403013654=\"");
int length = "prefix131403013654=\"".Length;
//need to grab last part now. We know there's a leading double quote and a ending double quote surrounding it, so we find the second quote.
int secondQuote = str.IndexOf("\"",point + length); //second part is your positionif (str.Substring(point - 1, 1) == " ")
{
str = str.Replace(str.Substring(point, (secondQuote - point + 1)),"");
}
}
edited for better code. Edited again after testing, added +1 to replace to count the final quote. It works. Basically you could encompass this in a loop that goes through an array list that has all "remove these" values in it.
If you don't know the full prefix's name you can change it up like so:
Stringstr = "<tag1 prefix131403013654=\"2\">";
while (str.IndexOf("prefix") != -1) //At least one still exists...
{
int point = str.IndexOf("prefix");
int firstQuote = str.IndexOf("\"", point);
int length = firstQuote - point + 1;
//need to grab last part now. We know there's a leading double quote and a ending double quote surrounding it, so we find the second quote.
int secondQuote = str.IndexOf("\"",point + length); //second part is your positionif (str.Substring(point - 1, 1) == " ") //checking if its actually a prefix
{
str = str.Replace(str.Substring(point, (secondQuote - point + 1)),"");
}
//Like I said, a very heavy way of doing it.
}
That will catch all of them that start with prefix.
Post a Comment for "How To Strip Out One Common Attribute From Every Form Element On The Page?"