C#: String Extension to Replace Accented Characters

Have you ever wanted to replace “accented” characters in a string with their equivalent English character?  Here’s a string extension that replaces these diacritics within a string for C# 2.0 and up:

 
public static string ReplaceDiacritics(this string source)
{
string sourceInFormD = source.Normalize(NormalizationForm.FormD);
 
var output = new StringBuilder();
foreach (char c in sourceInFormD)
{
UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(c);
if (uc != UnicodeCategory.NonSpacingMark)
output.Append(c);
}
 
return (output.ToString().Normalize(NormalizationForm.FormC));
}

The extension replaces characters like “ö” with “o”, “è” with “e” and “ñ” with “n”. This is great for getting acceptable URLs or for auto-complete / type-ahead search boxes where you want to match on both the accented and non-accented characters.

This entry was posted in ASP.Net, C#. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *