C#: String Extension to Replace Accented Characters

Have you ever wanted to replace “accented” characters in a string with their equivalent English character?  Here’s a string extension that replaces these diacritics within a string for C# 2.0 and up:

 
public static string ReplaceDiacritics(this string source)
{
string sourceInFormD = source.Normalize(NormalizationForm.FormD);
 
var output = new StringBuilder();
foreach (char c in sourceInFormD)
{
UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(c);
if (uc != UnicodeCategory.NonSpacingMark)
output.Append(c);
}
 
return (output.ToString().Normalize(NormalizationForm.FormC));
}

The extension replaces characters like “ö” with “o”, “è” with “e” and “ñ” with “n”. This is great for getting acceptable URLs or for auto-complete / type-ahead search boxes where you want to match on both the accented and non-accented characters.

This entry was posted in ASP.Net, C#. Bookmark the permalink.

4 Responses to C#: String Extension to Replace Accented Characters

  1. Chris says:

    Excellent. Just what I was looking for, thanks for sharing.

  2. Fausto David says:

    Thx!

  3. DaveMan says:

    You can use this:

    var myNewString = Encoding.UTF8.GetString(Encoding.GetEncoding(“ISO-8859-8”).GetBytes(myStringWithAccents));

  4. Fabio Cometti says:

    The proposed method for me fails on the character ‘Ł’, DaveMan’s method run better for my purposes.

Leave a Reply

Your email address will not be published. Required fields are marked *