Since writing this post, I have had many great feature suggestions for the Profanity Detector. I have implemented all the suggestions I have received and written another blog post about it. You can read the 2nd post here.

On several projects that I have worked on, we have had a requirement to detect profanity in users input. This includes things like general swear words, sexual acts, racial slurs, and sexist slurs, etc. Over the years, I have built a pretty comprehensive list of these profanities used for the detection process. The list has been built from combining lists I found on the internet. The lists are allegedly used by a lot of the large social networks in their profanity detection; although I can’t verify that.

Profanity detector on GitHub by Stephen Haunts

My profanity detector is on GitHub, and released under an MIT license, so it is free for anyone to use and modify. The main list of profanities can be found in the ProfanityList.cs file. If you are easily offended and a bit sensitive to language then I recommend you DO NOT open that file. It contains some pretty gross language, but to detect the language, you need to be able to define it.

The library is built to .NET Standard 2.1, so it can be used in any .NET / .NET Core project. You can lower the .NET Standard version number if required for an older project; it will still work perfectly well.

There are three methods in the ProfanityFilter class.

IsProfanity() will return true if the passed in a string is considered profanity and false otherwise.

CensorString(), will return a version of an input string with any profanities censored out with the ‘*’ character.

DetectAllProfanities() will return a list of all profanities detected in a string.

Here are some example usages.

// Return true if a bad word
var filter = new ProfanityFilter();
Assert.IsTrue(filter.IsProfanity("arsehole"));

// Return false if NOT a naughty word
var filter = new ProfanityFilter();
Assert.IsFalse(filter.IsProfanity("fluffy"));

In this example, we pass in a string “Mary had a little [email protected] lamb who was a little [email protected]”. The method CensorString will return the input string with the words [email protected] and [email protected] removed.

var filter = new ProfanityFilter();

var censored = filter.CensorString("Mary had a little [email protected] lamb who was a little [email protected]");
var result = "Mary had a little **** lamb who was a little ******.";

Assert.AreEqual(censored, result);

In this final example we pass the string “2 girls 1 cup is my favourite twitting video” into the method DetectAllProfanities. This will return a ReadOnlyCollection of profanity. In this case, it returns the words, “twatting”, and “2 girls 1 cup” (if you don’t know what this is, DON’T look it up).

For future upgrades I want to be able to classify each profanity into different groups, such as swear words, sexual acts, racial slurs, sexist remarks etc. Implementation wise this is very easy to do, the difficulty comes from the fact I only recognise less than a quarter of the terms in this list. To classify them, I would need to look them all up, and I am not sure I want to do that. For example, an Alabama HotPocket, sounds like some kind of cake. It’s NOT! Maybe some debauched developers can help with this classification. We’ll see.

If you need to detect bad language in user input, then hopefully this library will be of use to use. If I am honest, there are probably more efficient ways of implementing it, but this has worked out perfectly well for me in the past. I hope you find it useful.

7 comments

    1. Absolutely, and I am in the process of adding this feature along with a way to be a little more proactive about not blocking words that contain profanities in the middle of them.

      1. Since when was ‘menstruate’ a profanity? I think you’re probably going to violate all sorts of regulations be blocking this,

      2. I agree. I have been going through the list removing words that don’t seem to make sense. I just haven’t checked in the file yet. I removed that one, gay and lots of others that don’t really make much sense to keep in there. I have also added support to clear this default list and import companies’ own curated list, which is something that people have asked me for. I found this list online, so I didn’t​ write it, but I agree it needs ammending.

  1. I would advise against proactively chasing for all rare used slangs (ie. a normal word used once for profanity in its 27th meaning somewhere in East Guam in 1957) and constantly extending the list, it will do more harm than good.
    With this logic we can also add the Spanish profane words, and why not then include all the 6500 human languages spoken by anyone who might get offended. Considering most profanities are 3-4 letter words in most languages, we have pretty much have to filter out all combinations of all pronouncable 3-4 letter combination rendereing all human communication as offending.
    Just an example, the other day I could not enter Pina Colada somewhere on an international website hosted in Europe, they where so PC to include a blacklist of all 27 languages spoken in the EU countries, including ‘pina’ which is the Hungarian word for female genitals.

    1. Yeah, that’s a good point and something I have been thinking about. What I am doing this week is adding the ability to add your own words, remove words or completely clear out the default list and add in a custom list of approved profanities at an organization, which several people have asked for.

      The list I am using I didn’t write, I found it online, so I admit I don’t really know what a lot fo the words mean, so the approach I will take it that is the default list, but you can alter the list or replace it if necessary.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s