The Science of Quackometrics

Thursday, June 7, 2007

So, how does the Quackometer work?

The quackometer counts words in web pages that quacks tend to use. The more quack words, the more quackery is suspected. That is Quackometrics.

The basic problem is that spotting the suspect words that many sites use, such as ‘vibrations’ or ‘energy’ is just not good enough as ‘good science’ sites are quite at liberty to use them. Even spotting these words in close conjunction with health terms, such as ‘healing’ or ‘nutrients’, is not quite good enough. My own background was research within in nuclear medicine group and the researchers had lots of legitimate reasons to mention ‘magnets’ and ‘health’ in (almost) the same breath.

So – the site uses an algorithm roughly like this:
  1. Keep a number of different dictionaries for use in tallying words in a web site
  2. Load the suspect web page and strip as much out as possible, HTML tags, scripts, punctuation etc.
  3. Count the number of words in each of the following dictionaries:
    a) altmed terms: such as ‘homeopathic’, ‘herbal’, ‘naturopath’
    b) pseudoscientific: clearly suspect terms that scientists rarely use such as ‘toxins’, ‘superfoods’.
    c) domain specific words from biomed, physics or chemistry such as ‘energy’, ‘vibration’, ‘organic’.
    d) skeptical words: words that no sincere homeopath would ever use, such as ‘placebo’, ‘flawed’, ‘crank’ or ‘prosecution’.
    e) commerce terms that would indicate that something is for sale, such as ‘products’, ‘shipping’, or ‘p&p’.
    f) Run a few other checks on pomo terms and religious terms, although not much is done with these.
  4. Compare the ratio of frequency usage of these various types of terms and compare them to preset thresholds. If a threshold is exceeded then append the test’s associate sentence to the response. The tweaking I have been doing to the site has been adding words to dictionaries and varying the thresholds for matches.

This does not always work, Some quacks are very clever and avoid the obvious quack words. Nonetheless they still have completely hatstand ideas.

So, if anyone else has suggestions, then I would be very greatful. Just need to give up my real job to concentrate on this now.


3 Comments:

Blogger Dougal said...

I'm curious: does this work in the manual way described by the FAQ or is it more like a quack-term-specific bayesian classifier in the style of akismet?

November 14, 2007 8:13 AM  
Anonymous Dean Morrison said...

Could I suggest the word 'scholar' would be a good one to add to your list of naughty search words LBD? It seems to be beloved of various quacks , creationists and other assorted religious nutters??

Power to you and Positive Internet by the way!!!

February 28, 2008 3:45 PM  
Blogger Sedgewick Demetrius said...

I see a flaw (only because I've just used the quackometer and was concerned by it's result).

The problem occurs in this situation:
A reputable person (in this case Ursula James - Visiting Teaching Fellow at Oxford University Medical School) is listed on an index site, but that index site has a long list of therapies offered by other members including some quackery, all down one side.

Because the quackometer sees the names of all the other therapies it associates them with the subject of the search (Ursula James) and decides that she is a quack.

This is clearly a flaw so I pass it over to you guys to investigate.

July 17, 2009 3:11 AM  

Post a Comment

Links to this post:

Create a Link

<< Home

About Me

The Quackometer has been developed by Andy Lewis. If you wish to get in contact then please read the FAQ and then email me. Details in the About section.

Tools

Get the QuackSafeTM Surfing 4 in 1 Toolbar. Access the quackometer from any web page.

 

Subscribe to the Quackometer Blog by Email

Find out more

Visit the Quackometer Amazon Store. Buy books there and help support the quackometer

Previous Posts

Powered by Positive