Friday, July 16, 2010

"I Write Like" Developer reveals criteria

A friend of mine sent me a link that provided comments about the "I Write Like" analyzer and in those comments I found another link to the analyzer creator's blog "Coding Robots".  As someone that has been interested in artificial intelligence since the 90s, I found the creator's comments on his blog  quite enlightening:

"Currently it analyzes vocabulary (use of words), number of words, commas, and semicolons in sentences, number of sentences with quotation marks and dashes (direct speech)."

I would like to know the rationale for developing this criteria.  I'm always curious about the choices a programmer makes when developing a program for a particular purpose.

I also enjoyed reading the user comments.  As I suspected the subject matter that really dictates many of the word choices is a key factor in the analysis.  Since I often write about archaeological excavations that yield skeletons or other types of forensic remains, it is no surprise that I got more hits on Lovecraft, a classic horror writer, than other authors. 

I also have a tendency to write fairly lengthy sentences and often include asides in parenthesis.  In blog posts I often quote from the original article that sparked my comment or from other works written by someone involved in the original activity I am writing about.  So an analysis of this type of work may be more of a reflection of the original material I am quoting from than my own.

One user actually fed the analyzer snippets of actual work from famous authors in its database and it got many of them correct.  It did miss on some but I'm sure even famous authors are influenced by other authors.

Of course there are always the flamers that are quick to criticize someone else's creative efforts.  When I developed my virtual Julius Caesar, people would try to ask him about things like the Colosseum that were not built before his death.  I explained to people in my introduction to the project that I was limiting his knowledgebase to things and events that occurred before his death.  But, many people jump into interactions without reading any introductory material and it is often these people who quickly dismiss (and loudly criticize) such a project because they ignore the programmer's parameters when attempting interaction with the bot. 

A number of the users of the "I Write Like.." tool complained that the knowledgebase did not include enough women authors and people of color.  It will be interesting to see how the tool evolves as the knowledgebase is expanded.

