Difference between revisions of "Chance News 41"

From ChanceWiki
Jump to navigation Jump to search
Line 47: Line 47:
 
==SMOG (Simple Measure of Gobbledygook)==  
 
==SMOG (Simple Measure of Gobbledygook)==  
  
Strange as it may seem to the general public, even statisticians want to write properly in order to communicate to the reader.  A previous wiki (Ghost Writers) mentioned several websites which calculate readability and grade level using regression analysis; as you will see, there are others.  According to Wikipedia, [http://en.wikipedia.org/wiki/SMOG  (Simple Measure of Gobbledygook)] is a readability formula that estimates the years of education needed to completely understand a piece of writing. SMOG is widely used, particularly for checking health messages. The precise SMOG formula yields an outstandingly high 0.985 correlation with the grades of readers who had 100% comprehension of test materials.  SMOG was published by G. Harry McLaughlin in 1969 as a more accurate and more easily calculated substitute for the Gunning-Fog Index.
+
Strange as it may seem to the general public, even statisticians want to write properly in order to communicate to the reader.  A previous wiki (Ghost Writers) mentioned several websites which calculate readability and grade level using regression analysis; as you will see, there are others.  According to Wikipedia, [http://en.wikipedia.org/wiki/SMOG  (Simple Measure of Gobbledygook)] "is a readability formula that estimates the years of education needed to completely understand a piece of writing. SMOG is widely used, particularly for checking health messages. The precise SMOG formula yields an outstandingly high 0.985 correlation with the grades of readers who had 100% comprehension of test materials.  SMOG was published by G. Harry McLaughlin in 1969 as a more accurate and more easily calculated substitute for the Gunning-Fog Index."
  
 
In order to calculate SMOG
 
In order to calculate SMOG
Line 56: Line 56:
  
 
For the Gunning-Fan Index, go  [http://en.wikipedia.org/wiki/Gunning-Fog_Index here] where you will find
 
For the Gunning-Fan Index, go  [http://en.wikipedia.org/wiki/Gunning-Fog_Index here] where you will find
 +
  
 
The Gunning fog index can be calculated with the following [http://en.wikipedia.org/wiki/Algorithm algorithm]
 
The Gunning fog index can be calculated with the following [http://en.wikipedia.org/wiki/Algorithm algorithm]
Line 65: Line 66:
 
5.. Multiply the result by 0.4<br>
 
5.. Multiply the result by 0.4<br>
 
The complete formula is as follows:
 
The complete formula is as follows:
 +
  
 
The [http://www.editcentral.com/gwt/com.editcentral.EC/EC. wonderful website],  is an interactive web page for checking a sample of writing. It is modeled after the ancient Unix utilities style and diction."  One can "enter or copy text into the first box below. The scores to the right give the readability of the text according to various formulas" including all the ones mentioned thus far.  "Words of three or more syllables are underlined. You should check the words or phrases in red to see if they should be re-written according to the suggestion in the brackets."  
 
The [http://www.editcentral.com/gwt/com.editcentral.EC/EC. wonderful website],  is an interactive web page for checking a sample of writing. It is modeled after the ancient Unix utilities style and diction."  One can "enter or copy text into the first box below. The scores to the right give the readability of the text according to various formulas" including all the ones mentioned thus far.  "Words of three or more syllables are underlined. You should check the words or phrases in red to see if they should be re-written according to the suggestion in the brackets."  

Revision as of 14:52, 18 October 2008

Quotation

Forsooth

http://www.pmean.com/images/ForgottenMissingValue.jpg

This graphical forsooth was submitted by Steve Simon.

Is the Bradley effect real?

Do polls lie about race? Kate Zernike, The New York Times, October 12, 2008.

There has been a lot written about the "Bradley effect." This is a phenomenon first noted in the race for governor of California in 1982, where the Los Angeles mayor, Tom Bradley, polled far ahead of his competition, but lost by a small margin. This phenomenon was also noted in elections involving Harold Washington, David Dinkins, and Douglas Wilder. All of these candidates were black men and in these elections the results of the polls were more favorable to the black candidates than the election results. The belief is that people who are polled don't want to appear bigoted to the pollster by opposing the black candidate, but feel no such social pressure when casting their ballots.

In recent days, nervous Obama supporters have traded worry about a survey — widely disputed by pollsters yet voraciously consumed by the politically obsessed — that concluded racial bias would cost Mr. Obama six percentage points in the final outcome.

Is that true? Perhaps there is a Bradley effect, but perhaps not.

But pollsters and political scientists say concern about a Bradley effect — some call it a Wilder effect or a Dinkins effect, and plenty call it a theory in search of data — is misplaced. It obscures what they argue is the more important point: there are plenty of ways that race complicates polling. Considered alone or in combination, these factors could produce an unforeseen Obama landslide with surprise victories in the South, a stunningly large Obama loss, or a recount-thin margin. In a year that has already turned expectations upside down, it is hard to completely reassure the fretters.

The article notes situations where there may be a reverse Bradley effect. This occurs when

polls understate support for a black candidate, particularly in regions where it is socially acceptable to express distrust of blacks.

More critical than social expectations, perhaps, is an even more fundamental issue about polling.

Research shows that those who refuse to participate in surveys tend to be less likely to vote for a black candidate.

One survey researcher, Andrew Kohut, got at this indirectly by comparing people who responded immediately to those that required some extra effort.

Mr. Kohut conducted a study in 1997 looking at differences between people who readily agreed to be polled and those who agreed only after one or more callbacks. Reluctant participants were significantly more likely to have negative attitudes toward blacks — 15 percent said they had a “very favorable” attitude toward them, as opposed to 24 percent of the ready respondents. “The kinds of people suspicious of surveys are also more intolerant,” Mr. Kohut said.

The article discusses some of the issues involving the race of the person conducting the survey interview.

A further complication is the race of the person who asks the questions. Talking to a white interviewer, blacks or whites are more likely to say that they are supporting the white candidate; talking to a black interviewer, people are more likely to support the black candidate. This holds true whether the surveys are in person, or on the phone.

It is unclear, however, which type of interviewer is more likely to produce an accurate response.

Submitted by Steve Simon

Questions

1. If there is indeed a Bradley effect, is there any statistical adjustment that could be made to produce more accurate election polling results?

2. Does the study by Andrew Kohut produce a valid conclusion about the racial attitudes of non-respondents?


SMOG (Simple Measure of Gobbledygook)

Strange as it may seem to the general public, even statisticians want to write properly in order to communicate to the reader. A previous wiki (Ghost Writers) mentioned several websites which calculate readability and grade level using regression analysis; as you will see, there are others. According to Wikipedia, (Simple Measure of Gobbledygook) "is a readability formula that estimates the years of education needed to completely understand a piece of writing. SMOG is widely used, particularly for checking health messages. The precise SMOG formula yields an outstandingly high 0.985 correlation with the grades of readers who had 100% comprehension of test materials. SMOG was published by G. Harry McLaughlin in 1969 as a more accurate and more easily calculated substitute for the Gunning-Fog Index."

In order to calculate SMOG

1.. Count a number of sentences (at least: 10 from the start of a text, 10 from the middle, and 10 from the end).
2.. In those sentences, count the polysyllables(words of 3 or more syllables).
3.. Calculate using

For the Gunning-Fan Index, go here where you will find


The Gunning fog index can be calculated with the following algorithm

1.. Take a full passage that is around 100 words (do not omit any sentences).
2.. Find the average sentence length (divide the number of words by the number of sentences).
3.. Count words with three or more syllables (complex words), not including proper nouns (for example, Djibouti), compound words, or common suffixes such as -es, -ed, or -ing as a syllable, or familiar jargon.
4.. Add the average sentence length and the percentage of complex words (ex., +13.37%, not simply + 0.1337)
5.. Multiply the result by 0.4
The complete formula is as follows:


The wonderful website, is an interactive web page for checking a sample of writing. It is modeled after the ancient Unix utilities style and diction." One can "enter or copy text into the first box below. The scores to the right give the readability of the text according to various formulas" including all the ones mentioned thus far. "Words of three or more syllables are underlined. You should check the words or phrases in red to see if they should be re-written according to the suggestion in the brackets."

Click "the Demo" button one, two, or three times to see different samples of text. Check the scores for each sample; do you think the scores match the abilities of students in those grades? The different formulas give different estimates of grade level [required to understand the text]. Which formula is the most accurate? Click the 'Submit' button to look for problems and to see the more complex words underlined."

For example, enter the entire contents of Chance News Wiki #40 to obtain the following:

Flesch reading ease score:

   62.1

Automated readability index:

   11.1

Flesch-Kincaid grade level:

   9.1

Coleman-Liau index:

   11.9

Gunning fog index:

   14.1

SMOG index:

   12.7

15658 characters 12913 non-space characters 12291 letters/numbers 2464 words 425 complex words 3681 syllables 136 sentences 4.99 chars per word 1.49 syllables per word 18.12 words per sentence

Oh, I forgot to mention thiswhich explains the Coleman-Liau Index

To calculate the Coleman-Liau Index:

 1.. Divide the number of characters by the number of words, and multiply by 5.89. Call this A. 
 2.. Take the number of sentences in a fragment of 100 words, and multiply by 0.3. Call this B. 
 3.. Subtract B from A and subtract 15.8


And from http://en.wikipedia.org/wiki/Automated_Readability_Index<http://en.wikipedia.org/wiki/Automated_Readability_Index>

To calculate the Automated Readability Index:

 1.. Divide the number of characters by the number of words, and multiply by 4.71. 
 2.. Divide the number of words by the number of sentences, and multiply by 0.5. 
 3.. Add #1 and #2 together, and subtract 21.43. 


Discussion

1. If possible, randomly sample material you have written and use the http://www.editcentral.com/gwt/com.editcentral.EC/EC.html<http://www.editcentral.com/gwt/com.editcentral.EC/EC.html> to see how your writing has changed over the years, thus obtaining a longitudinal view. Which index shows the most change in absolute value and/or relative value?

2. Do the same for Chance News to see how it has changed over the years.

3. Ask some teachers of English what they think of these figures of merit.

Submitted by Paul Alper

item3