Saturday, December 2, 2006

comprehension

Here I look at the issue of the relationship between the voice of verbs and effective communication. I consider some findings about the parts of a document in which information might be placed so that it is memorable. I also look at the relationship between the frequency of chosen vocabulary and comprehensibility.

Verb-voice

The voice of verbs is mentioned in several printed style-guides and in material about style-software. StyleWriter software’s online product information lists passives among what it calls faults. The Plain English Campaign says that passives can be confusing. Orwell’s fourth rule (of six) says: “Never use the passive where you can use the active.” Strunk and White say that active is “usually more direct and vigorous” than passive. Williams is more nuanced and liberal; a writer may choose not to say who is responsible for an action because he does not know, does not care or would prefer not to. It is possible, says Williams, for the passive sometimes to be “the natural and correct choice.” He says that passives can “improve cohesion and emphasis”.

We thus have a range of views among style-authorities on the use of passives and, as far as I can tell from the material I have seen, those recommendations are not substantiated by evidence of testing performed on readers or listeners. Such testing may have been done, but it is not explicitly quoted or even cited.

Spyridakis and Isakson’s review of academic research on how well readers understand technical documents found that studies conflicted on whether voice affected comprehension. Some work found that active was better than passive while other findings suggested that voice did not matter.

Edwards worked with relative clause sentences among high school pupils and found that comprehension was better when:

  • relative clauses followed independent clauses
  • independent clauses were active rather than passive
  • agents were mentioned first and verbs were passive.

The abstract to her article says: “Comprehension difficulty related less closely to the use of the passive verb form than to the word arrangement in particular passive constructions.”

Further examination of Edwards’s and others’ work may lead one to conclude that there are more important comprehension-related factors at work when writers use passives than the actual selection of that voice in preference to the active one.

I would suggest that the evidence was not conclusive that passives were a stylistic taboo. Given the variety of opinions and findings on the matter of verb-voice, this could be an area for further study and/or an aspect of language which has minimal bearing on communicative effectiveness.

Other findings on making information memorable

As well as reviewing others’ research, Spyridakis and Isakson conducted their own study. Although this was inconclusive about verb-voice, it suggested that information was more memorable if it was in:

  • clauses
  • independent clauses
  • a document’s first paragraph.

Now, here we have some substantiated findings. These contrast starkly with any subjective opinion about style which might appear in a guide or be part of some software’s heuristics. At this stage I want to ask:

  • have we sufficient confidence in Spyridakis and Isakson’s findings?
  • if we have such confidence, may those maxims not be added to this project’s definitive canon of what makes for good English?

The use of frequency in avoiding ambiguity

Gibson and Pearlmutter, relying partly on others’ work which they cite, contrast the following sentences:

  • Amanda believed the senator during the speech.
  • Amanda believed the senator was lying to the committee.

In each case, they say, there is a temporary ambiguity after “senator” in terms of what function “believed” is going to perform. In the first case it licenses a noun-phrase and in the second an embedded sentence. They point out that “understood” behaves similarly and, thus, also causes such temporary ambiguity as readers or listeners are decoding sentences which contain that word.

I would like to suggest that temporary ambiguity will at least slow down the rate of comprehension if not actually prevent it. Therefore, it is desirable that the English we write avoids such ambiguity. It seems to me unrealistic that one should not use “believed” or “understood” at all because of this potential ambiguity. Thankfully, Gibson and Pearlmutter also provide useful information about those two words from which style-rules might be formed which would permit those words’ use under certain conditions.

Gibson and Pearlmutter point out that “believed” is at least three times more likely to be used with an embedded sentence than with a noun-phrase. With “understood”, it is the other way around, only more so because “understood” licenses a noun-phrase in eight instances out of every nine.

Gibson and Pearlmutter seem to be assuming, as do I, that frequency and comprehensibility vary together, not least because one is logically more likely to understand a familiar expression than an unfamiliar one. This suggests to me that style-rules might usefully be formulated from observations such as those about what types of phrase “believed” and “understood” are each likely to license.

I can see problems trying to explain rules about those two verbs in a style-guide whose content would need to be comprehensible to (and memorable by) a lay readership. However, such rules may be incorporated in style-software.

Frequency and comprehensibility

The findings of Spyridakis and Isakson’s literature review also suggest that frequency and comprehensibility are linked. They say: “High-frequency words (words that occur frequently in our language), short words, and structurally simple words are much easier for readers to recognize and comprehend than their low-frequency or longer counterparts. In fact, high-frequency words tend to be shorter and simpler.”

This finding does actually square with many of the style-guides’ prescriptions. Online promotional material for WhiteSmoke, a computer-based system which claims to help people write better, says: “A powerful engine [presumably used by the software] simulates the human mind by reading millions of carefully selected texts, classifying and storing them, and ultimately providing the user with precise choices.” It could be that this software is performing corpus-analysis which is similar to that used to produce the observations about “believed” and “understood”.

This leads me to consider what one might describe as a usage-democracy basis for the formulation of style-rules. After all, whether or not one likes an expression – be it slang, a neologism, a split infinitive, a dangling participle or a passive verb – if many people are using it, then many people will be seeing it and hearing it and, one assumes, understanding it, not least because of its widespread use – its frequency.

Observations

This brief survey has produced some tentative points, which I summarise as:

  • passives may not matter that much
  • arrangement of texts in clauses aids recall
  • rules can be written for “believed” and “understood”
  • usage-democracy may well rule good style, suggesting a frequency-detecting corpus-basis for style prescriptions.

Sources

Edwards, Audrey Toan (1969) The Comprehension of Written Sentences Containing Relative Clauses. Ann Arbor: University Microfilms (abstract)

Gibson, Edward, and Neal J Pearlmutter (1998) Constraints on sentence comprehension. Trends in Cognitive Sciences 2: 7: 262-268. Elsevier

Orwell, George (1946) Politics and the English language

Plain English Campaign (undated) How to write in plain English. http://www.plainenglish.co.uk/howto.pdf (19 November 2006)

Spyridakis, Jan H, and Carol S Isakson (1998) The Influence of Text Factors on Readers. Annual Conference - Society for Technical Communication, 45, 259-262 USA (sic): Society for Technical Communication

Strunk, William, Jr., and E B White (2000) The elements of style. Boston: Longman.

StyleWriter (undated) Product Information. http://www.stylewriter-usa.com/productinfo.html (19 November 2006)

WhiteSmoke (undated) About us. http://www.whitesmoke.com/about.html (2 December 2006)

Williams, Joseph M, with Gregory G Colomb (1995) Style: toward clarity and grace. Chicago: University Press

Posted by Paul Danon at 22:08:30 | Permalink | No Comments »

Thursday, November 23, 2006

comprehension

One might expect that complex sentences take longer to process. However: “… experiments … which use reading rates as the dependent measure suggest that it is the interaction of … types of deep structure representations with their surface structure forms that accounts for fluctuations in readers’ on-line processing of … sentence types.” More. Many style-guides advise against using passives. However: “Comprehension difficulty related less closely to the use of the passive verb form than to the word arrangement in particular passive constructions.” More.

On the comprehension of active and passive sentences, DR Olson, N Filby - Cognitive Psychology, 1972

The comprehension of active and passive sentences as a function of pragmatic expectations, P Herriot - Journal of Verbal Learning and Verbal Behavior, 1969

Focus of attention 

One problem for the generation of natural language text is determining when to use a sequence of simple sentences and when a single complex one is more appropriate. In this paper, we show how focus of attention is one factor that influences this decision and describe its implementation in a system that generates explanations for a student advisor expert system. The implementation uses tests on functional information such as focus of attention within the Prolog definite clause grammar formalism to determine when to use complex sentences, resulting in an efficient generator that has the same benefits as a functional grammar system. here

Posted by Paul Danon at 12:59:39 | Permalink | Comments (1) »

Tuesday, November 21, 2006

pronouncing non-English words in English

One does what one can with the sounds one has. American has a perfectly respectable and widely (more than in British English) used /a/ sound. What beats me is why they don’t use is in Italian latte but, instead, say /lɑtei/. One would be forgiven for thinking, BTW, that latte meant milk — you know, all the other stuff about lactation and lactic acid. But no. It’s a type of coffee.

I had an argument with an African acquaintance called Mr Ngakane about how English-speakers might pronounce words like his name which began with /ŋ/. I can just about do it but my point was that normal (!) speakers would be better off using /j/ which is the closest English-language sound. As it was, they, untutored by me, used the rather uncomfortable-making /nagə/.

At Leeds-university in the 1970s there was a Mr Woodhead in the linguistics-department who, annually, would give a very witty talk and demonstration of click-consonants. He would amusingly imply that, the rest of the time, he was kept in a cupboard and only “wheeled out” (his term) every year to entertain the second-year English-language students with his Zulu voiceless affricated velar plosives. It came as something of a pleasant surprise to undergraduates in a long-term emotional relationship to discover that they had been practising bilabial ingressives in the back-row of the cinema without even knowing it. The question is, which closest sound in English does one choose to do for clicks? /k/ or /t/, I suppose, but how boring,

Suddenly bursting, mid-sentence, into a foreign pronunciation of just one or two words, only to lapse back into one’s estuaryese can disturb one’s audience and, even, in my case, sometimes wake them. Imagine you’re pontificating one night at a bus-shelter in Poplar about French politics and decide, perhaps a paragraph ahead of time, that you’re going to have a shot at an authentic rendition (without a safety-net) of Ségolène Royal. You’re all limbering-up for the delicious /ʁ/ at the start of her surname yet, when you do it, people think you’ve been slightly sick and offer a tissue. Try an authentic Dutch pronunciation and they give you a cough-sweet, German and they back off, Italian and they think you’re drunk.

Never try to order Grolsch lager while affecting an authentic Dutch accent. Barmen will check if you’re actually after a White Horse whisky, or maybe you need a hot toddy for that nasty cold you’ve got. If you don’t want to die of thirst, give in, go with the grain and pronounce it like it was German. Same problem with Oranjeboom lager. You’ll never get your laughing-gear around a frothy pint of that stuff in London if you don’t ask for an orange boom (which is like white noise only louder).

No, best to stick to English’s reliable old set of some 40 phonemes and, while you’re about it, really relish the final /s/ or (better still) /z/ in the well-established Parisss, Marseillezzz and Lyonzzz.

This brings me to the time-honoured British tradition of Irritating the French, which includes scrupulously pronouncing otherwise silent terminal letters. Dijon mustard must be /di’ʒɔn/ (where getting the stress wrong too really rubs it in). If a gentleman is a doyen, be sure to describe him as a doyenne. The first elements of en suite and en route are an absolute gift, and none can forget the exquisite habit of Mr John Major, former British prime minister, of pronouncing the French president’s first name as though it was something you used to change a flat tyre on a lorry.

Of course, such phonemic warfare is waged in the opposite direction. Across the Fifth Republic and her former dominions, schoolchildren are, according to a Napoleonic timetable, drilled in ensuring that their pronunciation of beat and bit are identical. This is unfortunate if, like one TV-chef I heard, you end up telling people to put items for cooking into the oven on what sounded un-nervingly like a baking-shit.

Posted by Paul Danon at 20:15:21 | Permalink | No Comments »

rookie

A BBC webpage uses “rookie” and I write to the BBC: “”Rookie” isn’t standard English and won’t be understood by many people in Britain. Also, some would call it slang. A lot of slang is country-specific. Similar words are “boomer”, “boondoggle” and “home run”, as well as “sticky wicket” and “googly”. Best to avoid all slang and to use words that are understood by all English speakers.”

Jonathan Amos, Assistant Editor, Science and Nature, replies: “I would largely agree with you - but I don’t think you can be too prescriptive. Some words that might be regarded as slang have very wide currency (eg movie); other supposedly non-slang words can be meaningless to many, or used wrongly (eg enormity). One has to be aware of words or phrases that have very specific meanings in different cultures (eg pants/trousers). On balance, I am happy with “rookie”, not least because of the context in which it is being used.”

I reply that “rookie” isn’t understood by many English speakers.

Posted by Paul Danon at 18:52:00 | Permalink | No Comments »

word-formation

One may Google for something and/or Google someone, i.e. look for it/them on the web using a popular search-engine.

The other day I was upbraided by a colleague for using as a verb. I was explaining (as his eyelids drooped) a web-based system which we use in our work. I said that, the second time you log in, the system will probably have cookied your computer so that you won’t need to enter all your details again.

I note that, whereas MSN is the abbreviation for a range of Microsoft online services, it is now sometimes used to refer only to the instant-messaging system which is but one of those services. One may MSN someone, as one may message them. I feel tempted to say that one may messenge someone since at least one of those services is called a messenger (rather than a messager).

Webcam has meant a camera whose output appears continuously on a webpage, perhaps like live TV or perhaps in the form of a picture taken every few seconds or minutes which replaces its predecessor. It can now be used to refer to the use of a computer-connected camera as part of an online dialogue (such as an MSN session or a Skype phone call). The camera is, thus, called a webcam but is not on the web.

Internet is often used to mean the same as the worldwide web. The techies will tell you that the web is just part of the net, other components including email, messaging and file-transfer.

For many years now the IT community have tried to maintain a distinction between memory and storage. The former disappears when you switch off the machine while the latter persists. No wonder people confuse the two since, when you switch on your computer in the morning, you expect it to remember the work you did the day before in the shape of the files you saved. What’s needed is a term (and accompanying swearword) for when you’ve written eight pages of text without saving it to disk and someone in the street puts a pneumatic drill through a main power-cable outside your house.

Refresh used to be about attaining that nice, awake feeling you get after a cold shower or a strong cup of tea. Now it also means requesting a new version of a webpage. The user can do this and/or the website may determine that new versions of pages will be sent to users. Such auto-refreshing is used on pages which tell you how late your train is running. Understanding refreshing (as opposed to refreshment) is important for web-users because many assume that, when they look at a webpage, they are seeing a live picture as they would if they were watching television. They are actually seeing a copy of the page in the form that it was when they requested it, which could have been some time before; unless it’s refreshed, that is.

Computer-folks make an arcane distinction between key and button. The former is physical and on a keyboard. The latter is a representation of a physical key but on a screen.

The phrasal verb mouse-over involves one’s moving one’s on-screen pointer over something on the screen, in many cases a graphic which changes when you do that. This is an interesting use of mouse (the physical pointing-device) to mean pointer (the virtual thing on the screen). Some computers have neither mouses nor mice, but you can still move the pointer on the screen and, thus, mouse-over without a mouse.

My newest acquisition is “cloud” to mean the three-dimensional area in which one may receive wireless internet, thus a wifi cloud. Wouldn’t it be nice if you could see them and they were different colours like pink and baby-blue? Better still, if you could sit on them after a busy day’s work.

Posted by Paul Danon at 18:51:24 | Permalink | No Comments »

hyphenation

I use hyphens to join what I call compound-nouns (like I did just then). I used to talk in terms of adjectival nouns but maybe this is wrong. I like hyphenating my noun-compounds! I’m not sure why, but it feels neat to be flagging-up the fact that one knows that these two contiguous words are a unit. It may help resolve ambiguity. At the golf-course, you can have green wellies which are rubber boots of the same colour as grass, and green-wellies which are special boots for wearing on the parts of the links around the holes. Is that convincing? BTW, with “green wellies” the stress is even but in “green-wellies” the first word is prominent. One problem I have explaining this is that not all folks know what a noun is, let alone a compound-/adjectival one.
Posted by Paul Danon at 18:50:43 | Permalink | Comments (1) »

capital-offences

Ask anyone (!) and they’ll say that initial capitalisation is used to start a sentence and to identify names of people, countries, places and institutions which you’d find in the phone-book. Thus, one has John Smith who lives in England and works at Falmer-station for Southern Railway. However, in the staff-newsletter, John may worryingly be described as a Senior Ticketman, which isn’t his name, just his paygrade. At the head of John’s nation we have Elizabeth II, the reigning female monarch, also known as a queen. In Britain at least, however, she’s not the queen but the Queen. I can’t think why, any more than I can understand why we have the Second World War or the National Health Service (which, as far as I can tell, doesn’t exist as a single institution).

Ah, say the capitalists (those who support the initial capitalisation of non-proper nouns), when we say the Queen, we mean a particular queen. Sure, I say, but when I talk about my dog (which is a particular dog) I don’t call him the Dog. Same goes for prime ministers. There are lots of them but, just because, when you write prime minister, you mean the one who runs the Canadian government, that doesn’t make it a proper noun. Same goes for the pope. Sure, you might write Pope Benedict, but he’s a pope, not a Pope or the Pope.

Ah, say the capitalists, the conventions are somewhat fluid. They accommodate certain roles and not others. But if they’re that fluid, they’re surely not rules and we have anarchy. Or maybe we have Anarchy. After all, some people write about Communism, so why can’t any political expression have an initial capital? You see, once you let certain vulgar nouns have initial caps, any Noun can.

I’m a radical, me (in case you hadn’t noticed). Indeed, I might even be a Radical. I suggest that we refer to the linguistics and English language department at the University of Sussex, where there is a professor of linguistics who is, of course, Dr Max Wheeler with all those capitals. The department (not the Department) is benignly ruled over by its head (not Head) of department. Much nicer, easier on the eye and not needing so many wasteful pressings of the shift-key.

Posted by Paul Danon at 18:46:08 | Permalink | Comments (1) »

Tuesday, November 14, 2006

supervision 20061114

Supervision report to Dr Max Wheeler, L&EL, from Paul Danon, research student

We met today (14 November 2006) in your office from 16:00 till 17:15. This was my first supervision since I began the best part of two years’ intermission. Let me take this opportunity to say how pleased I am to be back at Sussex and, particularly, in L&EL.

I presented at yesterday’s Research on Languages and Linguistics Seminar and the audience was very gracious and appreciative. I got two pieces of fan-mail! Ms Carol O’Neal wrote: “Many congratulations on your excellent and most entertaining seminar.” Mr Nicholas Padmore wrote: “I attended your fantastic presentation for ROLLS yesterday evening, and wanted first to congratulate you for that.” Sorry to blow my own trumpet but, as I understand it, part of this undertaking is to win the confidence of research-based colleagues.

I spoke at the meeting about work done to date and, whereas I previously presented to colleagues about my review of paper-based style-guides, this time I also included material about software which offers to improve style. Although this review-work has been useful, my confidence has not been inspired by either printed or computer-based guides. Evidence of these tools’ effectiveness is sparse. Their precepts could well be valid but, for me at least, empirical proof of such validity is needed before I publish to that effect.

Interesting input from yesterday’s meeting included the observation that the ethos of at least some of the books on style came from a can-do attitude to writing which is found in the USA. Whereas in other cultures the ability to write well may be regarded as innate, in America it is seen as a skill to be acquired. First-year American undergraduates are taught to present their ideas in writing and Dr Murphy testified to the usefulness of that. Dr Murphy suggested I contacted Dr Allison (sic) Smith of Middle Tennessee State University.

My initial wish at today’s supervision was to get down straight away to practical research, testing the style-guides’ and styleware’s prescriptions by using real language with real people. We discussed the formulation by me of a research-outline or –protocol.

However, psychology-literature includes results of tests on comprehension. Inasfar as my new approach is to look for the so-to-speak laws of nature about English style, such work should be examined. Imagine that researchers had established that active verbs really did communicate better than passives. Such a fact could join the list of real-life laws about how language worked when it worked well.

My next lines of enquiry should therefore be:

  • PsycINFO
  • Modern Language Association
  • Google Scholar.

My keyword will be comprehension and I shall particularly look out for review-articles which sum up the state of the art on an aspect of the subject.

I shall ask Sussex faculty (including informatics personnel and/or Professor Sampson and/or Dr Bill Keller) about what rules, if any, in natural language processing apply to style. I shall also ask the companies which produce the style-software about the rules they incorporate into that software, including what those rules are and how they are identified as being valid. Also of interest will be reviews of style-software.

Today we also spoke about problems with telling writers to enforce a rule which preferred English words of Germanic origin, since many would not know about many words’ origins. We mentioned how morpheme-frequency might determine readers’ familiarity and, consequently, understandability.

Please may I report on progress in a couple of weeks’ time and might we then decide if another face-to-face meeting is desirable this term, or if we can continue to communicate by email and/or phone? Daytime telephone appointments may be feasible for me.

Posted by Paul Danon at 22:37:27 | Permalink | Comments (3)