Research Process: Questionnaire Survey and Analysis

The research article “A Comparison of the Gatehouse Bullying Scale and the Peer Relations Questionnaire for Students in Secondary School” authored by Lyndal Bond, Sarah Wolfe, et. al., illustrates the accuracy and ease of an alternative test to that which had been used previously. Given the fact that “bullying is common in all schools” the authors indicate the importance of an alternate test as “there is neither a universally accepted operational definition of bullying nor a commonly accepted method to gather the information” (Bond, p. 76). There is further disagreement whether a definition for bullying should be supplied to the student respondent or whether they should simply be asked if certain acts had occurred. The Peer Relations Questionnaire (PRQ) in current use does provide the student with a definition of bullying; however the test “has 7 sections with 33 items specifically about bullying, and takes about 35 minutes to complete” (p. 76).

Given the importance of determining the extent of bullying, the authors constructed a brief twelve-question survey, without definition, capable of being completed in much less time. Titled the Gatehouse Bullying Scale (GBS), the authors designed a study in which students were given both the PRQ and the GBS to determine (1) accuracy of the GBS in comparison to the PRQ and (2) the “test-retest reliability of the GBS” (p. 77).

The rationale of the study was to provide an accurate and efficient method of assessing the level of bullying by “self-report” in lieu of “direct observation” (p. 76). Ostensibly only one study design—the GBS—was implemented in the survey; however, it is noted that the complete PRQ was not used, only a “subset of questions” (p. 76). It can be inferred the PRQ subset was specific to bullying and included the bullying definition; however, the authors are silent as to whether using a “subset” instead of the entire test would have any effect on reliability. The GBS asked respondents specific questions, describing certain behavior, and whether the respondent had been subjected to that behavior “recently” as opposed to a determinate time period.

Study variables were minimized to a great extent. There was no selection of the student participants beyond being in the eighth grade. Reference was made to the schools being of “medium to low socioeconomic backgrounds” but no effort was made to delineate students according to any criteria whatsoever. A variable was acknowledged in the time framing of the two tests, with the GBS using “recently” and the PRQ referring to “this year” (78). The greatest variable therefore is the degree of variance between the format and actual questions presented by the GBS and the PRQ. The authors stress “it was necessary to combine some of the items in the PRQ” for appropriate comparison with the GBS (p. 78). Both tests supplied operational definitions (have you been called names, have you been teased, have you been threatened) although the time variable for the GBS was specific (most days, once a week, less than once a week) as opposed to the PRQ (never, sometimes, often).

The survey was repeated “about three weeks apart” and the result of the answers to the GBS was compared to the answers given to the PRQ. (p. 77) A statistical analysis of the answers, charted by prevalence and frequency, showed an agreement between the two instruments from a low end of 75.6% agreement on “bullied” up to a high end of 90.1% on “left out of things frequently” (p. 77) The test-retest comparison was considered by the authors to be “moderate to good agreement for the GBS measures over time” and is therefore reasonably reliable. (p. 78)

The validity of the GBS is entirely dependent on the validity of its “yardstick” the PRQ. Appropriately, the authors discuss the shortcomings of the PRQ, including “problems and inconsistencies associated with providing definitions of bullying” and the length of the survey being “a burden” to respondents and “problematic” in some studies. (76). The obvious dilemma is a researcher cannot have it both ways. On one hand they seem to claim the PRQ is not a particularly valid instrument and needs to be replaced. However, they then use it as a measure of their own instrument. The validity is questionable.

The accuracy of the tables of comparison is undisputed. However, there are several issues with Table 2 worthy of note. The first comparison refers to “bullied” while there is no definition given in the GBS. Additionally, it appears all the data goes to a “quantitative” of “frequently” that can be very problematic unless a student considers and equates “often” with “most days”. Kappa values indicate a fair to moderate agreement across the board, with a range of .42 to .58 and a cluster of .50 to .54. Kappa values in Table 3 are much wider spread, from .36 to .63. Both tables support the author’s conclusions “that the GBS is a short, reliable tool to measure the occurrence of bullying in schools” (p. 78). However, that appears to be faint praise at best, self-deluding at worst.

One significant issue and limitation of the study appears to be the provision of a definition. The authors state the inherent problems in supplying a definition, and “because the PRQ defined bullying, the PRQ questions were placed after the GBS questions on the questionnaire” (p. 77). Citing D. Salin’s work (which was directed towards measurement of bullying in the adult workplace) they posit that “higher prevalence rates were obtained when participants were provided with a list of negative acts than when they were provided with a definition of bullying” (p. 78). They reached a different result, which they attributed to the “different time frames” mentioned above. (78). However, more interest, if not reliability could have been shown if the questionnaires had been given out both ways: half with the GBS first, the other half with the PRQ first. This would immediately set off alarms if there were great discrepancies.

Other limitations include the language used: “was upset ‘a lot’ by bullying” leaves a great deal to be desired. The students in this study were young teenagers in the eighth grade, capable to some degree of differentiation beyond “a lot”. Similarly, the language used is open to speculation, even by children. What separates the normal rough and tumble of life on the playground—with cliques, teasing, name calling and rumors—from the constant harassment and intimidation most adults equate with “bullying”?

Contrary to the authors’ position that the GBS is a “reliable tool to measure the occurrence of bullying” it is more a loose measure of what kind of activity adults perceive as youth bullying. It is not necessarily semantics; nowhere is there any allusion to what the children would define as “bullying” and no indication if it would be similar to the adult definition. However, given the horrific amount of deadly school violence any quick questionnaire that could possibly shed some light on a “bullying” situation before it became violent or met with a violent reprisal by a bullied victim is worthwhile. This is a study that clearly begs additional research and clarification of an extremely important issue. There will never be a “perfect measure” for qualitative perceptions, but the more educators and counselors know about their charges the better off society.


Bond, Lyndal, Wolfe, Sarah, Tollit, Michelle, Butler, Helen and Patton, George.

A Comparison of the Gatehouse Bullying Scale and the Peer Relations Questionnaire for Students in Secondary School. The Journal of School Health; Feb 2007; 77, 2; ProQuest Medical Library, pg. 75.