Ofsted - Worlds Apart? (1996)

Ofsted: Worlds Apart? (1996)

Prepared for Ofsted by David Reynolds and Shaun Farrell, this review looked at the major international surveys of educational achievement between 1964 and 1990.

The text of Worlds Apart? was prepared by Derek Gillard and uploaded on 15 March 2022.

Worlds Apart? (1996)
A Review of International Surveys of Educational Achievement involving England

London: Her Majesty's Stationery Office 1996
© Crown copyright material is reproduced with the permission of the Controller of HMSO and the Queen's Printer for Scotland.

[cover]

[title page]

OFFICE FOR STANDARDS IN EDUCATION

OFSTED REVIEWS OF RESEARCH

Worlds Apart?

A Review of International Surveys
of Educational Achievement
involving England

David Reynolds
UNIVERSITY OF NEWCASTLE UPON TYNE

Shaun Farrell
UNIVERSITY OF NEWCASTLE UPON TYNE
UNIVERSITY OF WALES, CARDIFF

LONDON: HMSO

[page ii]

ISBN 0 11 350085 8

Office for Standards in Education
Alexandra House
29 - 33 Kingsway
London WC2B 6SE

Telephone: 0171 421 6800
Facsimile: 0171 421 6707

[page iii]

CONTENTS

OUTLINE	1
1 INTRODUCTION: International Comparisons of Educational Achievement: The Political, Social, Economic and Educational Contexts	3
Economic Factors	3
Educational Factors	4
Conclusions	6
2 THE EVIDENCE REVIEWED: International Comparisons of Educational Achievement	8
The Problems of International Comparisons	8
Some Further Methodological and Practical Problems	11
Research Design	11
Sampling	12
An Absence of Needed Data	12
An Absence of Needed Analyses	13
2.1 Mathematics Studies	16
1964 - The IEA First International Mathematics Study (FIMS)	16
1982-83 - The IEA Second International Mathematics Study (SIMS)	18
Changes in Mathematics Achievement between 1964 (FIMS) and 1982-3 (SIMS)	24
1988 - The IAEP First International Assessment of Mathematics (IAEPM 1)	27
1990 - The IAEP Second International Assessment of Mathematics (IAEPM 2)	31
2.2 Science Studies	34
1970-72 - The IEA First International Science Study (FISS)	34
1983-85 - The IEA Second International Science Study (SISS)	38
1988 - The IAEP First International Assessment of Science (IAEPS 1)	44
1990 - The IAEP Second International Assessment of Science (IAEPS 2)	47
Smaller Scale Studies	49
Secondary Analyses	51
3 CONCLUSIONS: WORLDS APART?	52
The Case so Far	52
Why Worlds Apart?	53
Some Hypotheses	54
1 The Pacific Rim	54
2 European Societies	56
Growing Apart? Some Speculations	57
The Next Steps	58
Appendix A: Further Reading	60
Appendix B: Recommendations for Future Research	61
Appendix C: Bibliography	62

[page iv]

LIST OF TABLES IN THE TEXT

1 Factors Associated with Effectiveness in Different Countries	7
2 Proficiency Test Scores in Mathematics from the International Assessment of Educational Progress 2 (IAEPM 2) 1990	9
3 Vocational Qualifications of the Workforce in Various European Societies, 1988 - 91	10
4 The IEA and IAEP International Studies of Educational Achievement	14
5 Achievement Results from the IEA First International Mathematics Study 1964	17
6 Achievement Results from the IEA Second International Mathematics Study 1982-83	19
7 Retentivity in Mathematics Classrooms (Pre-university year) 1982-83	20
8 Variation in Opportunity to Learn Between Systems 1982-83 for students of 13 years	22
9 Variation in Opportunity to Learn, 13 year-old population, Within Systems (1982-83)	23
10 General Patterns of Change in Mathematics Achievement	25
11 Specific Changes in Mathematics Achievement by Curricula Areas (13 year old grade)	26
12 Achievement Results from the IAEP First International Assessment of Mathematics (IAEPM 1) 1988	28
13 Percent Correct by Curriculum Area and Opportunity to Learn (IAEPM 1) 1988	29
14 Percentage Correct Items for Various Countries (IAEPM 2) 1990	32
15 Achievement Results from the IEA First International Science Study 1970-72	35
16 Retentivity in Science Classrooms (Pre-university year) 1970-72	36
17 Achievement Results from the IEA Second International Science Study 1983-85	39
18 Achievement Results from the IEA Second International Science Study (pre-university Science specialists) 1983-85	40

[page v]

19 Average Percent Opportunity to Learn by Subject and Country (14 year olds) 1983-85	41
20 Total raw achievement means by country and adjusted means for opportunity to learn (14 year aids), 1983-85	42
21 Average Science Proficiency (IAEPS 1) 1988	44
22 Percent Correct by Curricular Area and Opportunity to Learn Ratings for Science (IAEPS 1) 1988	45
23 Overall Average Percentage Correct of All Participants (IAEPS 2) 1990	48

[page vi]

ACKNOWLEDGEMENTS

We are grateful to numerous people for their comments in responding to early drafts of this review of literature:

Bert Creemers, Dean Fink, John Gray, John Haslett, David Hopkins, Wendy Keys, Neil Macintosh, John McBeath, Margaret Maden, Peter Mortimore, Kate Myers, Jean Ruddock, Pam Sammons, Jaap Scheerens, George Smith, Sally Thomas, Peter Tymms.

We are also grateful to colleagues in The International School Effectiveness Research Project (ISERP) for permission to make use of some collectively gathered data in advance of publication of the full study.

Errors of fact and interpretation we, as all authors, claim exclusively for ourselves.

David Reynolds
Shaun Farrell
Newcastle, March 1996

[page 1]

OUTLINE

Our remit in this study was to review the internationally comparative studies of educational achievement that have involved England. We interpreted our task as requiring us to collect the following types of literature:

Large scale international achievement surveys of the International Association for the Evaluation of Educational Achievement (IEA) and the International Assessment of Educational Progress (IAEP), (the latter conducted by Educational Testing Services of the United States);
Smaller scale, often bilateral, studies that compare England with other countries, often without achievement data, but with rich data on educational processes;
Comparative studies that include case study data about the interaction between education, society and culture;
Studies on the processes and effectiveness of the English system itself, from the literatures on educational policy and school effectiveness.

It will be clear from the above that we amassed a considerable volume of literature; we have, therefore, concentrated upon what appeared to us to be the most important studies. For the reader wanting to investigate further this educational minefield of debate, assertion and evidence, we list at the end some sources of further reading that will illuminate the complexities of the field. We will, in any case, make clear where we have been selective in our choice of studies, and why.

We have organised our material for this report into three parts.

The Introduction describes the increasing internationalisation of the field of educational research, and looks at the reasons for this. It outlines why the issue of how different countries perform educationally is now at 'centre stage' in many societies.

Section Two looks at the findings of the major data sources in the field and the international achievement surveys of the IEA and the IAEP; it also attempts an assessment of their limitations as valid or true measures of country differences, and looks briefly at other smaller scale studies.

The Conclusion uses a wide range of published and unpublished studies in an attempt to explain the pattern of findings shown in Section Two, a pattern which does not portray English levels of achievement in a particularly favourable light. It represents an attempt not to test hypotheses about English performance, but rather to generate them.

We have concentrated exclusively upon Science and Mathematics as the curriculum areas in which to undertake comparison between England and other countries, for a number of reasons:

Mathematics and Science are universally recognised as the key skills needed in a modern industrial society, and particularly in new 'information age' economies.

[page 2]

English participation in the important and large scale IEA surveys has only been consistent in the Mathematics and Science curriculum areas.
Mathematics and Science are the two areas of the curriculum where the effects of the educational system outweigh the effects of home background in determining achievement. This means that there is the greatest chance of establishing with these subjects, the 'real' educational effectiveness of the educational systems of different countries (Coleman, 1975; Fogelman, 1978).
The similarity of Mathematics and Science in all countries means that effectiveness in these subjects call be more validly compared across national boundaries than effectiveness in other subjects that are defined differently in different countries.
Mathematics and Science have been relatively stable curriculum areas over time, permitting comparisons of findings at different time points.

The only other curriculum area in addition to Mathematics and Science where England took part over the last two decades was a one-off study of Written Composition, in which technical problems and the difficulty of assessment within a common frame of reference proved very difficult.

We have referred throughout to 'England' as the home sampling unit, but of course this is not always an accurate description of the origin of the sample of children. In some studies, 'England' means purely English samples are used - in others, Welsh children are also included in the survey sample. In two studies, the 'United Kingdom' is the sampling unit, although England naturally provides the highest proportion of children taking part. We make clear when we discuss each study, which countries within the United Kingdom we are concerned with; information about the make-up of the 'English' sample and about all the countries taking part in the studies will be found in Table Four, at the beginning of Section Two. It is also important to note that Scotland is included in many of the surveys in its own right: whilst we do not explicitly point up Scottish performance in every study, we return at the end of this review to the issue of variation in performance between England and Scotland.

[page 3]

1 INTRODUCTION

International Comparisons of Educational Achievement: The Political, Social, Economic and Educational Contexts

We live in a world that is becoming 'smaller' all the time. The spread of mass communications, and particularly of satellite broadcasting, makes ideas that were formerly found only in isolated cultural niches globally available. The enhanced interactions between citizens of different countries through visits, vacations, migration and electronic contact are clearly both breaking down cultural barriers and yet, at the same time, also leading to a reassertion of cultural distinctiveness.

The educational world is also becoming 'smaller' all the time. In the United Kingdom, references to the superior achievements of 'Pacific Rim' economies or the 'Tiger' economies now pepper the speeches of Government and Opposition spokespersons (see Tony Blair's Platform column in The Times Educational Supplement of 23rd June 1995), as does acknowledgement of the educational reasons for their success. Only two decades ago, there was little reference in discussion of educational policies within the United Kingdom to 'overseas' evidence, save for occasional acknowledgements of the apparent success of Scandinavian comprehensive schools from the 'liberal' or 'left' wings (Reynolds et al, 1987) and of the success of German training and education-for-work provision (Prais and Wagner, 1965). In the debate about the necessity of educational reform in the mid 1980s, in fact, comparisons were usually made with Britain's own past, rather than with other contemporary countries (see Hargreaves and Reynolds, 1989; Ball, 1990).

Now all this is changing. The recent report Learning to Succeed (National Commission on Education, 1993) makes frequent reference to non- British societies. Indeed, one could not find better evidence for the changes taking place than the attention given by every 'quality' daily newspaper and the front page of The Times Educational Supplement to a recent report comparing teaching methods in Switzerland and England (Bierhoff, 1996). Comparable research on the relative educational performance of England and Germany in the mid-1980s barely rated a mention in the press of the time, although it attracted attention from within the ranks of educators and politicians. What factors are responsible for the increased internationalisation of educational discussion and debate?

Economic Factors

Firstly, it is clear that economic pressures on traditional industrial societies have intensified in recent years with the emergence of what has been called the 'Asian economic miracle'.

[page 4]

Societies such as Korea, China, Japan and Taiwan have been regularly achieving growth rates of 6-7% per year, as against rates of only half that amount in the traditional economies of the OECD, including the United Kingdom. Thailand will, on present trends, exceed the United Kingdom in its Gross Domestic Product by the year 2010. Indeed, what has been called a 'recession' in the economies of North America and Europe since the late 1980s is mostly the result of a global redistribution of wealth towards the newly emerging economies of the East.

Explanations for the success of these economies have ranged widely and have included their strong family and community networks, their cohesive social structures, the pervasiveness of 'pro-social' attitudes and values encouraged by religious traditions and, frequently, their apparently very high levels of educational achievement.

Educational Factors

The second set of reasons for the internationalisation of educational debate are connected with changes in the educational research communities in North America and Europe. Specifically:

(i) The emergence of organisations such as the European Educational Research Association and the International Congress for School Effectiveness and Improvement, and of journals such as School Effectiveness and School Improvement. This has led researchers in such areas as school effectiveness, educational evaluation, teacher effectiveness and educational policy to become increasingly aware both of intellectual traditions, and of educational processes and outcomes, that are different to those of their 'home' country. An example of this is the emergence of Dutch empirical research in the field of school effectiveness (Creemers, 1994; Creemers and Scheerens, 1989), which was largely unknown in the United Kingdom and United States as late as 1988, and which now is central to the effectiveness movement. The same inclusion of societies seems to be happening currently with Pacific Rim societies (see for example the work of Cheng, 1993).

(ii) Educational researchers have realised that they need to understand more clearly why some school effectiveness variables 'travel' across countries, whilst others do not. 'Top down' leadership from Principals is perhaps one of the most well supported of all the American school effectiveness 'correlates' within the 'five factor' theory originally formulated by Edmonds (1979). This has subsequently developed into a 'seven factor' theory by Lezotte (1989); yet in spite of the massive empirical support for the importance of this factor in the United States (e.g. Levine and Lezotte, 1990), Dutch researchers (van de Grift, 1990) have been generally unable to validate its importance in the Dutch educational and social climate (apparently Dutch Principals are often promoted out of the existing school staff and have a much more facilitative, group-based and perhaps 'democratic' attitude to their colleagues).

What explains why a factor is important in one country but not in another? Educational researchers have become interested in finding out.

Interestingly, it is the secondary analysis of existing international studies of educational achievement that have been a further spur to this trend. Scheerens et al (1989), for example, used a re-analysis of data from an IEA study to generate interesting material upon the

[page 5]

'context specificity' of factors associated with educational achievement in different countries, as shown in Table One at the end of this section. Whilst home background factors are associated across most countries with Mathematics achievement, and whilst expectations and 'opportunity to learn' appear to be consistently important factors across most countries, some factors appear to be associated with achievement only within certain national contexts. In the case of Belgium, class size is positively linked with achievement in the French speaking part of the country, but negatively associated with achievement in the Flemish speaking part.

Findings such as these at international level, and recent American research showing that different school effectiveness features prevail in different socio-economic contexts (Teddlie and Stringfield, 1993; Hallinger and Murphy, 1986), have propelled educational researchers increasingly in the direction of 'context specific' theories, in which effectiveness is contingent upon national contexts.

The problem is, though, that just as the educational world is aware that effectiveness factors may not 'travel' across countries, the political world is increasingly inclined to transfer features from one context to another; some American states have adopted shorter school holidays and lengthened the time pupils spend in school, simply because 'time to learn' is higher in educationally successful countries like Japan (see Business Week, 14th September, 1992).

(iii) Educational researchers have also increasingly come to realise that only international research can tap the full range in school and classroom quality, and therefore in potential school and classroom effects. The range of school factors concerning 'quantity' (such as size, financial resources, quality of buildings), and 'quality' (such as setting standards of achievement) is likely to be much smaller within countries, than between countries.

The existing estimates we have, based on all the empirical studies concerning the size of educational influences (schools and classrooms together) on students, have settled into a range of from 8% to 14% of variance in virtually all empirical studies (see Reynolds and Cuttance, 1992). It is likely that these estimates are merely artefacts of the studies' lack of school and classroom variation. The true power of education, researchers have begun to mink, may only be shown by internationally based research.

(iv) The fourth reason why educational researchers have been increasingly keen to internationalise their field is that they have become aware that international study is likely to generate more sensitive theories than those at present on offer, since both intellectually interesting and practical middle-range theories are connected with the ways in which school and classroom factors travel cross-culturally to a very varied degree. Why is it that the presence of 'an assertive Principal' is not a key factor in school effectiveness in the Netherlands - what is it in the local, regional or national ecology that might explain this finding? Answering this question inevitably involves the generation of more complex explanations that operate on a number of levels, and is likely to generate more complex theoretical explanations than those generated by the simple within country research findings.

Indeed, it is a review of what 'travels' and what 'does not travel', in terms of types of school and instructional factors, that has generated what is one of the most interesting new ideas within school effectiveness, since it is apparently the instructional variables - such as time on

[page 6]

task, opportunity to learn, structured teaching and high expectations - that 'travel' much more than do the standard school factors customarily used. The possibility that one needs different school processes in different countries to generate the same effective classroom processes is a potentially devastatingly important insight (see Reynolds et al, 1994).

(v) The fifth reason for the enhanced internationalisation of education, and educational research, is simply that people have become aware that other societies have educational features mat may be of interest to their own society. For educationists in societies like the United Kingdom, where education is often criticised for its less than exceptional quality, the realisation that other societies may provide 'new' factors or processes that are not tarred with the 'tried but found wanting' label has been of considerable interest.

It is clear that educational researchers are following some degree behind the general internationalisation of educational discussions that has been a feature of the last decade (see for example the work of Lynn (1988) and Howson (1991)). Nevertheless, the adoption of an international frame of reference is now proceeding very rapidly indeed, as we have seen.

Conclusions

There is no doubt that educational debate has become increasingly internationalised over the last decade. The professional community of educational researchers has developed internationally inclusive organisations, and economic pressures have propelled debate about the educational achievements of different countries to centre stage. This phenomenon is likely to be continued when the Third International Mathematics and Science Study (TIMSS) and the ISERP study are published later this year (see Reynolds et al, 1994 and Time for the West to go East', The Times Educational Supplement, September 29th, 1995, for details of the latter).

Such comparisons of different countries have potentially much use for researchers in England: they offer a chance to see factors in systems which do not exist in our own culture, and the possibility of developing an answer to the question of 'what factors travel, and why'.

In the next section, we move on to look at the comparisons that have been undertaken of educational achievement in England and in other societies. The directions in which our understanding has been taken by these and other studies form the contents of Conclusions.

[page 7]

Table One

Factors Associated with Effectiveness in Different Countries (from Scheerens et al, 1989)

Predictor Variables with significant Positive (p) or Negative (n) Associations with Mathematics Achievement

[click on the image for a larger version]

[page 8]

SECTION TWO

The Evidence Reviewed: International Comparisons of Educational Achievement

The Problems of International Comparisons

A number of different studies have been carried out over the last two decades that have attempted to describe and analyse the causes of international variation in educational achievement. Whether large scale (like those of the IEA or IAEP variety), or small scale (like the England/Germany comparisons of Prais and Wagner, 1985), and whether quantitative or qualitative, all investigations face the same basic problem: how does one measure the influence of the educational system and its responsibility for variation in educational achievements?

Table Two shows, as an example, the Mathematics performance of 13 year old pupils in eighteen different countries in 1992, together with the 'rank' obtained by the country and the percentage of answers correct on the Mathematics test used.

England appears to do relatively poorly in comparison to Pacific Rim societies such as China, Korea and Taiwan, a finding that will often be repeated as we progress through the various studies reviewed in this section. But how do we explain English low achievement? Is it that English pupils lack the strong respect for school learning that characterises pupils in Pacific Rim societies? Has the 'selective' educational history of the British system left its pupils with low expectations; or are there other factors at work?

[page 9]

Table Two

Proficiency Test Scores in Mathematics from the International Assessment of Educational Progress 2 (IAEPM 2) 1990 (from Foxman 1992)

[click on the image for a larger version]

[page 10]

A second major problem also presents itself. Table Three, as an example, shows the distribution of educational qualifications within the workforces of various European societies.

Table Three

Vocational Qualifications of the Workforce in Various European Societies, 1988-91 (from Prais 1994)

[click on the image for a larger version]

Again, the performance of Britain is poor by comparison with other European countries, But to what extent are we comparing 'like' with 'like'? 'Intermediate Vocational Qualifications' in Britain are defined as BTEC, HNC, HND and the like - do the German or French qualifications 'mean the same thing' as the British?

All studies in this field have, then, two basic problems - societies must be compared in their performance on the same skills, bodies of knowledge or tests, and attempts must be made to ensure that the educational causes of any differences are isolated from any other possible causes.

The majority of the studies referred to in this section have attempted to solve the first of these problems by using Mathematics and Science tests where the same definition of what is the 'right' answer applies across different cultures. 'Cohort' designs have also been used to control out non-educational influences. Cross-sectional studies of the kind that generated Table Two obtain measurements at a point in time, and then disentangle the various educational, social, economic and cultural influences that were responsible for these measurements over historic time. By contrast, cohort studies that look at children over time can study the relative gains children make, after allowing for the various different starting points in different countries. The 'true' effect of education in these countries can then be calculated and compared by 'controlling out' the background factors that are non-educational. The expense and complicated nature of cohort studies means, though, that they have rarely been used; given the problems involved in the longitudinal element of the Second International Mathematics Study (SIMS) (Garden, 1987), including limited participation, limited objectives and use of a very short time period, one can see why. The absence of 'cohort' studies leaves unsolved, however, the problem of the extent to which country differences reflect educational or non-educational influences.

[page 11]

Some Further Methodological and Practical Problems

Although the two problems noted above place limitations on the capacity of existing international surveys of achievement to definitively address the issue, 'Which countries are more effective educationally, and what is the educational contribution?', it is important to note that there are further limitations upon the IEA and IAEP achievement studies of the last twenty years (see for further discussion Goldstein, 1989, 1993; and Reynolds et al, 1994).

Of course, these surveys face the kinds of problems that are present in all cross-national research - of accurate translation of material, of ensuring reliability in the 'meaning' of questionnaire items (such as social class or status indicators for example), of inconsistencies caused by Southern Hemisphere countries having school years that begin in January, and of the need for retrospective information.

In certain curriculum areas, the cross-national validity of the tests gives cause for grave concern and the IEA study of Written Composition failed, for example, in its attempt to compare the performance of groups of students in different national systems that used different languages. This study concluded that the 'construct that we call Written Composition must be seen in a cultural context and not considered a general cognitive capacity or activity' (Purves, 1992, p199). Even the administration of an achievement test in the varying cultural contexts of different countries may pose problems, particularly in the case of England, where the test mode in which closed questions are asked in an examination style format under time pressure may not be frequently experienced. By contrast, the use of this test mode within a society such as Taiwan, where these assessment methods are frequently experienced, may inflate Taiwanese scores, just as it may depress English achievement below its 'real' level.

In addition to these basic problems that affect all large scale international comparative research, there are also specific problems concerning the IEA and IAEP studies we are about to look at:

Research Design

The basic design of the IEA studies, which are concerned to explain country against country variation, may itself have been responsible for problems. Generally, a small number of schools, each possessing a large number of students, are selected, making valid comparisons between schools difficult once factors such as school type, socio-economic status of students and catchment areas are taken into account. Statistics may also be unstable because of the small numbers.
Curriculum subjects are studied separately, making it difficult to provide an integrated picture of schools in different countries.
There is considerable difficulty in designing tests which sample the curricula in all countries acceptably.

[page 12]

Sampling

Very large variations in the response rates make interpretation of scores difficult. In the IAEP, for example, school participation rates of those schools originally approached varied from 70% to 100% across countries, and student participation rates varied similarly from 73% to 98%. In the IEA Second Science Study, the student response rate varied from 99.05% (in Japan) to 61.97% (in England). Although all IEA and IAEP studies have been weighted to take account of differential response between strata, and although a comparison of responding and non-responding schools on public examinations in the above study showed little difference, the potential biases caused by variation in response rates are a matter of considerable concern.
Sometimes the samples of students used are not representative of the country as a whole (e.g. one of the IAEP studies used one area of Italy as a surrogate for the whole country).
Sometimes there are variations in the timing of school year tests, resulting in students of different mean ages in different countries.
Choice of certain 'grades' or ages for study may not generate similar sampling populations for different countries. In the IEA Second Science Study, the mean ages of the country samples for the 14 year old students ranged actually from 13.9 to 15.1 years. Given the known relationships between age, length of time in school and achievement, this variation may have been responsible for some of the country differences.
Policies in different countries concerning 'keeping children down' or 'putting children up a year' may generate difficulties of comparison.
Variations in the proportion of children taking part in the studies in each country makes it difficult to assess their differences. Mislevy (1995) notes that 98% of American children were in the sampling frame for the IAEP2 study, while the restriction of the Israeli sample to Hebrew-speaking public schools meant that only 71% of its children were included.

An Absence of Needed Data

Many studies lack information on the non-school areas of children's lives (family and home environment) that might be associated with achievement scores. Surrogates for social class, such as looking at the 'number of books in the home', are inadequate.
Outcome data has been collected mostly on the academic outcomes of schooling, yet social outcomes may be just as interesting. It is clear from the Stevenson (1992) studies that the superiority of Japanese students over those in other societies may extend to areas such as children's perception of their control over their lives (locus of control). Yet these 'affective' outcomes have been neglected. The explanation for this is clear - the problems of cross-cultural validity and reliability. But it is not clear why routinely available non-cognitive data (such as that on student attendance for example) has not been used (TIMMS is using this for the first time).
The factors used to describe schools have often been overly resource-based (no doubt

[page 13]

Some of the most important educational effects are likely to lie in the areas of the 'moral messages' that humanities subjects, like History, Geography and Civics, possess and which are integrated with their curriculum content. These differences, which formed the basis of the fascinating analysis by Bronfenbrenner (1972) of variations between the United States and the former Soviet Union in the propensity of their students to be pro- or anti-social, would probably repay study; we have chosen not to include them in this review because of the clear difficulties of investigation.

An Absence of Needed Analyses

Only limited attempts have been made to analyse student groups differentially, by achievement, say, or by social class, with the exception of a limited amount of analysis by gender.
Only limited attempts have been made to disaggregate overall scores for geographical units into their component parts, like individual states in the USA or countries in the UK, in spite of the hints that 'within geographical unit' variability may be large (Bracey, 1996).

It will be clear, from all these points, that the large scale surveys of achievement of the IAEP and the IEA need to be viewed with some caution. Further information on the methodological adequacy of the studies can be found in the reading list in Appendix A.

It is with such caveats borne clearly in mind that we approach our review of the studies, the whole body of which is laid out in Table Four.

We begin with the subject area of Mathematics [we have eliminated from our discussion the IEA Pilot Study of Foshay (1962), that involved Mathematics (and Science) because of the small sample size (9,000 pupils in 12 countries)].

[page 14]

Table Four

The IEA and IAEP International Studies of Educational Achievement

[click on the image for a larger version]

[page 15]

Table Four continued

[click on the image for a larger version]

[page 16]

Table Four continued

[click on the image for a larger version]

2.1 Mathematics Studies

1964 - The IEA First International Mathematics Study (FIMS)

The IEA First International Study of Mathematics Achievement involved twelve countries. The target populations for the study were as follows:

1a. All pupils who were aged 13 at the date of testing.
1b. All pupils at the grade level where the majority of pupils aged 13 were found.
3. All pupils in full-time study in schools (not colleges) who were in the grade, or year, from which students are recruited for higher education. In the case of England, this was the final A level year (FY). Because education at this level varies considerably between countries, this population was divided into specialists (3a) and non-specialists (3b).

Sampling procedures in this study were quite rigorous, and the populations quite large (the sample for England alone was 12,740 pupils in 684 schools). In England, post compulsory schooling was (and remains) characterised by selection and subject specialism, and this fact should be borne in mind when comparing achievement scores for England with countries which offered either universal education to this stage (such as the United States), or countries which had a multi-disciplinary curriculum at this level (the baccalauréat system of several European countries). An additional concern is the fact that the tests had more items from certain Mathematics topics (e.g. arithmetic), which may have disadvantaged certain countries like England.

[page 17]

It is also important to note that students in England and Scotland began school at least one full year earlier than pupils in most other countries in the 1964 study, and in all the studies in this section.

Table Five

Achievement Results from the IEA First International Mathematics Study 1964

[click on the image for a larger version]

Overall, the study showed that:

In terms of Mathematics achievement in 1964, English 13 year olds were about average within the context of the participating countries,
The variation in Mathematics achievement was a key feature of the findings in England (see the standard deviations in Table Five).
Pre-university pupils in England achieved considerably more than in other countries, but differences in retention rate in the post-compulsory system in England were likely to have exaggerated differences in favour of England.
'Opportunity to learn' (or the extent to which students had covered the test items) was a significant factor in achievement in England, but not in all other countries,

[page 18]

Family characteristics, such as socio-economic status and parents' education, were influential factors in England and several other countries for 13 year olds.
Students' attitudes and aspirations were influential in all countries.
England had 2-3 times as many low achievers as several other countries including France, Germany, The Netherlands, and Japan, in the 13 year old sample.

1982-83 - The IEA Second International Mathematics Study (SIMS)

The Second International Study of Mathematics Achievement involved twenty countries. The target populations for the study were as follows:

Population A was defined as: All students in the grade in which the modal number of students had attained the age of 13.0 - 13.11 years by the middle of the school year (the comparable group in FIMS is Population 1b.)

Population B was defined as: All students who were in the normally accepted terminal grade of the secondary education system, and who were studying Mathematics as a substantial part (approximately 5 hours per week) of their academic programme. (Population B in England was taken from sixth forms only).

In England the intervening period between FIMS (1964) and SIMS (1982-83) had seen the transformation from a selective, to a predominantly comprehensive system. Retention rates at the pre-university level had also risen slightly, but continued to be among the lowest in comparative terms with the disparity increasing as several countries' retention rates had risen more during the same period.

For the 13 year old population, Japan again had the highest overall achievement. England was once more about average for the study and continued to lag behind the Netherlands, Belgium and others, while France had caught up and overtaken England.

Examination of Table Six suggests that achievement in England was again very close to the overall study mean. However, this second study included several low scoring developing countries which would suggest that achievement levels in England had not remained stable over time. Evidence supporting this interpretation is found by comparing England to France, which had actually moved ahead by some distance. Scotland, which trailed England in the earlier study, had also moved slightly ahead.

For students studying Mathematics at the pre-university level, Table Six demonstrates that England continued to occupy a high position. However, comparing the pre-university Mathematics specialists in FIMS and SIMS, it is notable that Japan had overtaken England and was joined by Hong Kong (who did not take part in the first study) as the highest achieving countries.

[page 19]

Table Six

Achievement Results from the IEA Second International Mathematics Study 1982-83

[click on the image for a larger version]

Curricula characteristics of tests

[click on the image for a larger version]

[page 20]

If retention rates for Mathematics at the pre-university level are taken account of, Table Seven shows that the achievement means for England were produced by a much smaller 'elite' population than in any of the other participating countries, with the exception of Hong Kong and Israel. As with FIMS, it would appear that the achievement of England may have reflected the selective nature of the English system.

Table Seven

Retentivity in Mathematics Classrooms (Pre-university year) 1982-83

[click on the image for a larger version]

Other findings in SIMS reflect upon many of the current debates and concerns about education in England today. For example, it was found that class size at the 13 year old level across countries did not consistently relate to achievement (Japan being among the highest in terms of mean class size). SIMS also found that a high retention rate did not necessarily produce a detrimental effect on achievement. In fact, the highest performing students in all systems gained very similar standards of achievement regardless of varying retention rates.

[page 21]

In terms of the Mathematics curriculum, the major differences between systems occurred at the pre-university level, although the apparent consistency at 13 year old level in curriculum coverage left many unanswered questions when juxtaposed with achievement differences between countries. For example; the English curriculum was "very similar" to that in Japan and New Zealand, but pupil achievements between these countries were very different.

Finally, the issue of 'opportunity to learn' (or the extent to which pupils had covered the test items), which was raised in FIMS, received much more attention in the second study. Specifically, SIMS sought to identify patterns of variation between countries in terms of both 'intended curriculum' (as defined nationally), and 'implemented curriculum' (as reported by the Mathematics teachers). Furthermore, and of particular significance in relation to the large variance in achievement within the English system, SIMS also looked at the extent of variation within countries. The findings are reproduced below to demonstrate the strength of the relationship between achievement and 'opportunity to learn', and demonstrate how the characteristic variation in achievement within England reflected a worrying variance in students' exposure to Mathematics curricula.

Table Eight shows the data concerned with 'opportunity to learn' the Mathematics curriculum for students in the 13 year grade. It suggests that there is little difference between systems in terms of their intended curricula at this level (all had reasonably high matches with the curriculum as defined by the SIMS test items). However, when consideration is given to what knowledge students actually had access to, it is evident that there were wide discrepancies between countries, and, importantly, that countries also vary in terms of the fit between 'intention' and 'implementation'.

In England, it can be seen that the 'intended curriculum' was a good match for the items covered by the SIMS tests. However, when consideration is given to the 'implemented curriculum', practice did not follow intention. Across each domain covered by SIMS, the 'implemented curriculum' fell short of what was intended.

Table Nine presents the variation in curriculum implementation within systems reported by the SIMS. Essentially, the darker the shading, the more within system variation which occurred in terms of pupils' access to the Mathematics curriculum.

[page 22]

Table Eight

Variation in Opportunity to Learn Systems 1982-83 for students of 13 years

[click on the image for a larger version]

[page 23]

Table Nine

Variation in Opportunity to Learn, 13 year old population, Within Systems 1982-83

[click on the image for a larger version]

[page 24]

Table Nine clearly demonstrates that, in terms of the delivery of the Mathematics curriculum at the 13 year grade level, England had one of the highest levels of within system variation (Japan, France and the Netherlands had among the lowest). Perhaps the most worrying feature was that England appeared to have one of the highest levels of within system variance in opportunity to learn for arithmetic, the foundation of Mathematics learning. It is therefore unsurprising that this pattern was mirrored by large variations between pupils within England. This finding, linked to the comparatively large numbers of low achievers, explains why the achievement of our 13 year olds in Mathematics is only average in an international context.

Overall, the study showed that:

Average 13 year olds in England were nearly two years behind average 13 year olds in Japan, and about a year behind those in the Netherlands, France and Belgium (in spite of an extra year of formal education).
England continued to be characterised by very large variation in achievement between high and low achievers when compared with other countries.
England's 'intended' curriculum in 1982-3 had a good fit with the items used in the study, although the 'implemented' curriculum in England fell far short of the 'intended' curriculum.
Close analysis of 'implemented' curriculum in relation to opportunity to learn suggested that within England there was considerable variation in access to essential mathematical knowledge.
Retention rates in England continued to be among the lowest in the study, and the pre-university population continued to have comparatively high achievements.
Family background characteristics continued to be an influential determinant of educational achievement in England (there remained little or no relationship in many other countries).

Changes in Mathematics Achievement between 1964 (FIMS) and 1982-3 (SIMS)

Assessing the extent of changes over time in achievement is clearly difficult, since different samples of children were used at different time points. The most accurate means of assessing changes between the two studies is to examine achievement on the 'bridge' test items - that is, those items which featured in the tests at both points. Table Ten shows the mean achievements for all countries who participated in both studies, for both the 13 year old grade and the pre-university Mathematics specialist populations.

[page 25]

Table Ten

General Patterns of Change in Mathematics Achievement (Based on 'bridge' Items Common To Both Studies)

[click on the image for a larger version]

The results demonstrate that in England, the mean achievement for all the 'bridge' items combined had fallen dramatically for the 13 year olds. It is also noticeable that the same can be said for several other countries at this age (exceptions are France and the Netherlands). There was a similar pattern for the pre-university population. Although here the achievement of students in England continued to be comparatively high, it is worth observing that Japan had overtaken England by a considerable margin, and that Finland and Sweden had closed the gap on England. The extent to which this may be due to changes in the English Mathematics curriculum between 1964 and 1982-3 is unclear.

Table Eleven demonstrates that only in England did achievement fall in all four main curricular areas. Although all countries' mean scores were reduced over time in arithmetic, England had among the highest reductions, which is surprising, since this was from an already low base. For algebra, England was the only country that had declining achievement.

[page 26]

Table Eleven

Specific Changes in Mathematics Achievement by Curricula Areas (13 yr grade) (Based on 'Bridge' Items Common to Both Studies)

[click on the image for a larger version]

We present here only achievement percentages of the 'bridge' items (test items common to both studies) for those countries involved in both studies.

Note: the number of bridge items for each Subset are as follows: Arithmetic = 14 items, Algebra = 9 items, Geometry and Measurement = 7 items, Descriptive Statistics = 5 items.

[page 27]

Overall, the changes over time were as follows:

While countries such as France and The Netherlands had improved their relative achievement for 13 year olds, England had either stood still or declined in most Mathematics curricular areas.
Some (small) reduction in the range of student achievement was found, although variation continued to be high (about 50% above France and Japan).
Based on the 'bridge' items common to both FIMS and SIMS for 13 year olds, England was the only country to show declining achievement in all curricular areas (arithmetic, algebra, geometry and statistics).
In terms of the 'bridge' items for 13 year olds, students in England had significantly lower achievements for 29 out of the 37 items (the remaining items showed no significant change).

1988 - The IAEP First International Assessment of Mathematics (IAEPM 1)

In addition to the IEA studies discussed above, there have been two studies of Mathematics undertaken by Education Testing Services (ETS) of the United States, entitled the International Assessment of Educational Progress (IAEP). These studies have employed a similar methodology, but have been restricted in the scope of their analyses, when compared with the IEA studies.

In the first IAEP study, participation was limited to six countries, with Canada divided into seven, giving a total of twelve education systems. The results are presented in terms of a Mathematics proficiency scale, with a range from 0-1000 and a mean of 500. As with the IEA studies, the target populations were defined as representative samples of 13 year olds from each system (the United Kingdom sample was taken from England, Wales, and Scotland, but not Northern Ireland). The Mathematics test consisted of 63 questions derived from the ETS item pool, which had been widely used in the United States.

[page 28]

Table Twelve

Achievement Results from the IAEP First International Assessment of Mathematics (IAEPM 1) 1988

[click on the image for a larger version]

The findings here are consistent with those of the IEA studies. The United Kingdom achievement scores were again below the average for all countries in the study, although ahead of the United States. Canada (British Columbia) continued to have high achievement levels (as was shown in the SIMS) with Canada (Ontario) having achievement scores very similar to the United Kingdom. In Europe, the number of participating countries is limited, with Ireland and Spain showing similar levels of achievement to the United Kingdom. Another consistent feature is the superiority of Korea, and the size of margin it achieves over all other countries. Further discussion of the United Kingdom findings can be found in Keys and Foxman (1989).

In order to examine the consistency of achievement patterns in relation to specific Mathematics curriculum areas and opportunity to learn, Table Thirteen presents the achievement breakdown by the six curriculum areas represented in the IAEP test items. It also shows the 'implemented' curriculum in terms of 'opportunity to learn' defined by the Mathematics teachers involved in the study.

[page 29]

Table Thirteen

Percent Correct by Curriculum Area and Opportunity to Learn, (IAEPM1) 1988

[click on the image for a larger version]

[page 30]

Table Thirteen continued

[click on the image for a larger version]

Table Thirteen is again consistent with the findings of the IEA studies. Students in the UK continue to demonstrate comparatively high levels of achievement in areas such as geometry, and logic and problem solving, yet in comparative terms they continue to achieve poorly in the basics of 'numbers and operations'. It is noteworthy that 'numbers and operations' items represent a third of the entire test battery of IAEP, which would tend to work against the UK. With regard to 'opportunity to learn', Table Thirteen suggests a clear relationship with achievement in the UK for 'numbers and operations' and 'logic and problem solving', with the other curricular areas giving a mixed picture. A similar relationship is evident in other systems such as Quebec (French) and Korea which are two of the highest achieving systems. Obvious exceptions to this relationship are British Columbia, where low opportunity to learn ratings are associated with consistently high achievements, and Spain, where the reverse pattern is evident.

[page 31]

Overall, the study showed that:

As with the IEA studies, the UK achievement scores were close to the mean of the total sample.
The Pacific Rim system's mean achievement was the highest by a considerable margin.
The UK students' achievements were not consistent between Mathematics curriculum areas.
There was a relationship between 'opportunity to learn' and achievement in several curriculum areas in the UK, although the relationship was not consistent in other countries,
Students in the UK reported spending less time on Mathematics homework than students in most other systems, although those reporting more homework in the UK demonstrated higher achievement scores.
The amount of time pupils reported watching television was negatively associated with achievement in all systems.

1990 - The IAEP Second International Assessment of Mathematics (IAEPM 2)

This study (Foxman, 1992) built on the first, but is different in that twenty countries took part, and that 9 year olds were added to the 13 year old group that formed the target of the first study. Additionally, a wide range of data was gathered on educational factors (class size, amount of instruction, availability of resources etc.) and background factors (use of television, reading material in the home, views of Mathematics etc.) that might have influenced results. In this study, Scotland is analysed separately, whereas in IAEPM 1 it formed part of the English sample.

The skill areas covered at ages 9 and 13 were as follows:

numbers and operations
measurement
geometry
data analysis, probability and statistics
algebra and functions

[page 32]

Table Fourteen

Percentage Correct Items for Various Countries (IAEPM 2) 1990

[click on the image for a larger version]

We do not report this study in full, since many of the detailed findings of the study parallel those reported already for the other Mathematics studies in this review. However, it is worth noting that the response rate (below 70% at both ages), which was the lowest of any participating country, limits the confidence that can be placed in the findings.

[page 33]

Overall, the study showed (see Foxman 1992):

English performance is particularly poor amongst those with lower scores; the relative performance tends to improve as we 'move up' from the 5th to 95th percentiles. As an example, the scores of England's students at the 5th percentile were four points below those of France. But at the 95th percentile, England's students had a one point advantage.
English performance at the top of the achievement range did not display any substantial advantage over that of other countries.
English performance tended to be stronger in Data Analysis, Probability and Statistics, and Geometry, and weaker in Algebra and Functions, and Numbers and Operations.
English pupils reported the least amount of homework time in Mathematics of any countries in the study, with the exception of Scotland. Only 6% of 13 year olds reported spending 4 hours or more per week, by comparison with 17% in France, 15% in the United States and 37% in China.
Home factors, with the 'number of books in students' homes' used as a surrogate for social class, were correlated with achievement in virtually all countries. Television watching was negatively correlated with achievement in half the countries, including England.
England was low on the actual exposure of children to Mathematics, standing 15th out of 20 countries with only 190 minutes of Mathematics instruction per week.
England reported the highest proportion of schools with a textbook shortage (at age 13), one of the lowest proportions of schools not permitting calculators and some of the highest proportions of classes grouping by ability both within classes and particularly between classes, with a full 92% of schools reporting the latter at age 13. At age 9, England ranked second in the proportion grouping by ability between Mathematics classes. This no doubt reflects the popularity of 'differentiation' within classes, and of 'setting' across classes.

[page 34]

2.2 Science Studies

1970-72 - The IEA First International Science Study (FISS)

The first large scale international study of Science achievement was part of a wider six subject survey. The Science part of the study involved nineteen countries and the target populations were defined as follows:

I. All students aged 10.0 - 10.11 years at the time of testing. This age group was chosen because in most systems students were still taught predominantly by non specialist primary teachers.
Il. All students aged 14.0 - 14.11 years at the time of testing. This was the last age in most systems in the IEA where 100% of the age group were still in compulsory education.
IV. All students who were in the pre-university year of full-time secondary education.
IVS. All students in population IV who were regarded within their own systems as specialising in Science.

The sample for England consisted of 9,227 students, selected to be representative of the population as a whole. As with all the IEA studies, the response rates for England were comparatively low.

It should also be remembered' that the education system in England was at this time going through the transformation from a selective to a comprehensive structure and that the Science curriculum was being influenced by changes in both curriculum emphasis and in delivery methods. The timing of this study was therefore not ideal for England.

Table Fifteen shows that students in England at age 10 achieved below the mean for the study. There was also evidence of the same large variation which was a consistent feature of the Mathematics studies. The pattern of achievement at age 14 was once again markedly consistent, the mean achievement score being below the mean for the study and the variance in achievement within England being substantially larger than for most other countries. As with the Mathematics studies, it was noticeable that Japanese performance was consistently the best, although the performance of other countries did not appear to be consistent across curricular areas.

The achievement of students in the pre-university year was difficult to assess in comparative terms, since the results concerned the entire pre-university population and there was therefore some bias toward those countries which offered an undifferentiated curriculum at this level. This would have inevitably disadvantaged England, as well as other countries, whose pre-university education provisions were characterised by specialisation; although there remained the issue of retention rates which clearly favoured England. Table Fifteen shows that for the pre-university population in England, achievement was relatively better than that of the compulsory school age populations.

[page 35]

Table Fifteen

Achievement Results from the IEA First International Science Study 1970-72

[click on the image for a larger version]

[page 36]

In Table Sixteen it is apparent from the retention rates that the system remained more selective in England than in the majority of other participating countries. In addition to the generally low retention rates in England at the time of the first Science study, it was also evident that a lower percentage of those students who did remain in education at the pre-university level were likely to continue to study Science, when compared with most other participating countries.

Table Sixteen

Retentivity in Science Classrooms (Pre-university year) 1970-72

[click on the image for a larger version]

Returning to the 10 year old population, it is notable that the highest between-school variance occurred in those countries where the streaming of pupils was the norm, which include England. At this age, home background variables were highly associated with achievement variation between schools, and also with variation between pupils. There was a large negative relationship between the number of hours spent watching television and achievement in England and the US, a relationship which was found in almost all countries to some degree. In terms of Science teaching, the study found that an 'unstructured' approach was likely to be less successful than a 'structured' approach. Variation in achievement was related to opportunity to learn in several countries, including England. As with the first IEA Mathematics study, the influence of attitudes and motivation were unclear, with Japanese students again demonstrating the least favourable attitudes to Science, yet gaining the highest achievements!

[page 37]

At the 14 year old level, home background variables continued to be the most important factor explaining between-school variance in England, Scotland, and the United States, with father's occupation being the most prominent factor. In terms of both between-school and between-student achievement variance, the 'type of school' attended was found to be very influential in England. 'Opportunity to learn' was again a factor related to the variation in achievement between schools in those countries characterised by differentiated school types such as England. Contrary to the findings at age 10, the 14 year old population was found to benefit from a more 'flexible' approach to learning in most countries,

For the pre-university sector, home background variables had little or no influence in most of the countries (as with the Mathematics studies this may be attributed to the homogeneity of the populations), and it is not surprising that, for England and other selective systems at the pre-university level, learning conditions and 'opportunity to learn' were among the strongest factors explaining variance in achievement between schools, and between students in schools.

Overall, the study showed that:

Student motivation to succeed at the 10 year old level presented an unclear picture, with Hungarian and Finnish students expressing the strongest motivation and Japanese students having the least motivation, but the highest mean achievement.
For the pre-university sector, English students scored above average on the practical test items, which probably reflects their greater use of laboratory practicals.
The relationship between 'opportunity to learn' and achievement was not consistent between grade levels. For the 14 year olds, teachers reported low 'opportunity to learn' ratings for the applied test items, which contrasted with the above average achievement results. At the pre-university level, teachers reported high 'opportunity to learn' ratings for the applied test items, yet scores were below the average for the test.
In England, at compulsory school age, home background factors were responsible for some of the variation in achievement between schools and between students within schools.
The amount of hours spent watching television was negatively associated with achievement in nearly all participating countries, including England.
Only in England was the type of school attended found to be an influential factor in explaining variation between students at 14, (which may reflect the transitions taking place between tri-partite and comprehensive structures of provision at the time of the study).
As with the Mathematics studies, the mean achievement of students in England could be described as, at best, average, with the exception of the high achievement of the pre-university population.
The recurring feature of large within-system variation at student and school levels identified in the Mathematics studies was found to be characteristic of the Science system in England as well.

[page 38]

1983-85 - The IEA Second International Science Study (SISS)

The second large scale international study of Science achievement was undertaken in the mid 1980s, and involved twenty-three countries, The target populations were defined as follows:

1. All students aged 10.0 - 10.11 year's at the time of testing. (Or all students in the grade where most ten year olds were to be found on the specified date of testing).
2. All students aged 14.0 - 14.11 years at the time of testing. (Or all students in the grade where most 14 year olds were to be found on the specified date of testing).
3. All students who were in the pre-university year of full-time secondary education. This population was also sub-divided into specialists (3 Biology, 3 Chemistry and 3 Physics).

As was the case for Mathematics, the intervening period between FISS (1970/72) and SISS (1983/85) had seen England transform its educational provision towards a comprehensive system. There had also been considerable changes in curriculum emphases and modes of delivery since the early 1970s. The second Science study was designed to focus on representative samples in each of the populations, in each of the participating countries. It is worth stating that, as with most of the IEA studies reviewed, the response rates for England were extremely low in comparison with those achieved in other countries. In the SlSS for example, English response rates were only about 70% for schools and students in population 1, about 60% for schools and nearer 50% for students in population 2, and less than 55% for schools in population 3.

Table Seventeen shows that the mean achievement of English 10 year olds was considerably below the mean for the study and that the achievement variation within England remained among the highest in the study, in spite of the move to a comprehensive system. A similar pattern was found at the 14 year old level.

In the case of the 14 year olds, the variation in achievement within England was consistently large, but was less than in several other countries, including Japan which was one of the highest achieving countries. It is interesting to note that Japan continued to feature among the highest achieving countries at all population levels, and that Hungary, which had comparatively high achievements in Mathematics for 13 year olds in the SIMS study, had the highest mean achievement in Science in the SISS for 14 year olds.

For pupils in the final year at school, achievement in England was among the highest, willie the extent of variation of pupil achievement was comparatively low, although greater wan the two highest achieving countries, Hungary and Sweden.

[page 39]

Table Seventeen

Achievement Results from the IEA Second International Science Study 1983-85 (Excluding Four Developing Countries)

[click on the image for a larger version]

[page 40]

Table Eighteen

Achievement Results from the IEA Second International Science Study (pre-university Science specialists)1983-85

[click on the image for a larger version]

Table Eighteen demonstrates that lie Science specialists in England achieved consistently well across each of the Science curricular areas identified. This finding is consistent with the previous Mathematics studies and suggests that the influence of a comparatively low retention rate and the continued subject specialisation at the pre-university level continued to favour England. The amount of variation demonstrated within the pre-university Science specialists was consistently below the mean for the study.

Several findings from the SISS are worthy of note regarding England. At the 10 year old level, there were very large between school differences in England, as well as in a few other countries, which contrasts with very low between school differences in achievement in Japan, Finland and most countries. The authors of the report note that "this large variation of achievement is surprising in some countries which pride themselves on equality of 'educational opportunity' and is clearly a point to which those responsible for the national planning of education could give attention" (Postlethwaite and Wiley, 1992, p 78). The lack of a national curriculum at the time of the SISS should be noted.

[page 41]

SISS also found a strong relationship between levels of achievement at 10 year old and 14 year old levels, which would indicate the importance of a good grounding in Science. The fact that English students achieve comparatively poorly in all branches of Science at the 14 year old level would seem to support this interpretation, although the sample tested in England at age 14 was younger than the samples tested from other countries.

Some investigation was also carried out into the extent to which there was variation between countries in their 'opportunity to learn' across the four subjects that comprised Science for Population Two (the 14 year olds). Table Nineteen shows the variation between countries on 'opportunity to learn,' with England towards the lower end of the range of countries in the curriculum exposure of its pupils to Biology, Chemistry, Earth Science and Physics.

Table Nineteen

Average Percent Opportunity to Learn by Subject and Country (14 year olds) 1983 -85

[click on the image for a larger version]

[page 42]

Although there was a moderate relationship between 'opportunity to learn' and achievement at between country level, stronger than the relationship with achievement within countries, adjusting country means to take account of this does little to change the overall ranking of countries, as Table Twenty shows.

Table Twenty

Total raw achievement means by country, and adjusted means for opportunity to learn (14 year olds) 1983-85

[click on the image for a larger version]

[page 43]

Overall, the study showed that:

10 year olds in England, Poland, Hong Kong and Singapore had the lowest levels of Science achievement. Those in Japan, Finland and Korea had the highest.
English 14 year olds were among the lowest achieving groups, while those in Hungary and Japan were the highest.
At the pre-university level, England was among the highest achieving countries.
The bottom 25% of 14 year olds in England, and several other countries, performed extremely poorly.
Differences of achievement between schools, within countries, were among the widest in England,
At the 10 year old level, 60% of schools in England scored lower than the lowest scoring school in Japan.
Home background factors were significantly related to Science achievement in many countries, including England.
A number of educational factors (the use of experiments, homework and attendance at schools with more Science facilities) had positive effects at the individual student level.

[page 44]

1988 - The IAEP First International Assessment of Science (IAEPS 1)

This study was linked to the IAEPM 1 study reported earlier, and took place within 12 different countries or regions (7 of the 12 settings were systems within Canada). The home sample used was a United Kingdom one, and the study followed the customary pattern of an achievement test (in this case administered to 13 year olds) and a set of further data on pupil background, pupil attitudes and the 'opportunity to learn' of different pupils rated by teachers. It is important to note that the United Kingdom response rate was 70%, compared to 89% or higher for other locations.

Table Twenty One shows the results, with the United Kingdom sample doing better in this study than in the other studies reported earlier. Countries were divided into three groups with Korea and British Columbia in the 'top' group, the United States in the 'bottom' group, and the United Kingdom in the 'middle' group.

Table Twenty One

Average Science Proficiency (IAEPS 1) 1988

[click on the image for a larger version]

[page 45]

Table Twenty Two

Percent correct by Curricular Area and Opportunity to Learn Ratings for Science IAEPS 1 1988

[click on the image for a larger version]

[page 46]

Table Twenty Two continued

Percent Correct by Curricular Area and Opportunity to Learn Ratings for Science, IAEPS 1 1988

[click on the image for a larger version]

Table Twenty Two shows the United Kingdom performance on the five sub-areas of scientific knowledge; it is clear that performance in Physics, and Earth and Space Sciences, was better than that in Life Sciences and Chemistry. This table also shows the 'opportunity to learn' scores and rankings. As has been the case in a number of the studies reported in this review, mere does not appear to be a close relationship between exposure to a topic and achievement level across the countries concerned. As an example, the good United Kingdom performance on Earth and Space Sciences is surprising given the very low exposure to these topics in schools. (The report notes the possibility that the information required to answer questions on these topics may have been obtained from other subject disciplines, or from non-school sources like the media).

[page 47]

Overall, the study showed that:

United Kingdom pupils reported spending more time on experiments and practical activities involving exploration and investigation than those in any other system.
United Kingdom students reported only moderate levels of homework for Science, by comparison with other countries (only 5.6% reported spending 3 or more hours per week, a figure only surpassing Ontario (English population) with its score of 4%).
United Kingdom students also reported very high levels of television viewing by comparison with other systems; 27% viewed five or more hours per day, a figure only exceeded by the United States on 31%.
Within all countries, there was a negative relationship between television viewing and achievement, but the relationship between hours of homework and achievement was less clear.
United Kingdom performance was above average.
As an explanatory variable of differences between countries in achievement levels, 'opportunity to learn' presents a confusing picture.

1990 - The IAEP Second International Assessment of Science (IAEPS 2)

The slightly more favourable picture of English achievement in IAEPS 1 is also evident in the second Science Study (IAEPS 2). Table Twenty Three shows the results for students aged 9 and 13, with the English performance being slightly above the mean for all countries. Pacific Rim countries again do well at both ages, and Switzerland, together with Hungary, does particularly well in the European group of countries for the older age group.

At age 9, England does worse on Life Sciences relative to the average for all countries, and best on Earth and Space Sciences and Nature of Science. At age 13, the English performance is worst on Earth and Space Sciences, and best on the Nature of Science!

As with Mathematics, English pupils are given a relatively low amount of Science homework, with only 2% of 13 year olds undertaking four hours or more per week, a figure exceeded or equalled by all except two of the countries involved. England is also very high in the proportion of students undertaking Science experiments in school (practical Science).

Many of the attitudinal measures used in the study have perplexing results. For example, high scoring Pacific Rim societies, and (among the European countries) Switzerland, have students whose perceptions of their own ability are notably more modest than students in lower scoring European societies such as England.

The wider environmental variables used in the study generally confirm the impression given by other studies, with Science achievement in England related to parental interest, the number of books in the home (as a surrogate for social class), the amount of homework undertaken and the amount of listening reported in Science lessons.

[page 48]

Table Twenty Three

Overall Average Percentage Correct Of All Participants (IAEPS2) 1990

[click on the image for a larger version]

[page 49]

Smaller Scale Studies

There are a number of further studies of educational achievement involving England that merit attention, mostly involving bilateral comparisons of England and one other country. Interestingly, they tend to support the principal findings of the major surveys of achievement noted above.

(i) The Exeter Kassel Project (Burghes and Blum, 1995) is an ongoing comparison of Mathematics achievement in England and Germany, involving a cohort of pupils aged 13 and their performance on number, algebra, shape and space, and the handling of data.

Initial scores of the cohort in England were slightly below those in Germany (44.3 compared to 47.3, with the maximum being 150), but by the end of year 2, German scores were 61.8 by comparison with 54.5 for England. Although the study has not used elaborate statistical methods so far, it is notable that there is considerable variation in achievement gain in differing English schools, and that the German increases in achievement have been uniform across the ability range, in apparent contrast to England.

Further analysis, the expansion of the study to more countries, and an experimental phase in which methods found to be successful in certain countries (Hungary in particular) are introduced into English schools, are all in progress.

(ii) The early National Institute for Economic and Social Research studies of Prais and Wagner (1985) looked at Mathematics in England, Holland and Germany. They found that some students in Dutch schools from the bottom third of the ability range were able to solve algebraic problems usually tackled only by first year A level students in England.

There were marked differences in the answers to arithmetic questions given by 15 year olds in the lower half of the ability range. Sixty-nine percent of German students, but only 13% of English students, answered some simple division correctly. In a separate study looking at the lower section of the academic attainment range in each country, average scores of 12.9 were recorded in Mathematics for secondary modern school students in England and 22.4 for comparable Hauptschule students in Germany (Prais and Wagner, 1985). In fact, the average German score of 22.4 exceeded the average score of all students in England of 20.1. Taking account of the presence of a higher proportion of high attaining children in England than in Germany (5% against 2% scoring 50 or more out of a possible 70), pupils in England had a 60% higher variability in attainment than pupils in Germany in terms of the 'coefficient of variation'.

(iii) The Bierhoff and Prais (1995) Re-analysis of IAEP Mathematics 2 shows that the overall score obtained by the median or 'middle' Swiss student was only obtained by the top quartile of English students. This suggests that there are three times as many low achieving students in England as in Switzerland. Further analyses of Realschule students (those at schools catering for the bottom third of academic attainment) show them achieving scores above those attained by the average pupil in all schools in England (a score of 57% compared with 51%).

[page 50]

(iv) The Burghes (1995) Study shows other countries performing much better than England; the samples of 14 year olds scored as follows (maximum score is 50):

Whilst these differences may reflect the influence of a variety of social and cultural factors as well as educational ones, the tendency for the gaps between countries to widen as cohorts pass through and as children get older, suggests the influence of within-system factors of some kind. Because of the small samples used and problems of representativeness, it would be unwise to place too great a reliance on these studies.

(v) Prais (1995) suggests that 'organisational aspects' of the British educational system disadvantage school-leavers of average and below average ability. In particular:

The use of calculators at too early an age, the 'New Mathematics' movement and the discovery and investigative methods of learning have contributed to a concentration on abstract exercises at the expense of the everyday mathematical needs of pupils.
In practical subjects, the British system emphasises 'problem-solving skills' rather than the applied skills that form the basis of these subjects in Europe; as a result, British school leavers generally have less technical competence than their European counterparts when they begin work.
Evenness of attainment is regarded in Europe as a way of ensuring high average standards for the class as a whole; the English emphasis on meeting individual pupils' needs leads to greater variation in levels of attainment within classes; in particular, slow learners do not always receive the degree of attention they need.
European pupils spend less time than their British counterparts moving from one classroom to another between lessons; the lack of a fixed place is seen by Continental teachers as a particular disadvantage for pupils from less favoured backgrounds.

[page 51]

Form teachers in Europe tend to cover more than one subject and are therefore able to develop a better understanding of individual pupils.
European secondary schools are on the whole smaller than British schools, which makes it easier for them to deal with discipline problems and disaffection.
From the age of 14 European pupils are able to pursue distinct curricula (vocational, technical or academic) according to their interests and abilities.

Secondary Analyses

There are also a number of secondary analyses of IEA and IAEP data that should be noted, although they tend to confirm the findings of the initial analyses. The Coleman (1975) re-analysis of IEA data, for example, showed that England had the highest level of family background effects of all the societies looked at, a finding also confirmed by later IEA studies. Scheerens et al (1989) re-analysed SIMS data and, although England is not included in the sample, there are fascinating variations between countries in the factors associated with achievement (as outlined in the Introduction), and in the relative strength of the school level and the classroom level in explaining variation in student achievement scores (Scandinavian countries like Sweden have very high variation explained by classrooms rather than schools, perhaps reflecting considerable central intervention to ensure similarity of catchment areas).

There have been some further attempts to rate countries on aspects of the organisation of their educational systems, such as the extent to which decisions are decentralised. But in spite of the attractiveness of using factors that have been collected in IEA studies in creative, yet low cost ways, the resulting analyses have been inconclusive. As an example, the Meuret and Scheerens (1995) classification of educational systems according to the proportion of decisions in which the school was involved, pointed to New Zealand (> 70%) and Switzerland (20%) as the two 'extreme' countries. Yet both achieved the very highest reading test scores in the OECD/CERI Education at a Glance publication (1993).

[page 52]

CONCLUSION

Worlds Apart?

The Case So Far

We noted in Section Two that there were built in problems with the kind of studies that we have been looking at - the problem of comparing countries on a 'common currency' and the problem of separating out educational influences from the other significant factors. The studies we have reviewed solved the first problem by using standard achievement tests. The second problem was considerably more difficult to deal with, since 'at a point in time' comparisons reflect the influences of multiple factors. There is no doubt that social, cultural, economic and familial factors in different countries are of major importance in explaining performance - indeed, it would be utterly contrary to all the findings of school effectiveness research if this were not the case (see reviews in Reynolds and Cuttance, 1992).

Nevertheless, it must be clear that educational influences do stand out in the findings reported. No wider societal or cultural factors could be responsible for the huge difference in achievement in England between pupils in the years of compulsory schooling (5 - 16) and pupils aged 16-19. The marked differences between societies in the sub-areas of Mathematics have no known non-educational causes. And Mathematics and Science are subjects on which wider, cultural influences are least marked.

It is clear, then, that the educational systems of different societies are key factors in determining their educational achievement. These studies are remarkably consistent in that respect. For England, the studies suggest that:

Performance in Science is rather better than that in Mathematics, but not appreciably so.
Performance in Mathematics in England is relatively poor overall, but with some strength in data handling and geometry, and considerable weakness in arithmetic/number operations.
This performance deteriorated relative to other countries between the mid-1960s and the mid-1980s.
English children have a very wide range of achievements, and a greater proportion of low achieving pupils.
There is greater variation in 'opportunity to learn' in England.
The historic advantages of the English system, with its high achieving pupils, have disappeared.

[page 53]

Only at older ages in a highly selected system is English performance relatively good.
The English performance is even more surprising, since students have more years in compulsory schooling than in most other societies.
The English performance is similar to that of Scotland, although one might have expected the latter to show a degree of superiority.

It may be that the nature of the studies is in itself an explanation for the apparently poor performance of English pupils. Particularly:

The curriculum in England may have increasingly departed from that which is measured in the surveys because of curriculum innovation.
A small number of poor or zero scores may drop the English average disproportionately.
The high proportion of test questions in studies related to number work may have adversely affected overall English performance, given our relative weakness in this skill area.
English performance might have been higher on the social outcomes of education, had more authentic assessment tasks that related more closely to school experience been used, although given the link between academic and social outcomes this seems highly unlikely.

Other societies could claim as many, or more, 'mitigating factors' than England. And the very low response rate for England, which is a feature of all the studies reported, is likely to have overestimated English performance rather than underestimated it.

When all studies point in the same direction, it would in our view need rather more than the above list of caveats to persuade one that the English performance is anything other than poor.

Why Worlds Apart?

In this section, we assess the factors that explain country differences in general and the rather poor English performance in particular. This search for explanations is something that has received much less attention than the issue of the size of the differences in achievement between countries. Indeed, we have seen that the IEA and IAEP studies are notably opaque when it comes to explaining their findings - not surprising, given the data that it is possible to collect on educational processes in studies of this kind.

Our difficulty in explaining, rather than describing, the differences between countries is magnified by the frankly inept contribution which the comparative education discipline has made over time (see the review in Altbach, 1991). A range of theories have been put forward in this field, without any apparent empirical backing. There are a large number of descriptive case studies of individual schools or educational systems, which it is impossible to synchronise because there are no common measures of outcomes or processes; there are also a number of descriptions of the range of educational, political, economic and cultural phenomena within different countries, which make no attempt to assess the contribution of the educational system compared with other factors.

[page 54]

There are, however, a number of small-scale, usually bilateral, comparisons of England with other countries and we have added to them by means of a comprehensive review of the educational literature ourselves. We have also had access to much unpublished data from our ISERP study.

Some Hypotheses

What follows now are some hypotheses as to:

What are the reasons for the superior performance of Pacific Rim countries?
What are the reasons for the superior performance of certain European societies as against England?
What are the reasons for the poor English performance that relate to the nature of the educational system?

The Pacific Rim

It is widely agreed that there are a variety of factors responsible for the high achievement scores of Pacific Rim societies. Among the cultural factors suggested are (Stevenson and Stigler, 1992; Her Majesty's Inspectorate, 1992; Thomas and Postlethwaite, 1983):

The high status of teachers within Pacific Rim societies that, because of their religious and cultural traditions, place a high value upon learning and education;
The cultural emphasis, reflecting Confucian beliefs, on the role of effort, the importance of an individual's striving and working hard, and the use of the educational system as a planned and stably resourced instrument of nation building;
The high aspirations of parents for their children. [Stevenson (1992) reports Taiwanese parents as being more dissatisfied with school than American parents'];
The recruitment to teacher training of students who are the equal of other students in terms of achievement levels. The fact that teaching offers clever, rural born children and girls and boys from low socio-economic status homes a route to upward social mobility is thought to be particularly important;
High levels of commitment from children who are keen to do well at school.

Among the systemic factors thought to be important are:

The high quantities of school time in Pacific Rim societies, with Korea and Taiwan, for example, having longer and more school days (222 days in school per year, compared to 192 in England). A high proportion of students also attend 'cramming' institutions in the evenings and homework is set for children as young as 6;
The prevalent belief that all children are able to acquire certain core skills in core subjects, and that there is no need for a 'trailing edge' of low performing pupils. This contrasts with the belief in Western societies of the normal distribution, with an elongated tail and 'built in' failure of fifty percent of the distribution to acquire more than the average level of skill;

[page 55]

Concentration on a small number of attainable goals, mostly of an academic variety or concerned with the individual's relationship to society, rather than a spread of effort across many academic, social, affective and moral goals.

Important school factors are:

The use of mixed ability classes in the early years of school, with all children receiving basic skills in an egalitarian setting, and learning to value the importance of the group and of cooperation, and with the class being kept together as a group throughout the school year.
The use of specialist teachers.
The possibility of teachers working collaboratively with each other, facilitated by teachers having approximately one third of their time out of the classroom.
Frequent testing of students' skills in core subjects which obviously is likely to enhance student attainment on achievement tests, but which is also beneficial in that it provides high quality information on how students, teachers and schools are functioning. In particular, frequent monitoring makes it possible to operate short term feedback loops and take corrective action at the level of the child or teacher much more quickly than in the English system.
Direct quality monitoring by the Principal of the work of teachers, by means of random sampling once or twice a term of the homework books of all children in the school; this happens, for example, in Taiwanese elementary schools.

Key classroom factors include:

Mechanisms to ensure that things are taught properly the first time around, and that there is no 'trailing edge' of children who have to be returned to later (an example from Taiwan is that children have to repeat in their homework books any exercises they got wrong in their previous homework).
High quantities of whole-class interactive instruction, in which the teacher attempts to ensure the entire class have grasped the information being given.
The use of the same textbooks by all children, which permits teachers to channel their energy into classroom instruction and the marking of homework, rather than into the production of worksheets that is so much a feature of English teaching.
Mechanisms to ensure that the range of achievement is kept small (in Taiwan children who have fallen behind finish their work in lesson breaks, at break times and sometimes after school).
A well ordered rhythm to the school day, involving in elementary school the use of 40 minute lessons that permit children frequent breaks to 'let off steam', combined with well-managed lesson transitions that do not 'leak' time.

[page 56]

European Societies

Discussion has also centred on the possible systemic features responsible for the high achievement levels of certain European societies. In Germany and Holland (Smithers and Robinson, 1991; Prais and Wagner, 1965; Bierhoff, 1996):

(i) There are close links between school leaving awards and future job opportunities, with particularly the apprenticeships that are the gateway to skilled working class employment being dependent upon good performance in school;
(ii) Final marks are averaged across subjects, so students cannot give up on a subject (e.g. Mathematics) if they do not like it;
(iii) Teaching groups are more homogeneous, partly because the school system is selective (for most German States), thus reducing the range and making teaching less difficult;
(iv) Students can be 'kept down' in a grade, until they have acquired the levels of achievement in all subjects necessary for the next grade up. (This affected 10% of the age cohort in the Burghes and Blum (1995) study, for example);
(v) The use of textbooks as a resource prevents teachers from re-inventing the wheel and diverting their energies into the preparation of worksheets.

In Switzerland, the high mean scores of pupils and the low range of scores in Mathematics and Science are thought to be the product of (Bierhoff, 1996; Bierhoff and Prais, 1995):

(i) High proportions of lesson time (50 - 70%) being used for whole-class teaching. This is not simply of the 'lecture to the class' variety, but high quality interactive teaching in which the teacher starts with a problem and develops solutions and concepts through a series of graded questions addressed to the whole class. Pupils working on their own in groups are correspondingly much rarer than in England;
(ii) The use of textbooks which are drafted by groups of experienced teachers to cover the prescribed curriculum of their Canton. There is a textbook for each level of schooling, and they are notable in that they contain little self-instruction material (tile content of English textbooks) but mostly exercises. The textbooks come with substantial teachers' manuals, which provide suggestions for teaching each page and master copies for overhead projector transparencies;
(iii) The coherent planning of work, so that teachers know how much material should have been covered at various different time points;
(iv) Concentration in primary schools on basic number work, the use of textbooks with clear goals which provide exercises appropriate for each learning step, the use of teacher-suggested methods of calculation in number work and the introduction of calculators at a later age than in England for fear of damaging students' capacity to do mental arithmetic and calculation. (Bierhoff, 1996).

[page 57]

Systemic factors in Hungary include (Burghes, 1995):

(i) More formal classroom teaching, with more teacher direction, more whole-class interactive instruction and more open discussion of students' mistakes.
(ii) The fact that students entering teacher training for primary teaching have high qualifications, including the equivalent of A level Mathematics.
(iii) High expectations of what children can achieve, with greater lesson 'pace' (itself aided by teacher control) and national guidelines that expect teachers to move to advanced topics quickly.
(iv) Selective systems of education at secondary age that reduce the range of achievement teachers have in school, and enable clear and distinct sets of goals to be drawn for children of different achievement levels.

Growing Apart? Some Speculations

So far we have noted a wide range of explanations as to how the educational systems of different societies may generate differences in levels of educational achievement. Cultures other than England may have a variety of factors in their educational systems which are not present in an English context.

But what are the processes that are present within an English context - not just the absence of effectiveness factors seen elsewhere - which may be responsible for the English performance? What does the English experience actually look like? We use data, much of it unpublished, from the British part of the ISERP study to speculate on this point (see Reynolds et al, 1994 for an outline of the study and Reynolds et al, 1996 in Appendix A for further details of the study findings).

The first thing that is clear from our data is that there is already a wide range of achievements at age 7 amongst English pupils when they begin junior school. England, in the ISERP data, has a standard deviation of 92 (on a mean of 504), whereas Taiwan, for example, has a standard deviation of 56 (on a higher mean of 629). The heterogeneity of English society and the variation in parental environments, pre-school experiences and in infant school quality are likely to be the key explanations for this.

This wide range of achievement may well be magnified by the nature of English classroom and school processes, that is:

By the class having the 'constant' of the teacher for only perhaps 20% of lesson time, by comparison with 80% of lesson time in the Pacific Him societies.
By children being thrown back, for much of their lessons, onto their own internal resources or those of their achievement differentiated group.

It is not surprising, therefore, that in our lSERP study, England was the only country in which the range of achievement in Mathematics (computation and problem solving) widened between ages 7 and 8 (years three and four of school).

[page 58]

A pupil population which has considerable variation in achievement interacts with an educational system that displays considerable variation in quality, reflecting:

The use of a complex pedagogy involving complex room and teacher changes between lessons, the use of groups that are sometimes achievement related and sometimes mixed ability and the poor management of parent helpers and special educational needs support.
The complicated nature of the teaching role, with the teacher involved in generating 'home made' resources for instruction, like worksheets, rather than using 'common' text books;
The multiplicity of goals - academic, social, behavioural, cultural - that can be pursued and that make a common 'mission' difficult to achieve.

In England, the complex pedagogy, lack of goal clarity and dissipation of teacher effort result in a wide variation between the levels of quality in schools. In the English ISERP schools, 12% of the variation between pupils is due to the influence of schools of clearly heterogeneous quality. In Taiwanese schools by contrast, the presence of simple pedagogies such as time, opportunity to learn and whole-class direct instruction, combined with goal certainty and clarity, creates a system of homogeneous quality - only 1% of the variation between pupils is due to schools.

The ISERP longitudinal data suggests that the English system may be responding to the range of initial achievement by differentiation practices that are a perfectly understandable method of handling range, but which may, in their recognition of range, make it even wider. Another way of handling range is of course to impact upon it directly by reducing it, for example by the direct targeting of the lower achieving children for extra attention, and by 'holding pupils down a year' and 'putting them up a year', both of which actions considerably truncate range. Many European societies indeed combine action on range, with a reluctance to differentiate between and within classes. In Switzerland, for example, a differentiated secondary system at age 13 means that only 18% of pupils are grouped by ability in Mathematics between classes, and 19% of pupils within classes. By comparison, the English figures are 92% and 32% respectively.

Indeed, much of Europe now possesses a predominantly mixed ability and undifferentiated primary education system, where pupils repeat years if necessary, followed by a differentiated secondary system. This is not the English experience.

The Next Steps

Customarily, educationists have been highly ethnocentric in their knowledge bases of effective practices, although things are clearly changing as we noted at the beginning of this review. Our conclusion is that English educationists now need to look beyond their own geographical boundaries to see why it is that other countries, in particular those of the Pacific Rim and successful European countries such as Switzerland, may be doing better than we are.

[page 59]

To look at other, non-English contexts and assess which of their practices may be useful here is of course a slightly risky enterprise, intellectually and practically. Factors that work within one context may not work within another, or at least may not work as productively. Indeed, factors that are associated with success in a Pacific Rim culture which celebrates a very different view of the nature of humankind, and a very different view of the proper relationship between an individual and the collectivity, may need careful evaluation before they are adopted in schools of a different culture.

However, we would argue that the situation in which England finds itself is now so worrying, that the risk involved in looking outward and trying new practices is worth taking. Indeed, limited experimentation with non-British practice seems positively overdue. When such experiments have taken place within non-educational sectors of society - as with the British motor industry's use of a blend of British and Japanese practice - they have been productive for the professionals concerned and for the wider society. Variations in cultural context and traditions have never prevented management in any area from trying out ideas or reforms that have been introduced abroad, monitoring their effectiveness and then dispensing with them if they do not improve the situation.

We would suggest that educationists in England behave as we would urge our children to do. That is to look beyond the immediate restriction of tradition and geography and use an open mind to see if other countries have ideas and practices which we can adapt to our own system. The way to cease being 'Worlds Apart' is surely to adopt an open mind.

[page 60]

APPENDIX A

Further Reading

1. International Comparisons and Education Reform (1989) by A. Purves and published by ASCD in the United States is an excellent guide to, and summary of, many of the studies reviewed here.

2. D. Reynolds et al (1994) Advances in School Effectiveness Research and Practice (Oxford: Pergamon) is an introduction to the field of educational effectiveness, and is international in scope.

3. H. Bierhoff and S. J. Prais (1995) Schooling As Preparation for Life and Work in Switzerland and Britain (London: NIESR) is a small scale study of England and of Europe's highest scoring country on Mathematics, Switzerland. It is available (price £3.00) from the National Institute for Economic and Social Research, 2 Dean Trench Street, Smith Square, London, SW1P 3HE.

4. The IAEPM 2 Study Learning Mathematics and Science (1992) by D. Foxman, is available (price £10.00) from the National Foundation for Educational Research, The Mere, Upton Park, Slough, Berkshire, SL1 2DQ. The NFER also publish a free leaflet on The Third International Mathematics and Science Study (TlMSS), available from the same address.

5. S. J. Prais (1995) Productivity, Education and Training: An International Perspective (Cambridge University Press) provides a comprehensive and comprehensible summary of the detailed work of Prais and associates over we last decade.

6. Findings from the ISERP study are available in D. Reynolds, B.P.M. Creemers, S. Stringfield and C. Teddlie (1996) The International School Effectiveness Research Project: Some Further Findings (a paper presented to the American Educational Research Association, April 1996), which is available free of charge from David Reynolds, Department of Education, St Thomas Street, Newcastle upon Tyne, NE1 7RU.

7. Reviews of the methodological adequacy of the material reviewed here can be found in a special issue of the Comparative Education Review, volume 31, number 1 (February 1987).

[page 61]

APPENDIX B

Recommendations for Future Research

A number of possible avenues of research have been suggested by our review of existing knowledge:

(i) England has, to our knowledge, undertaken no secondary analyses of international datasets, nor of the English/Welsh specific country data, in marked contrast to the situation in other countries (see for The Netherlands the re-analyses of IEA data by Scheerens et al, 1989). The imminent publication of the TlMSS results, and the availability of that data, creates an opportunity for such low cost work.
(ii) England has not participated in two major studies which have sought to find educational processes that might be linked to achievement outcomes, namely the Classroom Environment Study (Anderson, Ryan and Shapiro, 1989) and the Computers in Education Study (Pelgrum and Plomp, 1993). Participation in such process studies in future could be profitable.
(iii) Large scale surveys of educational achievement are often designed using complex matrix sampling so that wide topic coverage can be obtained. The potential exploitation of these data will be greatly enhanced by the use of multi-level modelling, as in TIMMS, which enables a detailed description of the variance structure at the student level (Goldstein, 1989).
(iv) England should end its continued non-participation in IEA cohort studies, which can give much better estimates of cross-cultural variation in achievement than cross-sectional studies.
(v) Future studies of the IEA, and similar organisations, should attempt to describe cross cultural differences rather less and analyse the reasons for their existence rather more.
(vi) The existence of substantial within country differences could usefully be explored by separate additional analyses that would look at the potentially different outcomes and processes of the four countries of England, Scotland, Northern Ireland and Wales.

[page 62]

APPENDIX C

Bibliography

Altbach, P. (1991) 'Trends in Comparative Education' in Comparative Education Review, August, pp 491-507.

Anderson, L. W. Ryan, D. W. and Shapiro, B. J. (1989) The IEA Classroom Environment Study. Oxford: Pergamon Press.

Ball, S. J. (1990) Politics and Policymaking in Education. London: Routledge and Kegan Paul.

Bierhoff, H. (1996) Laying the Foundation of Numeracy: A Comparison of Primary School Textbooks in Britain, Germany and Switzerland. London: National institute for Economic and Social Research.

Bierhoff, H. J. and Prais, S. J. (1995) Schooling As Preparation for Life and Work in Switzerland and Britain. London: National Institute for Economic and Social Research.

Bronfenbrenner, U. (1972) 'Another World of Children' in New Society, 10th February, pp 278-286.

Burghes, D. and Blum, W. (1995) 'The Exeter Kassel Comparative Project: A Review of Year 1 and Year 2 Results' in Gatsby Foundation, Proceedings of a Seminar on Mathematics Education. London: Gatsby.

Burghes, D. (1995) 'Britain Gets a Minus in Maths', Sunday Times, 14th May, p 11.

Cheng, Y. C. (1993) 'The theory of the characteristics of school-based management' in International Journal of Educational Management, Vol. 7, No. 6, pp 6-17.

Coleman, J. S. (1975) 'Methods and Results in the IEA Studies of Effects of School on Learning' in Review of Educational Research, Vol. 45, No. 3, pp 335-386.

Comber, L. C. and Keeves, P. (1973) Science Education in Nineteen Countries. London: John Wiley.

Creemers, B. (1994) The Effective Classroom. London: Cassell.

Creemers, B. and Scheerens, J. (Eds) (1989) 'Developments in school effectiveness research', a special issue of International Journal of Educational Research, Vol. 13, No. 7, pp 685-825.

Edmonds, R. R. (1979) 'Effective schools for the urban poor' in Educational Leadership, Vol. 37, No. 15-18, pp 20-24.

Fogelman, K. (1978) 'The Effectiveness of Schooling' in Armytage, W. H. G. and Peel, J. (Eds). Perimeters of Social Repair. London: Academic Press.

Foshay, A. W. (1962) Educational Achievements of Thirteen Tear Olds in Twelve Countries. Hamburg: UNESCO.

Foxman, D. (1992) Learning Mathematics and Science (The Second International Assessment of Educational Progress in England) Slough: National Foundation for Educational Research.

[page 63]

Fuller, B. (1987) 'School effects in the Third World' in Review of Educational Research, Vol. 57, No. 3, pp 255-292.

Garden, R. A. (1987) 'The Second IEA Mathematics Study' in Comparative Education Review, Vol. 31, No. 1, pp 47-68.

Goldstein, H. (1989) Multilevel Modelling in Large Scale Achievement Surveys. Paper presented to the American Educational Research Association, San Francisco, March.

Goldstein, H. (1993) Interpreting International Comparisons of Student Achievement. Paris: UNESCO.

Goodman, R. (1995) 'Chasing Illusions: the Real Lessons From Japanese Schools' in Demos Quarterly, Vol. 6, pp 37-8.

Hallinger, P. and Murphy, J. (1986) 'The social context of effective schools' in American Journal of Education, Vol. 94, pp 328-355.

Hargreaves, A. and Reynolds, D. (Eds) (1989) Educational Policy: Controversies and Critiques. Lewes: Falmer Press.

Her Majesty's Inspectorate, (1992) Teaching and Learning in Japanese Elementary Schools. London: HMSO.

Howarth, M. (1991) Britain's Educational Reform: A Comparison with Japan. London: Routledge/Nissan institute.

Husen, T. (Ed) (1967) International Study of Achievements in Mathematics, Volumes One and Two. Stockholm: Almquist and Wiksell.

Keeves, J. P. (1992) The IEA Study of Science III: Changes in Science Education and Achievement, 1970 to 1984. Oxford: Pergamon Press.

Keys, W. and Foxman, D. (1989) A World of Differences (A United Kingdom Perspective On An International Assessment of Mathematics and Science). Slough: National Foundation for Educational Research.

Lapointe, A. E., Mead, N. and Phillips, G. (1989) A World of Differences: An International Assessment of Mathematics and Science. New Jersey: Educational Testing Services.

Levine, D. and Lezotte, L. (1990) Unusually Effective Schools: A Review and Analysis of Research and Practice. Madison: NCESRD Publications.

Lezotte, L. (1989) 'School Improvement based on the Effective Schools Research' in International Journal of Educational Research, Vol. 13, No. 7, pp 815-25.

Lynn, R. (1988) Educational Achievement in Japan: Lessons for the West. London: Macmillan/Social Affairs Unit.

Meuret, D. and Scheerens, J. (1995) An International Comparison of Functional and Territorial Decentralisation of Public Educational Systems. Twente: University of Twente.

Mislevy, R. J. (1995) 'What can we learn from International Assessments?' Educational Evaluation and Policy Analysis, Vol. 17, No. 4, pp 419-437.

National Commission on Education (1993) Learning to Succeed. London: Heinemann.

Pelgrum, W. J. and Plomp, T. (1993) The IEA Study of Computers in Education: Implementation of An Innovation In 21 Education Systems. Oxford: Pergamon Press.

[page 64]

Postlethwaite, T. M. and Wiley, D. E. (1992) The IEA Study of Science II, Science Achievement in Twenty Three Countries. Oxford: Pergamon Press.

Prais, S. J. (1995) Productivity, Education and Training: An International Perspective. Cambridge: Cambridge University Press.

Prais, S. J. (1994) 'Economic Performance and Education: The Nature of Britain's Deficiencies' Proceedings of the British Academy, Vol. 84, pp 151-207.

Prais, S. J. and Wagner, K. (1965) 'Schooling standards in England and Germany: some summary comparisons based on economic performance' in Compare, Vol. 16, pp 5-36.

Purves, A. C. (1992) The IEA Study of Written Composition II: Education and Performance in Fourteen Countries. Oxford: Pergamon Press.

Reynolds, D. and Cuttance, P. (1992) School Effectiveness: Research, Policy and Practice. London: Cassell.

Reynolds, D. Sullivan, M. and Murgatroyd, S. J. (1987) The Comprehensive Experiment. Lewes: Palmer Press.

Reynolds, D. Creemers, B. P. M. Stringfield, S., Teddlie, C., Schaffer, E. and Nesselrodt, P. (1994) Advances in School Effectiveness Research and Practice. Oxford: Pergamon Press.

Robitaille, D. F. and Garden, R. A. (1989) The IEA Study of Mathematics II: Contexts and Outcomes of School Mathematics. Oxford: Pergamon Press.

Rosier, M. J. and Keeves, J. P. (1991) The IEA Study of Science I - Science Education and Curricula in Twenty Three Countries. Oxford: Pergamon Press.

Scheerens, J. Vermeulen, C. J. and Pelgrum, W. J. (1989) 'Generalisability of instructional and school effectiveness indicators across nations' in International Journal of Educational Research, Vol. 13, No. 7, pp 789-799.

Smithers, A. and Robinson, P. (1981) Beyond Compulsory Schooling: A Numerical Picture. London: Council for Industry and Higher Education.

Stevenson, H. (1992) 'Learning from Asian Schools' in Scientific American, December, pp 32-38.

Stevenson, H. W. and Stigler, J. W. (1992) The Learning Gap: Why Our Schools Are Failing and What We Can Learn from Japanese and Chinese Education. New York: Summit Books.

Teddlie, C. and Stringfield, S. (1993) Schools Make a Difference: Lessons learned from a ten year study of school effects. New York: Teachers College Press.

Thomas, R. and Postlethwaite, N. (1983) Schooling in East Asia: Forms of Change. Oxford: Pergamon Press.

Travers, K. J. and Westbury, 1. (1989) The IEA Study of Mathematics I: Analysis of Mathematics Curricula. Oxford: Pergamon Press.

Van de Grift, W. (1990) 'Educational leadership and academic achievement in secondary education' in School Effectiveness and School Improvement, Vol. 1, No. 1, pp 26-41.

Walberg, H. J. (1991) 'Improving School Science in Advanced and Developing Countries' in Review of Educational Research, Vol. 61, No. 1, pp 25-69.