A History of Roget's Thesaurus: Origins, Development, and Design

Werner Hüllen

Print publication date: 2003

Print ISBN-13: 9780199254729

Published to Oxford Scholarship Online: January 2010

DOI: 10.1093/acprof:oso/9780199254729.001.0001

(p.1) 1 Introduction
A History of Roget's Thesaurus

Werner Hüllen

Oxford University Press

Abstract and Keywords

Roget's Thesaurus is an outstanding work of English lexicography. When it appeared in 1852, it was the first of its kind. It has been on the bookshelves of almost every educated man and woman in Britain, the United States, and indeed the whole of the English-speaking world ever since, as the many reprints and new editions testify. This is true of both the countries with an indigenous English-speaking population and the countries whose inhabitants learn English as a second or foreign language. From these areas it wandered throughout the world to wherever people think they cannot afford to neglect something which enjoys such general acceptance. Moreover, Roget's Thesaurus gave rise to similar word collections, at least in many of the major European languages. This book traces the history of Roget's Thesaurus, focusing on its origins, development, and design.

Keywords:   Peter Mark Roget, lexicography, education, English-speaking world, Roget's Thesaurus

  1. 1.1. The initial hypothesis

  2. 1.2. Outline of the book

1.1 The initial hypothesis

Roget’s Thesaurus is an outstanding work of English lexicography. When it appeared (Roget 1852), it was the first of its kind. It has been on the bookshelves of almost every educated man and woman in Britain, the United States, and indeed the whole of the English-speaking world ever since, as the many reprints and new editions testify. This is true of both the countries with an indigenous English-speaking population and the countries whose inhabitants learn English as a second or foreign language. From these areas it wandered throughout the world to wherever people think they cannot afford to neglect something which enjoys such general acceptance. Moreover, Roget’s Thesaurus gave rise to similar word collections, at least in many of the major European languages.

The vast distribution of this book (almost comparable to that of the Bible) is hard to explain. The Thesaurus is now widely regarded as being part of our heritage; so a ‘150th anniversary edition’, published in 2002, was quite natural (Roget 2002). But then the question arises of how this came about in 150 years. Was it because of its eminent usability as an up-to-date tool for conversation and composition, or was it because of its linguistic concept, which allowed a comprehensive and comprehensible presentation of the English language (and, later, of other languages, too)? In other words, is it the practical or the theoretical merits of the Thesaurus which account for its success?

There is no way of deciding this question in a straightforward manner. Some voices argue that Roget’s Thesaurus, just like other (p.2) thesauri, is one of those volumes which sit on bookshelves and are hardly ever used (Püschel 1986; Wiegand 1994). Again, the question arises of what exactly the reason for this is. Even if a project to investigate present-day thesaurus use were to confirm this assumption, it would be difficult to transfer its findings to the past. Inferring from the present to the past is not a legitimate procedure, because there are so many more sources of linguistic information available now than in 1852 and the decades immediately following, so that the book’s early success might be understandable but not its unbroken, 150-year-long popularity. As for assessing the theoretical qualities of the Thesaurus, nobody has yet tried to do this.

But perhaps the ‘theory’ and ‘practice’ of the Thesaurus need not be juxtaposed in this way. Peter Mark Roget certainly did not do this himself. His professional fields were medicine and natural history, not linguistics. He sketched out an ideational framework of his project in the ‘Introduction’, but in doing so he showed his awareness of general philosophical ideas rather than any specific linguistic knowledge. He admitted in the ‘Preface to the First Edition’ to have studied and compiled words in the fashion of the Thesaurus for many years and that he had done so for the practical purposes of literary composition. It may very well be, however, that he did something intuitively and with his own linguistic needs and experiences as a scientist in mind which very soon turned out to be a real achievement in linguistics-based lexicography.

This possibility ties in with the general observation that many (perhaps most) of our relevant linguistic ideas have a pre-theoretical (common) as well as a theoretical (academic) status. After all, the analytical system of descriptive linguistics, esoteric as it looks to many, deals with a common type of human behaviour. Here academic experts reflect on and analyse what is part and parcel of everyone’s competence and performance. A phenomenon like ellipsis, for example, was certainly realized in countless texts long before an academic gave it special attention,1 and it has been and still is used all the time parallel to and without any knowledge of the permanent stream of scientific literature devoted to the topic. We might assume that the wide appeal of Roget’s Thesaurus lies in a particularly felicitous fusing of some theoretical and some practical aspects, leaving (p.3) open the question of whether the author really knew and understood what he was doing. This would place his book in a position precisely between the (merely) theoretical and the (merely) practical.

With the advantages of historiographical hindsight, we can indeed recognize that Roget’s Thesaurus combines various linguistic ideas with quite a long and influential tradition behind them and that this may be an additional covert source of its success.2 It is, first, a dictionary of synonyms. Such dictionaries had been popular for about two hundred years before 1852. Even then, they developed from much older linguistic reflections. Synonyms have been dealt with since the classical authors, though those authors rarely made them a point of theoretical discussion. Synonyms were used in the common handling of language—in dialogues, poems, dramas, sermons, etc.—without speakers and writers being aware that they were doing something which scientists would, at a certain date in history, make the starting point of their studies. We may call this an ‘autonomous tradition’ (Hüllen 1999a: 28–39), i.e. ‘a long established and generally accepted custom or method of procedure’ (OED, definition, 5b) which results from natural conditions, in this case of language use, and needs no further justification. The point in history when this ‘autonomous tradition’ became a ‘deliberate’ one (without ceasing to be autonomous) is worth our special attention.

Second, the Thesaurus is a topical dictionary with its entries arranged according to semantic affinity. Such dictionaries had also been popular for at least two hundred years prior to 1852. Again, we encounter the habit of collecting words according to certain domains of meaning as an autonomous tradition. It appears in classical (and even pre-classical) word-lists (e.g. onomastica, scalae) and, in post-classical culture, in countless glosses, glossaries, nomenclators, onomasiological dictionaries, etc. (Hüllen 1999a). Serving various needs, topical dictionaries came into their own when, in the era of humanism, works of this kind were compiled whose philosophical aim it was to mirror the structure of the world.

This book proceeds from the historiographical hypothesis that Peter Mark Roget’s possibly unintentional, but nevertheless unique achievement was to integrate these two types of dictionary by (p.4) compiling a topical dictionary of synonyms. In order to confirm this hypothesis, we shall treat the Thesaurus as a kind of a rescript (or palimpsest) of two rich and important linguistic traditions. The history of synonymy as well as the history of onomasiology are, metaphorically speaking, inscribed on it just as old texts are inscribed on a parchment which has been used for new texts time and time again.

But there is something more to this palimpsest quality of Roget’s book. If we take seriously the notion that linguistic works are the scientific presentation of common linguistic behaviour, we must also allow room for the idea that not only pre-Thesaurus but also post-Thesaurus linguistics is of importance for its comprehension. Later concepts such as, for example, semantic fields or prototypes are, avant la lettre, incorporated into it just as much as are earlier historical concepts of synonymy and onomasiology. These later concepts determine the thinking of present-day historiographers even more than historical concepts do. They can even be called the indispensable condition of speaking about the past at all, because every historiographer is bound to the language and the thinking of his or her time in his or her dealings with the past (Hüllen 1998).

Thus, a book like Roget’s Thesaurus has not only its past but also its future inscribed in it.3 Or, to change the metaphor, it appears as a pivotal point around which certain linguistic ideas before and after 1852 can be arranged. What suggests this position, comparable to the hub of a wheel, for the Thesaurus is, of course, its outstanding success. This judgement is nevertheless a historiographical artefact, a decision of the historiographer who finds the book interesting enough to study. There are plenty of reasons for making other books the pivot of linguistic developments in very much the same way.

Although towering high above other books of a similar kind, Roget’s Thesaurus is also a dictionary among other dictionaries, and thus a sample of a certain genre of book. As such, it is part of semantics, as all dictionaries are. At this more general level, Roget’s Thesaurus can therefore also be made the pivotal point of certain aspects of the somewhat hidden history of semantics, as it appears in lexicography before and after 1852, but in particular between (p.5) (roughly) 1700 and 1900.4 The frame of reference is the British tradition (i.e. the English language), but glances at the European tradition on the Continent will also be useful.

1.2 Outline of the book

The starting hypothesis, as explained above, streamlines the sequence of arguments to be unfolded in this book. Chapter 2 aims to shed light on the natural scientist Peter Mark Roget, on certain ideas about the ordering of knowledge which he encountered and favoured during his life, and on the publication history of his Thesaurus. This introduction to the author will deal with the facts of his biography, but also with the scientific and general ideas generated and disseminated in the course of his dealings with his contemporaries. We shall have to observe the difference in the status of facts on the one hand and of ideas on the other for the historiography of (certain areas of) linguistics.

The stage having been set, three tasks must be attended to. First, synonymy must be outlined as an experience of everyday linguistic behaviour and in its more recent scientific conceptualizations. Second, the historical path of English dictionaries of synonyms must be charted and, third, the same must be done for the topical tradition of English lexicography. In Chapter 3, terms such as synonymy, homonymy, and polysemy are assessed as denoting ordinary linguistic experience, and semantic fields, semantic features, and semantic models (e.g. prototypes) are discussed as the most recent and pertinent ways for determining word meanings. It will be argued that synonymy is the hard core of each of them and that they tend to lose precision and acceptance as soon as they reach out to wider and more general statements about language, such as culture, generativism, or cognitivism. Taken together, these discussions will support an explanation of what synonymy actually is, and they will provide the special terms needed for the structural analysis of the Thesaurus itself.

Of course, it is impossible to write an exhaustive history of synonymy—the linguistic phenomenon and its concomitant theory. (p.6) Sketches of early practices and statements must do instead, as is shown in Chapter 4. They oscillate between the autonomous and the deliberate tradition. They extend from selected texts in antiquity via medieval and humanist practice to literary language and are what a backdrop is for the stage.

Chapter 5 presents the argument that practical synonymy as a part of interpretative lexicography was used in hard-word and general dictionaries which appeared from the seventeenth century onwards. They are one trail leading to Roget’s work. The role played by synonymy in their procedures of semanticizing will be defined and illustrated. The first peak of this development is to be found in Dr Johnson’s dictionary of the English language (1755). Its elaborate theory of ‘reciprocity’, as explained in the ‘Plan’ and the ‘Preface’, will be considered, with a reference to those Lockean ideas which preceded the dictionary and are the groundwork of Dr Johnson’s lexicographical practice. This will mean dealing with John Locke as a semanticist rather than as a philosopher.

Chapter 6 is the first of two essential historiographical sections of the book. After the backdrop mentioned above, the ‘play’, so to speak, starts with the beginnings of what are still pre-theoretical deliberations on synonymy as published by the Abbé Girard (1718). A brief sketch will be given of the extraordinary success those deliberations enjoyed on the Continent and also in England.

After Dr Johnson, there is a continuous chain of English synonym dictionaries. Together with the books published in the wake of Abbé Girard, they are a parallel trail leading to Roget’s work. They pave the way to 1852, the year in which the Thesaurus was published, and beyond. William Perry (1805), who explains word meanings simply by placing them in the context of synonyms, brings us methodically to Roget’s doorstep. Of course, all these dictionaries differ from Roget’s in being alphabetically ordered. A similar abundance of relevant publications appeared in the United States, though these works are beyond the scope of this monograph.

The long path from Johnson and Girard up to Roget is regarded as a pre-theoretical preparation of semantics, which came into its own as late as Reisig’s lectures, which were delivered in 1825 and published in 1839 but influenced discussions only in the seventies of that century (Schmitter 1996) and, later, in Saussure’s concept of valeur. This preparatory function applies to linguistic material (lexis) as well as to analytical method (semantics, lexicography).

(p.7) Chapter 7 is the other essential historiographical section of this book. As a detailed history of the topical tradition in English and Continental lexicography is available (Hüllen 1999a), only a short epitome will be given. It ends with a conspectus of Wilkins’s Tables’, i.e. the semantic part of his universal language scheme which, in fact, is a Thesaurus of words ordered according to philosophical later principles. Around 1700, this topical tradition came to a halt, but started anew within a different framework inspired by the innovative spirit of John Locke, whose main ideas will have already been outlined in our discussion of Johnson’s dictionary. Works to be discussed are the sketch of Parsigraphie by Jean de Maimieux (1797a, b), which is a new version of John Wilkins’s universal language scheme, and David Booth’s abandoned plan of an analytical dictionary (1835).

At this point, the historical prerequisites for Chapter 8, an analysis of Roget’s Thesaurus, the topical dictionary of synonyms, will have been made clear. This analysis makes use of the usual lexicographical concepts of macrostructure, microstructure, and pragmatic structures. The preface of the Thesaurus is read in the Lockean spirit, the ‘Plan of Classification’ (as macrostructure) is commented upon, and the structure of entry articles (as microstructure) is explained by various examples. It is here that the prefiguring function of the Thesaurus for the future development of semantics becomes most clear. It is also here that post-Thesaurus terms are most needed in order to explain why this is the case. The concepts and corresponding terms of fields, features, prototypes, frames, and scripts will be employed. In order to uncover pragmatic structures, content-dependent rules of order will be applied. (p.8)


(1) It is Thomas Linacre (c. 1460–1524) who is usually credited with having done this.

(2) ‘Success’ does not refer only to the economic side of the matter, though it includes this. It means the general attention of experts and laypersons, which the Thesaurus managed to attract.

(3) I regret that there is no metaphor encompassing future inscriptions on a parchment as palimpsest does for the past.

(4) For more on this limitation, see the Preface.