Dictionary for Clinical Trials
Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ...

Author:
Simon Day

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Dictionary for Clinical Trials

Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

To Nikki, Anya and Huw

Dictionary for Clinical Trials Simon Day Medical Department, Leo Pharmaceuticals, Princes Risborough, UK

JOHN WILEY & SONS, LTD Chichester · New York · Weinheim · Brisbane · Singapore · Toronto

Copyright © 1999 by John Wiley & Sons Ltd, Baﬃns Lane, Chichester, West Sussex PO19 1UD, England National 01243 779777 International (; 44) 1243 779777 e-mail (for orders and customer service enquiries): [email protected] Visit our Home Page on http://www.wiley.co.uk or http://www.wiley.com All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency, 90 Tottenham Court Road, London, UK W1P 9HE, without the permission in writing of John Wiley & Sons Ltd, Baﬃns Lane, Chichester, West Sussex, UK PO19 1UD. Other Wiley Editorial Oﬃces John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158-0012, USA WILEY-VCH Verlag GmbH, Pappelallee 3, D-69469 Weinheim, Germany Jacaranda Wiley Ltd, 33 Park Road, Milton, Queensland 4064, Australia John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop 02-01, Jin Xing Distripark, Singapore 129809 John Wiley & Sons (Canada) Ltd, 22 Worcester Road, Rexdale, Ontario M9W 1L1, Canada Library of Congress Cataloging-in-Publication Data Day, Simon. Dictionary for clinical trials / Simon Day. p. cm. Includes bibliographical references. ISBN 0-471-98611-9 (cased : alk. paper).—ISBN 0-471-98596-1 (paper : alk. paper) 1. Clinical trials—Dictionaries. I. Title. [DNLM: 1. Clinical Trials dictionaries. QV 13 D275d 1999] R853.C55D39 1999 610.72—dc21 DNLM/DLC for Library of Congress 99—11151 CIP British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN 0-471-98611-9 (cased) ISBN 0-471-98596-1 (paper) Typeset in 9/10pt Times from the author’s disks by Vision Typesetting, Manchester in Great Britain by Antony Rowe Ltd., Chippenham, Wiltshire Printed and bound This book is printed on acid-free paper responsibly manufactured from sustainable forestry, in which at least two trees are planted for each one used for paper production

iv

Preface It is now ﬁfty years since the British Medical Research Council published the results of a trial entitled ‘Streptomycin treatment of pulmonary tuberculosis’ (British Medical Journal, 30th October 1948, pages 769—782). That study is widely regarded as the ﬁrst randomised clinical trial. Earlier examples of nonrandomised studies are cited, notably that of J Lind (A Treatise on the Scurvy, 1753). Despite such a history and the enormous numbers of trials conducted and published in the last twenty or so years, many people do not consider ‘clinical trials’ as a discipline in its own right and, as such, the breadth of terms that should be covered in a dictionary of this kind is not well deﬁned. Ultimately, the choice of entries is a personal one, guided by experiences of what I have had to learn and what my colleagues in various specialities of the clinical trials spectrum have struggled to understand. Additionally I have trawled clinical trial protocols, reports, regulatory guidelines and published manuscripts to try to cover the majority of terms that are likely to be encountered. A lot of the terminology of clinical trials is statistical: terms used for the design (blocks, randomisation, stratiﬁcation) and for the analysis (conﬁdence interval, P-value, survival analysis, t test, to list but a few). I make no apology for the high proportion of statistical terms: those are usually the ones that are least well understood. Overall though, the content is broad and it is very diﬃcult to summarise what is covered. It is almost as diﬃcult to summarise what isn’t covered. This is not a dictionary of medical terms, of statistical terms, of epidemiological, ethical or data management terms. It does, however, contain elements of all those disciplines, the ﬁrst three in particular. Many of the epidemiological terms included would not ordinarily be found in a clinical trial protocol or report; however, in the discussion of whether a clinical trial is appropriate for answering a particular medical question, or in discussion of trial results alongside other sources of evidence, the issue of other approaches such as case-control studies and cohort studies are likely to be discussed. I have not included speciﬁc diseases (a medical dictionary would be more appropriate) or names of clinical rating scales but I have included a variety of medical terms that are frequently assumed to be understood Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

Preface (terms such as acute, chronic, subcutaneous, etc.) Abbreviations are not included, except in the few instances where a term is better known by its acronym than by its full name (COSTART and MedDRA are obvious examples). Nor are the names of professional or scientiﬁc societies, research institutions or regulatory authorities included. The intended readership for this dictionary are all those people who work with clinical trial protocols and reports or who otherwise need to understand the use of language in this specialist area. Such a readership includes clinical trialists (those people who actually carry out the various administrative, clerical and scientiﬁc aspects), those who sit on ethics committees, those who work in regulatory departments or grant awarding bodies, doctors, nurses, pharmacists (and patients) reading clinical trial reports, and so on. Trials sponsored by the pharmaceutical industry, as well as those conducted by academic institutions or by small groups of enthusiasts, all fall within the scope of this work as do community-based intervention studies, vaccine trials, studies of medical practice and medical devices. Necessarily, many entries will be more relevant to some types of trials and trialists than to others. I hope the coverage is adequate without being too cumbersome. The style of explanations and deﬁnitions is aimed at being pragmatic and readable rather than purist. Pre-existing deﬁnitions (often in regulatory guidelines) have not necessarily been faithfully reproduced, although care has been taken to incorporate the essential meaning from relevant guidelines. As an example, the term ‘adverse event’ has a very speciﬁc deﬁnition within the International Conference on Harmonisation although the explanation given here is a little more brief. Further examples of pragmatism abound in the explanations of some statistical terms. Many statisticians may challenge the correctness of my explanations of analysis of covariance, Bayesian statistics or P-value, for example: I apologise to them in advance but hope that the explanations I have given will help those readers who understand little or nothing of such terms to at least gain a rough and ready grasp of their meaning. Similarly, ‘ethics’ is covered in a mere two lines: there are other related entries but the aim is to get the essential meaning across. Full and complete explanations of all the terms included would mean this work taking on the scale of a series of text books and that is not the intention. I hope that the explanations given here, put in the context where the word or expression has arisen, will allow most readers to unravel most uncertainties.

vi

Preface In my defence over accuracy and quality control I can claim that every single entry has been reviewed by a variety of my colleagues; and in their defence I acknowledge that every single error, discrepancy and inconsistency remains my responsibility.

vii

The Ground Rules The following is a brief guide to what’s in and what’s not in, and rules for cross-referencing related or alternative terms. In general, study is used rather than trial except where the distinction is helpful (strictly speaking, study encompasses trial but many types of study will not be trials). Similarly, trial is taken to mean clinical trial. For example, acute study is listed, but not acute trial or acute clinical trial. Phrases may sometimes be abbreviated but, I hope, without causing any diﬃculty in ﬁnding them. For example, adaptive design should be taken to encompass adaptive trial design and adaptive clinical trial design. Where alternative terms may be used interchangeably I have tried to pick the most common term to deﬁne and its synonyms will simply direct you there with the symbol . For example, alpha error simply says ‘ type I error’ (where an explanation is given). The most important terms used within the deﬁnitions of other terms are emboldened, as are references to contrasting terms (Û . . .) and related terms (G . . .). I hope that sometimes giving indication of contrasting or related terms may help understanding. It is inevitable, however, that some deﬁnitions will be circular: active control contrasts with (Û) placebo control; placebo control contrasts with (Û) active control. Ultimately, just as with all dictionaries, all deﬁnitions must use the terms herein to explain other terms and the circularity becomes inevitable.

Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

Bibliography There is a variety of books written about clinical trials and several other dictionaries and glossaries that may prove helpful in deﬁning terms and clarifying their use. The following titles have proved particularly helpful in compiling this dictionary and may serve as useful additional sources of reference: Applied Clinical Trials (various issues) Churchill’s Illustrated Medical Dictionary (1989) New York: Churchill Livingstone. Boyd KM, Higgs R and Pinching AJ (1997) The New Dictionary of Medical Ethics. London: British Medical Journal. Bull K and Spiegelhalter DJ (1997) Survival analysis in observational studies. Statistics in Medicine 16:1041—74. Duncan AS, Dunstan GR and Welbourn RB (1981) Dictionary of Medical Ethics, revised edition. London: Darton, Longman and Todd. Dupayrat J (1990) Dictionary of Biomedical Acronyms and Abbreviations, 2nd edition. Chichester: John Wiley & Sons. Everitt BS (1995) The Cambridge Dictionary of Statistics in the Medical Sciences. Cambridge: Cambridge University Press. Friedman LM, Furberg CD and DeMets DL (1985) Fundamentals of Clinical Trials, 2nd edition. Littleton: PSG Publishing Company. Grieve AP (1998) FAQs of Statistics in Clinical Trials. Richmond: Brookwood Medical Publications. Heister R (1989) Dictionary of Abbreviations in Medical Sciences. Berlin: Springer-Verlag. Jadad A (1998) Randomised Controlled Trials. London: British Medical Journal. Johnson FN and Johnson S (1977) Clinical Trials. Oxford: Blackwell Scientiﬁc Publications. Last JM (1995) A Dictionary of Epidemiology, 3rd edition. New York: Oxford University Press. Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

Bibliography Marriott FHC (1990) A Dictionary of Statistical Terms, 5th edition. Harlow: Longman Scientiﬁc and Technical. Meinert CL (1986) Clinical Trials: Design, Conduct and Analysis. New York: Oxford University Press. Meinert CL (1996) Clinical Trials Dictionary: Terminology and Usage Recommendations. Baltimore: The Johns Hopkins University. Nahler G (1994) Dictionary of Pharmaceutical Medicine. New York: Springer-Verlag. Pereira-Maxwell F (1998) A—Z of Medical Statistics. London: Arnold. Po AL (1998) Dictionary of Evidence Based Medicine. Oxford: Radcliﬀe Medical Press. Pocock SJ (1983) Clinical Trials: A Practical Approach. Chichester: John Wiley & Sons. Rasch D, Tiku ML and Sumpf D (1994) Elsevier’s Dictionary of Biometry. Amsterdam: Elsevier Science. Raven A (1993) Clinical Trials: An Introduction. Oxford: Radcliﬀe Medical Press. Samson P (1975) Glossary of Bacteriological Terms. London: Butterworth and Co (Publishers) Ltd. Schwartz D, Flamant R and Lellouch J (1980) Clinical Trials. London: Academic Press. Senn S (1997) Statistical Issues in Drug Development. Chichester: John Wiley & Sons. Spilker B (1991) Guide to Clinical Trials. New York: Raven Press. Spriet A and Simon P (1985) Methodology of Clinical Drug Trials. Basel: Karger. Steen EB (1978) Abbreviations in Medicine, 4th edition. London: Baillie`re Tindall. Vogt WP (1993) Dictionary of Statistics and Methodology. London: Sage Publications. Winslade J and Hutchinson DR (1993) Dictionary of Clinical Research. Brookwood: Brookwood Medical Publications.

x

A a posteriori after the event; generally referring to decisions made or actions taken after data or results of a study have been seen. Û a priori. G Bayes’ theorem, posterior distribution a priori before the event; generally referring to decisions made or beliefs held before data or results of a study have been seen. Such decisions or beliefs may be based on data from previous studies or subjective feeling based on informal clinical experience. Û a posteriori. G Bayes’ theorem, prior distribution Abbe´ plot L’Abbe´ plot x axis. Û ordinate (or y axis) abscissa absolute change the numerical diﬀerence between two numbers as in, for example, change from baseline. Û relative change absolute frequency the number of items or the number of occurrences of a speciﬁed event. Often abbreviated simply to frequency. Û relative frequency absolute risk the number of events (deaths, adverse reactions, etc.) divided by the number of individuals who could have experienced the event. Û relative risk absolute value a numerical value that ignores any positive or negative sign; for example, the absolute value of ;3 is ;3; the absolute value of 93 is also ;3 absorption the process by which drug enters the blood stream. Û clearance, elimination absorption study a study that measures the time taken for drug to be absorbed into the blood stream accelerated failure time model a statistical model used in survival analysis that assumes the eﬀect of one treatment is to multiply the median survival time for patients randomised to one treatment group relative to that of patients randomised to another treatment group. Û Cox’s proportional hazards model acceptance error the error of accepting a statement (usually a null Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

acceptance region

acute study

hypothesis) when that statement or hypothesis is false. Û rejection error. G producer’s risk, Type II error acceptance region the values of a test statistic (for example, calculated values of t in a t test or of chi-squared in a chi-squared test) that lead to accepting the null hypothesis. Û rejection region. G critical value accountability taking responsibility for one’s own actions accrue to gather or accumulate (often with respect to patients, data or information) accumulate to collect more and more (patients, data, information, etc.) over time accumulating data when more and more data are available as time progresses. Usually used in the context of sequential analysis or group sequential analysis accuracy nearness of an observed value to its true value (even if the true value may never be known). Also used with respect to a measurement process to describe how closely that process measures the true quantity. Û precision accurate close to the true value. Û precise active control a comparator group in a study that receives an active treatment. Û placebo control active control equivalence study (ACES) a study designed to show therapeutic equivalence between two active products active ingredient the pharmacologically or biologically active parts of product (the tablet, capsule, etc.) G formulation, presentation active treatment generally means a noninert pharmacological product or biological substance (not a placebo). The term is also sometimes used to describe the treatment of primary interest, rather than a comparator (but still active) treatment life table analysis actuarial method acute rapid onset and short lasting. A disease may be acute (for example chicken pox) as opposed to chronic (for example diabetes). Sometimes the term is used to describe part of a study that is used to treat the disease of interest, in contrast to a long term follow-up period looking for relapse or long term drug safety. Such a short term part of a study is sometimes called the acute phase of the study. Û chronic acute episode short term appearance of symptoms of an underlying chronic (long lasting) illness. For example, bronchitis may be a chronic illness with acute episodes acute phase see acute. Û follow-up period acute study short term study (usually of a long lasting disease). 2

acute toxicity study

adverse drug experience

Û chronic study acute toxicity study a study to investigate the short term toxicity of a product, usually a single dose of a drug. Û repeated dose toxicity study. G reproductive and developmental toxicity study ad hoc one oﬀ. Something unique to a particular problem adaptive design study procedures that change as the study progresses. Most often refers to the details of the randomisation process changing as the study progresses and results become known. Such designs are used so that, if it appears that one treatment is emerging as superior to another, the allocation ratio can be biased in favour of the treatment that seems to be best. G dynamic allocation adaptive inference conclusions that can be made as data and information accumulate. Although this seems obvious, in many studies conclusions are drawn only once at the end of the study; adaptive inference may draw conclusions as the study progresses adaptive randomisation adaptive design adaptive treatment assignment adaptive design additive model a statistical model where the combined eﬀect of separate variables contribute as the sum of each of their separate eﬀects. Û multiplicative model. G interaction adequate and well controlled a term describing a study that is suﬃciently large, properly randomised, and blinded adjust to modify (usually the estimate of a treatment eﬀect) to account for diﬀerences in patient characteristics between treatment groups. G adjusted estimate adjusted estimate an estimate of a parameter as would have been observed at some speciﬁed value of another variable. For example, high blood pressure (and its treatment) may be related to age and so we may wish to estimate the eﬀect of a drug on people of diﬀerent ages. G analysis of covariance adjuvant therapy extra treatment given to enhance the eﬀect of a monotherapy. For example sensitising drugs to enhance the eﬀect of radiotherapy administer to give (in the sense of giving treatment) administrative review a review of (usually accumulating) study data where the purpose is to monitor practical aspects of the study’s progress (such as recruitment rates, shipment of laboratory samples, etc.) Û interim analysis inclusion criteria admission criteria adverse drug experience adverse event 3

adverse drug reaction

alphanumeric

adverse drug reaction adverse reaction adverse drug reaction on-line information tracking (ADROIT) a database kept of adverse reactions to marketed products adverse event any (usually) unwanted eﬀect that a subject experiences whilst taking a drug. Note that causality is not implied. Û adverse reaction adverse experience adverse event adverse reaction see adverse event but note that causality to a particular drug is implied adverse treatment eﬀect adverse reaction advocate to support a given argument, opinion, or point of view aetiology the cause of a disease or the study of disease causality agency regulatory authority aggregate to combine separate data values into groups of aggregate data aggregate data data that have been grouped in categories. For example, all ages of patients in the range 0 to 5 put into one category, ages 6 to 12 in another category, etc. agonist a drug that enhances or activates the eﬀect of a natural body chemical or of another drug. Û antagonist algorithm a written description of a mathematical equation or decision rule. It is usually written partially in words (although not necessarily in complete and proper sentences) rather than just a set of mathematical expressions all patients treated analysis intention-to-treat analysis intention-to-treat population all patients treated population all subsets regression a method of deciding which variables should be in a regression model. G backward elimination, forward selection allocate to assign (typically a treatment to a patient) either by randomisation or by some deterministic method allocation ratio in a parallel group study the ratio of the number of patients allocated to one treatment group relative to the number allocated to another treatment group. Most often, the ratio is 1:1, or equal allocation alpha () the probability of making a Type I error. Û beta (). G signiﬁcance test alpha error Type I error alpha spending function a method in sequential studies such that the times when interim analyses are performed do not need to be speciﬁed in advance. The number of, and timing of, interim analyses can be ﬂexible alphanumeric data that may be alphabetical (a, b, c, . . . , A, B, C, . . . , including special symbols such as ;, £, %) or numeric (0, 1, 2, . . . 9) 4

alternate allocation

analysis of variance (ANOVA)

alternate allocation a method of assigning treatments to patients whereby the ﬁrst patient receives Treatment A, the second receives Treatment B, the third Treatment A, the fourth Treatment B and so on in a predictable (alternating) manner. Û random allocation alternative hypothesis (H1) this is usually the point of interest in a study. It is generally phrased in terms of the null hypothesis (of no treatment eﬀect) not being true. If the objective of a study is to ‘compare Drug A with placebo’ then the null hypothesis would be that there is no diﬀerence between the two groups and the alternative hypothesis would be that there is a diﬀerence alternative medicine approaches to medicine such as homeopathy, acupuncture, herbal medicines, etc., considered by many people to be nonconventional medicines altruism putting the interests of the individual ﬁrst; speciﬁcally in clinical trials, putting the interests of the individual before those of the research project. G collective ethics, individual ethics amendment protocol amendment ampoule vial analysis the process of summarising data or problems, describing them clearly (including plotting data) and drawing conclusions analysis by administered treatment a strategy where data are summarised and conclusions drawn based on the treatment that patients were actually given. Û analysis by randomised treatment analysis by randomised treatment analysis by assigned treatment analysis by randomised treatment a strategy where data are summarised and conclusions drawn based on the treatment that patients were supposed to be given (the treatment they were randomised to receive), regardless of what they actually took. It is very similar to the term intention-to-treat. Û analysis by administered treatment analysis of covariance (ANCOVA) a statistical analysis method that is an extension of analysis of variance. It allows estimates of treatment eﬀects to be adjusted for possible covariates as well as factors analysis of variance (ANOVA) a statistical analysis method that allows comparison of two or more treatment groups and estimates of treatment eﬀects to be adjusted for other possible factors such as race, gender, treatment centre, etc. It is a very general method covering a very broad range of techniques and can be used in a great variety of situations. Because of this, to describe a method of analysis as being ‘analysis of variance’ is rarely suﬃcient to adequately describe what analysis has actually been carried out 5

analysis policy

applicable regulatory requirements

analysis policy analysis strategy analysis population the set (often subset) of patients recruited to a study who are subsequently included in the data analysis. Examples are the all patients treated population, per protocol population analysis strategy this combines the decision whether to use an all patients treated analysis, an intention-to-treat analysis, a per protocol analysis, or some other policy and considerations such as whether to use, for example, parametric methods or nonparametric methods, Bayesian inference or frequentist inference anatomical therapeutic chemical classiﬁcation system (ATC) a drug coding system that codes according to a drug’s site of action and its indication and/or a badly used term that often causes confusion, particularly over whether the word ‘or’ is considered as inclusive or exclusive. For example, if there are two events P and Q, one option is that both P and Q may occur; another option is that P or Q (but not both) may occur—this is the ‘exclusive or’; ﬁnally P or Q (or both) may occur—this is the ‘inclusive or’. If ‘or’ is considered inclusive then the term ‘and/or’ is redundant: ‘P or Q’ includes ‘P and Q’; if ‘or’ is considered exclusive then the term may have some use. It is probably better to use a few more words and explain what is intended anecdotal evidence unsubstantiated evidence that cannot be strongly relied on. It is usually considered as more informed than mere opinion and often used as a means of generating ideas and research questions aneugen a substance that causes toxic eﬀects on DNA. G clastogen angular transformation a transformation applied to data that are of the form of proportions to allow use of statistical methods based on the Normal distribution. Where the proportion is p, the transformation is y : arcsin ('p). G logistic function, probit transformation animal model results from experiments in (nonhuman) animals, used to extrapolate results to humans animal study a study carried out in (nonhuman) animals. G preclinical study antagonist a drug that prevents or reverses the eﬀect of a natural body chemical or of another drug. Û agonist antedependence model a statistical method for analysing a series of repeated measurements on the same individuals. The method describes the data based (partly) on earlier measurements applicable regulatory requirements requirements of a regulatory authority that are either general to all studies or apply speciﬁcally to the 6

approval

assent

experimental or geographical circumstances relevant to a particular study approval the process of an individual or group of individuals with appropriate authority agreeing to a request. This may take the form of approving a protocol, a submission to a research ethics committee, a submission to a regulatory authority, etc. approximate close to the true value. Note that the ‘true’ value may not be known and the interpretation of ‘close’ may vary from one situation to another, so this is a rather vague term approximation a method of estimating a parameter that gives an approximate answer archive to keep a historical record in secure conditions to conﬁrm the data obtained and the procedures that were followed during the course of a study. G backup arcsin transformation angular transformation area under the curve a summary measure of data that have been collected repeatedly over time. The data are plotted with time on the x axis and the measurement on the y axis. The area is that between the line connecting the data points and the x axis (Figure 1) mean arithmetic mean arm synonym for group (as in randomised group) artefact an aspect of data that is not substantiated in other data sets and is not a real eﬀect ascending order data sorted so that the smallest value comes ﬁrst, the larger values later and the largest value last. This can be applied to alphanumeric data (by sorting into alphabetic order with special rules for including numbers and special symbols) as well as numeric data. Û descending order. G ranked data ascertainment bias bias caused due to the manner in which data are collected. For example, surveying the general incidence of health problems near a doctor’s surgery would probably lead to an unreasonably high proportion of respondents indicating less than perfect health; in contrast, surveying near a health club might lead to an unreasonably low proportion of respondents with impaired health ASCII a standard set of alphanumeric characters that is widely transferable between diﬀerent computers. It stands for American Standard Code for Information Interchange assay a procedure to measure the quantity of a chemical (usually drug) in a sample (usually of blood or urine) assent agreement to something in a passive way and not after thorough consideration of the advantages and disadvantages. Note that clinical 7

assessment

asymmetric

Figure 1 Area under the curve. Plot of serum concentration of drug on ten occasions up to 10 hours after administration. The area under the curve is shaded. Other features to note are C at 1.75 hours and T :

3.2 mg/ml trials usually need subjects to consent to take part, not just assent assessment measurement of the state of disease. This may be a measurement of blood pressure, severity of depression, quality of life, etc. assign allocate assigned treatment the treatment that a patient is due to receive based on a randomisation (or other) method associate an assistant (often in the sense of a subinvestigator) associate investigator subinvestigator association a means by which two items are linked. For example, there is a link (or association) between smoking and lung cancer. G correlation assumption a state (often a feature of data) that is taken as true although there may not be suﬃcient evidence to guarantee that state. A common assumption is that data come from a Normal distribution asymmetric not symmetric, as in not evenly split around the middle. The 8

asymptote

average absolute deviation

term is often used about distributions of data that are skewed asymptote a value that is never achieved but that is approached more and more closely. For example, repeatedly dividing a number by two will get closer and closer to zero but will never actually attain that value: in this case, zero is the asymptote asymptotic method a statistical method that assumes there is a large sample of data and which may not be suitable with small samples atopy indicates that an allergic disease such as asthma, eczema, etc. is hereditary rather than being a spontaneous new case attenuation making extreme results or statements less extreme and more typical of the norm risk diﬀerence attributable risk attribute characteristic or feature (usually of a patient). All variables (age, sex, pulse, serum calcium, etc.) are attributes attrition loss; often used to describe loss of patients’ data in long term studies due to patients withdrawing for reasons other than those of meeting the study’s primary endpoint audit a systematic review of data and operational details or study procedures audit certiﬁcate a certiﬁcate to conﬁrm that a study has been audited audit report a report (written or verbal) describing the ﬁndings of an audit. Such ﬁndings are usually restricted to points that do not meet expected standards of quality or completeness (rather than all aspects that do meet the expected standards) audit trail a list of reasons and justiﬁcations for all changes that are made to data or of all procedures that do not comply with agreed study procedures auditor a person responsible for carrying out an audit autocorrelation correlation between repeated measurements taken successively in time from the same subject autoencoding an automatic (usually by computer) method of assigning codes to data, for example codes for drugs or adverse events autoregressive a description of a process that produces data collected sequentially in time when each data point is potentially related (or correlated) with the previous one(s) average informal term for the mean average absolute deviation the average ( mean) amount by which a set of data values diﬀer from some reference value (usually that reference value being the mean). The diﬀerences ignore the sign (plus or minus). So, for example, the average absolute deviation of the numbers 1, 2 and 9

average deviation

axis

4 is +(2 9 1);(2 9 2);(4 9 2),93 : 393 : 1.11. G standard deviation average deviation average absolute deviation axis scale (x axis or y axis) on a graph

10

B backup a reserve (often used in the sense of a reserve copy of data) kept under secure conditions in case of loss or corruption of the original. A more readily available and less permanent version of an archive backward elimination a method of ﬁnding which variables should be kept in a regression model by including all possible variables and then removing (‘eliminating’) those that are deemed not useful. Û forward selection, all subsets regression backward stepwise regression backward elimination bacterium single-celled microscopic organism; the cause of many diseases Balaam’s design a type of crossover design where patients are randomly assigned to receive treatments A and B in one of the treatment sequences AA, BB, AB or BA balance the state of being equal, usually with reference to the number of subjects in each treatment group. G balanced design balanced block part of an experiment (one block of it) such that within that block, the eﬀect of each treatment is estimated with equal precision balanced design an experiment in which the eﬀect of each of the treatments is assessed with equal precision, usually by having the same number of subjects in each treatment group. Note that, in crossover designs, balance refers to there being as many treatment sequences AB as there are BA balanced incomplete block design an experiment in which not all treatments being compared are represented in every block but where, overall, the occurrence of each treatment across all the blocks is the same (or balanced) balanced randomisation a randomisation method which ensures that the eﬀect of each treatment is estimated equally precisely, usually by assigning the same number of patients to each treatment group. Û unequal randomisation balanced study a study that has used balanced randomisation adaptive design bandit design Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

bar chart

baseline comparability

Figure 2 Bar chart. The number of patients who fall into each of the ﬁve categories is represented by the height of each bar bar chart a graphical method of showing the number of subjects that fall into each of two or more categories. The height of each ‘bar’ is proportional to the number of subjects within that category (Figure 2). G histogram bar diagram bar chart Bartlett’s test a method of testing the null hypothesis that several variances, each estimated from diﬀerent groups of subjects, are all equal baseline the moment in time that subjects are randomised or otherwise assigned their study medication. It is also used to refer to periods of time after a study has started but before randomisation has occurred baseline characteristic a measurement taken on a subject at the beginning of a study. Note that ‘beginning’ is generally taken to be at, or as near as possible to, the time of randomisation. G demographic data baseline comparability the process of, and results of, deciding if groups of patients assigned to diﬀerent treatment groups (usually by randomisation) 12

baseline data

Bayesian inference

are similar with respect to demographic data and severity of disease baseline data baseline characteristic baseline hazard function in survival analysis, the hazard function for a subject in the control group (or a group arbitrarily chosen to be a control group) baseline testing see baseline comparability baseline visit usually the very ﬁrst visit that subjects attend in a study. If randomisation does not occur at visit 1 then baseline visit may be used to refer to any visit before (and including) the randomisation visit BASIC a computer programming language. G C, C;;, Fortran, Visual Basic Baskerville design a method for ﬁnding the most preferred of several treatments. Each subject is randomly assigned to a sequence of treatments but the length of time each patient receives each treatment is dependent on their own personal choice. If a subject is completely satisﬁed with the ﬁrst treatment they receive then they would not change and would not receive any of the other treatments. In contrast, if a patient is not happy with any of the treatments being compared they would quickly pass through the entire possible set of treatments and ﬁnish the study batch process to work on a large number of documents all at once, rather than to handle each document as it arrives. This is a common term in data management but applies to computerised systems as well as manual systems batch validation to validate a large number of documents (usually data) as a batch process baud rate the speed at which data are transmitted electronically, measured as the number of binary digits sent per second. A baud rate of 32 000 means 32 000 binary digits sent per second Bayes factor the ratio of the posterior belief to the prior belief. This can be seen as a measure of how the strength of evidence in favour of a given hypothesis has increased, given new data, relative to the prior belief. G Bayes’ theorem Bayes’ rule the action one takes that gives the maximum utility Bayes’ theorem the process of making judgements about the outcome of a study before the data are analysed (assigning prior beliefs), then combining these with the observed data (in the form of the likelihood) to obtain new posterior beliefs Bayesian general statistical methods based around Bayes’ theorem Bayesian inference a method of statistical inference based on Bayes’ 13

before–after design

best case analysis

theorem as opposed to being based on classical statistical inference or frequentist inference before–after design a study in which subjects are observed before treatment is given and their disease state and severity is recorded. These subjects are then given treatment and subsequently their disease state and severity are reassessed. G crossover design Behrens–Fisher problem the problem of using a statistical signiﬁcance test to compare two means when their variances are not equal. Behrens and Fisher originally discussed the problem. Note that the usual t test assumes that the variances in the two groups are equal. It is a long standing mathematical and philosophical issue; hence being referred to as the Behrens—Fisher ‘problem’ rather than the Behrens—Fisher ‘method’ bell shaped used to describe a distribution that when drawn as a histogram or density function, has the same proﬁle as a bell. The Normal distribution (see Figure 22) is the most common example but the term should not be used exclusively for that purpose benchmarking the process of comparing activities (usually performance measures) against a standard reference value or in the absence of a standard, then against other methods to achieve the same outcome. Examples commonly include the costs of running studies between diﬀerent companies; speed of recruitment into studies in diﬀerent therapeutic areas, etc. beneﬁcial eﬀect a therapeutic eﬀect of a drug that is considered to be advantageous to the patient. It is usually meant to imply alleviating symptoms of the disease under study but is not limited to that. If a topical treatment were intended to alleviate symptoms of rash on the scalp and it appeared to reverse the eﬀects of alopecia then the eﬀect on alopecia would be considered a beneﬁcial eﬀect. Û adverse event beneﬁt a nontechnical term referring to advantage (of one treatment or activity over another). It may be measured in a variety of ways including decreased cost, increased patient satisfaction, reduced length of hospital inpatient stay, extended life expectancy benign a condition that does not produce any harmful eﬀects Berkson’s fallacy drawing wrong conclusions (usually in case-control studies) because of selection bias Bernoulli distribution the probability distribution of a binary variable best case analysis the process of making assumptions, often about data that are missing (either inadvertently or because they could not be measured), when the implications of those assumptions are that a treatment may appear to give more beneﬁt than is truly justiﬁed. 14

best fit

between subjects sum of squares

Û worst case analysis. G sensitivity analysis best ﬁt used in the context of regression and ﬁtting lines (straight or curved) to data on graphs. The best ﬁtting line is generally the one that has the data points closest to the regression line (although various other criteria for ‘best’ may be speciﬁed) best linear unbiased estimator a linear estimator that is better (usually in the sense of having a smaller variance) than any other possible linear estimator beta () the probability of making a Type II error. Û alpha (). G signiﬁcance test beta coeﬃcient regression coeﬃcient Type II error beta error beta level the probability of making a Type II error between groups usually used in the sense of estimating the variation (strictly speaking the variance) of data where we are describing the variation between the means of two or more groups of subjects. Û between subjects, within groups between groups sum of squares a measure of variability (by the method of sum of squares) between diﬀerent groups (treatment groups, strata, etc.) in a study. Û within groups sum of squares between groups variance between groups between groups variation an informal term for the between groups variance between person between subjects between study in meta-analyses this is used to describe the variation that is due to diﬀerences between studies rather than diﬀerences between subjects within each study between study variance see between study. This makes it clearer that it is the variance (or variation) that is being considered between study variation a less formal term for between study variance between subjects usually used in the sense of estimating the variation (strictly speaking the variance) of data where we are describing the variation between individual subjects. Û between groups, within subjects between subjects comparison the types of analyses that are made in parallel group studies, that are unpaired comparisons, rather than paired comparisons between subjects comparison between subjects eﬀect between subjects study parallel group study between subjects sum of squares a measure of variability (by the method of sum of squares) between diﬀerent subjects in a study. Û within subjects sum of squares 15

between subjects variance

bioequivalent

between subjects variance see between subjects between subjects variation an informal term for the between subjects variance between treatments between groups bias a process which systematically overestimates or underestimates a parameter. Bias is sometimes, but not always, acceptable: for example, we routinely underestimate peoples’ ages by an average of 6 months if we record data only to the lowest whole year. G precision biased coin a method of randomisation that does not assign patients to treatments with equal probabilities. Û balanced design. G unequal allocation biased estimator a method of estimation of a parameter from data that gives a biased result bibliography a list of published books, manuscripts, etc. that discuss a particular topic bimodal having two modes bimodal distribution a distribution (either a probability distribution or a frequency distribution) that has two modes or peaks binary data data taking only one of two values: typical examples are data of the form Yes/No, Dead/Alive, Male/Female. Sometimes a third category of ‘not known’ or ‘missing’ is included but the data are still said to be binary. G categorical data binary outcome an outcome that can take only one of two values; one that yields binary data binary variable a variable that can take only one of two values; one that yields binary data binomial data binary data binomial distribution in data that are binary (yielding only ‘positive’ or ‘negative’ outcomes), the probability distribution of the number of positives is a binomial distribution. For example, the number of live births (as opposed to still births) out of the ﬁrst one hundred deliveries in a maternity unit follows a binomial distribution bioassay estimation of the potency of a drug by observing its eﬀect on a biological organism bioavailability at any time, the proportion of drug within the body that is available to give a therapeutic eﬀect biochemistry the study of chemistry in living things. Usually used in the context of laboratory data to refer to the amount of various chemicals (for example albumin, calcium, ethanol) in the blood. Û haematology bioequivalent two products that have the same bioavailability are said to be bioequivalent 16

biologic

blinding

biologic a drug derived from a biological product. G biotechnology. Û pharmaceutical, phytomedicine biological assay bioassay biological marker a nonclinical (often a laboratory) measurement that is an indicator (or ‘marker’) of a clinical condition biological plausibility a hypothesis that is justiﬁable from biological theory and not just based on observable data biometrician a person who specialises in biological (including medical, genetic, agricultural) applications of statistics biometry literally ‘measurement in biology’. More generally, the application of statistical theory and methods in the biological sciences biopharmaceutical the subset of biology related to pharmacology. Often the term is used synonymously with pharmaceutical biostatistician biometrician biostatistics the application of statistical theory and methods in the biological sciences biotechnology the process of developing drugs from biological products (such drugs are then called biologics) bivariate the joint measurement and consideration of two characteristics (for example a person’s ‘size’ would often be measured in terms of their height and weight). G multivariate bivariate analysis special methods of analysis suitable for bivariate data. These are usually simpliﬁcations of general methods of multivariate analysis bivariate data measurements that consist of two response variables. For example, a person’s blood pressure could be measured as both systolic pressure and diastolic pressure. More than two variables are always referred to as multivariate data bivariate distribution the joint distribution of two separate (but often correlated or related) measurements. Û univariate distribution. G multivariate distribution black box a process whose internal workings are unknown (at least to the user) but whose output is usually trusted. Computers, for example, are black boxes to most people blind not being able to see. Speciﬁcally, within clinical trials, where the investigator, subject (and possibly other people) are not able to distinguish diﬀerent treatments that are being compared (by sight, smell, taste, weight, etc.) G single blind, double blind, triple blind, quadruple blind blinding the process of keeping hidden certain information about data or 17

block

box plot

study procedures in order to help avoid bias. Most commonly this means keeping the treatment allocation hidden from the doctors and patients (and often data management staﬀ) taking part in a study block several packs of medication kept together and used sequentially, each block usually having the same number of treatments (although in random order) as each other block. The concept can be extended to cases when treatment ‘packs’ do not actually exist. Commonly, if a study is comparing two treatments each ‘block’ might contain medication for four patients, two on one treatment and two on the other. G block size block eﬀect any systematic diﬀerence in response that may exist between blocks of treatment medication. Such diﬀerences do not invalidate the study; the purpose of blocking is to ensure that if such diﬀerences exist, the treatment allocation is equal across blocks block size the number packs of treatment that form one complete block blocked randomisation a randomisation scheme that uses blocks to help maintain balance. Û completely randomised design blocking the act of using blocks of treatment Quetelet’s index body-mass index Bonferroni correction an adjustment made when interpreting multiple signiﬁcance tests that all address a similar basic question. If two endpoints have been assessed separately, instead of considering whether a P-value is less than (or greater than) 0.05, the calculated P-value should be compared to 0.025. In general, if k P-values have been calculated, the declaration of statistical signiﬁcance should not be made unless one or more of those P-values is less than 0.05/k Boolean logic rules for making decisions based on combining binary outcomes using the key words AND, OR and NOT. For example, subjects may be eligible for a study ‘IF (they are male) OR ((they are female) AND (they are using adequate contraception))’ bootstrap a simulation method used for statistical signiﬁcance testing and estimation that takes as possible (simulated) sample data values, only those data values that have actually been observed. G Monte Carlo method box and whisker plot a diagram used to show a few key features of a frequency distribution, namely the minimum, lower quartile, median, upper quartile and maximum (Figure 3). G Exploratory Data Analysis box plot box and whisker plot

18

Box–Cox transformation

byte

Figure 3 Box and whisker plot. Distribution of the number of years a group of 66 patients had suﬀered from eczema. The key features illustrated are the minimum (1 year), the lower quartile (4 years), the median (15 years), the upper quartile (28 years) and the maximum (55 years) Box–Cox transformation a very general equation used to transform a set of data so that it better resembles a Normal distribution. The method was developed by the two statisticians, Box and Cox; hence the name branch on a decision tree (Figure 6), any of the possible routes that can be followed. G node brand name trade name break point used to describe a regression line that is not a continuous smooth function across the whole range of data but is made up of diﬀerent lines (often only two). The point at which the two lines (with diﬀerent slopes) meet is called the break point bridging study a study designed to extend the applicability of a conﬁrmatory study, usually to broaden the population to which the results apply. Bridging studies are usually much smaller than other conﬁrmatory studies byte a single character (one letter or a single digit in a number) as stored by a computer

19

C C a widely used, high level, computer programming language. There are other programming languages that are commonly used in clinical trials work such as BASIC, C;;, Fortran, Visual Basic and there are speciﬁc statistical analysis programs, for example BMDP®, SAS®, SPSS®, STATA® C;; a more advanced version of the C programming language calibrate to check measurements against a known standard capsule a dissolvable container (with an enteric coating) that contains a drug. G other delivery devices such as tablet, transdermal patch carcinogen a chemical that causes any type of cancer carcinogenicity the potential to cause any type of cancer carcinogenicity study a study to determine if a chemical is a carcinogen carryover a term used mostly in the context of crossover studies where the eﬀect of a drug is still present after that drug has ceased to be given to a subject, and in particular when that subject is taking another drug Cartesian coordinate the place (in terms of x axis and y axis) where a data point lies on a graph. For example, if a subject’s systolic blood pressure is 120 mmHg and their diastolic blood pressure is 85 mmHg the Cartesian coordinates would be 120, 85 case a term used synonymously with patient, although often intended to mean one with a particular identiﬁed disease. Û control. G case-control study case history the description (usually the medical history) of an individual case case record form (CRF) the term used for the paper on which data are written. Often a CRF comes in the form of a book with many pages of forms to record a subject’s data case report form (CRF) case record form case-control study a type of study used for evaluating the causes of a particular disease. A group of patients with the disease (the cases) are compared with another group of subjects who do not have the disease Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

case-fatality rate

centile

(the controls). Their lifestyles, previous exposure to potential hazards, demographics, etc., are compared to try to distinguish which of those features predispose someone to have the disease in question case-fatality rate the death rate amongst a group of cases catchment area the geographical area from which subjects may be included in a study. The area covered by a Health Authority or in which a survey was being carried out would be termed the ‘catchment area’ categorical data data that are not pure measurements but are in the form of labels assigned, such as ‘male’ and ‘female’. G ordered categorical data categorical scale the scale on which categorical data are measured categorical variable a characteristic of a subject that results in categorical data. For example a subject’s gender is a categorical variable: it falls into the categories ‘male’ or ‘female’ categorise the process of taking data that may take many distinct values and putting them into categories. For example all the people living in a group of post codes may be classed (or categorised) as being in one town category a group (but used when the term is applied to data). Categories of blood group are A, AB, O; categories of products might be ‘prescription only’ or ‘over the counter’ causal relationship a relationship that is observed when one variable is a consequence of another. For example, alcohol intake and impaired reaction times are causally related. G correlation causality the act of causing. G correlation cause and eﬀect a phrase used to imply causality, over and above correlation ceiling eﬀect a term to describe an asymptote that is an upper limit. Û ﬂoor eﬀect cell when referring to a tabulation of data, each of the individual categories or subcategories of patients (or other data) are referred to as cells cell frequency the number of subjects within a cell of a table. For an example, see contingency table cell mean the mean of the data for all subjects within a cell of a table (Table 1) censor to prevent something being observed. G censored data censored data when the time until an event (typically cure, recurrence of symptoms or death) is the data value to be recorded and that event has not yet been observed for a particular subject, that data value is said to be censored. G truncated data censored observation censored data centile if a (large) set of observations are placed in order, the 1st centile is the value below which 1% of the data lie, the 2nd centile the value below 21

central laboratory

chance

Table 1 Cross-classiﬁcation of mean systolic blood pressure (mmHg). The data are cross-classiﬁed by treatment group and by centre. Each of the means (127.4 mmHg, 135.1 mmHg, etc.) are referred to as cell means Centre

Treatment A

Treatment B

UK.001 UK.002 UK.003 UK.004 UK.005

127.4 131.2 122.0 129.5 141.3

135.1 135.9 127.6 130.3 147.7

which 2% of the data lie, etc. G lower quartile, median, upper quartile central laboratory a single laboratory that is used by all centres in a multicentre study, though it may not necessarily be ‘central’ in any geographical sense. Û local laboratory central limit theorem a statistical phenomenon such that the mean of several data values tends to follow a Normal distribution, even if the distribution of the original data was not Normal central processing unit (CPU) the part of a computer that carries out calculations central randomisation in multicentre studies it is common to use a separate randomisation list in each centre so that we stratify the randomisation by centre. Alternatively we may have a single randomisation sequence, held at one site (a ‘central’ site) and investigators would telephone (or otherwise contact that site) to obtain the next randomisation code central range the range in which the central 90% of the data from a distribution lie. G interquartile range, standard deviation central tendency a nonspeciﬁc summary of data (usually of continuous data) that, for any particular purpose, is useful in describing where the bulk of the data lie. The mean, median and mode are the most common measures of central tendency certiﬁcate of destruction an oﬃcially recognised document conﬁrming that speciﬁc batches of drug product have been safely destroyed. This would often apply to unused medication from a study certiﬁed oﬃcially recognised (strictly speaking, with a certiﬁcate). This can refer to an individual, a machine, a blood sample, etc. challenge test the administration of a product speciﬁcally to see if it produces an adverse reaction chance luck (good or bad). Events that happen by chance are ones that 22

change from baseline

chi-squared distribution

Table 2 Individual subjects’ systolic blood pressure (mmHg) before and after treatment Subject identiﬁcation number 1 2 3 4 5 6

Before treatment After treatment (baseline) 137 120 150 118 130 130

134 120 163 126 135 122

Change from baseline 93 0 13 8 5 98

could not have been predicted with certainty. They occur with probability less than one change from baseline when a measurement (for example subjects’ blood pressure) is measured at the time of randomisation (the baseline) and again after treatment and the diﬀerence calculated (as in Table 2), this diﬀerence is often used as the measure of treatment beneﬁt. It is called the ‘change from baseline’. G analysis of covariance change score change from baseline changeover design crossover design changepoint model a statistical model that attempts to identify when a smooth course of events abruptly changes. For example, height of growing children may follow a smooth curve until puberty, when a sudden change in that curve would be expected. A model that allows for this would be called a changepoint model characteristic an alternative term for data or measurement. Often (but not necessarily) restricted to demographic data and baseline data chart a general term for any form of graph, histogram, etc. check to conﬁrm that something (often data) is correct check digit a number (usually between 0 and 9) that is used as a means of checking that other numbers are correct. The last digit of the ISBN of this book is the check digit chemotherapy the use of drugs to eradicate disease or to prevent existing disease from spreading by killing cells that are otherwise dividing and multiplying. G cytotoxic chi-squared (2) chi-squared statistic chi-squared distribution a probability distribution used in a wide variety of forms of data analysis. Most often in clinical trials it is used for 23

chi-squared goodness of fit test

clean data

comparing the equality of proportions in contingency tables. However, its use is not restricted to this case chi-squared goodness of ﬁt test chi-squared test chi-squared statistic the calculated value of chi-squared from a set of data chi-squared test a statistical signiﬁcance test, the most simple of applications being for testing the null hypothesis that two (or more) proportions are equal chronic long term. Û acute chronic study a study of the long term treatment of a disease. Û acute study chronobiology the study of how biological features change with time. Biological example of time series methods chronotrophic eﬀect the eﬀect of a drug on the force of the heart beating. G inotropic eﬀect CIOMS form a standard template form for reporting adverse events to regulatory authorities. CIOMS stands for Council for International Organisations of Medical Sciences circadian rhythm a biological process that repeats itself in 24-hour cycles citation the reference in a published paper, book, etc., to another previously published piece of work class category class interval when continuous data are categorised into groups (for example, age groups) the class intervals are the number of years grouped into each category. They may, for example, be ten-year age groups of 0—9, 10—19, 20—29, etc. It is a term generally used when all classes have the same interval but this need not necessarily be the case class limits when continuous data (for example age) are categorised into groups (age groups) the class limits are the values that deﬁne at what values each group starts and ﬁnishes. The age groups may be 0—15 years, 16—64 years and 65—75 years; these values would be the class limits. Note that there is no need (explicitly or implicitly) for the class interval to be the same for every class classical statistical inference statistical methods that rely heavily on signiﬁcance testing and calculating conﬁdence intervals. Û Bayesian inference classiﬁcation variable a variable that is used to assign patients into groups (for example blood group, ethnic origin, or a continuous variable such as blood pressure that has been categorised) classify to assign a subject to a group (or category), based on data clastogen a substance causing damage to genetic material. G aneugen clean data data that contain no errors. Often data that are believed to 24

clearance

clinical trial certificate (CTC)

contain no errors are referred to as ‘clean’ but there is an assumption that may not be valid. Û dirty data clearance the rate of elimination of a drug from the body as a proportion of the amount of drug in the body. Û absorption clinic a medical centre where people are cared for clinical the branch of medicine dealing with patients. The practical application of medicine rather than medicine as a pure subject. Û medical clinical ethics ethical considerations and behaviour concerned with treating an individual subject. Û research ethics clinical investigation any form of investigating a patient or analysing a sample from a patient to help determine a diagnosis clinical investigational brochure a document describing the full extent of knowledge concerning an investigational product clinical practice what is generally accepted as the way patients are cared for. This includes all aspects of patient care including waiting in outpatient clinics, drugs received, palliative care, etc. clinical research that area of research carried out on humans (either patients or healthy volunteers). Û preclinical research clinical research associate someone employed to monitor the organisation and practical issues to do with running a study. Their duties may include collecting and collating study documentation, ensuring complete and clean data, ensuring that pharmacies or other dispensing centres have adequate supplies of materials clinical research coordinator clinical research associate clinical research organisation a company that provides staﬀ and/or facilities for carrying out clinical studies clinical signiﬁcance a ﬁnding or observation that is clinically signiﬁcant (for example a patient dying unexpectedly or a large treatment eﬀect). G clinically signiﬁcant diﬀerence. Û statistically signiﬁcant clinical study any systematic study that includes patients. This need not include studying any treatments, for which clinical trial clinical trial any systematic study of the eﬀects of a treatment in human subjects. G Phase I, Phase II, Phase III, Phase IV. Note that although randomisation and blinding, for example, are considered as some of the essential features of good clinical trials, these are not requirements. G prevention study clinical trial certiﬁcate (CTC) the certiﬁcate issued before the introduction of the system of the clinical trial exemption certiﬁcate (CTX). The amount of information needed for a CTC is more than that required for a CTX 25

clinical trial exemption certificate

clinically meaningful difference

Figure 4 Closed sequential design. The solid lines indicate stopping boundaries for declaring a statistically signiﬁcant diﬀerence between treatments A and B. For example, if out of ten patients expressing a preference for one or other treatment, nine preferred treatment B and only one preferred A, then the study would stop, concluding that B is signiﬁcantly better than A. If the broken boundary is crossed, then the study stops and the conclusion is drawn that no signiﬁcant diﬀerence was found between the treatments clinical trial exemption certiﬁcate (CTX) a certiﬁcate (issued by a regulatory authority) to a pharmaceutical company authorising use of an unlicensed product or use of a product outside its marketing authorisation for the purpose of carrying out a clinical study. Note that it is the product that is being exempted from otherwise stringent rules, not the study, so that one CTX may serve to cover several studies. G doctors and dentists exemption (DDX) clinically important clinically signiﬁcant clinically meaningful diﬀerence clinically signiﬁcant diﬀerence 26

clinically significant

coefficient of concordance

clinically signiﬁcant an eﬀect (in an individual subject or an average eﬀect in a group of subjects) that is suﬃciently large to be of beneﬁt (or harm) to a patient or of note to a treating physician clinically signiﬁcant diﬀerence a treatment eﬀect that is suﬃciently large to be useful for treating patients closed sequential design a sequential study design that does not have a predetermined number of patients. An upper limit on the number of patients does exist (hence ‘closed’) but it is possible to draw conclusions and stop the study before that number of patients has been recruited (Figure 4). Û open sequential design closed sequential study a study that is designed as a closed sequential design cluster randomisation a case where individual subjects are not randomised to receive diﬀerent interventions, but rather groups (‘clusters’) of subjects are randomised. Examples are most common in community intervention studies where, for example, some towns may have ﬂuoride introduced to their water supply whilst other towns may not. Clearly each member of the community cannot be randomly assigned to have ﬂuoride, or not, and the randomisation must be done in large groups of subjects (or clusters) C the maximum concentration of drug measured (usually) in a

subject’s blood but it could also apply to that measured in urine. The term can also be used to refer to the mean of many subjects’ C values;

it is then used as a description of the product rather than of any particular subject. ( Figure 1, area under the curve) G T

coarse data data that are measured or subsequently recorded very approximately, for example in categories with large class intervals. Û ﬁne data code an indirect means of linking two or more pieces of information. For example, to identify a pack of medication that pack may be given a code number and, separately, a list be kept of which code numbers refer to which treatments. G randomisation code coding dictionary a list of terms and associated codes. See, for example, COSTART, MedDRA, WHO-ART coding system a set of rules for making up codes for data coeﬃcient an estimate of a parameter. The term is used when the parameters are being estimated in statistical models such as regression analysis, logistic regression coeﬃcient of concordance a measure of agreement between several people, each rating a group of items on some speciﬁc measure. G correlation 27

coefficient of determination

community study

coeﬃcient of determination the square of the correlation between two variables, denoted r coeﬃcient of variation a measure of variation in data, relative to the mean of those data. It is calculated as 100;(standard deviation/mean) and is expressed as a percentage cohort a group of individuals with a common characteristic observed over a period of time. The feature they have in common may simply be the year of birth, or it may be the fact that they have all been exposed (for example) to a carcinogen or a novel educational programme cohort eﬀect any systematic diﬀerence between subjects recruited to a study at diﬀerent times. For example, the ﬁrst patients recruited to a study may have less (or possibly more) severe symptoms than those recruited later in the study cohort study the study of a group of subjects over time. This includes clinical trials, but the term is usually restricted to observational studies co-intervention more than one intervention being studied concurrently. Note that the interventions do not necessarily have to be given at the same moment but the period of study is coincidental, nor do both interventions need to be related to the same disease or be of the same type. For example, a drug treatment and a patient management strategy might both be studied concurrently collapse used in the sense of reducing the number of categories of data. For example, age may be recorded as under 5 years, 5—15 years, 16—65 years, etc. Subsequently deciding to combine adjacent categories (for example the under 5s and the 5—15s) would be described as ‘collapsing’ these two categories into one collective ethics ethical behaviour that is more concerned with beneﬁting other people than oneself. Being prepared to administer a placebo is unlikely to beneﬁt the patient concerned but may beneﬁt others by nature of the information gained. Û individual ethics column vector see vector combination drug more than one drug being administered simultaneously (usually when all of the drugs are packaged in the same tablet or capsule, etc.) Û monotherapy community intervention study a study carried out to investigate the eﬀect of an intervention on an entire group of people, for example all those who live in a particular city. Public health studies and studies of screening programmes frequently are described as community intervention studies. G cluster randomisation community study a study of large numbers of subjects in a community. It 28

comparability

complete cases analysis

could be some kind of survey or might be a community intervention study comparability similarity. Often used in the sense of describing how similar two randomised groups are with respect to demographic data or disease severity comparable the state of being similar comparator drug comparator treatment comparator group the group of patients assigned to receive the comparator treatment comparator study a study that makes comparisons (usually between treatments). Û observational study comparator treatment usually the drug, placebo, or other intervention with which a new or experimental treatment is being compared comparison a contrast (formal or informal) between two or more items or groups comparison group comparator group comparisonwise error rate the probability of making a Type I error for each comparison in a study. Û experimentwise error rate. G multiple comparisons compassionate use a regulatory term, meaning that an unlicensed product is allowed to be used for a limited number of patients for whom their is no alternative medication. Although the product may be ineﬀective (its eﬃcacy has not been demonstrated), there may be no other eﬀective therapies. G named patient use compassionate use protocol a protocol that deﬁnes how a product will be used on a compassionate use basis competitive enrolment the situation in multicentre studies where each centre is allowed to recruit as many subjects as they can until the overall recruitment target has been met, rather than each centre having their own recruitment target complementary log9log transformation an equation applied to data that are proportions to allow use of statistical methods based on the Normal distribution. The transformation is y : log(9log(19p)) complete block a block of medication that contains all possible treatments (or combinations of treatment or treatment sequences) that are being studied. Û incomplete block complete block design a study design that only uses complete blocks of treatment. Û incomplete block design complete cases analysis a strategy for analysing data where only subjects who provide complete data are included in the analysis; any subject with missing data is excluded. G intention-to-treat, per protocol analysis 29

complete response

computer assisted data collection

complete response in cancer studies, this is generally regarded as complete disappearance of all tumours and no new tumours. G partial response, stable disease, progression completely randomised design a study where subjects are allocated to receive treatments in a randomised manner with no constraints (such as equal numbers of patients per group, no blocks, no stratiﬁcation) compliance the measurement of how fully patients take their medication. This may be measured by weighing returned medication, counting returned tablets or simply asking how many doses of medication were (or were not) taken compliant fully compliant component a part. This may be a chemical component of a drug (one of the chemicals that make it up), a part of a data ﬁle or of a case record form, etc. components of variance a method of analysing data that assesses which features of an experiment account for the variation in those data. Typically, the sorts of features identiﬁed will be patients, treatment centres and diﬀerent medications composite hypothesis in a statistical signiﬁcance test, an alternative hypothesis that does not specify a single value for a parameter, for example H : 0 Û simple hypothesis composite outcome when an outcome measure for a study is a mix of several individual measurements. For example, the composite outcome ‘treatment success’ may be deﬁned as a patient who is free of symptoms and has a quality of life score better than some speciﬁed value. Neither of those features is suﬃcient on their own to deﬁne a treatment success but together they are. G Guttman scale composite score Guttman scale compound the bulk product of drug. Û product compound symmetry a term used in assessing repeated measurements. The data are required to have the same variance at each time point and equal covariances between time points. Generally, if compound symmetry can be assumed, the analysis of data is much simpler computer a machine (originally mechanical but now electronic) used for numerical calculations and data processing. The current uses range from complex and fast calculations through to controlling machinery and word processing computer assisted data collection a process by which a computer is used to help (in various possible ways) collection and/or recording of data. The help may simply be that it acts in the form of an electronic case 30

computer assisted new drug application

conditional distribution

record form and that data are recorded into the computer instead of onto paper. It may be more sophisticated and the computer linked to a holter monitor to directly record measurements of blood pressure without the need for human intervention computer assisted new drug application (CANDA) a new drug application where some or all of the data, study report, program ﬁles, etc. are supplied to the regulatory authority in electronic form on a computer computer package a computer program that does a variety of related tasks computer program instructions given to control what a computer does. A variety of types of computer programs are used in clinical research including those for data processing, statistical analysis, drawing graphics and report writing concentration the amount of a substance in a ﬁxed volume of liquid. This may be the amount of active drug per unit of blood during absorption and distribution conclusion the decision that is made based on data that have been collected and analysed. Note that results should generally be referred to in the past tense (‘Drug A was better than Drug B’) but conclusions should be referred to in the present tense, with future implications (‘we conclude that Drug A is better than Drug B’). Û discussion concomitant medication drugs that are not being studied but which a patient is taking through all or part of a study. These may be other drugs for the same indication as the study or for other indications concomitant variable a variable that may inﬂuence the results of a study but which is not a part of the study design. Most often, this term is used to refer to other (nonstudy) medications that a patient may be taking or other diseases that a patient may have. G concomitant medication, covariate concordance agreement. G coeﬃcient of concordance concordant pair in a study where subjects are assessed on two diﬀerent occasions or by two diﬀerent measuring devices and the variable measured is binary (for example, disease present or absent), the data may be summarised in a two-by-two table. The concordant pairs are those pairs of observations where the two measurements agree with each other. Û discordant pair concurrent control control subjects who are observed and data recorded concurrently with the active subjects. This need not necessarily be done in a controlled experiment. Û historical control concurrent medication concomitant medication conditional distribution the distribution of one variable at a ﬁxed value of 31

conditional odds

conflict of interest

another variable. For example, the distribution of age may be given for all subjects, but if it is given for males and females separately then these sex-speciﬁc distributions are said to be ‘conditional on sex’. Û joint distribution, marginal distribution conditional odds the conditional distribution of the odds of an event occurring conditional power the power of a study based on some prerequisite information. Usually it is meant as the power of the study as calculated (after the study has ﬁnished) using the observed diﬀerence between the treatments and the observed variance of that diﬀerence conditional probability the probability of an event happening, given that another event has already been observed to happen conﬁdence an informal term used to describe how strong is one’s belief in the results of a study. G strength of evidence conﬁdence coeﬃcient see conﬁdence interval conﬁdence interval a range of values for a parameter (such as a mean or a proportion) that are all consistent with the observed data. The width of such an interval can vary, depending on how conﬁdent we wish to be that the range quoted will truly encompass the value of the parameter. Usually ‘95% conﬁdence intervals’ are quoted. These intervals will, in 95% of repeated cases, include the true value of the parameter. In this case, the conﬁdence coeﬃcient (or conﬁdence level) is said to be 95% (or 0.95). Conﬁdence intervals are a preferred method of estimating parameters, whilst signiﬁcance tests compare those parameters with arbitrary values. G posterior distribution conﬁdence level see conﬁdence interval conﬁdence limit the values at the end of a conﬁdence interval. If the 95% conﬁdence interval for the diﬀerence in mean systolic blood pressure between two treatment groups is quoted as being from 93 mmHg to ;8 mmHg, then 93 and ;8 are the conﬁdence limits conﬁdential private; not to be disclosed to a third party conﬁrmatory analysis the analysis of a conﬁrmatory study conﬁrmatory study a study that is designed to answer a speciﬁc question without leaving any room for doubt. Whilst Phase I studies and Phase II studies give some information regarding eﬃcacy and safety, Phase III studies are usually thought of as being conﬁrmatory. Û exploratory study, pilot study. G deﬁnitive study conﬂict of interest the situation where an individual or organisation may ﬁnd it diﬃcult to make unbiased statements. Examples are of investigators reviewing their own project proposals at an ethics committee meeting or a pharmaceutical company reporting results of a study involving one 32

confounded

continual reassessment method

of their own products. In such cases, bias is not being assumed but it is recognised that there is a clear reason why individuals may make biased statements or give biased opinions confounded ‘cannot be distinguished from’. For example, if all males were given one treatment and all females given an alternative, the eﬀects of treatment and gender would be indistinguishable from one another, or confounded with each other confounder a term used in observational studies to describe a covariate that is related to the outcome measure and to a possible prognostic factor confounding factor confounder consent positive agreement, particularly in the sense of informed consent. Û assent conservative erring towards being safe; an estimate may be conservative if it is known to be less than the true parameter value (actually biased) and is intentionally quoted as such to avoid the risk of it being an overestimate. G safety margin consistency check an edit check on data to ensure that two (or more) data values could happen in conjunction. Systolic blood pressure measurements must always be at least as great as diastolic measurements so, for any given patient, if the systolic pressure is greater than the diastolic, then the two measures are consistent with each other. It may be that neither is correct—but they are, at least, consistent. Û plausibility check consistent reproducible without upward or downward trends over time. Also two items (often data points) that could both occur simultaneously. G consistency check CONSORT a set of guidelines, adopted by many leading medical journals, describing the way in which clinical trials should be described. It stands for Consolidation of the Standards of Reporting Trials. G structured abstract constant not changing between subjects or across time consumer’s risk the probability of committing a Type I error. Û producer’s risk. G regulator’s risk contingency table a cross-classiﬁcation of subjects by two or more categorical variables. The simplest form is the two-by-two table (Table 3), in which each subject is cross-classiﬁed by two binary variables. The table has four cells (totals are not usually counted) and the number of items within each cell is called the cell frequency continualreassessment method a procedurefor adjusting the dose given to successive subjects when the purpose of a study is to ﬁnd the median dose that has some speciﬁed eﬀect. G dose ﬁnding study, dose escalation study 33

continuity correction

contrast

Table 3 Contingency table showing the distribution of gender by treatment group

Male Female Total

Treatment A

Treatment B

58 29 87

63 28 91

continuity correction an adjustment made in the calculations for some signiﬁcance tests on discrete data to make a better approximation to the test statistic that is continuous. It generally involves adding or subtracting 0.5 to the diﬀerence between the observed and expected frequencies of data. In two-by-two tables it is often referred to as Yates’ correction continuous data data that are not restricted to particular values (as in categories) but that can take an inﬁnite number of values. Examples of variables that result in continuous data are age, height, weight, pulse.Û categorical data, discrete data, ordinal data continuous scale the scale on which continuous data are measured continuous variable a characteristic of a subject that results in continuous data. For example age, height, weight. Û discrete variable contour plot a graph that shows three dimensional data on a two dimensional surface. Two variables are depicted on the x axis and y axis; the third is depicted in the form of contours as would be seen on a map to show elevation (Figure 5) contractor a temporary employee, usually of professional rather than clerical status, taken on to perform duties that would otherwise be carried out by full time employees. Such people are often used to cover peaks in workload or periods of absence of permanent employees contraindication an indication for which a drug is speciﬁcally excluded contrast a more formal term for comparison. In its simplest form it is the diﬀerence in the mean value of a variable between two groups or the diﬀerence in the proportion of subjects with some particular characteristic in each of two groups. In more complex forms it may be a weighted diﬀerence between several groups. For example, in a study with three groups of subjects, two groups (A and B) treated with active treatments and a third group (C) treated with placebo, a simple contrast would be that between the two active products which is simply mean(A)9mean(B). We may wish to compare the active products together with the placebo group and so a more complex contrast would be mean(A) ; mean(B)9mean(C) 34

control

controlled experiment

Figure 5 Contour plot. The heights and weights of 100 patients with ischaemic heart disease are used to try to predict systolic blood pressure. In general there is a tendency for higher blood pressure in the bottom right-hand corner: that is, the heavier people who are rather short (and therefore those that are most overweight) have the highest blood pressure

control a term used in case-control studies speciﬁcally intended to mean someone who does not have any disease. Û case control group the subjects assigned to receive the comparator treatment, or to receive no treatment control treatment comparator treatment controlled clinical trial more formal term for clinical trial. It clearly emphasises the ‘controlled’ aspect of a trial, although that should be inherent in the deﬁnition of clinical trial controlled experiment a term similar to controlled clinical trial, except that it could refer to any kind of experiment, not just a clinical (or even medical) one. The aspect of control (and therefore inclusion of one or 35

convenience sample

COSTART

more control groups) is still emphasised convenience sample a sample of subjects. Whether or not a subject is selected for the sample is not based on any random process but merely on which people are conveniently available. G haphazard sample coordinate xy coordinate. Also means to ensure that several activities happen together (as they should do) or in sequence (if that is how they are intended to occur) correlate to assess how one variable changes as another changes. G correlation correlated samples t test paired t test correlation the degree to which two variables are associated with each other. Positive correlation ( Figure 34, scatter plot) implies that as one variable increases so does the other; negative correlation implies that as one variable increases the other decreases. Note that no causality is implied correlation coeﬃcient the statistical measure of correlation, denoted r. G coeﬃcient of determination correlation matrix a square matrix whose values are the correlation coeﬃcients between all pairs of several variables. An example of the correlation between ﬁve laboratory parameters is shown in Table 4 correlation table correlation matrix cost beneﬁt ratio the relative weighting of the cost of a medication to the beneﬁt of that medication. Beneﬁt may be deﬁned in arbitrary ways to suit the context. G cost eﬀectiveness ratio, cost utility ratio cost eﬀective generally meaning good value for money; the beneﬁt outweighs the cost cost eﬀectiveness ratio the relative weighting of the cost of a medication to the clinical eﬀectiveness of that medication. G cost beneﬁt ratio, cost utility ratio cost function an equation that calculates the total cost of treating a patient. It will typically include positive values (drug costs, pharmacy costs, hospital costs, productivity lost from work, etc.) but sometimes also negative costs (reduction in number of days spent in hospital, increased productivity from early return to employment, etc.) cost minimisation the approach of evaluating the optimum amount to spend in order to minimise the overall cost function cost utility ratio the relative weighting of the cost of a medication to the utility of that medication. Utility is the overall beneﬁt as assessed by any and all diverse measurement scales including medical, ﬁnancial, quality of life, etc. G cost beneﬁt ratio, cost eﬀectiveness ratio COSTART a dictionary of adverse event terms. COSTART stands for 36

count

critical value

Table 4 Correlation matrix of biochemistry parameters in 100 healthy subjects

Urinary creatinine Urinary calcium Serum phosphate Serum creatinine Serum calcium

Urinary creatinine

Urinary calcium

1.0 0.41 90.03 0.03 0.08

0.41 1.0 90.06 0.00 0.11

Serum Serum phosphate creatinine 90.03 90.06 1.0 0.07 0.15

0.03 0.00 0.07 1.0 90.05

Serum calcium 0.08 0.11 0.15 90.05 1.0

Coding Symbols for Thesaurus of Adverse Reaction Types. G MedDRA, WHO-ART count to determine how many (of something) exist or how many times a certain type of event has occurred covariance a statistical measure of how two variables vary together. G correlation, variance covariate a variable that is not of primary interest but which may aﬀect response to treatment. Common examples are subjects’ demographic data and baseline assessments of disease severity Cox model Cox’s proportional hazards model Cox’s proportional hazards model a statistical method for comparing survival times between two or more groups of subjects that also allows adjustment for covariates. The model assumes proportional hazards. G Cox–Mantel test, accelerated failure time model Cox–Mantel test a statistical method for comparing survival times between two groups. G Cox’s proportional hazards model cream a mixture of ointment (such as paraﬃn, lanolin, etc.) and water used as a vehicle for delivering topical treatment. Û gel, lotion credible interval a form of a conﬁdence interval used in the context of Bayes’ theorem. G highest density region critical appraisal the set of skills (and judgements) needed to evaluate evidence. G evidence-based medicine critical data the most important data that will be used to draw conclusions from a study relating to the most important objectives critical region the values of a test statistic (such as in the t test or chi-squared test) that lead to rejecting the null hypothesis at a given signiﬁcance level critical value the value of a test statistic (such as in the t test or chi-squared test) that is the boundary between where the null hypothesis is rejected and not rejected at a given signiﬁcance level 37

Cronbach’s alpha

curve

Cronbach’s alpha a measure of internal consistency in a psychological test cross-classiﬁcation contingency table crossed factors the opposite of nested factors. When every category of one variable also contains every category of another. G factorial design crossover design a study where each subject receives (in a random sequence) each study medication. After receiving Treatment A, they are ‘crossed over’ to receive Treatment B (or vice versa). This is the simplest form of crossover design and is called the two period crossover design. Û parallel group design crossover study a study that is designed as a crossover design cross-product ratio odds ratio cross-sectional considering a single moment (or separate moments) in time without regard for any trend across time. Û longitudinal cross-sectional analysis the analysis either of a cross-sectional study or of data as if they were collected in a cross-sectional study. Û longitudinal analysis cross-sectional study a study that examines data at one particular point in time (either in the sense of ‘all ten-year-old children’ or ‘everybody on 1st January’) and does not consider within subjects eﬀects crude estimate any estimate of a parameter that is an unadjusted estimate crude rate an unadjusted rate. Generally, simply the observed number of subjects experiencing a speciﬁc event divided by the total number of subjects exposed and potentially at risk of that event cumulative frequency a running total. For example, if we count the number of deaths per day (the daily frequency), then the total number of deaths from the beginning of a study to any particular day is the cumulative frequency cumulative frequency distribution the distribution of cumulative frequencies cumulative hazard rate the accumulation of the hazard functions at all times from time zero up to a speciﬁed time point cumulative meta-analysis a meta-analysis that shows continuing updated estimates of treatment eﬀect after each of the studies was completed. It does not simply show one overall result taking account of all studies, regardless of when they were carried out curriculum vitae a person’s educational and employment history, usually including all other relevant experience and any publications to which they have contributed curve a smooth line or surface drawn though a set of data points. The term can strictly be used to describe a line (in two dimensions) or a surface (in more than two dimensions) that is straight as well as one that bends 38

curvilinear regression

cytotoxic

curvilinear regression a regression model that ﬁts a curve to data. In this context, curve is generally taken to exclude a straight line. G linear regression cutoﬀ design a method of treatment assignment based on a baseline measurement. All subjects with values below some cutoﬀ point (deﬁned as those with good prognosis) are assigned to the control group; subjects with values of the baseline measurement in the middle of the range are not included in the study; and all subjects with values of the baseline measurement above another cutoﬀ point (those with poor prognosis) are assigned to the experimental group. G regression discontinuity design cutoﬀ point a value on an ordered scale (possibly a continuous scale) where a change of decision is made. For example, patients with systolic blood pressure above 180 mmHg may be included in a study; those with values less than, or equal to, 180 mmHg are not included: 180 would be called the cutoﬀ point cutpoint a point along a line or on a surface where the slope changes abruptly rather than smoothly. G changepoint model, cutoﬀ point cyclic cyclic variation cyclic variation systematic variation over a course of time. A circadian rhythm is one type of cyclic variation cytotoxic a drug that is poisonous to certain types of cells. Frequently used in cancer treatment

39

D data information of any sort, whether it be numerical, alphabetical, judgements, estimates or precise measurements. G binary data, categorical data, continuous data, discrete data, ordinal data data analysis the process of summarising data, either to draw conclusions or simply to describe a process data and safety monitoring committee a group of people who regularly review accumulating data in a study with the possibility of stopping the study or modifying its progress. A study may be stopped, or changes made to it, if clear evidence of eﬃcacy is seen or if adverse safety is observed in one or more treatment groups data audit an audit of the quality, source and integrity of data data centre the place where data are gathered and the data management tasks completed. It is a term particularly relevant to multicentre studies. Single centre studies may have the data centre at the same place as the patients are seen or somewhere diﬀerent data cleaning the process of ﬁnding errors or possible errors in data, checking them and, if appropriate, correcting them. G clean data, dirty data data coding assigning data into categories. For example, classifying adverse events into groups according to which part of the body is aﬀected or classifying concomitant medications into generic names rather than trade names data collection form case record form data collection protocol speciﬁc, detailed instructions for how data are to be collected and recorded data coordinating centre data centre data dependent stopping making the decision to stop recruitment to a study (or possibly follow-up in a study) based on data already observed. G interim analysis data dredging analysing data without regard to accepted scientiﬁc and statistical principles in order to ﬁnd some aspect that will be of interest. Also referred to as ‘ﬁshing expeditions’ because of the analogy of Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

data driven analysis

death rate

dipping a ﬁshing rod into dark water and pulling out various items of rubbish, but rarely ﬁsh! data driven analysis making decisions on which analyses should be carried out based on the observed data. G post hoc analysis data editing data cleaning data entry typing data into a computer. This may be done directly by the subject, by the treating doctor or investigator or, more usually, copied from a case record form at a data centre. G single data entry, double data entry data ﬁeld an individual item of data. The term is most often used in referring to data on a case record form or on a computer data ﬁle a highly structured and well organised collection of related data. The term could be used about paper ﬁles (including a case record form or a large number of case record forms) but is generally reserved for data on a computer data item data ﬁeld data management the discipline of collecting and ﬁling data in an ordered fashion to facilitate subsequent retrieval and analysis. Although the term can refer to the management of paper ﬁles, most activity in data management usually revolves around storage of electronic versions of data on a computer data manager the person with the responsibility for ensuring data management is properly carried out data monitoring the process of reviewing data being collected to ensure it is of high quality and complete. G quality control data monitoring committee data and safety monitoring committee data monitoring report a report on the quality and completeness of data data processing the steps involved in computerisation and particularly data management in a computer system data query a question raised about the validity or correctness of an item of data data reduction the process of summarising data, particularly using summary measures or coding continuous data into categorical data data screening the process of looking at and reviewing data to check their plausibility and completeness database an electronic version of a set of data, held on a computer database management system any piece of software for handling data. This includes data entry as well as production of tables, listings, etc. dataset data ﬁle death rate the number of people dying in a speciﬁed time interval divided 41

debug

degree of belief

by the number alive at the beginning of that time interval. G casefatality rate debug to ﬁnd errors in computer programs and correct them decile each of the tenth (10th, 20th, 30th, etc.) centiles decimal a number recorded in whole units and in tenths, hundredths, thousandths, etc. For example, weight measured in kilograms and grams is expressed as a decimal number but weight measured in pounds and ounces is not. Sometimes the word is used just to refer to the part of the number less than 1 (only the numbers that come after the decimal point) decision function a mathematical function that describes which decision to make, based on a given set of circumstances. G decision rule, decision tree decision rule this term can be either a synonym for a decision function or a less technical, written, description of what decision to make based on a given set of circumstances. The rule may sometimes be depicted as a decision tree decision theory the general theory of how to make optimal decisions decision tree a diagram, resembling that of a family tree, to guide which decision (or sometimes which conclusions) should be drawn from a set of criteria (Figure 6). G decision rule Declaration of Helsinki a set of ethical guidelines for the conduct of research on humans. It was ﬁrst agreed in 1964 by the World Medical Association and has been revised subsequently in Tokyo (1975), Venice (1983), Hong Kong (1989) and South Africa (1996) decrement to decrease in value. Û increment deduce to draw a conclusion of a speciﬁc result based on broader examples. Û induce deduction deduce deductive inference the process of drawing conclusions based on deduction deductive reasoning see deductive inference but note that ‘reasoning’ is a broader term than ‘inference’ default an assumed state unless a positive reason can be given to accept an alternative state. For example, in signiﬁcance testing, by default the null hypothesis will be accepted unless evidence exists to refute it deﬁnitive study a study that is generally agreed to provide the answer to a question with no room for doubt. The term ‘deﬁnitive’ is usually used to describe a study that has already been completed. The term conﬁrmatory study is more often used of a study that it is planned to undertake degree of belief often used as an informal interpretation for a P-value. It is a measure (either on a probability scale or an informal, intuitive, scale) of the strength of evidence about a particular hypothesis 42

degrees of freedom

demographic data

Figure 6 Decision tree. A way to make a choice of a simple statistical signiﬁcancetest for comparinggroups of categorical data. Eachof the boxes with roundedcornersis called a ‘node’;each of the arrows is calleda ‘branch’ degrees of freedom a statistical term to describe the number of independent pieces of information that there are for a statistic. In chi-squared tests of two-by-two tables, there is one degree of freedom, the sample mean of n data points has n91 degrees of freedom delivery device the medium used for getting active product into the body. Tablets, ointments, injections, etc. are all delivery devices. G vehicle delta () usually used as the symbol to describe the ‘true’ size of an eﬀect. In particular it is used in planning studies to describe the smallest clinically signiﬁcant diﬀerence to detect. It is more often used to describe a diﬀerence in means but can also be used to describe a diﬀerence in rates or proportions. The symbol d is often used to describe the observed value of demographic data data on subjects’ age, height, weight, etc. The term can be used to describe any baseline characteristics of subjects including the 43

demographic variable

design effect

baseline measurements of the primary endpoint variable but is more often reserved for measurements that are not aspects of the disease. There is no clear distinction between which data are disease related and which are not; clearly in a study of weight loss, subjects’ weight would be both demographic data and important data describing the state of disease demographic variable any variable that is demographic data demographics demographic data demography the study of vital statistics of populations denominator in a fraction, such as or , the denominator is the number on the bottom line of the fraction (in these cases 2 and 4, respectively). Û numerator density function the mathematical function that gives the probability that a random variable is equal to any given value. G distribution function dependent samples t test paired t test dependent variable in any sort of statistical model, but most commonly in regression models, the dependent variable is the one we are trying to predict from the independent variable(s). In most cases, the dependent variable is the eﬃcacy variable derived variable data values that are calculated or formed from other data. For example, subjects’ age might be calculated (or derived) from the visit date minus the date of birth; age would then be called a derived variable descending order data sorted so that the largest value is written ﬁrst, the smaller values later and the smallest value last. Most easily described in terms of numeric data but special rules can be applied to alphanumeric data. Û ascending order descriptive statistics summaries of data that do not try to draw conclusions but which just describe the data. Most often used for continuous data. Common descriptive statistics include the mean, standard deviation, minimum value, maximum value, mode, median, quartiles and conﬁdence intervals. Û inferential statistics descriptive study one that aims to describe a phenomenon or a group of individuals. The analysis of data from such studies generally uses descriptive statistics rather than signiﬁcance testing or inferential statistics design the plan for a study with particular reference to whether it is a parallel group design or crossover design. The term should, however, be thought of very broadly to encompass the number of subjects to be included, the number of visits, the number of investigators taking part, strata, blocking, methods of randomisation, etc. design eﬀect the eﬀect caused by a design variable in a study. Such eﬀects 44

design variable

difference study

would, hopefully, be advantageous but they may be negative or neutral. In a study where the randomisation was stratiﬁed by gender, an observed diﬀerence in treatment eﬀect between males and females would be called a design eﬀect (because stratiﬁcation was part of the design of the study) design variable any variable that contributes to the design of a study, often because of stratiﬁcation according to values of the variable deterministic a process that is guaranteed to give the same result repeatedly, with no unexplainable (random or otherwise) variation deviance a statistical measure of how much a set of data diﬀers from a perfect ﬁt to a model. In the simplest case of a model with normally distributed residuals, the deviance is equal to the residual sum of squares. G variance deviate a variable that takes the values of the diﬀerence between another variable and a chosen reference value, such as the mean deviation a measure of how far values of a variable lie from a chosen reference point. G average absolute deviation, standard deviation device see delivery device, medical device diagnosis the decision that is reached regarding the disease a patient has diagnostic test a test (physical, mental or, more commonly, biological) that is used to deﬁnitively assess whether or not a subject has a particular disease. Û screening test diagram a line drawing, usually to show the relative positions (physically or in time) of a set of objects or activities diary card usually a paper system for subjects in a study to record symptoms, adverse events or other data on a daily basis, often at home and generally not under the direct supervision of any medical personnel dichotomous data binary data dichotomous outcome binary outcome dichotomous variable binary variable diﬀerence the value obtained by subtracting one value from another. This may be on an individual subject basis, for example calculating the diﬀerence between a subject’s pulse at baseline and the same subject’s pulse after treatment ( change from baseline), or it may be on a group basis, for example calculating the diﬀerence between the mean of all subjects’ heart rates in one treatment group and the mean of all subjects’ heart rates in a control group diﬀerence study a term used rarely, except to diﬀerentiate from an equivalence study or a noninferiority study. A study where the null hypothesis is that there is no diﬀerence between treatments and the 45

diffuse prior

directional hypothesis

alternative hypothesis states that there is a diﬀerence. The intention of the study (or objective) is usually to show that two (or more) treatments have diﬀerent eﬀects. G superiority study diﬀuse prior vague prior digit any numeral between zero and nine. For example, the number 57 contains two digits: 5 and 7 digit bias digit preference digit preference when recording numerical data, there is often a preference (intentional or unintentional) to round the last digit. For example, birth weight measured in grams will often be recorded to the nearest 10 grams; there is said to be a preference for zeros. Blood pressure measured in millimetres of mercury will often be recorded to the nearest 5 mm or the nearest 2 mm; values such as 73 mmHg (where the last digit is not a multiple of 2 or of 5) tend to be recorded less often than would be expected by chance dimension one of any number of variables that describe a subject. The term is most often used in connection with plotting data. When two variables are measured and plotted, there are two dimensions. When there are three variables plotted, three-dimensional graphs can be plotted (with some diﬃculty). More than three dimensions can be thought about but cannot easily be plotted. G multivariate data direct access in computing terms this refers to the method of accessing data on a physical storage device such as a disk. (Û sequential access.) The term also applies to source data veriﬁcation, where the person reviewing source data is allowed to see the source data for themselves, rather than indirect access where the values of source data have to be requested through a third party direct contact the contact of one person with another that potentially passes on an infectious disease. Û indirect contact direct cost actual (ﬁnancial) costs that are incurred in treating patients. These include the cost of drugs, the cost of occupying a hospital bed, etc. Û indirect cost. G pharmacoeconomics direct eﬀect main eﬀect direct relationship the case when the relationship between two variables is linear, so that plotting one variable against the other variable shows a rough ﬁt to a straight line. The term is often further restricted to the case when the correlation is positive—such as in Figure 34 ( scatter plot) directional hypothesis a hypothesis which speciﬁes that one treatment is equal to, or better than, another treatment. In general, the alternative hypothesis is stated that one group is diﬀerent from another, which 46

dirty data

diskette

Table 5 Cross-classiﬁcation of paired data to show the discordant pairs. The response to each treatment (in this example) is graded simply as ‘good’ or ‘bad’ Treatment A Treatment B Good Bad

Good

Bad

55 13

47 24

could allow it to be either better or worse. G one sided hypothesis dirty data data that contain errors, or data that may contain errors and have not yet been fully reviewed and validated to ﬁnd those possible errors. Û clean data discordant pair in a study where subjects are assessed on two diﬀerent occasions or by two diﬀerent measuring devices and the variable measured is binary (for example, disease present or absent), the data may be summarised in a two-by-two table. The discordant pairs are those pairs of observations where the two measurements do not agree with each other (Table 5). In this example, 47 patients and 13 patients represent the discordant pairs. Û concordant pair discrete data data that may take only a ﬁxed set of values. This includes categorical data but also extends to data in the form of counts, for example where only whole numbers of items can be counted. Û continuous data discrete variable a variable that can result only in discrete data values. Û continuous variable discussion that part of a ﬁnal report that addresses the validity of the results by considering the appropriateness of the study design, the success (or otherwise) of its implementation, quality of the data, consistency of results across diﬀerent outcome variables and in the light of other studies. G conclusion disease proﬁle the set of signs and symptoms (and their severity) that either characterise a disease (and therefore may help with diagnosis) or describe the severity of disease for an individual patient disk a device for storing data on a computer or in a computer readable form. Traditionally these have been magnetic devices but optical devices (compact discs, etc.) are becoming very common diskette virtually synonymous with disk. Some people use the term diskette to refer to ‘small’ ﬂoppy disks that can be carried around (usually for use with personal computers) rather than larger hard disks 47

dispersion

dose response

that are kept permanently inside the computer dispersion a term used almost synonymously with variability (as in variation of data). G variance distributed data entry a system of entering data onto a variety of computers, possibly spread around the world, to form a distributed database. G remote data entry distributed database rather than all the data relating to a study being held on a single computer, a distributed database allows diﬀerent parts of the data to be held on diﬀerent computers. The diﬀerent computers are all linked together by a network so that it is not obvious to the user that the database is distributed distribution a general term covering either frequency distribution or probability distribution, depending on the context distribution free method nonparametric method distribution function the mathematical function that gives the probability that a random variable is less than any given value. G density function divisor denominator doctors and dentists exemption (DDX) an exemption similar to a clinical trial exemption certiﬁcate (CTX) but one that is issued to a doctor or dentist, not to a pharmaceutical company documentation written evidence to conﬁrm the activities that have been undertaken in a study and the standards to which a study has been managed dosage regimen the dose, timing and method of giving medication to a patient. G treatment regimen dose the amount of drug that is given dose eﬀect relationship dose response relationship dose escalation study a study in which successively higher doses of a drug are given to subjects. This may be done either by administering a dose to an individual and, if there are no adverse events, by increasing the dose for that individual until adverse events are seen or by giving a dose to a small number of subjects and, if no adverse events are seen, giving a subsequent group of subjects a higher dose, and so on, until adverse events are seen. Û dose ranging study dose ﬁnding study a study to ﬁnd the best dose (‘best’ according to an agreed deﬁnition) of a drug dose ranging study a study of diﬀerent doses of a drug but, in contrast to a dose escalation study, the doses being compared are not investigated in an escalating manner dose response dose response relationship 48

dose response relationship

drug interaction

dose response relationship how the eﬀect of a drug changes with dose dose titration study dose escalation study dosing schedule dosage regimen dot chart scatter plot dot plot scatter plot double blind a study where the subjects and the investigators are blind to the treatment allocation. Û single blind, triple blind double data entry a strategy where data from case record forms is entered (typed) into a computer twice and the two typed ﬁles compared. This helps to reduce the number of typographical errors and errors of interpretation of poor handwriting. Û single data entry double dummy a method of blinding where both treatment groups may receive placebo. For example, one group may receive Treatment A and the placebo of Treatment B; the other group would receive Treatment B and the placebo of Treatment A double entry double data entry double mask double blind doubly censored data data that are both left censored and right censored. Right censored data is quite common; left censored data is less common; doubly censored data is rare doubly censored observation doubly censored data download copying ﬁles (data or programs) from a central computer to a local computer. Û upload dropin the opposite of dropout. Dropins to clinical trials are not common but, when they occur, may result in left censored observations dropout the case where a subject stops participating in a study before they are due to according to the study protocol. A more polite term is early withdrawal drug a pharmaceutical preparation. The term is often used very broadly and loosely to include placebo. G biologic, phytomedicine. Û product drug accountability the process of checking what has happened to all study medication. This includes checking stocks in a pharmacy, counting individual subjects’ tablets, weighing tubes of ointment, etc. drug company pharmaceutical company drug industry pharmaceutical industry drug interaction the eﬀect sometimes produced when more than one product is used simultaneously. The eﬀect is either more than or less than the sum of the individual eﬀects. The term is most commonly used in connection with adverse reactions, caused by diﬀerent products combining in the body, rather than with extra beneﬁcial eﬀects 49

drug metabolism

dynamic allocation

drug metabolism metabolism drug reaction any response to a product, either beneﬁcial or unwanted, but usually reserved for unwanted eﬀects. G adverse event, adverse reaction drug trial clinical trial dry run similar to a pilot study. Trying a process under artiﬁcial conditions to determine if it will work properly in a real setting dummy loading a method of blinding treatments when they involve diﬀerent dosage regimens. G double dummy dummy report ghost report dummy table ghost table dummy variable indicator variable Duncan’s multiple range test a multiple comparison test for comparing the mean value of a variable between more than two groups duration of action the length of time that a treatment gives any beneﬁt duty of care the requirement that doctors must care for their patients and that this duty must take priority over such things as research projects dynamic allocation a randomisation method that changes the probability of assignment from one group to another as the study progresses. The probabilities are changed either as a consequence of eﬃcacy and adverse event data emerging or to maintain balance for prognostic factors across the groups. G minimisation

50

E early stopping the practice of stopping recruitment into a study before reaching the maximum target sample size. This may be in a sequential study, after a formal interim analysis or for purely practical reasons that are independent of eﬃcacy or safety results early stopping rule a statistical rule that allows a study to stop recruitment after an interim analysis. Unless such rules are used the P-value associated with testing the null hypothesis is generally biased (it is too small). Early stopping rules allow for this and help to calculate the correct P-value early withdrawal when a subject leaves a study earlier than is routinely allowed for in the protocol. Typical reasons include the onset of unacceptable adverse events and voluntary withdrawal. In studies where death is not the endpoint, a death might also be included as an early withdrawal edit the process of changing data or text in a dataﬁle or in a text document (usually one held on a computer) edit check a term that covers all types of checks that may be put on data, including consistency checks, plausibility checks, range checks edit query a question raised by an edit check. The relevant data would then be checked and appropriate corrective action taken if necessary eﬀect this term is often misinterpreted as being the change from baseline in some measurement (blood pressure, for example) during the period of an intervention. Strictly speaking, ‘eﬀect’ should always be a relative measure, such as the extra change in blood pressure over that produced by the comparator treatment. If the mean blood pressure in a treatment group falls by 15 mmHg and in a comparator group it falls by 5 mmHg, then the eﬀect is the diﬀerence between these two values—10 mmHg. Similarly, the eﬀect of gender is deﬁned as the diﬀerence in mean response between males and females; the eﬀect of study centre is deﬁned as the diﬀerence in mean response between participating study centres. When the outcome of interest is not a mean, the term ‘eﬀect’ has the Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

effect modifier

eighty–twenty rule

same limitation in that it should always be the diﬀerence between two groups; this can be measured as the diﬀerence between two proportions, or the odds ratio, or the diﬀerence in median survival times, etc. eﬀect modiﬁer covariate eﬀect size strictly this should simply be the size of an eﬀect but conventionally it is taken to be the size of the eﬀect divided by the standard deviation of the measurements. An eﬀect size of one indicates a diﬀerence between two means equal to one standard deviation; this is generally considered to be quite a large eﬀect. Eﬀect sizes of about 0.5 are moderate and eﬀect sizes 0.1 or lower are considered very small eﬀective sample size the more variation there is in data then, generally, the larger the sample size required to show a treatment eﬀect. However, if, for a given sample size, there is more (or less) variability in the data than anticipated it is as if there is a reduced (or increased) sample size. The sample size that would have been required, had the variability in the data been correctly assessed, is called the eﬀective sample size. Variability in data can be increased due to missing data (G early withdrawal) and errors in the data; it can be reduced by modelling the data using extra covariates. G relative eﬃciency eﬀectiveness the extent to which a product works in the patients to whom it has been oﬀered. This is slightly diﬀerent from ‘eﬃcacy’, which can be measured in those who were actually treated. ‘Eﬃcacy’ relates to explanatory studies, ‘eﬀectiveness’ to pragmatic studies eﬃcacy the desirable eﬀect of an intervention. Û safety. G adverse event eﬃcacy data any data relating to the eﬃcacy of a treatment. Û safety data eﬃcacy population per protocol population eﬃcacy review an overview or meta-analysis of eﬃcacy data. Û safety review eﬃcacy sample per protocol population eﬃcacy study a study intended primarily to demonstrate eﬃcacy rather than safety. Often the same as a Phase III study eﬃcacy variable a variable that is a measure of eﬃcacy. Û safety variable eﬃcient a process that makes good use of resources and is not wasteful. This is also a statistical term referring to methods of estimating parameters: in general it is desirable to have eﬃcient estimators because these may require smaller sample sizes eﬃcient estimator an estimate of a parameter that is eﬃcient eighty–twenty rule an informal rule which suggests that most beneﬁt (about 80%) can be achieved with minimal eﬀort (about 20%); and, conversely, that the last 20% of beneﬁt needs 80% of the eﬀort 52

elective treatment

enrolment period

elective treatment a treatment that a patient chooses to have rather than one that is assigned by randomisation or one that is mandatory on medical (or other) grounds electronic database database eligibility criteria inclusion criteria eligible a subject that meets all the eligibility criteria ( inclusion criteria) elimination the process by which a drug is excreted from the body or removed from the required site of action within the body. Û absorption, clearance elimination rate constant once a drug has been completely absorbed into the body, this is the rate of elimination (which, for many drugs, is approximately constant) empirical observed (particularly in relation to curves, distributions, etc.) Û ﬁtted value empirical Bayes Bayesian methods that require the prior distribution to be based on data. Û subjective Bayes empirical distribution the observed frequency distribution of data. Û probability distribution empirical result a result based on data (or facts) rather than one based on theory empty cell in a contingency table, a cell that contains no observations end of study end of treatment end of treatment the time at which subjects are either supposed to stop taking treatment (according to a protocol) or actually do stop taking treatment (if, for example, they were an early withdrawal) end of treatment value the value of a variable at the end of treatment visit end of treatment visit the visit at which subjects are supposed to stop taking treatment (according to the protocol), actually do stop taking treatment or withdraw from a study endemic a disease that is always present in a certain proportion of the population in a given geographical area. The term is usually used when considering the frequency of extra cases of the disease endpoint a variable that is one of the primary interests in a study. The variable may relate to eﬃcacy or safety. The term is used almost synonymously with eﬃcacy variable or safety variable but not, for example, with demographic variable enrol to recruit a subject, or subjects, into a study enrolment the number of subjects that have been enrolled into a study enrolment period the time (often measured in months or years) during which subjects are enrolled into a study 53

enteric coating

error band

enteric coating a coating (often made of gelatine) used on a tablet or capsule to prevent it being destroyed by acid in the stomach entry criteria inclusion criteria epidemiological study a study using the methods of epidemiology. This includes clinical trials but also case-control studies, cohort studies, natural experiments, surveys, etc. epidemiologist one who studies or practices epidemiology epidemiology the study of health and disease in populations, including aetiology, natural course and treatments. Clinical trials are considered by many to be one of the methods of epidemiology episode the occurrence of an event. In some studies the primary endpoint or primary eﬃcacy variable may be the number of times an event happens (the number of episodes of that event) equal allocation allocating the same number of subjects to each treatment. Û unequal allocation equal randomisation equal allocation equation a set of mathematical symbols and instructions for performing calculations equipoise the state of having an indiﬀerent opinion about the relative merits of two (or more) alternative treatments. Ethically, a subject should only be randomised into a study if the treating physician has no clear evidence that one treatment is superior to another. If such evidence does exist then it is considered unethical to randomly choose a treatment. If the physician is in a state of equipoise, then randomisation is considered ethical equipotent having equal potency and therefore having equal eﬀects (positive or negative). G equivalent equivalence the situation where two treatments show equal eﬀects equivalence study a study whose primary aim is to demonstrate that two treatments are equivalent with regard to certain speciﬁed parameters. Most studies are designed to show that one treatment is better than another; these are sometimes referred to as diﬀerence studies to emphasise the contrast with equivalence studies. G noninferiority study equivalent having equal eﬀects (positive or negative). G equipotent erect standing. G prone, supine error a mistake. Sometimes the term is used to describe the discrepancy between an observed data value and the true value. In these situations, the term is used with reference to the variance, as in, for example, error term, error variance error band an informal term to describe an interval around an estimate 54

error bar

estimate

that semiquantitatively describes the uncertainty of the estimate of the parameter. G interval estimate error bar an informal term, similar to error band but where the interval is shown on a graph. There is no ﬁxed convention for the length of these ‘bars’ but they are typically one standard error, one standard deviation, two standard errors or two standard deviations. If error bars are used, their precise deﬁnition should be given. G box and whisker plot error mean square residual variance error of the 1st kind Type I error error of the 2nd kind Type II error error of the 3rd kind Type III error residual sum of squares error sum of squares error term residual variance error variance residual variance errors in variables model in many situations it is assumed that, although a response variable may be measured with uncertainty (because it has some residual variance), the predictor variables, or covariates, do not have any uncertainty in their measurement. This may often not be the case, and if it is not the relationship between the covariates and the response will be biased: positive relationships will be estimated as larger than they should be whilst negative relationships will be estimated to be smaller than they should be. If the variances of the covariates can be estimated, then an adjustment can be made to the estimated relationship with the response variable. A model that makes this adjustment is called an errors in variables model essential documents a regulatory term describing the documentation that is required to support the data from clinical trials. It includes the protocol, case record form, names and aﬃliations of all staﬀ involved, including their curricula vitae, the source and quality assurance statements of the products involved, etc. essential documents essential requirements estimable a parameter that can be estimated from a given experimental design. Some complex crossover studies and factorial studies may intentionally include some parameters of lesser importance that cannot be estimated, in order to more eﬃciently estimate those parameters that are of greater interest estimate the value of a parameter that is calculated using data. It should always be remembered that exact answers to questions are rarely attainable because of measurement error and random variation in the variable we are trying to measure. The ‘truth’ is rarely known, the best 55

estimated sample size

evidence based medicine

we can usually do is to get estimates of it estimated sample size the estimate of how many subjects must be enrolled into a study in order to meet the objectives of the study. G sample size estimation the process of obtaining estimates of parameters from data estimator a formula used to estimate a parameter ethical a process or study that conforms to accepted guidelines and rules on ethics. G Declaration of Helsinki ethical pharmaceutical a medicinal product that is available only with a doctor’s prescription. Û over-the-counter drug ethics the discipline of describing behaviour, practices, thinking and moral values generally agreed to be acceptable to society. G Declaration of Helsinki ethics committee research ethics committee ethnic origin a demographic variable encompassing place of birth, race, religion, and sometimes also native language. Often it is simply used to describe country or region of birth of a subject evaluable subject one who conforms to the study protocol suﬃciently well to be included in the per protocol population. Often this means a subject who meets all the inclusion criteria for a study and none of the exclusion criteria. Sometimes the requirements may be made less stringent and only certain major inclusion criteria need to be fulﬁlled. Sometimes the requirements may be more stringent and a certain minimum time in the study may be required. The precise deﬁnition of evaluable is likely to be study dependent and should be described in the protocol and study report event a binary variable that is an outcome that may or may not occur for each subject in a study. Some events, if they do occur, can occur more than once. Events are more often considered as negative ( adverse event) but they may be positive aspects of a treatment event rate the proportion of subjects who experience a particular event in a given time interval. Note that if the event can occur more than once for any given subject, as in adverse events, the event rate is still the proportion of subjects who experience that event; it is not a function of the number of events that occur evidence based medicine a recent approach to patient management that relies on using the most rigorous data available to guide decisions on what treatments should be used and how they should be used. The forms of evidence preferred are usually (although not always) from randomised and blinded clinical trials and meta-analyses 56

exact statistical method

expected value

exact statistical method a statistical method for estimation and signiﬁcance testing that does not make assumptions about the distribution of variables. Some exact methods are commonly referred to as nonparametric methods but the variety of exact methods currently being developed goes beyond what have traditionally been thought of as the nonparametric methods. G parametric methods exact test a statistical signiﬁcance test using an exact statistical method examination a series of observations, usually undertaken to determine a diagnosis or to measure the progress of disease exchangeability a term used in the context of bioequivalence to encompass equivalence of all aspects of two products excipient the constituents of a product that are not active but help with the formulation. G vehicle exclusion criteria reasons why a subject should not be enrolled into a study. These are usually reasons of safety and should not simply be the opposites of inclusion criteria excrete to eliminate from the body, usually taken to mean via urine and faeces, but can also include sweat excretion study a study of the quantity, route, timing, etc. of drug being excreted from the body executive committee a small group of individuals representing a larger group, with the authority to make decisions regarding the design or conduct of a study. Data monitoring committees and research ethics committees could have a smaller group that meets more frequently than the main committee to pass through ‘simple’ decisions quickly or who meet on an infrequent basis to make ‘major’ decisions that have been discussed at length at a fuller committee expectation expected value expected frequency the number of events that would be expected to occur within a set of constraints (usually the constraint is the null hypothesis). The term refers particularly to expected cell frequencies (as opposed to observed frequencies) in contingency tables expected number expected frequency expected outcome in a statistical sense expected value. Otherwise the term is used in a general sense to refer to what outcome (or course of a disease) would generally be expected to occur. G prognosis expected value the value of a parameter that an estimator predicts. For example, the expected value of the sample mean is the population mean, although the expected value of the sample variance is not quite the population variance, there is a small bias (which can be corrected) 57

expedited report

explained variance

expedited report a report that must be made very quickly. It usually refers to reporting serious adverse events to regulatory authorities, sometimes within two or three days of the event occurring experiment a general term that encompasses preclinical studies, clinical trials, animal studies, etc. It covers almost any form of practical research that involves intervention. Û observational study experimental design all aspects of the design of an experiment. Sometimes the term is restricted to certain specialised statistical aspects of the design such as blocking, replication and stratiﬁcation experimental treatment experimental drug experimental error residual variance experimental treatment usually the product that is of primary interest and that is being compared with the comparator treatment experimental unit this usually means each subject but is best thought of as the smallest unit that could be randomised. Even in studies that do not involve randomisation, it is still helpful to think in these terms. In community intervention studies the experimental unit might be an entire town; in other situations it could be a hospital ward or a General Practitioner’s surgery. G unit of analysis Hawthorne eﬀect experimenter eﬀect experimentwise error rate the probability of making a Type I error when considering the overall result of a study. Note that, if a study has several endpoints to be analysed, even if one or more of those analyses may result in a Type I error the overall conclusion from the study could still be correct. Û comparisonwise error rate. G multiple comparisons expert report a regulatory document that summarises the complete set of documents on the safety and eﬃcacy of a product submitted for regulatory approval in a new indication expert review a review of documents, study results, etc. by an expert. G peer review expert system a computerised method of making decisions that is more complex than a simple algorithm; it is a method that is capable of ‘learning’ by building upon past decisions and their outcomes expiry date the date after which a product should not be used because its quality cannot be assured. G shelf life explained variance in a set of data there will usually be variation between data points. Some of this variation will be due to diﬀerences between subjects, diﬀerences between points in time, diﬀerences between treatments, etc. The variation that is due to such known causes is the explained variance; the variation that is due to unknown causes is called the 58

explanatory study

exponential distribution

Figure 7 Exponential decay. For each equal sized change in the value of x, the value of y falls by the same proportion residual variance, or simply the variance explanatory study a study that aims to ﬁnd out if an intervention can work, given ideal circumstances, or to ﬁnd the circumstances under which an intervention works. The analysis of such studies is usually by the per protocol approach. Û pragmatic study covariate explanatory variable exploratory data analysis methods of reviewing data to ﬁnd potential errors and to gain simple impressions of patterns that may exist or eﬀects that may be happening. The methods are usually graphical and include box and whisker plots, histograms, stem and leaf plots exploratory study a study that aims to generate hypotheses rather than to deﬁnitively test them exponent in a mathematical equation of the form y : x X (‘x raised to the power z’) the parameter z is called the exponent exponential exponential growth exponential decay a quantity that is diminishing at an ever-decreasing rate (Figure 7). Û exponential growth exponential distribution the probability distribution that describes the 59

exponential growth

external consistency

Figure 8 Exponential growth. Rate of growth of cancerous cells. The number of cells multiplies by the same factor after each additional day time interval between randomly occurring events. It is an important distribution in the analysis of survival data exponential growth growing at an ever-increasing rate; for example, the number of cancerous cells in a tumour may double every week, or may increase tenfold every week (Figure 8). Û exponential decay exposed group in a clinical trial this term is sometimes used to refer to the group receiving the experimental treatment. The term more naturally comes from case-control studies and refers to the cases exposure the extent (amount and length of time) for which a subject has received medication or other intervention (including possibly harmful interventions) exposure variable the variable that measures exposure external consistency a study whose results are applicable to, and match what is seen in, other studies and in clinical practice. All studies should obviously have this feature but many do not because they use inclusion 60

external validity

extreme value

Figure 9 Extrapolation. A simple regression model predicting patients’ systolic blood pressure from their weight has been used to predict what the blood pressure of a 150 kg person (the large dot) would be criteria and exclusion criteria that are either diﬀerent to those of other studies or are not reﬂected in clinical practice. Û internal consistency. G explanatory study, pragmatic study external validity external consistency extrapolate either formally estimating, via a statistical model, or informally judging what results will occur outside the range of data actually collected and analysed. This may involve extrapolating to a wider patient population than has been studied, extrapolating from animal studies to judge what will occur in humans, etc. (Figure 9). Û interpolate extreme value the largest or smallest value in a set of data. Sometimes the extreme values (plural) are taken as several of the largest or smallest values

61

F F distribution a probability distribution used extensively for signiﬁcance testing in analysis of variance. It is used to test whether two variances are equal but this can be put to use to compare the means across several groups F ratio F statistic F statistic the value of the test statistic calculated from an F test F test a statistical signiﬁcance test based on the F distribution F to enter the value of an F test required as a decision rule to enter a variable into a regression model when using forward selection or stepwise regression methods. Û F to remove F to remove the value of an F test required as a decision rule to remove a variable from a regression model when using backward elimination or stepwise regression methods. Û F to enter fabricated data data that are not real and have been presented fraudulently. G fraud face validity a term usually used with reference to questions on a questionnaire. Face validity refers to whether a question seems to make sense to an expert in the ﬁeld. It stems from the expression ‘on the face of it’. G external validity, internal consistency factor another name for a categorical variable, usually (but not exclusively) one that is a covariate or a stratiﬁcation variable, rather than one that is an outcome variable factorial design a study that compares two (or more) diﬀerent sets of interventions. The simplest design uses Drug A versus Placebo A and Drug B versus Placebo B. Subjects will be randomised to one of four groups: Placebo A ; Placebo B, Drug A ; Placebo B, Placebo A ; Drug B or Drug A ; Drug B. This is a very eﬃcient type of study because it not only allows the assessments of Drug A and Drug B in one study instead of two but also allows us to investigate the question of whether drugs A and B show any interaction factorial study a study of two or more interventions carried out in a factorial design Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

failure

file

failure the term sometimes used in place of event in survival data. It comes from studies of the time it takes for machine parts to cease working (or ‘failing’) but the term has been carried over to medical examples where we are looking at the time until an event such as death or relapse failure time the time until an event occurs, where the term failure has been used instead of event false negative the case when a test of some sort does not detect what it is supposed to detect. This can be a diagnostic test that fails to identify a patient who has a particular disease. The term is also sometimes used in signiﬁcance testing to describe a Type II error. Û false positive false positive when a test incorrectly detects something that is not real. In a diagnostic test this is identifying a patient as having a particular disease when they do not. In signiﬁcance testing it is the same as a Type I error. Û false negative falsiﬁcationism the act of falsifying data or results. G fabricated data, fraud familywise error rate experimentwise error rate fatal that which causes death. Û lethal feasibility study pilot study Fibonacci dose escalation scheme a commonly used method of determining what doses of a drug should be used in a dose escalation study. The successively increasing doses follow a Fibonacci series Fibonacci numbers numbers that follow a Fibonacci series Fibonacci series a series of numbers that increase by successively adding the previous two numbers to get the next one. For example, 1, 1, 2 (: 1 ; 1), 3 (: 2 ; 1), 5 (: 3 ; 2), 8 (: 5 ; 3), 13 (: 8 ; 5), 21 (: 13 ; 8), 34 (: 21 ; 13), . . . ﬁducial inference a method of statistical inference similar to signiﬁcance testing. G Bayesian inference, frequentist inference ﬁeld study a term used to describe a study that is not conducted in a hospital or similar type of well controlled environment but rather one that is carried out in general practice with patients free to carry on their normal daily activities. The analogy of an agricultural study being carried out either in a greenhouse-type environment or in a ﬁeld where the climate and other environmental factors cannot be controlled is a good one. G experiment ﬁgure the term is used to refer to a number, or to a graph or diagram in a study report. The use can be confusing as diﬀerent people assume it means diﬀerent things ﬁle a physical or electronic (on a computer) place where documents and data are stored 63

final data analysis

fixed effect

ﬁnal data analysis the ﬁnal analyses of a study that are reported. These may be done after various forms of exploratory data analysis have been completed ﬁnal report another term for study report but the use of this term can be useful to distinguish it from an interim report or a draft of a study report ﬁne data data measured with great accuracy. Û coarse data ﬁnite having real bounds. The term is sometimes overused because most of what we do is ﬁnite. The use of the word can only really be justiﬁed if it genuinely contrasts with the possibility of being inﬁnite ﬁnite population for the purposes of most statistical analyses, it is assumed that there are an inﬁnite number of subjects to which the study results apply. This assumption is partly justiﬁed on the grounds that the possible set of subjects having the target disease includes all those with the disease today and all those who will have the disease in the future. In some situations this is not a sensible assumption and it must recognised that there is a ﬁnite number of subjects in the population to which our results can apply. Û inﬁnite population ﬁrst in man study the ﬁrst Phase I study undertaken with a new drug two factor interaction ﬁrst order interaction ﬁrst pass metabolism the absorption of drugs into the body when they pass through the liver Fisher’s exact test a statistical signiﬁcance test that is used for comparing proportions in contingency tables. It is used in preference to the chi-squared test when the sample size is small (often less than 30) ﬁshing expedition data dredging ﬁt to estimate the parameters of a model from data ﬁtted value the estimated value of a parameter based on a model. Û observed value. G empirical result ﬁxed combination therapy a mixture of two (or more) drugs in one formulation. Û free combination therapy ﬁxed cost in pharmacoeconomics this refers to a cost that will remain the same however many patients there may be or in whatever way they may be treated. One might argue that the pharmacy department in a hospital needs to be open 24 hours a day whether it stores drugs for a certain disease or not: there is, therefore, a certain minimum ﬁxed cost for this facility. Û marginal cost, per unit cost, variable cost ﬁxed disk hard disk ﬁxed eﬀect a categorical variable where the diﬀerent levels of the factor are exactly the ones that we wish to draw conclusions about. G ﬁxed eﬀects model. Û random eﬀect 64

fixed effects model

FORTRAN

ﬁxed eﬀects model a statistical model that assumes we wish to make inferences about the particular levels of a factor used in the study, and no others. This is particularly relevant when including study centre as a factor in the analysis: do we wish our results to be applicable only to those centres that took part in the study, or do we wish to consider those centres to be a random selection of all the centres that might have taken part so that the results can be applied to all possible centres? The ﬁxed eﬀects approach assumes the ﬁrst case. Û random eﬀects model ﬁxed sample size design a design that determines the number of subjects to be recruited before the study starts and does not allow the number to be changed. This is the most common type of approach to determining how many subjects should be in a study. G group sequential design, interim analysis, sequential design ﬂat ﬁle a computer dataﬁle that can be thought of as like a matrix, usually with each row representing one subject and each column representing one variable. Û hierarchical database ﬂoor eﬀect an asymptote that is a lower limit. Often zero will be that lower limit. Û ceiling eﬀect ﬂoppy disk a form of computer disk that is easily portable and is intended to be slotted in or out of a computer rather than being a permanent ﬁxture (as in hard disk) ﬂow diagram a diagram showing a series of activities occurring across time (Figure 10) ﬂow diagram ﬂowchart follow-up the process of collecting data after some activity has taken place. This often simply means gathering data after subjects have been randomised, or it may mean collecting data after treatment has been stopped to monitor safety or relapse of symptoms follow-up data data that are collected as a result of follow-up follow-up period the time during which follow-up occurs. This may simply be the time that patients are in a study from randomisation until their last visit follow-up visit any visit during a follow-up period of a study for cause audit an audit that is carried out because of some suspicion of poor quality work or of fraud. Û no cause audit form case record form formulation the way in which a product is manufactured and presented. Examples include tablets, capsules, injections. G product FORTRAN a very powerful but quite old computer programming language. G BASIC, Visual Basic, C, C;; 65

forward selection

frailty model

Figure 10 Flow diagram. The sequence of events to follow in Zelen’s randomised consent design, seeking consent in conjunction with randomisation forward selection a method of arriving at a regression model when several possible covariates might be included. The method begins by selecting the variable that makes the greatest contribution to reducing the residual variance (subject to some minimum criterion) and putting this in the model. Then the variable giving the next greatest reduction in variance (again, subject to a minimum criterion) is found and included in the model. The process continues until either all the variables are in the model or no more meet the minimum criterion for being included. The minimum criterion is referred to as F to enter. G all subsets regression, backward elimination, stepwise regression forward stepwise regression forward selection fourfold table two-by-two table fourth hurdle some regulatory authorities, in addition to requiring demonstration of quality, safety and eﬃcacy, require evidence of additional value for money of a new product. This is called the ‘fourth hurdle’. G pharmacoeconomics frailty model a statistical model that assumes diﬀerent individuals have diﬀerent probabilities of being unobserved. The term is most often used with respect to survival times where it is expected that there will be some censored data. Survival models assume that the probability of 66

frame

frequency polygon

Figure 11 Frequency polygon. Distribution of the number of years a group of 87 patients had suﬀered from eczema. Only the outline of the histogram is plotted censoring is the same for every subject but frailty does not make that assumption sampling frame frame fraud the act of intentional and dishonest deception. G fabricated data fraudulent data fabricated data free combination therapy a mixture of two (or more) drugs that are intended to be taken together but which are not combined in one formulation. Û ﬁxed combination therapy frequency the number of times a particular event occurs or a particular data value is observed. Û relative frequency frequency distribution the number of times each of several events occurs or the number of times each of many diﬀerent data values occurs. G frequency polygon, frequency table, histogram frequency polygon a diagram for representing a frequency distribution. 67

frequency table

funnel plot

Table 6 Frequency table of extent of body surface area aﬀected by eczema in 157 patients

No involvement :10% 10—29% 30—49% 50—69% 70—100%

Frequency

Percentage

Cumulative frequency

Cumulative percentage

42 48 25 14 16 12

26.8 30.6 15.9 8.9 10.2 7.6

42 90 115 129 145 157

26.8 57.3 73.2 82.2 92.4 100.0

Each of the data values is placed along the x axis and the number of times each occurs is plotted as a point on the y axis. These points are then joined to form a polygon (Figure 11). Û histogram frequency table a numerical summary of a frequency distribution showing the number of times each data value occurs. Sometimes this may be enhanced to also show the percentage of occurrences, the cumulative frequency and the cumulative percentage of occurrences. All of these features are shown in Table 6 frequentist inference an approach to data analysis that produces estimates of parameters, conﬁdence intervals and signiﬁcance tests. G Bayesian inference, ﬁducial inference Friedman’s test a nonparametric signiﬁcance test for testing the null hypothesis that all of several treatments given to the same subjects have the same distribution of responses. Informally, this can be thought of as the nonparametric equivalent of repeated measurements analysis of variance Friedman’s test Friedman’s two way analysis of variance full analysis set intention-to-treat population saturated model full model fully compliant a subject who takes or uses all medication exactly as prescribed in the study protocol function a mathematical equation funnel plot a type of graph for plotting summary results from many diﬀerent studies. It is used in meta-analysis and in overviews to help try to detect publication bias (Figure 12)

68

funnel plot

funnel plot

Figure 12 Funnel plot. Summary odds ratios from 25 studies comparing the eﬃcacy of a certain class of antidepressant with placebo. If no publication bias existed, we would expect to see a ‘funnel’ shape. There is some suggestion here that some small negative studies may have been missed because of the lack of studies in the bottom centre of the plot

69

G Galbraith plot radial plot Gaussian curve Normal distribution Gaussian distribution Normal distribution Gehan’s design a design, typically in Phase II cancer studies, where no control group is used. The design initially recruits a small number of patients: if overwhelming evidence in favour or against eﬃcacy is seen then the study stops. If the evidence is not conclusive either way, further patients are recruited in order to obtain a reasonable estimate of the treatment response rate Gehan’s generalised Wilcoxon test a nonparametric statistical signiﬁcance test for comparing two survival distributions. G Cox’s proportional hazards model, log rank test gel a vehicle for delivering topical treatments. Similar to cream but more solid. G lotion, ointment gender synonym for sex general linear model linear model generalisability the extent to which conclusions can be applied to a wide population. G external validity generaliseable conclusions that have wide generalisability generalised additive model a method of producing models, similar to generalised linear models, that predict an outcome variable from several independent variables. In this case, the link function is a complex function of the data, rather than a theoretical link function, such as the Normal distribution or logistic function generalised estimating equations an extension to linear models particularly useful for modelling repeated measurements and further particularly suited to binary data and Poisson data generalised linear model an extension to linear models where a link function is introduced. This link function is a function of the response variable and, instead of modelling the response variable directly, the link function is modelled as a linear function of the independent variables Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

generic

ghost report

Figure 13 Ghost table. All of the row headings and column headings have been drafted out so it is clear what the table will look like when the data are available. In this example, the number of decimal places and signiﬁcant digits for each of the numerical values have also been indicated generic the fundamental, original, form. Often used to refer to drug names as generic name, in contrast to trade name generic name the name that the original manufacturer or developer gives to a drug. Û trade name genetics the study and description of genes and DNA Genie score a way of summarising multivariate data (usually used for laboratory data). The greater the score, the more deviation there is in a subject’s (laboratory) data from the relevant reference ranges geometric mean a measure of central tendency, particularly useful for highly skewed data. It is calculated as the nth root of the product of n numbers or, alternatively, as the antilog of the mean of the logarithms of all the numbers. G harmonic mean ghost report a draft of a report that contains no results but has all the section headings and some of the introductory text included. The 71

ghost table

grand total

intention of a ghost report is to be able to produce a ﬁnal report as quickly as possible after the data become available. A ghost report may also contain ghost tables ghost table the layout of a table indicating row and column headings but without any data (Figure 13). G ghost report Gini coeﬃcient a measure of variation most often used in describing income or salaries. Hence it has uses in pharmacoeconomics glossary a list of specialist terms referred to in a document, with their deﬁnitions goal a target such as the goal for the number of subjects to be recruited to a study gold standard a diagnostic test that is guaranteed to give the correct diagnosis. Also used to refer to a treatment that is widely recognised as the best available golden rule informal term for ‘most important rule’ Good Clinical Practice (GCP) a set of principles and guidelines to ensure high quality and high ethical standards in clinical research. G Good Laboratory Practice, Good Manufacturing Practice Good Distribution Practice (GDP) a set of guidelines to ensure high quality standards in warehouse storage and distribution work Good Laboratory Practice (GLP) a set of guidelines to ensure high quality standards in laboratory work. G Good Clinical Practice, Good Manufacturing Practice Good Manufacturing Practice (GMP) a set of guidelines to ensure high quality standards in manufacturing. G Good Clinical Practice, Good Laboratory Practice Good Regulatory Practice (GRP) a set of guidelines to ensure high quality standards in regulatory aﬀairs work Good Statistical Practice (GSP) a set of guidelines to ensure high quality standards in statistical work goodness of ﬁt a measure of agreement between a set of observed data and a model that has been ﬁtted to those data goodness of ﬁt test a statistical signiﬁcance test to compare whether one model ﬁts data better than an alternative model Graeco-Latin square a form of Latin square that balances for three sources of variation. G Youden square grand mean the mean of a set of numerical observations, regardless of which group (treatment group or other form of group) those data relate to. G grand total grand total the total of a set of numerical observations regardless of which 72

graph

growth curve

group (treatment group or other form of group) those data relate to. This equates to the grand mean multiplied by the number of observations graph a pictorial representation of data plotted on an x axis and y axis, and sometimes on a z axis too. G scatter plot graphic a general term for diagrams, graphs, sketches, etc. Greenhouse–Geisser correction an adjustment made to the degrees of freedom in an F test of within subjects eﬀects in repeated measurements analysis of variance. It is assumed that the pattern of correlation is constant over time and this adjustment is required if the assumption is not valid. G Huynh–Feldt correction group one of the strata in stratiﬁed data. The term is frequently used to refer to subsets such as the treatment group or the placebo group (those treated with active treatment or those treated with placebo, respectively). It can be used to refer to other strata such as the males or females, the ‘high risk group’, ‘low risk group’, etc. group data the subset of an entire set of data that relates to only one group. For example, all the data from subjects treated with placebo or all the data from female subjects collective ethics group ethics group matching usually in matching we refer to matched pairs. However, with group matching we imply that overall, two (or more) groups of subjects are typically quite similar in terms of their demographic data, disease proﬁle, etc. group randomisation cluster randomisation group sequential analysis special types of analysis that are appropriate for group sequential studies group sequential design a form of sequential design where interim analyses are carried out after a number of subjects have been recruited into a study. Usually only two or three analyses would be planned into such a study after either half the subjects or one third and two thirds of the subjects have completed the study. G O’Brien and Flemming rule, Pocock rule group sequential study a study designed as a group sequential design group sequential test a statistical signiﬁcance test carried out in group sequential designs grouped data categorical data growth curve a graph that traditionally plots the progress of some feature of growth over time. Growth could be measured by height or weight. The term now has a broader use to include any variable that systematically changes (usually increases) over time 73

guardian

Guttman scale

guardian legal guardian guesstimate an informal term to describe a result that is largely a guess but that supposedly has some data used to help form that guess. It is not an estimate in the formal sense of the word but it is supposed to be better than a guess based on no knowledge (or data) at all guideline a set of suggested rules but ones that are not enforceable by any laws. In practice, one would be foolish to ignore many of the regulatory guidelines that have been written. G Good Clinical Practice guinea pig a subject who is part of a study may be referred to as a guinea pig. The term is used disparagingly by those who do not approve of the particular study involved (or who do not approve of research on humans generally); or it is used light heartedly. Given these two extremes of meaning, it is a term best avoided Guttman scale a method of combining answers to individual questions to arrive at an overall score (sometimes called a composite score). Each question may be weighted diﬀerently, so it is not simply the sum of the individual question responses

74

H H symbol for null hypothesis H usual symbol for alternative hypothesis H alternative (less often used) symbol for alternative hypothesis ? haematology the study of the makeup of blood. Usually used in the context of laboratory data to refer to such parameters as platelets, red blood cell counts, etc. Û biochemistry half life the amount of time that a radioactive substance takes to decay to half its original quantity or a drug takes to halve its concentration in the body. In many situations, four half lives might be considered a reasonable time to reduce the original concentration to a minimal quantity (four half lives being of the original amount) halo eﬀect an informal term to describe psychosomatic eﬀects that often occur when patients believe that the doctor will be able to give them beneﬁt. G placebo eﬀect handbook a book of instructions for using a machine, for running a study or for general work practices haphazard unpredictable but not in the highly controlled sense of random haphazard sample a sample of people (or items). The members of the sample are not chosen for any particular reasons, just as they happen to present themselves. Haphazard samples often display various patterns that would not be seen in a truly random sample. G convenience sample haphazard treatment assignment a method of assigning treatments to subjects that is not controlled or predictable. Like haphazard samples, haphazard treatment assignment often displays various patterns that would not be seen in truly random assignment hard data objective data hard disk a form of computer disk that usually resides inside the computer and is not intended to be moved between diﬀerent computers. They have much larger capacity than ﬂoppy disks or diskettes hard endpoint objective endpoint hard measurement objective measurement

hard outcome

Heisenberg effect

hard outcome a response to an intervention that can be measured using objective data. Û soft outcome hardware the mechanical, electrical and electronic components of a computer such as the screen, the processor, disk drives, keyboard, etc. G software harmonic mean a measure of central tendency used for skewed data. It is — calculated using the reciprocals of the data, namely H : (1/x ) . L G G geometric mean Hawthorne eﬀect the response that is often seen in subjects taking part in a study and produced simply because they know that they are being observed; however, it is not a true eﬀect of any intervention. The strict deﬁnition given for eﬀect is particularly important to note. G placebo eﬀect hazard hazard function hazard function in survival analysis the probability of a given event (such as death) occurring at each instant in time, given that the event has not already happened. G Cox’s proportional hazards model hazard rate the hazard function at any particular point in time hazard ratio the ratio of two hazard rates or of two hazard functions, either at a particular point in time or averaged over a long period. G Cox’s proportional hazards model health the general state of wellbeing or lack of wellbeing in an individual or a group of individuals health economics pharmacoeconomics health services research research into the provision of health care, including aspects of cost, need, resources, supply and outcome. Strongly linked with pharmacoeconomics healthy subject healthy volunteer healthy volunteer a subject who volunteers to take part in a study but who does not have any signiﬁcant disease. Such subjects often participate in Phase I studies. Note that all subjects who take part in clinical trials should do so voluntarily; for this reason, the term should not be abbreviated simply to ‘volunteer’ (although it often is). Û patient healthy worker eﬀect a form of volunteer bias. Subjects who have employment (those that are workers) tend to be healthier, on average, than the general population (which includes those who do not work, through choice, old age, disability, etc.) Heisenberg eﬀect a term from physics that says that the act of observing and measuring a process aﬀects that process so that absolute eﬀects are impossible to measure. This is one of the reasons why we need comparison groups in studies. G Hawthorne eﬀect 76

Helmert contrasts

high–low graph

Helmert contrasts a particular type of contrast where each level of a factor is compared with the mean of all other levels of that factor. For example, if three ethnic groups are represented in a study the response variable could be investigated to see if it is aﬀected by ethnic group. The mean response in ethnic group 1 would be compared with the mean of the combined data from ethnic groups 2 and 3; ethnic group 2 would be compared with the mean of the combined data from ethnic groups 1 and 3; ethnic group 3 would be compared with the mean of the combined data from ethnic groups 1 and 2. G analysis of variance, multiple comparisons hepatic metabolism metabolism (of product) through the liver. G renal metabolism, pharmacokinetics heterogeneous a term used to mean that the variation of a measurement within a group is diﬀerent from the variation of that same measurement within other groups. Û heteroscedastic, homogeneous heteroscedastic unequal variances of data values of the same variable. For example, the variation in the measurement of a person’s age usually changes with age; age of newborns may be measured in hours or days, age of infants in months, adults in years. Û heterogeneous heuristic using intuition and judgement hierarchical nested, meaning built up in layers hierarchical database a computer database that has several levels of data. For example, the highest level may be the subject level recording basic demographic data for each subject. For each subject, other levels may contain data on the diseases they have and, for each disease, the treatments they have been given. Û ﬂat ﬁle hierarchical models two statistical models for the same data but one has extra covariates that are not included in the other high level term a classiﬁcation of signs, symptoms and diseases (particularly used in MedDRA) giving a coding that is less detailed than the preferred term but is more detailed than the system organ class high order interaction a general term used to refer to an interaction (in the statistical sense) that is not a two factor interaction but involves at least three factors highest density region in Bayesian statistics the middle region of a posterior distribution used for determining interval estimates. G credible interval, conﬁdence interval highest posterior density a method in Bayesian statistics for determining a point estimate of a parameter high–low graph a graph for plotting one continuous variable (usually on

hinge

historical control

Figure 14 High—low graph. The mean pulse rate in 100 patients with ischaemic heart disease is plotted at each of ﬁve visits. Additionally, the minimum and maximum values at each visit are plotted. Note that the patient with highest pulse at visit 1 is not necessarily the one with the highest pulse at any other visit the y axis) against one categorical variable (usually on the x axis). It shows the mean and/or median and the minimum and maximum values of the continuous variable for each value of the categorical variable (Figure 14) hinge quartile Hippocratic oath a promise to act to certain high ethical and medical standards. Traditionally it is thought of as being sworn by all doctors when they qualify but this is not actually the case histogram a graphical method for plotting a frequency distribution similar to a bar chart. Whilst a bar chart is typically used for categorical data, a histogram is more usually used for continuous data (Figure 15) historical control a control group that has not been randomised but consists of patents treated in the past. Û concurrent control 78

historical control group

homeopath

Figure 15 Histogram. Distribution of the number of years a group of 87 patients had suﬀered from eczema historical control group a comparator group that has not been assigned by randomisation but which consists of patients treated in the past (sometimes patients who were not treated). This is a much less desirable method of making comparisons and is prone to many forms of unpredictable bias but it is a much easier source of comparisons than setting up a randomised study hold constant when analysing data and producing adjusted estimates (by analysis of covariance or some other method) it is convenient to think of the result as what would have been observed had a particular covariate taken the same value for every subject. This is sometimes described as that covariate being ‘held constant’ home visit in many studies, subjects are seen at hospital, at their own general practice or at some other kind of health centre. Particularly in community studies, a home visit is when a nurse, doctor or other health professional assesses the subject in their own home homeopath one who practices homeopathy

homeopathy

hypothesis testing

homeopathy a treatment regimen that involves exposing patients to trace amounts of a chemical that would, in large enough doses in healthy people, produce symptoms of the disease that is being treated homogeneous the variation of a measurement within a group being similar to the variation of that same measurement within other groups. Û heterogeneous, homoscedastic homoscedastic equal variances of data values of the same variable. For example, the variation in the measurement of a person’s weight would not be expected to vary between diﬀerent treatment centres (even though the mean might vary considerably). Û heteroscedastic, homogeneous hot deck a method of imputing for missing data based on other nonmissing data Hotelling’s T test a statistical signiﬁcance test for comparing the means of two multivariate distributions. It could be used, for example, when a subject’s ‘size’ is measured on three variables: height, weight, and head circumference. Rather than three separate t tests, the Hotelling test compares ‘size’ rather than separately comparing height, weight, and head circumference Huynh–Feldt correction an alternative to the Greenhouse–Geisser correction in repeated measurements analysis of variance hypothesis a statement for which good evidence may not exist but which is to be the subject of an experiment. A common example in clinical trials would be that ‘Drug A shows an eﬀect identical to that of placebo’. This is clearly a statement; it may be true or false; it can be tested in an appropriately designed experiment. G alternative hypothesis, null hypothesis hypothesis generating study a study that is not intended to answer speciﬁc questions but rather to produce data that can be looked at in various ways to suggest interesting questions (or hypotheses) to be researched in subsequent experiments. Û deﬁnitive study. Many studies may be run with the intention of answering a small number of hypotheses and to generate further ideas hypothesis test a statistical process to determine the strength of evidence in favour of, or against, a particular hypothesis. There are many types of hypothesis test for use in diﬀerent situations and for addressing diﬀerent types of question. G nonparametric test, parametric test; and, for example, chi-squared test, F test, Mann–Whitney U test, t test, P-value hypothesis testing the process of using a statistical hypothesis test to test a null hypothesis 80

hypothetical population

hypothetical population

hypothetical population a population that cannot be completely deﬁned (it would not be possible to list the names of all the individuals in that population, for example) but that can be considered to exist for practical purposes. G inﬁnite. Û ﬁnite population

I iatrogenic describing a condition caused by the treatment given for another disease. Obvious examples are adverse reactions id subject id identiﬁcation number subject identiﬁcation number ignorable missing data data values that, despite being missing, do not introduce any bias into the analysis and results of a study. G missing completely at random, missing at random. Û informative missing data, nonignorable missing data ignorable missingness the process that produces ignorable missing data ignorant prior in Bayesian statistics, a prior distribution that gives no (or very little) information. G improper prior, reference prior imbalance lack of balance or not balanced immune not susceptible to a disease immune system those parts of the body, particularly antibodies, that help to protect or ﬁght against infection immunise to make someone immune to a particular disease. This may occur either naturally or artiﬁcially by inoculation impartial witness someone who observes an event (usually that of giving informed consent) but who has no involvement with the study improper prior in Bayesian statistics, this is a prior distribution that is not a valid probability distribution but which can still be used as if it were. In general it states that our prior belief about a parameter is that it lies somewhere between minus inﬁnity and plus inﬁnity. As this does not tell us much about the parameter, it is sometimes called an ignorant prior. G reference prior imputation the process of imputing impute to ﬁll in data values (usually missing data) with values that are thought to be sensible. There are several ways of doing this; many make valid assumptions, many make very questionable assumptions. Some methods rely on calculations based on the remaining data, some rely on intuition and guesstimates. The most common example is probably the 9

in vitro

independent contrasts

concept of last observation carried forward in vitro in a test tube (or similar). Û in vivo in vivo in living tissue. Û in vitro inactive control a placebo. Use of the term ‘control’ indicates that some intervention (even if only placebo) is implied. The term would not generally be used to refer to a control group that received no treatment at all incidence the number of new cases (of a disease) that occur in a speciﬁed period of time. Û prevalence incidence rate the number of new cases of a disease in a period of time, divided by the number of subjects at risk of the disease. Û prevalence rate event incident inclusion criteria the requirements that a subject must fulﬁl to be allowed to enter a study. These are usually devised to ensure that the subject has the appropriate disease and that he or she is the type of subject that the researchers wish to study. Inclusion criteria should not simply be the opposites of the exclusion criteria incomplete block a block of treatment (or treatment sequences) that does not contain all of the possible treatments (or treatment sequences) to which subjects in the study may be randomised. Û complete block incomplete block design a study that uses incomplete blocks of treatment. Although each block will necessarily be unbalanced (which may not be desirable), the study as a whole can still be balanced, as in a balanced incomplete block design incomplete crossover design a crossover design where not all subjects receive all of the possible treatments incomplete crossover study a study that is designed as an incomplete crossover design incomplete factorial design a factorial design where not all combinations of the possible treatments are used incomplete factorial study a study that is designed as an incomplete factorial design increment an increase in value (commonly the dose of a drug or the draft number of a protocol). Û decrement incubation period the time between exposure to an infection and the appearance of clinical signs. G sojourn period independent if knowledge of one event or variable gives us no information (or even clues) about another event then the two events are said to be independent of each other. G correlation independent contrasts two (or more) contrasts that are independent of 83

independent ethics committee

indicator variable

each other. If we were to compare the mean responses in three treatment groups (A, B and C), there are several possible contrasts that we could make. The simplest would be to compare each pair of treatments: mean(A)9mean(B), mean(A)9mean(C), and mean(B)9mean(C); however, if we know that A is greater than B, and that B is greater than C, then we immediately know that A must be greater than C. So these three contrasts are not independent of each other independent ethics committee research ethics committee independent groups groups of subjects that are independent of each other. For example, a parallel group design uses independent groups but a crossover design does not independent identically distributed (iid) a term used to describe values of a random variable that are independent of each other but which all come from the same underlying probability distribution. In a random sample of women, shoe sizes might all be independent, and all from the same distribution; if the sample contained men and women then, although the shoe sizes may all be independent, there might be two underlying distributions (larger shoes for men than women) independent variable independent random variable independent samples independent groups independent samples t test a statistical signiﬁcance test for testing the null hypothesis that the means of two populations are equal. Û paired t test independent variable another term for a covariate in a regression model. Note, confusingly, that several so-called independent variables may not be independent of each other, nor of the response variable (or dependent variable). In a regression model the response variable may depend on the independent variables but the independent variables are not dependent on the response variable. For example, blood pressure may be partially predicted from knowing a subject’s age, height, weight, etc.: these variables would be said to be the independent variables, whilst blood pressure is the dependent variable index case a case (as in case-control study) index group all of the cases in a case-control study indexed ﬁle a term that might be obvious in keeping paper ﬁles but is more relevant in computer databases. It is a collection (a ﬁle) of data that has an index which allows direct access to the required items indication the reason for using a product or other intervention. Synonym for disease indicator variable in computing terms a variable that is a binary variable. Often a set of indicator variables may exist to describe the values of one 84

indirect contact

inferential statistics

categorical variable. If a subject is randomised to receive one of three treatments, two indicator variables can be set up: the ﬁrst takes the value 1 (and the second 0) if the subject is randomised to Treatment A; the ﬁrst variable takes the value 0 and the second 1 if the subject is randomised to Treatment B; otherwise, both variables are set to 0, indicating that the subject must have been randomised to Treatment C indirect contact the contact of one person with another through a third party. Particularly relevant with infectious diseases, where the infection may initially be passed to someone in direct contact with the source of infection but these people may then pass the infection on further indirect cost in pharmacoeconomics a cost incurred because someone has a certain disease, but not the direct cost of treating the patient. Loss of earnings and social security payments are often considered indirect costs individual relating to a particular item (often, but not necessarily, a person) individual ethics ethical behaviour that focuses on beneﬁt to an individual rather than beneﬁt to society. Û collective ethics individual matching ﬁnding cases and controls that have similar demographic data and/or disease proﬁles. For each case, one or more similar controls is sought for comparison. Û group matching individual variation variation in measurements of individuals, rather than of groups. G within subjects variation. Û between subjects variation induce to draw a conclusion or a generalisation from speciﬁc examples of data. Û deduce induce induction inductive inference the process of drawing conclusions by induction. Û deductive inference inductive reasoning a less strong term than inductive inference inequality a statement which says that two things are not equal. Sometimes there may be suﬃcient information to know that one item or quantity is larger or smaller than another; otherwise ‘not equal’ is all that can be said inert having no (biological) action. Placebos are often considered as being inert infection the implantation and growth of an organism infectious a disease that can be passed on via direct contact or indirect contact with other people infer deduce inference a conclusion drawn based on data and reasoning inferential statistics the branch of statistical methods concerned with drawing conclusions from data, typically by use of statistical signiﬁcance 85

infinite

inpatient

testing. Û descriptive statistics inﬁnite without bounds. In numerical terms, a number larger than any other can be. Û ﬁnite. G minus inﬁnity, plus inﬁnity inﬁnite population a population (which must be a hypothetical population) that contains an inﬁnite number of individuals. For the purposes of statistical methods used in clinical trials, most populations are assumed to be inﬁnite. Û ﬁnite population inﬂuence to contribute substantially to a decision or conclusion inﬂuential observation a data point that has a lot of inﬂuence on a statistical model. Some outliers can be very inﬂuential observations but this is not always the case informatics the science of handling and processing information (usually in the form of data) information a term encompassing data but rather broader. Some say that the value of analysing data is to turn it into information informative censoring censored data where the process of censoring tells us something about the state of a subject. If censoring is random then we know only that data are censored; if subjects withdraw from a study because they are too unwell to attend the clinic, or because they are free of any symptoms, then we may have censored data but in both cases there is information (negative or positive) in the censoring. G informative missing data. Û noninformative censoring informative missing data missing data where the reason that the data are missing tells us something about the state of a subject. G informative censoring. Û noninformative missing data informative prior in Bayesian statistics any form of prior distribution that is not a reference prior. G proper prior informed consent the practice of explaining to subjects and informing them about the purpose of a study and seeking their agreement to participate on a voluntary basis. G Declaration of Helsinki, ethics, research ethics committee injection a method of delivering liquid medication into the body. G subcutaneous, intramuscular, intravenous inlier a data value that does not seem to be true, given all the other data values, usually because it is too typical or normal. This is an odd concept but can be applicable to multivariate data. Û outlier inotropic eﬀect the eﬀect a drug has on the contraction of the heart. G chronotrophic eﬀect inpatient a patient who is treated in hospital and usually stays in hospital overnight. Û outpatient 86

input device

interaction effect

input device a device for getting data into a computer (this may simply be the keyboard or it may be a sophisticated blood analyser that feeds results directly into the computer) input variable another term for a covariate or independent variable inspection review of data and work practices by an independent reviewer (usually from a regulatory authority). G audit instantaneous rate the number of subjects who experience an event at a particular (small) point in time divided by the number who were at risk at that time. G hazard function institution in the context of clinical trials, a place where a study is undertaken; usually a hospital or similar establishment research ethics committee institutional review board integer a whole number (1, 2, 3, etc.), including negative numbers (91000,96, etc.) but excluding any fractions or decimal numbers (3,96.75, etc.) integrity honesty (when applied to a person); correctness (when applied to data). Û fraud intelligence broadness of understanding of, and the ability to solve, problems (practical or theoretical). It is a term that can refer both to humans and other animals. Note, therefore, that it is not the same as ‘general knowledge’ intelligence test a series of questions and problems to measure intelligence. Often referred to as IQ (intelligence quotient) tests intention-to-treat a term very similar to analysis by randomised treatment. It is a strategy for analysing study data which (in its simplest form) says that any subject randomised to treatment must be included in the analysis. This is not always easy, particularly in the presence of missing data. Û per protocol analysis intention-to-treat intention-to-treat analysis intention-to-treat population the subset of subjects recruited into a study who are included in the intention-to-treat analysis interaction the joint inﬂuence of two or more independent variables on a response variable that is not simply the sum of the individual inﬂuences interaction eﬀect the diﬀerence in the size of the eﬀect caused by two (or more) variables jointly, compared with the sum of the individual eﬀects. For example, it is known that smoking and exposure to asbestos increase the risk of bronchial cancer. However, for smokers who are exposed to asbestos the risk is substantially higher than the sum of the individual risks. There is said to be an interaction between smoking and exposure to asbestos 87

intercept

interobserver agreement

intercept in a regression model this is where the regression line crosses the y axis (which is the value of y when x : 0) interim part way through; before the entire (study) is completed interim analysis a formal statistical term indicating an analysis of data part way through a study, usually in the context of group sequential studies. G sequential analysis interim look a less formal term than interim analysis, used to describe a broader range of analyses of data part way through a study. These may include formal interim analyses or less formal summaries of data, without necessarily having broken the blind interim report this term may either be used informally to refer to a preliminary report (that is, not a ﬁnal report) or more formally to mean the report of an interim analysis interim result the results of an interim analysis or interim look interim review a review of data part way through a study, often to check on data quality and completeness rather than in the sense of a formal interim analysis intermediate variable a variable that does not measure exactly what we want to know but which is a second-best alternative. G surrogate internal consistency in questionnaires this is used to describe the situation where diﬀerent questions ﬁnd the same information; a simple example is to record age and date of birth. Note that both responses may be consistent with each other but also that both may be wrong. A similar usage applies to results in a study report where, again, two sets of results may be based on incorrect data and so may be wrong—but if the two results agree with each other, then the report would be said to have internal consistency. Û external consistency internal pilot study a form of pilot study where the data collected also form part of the data for the main study internal validity a statement or result that is valid, given a set of assumptions. If those assumptions are not correct then that statement or result may not be true International Classiﬁcation of Diseases a coding system developed by the World Health Organization. Virtually every disease, illness, injury, etc. is given an alphanumeric code Internet a worldwide computerised communications network and source of information interobserver agreement the extent to which two (or more) people agree with each other when recording measurements. This can be important in multicentre studies where several investigators (possibly in several 88

interobserver disagreement

interval variable

countries) are supposed to be assessing the same quantity. It is most often referred to in the context of subjective data rather than objective data. Û intraobserver agreement interobserver disagreement the extent to which two (or more) people disagree with each other when recording measurements. More commonly referred to as interobserver agreement. Û intraobserver disagreement interobserver variation the variation that almost always exists when more than one person measures the same quantity. This variation leads to interobserver disagreement. Û intraobserver variation interpolate to calculate an unknown value between two known values. This is most often done in a linear way but more complex methods exist. The practice is often used when looking up conversion values in tables and ﬁnding that the exact value to be converted is not tabulated. The required result may be approximately determined by interpolation. Û extrapolate interquartile between the lower quartile and the upper quartile interquartile range a measure of variability. The value of the upper quartile, minus the value of the lower quartile interobserver agreement interrater agreement interrater disagreement interobserver disagreement interrater variation interobserver variation interrelate correlate intersect the point on a graph where two curves cross each other. This also includes one curve crossing the x axis or y axis ( intercept) interval the range between two data values. G class interval interval censored observation data that are censored within a time interval. Generally, in censoring it is assumed that a subject’s status is known until a particular time; thereafter it is censored. In interval censoring a subject may be seen once a week or once every three months, etc. and all that is known is that the subject’s data became censored some time during that interval interval data continuous data interval estimate a range of values a parameter is likely to take that reﬂect the uncertainty and variability in measurements. The most common types of interval estimate are standard errors, conﬁdence intervals and credible intervals. Û point estimate interval estimation the process of determining an interval estimate. Û point estimation interval scale continuous scale interval variable continuous variable 89

intervene

inventory

intervene to take action, rather than to do nothing but observe intervention the action that is taken when one intervenes. In clinical trials the most common type of intervention is to give treatment (or placebo) intervention study an alternative term for a clinical trial. Û observational study interview a series of questions that are asked of a subject. Interviews may be held face to face or as telephone interviews interview study a study carried out by interviewing subjects intraclass correlation correlation between two measurements of the same variables, in the same subjects, taken at two diﬀerent times intraclass correlation coeﬃcient the statistical measure of intraclass correlation. It is denoted r, as is the more usual correlation coeﬃcient intramuscular into the muscle tissue. A method of delivering drugs by injection. Û intravenous, subcutaneous intranet a type of wide area network that resembles the Internet but does not have unlimited public access intraobserver agreement the extent to which the same person can repeatedly make the same measurement. As with interobserver agreement, this is more relevant with subjective data than with objective data. G reliability intraobserver disagreement the extent of disagreement between repeated measurements of the same quantity taken by the same person. More usually referred to in the context of intraobserver agreement intraobserver variation the variation in a person’s repeated measurements of the same quantity that results in intraobserver disagreement intrarater agreement intraobserver agreement intrarater disagreement intraobserver disagreement intrarater variation intraobserver variation intravenous into the blood stream. A method of delivering drugs by injection. Û intramuscular, subcutaneous intuitive a decision reached by use of judgement and experience rather than based on data invariant lacking variation. The term is most often applied to a result that is found through analysis of data when that result holds for a variety of diﬀerent methods of analysis and a variety of diﬀerent assumptions about the data. It is the result that lacks variation, not the data invasive entering into the body, for example by needle to give an injection or to take blood or by an endoscope to take a biopsy inventory a list of items, typically of study materials, paperwork, medication, etc. 90

inverse correlation

isometric graph

inverse correlation this is usually used synonymously with negative correlation but more precisely means the correlation between one variable and the reciprocal of another variable inverse logarithm the reverse function of taking logarithms. For logarithms to base e, the inverse logarithm is the function eV inverse relationship strictly, this term should be used when one variable changes in relationship to the reciprocal of another. However, it is often used when one variable increases as another (on average) decreases; this is more correctly called a negative relationship investigate to systematically observe and take measurements. Note that this does not necessarily encompass the term experiment investigation a particular variable or collection of variables that are observed investigational centre investigational site investigational device medical device investigational device exemption (IDE) an exemption similar to a clinical trial exemption certiﬁcate, issued to allow a medical device to be used in trials medical device study investigational device study investigational new drug (IND) application an application to the US Food and Drug Administration (FDA) for permission to test a new drug in humans investigational product the product (usually) that is being researched. G experimental treatment investigational site the place where the clinical work for a study takes place investigator the person who carries out the investigation. The term is very commonly used to refer to doctors who see subjects in a study and administer medication and record progress. Sometimes the investigator is not medically qualiﬁed—he or she may, for example, be a microbiologist in a study of an antibiotic investigator initiated study a study that is proposed and usually run and managed by an investigator rather than one that is proposed and managed by a pharmaceutical company. Û sponsor initiated study investigator’s brochure a document prepared by a pharmaceutical company, for use by investigators, that summarises all the known relevant data (including safety, eﬃcacy, pharmacodynamics, pharmacokinetics, etc.) regarding an investigational product isometric graph a graph that attempts to plot three dimensional data in two dimensions (Figure 16). G contour plot, x axis, y axis, and particularly z axis 91

isometric graph

isometric graph

Figure 16 Isometric graph. A graph showing the relationship between height, weight and systolic blood pressure. In such graphs, the axis apparently going ‘into’ the page (in this case weight) is often referred to as the z axis

92

J J shaped curve a curve on a graph that resembles the shape of the letter J (for example exponential growth), or a reversal of it: J (exponential decay). The essential elements are that the curve is quite ﬂat and then rises steeply, or that (in reverse form) it falls quickly and is then ﬂat J shaped distribution a distribution that either has a large peak of values and then a long tail (a high degree of positive skew) or a long tail before a peak of values (high degree of negative skew) jackknife a statistical method of estimating parameters that helps to reduce bias in certain circumstances. The method calculates an estimate of the parameter of interest based on all the data except one observation; it then re-estimates the same parameter based on all the data except one other observation. The process is repeated until separate estimates have been calculated, each with the exclusion of one data point. These separate estimates are then combined jackknife estimator an estimator that uses the jackknife method joint distribution the distribution (either frequency distribution or probability distribution) of two (or more) random variables. To fully understand this distribution, we need to know the distribution of each of the variables separately and the correlation between them. G bivariate distribution, multivariate distribution. Û marginal distribution joint frequency distribution see joint distribution joint probability function see joint distribution Jonckheere–Terpstra test a nonparametric statistical signiﬁcance test for testing the null hypothesis of no trend in ordered categorical data between two or more groups journal a regularly published document containing academic research, reviews, etc. judgement use of intuition and experience instead of (or possibly in conjunction with) data

Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

K Kaplan–Meier curve a graph showing the cumulative probability of survival. G Kaplan–Meier estimate Kaplan–Meier estimate a nonparametric estimate of the cumulative probability of survival for a set of data (that may include censored observations) Kaplan–Meier product limit estimate Kaplan–Meier estimate kappa () coeﬃcient an index (ranging from 0 to 1) of interobserver agreement Kendall’s tau () a nonparametric correlation coeﬃcient. G Spearman’s rho () keystroke error pressing the wrong key on a computer keyboard. The term is most often used in assessing quality of data entry, where the number of keystroke errors may be taken as a measure of the quality of the working practice. It is one of the major reasons for doing double data entry kilobyte one thousand bytes of computer information. Desktop computers can typically store at least a million bytes (or 1000 kilobytes). G megabyte pharmacokinetics kinetics Kolmogorov–Smirnov test a nonparametric statistical signiﬁcance test for testing the null hypothesis that the location parameters of two groups are equal. G Kruskal–Wallis test, independent samples t test Kruskal–Wallis test a nonparametric statistical signiﬁcance test for testing the null hypothesis that the location parameters of two or more groups are equal. G Kolmogorov–Smirnov test, one way analysis of variance kurtosis a measure of how highly peaked a distribution is. Distributions that have steeper peaks than the Normal distribution are called leptokurtic; those that are ﬂatter than the Normal distribution are called platykurtic. The Normal distribution is sometimes described as being mesokurtic (Figure 17)

Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

kurtosis

kurtosis

Figure 17 Kurtosis. As distributions become more and more peaked they are called leptokurtic; as they become less peaked they are called platykurtic

95

L L’Abbe´ plot a type of graph useful for plotting the results of many studies to assess how consistent they are with each other (Figure 18). G meta-analysis, overview

Figure 18 L’Abbe´ plot. Response rates from nine studies comparing an antipsychotic with placebo in obsessive—compulsive disorder. The diagonal is the line of equality: points above the line indicate that the response to treatment was better than that to placebo whilst points below the line (just one in this example) indicate a higher placebo response than treatment response Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

laboratory

latent period

laboratory a place where investigations and/or experiments are carried out. Traditionally, the laboratory was where experiments in chemistry or physics were done but the term is now used more broadly and may include, for example ‘computer laboratory’ or ‘speech laboratory’ laboratory data strictly any data that come from a laboratory. However, the term is usually used to refer to biochemistry, haematology and urinalysis data lag waiting behind. The term is used in computer programming and in statistical time series methods landscape a page that is wider than it is high, as in how (most) landscape pictures would be viewed. Landscape A4 paper is 297 mm wide and 210 mm high. Û portrait large sample method asymptotic method large scale trial megatrial Lasagna’s law the situation where the number of subjects eligible for a study apparently decreases when the study starts and increases again as soon as it ends. G Mu¨nch’s law last observation carried forward a method sometimes used to analyse studies with missing data. Consider the situation where subjects are due to be seen at several visits (say, each month for six months), with the endpoint of the study being the six month assessment. If a subject withdraws from the study at month four, then we may use that month four data to replace the (missing) month six data. That is, we take the last actual observation and carry it forward to the end of the study. Various scenarios are illustrated in Table 7. G intention-to-treat last observation carried forward last visit analysis last visit carried forward last observation carried forward latent period sojourn period Table 7 Individual subjects’ heart rates (beats per minute) at four consecutive visits and each subject’s value for the ‘last observation carried forward’ Subject id

Visit 1 (baseline)

Visit 2 (4 weeks)

Visit 3 (8 weeks)

Visit 4 (12 weeks)

‘Last observation’

1 2 3 4 5 6

98 80 83 95 110 88

99 72 83 90 88 Missing

94 Missing 80 Missing 80 Missing

89 Missing 81 95 Missing Missing

89 72 81 95 80 88

97

Latin square

learning curve

Latin square an experimental design that balances for two sources of variation. In clinical trials, the two sources are usually subject and time. The example in Table 8 shows how four treatments (A, B, C, and D) could be compared in four subjects in four time periods. The essential feature is that every treatment appears only once in every row (each subject) and once in every column (each time period). G crossover study, Youden square Table 8 Latin square showing the sequence of four treatments (A, B, C and D) for four subjects in four periods Period

Subject Subject Subject Subject

1 2 3 4

1

2

3

4

A B C D

B A D C

C D B A

D C A B

law a set of rules. A relationship between a set of events and an outcome law of averages an informal term that reﬂects the fact that probability distributions exist and in particular reﬂects the belief that any particular outcome will eventually be observed if enough data are collected. It is distinctly diﬀerent from the law of large numbers (or central limit theorem) law of diminishing returns eighty–twenty rule law of large numbers central limit theorem lay person someone who is not speciﬁcally trained in the subject being discussed but is nevertheless involved in discussing it. Research ethics committees will include lay members lead time bias a term often used in assessing survival times when the method of detecting cases improves with time. Patients apparently survive longer than they used to but this is not due to better treatment; rather it is due to earlier diagnosis. This could be an important problem in evaluating a screening programme. For example, even with no change in clinical practice, because cases may be detected earlier than without screening, the survival time from diagnosis will increase because diagnosis is occurring earlier in the life cycle of the disease run in period lead-in period learning curve a graph (rarely plotted, but frequently imagined) that plots time on the x axis and ability in a particular subject on the y axis. Sometimes such curves are J shaped curves, starting very ﬂat and then rising steeply (suggesting it takes a long time before you can do anything, but then it all becomes clear); sometimes they start very 98

least significance difference test

level of measurement

steeply and then ﬂatten oﬀ (suggesting it is easy to get started but learning the last few techniques becomes more and more diﬃcult) least signiﬁcance diﬀerence test Tukey’s least signiﬁcant diﬀerence test least squares a method of estimating parameters from data. It is based on choosing the value for that parameter that minimises the squared distance of each of the data values from the estimate of the parameter. Û maximum likelihood least squares estimate an estimate of a parameter obtained by the method of least squares. Û maximum likelihood estimate least squares mean the estimated mean of a variable obtained from an analysis of variance model or analysis of covariance model. It is the adjusted mean after adjusting for any other factors and covariates in the model least squares method any statistical method based on the principle of least squares. Û maximum likelihood method left censored when measuring when an event occurs, the events that occurred before the study follow-up period (and so were not observed) are left censored. Û right censored left censored data when the time of an event is known but the instant of exposure may be known only to be before a given time and the exact time is not known. Left censored data are much less common than right censored data. G censored data left censored observation left censored data negative skew left skew left tail the values in a distribution that are small (typically taken as meaning less than the mode) legal guardian someone who either permanently or temporarily is legally responsible for someone else’s health and well being. Û next of kin lethal will kill or extinguish life. Û fatal lethal dose the dose of a drug that will kill an individual lethal median dose (LD50) the dose of a drug that will kill half of the subjects exposed to it level of a factor one of the diﬀerent values that a factor (a categorical variable) can take. For example, the factor gender usually has two levels: male and female level of blinding whether a study is open label, single blind, double blind, triple blind, etc. level of measurement the degree of detail with which measurements are recorded. In general, the diﬀerent levels (in descending order of detail) are continuous data, ordinal data, categorical data and binary data 99

level of significance

lifetime prevalence

Table 9 Example of a life table. In this instance, the radix (which is merely a baseline number taken for convenience) is 10000 Age (years) x

Survivors at age x

0 1 2 3 4 5 : : 50 : : 90 100

10000 7675 6718 6247 5987 : : : 2971 : : 119 2

Deaths between Probability of x and x ; 1 dying between x and x ; 1 2325 957 471 260 155 : : : 77 : : 27 0

0.02325 0.01247 0.00701 0.00416 0.00259 : : : 0.00259 : : 0.0227 —

level of signiﬁcance in statistical signiﬁcance tests this is the P-value (strictly speaking, speciﬁed before the calculations are carried out) that will be needed in order to declare a result as statistically signiﬁcant. The most common cutoﬀ value is 0.05 but 0.01, 0.001, etc., may also be used. Note that the level of signiﬁcance is not the calculated (or observed) P-value life expectancy the length of time that an individual (or group of individuals) is expected to live life table a tabulation used to summarise life expectancy and probabilities of survival or death at diﬀerent ages (or at diﬀerent times after exposure to an intervention). An example of a life table is shown in Table 9. G survival analysis life table analysis methods used to analyse life tables, and particularly to compare survival curves between diﬀerent groups of individuals and to assess the importance of prognostic factors on the length of survival. One of the most common methods is Cox’s proportional hazards model life table method life table analysis lifetime the time between birth and death lifetime prevalence the prevalence of a particular event when the period within which the prevalence is measured is a person’s entire lifetime. G period prevalence 100

likelihood

linear trend

likelihood the probability of a set of observed data values, assuming a particular hypothesis (which is generally that they come from a particular probability distribution with speciﬁed parameters). Note that this is not the same as the probability of a given probability distribution, given a set of data. A variety of statistical procedures for signiﬁcance testing and estimation are based on methods that use likelihood likelihood function likelihood likelihood principle methods of estimating parameters and signiﬁcance testing, based on likelihood functions likelihood ratio the ratio of the likelihood of two diﬀerent hypotheses based on the same set of data likelihood ratio test statistic a general form of statistical signiﬁcance test based on the likelihood ratio. Simplistically, the hypothesis with the greater likelihood is more likely (sic) to be correct likert scale an ordinal scale where scores are assigned to the diﬀerent categories in the style of (for example) 1 : condition worse, 2 : no change, 3 : slight improvement, 4 : marked improvement, 5 : condition cleared limit asymptote line extension an addition to a range of products or the range of uses of a product. This may include alternative forms of presentation or new indications for use linear in a straight line. G curve linear combination a combination of values that is gained by simple addition and subtraction of multiples of those values. It does not involve any multiplication or other nonlinear functions of the values. For example, x ; y is a linear combination of x and y but x;y and xW are not linear combinations linear correlation correlation linear estimator an estimator that involves only linear combinations of data values linear kinetics describes the pharmacokinetics of a product when the rates of absorption, distribution and elimination are each proportional to the dose of drug linear model a statistical model (such as a regression model) that only has a linear combination of parameters linear regression a regression model that is a linear model linear transformation linear combination linear trend a steadily increasing (or decreasing) response when a covariate increases (or decreases). The trend is linear if, for a ﬁxed 101

link function

loading dose

Figure 19 Linear trend. Pupil diameter after administration of a new test compound. One subject had ﬁve measurements taken one hour after receiving each of four separate doses of drug. Within the dose range used, the eﬀect on pupil size seems to be linear change in the covariate, there is a ﬁxed size change in response (Figure 19). G dose response relationship link function a transformation of data values used to try to make a nonlinear curve be a linear one. Used extensively in generalised linear models linkage record linkage literature review a review of published studies and data relating to a particular topic. It is often the starting point for a new piece of research (to review the current and recent publications to ﬁnd out what is known about a subject). It is also one of the ﬁrst activities carried out in meta-analysis and overviews ln log C loading dose a high dose of a drug that is initially given to quickly achieve a required therapeutic level. Thereafter, smaller doses (maintenance 102

local area network

logistic transformation

doses) are often suﬃcient to keep the amount of drug in the body within the therapeutic range local area network a set of computers linked to each other to allow sharing of data and documents. The term ‘local’ is relative but tends to mean restricted to one site or building within a site. Û wide area network. G intranet local laboratory a laboratory that is geographically close to where subjects are being investigated. Û central laboratory local research ethics committee a research ethics committee that assesses studies to be carried out in local areas, typically with few centres. Û multicentre research ethics committee server local server location a nonspeciﬁc term to describe the central tendency in a set of data location parameter the parameter used to describe location for any particular set of data. The most common location parameters are the mean, the median and the mode lods log odds log a systematic record of activities and actions. Also an abbreviation for logarithm log odds log of the odds of an event occurring C log odds ratio log of the odds ratio. Many calculations concerning odds C ratios are, in fact, carried out on the logarithm of the odds ratio and then transformed back to the odds ratio scale log rank test a statistical signiﬁcance test for comparing the survival times of diﬀerent groups of subjects log transformation the transformation of data values that is made by taking the logarithm of those data log abbreviation for logarithm in base 10 units. G log C logarithm a mathematical function; the opposite function to the exponential logarithmic transformation log transformation log abbreviation for logarithm in base e (e is a natural constant, C approximately equal to 2.718). G log logistic curve a curve that is the graph of the logistic function (Figure 20) logistic function a transformation of binary data that is used in logistic regression. Where the proportion of responses is denoted p, the transformation is y : log +p/(19p), C logistic regression regression where the response variable is binary and a logistic transformation has been used to help facilitate the mathematics in the statistical model. It is one form of generalised linear model logistic transformation logistic function 103

logit

longitudinal analysis

Figure 20 Logistic curve. The logistic function is deﬁned for proportions (p) between 0 and 1. When p : 0, the logistic function equals minus inﬁnity; when p : 1, the logistic functions equals plus inﬁnity

logistic function logit logit model logistic regression log-linear model a statistical model for analysing data that are in the form of a count of the number of observations that fall into each cell of a contingency table. It is one form of generalised linear model lognormal distribution the probability distribution of a variable such that the logarithm of that variable follows a Normal distribution long term follow-up usually restricted to observations on subjects after some intervention has taken place. The subjects may, or may not, be given medication during this time. ‘Long term’ is obviously open to interpretation but is generally considered to be at least six months. G acute phase longitudinal followed across time. Û cross-sectional. G cohort longitudinal analysis the analysis of longitudinal data, usually with the 104

longitudinal data

lower quartile

speciﬁc intention of analysing changes with time. G growth curve. Û cross-sectional analysis longitudinal data data that are repeatedly collected on the same subject across time. G repeated measurements. Û cross-sectional longitudinal study a study that observes and measures the same subjects over a period of time. Û cross-sectional study loss any negative eﬀects of an intervention. A loss may be measured in cash, in years of life, in excess pain, etc. Sometimes a negative loss is referred to, meaning a gain loss function a function that combines several measures of loss (or possibly gain) to arrive at an overall ﬁgure for loss. The term ‘loss function’ is generally used when there is expected to be an overall loss (in the true negative sense); the term utility function is synonymous but tends to be used when there is expected to be a net gain (or negative loss) loss to follow-up the case when a subject is lost to follow-up lost to follow-up a subject who supplies some data for a study but for whom after a certain time no more data are available. The term usually also implies that there is no known reason why the subject supplies no more data. G censored observation, missing data lotion a liquid used as a vehicle for delivering topical treatments. G cream, gel, ointment lower quartile the 25th centile. G upper quartile, median

105

M main eﬀect in factorial studies, the main eﬀect of one factor is the size of the eﬀect averaged over all levels of all other factors. Û interaction eﬀect main study a term meaning study but useful to distinguish from pilot study mainframe mainframe computer mainframe computer a large computer. As technology progresses, the processing power and storage of small desktop computers is making the need for mainframe computers less and less. G microcomputer, minicomputer, supercomputer maintenance dose the amount of drug that needs to be given to keep within the required therapeutic range. Û loading dose majority more than 50% (but not necessarily the mode). Û minority Mann–Whitney U test a nonparametric signiﬁcance test for testing the null hypothesis that the location parameter (usually the median) is the same in each of two groups. G independent samples t test Mantel–Haenszel estimate a method of estimating an odds ratio from a stratiﬁed sample Mantel–Haenszel test a statistical signiﬁcance test for testing the null hypothesis that the Mantel–Haenszel estimate of the odds ratio is equal to one manual a set of instructions on how to use a machine or carry out a procedure manuscript a written document sent to a publisher to be published (or to be considered for publication) margin the edge. In multivariate data, each of the individual variables are sometimes referred to as the margins. See, for example, marginal distribution margin of error accuracy margin of safety safety margin marginal see marginal distribution, marginal mean marginal cost per unit cost marginal distribution in multivariate data, the distribution of each of the Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

marginal effect

matched subjects

variables, regardless of the other variables. Table 10 shows an example of bivariate data. Û conditional distribution, joint distribution Table 10 Joint distribution and marginal distributions of patients’ and investigators’ assessment of severity of disease in 202 patients. The marginal distributions are the row and column ‘total’ columns Scaliness Redness Absent Slight Moderate Severe Very severe Total

Absent

Mild

Moderate

Severe

Total

1 25 24 2 0 52

1 12 86 10 1 110

0 3 25 8 1 37

0 0 2 1 0 3

2 40 137 21 2 202

marginal eﬀect a loose term used to describe an eﬀect that is quite small and that may not be real. G marginally signiﬁcant marginal mean the mean of a marginal distribution marginally signiﬁcant a loose term used to imply that a calculated P-value is very close to some arbitrary criterion for being called statistically signiﬁcant. P-values of about 0.07 to 0.04 are often described as being marginally signiﬁcant. G marginal eﬀect marker surrogate marketing authorisation the authorisation given by a regulatory authority to a pharmaceutical company to market a product mask blind match to identify two (or more) subjects as having similar demographic data and/or disease severity (and other characteristics) such that one can serve as a control for another matched control a subject who has not been exposed to the intervention under study but who has demographic data and other exposure characteristics similar to those of one that has been exposed and who will be compared with that subject. G case-control study matched design a study where matched pairs are used matched pair two subjects who will be compared with each other or two measurements on the same subject that will be compared matched pairs t test paired t test matched study a study that is designed as a matched design matched subjects see matched pair 107

mathematical model

mean square error

Table 11 Simple matrix of demographic data Subject identiﬁcation number

Age (years)

Gender

Race

1 2 3 4 5 6 7

37 42 18 77 45 51 52

Male Male Female Male Male Female Female

British British British Indian British American British

mathematical model see model mathematics the science dealing with numbers, their uses and manipulation. G statistics matrix a rectangular (not necessarily square) array of mathematical elements. It could include numbers, regression coeﬃcients, parameter estimates, etc. or simple raw data as shown in Table 11. Strictly it should have at least two rows and at least two columns: if it has only one row or only one column, it is called a vector maximum the largest of a set of values. Û minimum maximum likelihood the largest value of the likelihood function. G maximum likelihood method maximum likelihood estimate the estimate of a parameter that is obtained by the maximum likelihood method maximum likelihood method a method of estimating parameters; the most likely value for the parameter (the best estimate) is the one that has the maximum likelihood. Û least squares method maximum tolerable dose the maximum dose of a drug that a subject can take before inducing unacceptable adverse reactions McNemar’s test a statistical signiﬁcance test for testing the null hypothesis of no change in the proportion of subjects experiencing an event when each subject is assessed twice (under diﬀerent conditions) and the data are in the form of matched pairs mean the sum of a set of numbers, divided by the number in the sample. A more formal term for the average mean absolute deviation average absolute deviation mean square the mean of a sum of squares mean square error the variance of an estimator. (Note that if the 108

meaningful difference

medical history

estimator is biased, then the mean square error is the sum of the variance of the estimator plus the square of the size of the bias) meaningful diﬀerence clinically signiﬁcant diﬀerence measure to determine the size or extent of a variable of interest measured value observed value measurement the assessment and recording of a data value. This does not have to be restricted to objective data; the term is also used with reference to subjective data measurement bias a bias caused by the process of taking measurements. Examples include digit preference or measuring only values of a variable that fall within the capacity of the measuring instrument. G Hawthorne eﬀect measurement error an error made in measuring the value of a variable. The error may be because of lack of care in the measurement process or because of diﬃculty or judgement needed to measure the variable. Blood pressure, for example, is prone to measurement error, as are most types of subjective data measurement scale the type of scale that is used to measure a variable. Examples include ordinal scale, continuous scale, categorical scale, etc. MedDRA a dictionary of adverse event terms. MedDRA stands for Medical Dictionary for Drug Regulatory Aﬀairs. G COSTART, WHO-ART median the 50th centile. When a set of numbers is sorted into ascending order, there are as many values greater than the median as there are values smaller than the median. G lower quartile, upper quartile median dose the dose of a drug that is estimated to produce a response in 50% of subjects median life expectancy the length of time that 50% of subjects are expected to live medical relating to medicine. Û clinical medical device a physical device used for medical treatment, such as a prosthesis or a heart pacemaker medical device study a study of the eﬃcacy and/or safety of a medical device. This can encompass the comparison of more than one device or the comparison of a device and a pharmaceutical product medical ethics the branch of ethics that considers medicine, medical practice, medical care, etc. G Declaration of Helsinki medical history the course of the health (including ill health) of a patient over time. This information can be used to help determine a diagnosis and predict a prognosis 109

medical judgement

metric scale

medical judgement a judgement (about diagnosis, treatment, prognosis, etc.) made by a physician medical record the notes and documents that describe a subject’s medical history medical study a study of the eﬃcacy and/or safety of one or more medicines. It is a more speciﬁc term than clinical trial medical treatment treatment administered to a patient. The type of treatment can be very broad but generally excludes surgical treatment medical trial clinical trial medically important diﬀerence clinically signiﬁcant diﬀerence medicine the science and practice of prevention, diagnosis and treatment of disease megabyte a unit of space for storing information on a computer. Equivalent to one million bytes. G kilobyte megatrial a very large trial. Usually considered to include several thousand subjects meta-analysis an analysis of the summary results from two or more similar studies. (Strictly, analyses of analyses; G metadata.) Such methods are becoming more common and are used as a way of synthesising data from a variety of studies to try to get better answers to speciﬁc medical questions. Û overview metabolise to change (when a drug changes in the body). G pharmacokinetics metabolism the set of changes that happen to a chemical (a drug) in the body. G pharmacokinetics metadata data about data. For example, a manufacturer’s data regarding accuracy of a peak ﬂow meter might be considered as metadata method a way of carrying out a procedure. The term applies equally to methods of treating patients, methods of measuring variables, methods of analysing data, etc. methodologist one who studies and is an expert in methods. The term is usually used to distinguish between applied research and theoretical research methodology a set of methods me-too a term used to describe a product for which a very similar alternative already exists metric any measurement scale, but particularly one referring to metric data metric data data measured in the SI system of units, which includes grams and metres. Also sometimes used to refer to continuous data metric scale continuous scale 110

metric variable

missing at random

metric variable continuous variable microcomputer a computer that is usually small enough to ﬁt on a desk, in a briefcase, or even in a pocket. G minicomputer, mainframe computer, supercomputer microprocessor the processing unit that forms the basis of a computer mid P-value an adjustment made to the calculation of P-values when working with ordinal data. With continuous data, the probability of observing any particular value is considered to be zero; so the probability that x is greater than y [Prob (xy)] is the same as the probability that x is greater than or equal to y [Prob (xPy)]. However, since with ordinal data any particular value can have a nonzero probability, the mid P-value is deﬁned as Prob (xy) ; Prob (x : y) midpoint the middle of a class interval. It is simply the mean of the lower class limit and the upper class limit; it is not the median within the class interval mid-quartile the mean of the lower quartile and upper quartile. Û median mid-range the mean of the minimum value and the maximum value mid-spread interquartile range minicomputer a small computer that is larger and more powerful than a microcomputer but not as large or powerful as a mainframe computer or supercomputer minimax rule a rule that calculates the maximum value of a parameter under diﬀerent circumstances (often the maximum cost under diﬀerent circumstances) and then chooses as ‘best’ the set of circumstances with the minimal cost. It is the minimum of all the possible maxima minimisation a pseudorandom method of assigning treatments to subjects to try to balance the distribution of covariates across the treatment groups. G randomisation, stratiﬁed randomisation minimum the smallest of a set of values. Û maximum minority less than 50%. Û majority minus inﬁnity a number smaller than any other number can be. Û inﬁnite, plus inﬁnity misclassiﬁcation with categorical data, misclassiﬁcation is any form of measurement error that ultimately means that a subject is recorded as being in the wrong category. Examples include gross errors such as recording a subject as being male instead of female, or lesser errors such as recording ‘partial’ improvement of symptoms instead of ‘moderate’ improvement misconduct fraud missing at random missing data where the probability of data missing 111

missing completely at random

moment

may depend on the values of some other measured data but does not depend on the missing values themselves. Û missing completely at random missing completely at random missing data, where the probability of data missing is independent of any observed or unobserved data. This is not very common since subjects may often withdraw from studies because their disease is completely cured or may default because their disease is extremely severe. Û missing at random missing data a data value that should have been recorded but, for some reason, was not missing value missing data mixed eﬀects model a statistical model that contains a mixture of diﬀerent types of parameters. Speciﬁcally, it is one that contains both ﬁxed eﬀects and random eﬀects mixed model mixed eﬀects model mock report ghost report mock table ghost table modal relating to the mode modal class in data measured in categories, the most frequently occurring class. G mode modality the property of having a mode mode the most frequently occurring value. Used as a measure of location. Û mean, median model an idealistic description of a real (often uncertain) situation. Models may take the form of physical imitations of medical devices, through to mathematical models that are equations or functions describing how a process behaves and on to statistical models that contain both deterministic elements (like mathematical models) and random elements. Statistical models are often thought of as being like regression models, logistic regression, log-linear models, etc. In fact, simple t tests are also models, just of a much simpler form. Models can be expressed in words: the model that an independent samples t test assumes is that the distribution of a variable is identical in each of two groups, except for a shift in location. Such models can also be expressed algebraically as y : ; x ; G G G model equation the equation for a model modiﬁed Fibonacci series a modiﬁcation to a standard Fibonacci series moment a series of statistics describing a probability distribution. The ﬁrst moment is the mean; the second moment (often referred to as the ‘second moment about the mean’) is the mean of the squared distances of each value from the mean; the third moment is the mean of the cubed 112

monitor

multicentre

distances of each value from the mean, etc. monitor one who visits investigators to help with study management, ensure that all data are being recorded as they should be and that all supplies (drugs, materials, etc.) are available on site, and who often returns completed case record forms to the data management oﬃce. Also a term used for one of the output devices (the screen) of a computer monitoring committee data and safety monitoring committee monitoring report a report (usually written) to describe the activities of a monitor at a study site and any positive or negative ﬁndings, any issues that need bringing to the attention of others, etc. monotherapy a single drug. Û combination drug monotonically decreasing repeated measurements that only remain constant or decrease; they never increase. Û monotonically increasing monotonically increasing repeated measurements that only remain constant or increase; they never decrease. Û monotonically decreasing Monte Carlo method a method to solve a problem by simulation Monte Carlo simulation either a single simulation (as in a Monte Carlo trial) or a complete set of simulations forming a Monte Carlo method. All Monte Carlo methods are simulations Monte Carlo trial one (usually of many thousands) of the simulations in a Monte Carlo simulation morbid prone to disease. Û mortal morbid event an event associated with illness morbidity relating to ill health. Û mortality morbidity curve a graph of the cumulative occurrence of morbidity with time morbidity rate the proportion of subjects with a morbid event at any given point in time mortal prone to death. Û morbid mortality relating to death. Û morbidity mortality curve a graph of the cumulative occurrence of death with time. Û survival curve mortality rate the proportion of subjects who have died at any given point in time most powerful test uniformly most powerful test moving average a term used most often with time series data. It involves calculating the mean (or average) of observations 1 and 2 (for example); then the mean of observations 2 and 3; then of observations 3 and 4, and so on multicentre involving more than one study centre 113

multicentre research ethics committee

multiple endpoints

multicentre research ethics committee a research ethics committee that assesses studies that are planned to take place in many study centres. Û local research ethics committee multicentre study a study carried out at more than one study centre multidisciplinary involving more than one scientiﬁc discipline (or speciality). This may include more than one medical discipline (such as oncology and gastroenterology) but also can include other disciplines such as biostatistics (for study design and analysis), mechanical engineering (if prostheses or other medical devices are being used), etc. multidisciplinary study a study that involves more than one scientiﬁc discipline for its design, execution, analysis, and reporting multilevel model a model that has a hierarchy to its parameters. For example, a study may be conducted in several countries (level 1); with several investigators (level 2) in each country; with many subjects (level 3) recruited by each investigator; and each subject observed on several occasions (level 4). G mixed eﬀects model multimodal having more than one mode multimodal distribution a distribution that has more than one peak (or ‘local maxima’). Note that the mode is the most frequently occurring value so the term multimodal is a tautology; hence more than one ‘peak’ is used in this context multinomial data categorical data multiperiod crossover design a crossover study with more than two study periods multiperiod crossover study a study designed as a multiperiod crossover design multiple comparison method any statistical method for making multiple comparisons multiple comparison test any form of statistical signiﬁcance test for making multiple comparisons multiple comparisons more than one comparison (usually in the form of statistical signiﬁcance tests) within a single study. The comparisons may be between more than two treatments, or between two treatments but with more than one response variable, or a mixture of both of these situations multiple correlation coeﬃcient (R2) the correlation in a multiple regression model. G correlation coeﬃcient multiple dose design repeated dose design multiple dose study repeated dose study multiple endpoints more than one endpoint in a study. G multiple 114

multiple imputation

Mu¨nch’s law

comparisons, multiple outcomes multiple imputation a method of imputing several randomly diﬀerent values for a missing value. The method may have no impact on any point estimate over and above that of simple imputation but it does better reﬂect variability of the missing value. G last observation carried forward multiple linear regression multiple regression multiple linear regression model multiple regression model multiple logistic regression logistic regression with more than one covariate multiple logistic regression model a statistical model resulting from multiple logistic regression multiple looks more than one analysis of accumulating data. G group sequential study multiple outcomes a study having more than one outcome variable. G multiple endpoints, multiple comparisons multiple regression linear regression with more than one covariate multiple regression model a statistical regression model resulting from multiple regression multiple signiﬁcance tests the use of more than one statistical signiﬁcance test in one study multiplicative model a statistical model where the combined eﬀect of separate variables contribute as the product of each of their separate eﬀects. Û additive model, linear model. G interaction multiplicity multiple comparisons multistage design a study that has more than one stage (or period), possibly including a run in period, a treatment period and a follow-up period multi-univariate more than one univariate response variable where the interest lies with each variable in its own right, rather than a multivariate combination of them multivariate relating to more than one variable (usually more than one response variable). Û univariate. G bivariate multivariate analysis special methods of analysis suitable for multivariate data multivariate data measurements that consist of more than one variable. For example, a person’s ‘size’ could be measured jointly by their femur length, tibia length and skull circumference. More than two variables are always referred to as multivariate: two variables, whilst still multivariate, are often referred to as bivariate multivariate distribution the joint distribution of more than one variable. Û univariate distribution. G bivariate distribution Mu¨nch’s law a pessimistic rule which suggests that the number of 115

mutually exclusive

mutually exclusive events

subjects expected to be available to enter a study should usually be divided by a factor of at least ten. G Lasagna’s law mutually exclusive not able to occur at the same time mutually exclusive events two or more events that are not able to occur at the same time. This is not restricted to situations where events have (by chance) not been observed to occur at the same time, but events that are not capable of jointly occurring. An example would be that a subject is male and is pregnant

116

N named patient use a way of allowing a doctor to oﬀer an unlicensed product to a patient outside of a clinical trial. This is often allowed in treatment of life threatening diseases where no alternative treatment is available. There are strict guidelines under which such a supply may be oﬀered. G compassionate use natural experiment a term used to describe a (usually major) event (usually some form of disaster). The resulting change in environment and its impact can be studied. It is not a true experiment as the intervention is not under our control. Examples include ﬂoods and chemical leaks natural history the course of events over time. The term can be used on a massive scale to describe geological and climatic changes or to describe how an illness in an individual has developed and is likely to develop over time. Û medical history natural logarithm a logarithm to base e. log C necessary and suﬃcient a term often used in a mathematical context but applicable elsewhere. It describes a set of circumstances that are required (‘necessary’) but also where no other circumstances are simultaneously required (‘suﬃcient’). For life to exist, oxygen must be present; but that is not all. Thus, oxygen is necessary, but not suﬃcient, for life to exist negative control see inactive control, placebo control negative correlation correlation between two variables such that as one variable increases the other tends to decrease. Û positive correlation. G inverse correlation negative eﬀect an eﬀect that is undesirable. G adverse reaction negative gain a loss. Û negative loss negative loss a gain. The term is used when referring to several items that generally incur a loss (possibly a ﬁnancial loss). To avoid switching between losses and gains, the term negative loss is sometimes used. Û negative gain Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

negative predictive value

nil effect

negative predictive value in a diagnostic test, the probability that a person with a negative result does not have the disease (a correct result). G positive predictive value, sensitivity, speciﬁcity negative relationship an informal term for negative correlation. Û positive relationship negative response a poor response, or no response, to treatment. Û positive response negative result a result less than zero. The result of a negative study. Û positive result negative skew describes a distribution that has a long left hand tail so that the majority of observations are at the upper end of the scale. Û positive skew negative study a study that fails to reject the null hypothesis or otherwise fails to fulﬁl its objectives. Û positive study negative treatment eﬀect an undesirable treatment eﬀect. G adverse reaction. Û positive treatment eﬀect nested design an experimental design where some factors occur only as subsets of other factors. G multilevel model nested factor a factor that occurs in an experiment only as a subset of another factor net change any change after removing the eﬀect in a control group. For example, if we calculate the change in blood pressure in a group of treated patients and in another group of untreated patients, the net change in the treated group would be their change minus any change observed in the untreated group. G treatment eﬀect net diﬀerence net change net eﬀect any eﬀect after removing some baseline or control eﬀect. G net change net treatment eﬀect see net change, treatment eﬀect network a system of communication between computers or people to share information new chemical entity (NCE) a new chemical that is being developed as a potential new drug new drug application (NDA) an application to the Food and Drug Administration in the USA for a licence to market a new chemical entity Newman–Keuls test a multiple comparison test for testing the null hypothesis of no diﬀerence between the means of more than two groups next of kin a person’s nearest relation (through either blood or marriage). Û legal guardian nil eﬀect no eﬀect, or zero eﬀect 118

no cause audit

nomogram

no cause audit an audit carried out as a matter of routine or because a study or a site has been selected at random for audit. Û for cause audit node on a decision tree (Figure 6), any point at which a choice of routes can be made n-of-1 study a study carried out in a single patient to determine the best treatment for that patient (which may not necessarily be the best treatment for patients in general) noise unwanted variation in data. G signal to noise ratio noisy data data that have a lot of noise, or a high variance nomenclature the terminology (symbols and special language) used in any science or discipline nominal data categorical data nominal scale a categorical scale whose possible values are simply in the forms of names: country of origin, concomitant medications, etc. nominal variable a variable measured on a nominal scale nomogram a type of graph used to depict the relationship between (usually) three variables (Figure 21)

Figure 21 Nomogram. For values of height and weight, Quetelet’s index (body mass index) can be read oﬀ 119

noncentral distribution

nonparametric method

noncentral distribution a variation of the more standard probability distributions (t distribution, F distribution, etc.) useful for calculating power of signiﬁcance tests noncompliance the act of not fully complying with a protocol. Often the term is restricted to whether or not a subject takes the medication as and when they should but it can be interpreted more widely to any aspect of a protocol noncompliant a subject who does not fully comply with a protocol nonignorable missing data missing data that indicate something about the subject because of the fact that the data are missing. For example, in an antihypertension study, data may be missing because a patient died: a death caused by a road traﬃc accident may be considered ignorable because it is unlikely to be study related but a death caused by a cardiac arrest would not be ignorable. In this example the term is partially related to a contrast between adverse events and adverse reactions nonignorable missingness a process that produces nonignorable missing data noninferiority study a study whose objective is to show that one treatment ‘is not worse than another’. This is subtly diﬀerent to showing that two treatments are equivalent ( equivalence study) and obviously diﬀerent to trying to show that one treatment is diﬀerent to another ( diﬀerence study). G superiority study noninformative censoring censoring in survival studies that is completely unrelated to treatment. Essentially the same meaning as nonignorable missing data in the context of survival studies and censoring noninformative missing data data that are missing, and the fact that they are missing tells us nothing about what the data value should be. G missing completely at random noninformative prior reference prior noninvasive any medical procedure that is not invasive nonlinear not in a straight line. Û linear nonlinear model a model that contains multiplicative terms, not simply additive terms. Û linear model nonparametric a branch of statistics that makes few assumptions about the distributions of data nonparametric data strictly, there is no such thing as nonparametric data. However, the term is quite commonly used to refer to data that come from distributions that do not obviously resemble any standard probability distribution and for which nonparametric methods of analysis need to be used. Û parametric data nonparametric method any statistical method for signiﬁcance testing and 120

nonparametric test

normal plot

estimation that makes fewer assumptions about the distribution of the data than do parametric methods. It is widely believed that these methods make no assumptions at all about the distribution of the data but this is not the case nonparametric test a nonparametric statistical signiﬁcance test. Examples include the Mann–Whitney U test, the Wilcoxon matched pairs signed rank test, etc. nonrandom not random; used to refer to nonrandom samples and nonrandom treatment allocation nonrespondent a subject who does not answer a question, either because they refuse to or because they did not attend a study visit and so could not be asked nonresponse similar meaning to nonrespondent but also used to describe subjects who do not respond to treatment nonsense correlation an observed correlation that may be statistically signiﬁcant but which does not make any biological or medical sense in terms of causality nonsigniﬁcant risk study a study of a medical device that poses no important risk to the subjects who take part. Û signiﬁcant risk study nonzero eﬀect this is usually used to refer to an eﬀect when it needs to be stressed that an eﬀect does exist. This may be because the eﬀect is very large or because, despite the eﬀect being very small, it may still be medically or scientiﬁcally important normal a rather dangerous term: it has an everyday use meaning typical or not unusual; it has a similar meaning in a technical sense of a normal range (G reference range) for a variable; it also has a highly technical (statistical) use as in Normal distribution, one of the most basic ideas in statistics. Because of these diverse uses, it is important to either avoid its use altogether or to be highly speciﬁc. In this book, an upper case ‘N’ is used for the statistical probability distribution, the Normal distribution normal approximation an approximate procedure based on assuming data come from a Normal distribution normal curve an informal term used to describe the shape of the curve of a Normal distribution Normal distribution the probability distribution that is very commonly used (either directly or as a basis for further reﬁnements) in statistical signiﬁcance testing, estimation, model building, etc. (Figure 22) normal limit the upper (or lower) limit of a normal range normal plot quantile–quantile plot 121

normal range

nuisance parameter

Figure 22 The classic ‘bell shape’ of a Normal distribution with mean 1 and standard deviation 1 normal range the usual range within which the values of a variable can be expected to lie. It usually implies that all subjects within that range will be healthy. Û reference range normality the degree to which a distribution is like a Normal distribution normally distributed said of a set of data that come from an underlying Normal distribution not signiﬁcant either an eﬀect that is of no clinical importance ( clinically signiﬁcant) or one that, regardless of its size, is not statistically signiﬁcant notiﬁable disease a disease that must, by law, be notiﬁed to health authorities nuisance parameter in a statistical model, parameters that may be very important as covariates but which are not of direct interest in the study. Usually the treatment eﬀect is of most interest; if it turns out that subject’s age or previous history are predictive of outcome (but equally 122

n-way classification

nuisance variable

predictive within each treatment group) then their parameters would be considered as nuisance parameters nuisance variable any variable in a statistical model that is not of primary interest. G nuisance parameter null distribution the probability distribution of a variable if the null hypothesis is true null hypothesis (H0) the assumption, generally made in statistical signiﬁcance testing, that there is no diﬀerence between groups (in whatever parameter is being compared). Evidence (in the form of data) is then sought to refute (or reject) this null hypothesis. Û alternative hypothesis (H1) number needed to harm the number of patients that a physician would have to treat with a new treatment in order to harm (in some predeﬁned sense) one extra subject who would otherwise not have been harmed. ‘Harm’ may be in the context of a treatment failure, an adverse reaction, a death, etc. More usually considered in the context of number needed to treat number needed to treat the number of patients that a physician would have to treat with a new treatment in order to avoid one event that would otherwise have occurred with a standard treatment numerator in a fraction, such as or , the numerator is the number on the top line of the fraction (in these cases 1 and 3, respectively). Û denominator numeric relating to numbers only. Û alphanumeric numeric variable a variable that is a number. This generally means it is a continuous variable and not, for example, a likert scale Nuremberg Code a set of ethical principles about research on humans that formed the basis of the Declaration of Helsinki n-way a generalisation of 1—way, 2—way, 3—way, etc. meaning any number of ways. Used particularly in the sense of n-way analysis of variance, n-way classiﬁcation, etc. n-way analysis of variance a generalisation of analysis of variance indicating that many (n) factors are included in the model n-way classiﬁcation classiﬁcation of a continuous variable (usually by a discrete variable) in several (n) subclasses

123

O O’Brien and Flemming rule one of the most common stopping rules used in group sequential studies. G Pocock rule objective the purpose of a study. It may be described either in very precise and speciﬁc terms or in general terms such as ‘to assess the safety and eﬃcacy of Drug A’. The term is also used to refer to clear facts rather than general impressions. For this interpretation Û subjective objective data data that are usually considered to be measured with high accuracy and that have low (or negligible) intraobserver variation and interobserver variation. Û subjective data objective endpoint an endpoint to a study that is objective data. Û subjective endpoint objective measurement a measurement of objective data. Û subjective measurement objective outcome an outcome that is objective data. Û subjective outcome observation usually meaning the data relating to one of the subjects being studied. However, the term is mostly used in a computing context to mean the number of rows in a (rectangular) database. Usually this will consist of one observation (with many variables) per subject; sometimes, if there is a diﬀerent number of records per subject, the database may be set out as one observation per record observational study a study that has no experimental intervention but just observes what happens to a group of subjects. G case-control study, cohort study. Û intervention study observed change the change in a variable that is seen to occur. This is in contrast to a ﬁtted value from a statistical model observed data strictly this is synonymous with data but use of the term ‘observed’ helps to contrast with ﬁtted values from statistical models observed diﬀerence the diﬀerence (in means, proportions, etc.) for a variable that is seen to occur. This is in contrast to a ﬁtted value from a statistical model observed distribution observed frequency distribution Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

observed effect

odds ratio

observed eﬀect usually the simple estimate of an eﬀect (diﬀerence in means, diﬀerence in proportion, the odds ratio, etc.) that has not been adjusted to account for any possible covariates observed frequency the frequency with which a speciﬁc variable is seen to occur. This is in contrast to a ﬁtted value from a statistical model observed frequency distribution strictly this is synonymous with frequency distribution but use of the term ‘observed’ helps to contrast with probability distribution observed mean the sample mean. Û population mean observed outcome the observed value (usually of categorical data). This is in contrast to an expected outcome from a statistical model observed rate the rate at which an event is seen to occur. This is in contrast to any ﬁtted values from statistical models observed relative frequency distribution the observed frequency distribution presented as a relative frequency distribution observed result any kind of result that is seen to occur. This is in contrast to any ﬁtted values from statistical models observed sample size the sample size actually obtained, in contrast to what was planned observed treatment diﬀerence observed eﬀect observed treatment eﬀect observed eﬀect observed value either the value of a measurement in a single subject or the number of occurrences of an event that have been observed observed variance the sample variance. Û population variance observer bias any bias in measurements introduced by an observer (for example digit preference) or caused by making observations ( Hawthorne eﬀect) observer error any error in measurements made by an observer. G intraobserver agreement, interobserver agreement observer variation see interobserver variation, intraobserver variation Occam’s razor a philosophical stance which prefers simple explanations to more complex alternatives. This is a general principal to adopt in formulating statistical models Ockham’s razor Occam’s razor odds the probability of an event occurring divided by the probability of it not occurring. For example, if one in ten cancer patients are cured by a drug, then the odds of being cured are stated as 1:9. Û rate, risk odds ratio the ratio of two odds, often used as a summary of the size of a treatment eﬀect in two-by-two tables. In Table 12, the odds ratio is calculated as (37;31)9(13;19) : 4.6. Û risk ratio 125

off label

on treatment

Table 12 Contingency table showing the distribution of treatment response by treatment group

Treatment success Treatment failure Total

Treatment A

Treatment B

37 13 50

19 31 50

oﬀ label the use of a product to treat a disease for which it does not have a marketing authorisation oﬀ site away from the buildings or facilities where key activities occur. This may be with reference to study medication being stored at a location separate from where patients are treated or it may refer to an archive of data being kept at a location separate from where the main data-processing activities take place. Û on site oﬀ study refers to clinical activities that may occur concurrently with a study protocol but which are not included in the protocol, or to procedures which take place, or medication that is given, after a subject has completed the protocol. Û on study oﬀ treatment any time (during the course of a study or after a subject has completed a study) when a subject is not being given treatment (or placebo). This may be during a run in period or during a long term follow-up period. Û on treatment ogive a graph of a cumulative frequency distribution (Figure 23) ointment a vehicle for delivering topical treatment, usually paraﬃn or Vaseline based. G cream, gel, lotion omitted covariate a covariate that has not been included in a regression model or analysis of covariance model (either intentionally or inadvertently) omnibus test any statistical signiﬁcance test that involves comparing parameters (often means or proportions) from more than two groups. It may, for example, be a test that all the means are equal: in such a case, if the null hypothesis (of equal means) is rejected we cannot immediately say which means are diﬀerent to which others on site activities that take place (or the availability of study material) at the site where they are needed. This may relate to medication being on the site where patients are treated, or to completed case record forms being at the premises of the data management oﬃce. Û oﬀ site on study activities that take place as part of a study protocol. Û oﬀ study on treatment any time when a subject is being given a study treatment (or placebo). Û oﬀ treatment 126

one sided

one way analysis of variance

Figure 23 Ogive. A graph of the cumulative number of patients who have suﬀered from eczema for less than 1 year (9 patients), less than 2 years (10 patients), less than 3 years (12 patients), . . . less than 55 years (all 66 patients) one sided concerned with only one tail of a distribution. Û two sided one sided alternative the alternative hypothesis that is a one sided hypothesis. Û two sided alternative one sided hypothesis a hypothesis that allows for the possibility of a diﬀerence in only one direction (for example, Drug A must be better than Drug B). Such hypotheses are not as common as two sided hypotheses one sided test any statistical signiﬁcance test that will accept a one sided hypothesis if the null hypothesis is rejected. Û two sided test one tailed one sided one tailed alternative one sided alternative one tailed hypothesis one sided hypothesis one tailed test one sided test one way analysis of variance the simplest form of analysis of variance, 127

one way classification

ordered alternative hypothesis

used to compare the means of two (or more) groups in a parallel groups study but without including any other factors or covariates in the statistical model one way classiﬁcation data that are grouped by only one categorical variable. Note that the categorical variable may have several levels ( levels of a factor) but there is only one variable one way design a study design that involves only a response variable and one (categorical) covariate online a computing term meaning that work is being done directly onto a central computer rather than being temporarily held on a local computer before batch processing or being uploaded to the central computer online data entry electronic data entry that occurs online. Û distributed data entry. G remote data entry open class interval a class interval that either has no lower limit (it is all values below a certain value) or has no upper limit (it is all values above a certain value). It is often used with highly skewed data open label not blind open label study a study where the treatments are not blinded open sequential design a sequential study design that does not have any upper limit to the number of subjects that may be recruited (Figure 24). Û closed sequential design open sequential study a study that is designed as an open sequential design open study open label study open treatment assignment treatment assignment that is not blinded (although it may still be random). G open label study operation a surgical procedure or a mathematical function optimal design a study that is the best (‘optimal’) for some speciﬁc purpose. Note that it may not be optimal for all purposes. It may be optimal on statistical grounds or from practical study management grounds oral assent assent that is given orally. Û written assent. G consent oral consent consent that is given orally. Û written consent (which is more common). G assent order of magnitude a multiple of, or division by, 10 order statistic any one of the centiles ordered see ascending order, descending order ordered alternative hypothesis an alternative hypothesis that involves more than two groups. The simplest example is that of comparing the means of three groups. The null hypothesis is that ‘all the means are 128

ordered categorical data

ordered logistic regression

Figure 24 Open sequential design. The solid lines indicate stopping boundaries for declaring a statistically signiﬁcant diﬀerence between treatments A and B. If the broken boundary is crossed, then the study stops, concludingthat no signiﬁcant diﬀerence was found between the treatments. Potentially, the number of preferences could continue indeﬁnitely between the upper solid and broken lines or between the lower solid and broken lines; in such a case no conclusion would ever be reached equal’ or, equivalently, ‘ : : ’; the simplest alternative ! hypothesis might be that ‘not all of the means are equal’; an ordered alternative hypothesis would be that ‘ ’ ! ordered categorical data data that are measured on a categorical scale but where the categories have a natural ordering, for example mild, moderate and severe. G likert scale ordered categorical scale the scale on which ordered categorical data are measured ordered categorical variable a variable that yields ordered categorical data ordered data data that are measured on an ordered scale ordered logistic regression an extension of the methods of logistic 129

ordered scale

outcomes research

regression where the response variable is ordered categorical, instead of binary. G polytomous regression ordered scale a measurement scale that is ordered. This includes ordinal scales, ordered categorical scales, interval scales ordinal data data that are simply ordinal numbers ordinal number the numerical position (1st, 2nd, 3rd, etc.) in a set of ordered data ordinal scale the scale on which ordinal data are measured ordinal variable a variable that yields ordinal data ordinary least squares least squares ordinate y axis. Û abscissa (or x axis) orientation layout, generally of paper in the form of either landscape or portrait origin the point of zero on a graph. On a two-dimensional graph, where x : 0 and y : 0 original data source data original document the top copy (not photocopies, etc.) of a document. G source data original record source data orphan drug a product that has a limited market because it is used for a rare disease. Regulatory requirements are diﬀerent for orphan drugs than for non-orphan drugs orthogonal when two ideas, measurements, estimates, etc. are at right angles to each other, implying that they are also independent of each other orthogonal contrasts two (or more) contrasts that are independent of each other outcome usually the primary variable of a study. Although an outcome would generally be an event ( outcome event), the term is frequently used to refer to the primary variable whatever the measurement scale outcome event the primary variable of a study, speciﬁcally when that variable is binary outcome measure the primary variable of a study, usually restricted to the case when that variable is continuous outcome variable the variable that deﬁnes the outcome for a study outcomes research methods of trying to answer research questions that do not involve intervention studies but which analyse databases and attempt to control for all possible confounding factors by complex statistical modelling. It is a cheaper, quicker and relatively easier way to answer a question than doing a clinical trial and so presents clear advantages over trials (which are often expensive and time consuming). 130

outlier

overview

However, much of the rigour and control of bias gained from clinical trials may be lost outlier a data value that does not seem to be true, given all the other data values, usually because it is very extreme (either too large or too small). Û inlier outpatient a patient who is not kept in hospital overnight. Note that treatment may still be given in the hospital. Û inpatient outpatient study a study of outpatients output device a method of getting data out of a computer (this may simply be the monitor or a printer) over represent when there is a higher proportion of some subgroup in a sample than there is in the population. Sometimes this may be desirable. The proportion of subjects with mild, moderate and severe symptoms may intentionally be kept equal in a sample, even though they are not similar in the population. Û under represent. G sample demographic fraction overmatch in matched studies, cases and controls might be matched for as many variables as is reasonably possible. If, however, the exposure variable is also matched between the groups then no diﬀerence between the groups will be found. This is called overmatching and is a risk in complex epidemiological studies where the variable causing the cases is not known over-the-counter drug products that can be purchased without needing a doctor’s prescription. Û prescription only medicine overview to look at data from various sources, considering them as a whole and making a conclusion. Û meta-analysis that involves more formal assessments of the completeness of the data and more formal statistical methods for combining them. Overviews and meta-analysis are very important methods for synthesising data

131

P package see computer package, package insert package insert the information given to a patient with a pack of medication. It contains information similar to the summary of product characteristics but is written in a style appropriate for patients to understand page orientation orientation pair two items. Usually this means the same variable measured on two similar subjects or the same variable measured on one subject on two occasions. It can sometimes refer to two independent items that are brought together in some way ( pairwise comparisons for an example) pair matching pairwise matching paired comparison a comparison that is made on paired data (not on independent groups). Û pairwise comparisons paired data the same variable measured on two similar subjects or the same variable measured on one subject on two occasions paired design a study design that involves taking paired observations and usually makes treatment comparisons using paired comparisons, often (but not necessarily) in the form of a crossover design paired observations two observations that are related to each other, either as two observations from the same subject at diﬀerent times (or on diﬀerent sites on the body) or as one observation from each of two matched subjects in a paired design paired sample a sample of paired observations paired t test a statistical signiﬁcance test testing the null hypothesis that the mean diﬀerence in a population (from which a sample of paired data has been taken) is equal to some particular value. Usually it is to compare the mean diﬀerence with zero. Û independent samples t test pairwise relating to pairs pairwise comparisons in a study where more than two groups are being compared, the term pairwise refers to each of the possible pairs of treatments that can be compared. For example, when there are three Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

pairwise matching

parametric data

groups (A, B, C) there are three possible pairwise comparisons: A vs. B, A vs. C, and B vs. C. G multiple comparison method pairwise matching matching a pair of subjects palliative care care for the whole patient, rather than treatment of speciﬁc symptoms. Examples include supportive care in the form of good communication, sympathy, understanding, empathy, etc. towards patients (and their relatives) pandemic occurring over a large geographic area. Û endemic paperless using no paper; as in ‘paperless case record form’ (where data are entered directly onto a computer without being transcribed onto paper ﬁrst) parallel side by side; not crossing over parallel assay a dose ﬁnding study where the activity of a new product is compared with the activity of a standard drug parallel control the control group in a parallel group study parallel dose design a parallel group study where the diﬀerent groups of subjects receive diﬀerent doses of the same drug parallel group design the most common design for clinical trials, whereby subjects are allocated to receive one of several treatments (or treatment regimens). All subjects are independently allocated to one of the treatment groups. No subjects receive more than one of the treatments. Û crossover design parallel group study a study designed as a parallel group design parallel study parallel group study parallel track occurring at the same time but independently. For example, two studies that are being conducted at the same time but independently of each other parameter the true (but often unknown) value of some characteristic of a population. A simple example is the mean age of a population. Other examples include variances, minimum values and medians. Parameters are usually denoted by Greek letters (for example, for the variance) and are estimated by sample statistics that are usually denoted by Roman letters (for example, s for the variance). The most common parameter that we wish to estimate in clinical trials is the size of the treatment eﬀect parameter estimate the estimate (based on data) of a parameter parametric data as with nonparametric data, this term has no real meaning but it is quite commonly used to refer to data that come from recognisable probability distribution and for which parametric methods of analysis can be used 133

parametric method

pathogenesis

parametric method statistical methods that make speciﬁc assumptions about the distributions of data. Examples include the t test, correlation and regression. Û nonparametric method parametric test any statistical signiﬁcance test that uses parametric methods. Examples include the t test and the F test parent this term is used in the obvious way referring to mothers and fathers of children. It is also sometimes used in decision trees and mathematical models. In decision trees, it refers to a node from which branches come; in mathematical models it sometimes refers to a broad set of models from which other, simpler, models can be formulated. These are the most common uses but the term is sometimes used in any context where a hierarchy exists parent drug the basic form of a drug from which various alternative modiﬁcations are available parsimony the concept of simplicity being preferred over complexity. With particular reference to statistical models, models with few parameters are generally preferred over those with many parameters. G Occam’s razor partial response in cancer studies, this is generally regarded as a decrease in tumour size of at least 50%. G complete response, stable disease, progression partially balanced block a block of treatments that is balanced for some comparisons but not for others. For example, a block containing two assignments to Treatment A, two to Treatment B and three to Treatment C is only partially balanced partially balanced design a study design that uses partially balanced blocks of treatments partially confounded the situation where two estimates are not completely confounded but where some information in one estimate is not independent of another. This is very common in unbalanced designs and when using analysis of covariance participant someone who takes part (usually in a study) partition to split up. The term is most usually used when trying to decide if relationships (for example, dose—response relationships) are linear or quadratic. In this instance, we often refer to partitioning the sums of squares patent the process of registering, or the documents conﬁrming, ownership of an invention (such as a new drug), thus protecting that invention from being copied pathogen a microbiological organism that is capable of causing disease pathogenesis the cause and subsequent development of a disease 134

pathology

pay journal

pathology the science of the causes of disease patient a subject who has a disease or other illness. Note that the requirement to have a disease or other illness diﬀerentiates from the broader term ‘subject’. Note also that ‘volunteer’ is not a good choice of word when describing those who take part in studies because healthy subjects and diseased patients should all be taking part voluntarily patient accrual patient enrolment patient chart any kind of chart or graph on which a patient’s data are plotted patient compliance the degree to which an individual patient complies with the study protocol as a whole, or speciﬁcally complies with taking the appropriate medication patient contact any type of meeting between a patient and a health worker. The contact may be face to face, by telephone, by letter, etc. patient enrolment the process of recruiting patients into a study patient enrolment period the time period during which patients are enrolled into a study patient follow-up the process of observing a patient over time, after they have been given study medication. G follow-up data, follow-up period, follow-up visit patient home visit a visit (usually by a study nurse or an investigator) to a patient, in the patient’s home patient id subject id patient identiﬁcation number subject identiﬁcation number patient information booklet a small booklet given to subjects, before they agree to take part in a study, to give them information about the study to help them decide if they are prepared to volunteer patient information sheet a smaller form of a patient information booklet that is just a single sheet of paper patient monitoring observation of a patient to ensure safety (primarily) and sometimes to record eﬃcacy data patient population the entire (theoretical) population of patients that could be recruited into a study. The term is also used to refer to the diﬀerent analysis populations (intention-to-treat population, per protocol population, safety population, etc.) patient record the data referring to a single patient patient recruitment patient enrolment pay journal a journal for which the cost of publication has to be met (fully or partially) by the authors of the manuscripts. Peer review may, or may not, also be required. Û peer review journal 135

peak

performance measure

peak any area on a graph that shows a rise and subsequent fall peak value the maximum value from a set of related data. Usually it is from data that are all from one subject Pearson chi-squared statistic chi-squared statistic. The term is often preﬁxed with ‘Pearson’ to distinguish it from other forms of statistical signiﬁcance tests that also use the chi-squared distribution Pearson product-moment correlation coeﬃcient correlation coeﬃcient Pearson residual residuals in contingency tables (and in logistic regression models). Each residual is calculated as the diﬀerence between the observed value and expected value, divided by the square root of the expected value peer a colleague or other person who is considered an equal in scientiﬁc merit and experience peer review when an independent scientist of similar standing and experience to the ﬁrst reviews a manuscript or other documents or working practices and makes comments. G expert review peer review journal a journal that sends submitted manuscripts for peer review. It is usually assumed that there is no charge for publishing an accepted manuscript. Û pay journal per protocol analysis the analysis of study data that excludes data from subjects who did not adequately comply with the study protocol. Û intention-to-treat per protocol population the subset of subjects recruited into a study who are included in the per protocol analysis per unit cost the extra cost incurred (per person treated, per bottle manufactured, etc.) It does not include basic set up costs. Û ﬁxed cost percent of 100. For example, the phrase ‘37 percent of patients responded to treatment’ means that ‘of every 100 patients given treatment, 37 responded’ percent diﬀerence index the diﬀerence between two percentages. G percentage point percentage point the term is often used in a similar way to percent diﬀerence index: when considering a change from, for example, 20% to 30%, this can be described as a ‘50% increase’ or a ‘diﬀerence of 10 percentage points’ percentile centile percentile–percentile plot quantile–quantile plot per comparison error rate comparisonwise error rate per experiment error rate experimentwise error rate performance measure any measurement of how well a person or group of 136

performance monitoring

person-year

people carried out a particular task; or a measurement of how well an experiment or measuring instrument does what it is intended to do performance monitoring the process of reviewing performance (that is, how well a task is being done), with a view to making improvements, if necessary. The term can equally well apply to reviewing the performance of people or machines period an interval of time. In the speciﬁc context of crossover studies, the term refers to the intervals of time when a subject is given the ﬁrst treatment (period 1), when they are given the second treatment (period 2), etc. period eﬀect any systematic diﬀerence in response between two periods. Most commonly used in the context of crossover studies period prevalence the prevalence (number of cases) of an event during a speciﬁed period of time. Û point prevalence periodic safety update report a regular report sent to a regulatory authority with details of all adverse events reported for a product peripheral of secondary importance. In computer terms, it refers to any additional piece of hardware that can be added to a computer (image scanners, printers, etc.) permutation any ordering of a given ﬁxed set of data values permutation test nonparametric test permute to rearrange a set of data values to form a new permutation. G randomise permuted block randomised block personal computer typically a small (although possibly quite powerful) computer. Such computers are suﬃciently small that they easily ﬁt onto a desk; some are small enough to ﬁt into a small briefcase. Û mainframe computer personal data data about individual people. Often it is restricted to data that may be considered as of a sensitive nature (sexual behaviour, illegal substance abuse, etc.) personal probability in Bayesian statistics, this is one person’s prior probability of an event. It is sometimes called a ‘personal probability’ to emphasise that diﬀerent people may legitimately have diﬀerent prior probabilities for the same event, so the prior probability is of a ‘personal’ nature person-time see person-year as an example. ‘Time’ can be any chosen units person-year when many people have been exposed to an intervention for varying lengths of time, the total time of exposure for all people can be calculated and expressed as if it were one individual exposed for this 137

pessary

Phase I study

total length of time. For example, two people each exposed for 6 months would equate to one person-year; one person exposed for 12 months and another with zero exposure would also equate to a total exposure of one person-year pessary a suppository inserted into the vagina pharmaceutical relating to drugs. Û biologic, phytomedicine pharmaceutical company a commercial organisation that researches, develops, manufactures and markets drugs pharmaceutical industry pharmaceutical companies and other support companies involved in the research, development, manufacture and marketing of drugs pharmacist a person qualiﬁed to prepare, safely store and dispense drugs pharmacodynamics broadly, the action of a drug on the physiology of the body. Û pharmacokinetics pharmacoeconomics the study of economic implications of drug usage. This can be used either to try to justify use of drugs as an economic beneﬁt or to evaluate the cost associated with a patient having a disease compared with the cost needed to treat the patient pharmacoepidemiology the study of drug usage and results (positive and negative) in broad populations with a view to a better understanding of beneﬁcial drug usage. G epidemiology, outcomes research, pharmacovigilance pharmacogenetics the study of how drugs aﬀect the genetic makeup of the body pharmacokinetics broadly, the action of the body on a drug. Pharmacokinetics includes the study of the rate of absorption and distribution of products into and around the blood stream, and the rate (and methods) of elimination of drug from the body. Û pharmacodynamics pharmacology the study of drugs (including uses, beneﬁts, harmful eﬀects and stability) pharmacovigilance the study of adverse events (presumed to be related to drug usage) in broad populations pharmacy a place where drugs are stored in secure conditions and under the control of a pharmacist phase diﬀerent stages of drug development and testing ( Phase I, II, III, IV study) or, used on its own, to denote diﬀerent stages within a study. In this latter case phase of study Phase I study the earliest types of studies that are carried out in humans. They are typically done using small numbers (often less than 20) of healthy subjects and are to investigate pharmacodynamics, phar138

Phase II study

pilot

macokinetics and toxicity Phase II study studies carried out in patients, usually to ﬁnd the best dose of drug and to investigate safety. This term is sometime split into the subgroups Phase IIa studies and Phase IIb studies Phase IIa study of a set of Phase II studies, the earlier ones (often on fewer patients) are sometimes referred to as Phase IIa. Û Phase IIb study Phase IIb study of a set of Phase II studies, the later ones are sometimes referred to as Phase IIb. Û Phase IIa study Phase III study generally these are major studies aimed at conclusively demonstrating eﬃcacy. They are sometimes called conﬁrmatory studies and (in the context of pharmaceutical companies) typically are the studies on which registration of a new product will be based. They are sometimes split into so-called Phase IIIa studies and Phase IIIb studies Phase IIIa study this term is not often used; the term Phase III study is usually adequate. Û Phase IIIb study Phase IIIb study when a product already has a marketing authorisation but the indication is being expanded, new Phase III studies are needed to demonstrate eﬃcacy in the new indication. Since the Phase III studies in the drug’s development have already been completed, these new studies are sometimes referred to as Phase IIIb Phase IV study these are studies carried out after registration of a product. They are often for marketing purposes as well as to gain broader experience with using the new product. G post marketing surveillance study, seeding study phase of study denotes diﬀerent stages within a study. For example, a study may have a washout period, treatment period, follow-up period, etc.; each of these could be referred to as a phase rather than a ‘period’ phlebotomy the taking of blood physician a medically qualiﬁed person who can treat patients. G investigator physiology the study of the functioning of the body and body systems phytomedicine drugs developed from plants. Û biologic, pharmaceutical pi () a mathematical constant. It is the ratio of the circumference of a circle to its diameter although it has many uses in mathematics and statistics beyond this pie chart a circular graph used for showing percentages. Schematically it resembles a pie (or a cake) with slices cut out; each slice being proportional (in area) to the proportion of data being represented (Figure 25). Û stacked bar chart pilot pilot study 139

pilot study

placebo controlled

Figure 25 Pie chart. The proportion of patients recruited by each of four study centres is represented. In this example, the actual number of patients recruited at each centre is also indicated

pilot study a small study for helping to design a further, conﬁrmatory study. The main uses of pilot studies are to test practical arrangements (for example, how long do various activities take? is it possible to do all the things we want to?), to test questionnaires (do the subjects understand the questions in the way we intended?) and to investigate variability in data. G internal pilot study pilot test usually an informal type of pilot study pivotal something on which a major decision (possibly to continue or cease developing a compound) will depend pivotal study a study that is pivotal. It may be pivotal for internal company use or for regulatory use. In the latter case, G conﬁrmatory study placebo an inert substance usually prepared to look as similar to the active product being investigated in a study as possible. In some situations, the term vehicle is used instead. G blinding placebo control giving placebo to a control group of subjects placebo controlled placebo controlled study 140

placebo controlled study

plus and minus

placebo controlled study a description of a study that implies there is a control group who receive placebo placebo eﬀect a nonspeciﬁc term used to encompass any (usually beneﬁcial) changes that occur within a group ‘treated’ with placebo. G eﬀect, treatment eﬀect placebo group the group of subjects assigned to receive placebo. Û treatment group placebo lead in period placebo run in period placebo period a period within a study where subjects are given placebo. G placebo run in period, placebo washout period placebo run in period a run in period where all subjects are given placebo. G placebo washout period placebo subject a subject who has been allocated to receive placebo placebo treatment an alternative term simply for placebo. Strictly, placebo is not a ‘treatment’ but the term is still commonly used placebo washout giving subjects placebo when the purpose is to allow any other medication that may be in the body to be eliminated. This may be at the beginning of a study ( placebo run in period) or between periods in a crossover study placebo washout period the time during which placebo washout takes place plagiarise to extensively copy someone else’s ideas or work without adequately acknowledging them plagiarism the act of copying someone else’s ideas or work without adequately acknowledging them plasma the liquid part of blood platform often used to refer to types (and manufacturers) of diﬀerent computers. See, for example, mainframe computer, personal computer plausibility check an edit check to test if data items appear plausible. This may be based on a simple range check or may be a more complex consistency check. Note that data that pass a plausibility check may still not be correct play-the-winner rule a method of assigning treatment to subjects. When the response is binary, the next subject will be given the same treatment as the last subject, if the last subject showed a positive response. However, if the last subject showed a negative response, then the next subject will be given the alternative treatment play-the-winner treatment assignment a method of treatment assignment that uses a play-the-winner rule plot a graph; or to draw a graph plus and minus plus or minus. Note the distinction between ‘and’ and 141

plus infinity

pool

‘or’ is poorly used. G and/or plus inﬁnity the term inﬁnity strictly means plus inﬁnity but in some situations it is helpful to distinguish from minus inﬁnity plus or minus (

Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

To Nikki, Anya and Huw

Dictionary for Clinical Trials Simon Day Medical Department, Leo Pharmaceuticals, Princes Risborough, UK

JOHN WILEY & SONS, LTD Chichester · New York · Weinheim · Brisbane · Singapore · Toronto

Copyright © 1999 by John Wiley & Sons Ltd, Baﬃns Lane, Chichester, West Sussex PO19 1UD, England National 01243 779777 International (; 44) 1243 779777 e-mail (for orders and customer service enquiries): [email protected] Visit our Home Page on http://www.wiley.co.uk or http://www.wiley.com All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency, 90 Tottenham Court Road, London, UK W1P 9HE, without the permission in writing of John Wiley & Sons Ltd, Baﬃns Lane, Chichester, West Sussex, UK PO19 1UD. Other Wiley Editorial Oﬃces John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158-0012, USA WILEY-VCH Verlag GmbH, Pappelallee 3, D-69469 Weinheim, Germany Jacaranda Wiley Ltd, 33 Park Road, Milton, Queensland 4064, Australia John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop 02-01, Jin Xing Distripark, Singapore 129809 John Wiley & Sons (Canada) Ltd, 22 Worcester Road, Rexdale, Ontario M9W 1L1, Canada Library of Congress Cataloging-in-Publication Data Day, Simon. Dictionary for clinical trials / Simon Day. p. cm. Includes bibliographical references. ISBN 0-471-98611-9 (cased : alk. paper).—ISBN 0-471-98596-1 (paper : alk. paper) 1. Clinical trials—Dictionaries. I. Title. [DNLM: 1. Clinical Trials dictionaries. QV 13 D275d 1999] R853.C55D39 1999 610.72—dc21 DNLM/DLC for Library of Congress 99—11151 CIP British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN 0-471-98611-9 (cased) ISBN 0-471-98596-1 (paper) Typeset in 9/10pt Times from the author’s disks by Vision Typesetting, Manchester in Great Britain by Antony Rowe Ltd., Chippenham, Wiltshire Printed and bound This book is printed on acid-free paper responsibly manufactured from sustainable forestry, in which at least two trees are planted for each one used for paper production

iv

Preface It is now ﬁfty years since the British Medical Research Council published the results of a trial entitled ‘Streptomycin treatment of pulmonary tuberculosis’ (British Medical Journal, 30th October 1948, pages 769—782). That study is widely regarded as the ﬁrst randomised clinical trial. Earlier examples of nonrandomised studies are cited, notably that of J Lind (A Treatise on the Scurvy, 1753). Despite such a history and the enormous numbers of trials conducted and published in the last twenty or so years, many people do not consider ‘clinical trials’ as a discipline in its own right and, as such, the breadth of terms that should be covered in a dictionary of this kind is not well deﬁned. Ultimately, the choice of entries is a personal one, guided by experiences of what I have had to learn and what my colleagues in various specialities of the clinical trials spectrum have struggled to understand. Additionally I have trawled clinical trial protocols, reports, regulatory guidelines and published manuscripts to try to cover the majority of terms that are likely to be encountered. A lot of the terminology of clinical trials is statistical: terms used for the design (blocks, randomisation, stratiﬁcation) and for the analysis (conﬁdence interval, P-value, survival analysis, t test, to list but a few). I make no apology for the high proportion of statistical terms: those are usually the ones that are least well understood. Overall though, the content is broad and it is very diﬃcult to summarise what is covered. It is almost as diﬃcult to summarise what isn’t covered. This is not a dictionary of medical terms, of statistical terms, of epidemiological, ethical or data management terms. It does, however, contain elements of all those disciplines, the ﬁrst three in particular. Many of the epidemiological terms included would not ordinarily be found in a clinical trial protocol or report; however, in the discussion of whether a clinical trial is appropriate for answering a particular medical question, or in discussion of trial results alongside other sources of evidence, the issue of other approaches such as case-control studies and cohort studies are likely to be discussed. I have not included speciﬁc diseases (a medical dictionary would be more appropriate) or names of clinical rating scales but I have included a variety of medical terms that are frequently assumed to be understood Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

Preface (terms such as acute, chronic, subcutaneous, etc.) Abbreviations are not included, except in the few instances where a term is better known by its acronym than by its full name (COSTART and MedDRA are obvious examples). Nor are the names of professional or scientiﬁc societies, research institutions or regulatory authorities included. The intended readership for this dictionary are all those people who work with clinical trial protocols and reports or who otherwise need to understand the use of language in this specialist area. Such a readership includes clinical trialists (those people who actually carry out the various administrative, clerical and scientiﬁc aspects), those who sit on ethics committees, those who work in regulatory departments or grant awarding bodies, doctors, nurses, pharmacists (and patients) reading clinical trial reports, and so on. Trials sponsored by the pharmaceutical industry, as well as those conducted by academic institutions or by small groups of enthusiasts, all fall within the scope of this work as do community-based intervention studies, vaccine trials, studies of medical practice and medical devices. Necessarily, many entries will be more relevant to some types of trials and trialists than to others. I hope the coverage is adequate without being too cumbersome. The style of explanations and deﬁnitions is aimed at being pragmatic and readable rather than purist. Pre-existing deﬁnitions (often in regulatory guidelines) have not necessarily been faithfully reproduced, although care has been taken to incorporate the essential meaning from relevant guidelines. As an example, the term ‘adverse event’ has a very speciﬁc deﬁnition within the International Conference on Harmonisation although the explanation given here is a little more brief. Further examples of pragmatism abound in the explanations of some statistical terms. Many statisticians may challenge the correctness of my explanations of analysis of covariance, Bayesian statistics or P-value, for example: I apologise to them in advance but hope that the explanations I have given will help those readers who understand little or nothing of such terms to at least gain a rough and ready grasp of their meaning. Similarly, ‘ethics’ is covered in a mere two lines: there are other related entries but the aim is to get the essential meaning across. Full and complete explanations of all the terms included would mean this work taking on the scale of a series of text books and that is not the intention. I hope that the explanations given here, put in the context where the word or expression has arisen, will allow most readers to unravel most uncertainties.

vi

Preface In my defence over accuracy and quality control I can claim that every single entry has been reviewed by a variety of my colleagues; and in their defence I acknowledge that every single error, discrepancy and inconsistency remains my responsibility.

vii

The Ground Rules The following is a brief guide to what’s in and what’s not in, and rules for cross-referencing related or alternative terms. In general, study is used rather than trial except where the distinction is helpful (strictly speaking, study encompasses trial but many types of study will not be trials). Similarly, trial is taken to mean clinical trial. For example, acute study is listed, but not acute trial or acute clinical trial. Phrases may sometimes be abbreviated but, I hope, without causing any diﬃculty in ﬁnding them. For example, adaptive design should be taken to encompass adaptive trial design and adaptive clinical trial design. Where alternative terms may be used interchangeably I have tried to pick the most common term to deﬁne and its synonyms will simply direct you there with the symbol . For example, alpha error simply says ‘ type I error’ (where an explanation is given). The most important terms used within the deﬁnitions of other terms are emboldened, as are references to contrasting terms (Û . . .) and related terms (G . . .). I hope that sometimes giving indication of contrasting or related terms may help understanding. It is inevitable, however, that some deﬁnitions will be circular: active control contrasts with (Û) placebo control; placebo control contrasts with (Û) active control. Ultimately, just as with all dictionaries, all deﬁnitions must use the terms herein to explain other terms and the circularity becomes inevitable.

Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

Bibliography There is a variety of books written about clinical trials and several other dictionaries and glossaries that may prove helpful in deﬁning terms and clarifying their use. The following titles have proved particularly helpful in compiling this dictionary and may serve as useful additional sources of reference: Applied Clinical Trials (various issues) Churchill’s Illustrated Medical Dictionary (1989) New York: Churchill Livingstone. Boyd KM, Higgs R and Pinching AJ (1997) The New Dictionary of Medical Ethics. London: British Medical Journal. Bull K and Spiegelhalter DJ (1997) Survival analysis in observational studies. Statistics in Medicine 16:1041—74. Duncan AS, Dunstan GR and Welbourn RB (1981) Dictionary of Medical Ethics, revised edition. London: Darton, Longman and Todd. Dupayrat J (1990) Dictionary of Biomedical Acronyms and Abbreviations, 2nd edition. Chichester: John Wiley & Sons. Everitt BS (1995) The Cambridge Dictionary of Statistics in the Medical Sciences. Cambridge: Cambridge University Press. Friedman LM, Furberg CD and DeMets DL (1985) Fundamentals of Clinical Trials, 2nd edition. Littleton: PSG Publishing Company. Grieve AP (1998) FAQs of Statistics in Clinical Trials. Richmond: Brookwood Medical Publications. Heister R (1989) Dictionary of Abbreviations in Medical Sciences. Berlin: Springer-Verlag. Jadad A (1998) Randomised Controlled Trials. London: British Medical Journal. Johnson FN and Johnson S (1977) Clinical Trials. Oxford: Blackwell Scientiﬁc Publications. Last JM (1995) A Dictionary of Epidemiology, 3rd edition. New York: Oxford University Press. Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

Bibliography Marriott FHC (1990) A Dictionary of Statistical Terms, 5th edition. Harlow: Longman Scientiﬁc and Technical. Meinert CL (1986) Clinical Trials: Design, Conduct and Analysis. New York: Oxford University Press. Meinert CL (1996) Clinical Trials Dictionary: Terminology and Usage Recommendations. Baltimore: The Johns Hopkins University. Nahler G (1994) Dictionary of Pharmaceutical Medicine. New York: Springer-Verlag. Pereira-Maxwell F (1998) A—Z of Medical Statistics. London: Arnold. Po AL (1998) Dictionary of Evidence Based Medicine. Oxford: Radcliﬀe Medical Press. Pocock SJ (1983) Clinical Trials: A Practical Approach. Chichester: John Wiley & Sons. Rasch D, Tiku ML and Sumpf D (1994) Elsevier’s Dictionary of Biometry. Amsterdam: Elsevier Science. Raven A (1993) Clinical Trials: An Introduction. Oxford: Radcliﬀe Medical Press. Samson P (1975) Glossary of Bacteriological Terms. London: Butterworth and Co (Publishers) Ltd. Schwartz D, Flamant R and Lellouch J (1980) Clinical Trials. London: Academic Press. Senn S (1997) Statistical Issues in Drug Development. Chichester: John Wiley & Sons. Spilker B (1991) Guide to Clinical Trials. New York: Raven Press. Spriet A and Simon P (1985) Methodology of Clinical Drug Trials. Basel: Karger. Steen EB (1978) Abbreviations in Medicine, 4th edition. London: Baillie`re Tindall. Vogt WP (1993) Dictionary of Statistics and Methodology. London: Sage Publications. Winslade J and Hutchinson DR (1993) Dictionary of Clinical Research. Brookwood: Brookwood Medical Publications.

x

A a posteriori after the event; generally referring to decisions made or actions taken after data or results of a study have been seen. Û a priori. G Bayes’ theorem, posterior distribution a priori before the event; generally referring to decisions made or beliefs held before data or results of a study have been seen. Such decisions or beliefs may be based on data from previous studies or subjective feeling based on informal clinical experience. Û a posteriori. G Bayes’ theorem, prior distribution Abbe´ plot L’Abbe´ plot x axis. Û ordinate (or y axis) abscissa absolute change the numerical diﬀerence between two numbers as in, for example, change from baseline. Û relative change absolute frequency the number of items or the number of occurrences of a speciﬁed event. Often abbreviated simply to frequency. Û relative frequency absolute risk the number of events (deaths, adverse reactions, etc.) divided by the number of individuals who could have experienced the event. Û relative risk absolute value a numerical value that ignores any positive or negative sign; for example, the absolute value of ;3 is ;3; the absolute value of 93 is also ;3 absorption the process by which drug enters the blood stream. Û clearance, elimination absorption study a study that measures the time taken for drug to be absorbed into the blood stream accelerated failure time model a statistical model used in survival analysis that assumes the eﬀect of one treatment is to multiply the median survival time for patients randomised to one treatment group relative to that of patients randomised to another treatment group. Û Cox’s proportional hazards model acceptance error the error of accepting a statement (usually a null Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

acceptance region

acute study

hypothesis) when that statement or hypothesis is false. Û rejection error. G producer’s risk, Type II error acceptance region the values of a test statistic (for example, calculated values of t in a t test or of chi-squared in a chi-squared test) that lead to accepting the null hypothesis. Û rejection region. G critical value accountability taking responsibility for one’s own actions accrue to gather or accumulate (often with respect to patients, data or information) accumulate to collect more and more (patients, data, information, etc.) over time accumulating data when more and more data are available as time progresses. Usually used in the context of sequential analysis or group sequential analysis accuracy nearness of an observed value to its true value (even if the true value may never be known). Also used with respect to a measurement process to describe how closely that process measures the true quantity. Û precision accurate close to the true value. Û precise active control a comparator group in a study that receives an active treatment. Û placebo control active control equivalence study (ACES) a study designed to show therapeutic equivalence between two active products active ingredient the pharmacologically or biologically active parts of product (the tablet, capsule, etc.) G formulation, presentation active treatment generally means a noninert pharmacological product or biological substance (not a placebo). The term is also sometimes used to describe the treatment of primary interest, rather than a comparator (but still active) treatment life table analysis actuarial method acute rapid onset and short lasting. A disease may be acute (for example chicken pox) as opposed to chronic (for example diabetes). Sometimes the term is used to describe part of a study that is used to treat the disease of interest, in contrast to a long term follow-up period looking for relapse or long term drug safety. Such a short term part of a study is sometimes called the acute phase of the study. Û chronic acute episode short term appearance of symptoms of an underlying chronic (long lasting) illness. For example, bronchitis may be a chronic illness with acute episodes acute phase see acute. Û follow-up period acute study short term study (usually of a long lasting disease). 2

acute toxicity study

adverse drug experience

Û chronic study acute toxicity study a study to investigate the short term toxicity of a product, usually a single dose of a drug. Û repeated dose toxicity study. G reproductive and developmental toxicity study ad hoc one oﬀ. Something unique to a particular problem adaptive design study procedures that change as the study progresses. Most often refers to the details of the randomisation process changing as the study progresses and results become known. Such designs are used so that, if it appears that one treatment is emerging as superior to another, the allocation ratio can be biased in favour of the treatment that seems to be best. G dynamic allocation adaptive inference conclusions that can be made as data and information accumulate. Although this seems obvious, in many studies conclusions are drawn only once at the end of the study; adaptive inference may draw conclusions as the study progresses adaptive randomisation adaptive design adaptive treatment assignment adaptive design additive model a statistical model where the combined eﬀect of separate variables contribute as the sum of each of their separate eﬀects. Û multiplicative model. G interaction adequate and well controlled a term describing a study that is suﬃciently large, properly randomised, and blinded adjust to modify (usually the estimate of a treatment eﬀect) to account for diﬀerences in patient characteristics between treatment groups. G adjusted estimate adjusted estimate an estimate of a parameter as would have been observed at some speciﬁed value of another variable. For example, high blood pressure (and its treatment) may be related to age and so we may wish to estimate the eﬀect of a drug on people of diﬀerent ages. G analysis of covariance adjuvant therapy extra treatment given to enhance the eﬀect of a monotherapy. For example sensitising drugs to enhance the eﬀect of radiotherapy administer to give (in the sense of giving treatment) administrative review a review of (usually accumulating) study data where the purpose is to monitor practical aspects of the study’s progress (such as recruitment rates, shipment of laboratory samples, etc.) Û interim analysis inclusion criteria admission criteria adverse drug experience adverse event 3

adverse drug reaction

alphanumeric

adverse drug reaction adverse reaction adverse drug reaction on-line information tracking (ADROIT) a database kept of adverse reactions to marketed products adverse event any (usually) unwanted eﬀect that a subject experiences whilst taking a drug. Note that causality is not implied. Û adverse reaction adverse experience adverse event adverse reaction see adverse event but note that causality to a particular drug is implied adverse treatment eﬀect adverse reaction advocate to support a given argument, opinion, or point of view aetiology the cause of a disease or the study of disease causality agency regulatory authority aggregate to combine separate data values into groups of aggregate data aggregate data data that have been grouped in categories. For example, all ages of patients in the range 0 to 5 put into one category, ages 6 to 12 in another category, etc. agonist a drug that enhances or activates the eﬀect of a natural body chemical or of another drug. Û antagonist algorithm a written description of a mathematical equation or decision rule. It is usually written partially in words (although not necessarily in complete and proper sentences) rather than just a set of mathematical expressions all patients treated analysis intention-to-treat analysis intention-to-treat population all patients treated population all subsets regression a method of deciding which variables should be in a regression model. G backward elimination, forward selection allocate to assign (typically a treatment to a patient) either by randomisation or by some deterministic method allocation ratio in a parallel group study the ratio of the number of patients allocated to one treatment group relative to the number allocated to another treatment group. Most often, the ratio is 1:1, or equal allocation alpha () the probability of making a Type I error. Û beta (). G signiﬁcance test alpha error Type I error alpha spending function a method in sequential studies such that the times when interim analyses are performed do not need to be speciﬁed in advance. The number of, and timing of, interim analyses can be ﬂexible alphanumeric data that may be alphabetical (a, b, c, . . . , A, B, C, . . . , including special symbols such as ;, £, %) or numeric (0, 1, 2, . . . 9) 4

alternate allocation

analysis of variance (ANOVA)

alternate allocation a method of assigning treatments to patients whereby the ﬁrst patient receives Treatment A, the second receives Treatment B, the third Treatment A, the fourth Treatment B and so on in a predictable (alternating) manner. Û random allocation alternative hypothesis (H1) this is usually the point of interest in a study. It is generally phrased in terms of the null hypothesis (of no treatment eﬀect) not being true. If the objective of a study is to ‘compare Drug A with placebo’ then the null hypothesis would be that there is no diﬀerence between the two groups and the alternative hypothesis would be that there is a diﬀerence alternative medicine approaches to medicine such as homeopathy, acupuncture, herbal medicines, etc., considered by many people to be nonconventional medicines altruism putting the interests of the individual ﬁrst; speciﬁcally in clinical trials, putting the interests of the individual before those of the research project. G collective ethics, individual ethics amendment protocol amendment ampoule vial analysis the process of summarising data or problems, describing them clearly (including plotting data) and drawing conclusions analysis by administered treatment a strategy where data are summarised and conclusions drawn based on the treatment that patients were actually given. Û analysis by randomised treatment analysis by randomised treatment analysis by assigned treatment analysis by randomised treatment a strategy where data are summarised and conclusions drawn based on the treatment that patients were supposed to be given (the treatment they were randomised to receive), regardless of what they actually took. It is very similar to the term intention-to-treat. Û analysis by administered treatment analysis of covariance (ANCOVA) a statistical analysis method that is an extension of analysis of variance. It allows estimates of treatment eﬀects to be adjusted for possible covariates as well as factors analysis of variance (ANOVA) a statistical analysis method that allows comparison of two or more treatment groups and estimates of treatment eﬀects to be adjusted for other possible factors such as race, gender, treatment centre, etc. It is a very general method covering a very broad range of techniques and can be used in a great variety of situations. Because of this, to describe a method of analysis as being ‘analysis of variance’ is rarely suﬃcient to adequately describe what analysis has actually been carried out 5

analysis policy

applicable regulatory requirements

analysis policy analysis strategy analysis population the set (often subset) of patients recruited to a study who are subsequently included in the data analysis. Examples are the all patients treated population, per protocol population analysis strategy this combines the decision whether to use an all patients treated analysis, an intention-to-treat analysis, a per protocol analysis, or some other policy and considerations such as whether to use, for example, parametric methods or nonparametric methods, Bayesian inference or frequentist inference anatomical therapeutic chemical classiﬁcation system (ATC) a drug coding system that codes according to a drug’s site of action and its indication and/or a badly used term that often causes confusion, particularly over whether the word ‘or’ is considered as inclusive or exclusive. For example, if there are two events P and Q, one option is that both P and Q may occur; another option is that P or Q (but not both) may occur—this is the ‘exclusive or’; ﬁnally P or Q (or both) may occur—this is the ‘inclusive or’. If ‘or’ is considered inclusive then the term ‘and/or’ is redundant: ‘P or Q’ includes ‘P and Q’; if ‘or’ is considered exclusive then the term may have some use. It is probably better to use a few more words and explain what is intended anecdotal evidence unsubstantiated evidence that cannot be strongly relied on. It is usually considered as more informed than mere opinion and often used as a means of generating ideas and research questions aneugen a substance that causes toxic eﬀects on DNA. G clastogen angular transformation a transformation applied to data that are of the form of proportions to allow use of statistical methods based on the Normal distribution. Where the proportion is p, the transformation is y : arcsin ('p). G logistic function, probit transformation animal model results from experiments in (nonhuman) animals, used to extrapolate results to humans animal study a study carried out in (nonhuman) animals. G preclinical study antagonist a drug that prevents or reverses the eﬀect of a natural body chemical or of another drug. Û agonist antedependence model a statistical method for analysing a series of repeated measurements on the same individuals. The method describes the data based (partly) on earlier measurements applicable regulatory requirements requirements of a regulatory authority that are either general to all studies or apply speciﬁcally to the 6

approval

assent

experimental or geographical circumstances relevant to a particular study approval the process of an individual or group of individuals with appropriate authority agreeing to a request. This may take the form of approving a protocol, a submission to a research ethics committee, a submission to a regulatory authority, etc. approximate close to the true value. Note that the ‘true’ value may not be known and the interpretation of ‘close’ may vary from one situation to another, so this is a rather vague term approximation a method of estimating a parameter that gives an approximate answer archive to keep a historical record in secure conditions to conﬁrm the data obtained and the procedures that were followed during the course of a study. G backup arcsin transformation angular transformation area under the curve a summary measure of data that have been collected repeatedly over time. The data are plotted with time on the x axis and the measurement on the y axis. The area is that between the line connecting the data points and the x axis (Figure 1) mean arithmetic mean arm synonym for group (as in randomised group) artefact an aspect of data that is not substantiated in other data sets and is not a real eﬀect ascending order data sorted so that the smallest value comes ﬁrst, the larger values later and the largest value last. This can be applied to alphanumeric data (by sorting into alphabetic order with special rules for including numbers and special symbols) as well as numeric data. Û descending order. G ranked data ascertainment bias bias caused due to the manner in which data are collected. For example, surveying the general incidence of health problems near a doctor’s surgery would probably lead to an unreasonably high proportion of respondents indicating less than perfect health; in contrast, surveying near a health club might lead to an unreasonably low proportion of respondents with impaired health ASCII a standard set of alphanumeric characters that is widely transferable between diﬀerent computers. It stands for American Standard Code for Information Interchange assay a procedure to measure the quantity of a chemical (usually drug) in a sample (usually of blood or urine) assent agreement to something in a passive way and not after thorough consideration of the advantages and disadvantages. Note that clinical 7

assessment

asymmetric

Figure 1 Area under the curve. Plot of serum concentration of drug on ten occasions up to 10 hours after administration. The area under the curve is shaded. Other features to note are C at 1.75 hours and T :

3.2 mg/ml trials usually need subjects to consent to take part, not just assent assessment measurement of the state of disease. This may be a measurement of blood pressure, severity of depression, quality of life, etc. assign allocate assigned treatment the treatment that a patient is due to receive based on a randomisation (or other) method associate an assistant (often in the sense of a subinvestigator) associate investigator subinvestigator association a means by which two items are linked. For example, there is a link (or association) between smoking and lung cancer. G correlation assumption a state (often a feature of data) that is taken as true although there may not be suﬃcient evidence to guarantee that state. A common assumption is that data come from a Normal distribution asymmetric not symmetric, as in not evenly split around the middle. The 8

asymptote

average absolute deviation

term is often used about distributions of data that are skewed asymptote a value that is never achieved but that is approached more and more closely. For example, repeatedly dividing a number by two will get closer and closer to zero but will never actually attain that value: in this case, zero is the asymptote asymptotic method a statistical method that assumes there is a large sample of data and which may not be suitable with small samples atopy indicates that an allergic disease such as asthma, eczema, etc. is hereditary rather than being a spontaneous new case attenuation making extreme results or statements less extreme and more typical of the norm risk diﬀerence attributable risk attribute characteristic or feature (usually of a patient). All variables (age, sex, pulse, serum calcium, etc.) are attributes attrition loss; often used to describe loss of patients’ data in long term studies due to patients withdrawing for reasons other than those of meeting the study’s primary endpoint audit a systematic review of data and operational details or study procedures audit certiﬁcate a certiﬁcate to conﬁrm that a study has been audited audit report a report (written or verbal) describing the ﬁndings of an audit. Such ﬁndings are usually restricted to points that do not meet expected standards of quality or completeness (rather than all aspects that do meet the expected standards) audit trail a list of reasons and justiﬁcations for all changes that are made to data or of all procedures that do not comply with agreed study procedures auditor a person responsible for carrying out an audit autocorrelation correlation between repeated measurements taken successively in time from the same subject autoencoding an automatic (usually by computer) method of assigning codes to data, for example codes for drugs or adverse events autoregressive a description of a process that produces data collected sequentially in time when each data point is potentially related (or correlated) with the previous one(s) average informal term for the mean average absolute deviation the average ( mean) amount by which a set of data values diﬀer from some reference value (usually that reference value being the mean). The diﬀerences ignore the sign (plus or minus). So, for example, the average absolute deviation of the numbers 1, 2 and 9

average deviation

axis

4 is +(2 9 1);(2 9 2);(4 9 2),93 : 393 : 1.11. G standard deviation average deviation average absolute deviation axis scale (x axis or y axis) on a graph

10

B backup a reserve (often used in the sense of a reserve copy of data) kept under secure conditions in case of loss or corruption of the original. A more readily available and less permanent version of an archive backward elimination a method of ﬁnding which variables should be kept in a regression model by including all possible variables and then removing (‘eliminating’) those that are deemed not useful. Û forward selection, all subsets regression backward stepwise regression backward elimination bacterium single-celled microscopic organism; the cause of many diseases Balaam’s design a type of crossover design where patients are randomly assigned to receive treatments A and B in one of the treatment sequences AA, BB, AB or BA balance the state of being equal, usually with reference to the number of subjects in each treatment group. G balanced design balanced block part of an experiment (one block of it) such that within that block, the eﬀect of each treatment is estimated with equal precision balanced design an experiment in which the eﬀect of each of the treatments is assessed with equal precision, usually by having the same number of subjects in each treatment group. Note that, in crossover designs, balance refers to there being as many treatment sequences AB as there are BA balanced incomplete block design an experiment in which not all treatments being compared are represented in every block but where, overall, the occurrence of each treatment across all the blocks is the same (or balanced) balanced randomisation a randomisation method which ensures that the eﬀect of each treatment is estimated equally precisely, usually by assigning the same number of patients to each treatment group. Û unequal randomisation balanced study a study that has used balanced randomisation adaptive design bandit design Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

bar chart

baseline comparability

Figure 2 Bar chart. The number of patients who fall into each of the ﬁve categories is represented by the height of each bar bar chart a graphical method of showing the number of subjects that fall into each of two or more categories. The height of each ‘bar’ is proportional to the number of subjects within that category (Figure 2). G histogram bar diagram bar chart Bartlett’s test a method of testing the null hypothesis that several variances, each estimated from diﬀerent groups of subjects, are all equal baseline the moment in time that subjects are randomised or otherwise assigned their study medication. It is also used to refer to periods of time after a study has started but before randomisation has occurred baseline characteristic a measurement taken on a subject at the beginning of a study. Note that ‘beginning’ is generally taken to be at, or as near as possible to, the time of randomisation. G demographic data baseline comparability the process of, and results of, deciding if groups of patients assigned to diﬀerent treatment groups (usually by randomisation) 12

baseline data

Bayesian inference

are similar with respect to demographic data and severity of disease baseline data baseline characteristic baseline hazard function in survival analysis, the hazard function for a subject in the control group (or a group arbitrarily chosen to be a control group) baseline testing see baseline comparability baseline visit usually the very ﬁrst visit that subjects attend in a study. If randomisation does not occur at visit 1 then baseline visit may be used to refer to any visit before (and including) the randomisation visit BASIC a computer programming language. G C, C;;, Fortran, Visual Basic Baskerville design a method for ﬁnding the most preferred of several treatments. Each subject is randomly assigned to a sequence of treatments but the length of time each patient receives each treatment is dependent on their own personal choice. If a subject is completely satisﬁed with the ﬁrst treatment they receive then they would not change and would not receive any of the other treatments. In contrast, if a patient is not happy with any of the treatments being compared they would quickly pass through the entire possible set of treatments and ﬁnish the study batch process to work on a large number of documents all at once, rather than to handle each document as it arrives. This is a common term in data management but applies to computerised systems as well as manual systems batch validation to validate a large number of documents (usually data) as a batch process baud rate the speed at which data are transmitted electronically, measured as the number of binary digits sent per second. A baud rate of 32 000 means 32 000 binary digits sent per second Bayes factor the ratio of the posterior belief to the prior belief. This can be seen as a measure of how the strength of evidence in favour of a given hypothesis has increased, given new data, relative to the prior belief. G Bayes’ theorem Bayes’ rule the action one takes that gives the maximum utility Bayes’ theorem the process of making judgements about the outcome of a study before the data are analysed (assigning prior beliefs), then combining these with the observed data (in the form of the likelihood) to obtain new posterior beliefs Bayesian general statistical methods based around Bayes’ theorem Bayesian inference a method of statistical inference based on Bayes’ 13

before–after design

best case analysis

theorem as opposed to being based on classical statistical inference or frequentist inference before–after design a study in which subjects are observed before treatment is given and their disease state and severity is recorded. These subjects are then given treatment and subsequently their disease state and severity are reassessed. G crossover design Behrens–Fisher problem the problem of using a statistical signiﬁcance test to compare two means when their variances are not equal. Behrens and Fisher originally discussed the problem. Note that the usual t test assumes that the variances in the two groups are equal. It is a long standing mathematical and philosophical issue; hence being referred to as the Behrens—Fisher ‘problem’ rather than the Behrens—Fisher ‘method’ bell shaped used to describe a distribution that when drawn as a histogram or density function, has the same proﬁle as a bell. The Normal distribution (see Figure 22) is the most common example but the term should not be used exclusively for that purpose benchmarking the process of comparing activities (usually performance measures) against a standard reference value or in the absence of a standard, then against other methods to achieve the same outcome. Examples commonly include the costs of running studies between diﬀerent companies; speed of recruitment into studies in diﬀerent therapeutic areas, etc. beneﬁcial eﬀect a therapeutic eﬀect of a drug that is considered to be advantageous to the patient. It is usually meant to imply alleviating symptoms of the disease under study but is not limited to that. If a topical treatment were intended to alleviate symptoms of rash on the scalp and it appeared to reverse the eﬀects of alopecia then the eﬀect on alopecia would be considered a beneﬁcial eﬀect. Û adverse event beneﬁt a nontechnical term referring to advantage (of one treatment or activity over another). It may be measured in a variety of ways including decreased cost, increased patient satisfaction, reduced length of hospital inpatient stay, extended life expectancy benign a condition that does not produce any harmful eﬀects Berkson’s fallacy drawing wrong conclusions (usually in case-control studies) because of selection bias Bernoulli distribution the probability distribution of a binary variable best case analysis the process of making assumptions, often about data that are missing (either inadvertently or because they could not be measured), when the implications of those assumptions are that a treatment may appear to give more beneﬁt than is truly justiﬁed. 14

best fit

between subjects sum of squares

Û worst case analysis. G sensitivity analysis best ﬁt used in the context of regression and ﬁtting lines (straight or curved) to data on graphs. The best ﬁtting line is generally the one that has the data points closest to the regression line (although various other criteria for ‘best’ may be speciﬁed) best linear unbiased estimator a linear estimator that is better (usually in the sense of having a smaller variance) than any other possible linear estimator beta () the probability of making a Type II error. Û alpha (). G signiﬁcance test beta coeﬃcient regression coeﬃcient Type II error beta error beta level the probability of making a Type II error between groups usually used in the sense of estimating the variation (strictly speaking the variance) of data where we are describing the variation between the means of two or more groups of subjects. Û between subjects, within groups between groups sum of squares a measure of variability (by the method of sum of squares) between diﬀerent groups (treatment groups, strata, etc.) in a study. Û within groups sum of squares between groups variance between groups between groups variation an informal term for the between groups variance between person between subjects between study in meta-analyses this is used to describe the variation that is due to diﬀerences between studies rather than diﬀerences between subjects within each study between study variance see between study. This makes it clearer that it is the variance (or variation) that is being considered between study variation a less formal term for between study variance between subjects usually used in the sense of estimating the variation (strictly speaking the variance) of data where we are describing the variation between individual subjects. Û between groups, within subjects between subjects comparison the types of analyses that are made in parallel group studies, that are unpaired comparisons, rather than paired comparisons between subjects comparison between subjects eﬀect between subjects study parallel group study between subjects sum of squares a measure of variability (by the method of sum of squares) between diﬀerent subjects in a study. Û within subjects sum of squares 15

between subjects variance

bioequivalent

between subjects variance see between subjects between subjects variation an informal term for the between subjects variance between treatments between groups bias a process which systematically overestimates or underestimates a parameter. Bias is sometimes, but not always, acceptable: for example, we routinely underestimate peoples’ ages by an average of 6 months if we record data only to the lowest whole year. G precision biased coin a method of randomisation that does not assign patients to treatments with equal probabilities. Û balanced design. G unequal allocation biased estimator a method of estimation of a parameter from data that gives a biased result bibliography a list of published books, manuscripts, etc. that discuss a particular topic bimodal having two modes bimodal distribution a distribution (either a probability distribution or a frequency distribution) that has two modes or peaks binary data data taking only one of two values: typical examples are data of the form Yes/No, Dead/Alive, Male/Female. Sometimes a third category of ‘not known’ or ‘missing’ is included but the data are still said to be binary. G categorical data binary outcome an outcome that can take only one of two values; one that yields binary data binary variable a variable that can take only one of two values; one that yields binary data binomial data binary data binomial distribution in data that are binary (yielding only ‘positive’ or ‘negative’ outcomes), the probability distribution of the number of positives is a binomial distribution. For example, the number of live births (as opposed to still births) out of the ﬁrst one hundred deliveries in a maternity unit follows a binomial distribution bioassay estimation of the potency of a drug by observing its eﬀect on a biological organism bioavailability at any time, the proportion of drug within the body that is available to give a therapeutic eﬀect biochemistry the study of chemistry in living things. Usually used in the context of laboratory data to refer to the amount of various chemicals (for example albumin, calcium, ethanol) in the blood. Û haematology bioequivalent two products that have the same bioavailability are said to be bioequivalent 16

biologic

blinding

biologic a drug derived from a biological product. G biotechnology. Û pharmaceutical, phytomedicine biological assay bioassay biological marker a nonclinical (often a laboratory) measurement that is an indicator (or ‘marker’) of a clinical condition biological plausibility a hypothesis that is justiﬁable from biological theory and not just based on observable data biometrician a person who specialises in biological (including medical, genetic, agricultural) applications of statistics biometry literally ‘measurement in biology’. More generally, the application of statistical theory and methods in the biological sciences biopharmaceutical the subset of biology related to pharmacology. Often the term is used synonymously with pharmaceutical biostatistician biometrician biostatistics the application of statistical theory and methods in the biological sciences biotechnology the process of developing drugs from biological products (such drugs are then called biologics) bivariate the joint measurement and consideration of two characteristics (for example a person’s ‘size’ would often be measured in terms of their height and weight). G multivariate bivariate analysis special methods of analysis suitable for bivariate data. These are usually simpliﬁcations of general methods of multivariate analysis bivariate data measurements that consist of two response variables. For example, a person’s blood pressure could be measured as both systolic pressure and diastolic pressure. More than two variables are always referred to as multivariate data bivariate distribution the joint distribution of two separate (but often correlated or related) measurements. Û univariate distribution. G multivariate distribution black box a process whose internal workings are unknown (at least to the user) but whose output is usually trusted. Computers, for example, are black boxes to most people blind not being able to see. Speciﬁcally, within clinical trials, where the investigator, subject (and possibly other people) are not able to distinguish diﬀerent treatments that are being compared (by sight, smell, taste, weight, etc.) G single blind, double blind, triple blind, quadruple blind blinding the process of keeping hidden certain information about data or 17

block

box plot

study procedures in order to help avoid bias. Most commonly this means keeping the treatment allocation hidden from the doctors and patients (and often data management staﬀ) taking part in a study block several packs of medication kept together and used sequentially, each block usually having the same number of treatments (although in random order) as each other block. The concept can be extended to cases when treatment ‘packs’ do not actually exist. Commonly, if a study is comparing two treatments each ‘block’ might contain medication for four patients, two on one treatment and two on the other. G block size block eﬀect any systematic diﬀerence in response that may exist between blocks of treatment medication. Such diﬀerences do not invalidate the study; the purpose of blocking is to ensure that if such diﬀerences exist, the treatment allocation is equal across blocks block size the number packs of treatment that form one complete block blocked randomisation a randomisation scheme that uses blocks to help maintain balance. Û completely randomised design blocking the act of using blocks of treatment Quetelet’s index body-mass index Bonferroni correction an adjustment made when interpreting multiple signiﬁcance tests that all address a similar basic question. If two endpoints have been assessed separately, instead of considering whether a P-value is less than (or greater than) 0.05, the calculated P-value should be compared to 0.025. In general, if k P-values have been calculated, the declaration of statistical signiﬁcance should not be made unless one or more of those P-values is less than 0.05/k Boolean logic rules for making decisions based on combining binary outcomes using the key words AND, OR and NOT. For example, subjects may be eligible for a study ‘IF (they are male) OR ((they are female) AND (they are using adequate contraception))’ bootstrap a simulation method used for statistical signiﬁcance testing and estimation that takes as possible (simulated) sample data values, only those data values that have actually been observed. G Monte Carlo method box and whisker plot a diagram used to show a few key features of a frequency distribution, namely the minimum, lower quartile, median, upper quartile and maximum (Figure 3). G Exploratory Data Analysis box plot box and whisker plot

18

Box–Cox transformation

byte

Figure 3 Box and whisker plot. Distribution of the number of years a group of 66 patients had suﬀered from eczema. The key features illustrated are the minimum (1 year), the lower quartile (4 years), the median (15 years), the upper quartile (28 years) and the maximum (55 years) Box–Cox transformation a very general equation used to transform a set of data so that it better resembles a Normal distribution. The method was developed by the two statisticians, Box and Cox; hence the name branch on a decision tree (Figure 6), any of the possible routes that can be followed. G node brand name trade name break point used to describe a regression line that is not a continuous smooth function across the whole range of data but is made up of diﬀerent lines (often only two). The point at which the two lines (with diﬀerent slopes) meet is called the break point bridging study a study designed to extend the applicability of a conﬁrmatory study, usually to broaden the population to which the results apply. Bridging studies are usually much smaller than other conﬁrmatory studies byte a single character (one letter or a single digit in a number) as stored by a computer

19

C C a widely used, high level, computer programming language. There are other programming languages that are commonly used in clinical trials work such as BASIC, C;;, Fortran, Visual Basic and there are speciﬁc statistical analysis programs, for example BMDP®, SAS®, SPSS®, STATA® C;; a more advanced version of the C programming language calibrate to check measurements against a known standard capsule a dissolvable container (with an enteric coating) that contains a drug. G other delivery devices such as tablet, transdermal patch carcinogen a chemical that causes any type of cancer carcinogenicity the potential to cause any type of cancer carcinogenicity study a study to determine if a chemical is a carcinogen carryover a term used mostly in the context of crossover studies where the eﬀect of a drug is still present after that drug has ceased to be given to a subject, and in particular when that subject is taking another drug Cartesian coordinate the place (in terms of x axis and y axis) where a data point lies on a graph. For example, if a subject’s systolic blood pressure is 120 mmHg and their diastolic blood pressure is 85 mmHg the Cartesian coordinates would be 120, 85 case a term used synonymously with patient, although often intended to mean one with a particular identiﬁed disease. Û control. G case-control study case history the description (usually the medical history) of an individual case case record form (CRF) the term used for the paper on which data are written. Often a CRF comes in the form of a book with many pages of forms to record a subject’s data case report form (CRF) case record form case-control study a type of study used for evaluating the causes of a particular disease. A group of patients with the disease (the cases) are compared with another group of subjects who do not have the disease Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

case-fatality rate

centile

(the controls). Their lifestyles, previous exposure to potential hazards, demographics, etc., are compared to try to distinguish which of those features predispose someone to have the disease in question case-fatality rate the death rate amongst a group of cases catchment area the geographical area from which subjects may be included in a study. The area covered by a Health Authority or in which a survey was being carried out would be termed the ‘catchment area’ categorical data data that are not pure measurements but are in the form of labels assigned, such as ‘male’ and ‘female’. G ordered categorical data categorical scale the scale on which categorical data are measured categorical variable a characteristic of a subject that results in categorical data. For example a subject’s gender is a categorical variable: it falls into the categories ‘male’ or ‘female’ categorise the process of taking data that may take many distinct values and putting them into categories. For example all the people living in a group of post codes may be classed (or categorised) as being in one town category a group (but used when the term is applied to data). Categories of blood group are A, AB, O; categories of products might be ‘prescription only’ or ‘over the counter’ causal relationship a relationship that is observed when one variable is a consequence of another. For example, alcohol intake and impaired reaction times are causally related. G correlation causality the act of causing. G correlation cause and eﬀect a phrase used to imply causality, over and above correlation ceiling eﬀect a term to describe an asymptote that is an upper limit. Û ﬂoor eﬀect cell when referring to a tabulation of data, each of the individual categories or subcategories of patients (or other data) are referred to as cells cell frequency the number of subjects within a cell of a table. For an example, see contingency table cell mean the mean of the data for all subjects within a cell of a table (Table 1) censor to prevent something being observed. G censored data censored data when the time until an event (typically cure, recurrence of symptoms or death) is the data value to be recorded and that event has not yet been observed for a particular subject, that data value is said to be censored. G truncated data censored observation censored data centile if a (large) set of observations are placed in order, the 1st centile is the value below which 1% of the data lie, the 2nd centile the value below 21

central laboratory

chance

Table 1 Cross-classiﬁcation of mean systolic blood pressure (mmHg). The data are cross-classiﬁed by treatment group and by centre. Each of the means (127.4 mmHg, 135.1 mmHg, etc.) are referred to as cell means Centre

Treatment A

Treatment B

UK.001 UK.002 UK.003 UK.004 UK.005

127.4 131.2 122.0 129.5 141.3

135.1 135.9 127.6 130.3 147.7

which 2% of the data lie, etc. G lower quartile, median, upper quartile central laboratory a single laboratory that is used by all centres in a multicentre study, though it may not necessarily be ‘central’ in any geographical sense. Û local laboratory central limit theorem a statistical phenomenon such that the mean of several data values tends to follow a Normal distribution, even if the distribution of the original data was not Normal central processing unit (CPU) the part of a computer that carries out calculations central randomisation in multicentre studies it is common to use a separate randomisation list in each centre so that we stratify the randomisation by centre. Alternatively we may have a single randomisation sequence, held at one site (a ‘central’ site) and investigators would telephone (or otherwise contact that site) to obtain the next randomisation code central range the range in which the central 90% of the data from a distribution lie. G interquartile range, standard deviation central tendency a nonspeciﬁc summary of data (usually of continuous data) that, for any particular purpose, is useful in describing where the bulk of the data lie. The mean, median and mode are the most common measures of central tendency certiﬁcate of destruction an oﬃcially recognised document conﬁrming that speciﬁc batches of drug product have been safely destroyed. This would often apply to unused medication from a study certiﬁed oﬃcially recognised (strictly speaking, with a certiﬁcate). This can refer to an individual, a machine, a blood sample, etc. challenge test the administration of a product speciﬁcally to see if it produces an adverse reaction chance luck (good or bad). Events that happen by chance are ones that 22

change from baseline

chi-squared distribution

Table 2 Individual subjects’ systolic blood pressure (mmHg) before and after treatment Subject identiﬁcation number 1 2 3 4 5 6

Before treatment After treatment (baseline) 137 120 150 118 130 130

134 120 163 126 135 122

Change from baseline 93 0 13 8 5 98

could not have been predicted with certainty. They occur with probability less than one change from baseline when a measurement (for example subjects’ blood pressure) is measured at the time of randomisation (the baseline) and again after treatment and the diﬀerence calculated (as in Table 2), this diﬀerence is often used as the measure of treatment beneﬁt. It is called the ‘change from baseline’. G analysis of covariance change score change from baseline changeover design crossover design changepoint model a statistical model that attempts to identify when a smooth course of events abruptly changes. For example, height of growing children may follow a smooth curve until puberty, when a sudden change in that curve would be expected. A model that allows for this would be called a changepoint model characteristic an alternative term for data or measurement. Often (but not necessarily) restricted to demographic data and baseline data chart a general term for any form of graph, histogram, etc. check to conﬁrm that something (often data) is correct check digit a number (usually between 0 and 9) that is used as a means of checking that other numbers are correct. The last digit of the ISBN of this book is the check digit chemotherapy the use of drugs to eradicate disease or to prevent existing disease from spreading by killing cells that are otherwise dividing and multiplying. G cytotoxic chi-squared (2) chi-squared statistic chi-squared distribution a probability distribution used in a wide variety of forms of data analysis. Most often in clinical trials it is used for 23

chi-squared goodness of fit test

clean data

comparing the equality of proportions in contingency tables. However, its use is not restricted to this case chi-squared goodness of ﬁt test chi-squared test chi-squared statistic the calculated value of chi-squared from a set of data chi-squared test a statistical signiﬁcance test, the most simple of applications being for testing the null hypothesis that two (or more) proportions are equal chronic long term. Û acute chronic study a study of the long term treatment of a disease. Û acute study chronobiology the study of how biological features change with time. Biological example of time series methods chronotrophic eﬀect the eﬀect of a drug on the force of the heart beating. G inotropic eﬀect CIOMS form a standard template form for reporting adverse events to regulatory authorities. CIOMS stands for Council for International Organisations of Medical Sciences circadian rhythm a biological process that repeats itself in 24-hour cycles citation the reference in a published paper, book, etc., to another previously published piece of work class category class interval when continuous data are categorised into groups (for example, age groups) the class intervals are the number of years grouped into each category. They may, for example, be ten-year age groups of 0—9, 10—19, 20—29, etc. It is a term generally used when all classes have the same interval but this need not necessarily be the case class limits when continuous data (for example age) are categorised into groups (age groups) the class limits are the values that deﬁne at what values each group starts and ﬁnishes. The age groups may be 0—15 years, 16—64 years and 65—75 years; these values would be the class limits. Note that there is no need (explicitly or implicitly) for the class interval to be the same for every class classical statistical inference statistical methods that rely heavily on signiﬁcance testing and calculating conﬁdence intervals. Û Bayesian inference classiﬁcation variable a variable that is used to assign patients into groups (for example blood group, ethnic origin, or a continuous variable such as blood pressure that has been categorised) classify to assign a subject to a group (or category), based on data clastogen a substance causing damage to genetic material. G aneugen clean data data that contain no errors. Often data that are believed to 24

clearance

clinical trial certificate (CTC)

contain no errors are referred to as ‘clean’ but there is an assumption that may not be valid. Û dirty data clearance the rate of elimination of a drug from the body as a proportion of the amount of drug in the body. Û absorption clinic a medical centre where people are cared for clinical the branch of medicine dealing with patients. The practical application of medicine rather than medicine as a pure subject. Û medical clinical ethics ethical considerations and behaviour concerned with treating an individual subject. Û research ethics clinical investigation any form of investigating a patient or analysing a sample from a patient to help determine a diagnosis clinical investigational brochure a document describing the full extent of knowledge concerning an investigational product clinical practice what is generally accepted as the way patients are cared for. This includes all aspects of patient care including waiting in outpatient clinics, drugs received, palliative care, etc. clinical research that area of research carried out on humans (either patients or healthy volunteers). Û preclinical research clinical research associate someone employed to monitor the organisation and practical issues to do with running a study. Their duties may include collecting and collating study documentation, ensuring complete and clean data, ensuring that pharmacies or other dispensing centres have adequate supplies of materials clinical research coordinator clinical research associate clinical research organisation a company that provides staﬀ and/or facilities for carrying out clinical studies clinical signiﬁcance a ﬁnding or observation that is clinically signiﬁcant (for example a patient dying unexpectedly or a large treatment eﬀect). G clinically signiﬁcant diﬀerence. Û statistically signiﬁcant clinical study any systematic study that includes patients. This need not include studying any treatments, for which clinical trial clinical trial any systematic study of the eﬀects of a treatment in human subjects. G Phase I, Phase II, Phase III, Phase IV. Note that although randomisation and blinding, for example, are considered as some of the essential features of good clinical trials, these are not requirements. G prevention study clinical trial certiﬁcate (CTC) the certiﬁcate issued before the introduction of the system of the clinical trial exemption certiﬁcate (CTX). The amount of information needed for a CTC is more than that required for a CTX 25

clinical trial exemption certificate

clinically meaningful difference

Figure 4 Closed sequential design. The solid lines indicate stopping boundaries for declaring a statistically signiﬁcant diﬀerence between treatments A and B. For example, if out of ten patients expressing a preference for one or other treatment, nine preferred treatment B and only one preferred A, then the study would stop, concluding that B is signiﬁcantly better than A. If the broken boundary is crossed, then the study stops and the conclusion is drawn that no signiﬁcant diﬀerence was found between the treatments clinical trial exemption certiﬁcate (CTX) a certiﬁcate (issued by a regulatory authority) to a pharmaceutical company authorising use of an unlicensed product or use of a product outside its marketing authorisation for the purpose of carrying out a clinical study. Note that it is the product that is being exempted from otherwise stringent rules, not the study, so that one CTX may serve to cover several studies. G doctors and dentists exemption (DDX) clinically important clinically signiﬁcant clinically meaningful diﬀerence clinically signiﬁcant diﬀerence 26

clinically significant

coefficient of concordance

clinically signiﬁcant an eﬀect (in an individual subject or an average eﬀect in a group of subjects) that is suﬃciently large to be of beneﬁt (or harm) to a patient or of note to a treating physician clinically signiﬁcant diﬀerence a treatment eﬀect that is suﬃciently large to be useful for treating patients closed sequential design a sequential study design that does not have a predetermined number of patients. An upper limit on the number of patients does exist (hence ‘closed’) but it is possible to draw conclusions and stop the study before that number of patients has been recruited (Figure 4). Û open sequential design closed sequential study a study that is designed as a closed sequential design cluster randomisation a case where individual subjects are not randomised to receive diﬀerent interventions, but rather groups (‘clusters’) of subjects are randomised. Examples are most common in community intervention studies where, for example, some towns may have ﬂuoride introduced to their water supply whilst other towns may not. Clearly each member of the community cannot be randomly assigned to have ﬂuoride, or not, and the randomisation must be done in large groups of subjects (or clusters) C the maximum concentration of drug measured (usually) in a

subject’s blood but it could also apply to that measured in urine. The term can also be used to refer to the mean of many subjects’ C values;

it is then used as a description of the product rather than of any particular subject. ( Figure 1, area under the curve) G T

coarse data data that are measured or subsequently recorded very approximately, for example in categories with large class intervals. Û ﬁne data code an indirect means of linking two or more pieces of information. For example, to identify a pack of medication that pack may be given a code number and, separately, a list be kept of which code numbers refer to which treatments. G randomisation code coding dictionary a list of terms and associated codes. See, for example, COSTART, MedDRA, WHO-ART coding system a set of rules for making up codes for data coeﬃcient an estimate of a parameter. The term is used when the parameters are being estimated in statistical models such as regression analysis, logistic regression coeﬃcient of concordance a measure of agreement between several people, each rating a group of items on some speciﬁc measure. G correlation 27

coefficient of determination

community study

coeﬃcient of determination the square of the correlation between two variables, denoted r coeﬃcient of variation a measure of variation in data, relative to the mean of those data. It is calculated as 100;(standard deviation/mean) and is expressed as a percentage cohort a group of individuals with a common characteristic observed over a period of time. The feature they have in common may simply be the year of birth, or it may be the fact that they have all been exposed (for example) to a carcinogen or a novel educational programme cohort eﬀect any systematic diﬀerence between subjects recruited to a study at diﬀerent times. For example, the ﬁrst patients recruited to a study may have less (or possibly more) severe symptoms than those recruited later in the study cohort study the study of a group of subjects over time. This includes clinical trials, but the term is usually restricted to observational studies co-intervention more than one intervention being studied concurrently. Note that the interventions do not necessarily have to be given at the same moment but the period of study is coincidental, nor do both interventions need to be related to the same disease or be of the same type. For example, a drug treatment and a patient management strategy might both be studied concurrently collapse used in the sense of reducing the number of categories of data. For example, age may be recorded as under 5 years, 5—15 years, 16—65 years, etc. Subsequently deciding to combine adjacent categories (for example the under 5s and the 5—15s) would be described as ‘collapsing’ these two categories into one collective ethics ethical behaviour that is more concerned with beneﬁting other people than oneself. Being prepared to administer a placebo is unlikely to beneﬁt the patient concerned but may beneﬁt others by nature of the information gained. Û individual ethics column vector see vector combination drug more than one drug being administered simultaneously (usually when all of the drugs are packaged in the same tablet or capsule, etc.) Û monotherapy community intervention study a study carried out to investigate the eﬀect of an intervention on an entire group of people, for example all those who live in a particular city. Public health studies and studies of screening programmes frequently are described as community intervention studies. G cluster randomisation community study a study of large numbers of subjects in a community. It 28

comparability

complete cases analysis

could be some kind of survey or might be a community intervention study comparability similarity. Often used in the sense of describing how similar two randomised groups are with respect to demographic data or disease severity comparable the state of being similar comparator drug comparator treatment comparator group the group of patients assigned to receive the comparator treatment comparator study a study that makes comparisons (usually between treatments). Û observational study comparator treatment usually the drug, placebo, or other intervention with which a new or experimental treatment is being compared comparison a contrast (formal or informal) between two or more items or groups comparison group comparator group comparisonwise error rate the probability of making a Type I error for each comparison in a study. Û experimentwise error rate. G multiple comparisons compassionate use a regulatory term, meaning that an unlicensed product is allowed to be used for a limited number of patients for whom their is no alternative medication. Although the product may be ineﬀective (its eﬃcacy has not been demonstrated), there may be no other eﬀective therapies. G named patient use compassionate use protocol a protocol that deﬁnes how a product will be used on a compassionate use basis competitive enrolment the situation in multicentre studies where each centre is allowed to recruit as many subjects as they can until the overall recruitment target has been met, rather than each centre having their own recruitment target complementary log9log transformation an equation applied to data that are proportions to allow use of statistical methods based on the Normal distribution. The transformation is y : log(9log(19p)) complete block a block of medication that contains all possible treatments (or combinations of treatment or treatment sequences) that are being studied. Û incomplete block complete block design a study design that only uses complete blocks of treatment. Û incomplete block design complete cases analysis a strategy for analysing data where only subjects who provide complete data are included in the analysis; any subject with missing data is excluded. G intention-to-treat, per protocol analysis 29

complete response

computer assisted data collection

complete response in cancer studies, this is generally regarded as complete disappearance of all tumours and no new tumours. G partial response, stable disease, progression completely randomised design a study where subjects are allocated to receive treatments in a randomised manner with no constraints (such as equal numbers of patients per group, no blocks, no stratiﬁcation) compliance the measurement of how fully patients take their medication. This may be measured by weighing returned medication, counting returned tablets or simply asking how many doses of medication were (or were not) taken compliant fully compliant component a part. This may be a chemical component of a drug (one of the chemicals that make it up), a part of a data ﬁle or of a case record form, etc. components of variance a method of analysing data that assesses which features of an experiment account for the variation in those data. Typically, the sorts of features identiﬁed will be patients, treatment centres and diﬀerent medications composite hypothesis in a statistical signiﬁcance test, an alternative hypothesis that does not specify a single value for a parameter, for example H : 0 Û simple hypothesis composite outcome when an outcome measure for a study is a mix of several individual measurements. For example, the composite outcome ‘treatment success’ may be deﬁned as a patient who is free of symptoms and has a quality of life score better than some speciﬁed value. Neither of those features is suﬃcient on their own to deﬁne a treatment success but together they are. G Guttman scale composite score Guttman scale compound the bulk product of drug. Û product compound symmetry a term used in assessing repeated measurements. The data are required to have the same variance at each time point and equal covariances between time points. Generally, if compound symmetry can be assumed, the analysis of data is much simpler computer a machine (originally mechanical but now electronic) used for numerical calculations and data processing. The current uses range from complex and fast calculations through to controlling machinery and word processing computer assisted data collection a process by which a computer is used to help (in various possible ways) collection and/or recording of data. The help may simply be that it acts in the form of an electronic case 30

computer assisted new drug application

conditional distribution

record form and that data are recorded into the computer instead of onto paper. It may be more sophisticated and the computer linked to a holter monitor to directly record measurements of blood pressure without the need for human intervention computer assisted new drug application (CANDA) a new drug application where some or all of the data, study report, program ﬁles, etc. are supplied to the regulatory authority in electronic form on a computer computer package a computer program that does a variety of related tasks computer program instructions given to control what a computer does. A variety of types of computer programs are used in clinical research including those for data processing, statistical analysis, drawing graphics and report writing concentration the amount of a substance in a ﬁxed volume of liquid. This may be the amount of active drug per unit of blood during absorption and distribution conclusion the decision that is made based on data that have been collected and analysed. Note that results should generally be referred to in the past tense (‘Drug A was better than Drug B’) but conclusions should be referred to in the present tense, with future implications (‘we conclude that Drug A is better than Drug B’). Û discussion concomitant medication drugs that are not being studied but which a patient is taking through all or part of a study. These may be other drugs for the same indication as the study or for other indications concomitant variable a variable that may inﬂuence the results of a study but which is not a part of the study design. Most often, this term is used to refer to other (nonstudy) medications that a patient may be taking or other diseases that a patient may have. G concomitant medication, covariate concordance agreement. G coeﬃcient of concordance concordant pair in a study where subjects are assessed on two diﬀerent occasions or by two diﬀerent measuring devices and the variable measured is binary (for example, disease present or absent), the data may be summarised in a two-by-two table. The concordant pairs are those pairs of observations where the two measurements agree with each other. Û discordant pair concurrent control control subjects who are observed and data recorded concurrently with the active subjects. This need not necessarily be done in a controlled experiment. Û historical control concurrent medication concomitant medication conditional distribution the distribution of one variable at a ﬁxed value of 31

conditional odds

conflict of interest

another variable. For example, the distribution of age may be given for all subjects, but if it is given for males and females separately then these sex-speciﬁc distributions are said to be ‘conditional on sex’. Û joint distribution, marginal distribution conditional odds the conditional distribution of the odds of an event occurring conditional power the power of a study based on some prerequisite information. Usually it is meant as the power of the study as calculated (after the study has ﬁnished) using the observed diﬀerence between the treatments and the observed variance of that diﬀerence conditional probability the probability of an event happening, given that another event has already been observed to happen conﬁdence an informal term used to describe how strong is one’s belief in the results of a study. G strength of evidence conﬁdence coeﬃcient see conﬁdence interval conﬁdence interval a range of values for a parameter (such as a mean or a proportion) that are all consistent with the observed data. The width of such an interval can vary, depending on how conﬁdent we wish to be that the range quoted will truly encompass the value of the parameter. Usually ‘95% conﬁdence intervals’ are quoted. These intervals will, in 95% of repeated cases, include the true value of the parameter. In this case, the conﬁdence coeﬃcient (or conﬁdence level) is said to be 95% (or 0.95). Conﬁdence intervals are a preferred method of estimating parameters, whilst signiﬁcance tests compare those parameters with arbitrary values. G posterior distribution conﬁdence level see conﬁdence interval conﬁdence limit the values at the end of a conﬁdence interval. If the 95% conﬁdence interval for the diﬀerence in mean systolic blood pressure between two treatment groups is quoted as being from 93 mmHg to ;8 mmHg, then 93 and ;8 are the conﬁdence limits conﬁdential private; not to be disclosed to a third party conﬁrmatory analysis the analysis of a conﬁrmatory study conﬁrmatory study a study that is designed to answer a speciﬁc question without leaving any room for doubt. Whilst Phase I studies and Phase II studies give some information regarding eﬃcacy and safety, Phase III studies are usually thought of as being conﬁrmatory. Û exploratory study, pilot study. G deﬁnitive study conﬂict of interest the situation where an individual or organisation may ﬁnd it diﬃcult to make unbiased statements. Examples are of investigators reviewing their own project proposals at an ethics committee meeting or a pharmaceutical company reporting results of a study involving one 32

confounded

continual reassessment method

of their own products. In such cases, bias is not being assumed but it is recognised that there is a clear reason why individuals may make biased statements or give biased opinions confounded ‘cannot be distinguished from’. For example, if all males were given one treatment and all females given an alternative, the eﬀects of treatment and gender would be indistinguishable from one another, or confounded with each other confounder a term used in observational studies to describe a covariate that is related to the outcome measure and to a possible prognostic factor confounding factor confounder consent positive agreement, particularly in the sense of informed consent. Û assent conservative erring towards being safe; an estimate may be conservative if it is known to be less than the true parameter value (actually biased) and is intentionally quoted as such to avoid the risk of it being an overestimate. G safety margin consistency check an edit check on data to ensure that two (or more) data values could happen in conjunction. Systolic blood pressure measurements must always be at least as great as diastolic measurements so, for any given patient, if the systolic pressure is greater than the diastolic, then the two measures are consistent with each other. It may be that neither is correct—but they are, at least, consistent. Û plausibility check consistent reproducible without upward or downward trends over time. Also two items (often data points) that could both occur simultaneously. G consistency check CONSORT a set of guidelines, adopted by many leading medical journals, describing the way in which clinical trials should be described. It stands for Consolidation of the Standards of Reporting Trials. G structured abstract constant not changing between subjects or across time consumer’s risk the probability of committing a Type I error. Û producer’s risk. G regulator’s risk contingency table a cross-classiﬁcation of subjects by two or more categorical variables. The simplest form is the two-by-two table (Table 3), in which each subject is cross-classiﬁed by two binary variables. The table has four cells (totals are not usually counted) and the number of items within each cell is called the cell frequency continualreassessment method a procedurefor adjusting the dose given to successive subjects when the purpose of a study is to ﬁnd the median dose that has some speciﬁed eﬀect. G dose ﬁnding study, dose escalation study 33

continuity correction

contrast

Table 3 Contingency table showing the distribution of gender by treatment group

Male Female Total

Treatment A

Treatment B

58 29 87

63 28 91

continuity correction an adjustment made in the calculations for some signiﬁcance tests on discrete data to make a better approximation to the test statistic that is continuous. It generally involves adding or subtracting 0.5 to the diﬀerence between the observed and expected frequencies of data. In two-by-two tables it is often referred to as Yates’ correction continuous data data that are not restricted to particular values (as in categories) but that can take an inﬁnite number of values. Examples of variables that result in continuous data are age, height, weight, pulse.Û categorical data, discrete data, ordinal data continuous scale the scale on which continuous data are measured continuous variable a characteristic of a subject that results in continuous data. For example age, height, weight. Û discrete variable contour plot a graph that shows three dimensional data on a two dimensional surface. Two variables are depicted on the x axis and y axis; the third is depicted in the form of contours as would be seen on a map to show elevation (Figure 5) contractor a temporary employee, usually of professional rather than clerical status, taken on to perform duties that would otherwise be carried out by full time employees. Such people are often used to cover peaks in workload or periods of absence of permanent employees contraindication an indication for which a drug is speciﬁcally excluded contrast a more formal term for comparison. In its simplest form it is the diﬀerence in the mean value of a variable between two groups or the diﬀerence in the proportion of subjects with some particular characteristic in each of two groups. In more complex forms it may be a weighted diﬀerence between several groups. For example, in a study with three groups of subjects, two groups (A and B) treated with active treatments and a third group (C) treated with placebo, a simple contrast would be that between the two active products which is simply mean(A)9mean(B). We may wish to compare the active products together with the placebo group and so a more complex contrast would be mean(A) ; mean(B)9mean(C) 34

control

controlled experiment

Figure 5 Contour plot. The heights and weights of 100 patients with ischaemic heart disease are used to try to predict systolic blood pressure. In general there is a tendency for higher blood pressure in the bottom right-hand corner: that is, the heavier people who are rather short (and therefore those that are most overweight) have the highest blood pressure

control a term used in case-control studies speciﬁcally intended to mean someone who does not have any disease. Û case control group the subjects assigned to receive the comparator treatment, or to receive no treatment control treatment comparator treatment controlled clinical trial more formal term for clinical trial. It clearly emphasises the ‘controlled’ aspect of a trial, although that should be inherent in the deﬁnition of clinical trial controlled experiment a term similar to controlled clinical trial, except that it could refer to any kind of experiment, not just a clinical (or even medical) one. The aspect of control (and therefore inclusion of one or 35

convenience sample

COSTART

more control groups) is still emphasised convenience sample a sample of subjects. Whether or not a subject is selected for the sample is not based on any random process but merely on which people are conveniently available. G haphazard sample coordinate xy coordinate. Also means to ensure that several activities happen together (as they should do) or in sequence (if that is how they are intended to occur) correlate to assess how one variable changes as another changes. G correlation correlated samples t test paired t test correlation the degree to which two variables are associated with each other. Positive correlation ( Figure 34, scatter plot) implies that as one variable increases so does the other; negative correlation implies that as one variable increases the other decreases. Note that no causality is implied correlation coeﬃcient the statistical measure of correlation, denoted r. G coeﬃcient of determination correlation matrix a square matrix whose values are the correlation coeﬃcients between all pairs of several variables. An example of the correlation between ﬁve laboratory parameters is shown in Table 4 correlation table correlation matrix cost beneﬁt ratio the relative weighting of the cost of a medication to the beneﬁt of that medication. Beneﬁt may be deﬁned in arbitrary ways to suit the context. G cost eﬀectiveness ratio, cost utility ratio cost eﬀective generally meaning good value for money; the beneﬁt outweighs the cost cost eﬀectiveness ratio the relative weighting of the cost of a medication to the clinical eﬀectiveness of that medication. G cost beneﬁt ratio, cost utility ratio cost function an equation that calculates the total cost of treating a patient. It will typically include positive values (drug costs, pharmacy costs, hospital costs, productivity lost from work, etc.) but sometimes also negative costs (reduction in number of days spent in hospital, increased productivity from early return to employment, etc.) cost minimisation the approach of evaluating the optimum amount to spend in order to minimise the overall cost function cost utility ratio the relative weighting of the cost of a medication to the utility of that medication. Utility is the overall beneﬁt as assessed by any and all diverse measurement scales including medical, ﬁnancial, quality of life, etc. G cost beneﬁt ratio, cost eﬀectiveness ratio COSTART a dictionary of adverse event terms. COSTART stands for 36

count

critical value

Table 4 Correlation matrix of biochemistry parameters in 100 healthy subjects

Urinary creatinine Urinary calcium Serum phosphate Serum creatinine Serum calcium

Urinary creatinine

Urinary calcium

1.0 0.41 90.03 0.03 0.08

0.41 1.0 90.06 0.00 0.11

Serum Serum phosphate creatinine 90.03 90.06 1.0 0.07 0.15

0.03 0.00 0.07 1.0 90.05

Serum calcium 0.08 0.11 0.15 90.05 1.0

Coding Symbols for Thesaurus of Adverse Reaction Types. G MedDRA, WHO-ART count to determine how many (of something) exist or how many times a certain type of event has occurred covariance a statistical measure of how two variables vary together. G correlation, variance covariate a variable that is not of primary interest but which may aﬀect response to treatment. Common examples are subjects’ demographic data and baseline assessments of disease severity Cox model Cox’s proportional hazards model Cox’s proportional hazards model a statistical method for comparing survival times between two or more groups of subjects that also allows adjustment for covariates. The model assumes proportional hazards. G Cox–Mantel test, accelerated failure time model Cox–Mantel test a statistical method for comparing survival times between two groups. G Cox’s proportional hazards model cream a mixture of ointment (such as paraﬃn, lanolin, etc.) and water used as a vehicle for delivering topical treatment. Û gel, lotion credible interval a form of a conﬁdence interval used in the context of Bayes’ theorem. G highest density region critical appraisal the set of skills (and judgements) needed to evaluate evidence. G evidence-based medicine critical data the most important data that will be used to draw conclusions from a study relating to the most important objectives critical region the values of a test statistic (such as in the t test or chi-squared test) that lead to rejecting the null hypothesis at a given signiﬁcance level critical value the value of a test statistic (such as in the t test or chi-squared test) that is the boundary between where the null hypothesis is rejected and not rejected at a given signiﬁcance level 37

Cronbach’s alpha

curve

Cronbach’s alpha a measure of internal consistency in a psychological test cross-classiﬁcation contingency table crossed factors the opposite of nested factors. When every category of one variable also contains every category of another. G factorial design crossover design a study where each subject receives (in a random sequence) each study medication. After receiving Treatment A, they are ‘crossed over’ to receive Treatment B (or vice versa). This is the simplest form of crossover design and is called the two period crossover design. Û parallel group design crossover study a study that is designed as a crossover design cross-product ratio odds ratio cross-sectional considering a single moment (or separate moments) in time without regard for any trend across time. Û longitudinal cross-sectional analysis the analysis either of a cross-sectional study or of data as if they were collected in a cross-sectional study. Û longitudinal analysis cross-sectional study a study that examines data at one particular point in time (either in the sense of ‘all ten-year-old children’ or ‘everybody on 1st January’) and does not consider within subjects eﬀects crude estimate any estimate of a parameter that is an unadjusted estimate crude rate an unadjusted rate. Generally, simply the observed number of subjects experiencing a speciﬁc event divided by the total number of subjects exposed and potentially at risk of that event cumulative frequency a running total. For example, if we count the number of deaths per day (the daily frequency), then the total number of deaths from the beginning of a study to any particular day is the cumulative frequency cumulative frequency distribution the distribution of cumulative frequencies cumulative hazard rate the accumulation of the hazard functions at all times from time zero up to a speciﬁed time point cumulative meta-analysis a meta-analysis that shows continuing updated estimates of treatment eﬀect after each of the studies was completed. It does not simply show one overall result taking account of all studies, regardless of when they were carried out curriculum vitae a person’s educational and employment history, usually including all other relevant experience and any publications to which they have contributed curve a smooth line or surface drawn though a set of data points. The term can strictly be used to describe a line (in two dimensions) or a surface (in more than two dimensions) that is straight as well as one that bends 38

curvilinear regression

cytotoxic

curvilinear regression a regression model that ﬁts a curve to data. In this context, curve is generally taken to exclude a straight line. G linear regression cutoﬀ design a method of treatment assignment based on a baseline measurement. All subjects with values below some cutoﬀ point (deﬁned as those with good prognosis) are assigned to the control group; subjects with values of the baseline measurement in the middle of the range are not included in the study; and all subjects with values of the baseline measurement above another cutoﬀ point (those with poor prognosis) are assigned to the experimental group. G regression discontinuity design cutoﬀ point a value on an ordered scale (possibly a continuous scale) where a change of decision is made. For example, patients with systolic blood pressure above 180 mmHg may be included in a study; those with values less than, or equal to, 180 mmHg are not included: 180 would be called the cutoﬀ point cutpoint a point along a line or on a surface where the slope changes abruptly rather than smoothly. G changepoint model, cutoﬀ point cyclic cyclic variation cyclic variation systematic variation over a course of time. A circadian rhythm is one type of cyclic variation cytotoxic a drug that is poisonous to certain types of cells. Frequently used in cancer treatment

39

D data information of any sort, whether it be numerical, alphabetical, judgements, estimates or precise measurements. G binary data, categorical data, continuous data, discrete data, ordinal data data analysis the process of summarising data, either to draw conclusions or simply to describe a process data and safety monitoring committee a group of people who regularly review accumulating data in a study with the possibility of stopping the study or modifying its progress. A study may be stopped, or changes made to it, if clear evidence of eﬃcacy is seen or if adverse safety is observed in one or more treatment groups data audit an audit of the quality, source and integrity of data data centre the place where data are gathered and the data management tasks completed. It is a term particularly relevant to multicentre studies. Single centre studies may have the data centre at the same place as the patients are seen or somewhere diﬀerent data cleaning the process of ﬁnding errors or possible errors in data, checking them and, if appropriate, correcting them. G clean data, dirty data data coding assigning data into categories. For example, classifying adverse events into groups according to which part of the body is aﬀected or classifying concomitant medications into generic names rather than trade names data collection form case record form data collection protocol speciﬁc, detailed instructions for how data are to be collected and recorded data coordinating centre data centre data dependent stopping making the decision to stop recruitment to a study (or possibly follow-up in a study) based on data already observed. G interim analysis data dredging analysing data without regard to accepted scientiﬁc and statistical principles in order to ﬁnd some aspect that will be of interest. Also referred to as ‘ﬁshing expeditions’ because of the analogy of Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

data driven analysis

death rate

dipping a ﬁshing rod into dark water and pulling out various items of rubbish, but rarely ﬁsh! data driven analysis making decisions on which analyses should be carried out based on the observed data. G post hoc analysis data editing data cleaning data entry typing data into a computer. This may be done directly by the subject, by the treating doctor or investigator or, more usually, copied from a case record form at a data centre. G single data entry, double data entry data ﬁeld an individual item of data. The term is most often used in referring to data on a case record form or on a computer data ﬁle a highly structured and well organised collection of related data. The term could be used about paper ﬁles (including a case record form or a large number of case record forms) but is generally reserved for data on a computer data item data ﬁeld data management the discipline of collecting and ﬁling data in an ordered fashion to facilitate subsequent retrieval and analysis. Although the term can refer to the management of paper ﬁles, most activity in data management usually revolves around storage of electronic versions of data on a computer data manager the person with the responsibility for ensuring data management is properly carried out data monitoring the process of reviewing data being collected to ensure it is of high quality and complete. G quality control data monitoring committee data and safety monitoring committee data monitoring report a report on the quality and completeness of data data processing the steps involved in computerisation and particularly data management in a computer system data query a question raised about the validity or correctness of an item of data data reduction the process of summarising data, particularly using summary measures or coding continuous data into categorical data data screening the process of looking at and reviewing data to check their plausibility and completeness database an electronic version of a set of data, held on a computer database management system any piece of software for handling data. This includes data entry as well as production of tables, listings, etc. dataset data ﬁle death rate the number of people dying in a speciﬁed time interval divided 41

debug

degree of belief

by the number alive at the beginning of that time interval. G casefatality rate debug to ﬁnd errors in computer programs and correct them decile each of the tenth (10th, 20th, 30th, etc.) centiles decimal a number recorded in whole units and in tenths, hundredths, thousandths, etc. For example, weight measured in kilograms and grams is expressed as a decimal number but weight measured in pounds and ounces is not. Sometimes the word is used just to refer to the part of the number less than 1 (only the numbers that come after the decimal point) decision function a mathematical function that describes which decision to make, based on a given set of circumstances. G decision rule, decision tree decision rule this term can be either a synonym for a decision function or a less technical, written, description of what decision to make based on a given set of circumstances. The rule may sometimes be depicted as a decision tree decision theory the general theory of how to make optimal decisions decision tree a diagram, resembling that of a family tree, to guide which decision (or sometimes which conclusions) should be drawn from a set of criteria (Figure 6). G decision rule Declaration of Helsinki a set of ethical guidelines for the conduct of research on humans. It was ﬁrst agreed in 1964 by the World Medical Association and has been revised subsequently in Tokyo (1975), Venice (1983), Hong Kong (1989) and South Africa (1996) decrement to decrease in value. Û increment deduce to draw a conclusion of a speciﬁc result based on broader examples. Û induce deduction deduce deductive inference the process of drawing conclusions based on deduction deductive reasoning see deductive inference but note that ‘reasoning’ is a broader term than ‘inference’ default an assumed state unless a positive reason can be given to accept an alternative state. For example, in signiﬁcance testing, by default the null hypothesis will be accepted unless evidence exists to refute it deﬁnitive study a study that is generally agreed to provide the answer to a question with no room for doubt. The term ‘deﬁnitive’ is usually used to describe a study that has already been completed. The term conﬁrmatory study is more often used of a study that it is planned to undertake degree of belief often used as an informal interpretation for a P-value. It is a measure (either on a probability scale or an informal, intuitive, scale) of the strength of evidence about a particular hypothesis 42

degrees of freedom

demographic data

Figure 6 Decision tree. A way to make a choice of a simple statistical signiﬁcancetest for comparinggroups of categorical data. Eachof the boxes with roundedcornersis called a ‘node’;each of the arrows is calleda ‘branch’ degrees of freedom a statistical term to describe the number of independent pieces of information that there are for a statistic. In chi-squared tests of two-by-two tables, there is one degree of freedom, the sample mean of n data points has n91 degrees of freedom delivery device the medium used for getting active product into the body. Tablets, ointments, injections, etc. are all delivery devices. G vehicle delta () usually used as the symbol to describe the ‘true’ size of an eﬀect. In particular it is used in planning studies to describe the smallest clinically signiﬁcant diﬀerence to detect. It is more often used to describe a diﬀerence in means but can also be used to describe a diﬀerence in rates or proportions. The symbol d is often used to describe the observed value of demographic data data on subjects’ age, height, weight, etc. The term can be used to describe any baseline characteristics of subjects including the 43

demographic variable

design effect

baseline measurements of the primary endpoint variable but is more often reserved for measurements that are not aspects of the disease. There is no clear distinction between which data are disease related and which are not; clearly in a study of weight loss, subjects’ weight would be both demographic data and important data describing the state of disease demographic variable any variable that is demographic data demographics demographic data demography the study of vital statistics of populations denominator in a fraction, such as or , the denominator is the number on the bottom line of the fraction (in these cases 2 and 4, respectively). Û numerator density function the mathematical function that gives the probability that a random variable is equal to any given value. G distribution function dependent samples t test paired t test dependent variable in any sort of statistical model, but most commonly in regression models, the dependent variable is the one we are trying to predict from the independent variable(s). In most cases, the dependent variable is the eﬃcacy variable derived variable data values that are calculated or formed from other data. For example, subjects’ age might be calculated (or derived) from the visit date minus the date of birth; age would then be called a derived variable descending order data sorted so that the largest value is written ﬁrst, the smaller values later and the smallest value last. Most easily described in terms of numeric data but special rules can be applied to alphanumeric data. Û ascending order descriptive statistics summaries of data that do not try to draw conclusions but which just describe the data. Most often used for continuous data. Common descriptive statistics include the mean, standard deviation, minimum value, maximum value, mode, median, quartiles and conﬁdence intervals. Û inferential statistics descriptive study one that aims to describe a phenomenon or a group of individuals. The analysis of data from such studies generally uses descriptive statistics rather than signiﬁcance testing or inferential statistics design the plan for a study with particular reference to whether it is a parallel group design or crossover design. The term should, however, be thought of very broadly to encompass the number of subjects to be included, the number of visits, the number of investigators taking part, strata, blocking, methods of randomisation, etc. design eﬀect the eﬀect caused by a design variable in a study. Such eﬀects 44

design variable

difference study

would, hopefully, be advantageous but they may be negative or neutral. In a study where the randomisation was stratiﬁed by gender, an observed diﬀerence in treatment eﬀect between males and females would be called a design eﬀect (because stratiﬁcation was part of the design of the study) design variable any variable that contributes to the design of a study, often because of stratiﬁcation according to values of the variable deterministic a process that is guaranteed to give the same result repeatedly, with no unexplainable (random or otherwise) variation deviance a statistical measure of how much a set of data diﬀers from a perfect ﬁt to a model. In the simplest case of a model with normally distributed residuals, the deviance is equal to the residual sum of squares. G variance deviate a variable that takes the values of the diﬀerence between another variable and a chosen reference value, such as the mean deviation a measure of how far values of a variable lie from a chosen reference point. G average absolute deviation, standard deviation device see delivery device, medical device diagnosis the decision that is reached regarding the disease a patient has diagnostic test a test (physical, mental or, more commonly, biological) that is used to deﬁnitively assess whether or not a subject has a particular disease. Û screening test diagram a line drawing, usually to show the relative positions (physically or in time) of a set of objects or activities diary card usually a paper system for subjects in a study to record symptoms, adverse events or other data on a daily basis, often at home and generally not under the direct supervision of any medical personnel dichotomous data binary data dichotomous outcome binary outcome dichotomous variable binary variable diﬀerence the value obtained by subtracting one value from another. This may be on an individual subject basis, for example calculating the diﬀerence between a subject’s pulse at baseline and the same subject’s pulse after treatment ( change from baseline), or it may be on a group basis, for example calculating the diﬀerence between the mean of all subjects’ heart rates in one treatment group and the mean of all subjects’ heart rates in a control group diﬀerence study a term used rarely, except to diﬀerentiate from an equivalence study or a noninferiority study. A study where the null hypothesis is that there is no diﬀerence between treatments and the 45

diffuse prior

directional hypothesis

alternative hypothesis states that there is a diﬀerence. The intention of the study (or objective) is usually to show that two (or more) treatments have diﬀerent eﬀects. G superiority study diﬀuse prior vague prior digit any numeral between zero and nine. For example, the number 57 contains two digits: 5 and 7 digit bias digit preference digit preference when recording numerical data, there is often a preference (intentional or unintentional) to round the last digit. For example, birth weight measured in grams will often be recorded to the nearest 10 grams; there is said to be a preference for zeros. Blood pressure measured in millimetres of mercury will often be recorded to the nearest 5 mm or the nearest 2 mm; values such as 73 mmHg (where the last digit is not a multiple of 2 or of 5) tend to be recorded less often than would be expected by chance dimension one of any number of variables that describe a subject. The term is most often used in connection with plotting data. When two variables are measured and plotted, there are two dimensions. When there are three variables plotted, three-dimensional graphs can be plotted (with some diﬃculty). More than three dimensions can be thought about but cannot easily be plotted. G multivariate data direct access in computing terms this refers to the method of accessing data on a physical storage device such as a disk. (Û sequential access.) The term also applies to source data veriﬁcation, where the person reviewing source data is allowed to see the source data for themselves, rather than indirect access where the values of source data have to be requested through a third party direct contact the contact of one person with another that potentially passes on an infectious disease. Û indirect contact direct cost actual (ﬁnancial) costs that are incurred in treating patients. These include the cost of drugs, the cost of occupying a hospital bed, etc. Û indirect cost. G pharmacoeconomics direct eﬀect main eﬀect direct relationship the case when the relationship between two variables is linear, so that plotting one variable against the other variable shows a rough ﬁt to a straight line. The term is often further restricted to the case when the correlation is positive—such as in Figure 34 ( scatter plot) directional hypothesis a hypothesis which speciﬁes that one treatment is equal to, or better than, another treatment. In general, the alternative hypothesis is stated that one group is diﬀerent from another, which 46

dirty data

diskette

Table 5 Cross-classiﬁcation of paired data to show the discordant pairs. The response to each treatment (in this example) is graded simply as ‘good’ or ‘bad’ Treatment A Treatment B Good Bad

Good

Bad

55 13

47 24

could allow it to be either better or worse. G one sided hypothesis dirty data data that contain errors, or data that may contain errors and have not yet been fully reviewed and validated to ﬁnd those possible errors. Û clean data discordant pair in a study where subjects are assessed on two diﬀerent occasions or by two diﬀerent measuring devices and the variable measured is binary (for example, disease present or absent), the data may be summarised in a two-by-two table. The discordant pairs are those pairs of observations where the two measurements do not agree with each other (Table 5). In this example, 47 patients and 13 patients represent the discordant pairs. Û concordant pair discrete data data that may take only a ﬁxed set of values. This includes categorical data but also extends to data in the form of counts, for example where only whole numbers of items can be counted. Û continuous data discrete variable a variable that can result only in discrete data values. Û continuous variable discussion that part of a ﬁnal report that addresses the validity of the results by considering the appropriateness of the study design, the success (or otherwise) of its implementation, quality of the data, consistency of results across diﬀerent outcome variables and in the light of other studies. G conclusion disease proﬁle the set of signs and symptoms (and their severity) that either characterise a disease (and therefore may help with diagnosis) or describe the severity of disease for an individual patient disk a device for storing data on a computer or in a computer readable form. Traditionally these have been magnetic devices but optical devices (compact discs, etc.) are becoming very common diskette virtually synonymous with disk. Some people use the term diskette to refer to ‘small’ ﬂoppy disks that can be carried around (usually for use with personal computers) rather than larger hard disks 47

dispersion

dose response

that are kept permanently inside the computer dispersion a term used almost synonymously with variability (as in variation of data). G variance distributed data entry a system of entering data onto a variety of computers, possibly spread around the world, to form a distributed database. G remote data entry distributed database rather than all the data relating to a study being held on a single computer, a distributed database allows diﬀerent parts of the data to be held on diﬀerent computers. The diﬀerent computers are all linked together by a network so that it is not obvious to the user that the database is distributed distribution a general term covering either frequency distribution or probability distribution, depending on the context distribution free method nonparametric method distribution function the mathematical function that gives the probability that a random variable is less than any given value. G density function divisor denominator doctors and dentists exemption (DDX) an exemption similar to a clinical trial exemption certiﬁcate (CTX) but one that is issued to a doctor or dentist, not to a pharmaceutical company documentation written evidence to conﬁrm the activities that have been undertaken in a study and the standards to which a study has been managed dosage regimen the dose, timing and method of giving medication to a patient. G treatment regimen dose the amount of drug that is given dose eﬀect relationship dose response relationship dose escalation study a study in which successively higher doses of a drug are given to subjects. This may be done either by administering a dose to an individual and, if there are no adverse events, by increasing the dose for that individual until adverse events are seen or by giving a dose to a small number of subjects and, if no adverse events are seen, giving a subsequent group of subjects a higher dose, and so on, until adverse events are seen. Û dose ranging study dose ﬁnding study a study to ﬁnd the best dose (‘best’ according to an agreed deﬁnition) of a drug dose ranging study a study of diﬀerent doses of a drug but, in contrast to a dose escalation study, the doses being compared are not investigated in an escalating manner dose response dose response relationship 48

dose response relationship

drug interaction

dose response relationship how the eﬀect of a drug changes with dose dose titration study dose escalation study dosing schedule dosage regimen dot chart scatter plot dot plot scatter plot double blind a study where the subjects and the investigators are blind to the treatment allocation. Û single blind, triple blind double data entry a strategy where data from case record forms is entered (typed) into a computer twice and the two typed ﬁles compared. This helps to reduce the number of typographical errors and errors of interpretation of poor handwriting. Û single data entry double dummy a method of blinding where both treatment groups may receive placebo. For example, one group may receive Treatment A and the placebo of Treatment B; the other group would receive Treatment B and the placebo of Treatment A double entry double data entry double mask double blind doubly censored data data that are both left censored and right censored. Right censored data is quite common; left censored data is less common; doubly censored data is rare doubly censored observation doubly censored data download copying ﬁles (data or programs) from a central computer to a local computer. Û upload dropin the opposite of dropout. Dropins to clinical trials are not common but, when they occur, may result in left censored observations dropout the case where a subject stops participating in a study before they are due to according to the study protocol. A more polite term is early withdrawal drug a pharmaceutical preparation. The term is often used very broadly and loosely to include placebo. G biologic, phytomedicine. Û product drug accountability the process of checking what has happened to all study medication. This includes checking stocks in a pharmacy, counting individual subjects’ tablets, weighing tubes of ointment, etc. drug company pharmaceutical company drug industry pharmaceutical industry drug interaction the eﬀect sometimes produced when more than one product is used simultaneously. The eﬀect is either more than or less than the sum of the individual eﬀects. The term is most commonly used in connection with adverse reactions, caused by diﬀerent products combining in the body, rather than with extra beneﬁcial eﬀects 49

drug metabolism

dynamic allocation

drug metabolism metabolism drug reaction any response to a product, either beneﬁcial or unwanted, but usually reserved for unwanted eﬀects. G adverse event, adverse reaction drug trial clinical trial dry run similar to a pilot study. Trying a process under artiﬁcial conditions to determine if it will work properly in a real setting dummy loading a method of blinding treatments when they involve diﬀerent dosage regimens. G double dummy dummy report ghost report dummy table ghost table dummy variable indicator variable Duncan’s multiple range test a multiple comparison test for comparing the mean value of a variable between more than two groups duration of action the length of time that a treatment gives any beneﬁt duty of care the requirement that doctors must care for their patients and that this duty must take priority over such things as research projects dynamic allocation a randomisation method that changes the probability of assignment from one group to another as the study progresses. The probabilities are changed either as a consequence of eﬃcacy and adverse event data emerging or to maintain balance for prognostic factors across the groups. G minimisation

50

E early stopping the practice of stopping recruitment into a study before reaching the maximum target sample size. This may be in a sequential study, after a formal interim analysis or for purely practical reasons that are independent of eﬃcacy or safety results early stopping rule a statistical rule that allows a study to stop recruitment after an interim analysis. Unless such rules are used the P-value associated with testing the null hypothesis is generally biased (it is too small). Early stopping rules allow for this and help to calculate the correct P-value early withdrawal when a subject leaves a study earlier than is routinely allowed for in the protocol. Typical reasons include the onset of unacceptable adverse events and voluntary withdrawal. In studies where death is not the endpoint, a death might also be included as an early withdrawal edit the process of changing data or text in a dataﬁle or in a text document (usually one held on a computer) edit check a term that covers all types of checks that may be put on data, including consistency checks, plausibility checks, range checks edit query a question raised by an edit check. The relevant data would then be checked and appropriate corrective action taken if necessary eﬀect this term is often misinterpreted as being the change from baseline in some measurement (blood pressure, for example) during the period of an intervention. Strictly speaking, ‘eﬀect’ should always be a relative measure, such as the extra change in blood pressure over that produced by the comparator treatment. If the mean blood pressure in a treatment group falls by 15 mmHg and in a comparator group it falls by 5 mmHg, then the eﬀect is the diﬀerence between these two values—10 mmHg. Similarly, the eﬀect of gender is deﬁned as the diﬀerence in mean response between males and females; the eﬀect of study centre is deﬁned as the diﬀerence in mean response between participating study centres. When the outcome of interest is not a mean, the term ‘eﬀect’ has the Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

effect modifier

eighty–twenty rule

same limitation in that it should always be the diﬀerence between two groups; this can be measured as the diﬀerence between two proportions, or the odds ratio, or the diﬀerence in median survival times, etc. eﬀect modiﬁer covariate eﬀect size strictly this should simply be the size of an eﬀect but conventionally it is taken to be the size of the eﬀect divided by the standard deviation of the measurements. An eﬀect size of one indicates a diﬀerence between two means equal to one standard deviation; this is generally considered to be quite a large eﬀect. Eﬀect sizes of about 0.5 are moderate and eﬀect sizes 0.1 or lower are considered very small eﬀective sample size the more variation there is in data then, generally, the larger the sample size required to show a treatment eﬀect. However, if, for a given sample size, there is more (or less) variability in the data than anticipated it is as if there is a reduced (or increased) sample size. The sample size that would have been required, had the variability in the data been correctly assessed, is called the eﬀective sample size. Variability in data can be increased due to missing data (G early withdrawal) and errors in the data; it can be reduced by modelling the data using extra covariates. G relative eﬃciency eﬀectiveness the extent to which a product works in the patients to whom it has been oﬀered. This is slightly diﬀerent from ‘eﬃcacy’, which can be measured in those who were actually treated. ‘Eﬃcacy’ relates to explanatory studies, ‘eﬀectiveness’ to pragmatic studies eﬃcacy the desirable eﬀect of an intervention. Û safety. G adverse event eﬃcacy data any data relating to the eﬃcacy of a treatment. Û safety data eﬃcacy population per protocol population eﬃcacy review an overview or meta-analysis of eﬃcacy data. Û safety review eﬃcacy sample per protocol population eﬃcacy study a study intended primarily to demonstrate eﬃcacy rather than safety. Often the same as a Phase III study eﬃcacy variable a variable that is a measure of eﬃcacy. Û safety variable eﬃcient a process that makes good use of resources and is not wasteful. This is also a statistical term referring to methods of estimating parameters: in general it is desirable to have eﬃcient estimators because these may require smaller sample sizes eﬃcient estimator an estimate of a parameter that is eﬃcient eighty–twenty rule an informal rule which suggests that most beneﬁt (about 80%) can be achieved with minimal eﬀort (about 20%); and, conversely, that the last 20% of beneﬁt needs 80% of the eﬀort 52

elective treatment

enrolment period

elective treatment a treatment that a patient chooses to have rather than one that is assigned by randomisation or one that is mandatory on medical (or other) grounds electronic database database eligibility criteria inclusion criteria eligible a subject that meets all the eligibility criteria ( inclusion criteria) elimination the process by which a drug is excreted from the body or removed from the required site of action within the body. Û absorption, clearance elimination rate constant once a drug has been completely absorbed into the body, this is the rate of elimination (which, for many drugs, is approximately constant) empirical observed (particularly in relation to curves, distributions, etc.) Û ﬁtted value empirical Bayes Bayesian methods that require the prior distribution to be based on data. Û subjective Bayes empirical distribution the observed frequency distribution of data. Û probability distribution empirical result a result based on data (or facts) rather than one based on theory empty cell in a contingency table, a cell that contains no observations end of study end of treatment end of treatment the time at which subjects are either supposed to stop taking treatment (according to a protocol) or actually do stop taking treatment (if, for example, they were an early withdrawal) end of treatment value the value of a variable at the end of treatment visit end of treatment visit the visit at which subjects are supposed to stop taking treatment (according to the protocol), actually do stop taking treatment or withdraw from a study endemic a disease that is always present in a certain proportion of the population in a given geographical area. The term is usually used when considering the frequency of extra cases of the disease endpoint a variable that is one of the primary interests in a study. The variable may relate to eﬃcacy or safety. The term is used almost synonymously with eﬃcacy variable or safety variable but not, for example, with demographic variable enrol to recruit a subject, or subjects, into a study enrolment the number of subjects that have been enrolled into a study enrolment period the time (often measured in months or years) during which subjects are enrolled into a study 53

enteric coating

error band

enteric coating a coating (often made of gelatine) used on a tablet or capsule to prevent it being destroyed by acid in the stomach entry criteria inclusion criteria epidemiological study a study using the methods of epidemiology. This includes clinical trials but also case-control studies, cohort studies, natural experiments, surveys, etc. epidemiologist one who studies or practices epidemiology epidemiology the study of health and disease in populations, including aetiology, natural course and treatments. Clinical trials are considered by many to be one of the methods of epidemiology episode the occurrence of an event. In some studies the primary endpoint or primary eﬃcacy variable may be the number of times an event happens (the number of episodes of that event) equal allocation allocating the same number of subjects to each treatment. Û unequal allocation equal randomisation equal allocation equation a set of mathematical symbols and instructions for performing calculations equipoise the state of having an indiﬀerent opinion about the relative merits of two (or more) alternative treatments. Ethically, a subject should only be randomised into a study if the treating physician has no clear evidence that one treatment is superior to another. If such evidence does exist then it is considered unethical to randomly choose a treatment. If the physician is in a state of equipoise, then randomisation is considered ethical equipotent having equal potency and therefore having equal eﬀects (positive or negative). G equivalent equivalence the situation where two treatments show equal eﬀects equivalence study a study whose primary aim is to demonstrate that two treatments are equivalent with regard to certain speciﬁed parameters. Most studies are designed to show that one treatment is better than another; these are sometimes referred to as diﬀerence studies to emphasise the contrast with equivalence studies. G noninferiority study equivalent having equal eﬀects (positive or negative). G equipotent erect standing. G prone, supine error a mistake. Sometimes the term is used to describe the discrepancy between an observed data value and the true value. In these situations, the term is used with reference to the variance, as in, for example, error term, error variance error band an informal term to describe an interval around an estimate 54

error bar

estimate

that semiquantitatively describes the uncertainty of the estimate of the parameter. G interval estimate error bar an informal term, similar to error band but where the interval is shown on a graph. There is no ﬁxed convention for the length of these ‘bars’ but they are typically one standard error, one standard deviation, two standard errors or two standard deviations. If error bars are used, their precise deﬁnition should be given. G box and whisker plot error mean square residual variance error of the 1st kind Type I error error of the 2nd kind Type II error error of the 3rd kind Type III error residual sum of squares error sum of squares error term residual variance error variance residual variance errors in variables model in many situations it is assumed that, although a response variable may be measured with uncertainty (because it has some residual variance), the predictor variables, or covariates, do not have any uncertainty in their measurement. This may often not be the case, and if it is not the relationship between the covariates and the response will be biased: positive relationships will be estimated as larger than they should be whilst negative relationships will be estimated to be smaller than they should be. If the variances of the covariates can be estimated, then an adjustment can be made to the estimated relationship with the response variable. A model that makes this adjustment is called an errors in variables model essential documents a regulatory term describing the documentation that is required to support the data from clinical trials. It includes the protocol, case record form, names and aﬃliations of all staﬀ involved, including their curricula vitae, the source and quality assurance statements of the products involved, etc. essential documents essential requirements estimable a parameter that can be estimated from a given experimental design. Some complex crossover studies and factorial studies may intentionally include some parameters of lesser importance that cannot be estimated, in order to more eﬃciently estimate those parameters that are of greater interest estimate the value of a parameter that is calculated using data. It should always be remembered that exact answers to questions are rarely attainable because of measurement error and random variation in the variable we are trying to measure. The ‘truth’ is rarely known, the best 55

estimated sample size

evidence based medicine

we can usually do is to get estimates of it estimated sample size the estimate of how many subjects must be enrolled into a study in order to meet the objectives of the study. G sample size estimation the process of obtaining estimates of parameters from data estimator a formula used to estimate a parameter ethical a process or study that conforms to accepted guidelines and rules on ethics. G Declaration of Helsinki ethical pharmaceutical a medicinal product that is available only with a doctor’s prescription. Û over-the-counter drug ethics the discipline of describing behaviour, practices, thinking and moral values generally agreed to be acceptable to society. G Declaration of Helsinki ethics committee research ethics committee ethnic origin a demographic variable encompassing place of birth, race, religion, and sometimes also native language. Often it is simply used to describe country or region of birth of a subject evaluable subject one who conforms to the study protocol suﬃciently well to be included in the per protocol population. Often this means a subject who meets all the inclusion criteria for a study and none of the exclusion criteria. Sometimes the requirements may be made less stringent and only certain major inclusion criteria need to be fulﬁlled. Sometimes the requirements may be more stringent and a certain minimum time in the study may be required. The precise deﬁnition of evaluable is likely to be study dependent and should be described in the protocol and study report event a binary variable that is an outcome that may or may not occur for each subject in a study. Some events, if they do occur, can occur more than once. Events are more often considered as negative ( adverse event) but they may be positive aspects of a treatment event rate the proportion of subjects who experience a particular event in a given time interval. Note that if the event can occur more than once for any given subject, as in adverse events, the event rate is still the proportion of subjects who experience that event; it is not a function of the number of events that occur evidence based medicine a recent approach to patient management that relies on using the most rigorous data available to guide decisions on what treatments should be used and how they should be used. The forms of evidence preferred are usually (although not always) from randomised and blinded clinical trials and meta-analyses 56

exact statistical method

expected value

exact statistical method a statistical method for estimation and signiﬁcance testing that does not make assumptions about the distribution of variables. Some exact methods are commonly referred to as nonparametric methods but the variety of exact methods currently being developed goes beyond what have traditionally been thought of as the nonparametric methods. G parametric methods exact test a statistical signiﬁcance test using an exact statistical method examination a series of observations, usually undertaken to determine a diagnosis or to measure the progress of disease exchangeability a term used in the context of bioequivalence to encompass equivalence of all aspects of two products excipient the constituents of a product that are not active but help with the formulation. G vehicle exclusion criteria reasons why a subject should not be enrolled into a study. These are usually reasons of safety and should not simply be the opposites of inclusion criteria excrete to eliminate from the body, usually taken to mean via urine and faeces, but can also include sweat excretion study a study of the quantity, route, timing, etc. of drug being excreted from the body executive committee a small group of individuals representing a larger group, with the authority to make decisions regarding the design or conduct of a study. Data monitoring committees and research ethics committees could have a smaller group that meets more frequently than the main committee to pass through ‘simple’ decisions quickly or who meet on an infrequent basis to make ‘major’ decisions that have been discussed at length at a fuller committee expectation expected value expected frequency the number of events that would be expected to occur within a set of constraints (usually the constraint is the null hypothesis). The term refers particularly to expected cell frequencies (as opposed to observed frequencies) in contingency tables expected number expected frequency expected outcome in a statistical sense expected value. Otherwise the term is used in a general sense to refer to what outcome (or course of a disease) would generally be expected to occur. G prognosis expected value the value of a parameter that an estimator predicts. For example, the expected value of the sample mean is the population mean, although the expected value of the sample variance is not quite the population variance, there is a small bias (which can be corrected) 57

expedited report

explained variance

expedited report a report that must be made very quickly. It usually refers to reporting serious adverse events to regulatory authorities, sometimes within two or three days of the event occurring experiment a general term that encompasses preclinical studies, clinical trials, animal studies, etc. It covers almost any form of practical research that involves intervention. Û observational study experimental design all aspects of the design of an experiment. Sometimes the term is restricted to certain specialised statistical aspects of the design such as blocking, replication and stratiﬁcation experimental treatment experimental drug experimental error residual variance experimental treatment usually the product that is of primary interest and that is being compared with the comparator treatment experimental unit this usually means each subject but is best thought of as the smallest unit that could be randomised. Even in studies that do not involve randomisation, it is still helpful to think in these terms. In community intervention studies the experimental unit might be an entire town; in other situations it could be a hospital ward or a General Practitioner’s surgery. G unit of analysis Hawthorne eﬀect experimenter eﬀect experimentwise error rate the probability of making a Type I error when considering the overall result of a study. Note that, if a study has several endpoints to be analysed, even if one or more of those analyses may result in a Type I error the overall conclusion from the study could still be correct. Û comparisonwise error rate. G multiple comparisons expert report a regulatory document that summarises the complete set of documents on the safety and eﬃcacy of a product submitted for regulatory approval in a new indication expert review a review of documents, study results, etc. by an expert. G peer review expert system a computerised method of making decisions that is more complex than a simple algorithm; it is a method that is capable of ‘learning’ by building upon past decisions and their outcomes expiry date the date after which a product should not be used because its quality cannot be assured. G shelf life explained variance in a set of data there will usually be variation between data points. Some of this variation will be due to diﬀerences between subjects, diﬀerences between points in time, diﬀerences between treatments, etc. The variation that is due to such known causes is the explained variance; the variation that is due to unknown causes is called the 58

explanatory study

exponential distribution

Figure 7 Exponential decay. For each equal sized change in the value of x, the value of y falls by the same proportion residual variance, or simply the variance explanatory study a study that aims to ﬁnd out if an intervention can work, given ideal circumstances, or to ﬁnd the circumstances under which an intervention works. The analysis of such studies is usually by the per protocol approach. Û pragmatic study covariate explanatory variable exploratory data analysis methods of reviewing data to ﬁnd potential errors and to gain simple impressions of patterns that may exist or eﬀects that may be happening. The methods are usually graphical and include box and whisker plots, histograms, stem and leaf plots exploratory study a study that aims to generate hypotheses rather than to deﬁnitively test them exponent in a mathematical equation of the form y : x X (‘x raised to the power z’) the parameter z is called the exponent exponential exponential growth exponential decay a quantity that is diminishing at an ever-decreasing rate (Figure 7). Û exponential growth exponential distribution the probability distribution that describes the 59

exponential growth

external consistency

Figure 8 Exponential growth. Rate of growth of cancerous cells. The number of cells multiplies by the same factor after each additional day time interval between randomly occurring events. It is an important distribution in the analysis of survival data exponential growth growing at an ever-increasing rate; for example, the number of cancerous cells in a tumour may double every week, or may increase tenfold every week (Figure 8). Û exponential decay exposed group in a clinical trial this term is sometimes used to refer to the group receiving the experimental treatment. The term more naturally comes from case-control studies and refers to the cases exposure the extent (amount and length of time) for which a subject has received medication or other intervention (including possibly harmful interventions) exposure variable the variable that measures exposure external consistency a study whose results are applicable to, and match what is seen in, other studies and in clinical practice. All studies should obviously have this feature but many do not because they use inclusion 60

external validity

extreme value

Figure 9 Extrapolation. A simple regression model predicting patients’ systolic blood pressure from their weight has been used to predict what the blood pressure of a 150 kg person (the large dot) would be criteria and exclusion criteria that are either diﬀerent to those of other studies or are not reﬂected in clinical practice. Û internal consistency. G explanatory study, pragmatic study external validity external consistency extrapolate either formally estimating, via a statistical model, or informally judging what results will occur outside the range of data actually collected and analysed. This may involve extrapolating to a wider patient population than has been studied, extrapolating from animal studies to judge what will occur in humans, etc. (Figure 9). Û interpolate extreme value the largest or smallest value in a set of data. Sometimes the extreme values (plural) are taken as several of the largest or smallest values

61

F F distribution a probability distribution used extensively for signiﬁcance testing in analysis of variance. It is used to test whether two variances are equal but this can be put to use to compare the means across several groups F ratio F statistic F statistic the value of the test statistic calculated from an F test F test a statistical signiﬁcance test based on the F distribution F to enter the value of an F test required as a decision rule to enter a variable into a regression model when using forward selection or stepwise regression methods. Û F to remove F to remove the value of an F test required as a decision rule to remove a variable from a regression model when using backward elimination or stepwise regression methods. Û F to enter fabricated data data that are not real and have been presented fraudulently. G fraud face validity a term usually used with reference to questions on a questionnaire. Face validity refers to whether a question seems to make sense to an expert in the ﬁeld. It stems from the expression ‘on the face of it’. G external validity, internal consistency factor another name for a categorical variable, usually (but not exclusively) one that is a covariate or a stratiﬁcation variable, rather than one that is an outcome variable factorial design a study that compares two (or more) diﬀerent sets of interventions. The simplest design uses Drug A versus Placebo A and Drug B versus Placebo B. Subjects will be randomised to one of four groups: Placebo A ; Placebo B, Drug A ; Placebo B, Placebo A ; Drug B or Drug A ; Drug B. This is a very eﬃcient type of study because it not only allows the assessments of Drug A and Drug B in one study instead of two but also allows us to investigate the question of whether drugs A and B show any interaction factorial study a study of two or more interventions carried out in a factorial design Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

failure

file

failure the term sometimes used in place of event in survival data. It comes from studies of the time it takes for machine parts to cease working (or ‘failing’) but the term has been carried over to medical examples where we are looking at the time until an event such as death or relapse failure time the time until an event occurs, where the term failure has been used instead of event false negative the case when a test of some sort does not detect what it is supposed to detect. This can be a diagnostic test that fails to identify a patient who has a particular disease. The term is also sometimes used in signiﬁcance testing to describe a Type II error. Û false positive false positive when a test incorrectly detects something that is not real. In a diagnostic test this is identifying a patient as having a particular disease when they do not. In signiﬁcance testing it is the same as a Type I error. Û false negative falsiﬁcationism the act of falsifying data or results. G fabricated data, fraud familywise error rate experimentwise error rate fatal that which causes death. Û lethal feasibility study pilot study Fibonacci dose escalation scheme a commonly used method of determining what doses of a drug should be used in a dose escalation study. The successively increasing doses follow a Fibonacci series Fibonacci numbers numbers that follow a Fibonacci series Fibonacci series a series of numbers that increase by successively adding the previous two numbers to get the next one. For example, 1, 1, 2 (: 1 ; 1), 3 (: 2 ; 1), 5 (: 3 ; 2), 8 (: 5 ; 3), 13 (: 8 ; 5), 21 (: 13 ; 8), 34 (: 21 ; 13), . . . ﬁducial inference a method of statistical inference similar to signiﬁcance testing. G Bayesian inference, frequentist inference ﬁeld study a term used to describe a study that is not conducted in a hospital or similar type of well controlled environment but rather one that is carried out in general practice with patients free to carry on their normal daily activities. The analogy of an agricultural study being carried out either in a greenhouse-type environment or in a ﬁeld where the climate and other environmental factors cannot be controlled is a good one. G experiment ﬁgure the term is used to refer to a number, or to a graph or diagram in a study report. The use can be confusing as diﬀerent people assume it means diﬀerent things ﬁle a physical or electronic (on a computer) place where documents and data are stored 63

final data analysis

fixed effect

ﬁnal data analysis the ﬁnal analyses of a study that are reported. These may be done after various forms of exploratory data analysis have been completed ﬁnal report another term for study report but the use of this term can be useful to distinguish it from an interim report or a draft of a study report ﬁne data data measured with great accuracy. Û coarse data ﬁnite having real bounds. The term is sometimes overused because most of what we do is ﬁnite. The use of the word can only really be justiﬁed if it genuinely contrasts with the possibility of being inﬁnite ﬁnite population for the purposes of most statistical analyses, it is assumed that there are an inﬁnite number of subjects to which the study results apply. This assumption is partly justiﬁed on the grounds that the possible set of subjects having the target disease includes all those with the disease today and all those who will have the disease in the future. In some situations this is not a sensible assumption and it must recognised that there is a ﬁnite number of subjects in the population to which our results can apply. Û inﬁnite population ﬁrst in man study the ﬁrst Phase I study undertaken with a new drug two factor interaction ﬁrst order interaction ﬁrst pass metabolism the absorption of drugs into the body when they pass through the liver Fisher’s exact test a statistical signiﬁcance test that is used for comparing proportions in contingency tables. It is used in preference to the chi-squared test when the sample size is small (often less than 30) ﬁshing expedition data dredging ﬁt to estimate the parameters of a model from data ﬁtted value the estimated value of a parameter based on a model. Û observed value. G empirical result ﬁxed combination therapy a mixture of two (or more) drugs in one formulation. Û free combination therapy ﬁxed cost in pharmacoeconomics this refers to a cost that will remain the same however many patients there may be or in whatever way they may be treated. One might argue that the pharmacy department in a hospital needs to be open 24 hours a day whether it stores drugs for a certain disease or not: there is, therefore, a certain minimum ﬁxed cost for this facility. Û marginal cost, per unit cost, variable cost ﬁxed disk hard disk ﬁxed eﬀect a categorical variable where the diﬀerent levels of the factor are exactly the ones that we wish to draw conclusions about. G ﬁxed eﬀects model. Û random eﬀect 64

fixed effects model

FORTRAN

ﬁxed eﬀects model a statistical model that assumes we wish to make inferences about the particular levels of a factor used in the study, and no others. This is particularly relevant when including study centre as a factor in the analysis: do we wish our results to be applicable only to those centres that took part in the study, or do we wish to consider those centres to be a random selection of all the centres that might have taken part so that the results can be applied to all possible centres? The ﬁxed eﬀects approach assumes the ﬁrst case. Û random eﬀects model ﬁxed sample size design a design that determines the number of subjects to be recruited before the study starts and does not allow the number to be changed. This is the most common type of approach to determining how many subjects should be in a study. G group sequential design, interim analysis, sequential design ﬂat ﬁle a computer dataﬁle that can be thought of as like a matrix, usually with each row representing one subject and each column representing one variable. Û hierarchical database ﬂoor eﬀect an asymptote that is a lower limit. Often zero will be that lower limit. Û ceiling eﬀect ﬂoppy disk a form of computer disk that is easily portable and is intended to be slotted in or out of a computer rather than being a permanent ﬁxture (as in hard disk) ﬂow diagram a diagram showing a series of activities occurring across time (Figure 10) ﬂow diagram ﬂowchart follow-up the process of collecting data after some activity has taken place. This often simply means gathering data after subjects have been randomised, or it may mean collecting data after treatment has been stopped to monitor safety or relapse of symptoms follow-up data data that are collected as a result of follow-up follow-up period the time during which follow-up occurs. This may simply be the time that patients are in a study from randomisation until their last visit follow-up visit any visit during a follow-up period of a study for cause audit an audit that is carried out because of some suspicion of poor quality work or of fraud. Û no cause audit form case record form formulation the way in which a product is manufactured and presented. Examples include tablets, capsules, injections. G product FORTRAN a very powerful but quite old computer programming language. G BASIC, Visual Basic, C, C;; 65

forward selection

frailty model

Figure 10 Flow diagram. The sequence of events to follow in Zelen’s randomised consent design, seeking consent in conjunction with randomisation forward selection a method of arriving at a regression model when several possible covariates might be included. The method begins by selecting the variable that makes the greatest contribution to reducing the residual variance (subject to some minimum criterion) and putting this in the model. Then the variable giving the next greatest reduction in variance (again, subject to a minimum criterion) is found and included in the model. The process continues until either all the variables are in the model or no more meet the minimum criterion for being included. The minimum criterion is referred to as F to enter. G all subsets regression, backward elimination, stepwise regression forward stepwise regression forward selection fourfold table two-by-two table fourth hurdle some regulatory authorities, in addition to requiring demonstration of quality, safety and eﬃcacy, require evidence of additional value for money of a new product. This is called the ‘fourth hurdle’. G pharmacoeconomics frailty model a statistical model that assumes diﬀerent individuals have diﬀerent probabilities of being unobserved. The term is most often used with respect to survival times where it is expected that there will be some censored data. Survival models assume that the probability of 66

frame

frequency polygon

Figure 11 Frequency polygon. Distribution of the number of years a group of 87 patients had suﬀered from eczema. Only the outline of the histogram is plotted censoring is the same for every subject but frailty does not make that assumption sampling frame frame fraud the act of intentional and dishonest deception. G fabricated data fraudulent data fabricated data free combination therapy a mixture of two (or more) drugs that are intended to be taken together but which are not combined in one formulation. Û ﬁxed combination therapy frequency the number of times a particular event occurs or a particular data value is observed. Û relative frequency frequency distribution the number of times each of several events occurs or the number of times each of many diﬀerent data values occurs. G frequency polygon, frequency table, histogram frequency polygon a diagram for representing a frequency distribution. 67

frequency table

funnel plot

Table 6 Frequency table of extent of body surface area aﬀected by eczema in 157 patients

No involvement :10% 10—29% 30—49% 50—69% 70—100%

Frequency

Percentage

Cumulative frequency

Cumulative percentage

42 48 25 14 16 12

26.8 30.6 15.9 8.9 10.2 7.6

42 90 115 129 145 157

26.8 57.3 73.2 82.2 92.4 100.0

Each of the data values is placed along the x axis and the number of times each occurs is plotted as a point on the y axis. These points are then joined to form a polygon (Figure 11). Û histogram frequency table a numerical summary of a frequency distribution showing the number of times each data value occurs. Sometimes this may be enhanced to also show the percentage of occurrences, the cumulative frequency and the cumulative percentage of occurrences. All of these features are shown in Table 6 frequentist inference an approach to data analysis that produces estimates of parameters, conﬁdence intervals and signiﬁcance tests. G Bayesian inference, ﬁducial inference Friedman’s test a nonparametric signiﬁcance test for testing the null hypothesis that all of several treatments given to the same subjects have the same distribution of responses. Informally, this can be thought of as the nonparametric equivalent of repeated measurements analysis of variance Friedman’s test Friedman’s two way analysis of variance full analysis set intention-to-treat population saturated model full model fully compliant a subject who takes or uses all medication exactly as prescribed in the study protocol function a mathematical equation funnel plot a type of graph for plotting summary results from many diﬀerent studies. It is used in meta-analysis and in overviews to help try to detect publication bias (Figure 12)

68

funnel plot

funnel plot

Figure 12 Funnel plot. Summary odds ratios from 25 studies comparing the eﬃcacy of a certain class of antidepressant with placebo. If no publication bias existed, we would expect to see a ‘funnel’ shape. There is some suggestion here that some small negative studies may have been missed because of the lack of studies in the bottom centre of the plot

69

G Galbraith plot radial plot Gaussian curve Normal distribution Gaussian distribution Normal distribution Gehan’s design a design, typically in Phase II cancer studies, where no control group is used. The design initially recruits a small number of patients: if overwhelming evidence in favour or against eﬃcacy is seen then the study stops. If the evidence is not conclusive either way, further patients are recruited in order to obtain a reasonable estimate of the treatment response rate Gehan’s generalised Wilcoxon test a nonparametric statistical signiﬁcance test for comparing two survival distributions. G Cox’s proportional hazards model, log rank test gel a vehicle for delivering topical treatments. Similar to cream but more solid. G lotion, ointment gender synonym for sex general linear model linear model generalisability the extent to which conclusions can be applied to a wide population. G external validity generaliseable conclusions that have wide generalisability generalised additive model a method of producing models, similar to generalised linear models, that predict an outcome variable from several independent variables. In this case, the link function is a complex function of the data, rather than a theoretical link function, such as the Normal distribution or logistic function generalised estimating equations an extension to linear models particularly useful for modelling repeated measurements and further particularly suited to binary data and Poisson data generalised linear model an extension to linear models where a link function is introduced. This link function is a function of the response variable and, instead of modelling the response variable directly, the link function is modelled as a linear function of the independent variables Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

generic

ghost report

Figure 13 Ghost table. All of the row headings and column headings have been drafted out so it is clear what the table will look like when the data are available. In this example, the number of decimal places and signiﬁcant digits for each of the numerical values have also been indicated generic the fundamental, original, form. Often used to refer to drug names as generic name, in contrast to trade name generic name the name that the original manufacturer or developer gives to a drug. Û trade name genetics the study and description of genes and DNA Genie score a way of summarising multivariate data (usually used for laboratory data). The greater the score, the more deviation there is in a subject’s (laboratory) data from the relevant reference ranges geometric mean a measure of central tendency, particularly useful for highly skewed data. It is calculated as the nth root of the product of n numbers or, alternatively, as the antilog of the mean of the logarithms of all the numbers. G harmonic mean ghost report a draft of a report that contains no results but has all the section headings and some of the introductory text included. The 71

ghost table

grand total

intention of a ghost report is to be able to produce a ﬁnal report as quickly as possible after the data become available. A ghost report may also contain ghost tables ghost table the layout of a table indicating row and column headings but without any data (Figure 13). G ghost report Gini coeﬃcient a measure of variation most often used in describing income or salaries. Hence it has uses in pharmacoeconomics glossary a list of specialist terms referred to in a document, with their deﬁnitions goal a target such as the goal for the number of subjects to be recruited to a study gold standard a diagnostic test that is guaranteed to give the correct diagnosis. Also used to refer to a treatment that is widely recognised as the best available golden rule informal term for ‘most important rule’ Good Clinical Practice (GCP) a set of principles and guidelines to ensure high quality and high ethical standards in clinical research. G Good Laboratory Practice, Good Manufacturing Practice Good Distribution Practice (GDP) a set of guidelines to ensure high quality standards in warehouse storage and distribution work Good Laboratory Practice (GLP) a set of guidelines to ensure high quality standards in laboratory work. G Good Clinical Practice, Good Manufacturing Practice Good Manufacturing Practice (GMP) a set of guidelines to ensure high quality standards in manufacturing. G Good Clinical Practice, Good Laboratory Practice Good Regulatory Practice (GRP) a set of guidelines to ensure high quality standards in regulatory aﬀairs work Good Statistical Practice (GSP) a set of guidelines to ensure high quality standards in statistical work goodness of ﬁt a measure of agreement between a set of observed data and a model that has been ﬁtted to those data goodness of ﬁt test a statistical signiﬁcance test to compare whether one model ﬁts data better than an alternative model Graeco-Latin square a form of Latin square that balances for three sources of variation. G Youden square grand mean the mean of a set of numerical observations, regardless of which group (treatment group or other form of group) those data relate to. G grand total grand total the total of a set of numerical observations regardless of which 72

graph

growth curve

group (treatment group or other form of group) those data relate to. This equates to the grand mean multiplied by the number of observations graph a pictorial representation of data plotted on an x axis and y axis, and sometimes on a z axis too. G scatter plot graphic a general term for diagrams, graphs, sketches, etc. Greenhouse–Geisser correction an adjustment made to the degrees of freedom in an F test of within subjects eﬀects in repeated measurements analysis of variance. It is assumed that the pattern of correlation is constant over time and this adjustment is required if the assumption is not valid. G Huynh–Feldt correction group one of the strata in stratiﬁed data. The term is frequently used to refer to subsets such as the treatment group or the placebo group (those treated with active treatment or those treated with placebo, respectively). It can be used to refer to other strata such as the males or females, the ‘high risk group’, ‘low risk group’, etc. group data the subset of an entire set of data that relates to only one group. For example, all the data from subjects treated with placebo or all the data from female subjects collective ethics group ethics group matching usually in matching we refer to matched pairs. However, with group matching we imply that overall, two (or more) groups of subjects are typically quite similar in terms of their demographic data, disease proﬁle, etc. group randomisation cluster randomisation group sequential analysis special types of analysis that are appropriate for group sequential studies group sequential design a form of sequential design where interim analyses are carried out after a number of subjects have been recruited into a study. Usually only two or three analyses would be planned into such a study after either half the subjects or one third and two thirds of the subjects have completed the study. G O’Brien and Flemming rule, Pocock rule group sequential study a study designed as a group sequential design group sequential test a statistical signiﬁcance test carried out in group sequential designs grouped data categorical data growth curve a graph that traditionally plots the progress of some feature of growth over time. Growth could be measured by height or weight. The term now has a broader use to include any variable that systematically changes (usually increases) over time 73

guardian

Guttman scale

guardian legal guardian guesstimate an informal term to describe a result that is largely a guess but that supposedly has some data used to help form that guess. It is not an estimate in the formal sense of the word but it is supposed to be better than a guess based on no knowledge (or data) at all guideline a set of suggested rules but ones that are not enforceable by any laws. In practice, one would be foolish to ignore many of the regulatory guidelines that have been written. G Good Clinical Practice guinea pig a subject who is part of a study may be referred to as a guinea pig. The term is used disparagingly by those who do not approve of the particular study involved (or who do not approve of research on humans generally); or it is used light heartedly. Given these two extremes of meaning, it is a term best avoided Guttman scale a method of combining answers to individual questions to arrive at an overall score (sometimes called a composite score). Each question may be weighted diﬀerently, so it is not simply the sum of the individual question responses

74

H H symbol for null hypothesis H usual symbol for alternative hypothesis H alternative (less often used) symbol for alternative hypothesis ? haematology the study of the makeup of blood. Usually used in the context of laboratory data to refer to such parameters as platelets, red blood cell counts, etc. Û biochemistry half life the amount of time that a radioactive substance takes to decay to half its original quantity or a drug takes to halve its concentration in the body. In many situations, four half lives might be considered a reasonable time to reduce the original concentration to a minimal quantity (four half lives being of the original amount) halo eﬀect an informal term to describe psychosomatic eﬀects that often occur when patients believe that the doctor will be able to give them beneﬁt. G placebo eﬀect handbook a book of instructions for using a machine, for running a study or for general work practices haphazard unpredictable but not in the highly controlled sense of random haphazard sample a sample of people (or items). The members of the sample are not chosen for any particular reasons, just as they happen to present themselves. Haphazard samples often display various patterns that would not be seen in a truly random sample. G convenience sample haphazard treatment assignment a method of assigning treatments to subjects that is not controlled or predictable. Like haphazard samples, haphazard treatment assignment often displays various patterns that would not be seen in truly random assignment hard data objective data hard disk a form of computer disk that usually resides inside the computer and is not intended to be moved between diﬀerent computers. They have much larger capacity than ﬂoppy disks or diskettes hard endpoint objective endpoint hard measurement objective measurement

hard outcome

Heisenberg effect

hard outcome a response to an intervention that can be measured using objective data. Û soft outcome hardware the mechanical, electrical and electronic components of a computer such as the screen, the processor, disk drives, keyboard, etc. G software harmonic mean a measure of central tendency used for skewed data. It is — calculated using the reciprocals of the data, namely H : (1/x ) . L G G geometric mean Hawthorne eﬀect the response that is often seen in subjects taking part in a study and produced simply because they know that they are being observed; however, it is not a true eﬀect of any intervention. The strict deﬁnition given for eﬀect is particularly important to note. G placebo eﬀect hazard hazard function hazard function in survival analysis the probability of a given event (such as death) occurring at each instant in time, given that the event has not already happened. G Cox’s proportional hazards model hazard rate the hazard function at any particular point in time hazard ratio the ratio of two hazard rates or of two hazard functions, either at a particular point in time or averaged over a long period. G Cox’s proportional hazards model health the general state of wellbeing or lack of wellbeing in an individual or a group of individuals health economics pharmacoeconomics health services research research into the provision of health care, including aspects of cost, need, resources, supply and outcome. Strongly linked with pharmacoeconomics healthy subject healthy volunteer healthy volunteer a subject who volunteers to take part in a study but who does not have any signiﬁcant disease. Such subjects often participate in Phase I studies. Note that all subjects who take part in clinical trials should do so voluntarily; for this reason, the term should not be abbreviated simply to ‘volunteer’ (although it often is). Û patient healthy worker eﬀect a form of volunteer bias. Subjects who have employment (those that are workers) tend to be healthier, on average, than the general population (which includes those who do not work, through choice, old age, disability, etc.) Heisenberg eﬀect a term from physics that says that the act of observing and measuring a process aﬀects that process so that absolute eﬀects are impossible to measure. This is one of the reasons why we need comparison groups in studies. G Hawthorne eﬀect 76

Helmert contrasts

high–low graph

Helmert contrasts a particular type of contrast where each level of a factor is compared with the mean of all other levels of that factor. For example, if three ethnic groups are represented in a study the response variable could be investigated to see if it is aﬀected by ethnic group. The mean response in ethnic group 1 would be compared with the mean of the combined data from ethnic groups 2 and 3; ethnic group 2 would be compared with the mean of the combined data from ethnic groups 1 and 3; ethnic group 3 would be compared with the mean of the combined data from ethnic groups 1 and 2. G analysis of variance, multiple comparisons hepatic metabolism metabolism (of product) through the liver. G renal metabolism, pharmacokinetics heterogeneous a term used to mean that the variation of a measurement within a group is diﬀerent from the variation of that same measurement within other groups. Û heteroscedastic, homogeneous heteroscedastic unequal variances of data values of the same variable. For example, the variation in the measurement of a person’s age usually changes with age; age of newborns may be measured in hours or days, age of infants in months, adults in years. Û heterogeneous heuristic using intuition and judgement hierarchical nested, meaning built up in layers hierarchical database a computer database that has several levels of data. For example, the highest level may be the subject level recording basic demographic data for each subject. For each subject, other levels may contain data on the diseases they have and, for each disease, the treatments they have been given. Û ﬂat ﬁle hierarchical models two statistical models for the same data but one has extra covariates that are not included in the other high level term a classiﬁcation of signs, symptoms and diseases (particularly used in MedDRA) giving a coding that is less detailed than the preferred term but is more detailed than the system organ class high order interaction a general term used to refer to an interaction (in the statistical sense) that is not a two factor interaction but involves at least three factors highest density region in Bayesian statistics the middle region of a posterior distribution used for determining interval estimates. G credible interval, conﬁdence interval highest posterior density a method in Bayesian statistics for determining a point estimate of a parameter high–low graph a graph for plotting one continuous variable (usually on

hinge

historical control

Figure 14 High—low graph. The mean pulse rate in 100 patients with ischaemic heart disease is plotted at each of ﬁve visits. Additionally, the minimum and maximum values at each visit are plotted. Note that the patient with highest pulse at visit 1 is not necessarily the one with the highest pulse at any other visit the y axis) against one categorical variable (usually on the x axis). It shows the mean and/or median and the minimum and maximum values of the continuous variable for each value of the categorical variable (Figure 14) hinge quartile Hippocratic oath a promise to act to certain high ethical and medical standards. Traditionally it is thought of as being sworn by all doctors when they qualify but this is not actually the case histogram a graphical method for plotting a frequency distribution similar to a bar chart. Whilst a bar chart is typically used for categorical data, a histogram is more usually used for continuous data (Figure 15) historical control a control group that has not been randomised but consists of patents treated in the past. Û concurrent control 78

historical control group

homeopath

Figure 15 Histogram. Distribution of the number of years a group of 87 patients had suﬀered from eczema historical control group a comparator group that has not been assigned by randomisation but which consists of patients treated in the past (sometimes patients who were not treated). This is a much less desirable method of making comparisons and is prone to many forms of unpredictable bias but it is a much easier source of comparisons than setting up a randomised study hold constant when analysing data and producing adjusted estimates (by analysis of covariance or some other method) it is convenient to think of the result as what would have been observed had a particular covariate taken the same value for every subject. This is sometimes described as that covariate being ‘held constant’ home visit in many studies, subjects are seen at hospital, at their own general practice or at some other kind of health centre. Particularly in community studies, a home visit is when a nurse, doctor or other health professional assesses the subject in their own home homeopath one who practices homeopathy

homeopathy

hypothesis testing

homeopathy a treatment regimen that involves exposing patients to trace amounts of a chemical that would, in large enough doses in healthy people, produce symptoms of the disease that is being treated homogeneous the variation of a measurement within a group being similar to the variation of that same measurement within other groups. Û heterogeneous, homoscedastic homoscedastic equal variances of data values of the same variable. For example, the variation in the measurement of a person’s weight would not be expected to vary between diﬀerent treatment centres (even though the mean might vary considerably). Û heteroscedastic, homogeneous hot deck a method of imputing for missing data based on other nonmissing data Hotelling’s T test a statistical signiﬁcance test for comparing the means of two multivariate distributions. It could be used, for example, when a subject’s ‘size’ is measured on three variables: height, weight, and head circumference. Rather than three separate t tests, the Hotelling test compares ‘size’ rather than separately comparing height, weight, and head circumference Huynh–Feldt correction an alternative to the Greenhouse–Geisser correction in repeated measurements analysis of variance hypothesis a statement for which good evidence may not exist but which is to be the subject of an experiment. A common example in clinical trials would be that ‘Drug A shows an eﬀect identical to that of placebo’. This is clearly a statement; it may be true or false; it can be tested in an appropriately designed experiment. G alternative hypothesis, null hypothesis hypothesis generating study a study that is not intended to answer speciﬁc questions but rather to produce data that can be looked at in various ways to suggest interesting questions (or hypotheses) to be researched in subsequent experiments. Û deﬁnitive study. Many studies may be run with the intention of answering a small number of hypotheses and to generate further ideas hypothesis test a statistical process to determine the strength of evidence in favour of, or against, a particular hypothesis. There are many types of hypothesis test for use in diﬀerent situations and for addressing diﬀerent types of question. G nonparametric test, parametric test; and, for example, chi-squared test, F test, Mann–Whitney U test, t test, P-value hypothesis testing the process of using a statistical hypothesis test to test a null hypothesis 80

hypothetical population

hypothetical population

hypothetical population a population that cannot be completely deﬁned (it would not be possible to list the names of all the individuals in that population, for example) but that can be considered to exist for practical purposes. G inﬁnite. Û ﬁnite population

I iatrogenic describing a condition caused by the treatment given for another disease. Obvious examples are adverse reactions id subject id identiﬁcation number subject identiﬁcation number ignorable missing data data values that, despite being missing, do not introduce any bias into the analysis and results of a study. G missing completely at random, missing at random. Û informative missing data, nonignorable missing data ignorable missingness the process that produces ignorable missing data ignorant prior in Bayesian statistics, a prior distribution that gives no (or very little) information. G improper prior, reference prior imbalance lack of balance or not balanced immune not susceptible to a disease immune system those parts of the body, particularly antibodies, that help to protect or ﬁght against infection immunise to make someone immune to a particular disease. This may occur either naturally or artiﬁcially by inoculation impartial witness someone who observes an event (usually that of giving informed consent) but who has no involvement with the study improper prior in Bayesian statistics, this is a prior distribution that is not a valid probability distribution but which can still be used as if it were. In general it states that our prior belief about a parameter is that it lies somewhere between minus inﬁnity and plus inﬁnity. As this does not tell us much about the parameter, it is sometimes called an ignorant prior. G reference prior imputation the process of imputing impute to ﬁll in data values (usually missing data) with values that are thought to be sensible. There are several ways of doing this; many make valid assumptions, many make very questionable assumptions. Some methods rely on calculations based on the remaining data, some rely on intuition and guesstimates. The most common example is probably the 9

in vitro

independent contrasts

concept of last observation carried forward in vitro in a test tube (or similar). Û in vivo in vivo in living tissue. Û in vitro inactive control a placebo. Use of the term ‘control’ indicates that some intervention (even if only placebo) is implied. The term would not generally be used to refer to a control group that received no treatment at all incidence the number of new cases (of a disease) that occur in a speciﬁed period of time. Û prevalence incidence rate the number of new cases of a disease in a period of time, divided by the number of subjects at risk of the disease. Û prevalence rate event incident inclusion criteria the requirements that a subject must fulﬁl to be allowed to enter a study. These are usually devised to ensure that the subject has the appropriate disease and that he or she is the type of subject that the researchers wish to study. Inclusion criteria should not simply be the opposites of the exclusion criteria incomplete block a block of treatment (or treatment sequences) that does not contain all of the possible treatments (or treatment sequences) to which subjects in the study may be randomised. Û complete block incomplete block design a study that uses incomplete blocks of treatment. Although each block will necessarily be unbalanced (which may not be desirable), the study as a whole can still be balanced, as in a balanced incomplete block design incomplete crossover design a crossover design where not all subjects receive all of the possible treatments incomplete crossover study a study that is designed as an incomplete crossover design incomplete factorial design a factorial design where not all combinations of the possible treatments are used incomplete factorial study a study that is designed as an incomplete factorial design increment an increase in value (commonly the dose of a drug or the draft number of a protocol). Û decrement incubation period the time between exposure to an infection and the appearance of clinical signs. G sojourn period independent if knowledge of one event or variable gives us no information (or even clues) about another event then the two events are said to be independent of each other. G correlation independent contrasts two (or more) contrasts that are independent of 83

independent ethics committee

indicator variable

each other. If we were to compare the mean responses in three treatment groups (A, B and C), there are several possible contrasts that we could make. The simplest would be to compare each pair of treatments: mean(A)9mean(B), mean(A)9mean(C), and mean(B)9mean(C); however, if we know that A is greater than B, and that B is greater than C, then we immediately know that A must be greater than C. So these three contrasts are not independent of each other independent ethics committee research ethics committee independent groups groups of subjects that are independent of each other. For example, a parallel group design uses independent groups but a crossover design does not independent identically distributed (iid) a term used to describe values of a random variable that are independent of each other but which all come from the same underlying probability distribution. In a random sample of women, shoe sizes might all be independent, and all from the same distribution; if the sample contained men and women then, although the shoe sizes may all be independent, there might be two underlying distributions (larger shoes for men than women) independent variable independent random variable independent samples independent groups independent samples t test a statistical signiﬁcance test for testing the null hypothesis that the means of two populations are equal. Û paired t test independent variable another term for a covariate in a regression model. Note, confusingly, that several so-called independent variables may not be independent of each other, nor of the response variable (or dependent variable). In a regression model the response variable may depend on the independent variables but the independent variables are not dependent on the response variable. For example, blood pressure may be partially predicted from knowing a subject’s age, height, weight, etc.: these variables would be said to be the independent variables, whilst blood pressure is the dependent variable index case a case (as in case-control study) index group all of the cases in a case-control study indexed ﬁle a term that might be obvious in keeping paper ﬁles but is more relevant in computer databases. It is a collection (a ﬁle) of data that has an index which allows direct access to the required items indication the reason for using a product or other intervention. Synonym for disease indicator variable in computing terms a variable that is a binary variable. Often a set of indicator variables may exist to describe the values of one 84

indirect contact

inferential statistics

categorical variable. If a subject is randomised to receive one of three treatments, two indicator variables can be set up: the ﬁrst takes the value 1 (and the second 0) if the subject is randomised to Treatment A; the ﬁrst variable takes the value 0 and the second 1 if the subject is randomised to Treatment B; otherwise, both variables are set to 0, indicating that the subject must have been randomised to Treatment C indirect contact the contact of one person with another through a third party. Particularly relevant with infectious diseases, where the infection may initially be passed to someone in direct contact with the source of infection but these people may then pass the infection on further indirect cost in pharmacoeconomics a cost incurred because someone has a certain disease, but not the direct cost of treating the patient. Loss of earnings and social security payments are often considered indirect costs individual relating to a particular item (often, but not necessarily, a person) individual ethics ethical behaviour that focuses on beneﬁt to an individual rather than beneﬁt to society. Û collective ethics individual matching ﬁnding cases and controls that have similar demographic data and/or disease proﬁles. For each case, one or more similar controls is sought for comparison. Û group matching individual variation variation in measurements of individuals, rather than of groups. G within subjects variation. Û between subjects variation induce to draw a conclusion or a generalisation from speciﬁc examples of data. Û deduce induce induction inductive inference the process of drawing conclusions by induction. Û deductive inference inductive reasoning a less strong term than inductive inference inequality a statement which says that two things are not equal. Sometimes there may be suﬃcient information to know that one item or quantity is larger or smaller than another; otherwise ‘not equal’ is all that can be said inert having no (biological) action. Placebos are often considered as being inert infection the implantation and growth of an organism infectious a disease that can be passed on via direct contact or indirect contact with other people infer deduce inference a conclusion drawn based on data and reasoning inferential statistics the branch of statistical methods concerned with drawing conclusions from data, typically by use of statistical signiﬁcance 85

infinite

inpatient

testing. Û descriptive statistics inﬁnite without bounds. In numerical terms, a number larger than any other can be. Û ﬁnite. G minus inﬁnity, plus inﬁnity inﬁnite population a population (which must be a hypothetical population) that contains an inﬁnite number of individuals. For the purposes of statistical methods used in clinical trials, most populations are assumed to be inﬁnite. Û ﬁnite population inﬂuence to contribute substantially to a decision or conclusion inﬂuential observation a data point that has a lot of inﬂuence on a statistical model. Some outliers can be very inﬂuential observations but this is not always the case informatics the science of handling and processing information (usually in the form of data) information a term encompassing data but rather broader. Some say that the value of analysing data is to turn it into information informative censoring censored data where the process of censoring tells us something about the state of a subject. If censoring is random then we know only that data are censored; if subjects withdraw from a study because they are too unwell to attend the clinic, or because they are free of any symptoms, then we may have censored data but in both cases there is information (negative or positive) in the censoring. G informative missing data. Û noninformative censoring informative missing data missing data where the reason that the data are missing tells us something about the state of a subject. G informative censoring. Û noninformative missing data informative prior in Bayesian statistics any form of prior distribution that is not a reference prior. G proper prior informed consent the practice of explaining to subjects and informing them about the purpose of a study and seeking their agreement to participate on a voluntary basis. G Declaration of Helsinki, ethics, research ethics committee injection a method of delivering liquid medication into the body. G subcutaneous, intramuscular, intravenous inlier a data value that does not seem to be true, given all the other data values, usually because it is too typical or normal. This is an odd concept but can be applicable to multivariate data. Û outlier inotropic eﬀect the eﬀect a drug has on the contraction of the heart. G chronotrophic eﬀect inpatient a patient who is treated in hospital and usually stays in hospital overnight. Û outpatient 86

input device

interaction effect

input device a device for getting data into a computer (this may simply be the keyboard or it may be a sophisticated blood analyser that feeds results directly into the computer) input variable another term for a covariate or independent variable inspection review of data and work practices by an independent reviewer (usually from a regulatory authority). G audit instantaneous rate the number of subjects who experience an event at a particular (small) point in time divided by the number who were at risk at that time. G hazard function institution in the context of clinical trials, a place where a study is undertaken; usually a hospital or similar establishment research ethics committee institutional review board integer a whole number (1, 2, 3, etc.), including negative numbers (91000,96, etc.) but excluding any fractions or decimal numbers (3,96.75, etc.) integrity honesty (when applied to a person); correctness (when applied to data). Û fraud intelligence broadness of understanding of, and the ability to solve, problems (practical or theoretical). It is a term that can refer both to humans and other animals. Note, therefore, that it is not the same as ‘general knowledge’ intelligence test a series of questions and problems to measure intelligence. Often referred to as IQ (intelligence quotient) tests intention-to-treat a term very similar to analysis by randomised treatment. It is a strategy for analysing study data which (in its simplest form) says that any subject randomised to treatment must be included in the analysis. This is not always easy, particularly in the presence of missing data. Û per protocol analysis intention-to-treat intention-to-treat analysis intention-to-treat population the subset of subjects recruited into a study who are included in the intention-to-treat analysis interaction the joint inﬂuence of two or more independent variables on a response variable that is not simply the sum of the individual inﬂuences interaction eﬀect the diﬀerence in the size of the eﬀect caused by two (or more) variables jointly, compared with the sum of the individual eﬀects. For example, it is known that smoking and exposure to asbestos increase the risk of bronchial cancer. However, for smokers who are exposed to asbestos the risk is substantially higher than the sum of the individual risks. There is said to be an interaction between smoking and exposure to asbestos 87

intercept

interobserver agreement

intercept in a regression model this is where the regression line crosses the y axis (which is the value of y when x : 0) interim part way through; before the entire (study) is completed interim analysis a formal statistical term indicating an analysis of data part way through a study, usually in the context of group sequential studies. G sequential analysis interim look a less formal term than interim analysis, used to describe a broader range of analyses of data part way through a study. These may include formal interim analyses or less formal summaries of data, without necessarily having broken the blind interim report this term may either be used informally to refer to a preliminary report (that is, not a ﬁnal report) or more formally to mean the report of an interim analysis interim result the results of an interim analysis or interim look interim review a review of data part way through a study, often to check on data quality and completeness rather than in the sense of a formal interim analysis intermediate variable a variable that does not measure exactly what we want to know but which is a second-best alternative. G surrogate internal consistency in questionnaires this is used to describe the situation where diﬀerent questions ﬁnd the same information; a simple example is to record age and date of birth. Note that both responses may be consistent with each other but also that both may be wrong. A similar usage applies to results in a study report where, again, two sets of results may be based on incorrect data and so may be wrong—but if the two results agree with each other, then the report would be said to have internal consistency. Û external consistency internal pilot study a form of pilot study where the data collected also form part of the data for the main study internal validity a statement or result that is valid, given a set of assumptions. If those assumptions are not correct then that statement or result may not be true International Classiﬁcation of Diseases a coding system developed by the World Health Organization. Virtually every disease, illness, injury, etc. is given an alphanumeric code Internet a worldwide computerised communications network and source of information interobserver agreement the extent to which two (or more) people agree with each other when recording measurements. This can be important in multicentre studies where several investigators (possibly in several 88

interobserver disagreement

interval variable

countries) are supposed to be assessing the same quantity. It is most often referred to in the context of subjective data rather than objective data. Û intraobserver agreement interobserver disagreement the extent to which two (or more) people disagree with each other when recording measurements. More commonly referred to as interobserver agreement. Û intraobserver disagreement interobserver variation the variation that almost always exists when more than one person measures the same quantity. This variation leads to interobserver disagreement. Û intraobserver variation interpolate to calculate an unknown value between two known values. This is most often done in a linear way but more complex methods exist. The practice is often used when looking up conversion values in tables and ﬁnding that the exact value to be converted is not tabulated. The required result may be approximately determined by interpolation. Û extrapolate interquartile between the lower quartile and the upper quartile interquartile range a measure of variability. The value of the upper quartile, minus the value of the lower quartile interobserver agreement interrater agreement interrater disagreement interobserver disagreement interrater variation interobserver variation interrelate correlate intersect the point on a graph where two curves cross each other. This also includes one curve crossing the x axis or y axis ( intercept) interval the range between two data values. G class interval interval censored observation data that are censored within a time interval. Generally, in censoring it is assumed that a subject’s status is known until a particular time; thereafter it is censored. In interval censoring a subject may be seen once a week or once every three months, etc. and all that is known is that the subject’s data became censored some time during that interval interval data continuous data interval estimate a range of values a parameter is likely to take that reﬂect the uncertainty and variability in measurements. The most common types of interval estimate are standard errors, conﬁdence intervals and credible intervals. Û point estimate interval estimation the process of determining an interval estimate. Û point estimation interval scale continuous scale interval variable continuous variable 89

intervene

inventory

intervene to take action, rather than to do nothing but observe intervention the action that is taken when one intervenes. In clinical trials the most common type of intervention is to give treatment (or placebo) intervention study an alternative term for a clinical trial. Û observational study interview a series of questions that are asked of a subject. Interviews may be held face to face or as telephone interviews interview study a study carried out by interviewing subjects intraclass correlation correlation between two measurements of the same variables, in the same subjects, taken at two diﬀerent times intraclass correlation coeﬃcient the statistical measure of intraclass correlation. It is denoted r, as is the more usual correlation coeﬃcient intramuscular into the muscle tissue. A method of delivering drugs by injection. Û intravenous, subcutaneous intranet a type of wide area network that resembles the Internet but does not have unlimited public access intraobserver agreement the extent to which the same person can repeatedly make the same measurement. As with interobserver agreement, this is more relevant with subjective data than with objective data. G reliability intraobserver disagreement the extent of disagreement between repeated measurements of the same quantity taken by the same person. More usually referred to in the context of intraobserver agreement intraobserver variation the variation in a person’s repeated measurements of the same quantity that results in intraobserver disagreement intrarater agreement intraobserver agreement intrarater disagreement intraobserver disagreement intrarater variation intraobserver variation intravenous into the blood stream. A method of delivering drugs by injection. Û intramuscular, subcutaneous intuitive a decision reached by use of judgement and experience rather than based on data invariant lacking variation. The term is most often applied to a result that is found through analysis of data when that result holds for a variety of diﬀerent methods of analysis and a variety of diﬀerent assumptions about the data. It is the result that lacks variation, not the data invasive entering into the body, for example by needle to give an injection or to take blood or by an endoscope to take a biopsy inventory a list of items, typically of study materials, paperwork, medication, etc. 90

inverse correlation

isometric graph

inverse correlation this is usually used synonymously with negative correlation but more precisely means the correlation between one variable and the reciprocal of another variable inverse logarithm the reverse function of taking logarithms. For logarithms to base e, the inverse logarithm is the function eV inverse relationship strictly, this term should be used when one variable changes in relationship to the reciprocal of another. However, it is often used when one variable increases as another (on average) decreases; this is more correctly called a negative relationship investigate to systematically observe and take measurements. Note that this does not necessarily encompass the term experiment investigation a particular variable or collection of variables that are observed investigational centre investigational site investigational device medical device investigational device exemption (IDE) an exemption similar to a clinical trial exemption certiﬁcate, issued to allow a medical device to be used in trials medical device study investigational device study investigational new drug (IND) application an application to the US Food and Drug Administration (FDA) for permission to test a new drug in humans investigational product the product (usually) that is being researched. G experimental treatment investigational site the place where the clinical work for a study takes place investigator the person who carries out the investigation. The term is very commonly used to refer to doctors who see subjects in a study and administer medication and record progress. Sometimes the investigator is not medically qualiﬁed—he or she may, for example, be a microbiologist in a study of an antibiotic investigator initiated study a study that is proposed and usually run and managed by an investigator rather than one that is proposed and managed by a pharmaceutical company. Û sponsor initiated study investigator’s brochure a document prepared by a pharmaceutical company, for use by investigators, that summarises all the known relevant data (including safety, eﬃcacy, pharmacodynamics, pharmacokinetics, etc.) regarding an investigational product isometric graph a graph that attempts to plot three dimensional data in two dimensions (Figure 16). G contour plot, x axis, y axis, and particularly z axis 91

isometric graph

isometric graph

Figure 16 Isometric graph. A graph showing the relationship between height, weight and systolic blood pressure. In such graphs, the axis apparently going ‘into’ the page (in this case weight) is often referred to as the z axis

92

J J shaped curve a curve on a graph that resembles the shape of the letter J (for example exponential growth), or a reversal of it: J (exponential decay). The essential elements are that the curve is quite ﬂat and then rises steeply, or that (in reverse form) it falls quickly and is then ﬂat J shaped distribution a distribution that either has a large peak of values and then a long tail (a high degree of positive skew) or a long tail before a peak of values (high degree of negative skew) jackknife a statistical method of estimating parameters that helps to reduce bias in certain circumstances. The method calculates an estimate of the parameter of interest based on all the data except one observation; it then re-estimates the same parameter based on all the data except one other observation. The process is repeated until separate estimates have been calculated, each with the exclusion of one data point. These separate estimates are then combined jackknife estimator an estimator that uses the jackknife method joint distribution the distribution (either frequency distribution or probability distribution) of two (or more) random variables. To fully understand this distribution, we need to know the distribution of each of the variables separately and the correlation between them. G bivariate distribution, multivariate distribution. Û marginal distribution joint frequency distribution see joint distribution joint probability function see joint distribution Jonckheere–Terpstra test a nonparametric statistical signiﬁcance test for testing the null hypothesis of no trend in ordered categorical data between two or more groups journal a regularly published document containing academic research, reviews, etc. judgement use of intuition and experience instead of (or possibly in conjunction with) data

Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

K Kaplan–Meier curve a graph showing the cumulative probability of survival. G Kaplan–Meier estimate Kaplan–Meier estimate a nonparametric estimate of the cumulative probability of survival for a set of data (that may include censored observations) Kaplan–Meier product limit estimate Kaplan–Meier estimate kappa () coeﬃcient an index (ranging from 0 to 1) of interobserver agreement Kendall’s tau () a nonparametric correlation coeﬃcient. G Spearman’s rho () keystroke error pressing the wrong key on a computer keyboard. The term is most often used in assessing quality of data entry, where the number of keystroke errors may be taken as a measure of the quality of the working practice. It is one of the major reasons for doing double data entry kilobyte one thousand bytes of computer information. Desktop computers can typically store at least a million bytes (or 1000 kilobytes). G megabyte pharmacokinetics kinetics Kolmogorov–Smirnov test a nonparametric statistical signiﬁcance test for testing the null hypothesis that the location parameters of two groups are equal. G Kruskal–Wallis test, independent samples t test Kruskal–Wallis test a nonparametric statistical signiﬁcance test for testing the null hypothesis that the location parameters of two or more groups are equal. G Kolmogorov–Smirnov test, one way analysis of variance kurtosis a measure of how highly peaked a distribution is. Distributions that have steeper peaks than the Normal distribution are called leptokurtic; those that are ﬂatter than the Normal distribution are called platykurtic. The Normal distribution is sometimes described as being mesokurtic (Figure 17)

Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

kurtosis

kurtosis

Figure 17 Kurtosis. As distributions become more and more peaked they are called leptokurtic; as they become less peaked they are called platykurtic

95

L L’Abbe´ plot a type of graph useful for plotting the results of many studies to assess how consistent they are with each other (Figure 18). G meta-analysis, overview

Figure 18 L’Abbe´ plot. Response rates from nine studies comparing an antipsychotic with placebo in obsessive—compulsive disorder. The diagonal is the line of equality: points above the line indicate that the response to treatment was better than that to placebo whilst points below the line (just one in this example) indicate a higher placebo response than treatment response Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

laboratory

latent period

laboratory a place where investigations and/or experiments are carried out. Traditionally, the laboratory was where experiments in chemistry or physics were done but the term is now used more broadly and may include, for example ‘computer laboratory’ or ‘speech laboratory’ laboratory data strictly any data that come from a laboratory. However, the term is usually used to refer to biochemistry, haematology and urinalysis data lag waiting behind. The term is used in computer programming and in statistical time series methods landscape a page that is wider than it is high, as in how (most) landscape pictures would be viewed. Landscape A4 paper is 297 mm wide and 210 mm high. Û portrait large sample method asymptotic method large scale trial megatrial Lasagna’s law the situation where the number of subjects eligible for a study apparently decreases when the study starts and increases again as soon as it ends. G Mu¨nch’s law last observation carried forward a method sometimes used to analyse studies with missing data. Consider the situation where subjects are due to be seen at several visits (say, each month for six months), with the endpoint of the study being the six month assessment. If a subject withdraws from the study at month four, then we may use that month four data to replace the (missing) month six data. That is, we take the last actual observation and carry it forward to the end of the study. Various scenarios are illustrated in Table 7. G intention-to-treat last observation carried forward last visit analysis last visit carried forward last observation carried forward latent period sojourn period Table 7 Individual subjects’ heart rates (beats per minute) at four consecutive visits and each subject’s value for the ‘last observation carried forward’ Subject id

Visit 1 (baseline)

Visit 2 (4 weeks)

Visit 3 (8 weeks)

Visit 4 (12 weeks)

‘Last observation’

1 2 3 4 5 6

98 80 83 95 110 88

99 72 83 90 88 Missing

94 Missing 80 Missing 80 Missing

89 Missing 81 95 Missing Missing

89 72 81 95 80 88

97

Latin square

learning curve

Latin square an experimental design that balances for two sources of variation. In clinical trials, the two sources are usually subject and time. The example in Table 8 shows how four treatments (A, B, C, and D) could be compared in four subjects in four time periods. The essential feature is that every treatment appears only once in every row (each subject) and once in every column (each time period). G crossover study, Youden square Table 8 Latin square showing the sequence of four treatments (A, B, C and D) for four subjects in four periods Period

Subject Subject Subject Subject

1 2 3 4

1

2

3

4

A B C D

B A D C

C D B A

D C A B

law a set of rules. A relationship between a set of events and an outcome law of averages an informal term that reﬂects the fact that probability distributions exist and in particular reﬂects the belief that any particular outcome will eventually be observed if enough data are collected. It is distinctly diﬀerent from the law of large numbers (or central limit theorem) law of diminishing returns eighty–twenty rule law of large numbers central limit theorem lay person someone who is not speciﬁcally trained in the subject being discussed but is nevertheless involved in discussing it. Research ethics committees will include lay members lead time bias a term often used in assessing survival times when the method of detecting cases improves with time. Patients apparently survive longer than they used to but this is not due to better treatment; rather it is due to earlier diagnosis. This could be an important problem in evaluating a screening programme. For example, even with no change in clinical practice, because cases may be detected earlier than without screening, the survival time from diagnosis will increase because diagnosis is occurring earlier in the life cycle of the disease run in period lead-in period learning curve a graph (rarely plotted, but frequently imagined) that plots time on the x axis and ability in a particular subject on the y axis. Sometimes such curves are J shaped curves, starting very ﬂat and then rising steeply (suggesting it takes a long time before you can do anything, but then it all becomes clear); sometimes they start very 98

least significance difference test

level of measurement

steeply and then ﬂatten oﬀ (suggesting it is easy to get started but learning the last few techniques becomes more and more diﬃcult) least signiﬁcance diﬀerence test Tukey’s least signiﬁcant diﬀerence test least squares a method of estimating parameters from data. It is based on choosing the value for that parameter that minimises the squared distance of each of the data values from the estimate of the parameter. Û maximum likelihood least squares estimate an estimate of a parameter obtained by the method of least squares. Û maximum likelihood estimate least squares mean the estimated mean of a variable obtained from an analysis of variance model or analysis of covariance model. It is the adjusted mean after adjusting for any other factors and covariates in the model least squares method any statistical method based on the principle of least squares. Û maximum likelihood method left censored when measuring when an event occurs, the events that occurred before the study follow-up period (and so were not observed) are left censored. Û right censored left censored data when the time of an event is known but the instant of exposure may be known only to be before a given time and the exact time is not known. Left censored data are much less common than right censored data. G censored data left censored observation left censored data negative skew left skew left tail the values in a distribution that are small (typically taken as meaning less than the mode) legal guardian someone who either permanently or temporarily is legally responsible for someone else’s health and well being. Û next of kin lethal will kill or extinguish life. Û fatal lethal dose the dose of a drug that will kill an individual lethal median dose (LD50) the dose of a drug that will kill half of the subjects exposed to it level of a factor one of the diﬀerent values that a factor (a categorical variable) can take. For example, the factor gender usually has two levels: male and female level of blinding whether a study is open label, single blind, double blind, triple blind, etc. level of measurement the degree of detail with which measurements are recorded. In general, the diﬀerent levels (in descending order of detail) are continuous data, ordinal data, categorical data and binary data 99

level of significance

lifetime prevalence

Table 9 Example of a life table. In this instance, the radix (which is merely a baseline number taken for convenience) is 10000 Age (years) x

Survivors at age x

0 1 2 3 4 5 : : 50 : : 90 100

10000 7675 6718 6247 5987 : : : 2971 : : 119 2

Deaths between Probability of x and x ; 1 dying between x and x ; 1 2325 957 471 260 155 : : : 77 : : 27 0

0.02325 0.01247 0.00701 0.00416 0.00259 : : : 0.00259 : : 0.0227 —

level of signiﬁcance in statistical signiﬁcance tests this is the P-value (strictly speaking, speciﬁed before the calculations are carried out) that will be needed in order to declare a result as statistically signiﬁcant. The most common cutoﬀ value is 0.05 but 0.01, 0.001, etc., may also be used. Note that the level of signiﬁcance is not the calculated (or observed) P-value life expectancy the length of time that an individual (or group of individuals) is expected to live life table a tabulation used to summarise life expectancy and probabilities of survival or death at diﬀerent ages (or at diﬀerent times after exposure to an intervention). An example of a life table is shown in Table 9. G survival analysis life table analysis methods used to analyse life tables, and particularly to compare survival curves between diﬀerent groups of individuals and to assess the importance of prognostic factors on the length of survival. One of the most common methods is Cox’s proportional hazards model life table method life table analysis lifetime the time between birth and death lifetime prevalence the prevalence of a particular event when the period within which the prevalence is measured is a person’s entire lifetime. G period prevalence 100

likelihood

linear trend

likelihood the probability of a set of observed data values, assuming a particular hypothesis (which is generally that they come from a particular probability distribution with speciﬁed parameters). Note that this is not the same as the probability of a given probability distribution, given a set of data. A variety of statistical procedures for signiﬁcance testing and estimation are based on methods that use likelihood likelihood function likelihood likelihood principle methods of estimating parameters and signiﬁcance testing, based on likelihood functions likelihood ratio the ratio of the likelihood of two diﬀerent hypotheses based on the same set of data likelihood ratio test statistic a general form of statistical signiﬁcance test based on the likelihood ratio. Simplistically, the hypothesis with the greater likelihood is more likely (sic) to be correct likert scale an ordinal scale where scores are assigned to the diﬀerent categories in the style of (for example) 1 : condition worse, 2 : no change, 3 : slight improvement, 4 : marked improvement, 5 : condition cleared limit asymptote line extension an addition to a range of products or the range of uses of a product. This may include alternative forms of presentation or new indications for use linear in a straight line. G curve linear combination a combination of values that is gained by simple addition and subtraction of multiples of those values. It does not involve any multiplication or other nonlinear functions of the values. For example, x ; y is a linear combination of x and y but x;y and xW are not linear combinations linear correlation correlation linear estimator an estimator that involves only linear combinations of data values linear kinetics describes the pharmacokinetics of a product when the rates of absorption, distribution and elimination are each proportional to the dose of drug linear model a statistical model (such as a regression model) that only has a linear combination of parameters linear regression a regression model that is a linear model linear transformation linear combination linear trend a steadily increasing (or decreasing) response when a covariate increases (or decreases). The trend is linear if, for a ﬁxed 101

link function

loading dose

Figure 19 Linear trend. Pupil diameter after administration of a new test compound. One subject had ﬁve measurements taken one hour after receiving each of four separate doses of drug. Within the dose range used, the eﬀect on pupil size seems to be linear change in the covariate, there is a ﬁxed size change in response (Figure 19). G dose response relationship link function a transformation of data values used to try to make a nonlinear curve be a linear one. Used extensively in generalised linear models linkage record linkage literature review a review of published studies and data relating to a particular topic. It is often the starting point for a new piece of research (to review the current and recent publications to ﬁnd out what is known about a subject). It is also one of the ﬁrst activities carried out in meta-analysis and overviews ln log C loading dose a high dose of a drug that is initially given to quickly achieve a required therapeutic level. Thereafter, smaller doses (maintenance 102

local area network

logistic transformation

doses) are often suﬃcient to keep the amount of drug in the body within the therapeutic range local area network a set of computers linked to each other to allow sharing of data and documents. The term ‘local’ is relative but tends to mean restricted to one site or building within a site. Û wide area network. G intranet local laboratory a laboratory that is geographically close to where subjects are being investigated. Û central laboratory local research ethics committee a research ethics committee that assesses studies to be carried out in local areas, typically with few centres. Û multicentre research ethics committee server local server location a nonspeciﬁc term to describe the central tendency in a set of data location parameter the parameter used to describe location for any particular set of data. The most common location parameters are the mean, the median and the mode lods log odds log a systematic record of activities and actions. Also an abbreviation for logarithm log odds log of the odds of an event occurring C log odds ratio log of the odds ratio. Many calculations concerning odds C ratios are, in fact, carried out on the logarithm of the odds ratio and then transformed back to the odds ratio scale log rank test a statistical signiﬁcance test for comparing the survival times of diﬀerent groups of subjects log transformation the transformation of data values that is made by taking the logarithm of those data log abbreviation for logarithm in base 10 units. G log C logarithm a mathematical function; the opposite function to the exponential logarithmic transformation log transformation log abbreviation for logarithm in base e (e is a natural constant, C approximately equal to 2.718). G log logistic curve a curve that is the graph of the logistic function (Figure 20) logistic function a transformation of binary data that is used in logistic regression. Where the proportion of responses is denoted p, the transformation is y : log +p/(19p), C logistic regression regression where the response variable is binary and a logistic transformation has been used to help facilitate the mathematics in the statistical model. It is one form of generalised linear model logistic transformation logistic function 103

logit

longitudinal analysis

Figure 20 Logistic curve. The logistic function is deﬁned for proportions (p) between 0 and 1. When p : 0, the logistic function equals minus inﬁnity; when p : 1, the logistic functions equals plus inﬁnity

logistic function logit logit model logistic regression log-linear model a statistical model for analysing data that are in the form of a count of the number of observations that fall into each cell of a contingency table. It is one form of generalised linear model lognormal distribution the probability distribution of a variable such that the logarithm of that variable follows a Normal distribution long term follow-up usually restricted to observations on subjects after some intervention has taken place. The subjects may, or may not, be given medication during this time. ‘Long term’ is obviously open to interpretation but is generally considered to be at least six months. G acute phase longitudinal followed across time. Û cross-sectional. G cohort longitudinal analysis the analysis of longitudinal data, usually with the 104

longitudinal data

lower quartile

speciﬁc intention of analysing changes with time. G growth curve. Û cross-sectional analysis longitudinal data data that are repeatedly collected on the same subject across time. G repeated measurements. Û cross-sectional longitudinal study a study that observes and measures the same subjects over a period of time. Û cross-sectional study loss any negative eﬀects of an intervention. A loss may be measured in cash, in years of life, in excess pain, etc. Sometimes a negative loss is referred to, meaning a gain loss function a function that combines several measures of loss (or possibly gain) to arrive at an overall ﬁgure for loss. The term ‘loss function’ is generally used when there is expected to be an overall loss (in the true negative sense); the term utility function is synonymous but tends to be used when there is expected to be a net gain (or negative loss) loss to follow-up the case when a subject is lost to follow-up lost to follow-up a subject who supplies some data for a study but for whom after a certain time no more data are available. The term usually also implies that there is no known reason why the subject supplies no more data. G censored observation, missing data lotion a liquid used as a vehicle for delivering topical treatments. G cream, gel, ointment lower quartile the 25th centile. G upper quartile, median

105

M main eﬀect in factorial studies, the main eﬀect of one factor is the size of the eﬀect averaged over all levels of all other factors. Û interaction eﬀect main study a term meaning study but useful to distinguish from pilot study mainframe mainframe computer mainframe computer a large computer. As technology progresses, the processing power and storage of small desktop computers is making the need for mainframe computers less and less. G microcomputer, minicomputer, supercomputer maintenance dose the amount of drug that needs to be given to keep within the required therapeutic range. Û loading dose majority more than 50% (but not necessarily the mode). Û minority Mann–Whitney U test a nonparametric signiﬁcance test for testing the null hypothesis that the location parameter (usually the median) is the same in each of two groups. G independent samples t test Mantel–Haenszel estimate a method of estimating an odds ratio from a stratiﬁed sample Mantel–Haenszel test a statistical signiﬁcance test for testing the null hypothesis that the Mantel–Haenszel estimate of the odds ratio is equal to one manual a set of instructions on how to use a machine or carry out a procedure manuscript a written document sent to a publisher to be published (or to be considered for publication) margin the edge. In multivariate data, each of the individual variables are sometimes referred to as the margins. See, for example, marginal distribution margin of error accuracy margin of safety safety margin marginal see marginal distribution, marginal mean marginal cost per unit cost marginal distribution in multivariate data, the distribution of each of the Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

marginal effect

matched subjects

variables, regardless of the other variables. Table 10 shows an example of bivariate data. Û conditional distribution, joint distribution Table 10 Joint distribution and marginal distributions of patients’ and investigators’ assessment of severity of disease in 202 patients. The marginal distributions are the row and column ‘total’ columns Scaliness Redness Absent Slight Moderate Severe Very severe Total

Absent

Mild

Moderate

Severe

Total

1 25 24 2 0 52

1 12 86 10 1 110

0 3 25 8 1 37

0 0 2 1 0 3

2 40 137 21 2 202

marginal eﬀect a loose term used to describe an eﬀect that is quite small and that may not be real. G marginally signiﬁcant marginal mean the mean of a marginal distribution marginally signiﬁcant a loose term used to imply that a calculated P-value is very close to some arbitrary criterion for being called statistically signiﬁcant. P-values of about 0.07 to 0.04 are often described as being marginally signiﬁcant. G marginal eﬀect marker surrogate marketing authorisation the authorisation given by a regulatory authority to a pharmaceutical company to market a product mask blind match to identify two (or more) subjects as having similar demographic data and/or disease severity (and other characteristics) such that one can serve as a control for another matched control a subject who has not been exposed to the intervention under study but who has demographic data and other exposure characteristics similar to those of one that has been exposed and who will be compared with that subject. G case-control study matched design a study where matched pairs are used matched pair two subjects who will be compared with each other or two measurements on the same subject that will be compared matched pairs t test paired t test matched study a study that is designed as a matched design matched subjects see matched pair 107

mathematical model

mean square error

Table 11 Simple matrix of demographic data Subject identiﬁcation number

Age (years)

Gender

Race

1 2 3 4 5 6 7

37 42 18 77 45 51 52

Male Male Female Male Male Female Female

British British British Indian British American British

mathematical model see model mathematics the science dealing with numbers, their uses and manipulation. G statistics matrix a rectangular (not necessarily square) array of mathematical elements. It could include numbers, regression coeﬃcients, parameter estimates, etc. or simple raw data as shown in Table 11. Strictly it should have at least two rows and at least two columns: if it has only one row or only one column, it is called a vector maximum the largest of a set of values. Û minimum maximum likelihood the largest value of the likelihood function. G maximum likelihood method maximum likelihood estimate the estimate of a parameter that is obtained by the maximum likelihood method maximum likelihood method a method of estimating parameters; the most likely value for the parameter (the best estimate) is the one that has the maximum likelihood. Û least squares method maximum tolerable dose the maximum dose of a drug that a subject can take before inducing unacceptable adverse reactions McNemar’s test a statistical signiﬁcance test for testing the null hypothesis of no change in the proportion of subjects experiencing an event when each subject is assessed twice (under diﬀerent conditions) and the data are in the form of matched pairs mean the sum of a set of numbers, divided by the number in the sample. A more formal term for the average mean absolute deviation average absolute deviation mean square the mean of a sum of squares mean square error the variance of an estimator. (Note that if the 108

meaningful difference

medical history

estimator is biased, then the mean square error is the sum of the variance of the estimator plus the square of the size of the bias) meaningful diﬀerence clinically signiﬁcant diﬀerence measure to determine the size or extent of a variable of interest measured value observed value measurement the assessment and recording of a data value. This does not have to be restricted to objective data; the term is also used with reference to subjective data measurement bias a bias caused by the process of taking measurements. Examples include digit preference or measuring only values of a variable that fall within the capacity of the measuring instrument. G Hawthorne eﬀect measurement error an error made in measuring the value of a variable. The error may be because of lack of care in the measurement process or because of diﬃculty or judgement needed to measure the variable. Blood pressure, for example, is prone to measurement error, as are most types of subjective data measurement scale the type of scale that is used to measure a variable. Examples include ordinal scale, continuous scale, categorical scale, etc. MedDRA a dictionary of adverse event terms. MedDRA stands for Medical Dictionary for Drug Regulatory Aﬀairs. G COSTART, WHO-ART median the 50th centile. When a set of numbers is sorted into ascending order, there are as many values greater than the median as there are values smaller than the median. G lower quartile, upper quartile median dose the dose of a drug that is estimated to produce a response in 50% of subjects median life expectancy the length of time that 50% of subjects are expected to live medical relating to medicine. Û clinical medical device a physical device used for medical treatment, such as a prosthesis or a heart pacemaker medical device study a study of the eﬃcacy and/or safety of a medical device. This can encompass the comparison of more than one device or the comparison of a device and a pharmaceutical product medical ethics the branch of ethics that considers medicine, medical practice, medical care, etc. G Declaration of Helsinki medical history the course of the health (including ill health) of a patient over time. This information can be used to help determine a diagnosis and predict a prognosis 109

medical judgement

metric scale

medical judgement a judgement (about diagnosis, treatment, prognosis, etc.) made by a physician medical record the notes and documents that describe a subject’s medical history medical study a study of the eﬃcacy and/or safety of one or more medicines. It is a more speciﬁc term than clinical trial medical treatment treatment administered to a patient. The type of treatment can be very broad but generally excludes surgical treatment medical trial clinical trial medically important diﬀerence clinically signiﬁcant diﬀerence medicine the science and practice of prevention, diagnosis and treatment of disease megabyte a unit of space for storing information on a computer. Equivalent to one million bytes. G kilobyte megatrial a very large trial. Usually considered to include several thousand subjects meta-analysis an analysis of the summary results from two or more similar studies. (Strictly, analyses of analyses; G metadata.) Such methods are becoming more common and are used as a way of synthesising data from a variety of studies to try to get better answers to speciﬁc medical questions. Û overview metabolise to change (when a drug changes in the body). G pharmacokinetics metabolism the set of changes that happen to a chemical (a drug) in the body. G pharmacokinetics metadata data about data. For example, a manufacturer’s data regarding accuracy of a peak ﬂow meter might be considered as metadata method a way of carrying out a procedure. The term applies equally to methods of treating patients, methods of measuring variables, methods of analysing data, etc. methodologist one who studies and is an expert in methods. The term is usually used to distinguish between applied research and theoretical research methodology a set of methods me-too a term used to describe a product for which a very similar alternative already exists metric any measurement scale, but particularly one referring to metric data metric data data measured in the SI system of units, which includes grams and metres. Also sometimes used to refer to continuous data metric scale continuous scale 110

metric variable

missing at random

metric variable continuous variable microcomputer a computer that is usually small enough to ﬁt on a desk, in a briefcase, or even in a pocket. G minicomputer, mainframe computer, supercomputer microprocessor the processing unit that forms the basis of a computer mid P-value an adjustment made to the calculation of P-values when working with ordinal data. With continuous data, the probability of observing any particular value is considered to be zero; so the probability that x is greater than y [Prob (xy)] is the same as the probability that x is greater than or equal to y [Prob (xPy)]. However, since with ordinal data any particular value can have a nonzero probability, the mid P-value is deﬁned as Prob (xy) ; Prob (x : y) midpoint the middle of a class interval. It is simply the mean of the lower class limit and the upper class limit; it is not the median within the class interval mid-quartile the mean of the lower quartile and upper quartile. Û median mid-range the mean of the minimum value and the maximum value mid-spread interquartile range minicomputer a small computer that is larger and more powerful than a microcomputer but not as large or powerful as a mainframe computer or supercomputer minimax rule a rule that calculates the maximum value of a parameter under diﬀerent circumstances (often the maximum cost under diﬀerent circumstances) and then chooses as ‘best’ the set of circumstances with the minimal cost. It is the minimum of all the possible maxima minimisation a pseudorandom method of assigning treatments to subjects to try to balance the distribution of covariates across the treatment groups. G randomisation, stratiﬁed randomisation minimum the smallest of a set of values. Û maximum minority less than 50%. Û majority minus inﬁnity a number smaller than any other number can be. Û inﬁnite, plus inﬁnity misclassiﬁcation with categorical data, misclassiﬁcation is any form of measurement error that ultimately means that a subject is recorded as being in the wrong category. Examples include gross errors such as recording a subject as being male instead of female, or lesser errors such as recording ‘partial’ improvement of symptoms instead of ‘moderate’ improvement misconduct fraud missing at random missing data where the probability of data missing 111

missing completely at random

moment

may depend on the values of some other measured data but does not depend on the missing values themselves. Û missing completely at random missing completely at random missing data, where the probability of data missing is independent of any observed or unobserved data. This is not very common since subjects may often withdraw from studies because their disease is completely cured or may default because their disease is extremely severe. Û missing at random missing data a data value that should have been recorded but, for some reason, was not missing value missing data mixed eﬀects model a statistical model that contains a mixture of diﬀerent types of parameters. Speciﬁcally, it is one that contains both ﬁxed eﬀects and random eﬀects mixed model mixed eﬀects model mock report ghost report mock table ghost table modal relating to the mode modal class in data measured in categories, the most frequently occurring class. G mode modality the property of having a mode mode the most frequently occurring value. Used as a measure of location. Û mean, median model an idealistic description of a real (often uncertain) situation. Models may take the form of physical imitations of medical devices, through to mathematical models that are equations or functions describing how a process behaves and on to statistical models that contain both deterministic elements (like mathematical models) and random elements. Statistical models are often thought of as being like regression models, logistic regression, log-linear models, etc. In fact, simple t tests are also models, just of a much simpler form. Models can be expressed in words: the model that an independent samples t test assumes is that the distribution of a variable is identical in each of two groups, except for a shift in location. Such models can also be expressed algebraically as y : ; x ; G G G model equation the equation for a model modiﬁed Fibonacci series a modiﬁcation to a standard Fibonacci series moment a series of statistics describing a probability distribution. The ﬁrst moment is the mean; the second moment (often referred to as the ‘second moment about the mean’) is the mean of the squared distances of each value from the mean; the third moment is the mean of the cubed 112

monitor

multicentre

distances of each value from the mean, etc. monitor one who visits investigators to help with study management, ensure that all data are being recorded as they should be and that all supplies (drugs, materials, etc.) are available on site, and who often returns completed case record forms to the data management oﬃce. Also a term used for one of the output devices (the screen) of a computer monitoring committee data and safety monitoring committee monitoring report a report (usually written) to describe the activities of a monitor at a study site and any positive or negative ﬁndings, any issues that need bringing to the attention of others, etc. monotherapy a single drug. Û combination drug monotonically decreasing repeated measurements that only remain constant or decrease; they never increase. Û monotonically increasing monotonically increasing repeated measurements that only remain constant or increase; they never decrease. Û monotonically decreasing Monte Carlo method a method to solve a problem by simulation Monte Carlo simulation either a single simulation (as in a Monte Carlo trial) or a complete set of simulations forming a Monte Carlo method. All Monte Carlo methods are simulations Monte Carlo trial one (usually of many thousands) of the simulations in a Monte Carlo simulation morbid prone to disease. Û mortal morbid event an event associated with illness morbidity relating to ill health. Û mortality morbidity curve a graph of the cumulative occurrence of morbidity with time morbidity rate the proportion of subjects with a morbid event at any given point in time mortal prone to death. Û morbid mortality relating to death. Û morbidity mortality curve a graph of the cumulative occurrence of death with time. Û survival curve mortality rate the proportion of subjects who have died at any given point in time most powerful test uniformly most powerful test moving average a term used most often with time series data. It involves calculating the mean (or average) of observations 1 and 2 (for example); then the mean of observations 2 and 3; then of observations 3 and 4, and so on multicentre involving more than one study centre 113

multicentre research ethics committee

multiple endpoints

multicentre research ethics committee a research ethics committee that assesses studies that are planned to take place in many study centres. Û local research ethics committee multicentre study a study carried out at more than one study centre multidisciplinary involving more than one scientiﬁc discipline (or speciality). This may include more than one medical discipline (such as oncology and gastroenterology) but also can include other disciplines such as biostatistics (for study design and analysis), mechanical engineering (if prostheses or other medical devices are being used), etc. multidisciplinary study a study that involves more than one scientiﬁc discipline for its design, execution, analysis, and reporting multilevel model a model that has a hierarchy to its parameters. For example, a study may be conducted in several countries (level 1); with several investigators (level 2) in each country; with many subjects (level 3) recruited by each investigator; and each subject observed on several occasions (level 4). G mixed eﬀects model multimodal having more than one mode multimodal distribution a distribution that has more than one peak (or ‘local maxima’). Note that the mode is the most frequently occurring value so the term multimodal is a tautology; hence more than one ‘peak’ is used in this context multinomial data categorical data multiperiod crossover design a crossover study with more than two study periods multiperiod crossover study a study designed as a multiperiod crossover design multiple comparison method any statistical method for making multiple comparisons multiple comparison test any form of statistical signiﬁcance test for making multiple comparisons multiple comparisons more than one comparison (usually in the form of statistical signiﬁcance tests) within a single study. The comparisons may be between more than two treatments, or between two treatments but with more than one response variable, or a mixture of both of these situations multiple correlation coeﬃcient (R2) the correlation in a multiple regression model. G correlation coeﬃcient multiple dose design repeated dose design multiple dose study repeated dose study multiple endpoints more than one endpoint in a study. G multiple 114

multiple imputation

Mu¨nch’s law

comparisons, multiple outcomes multiple imputation a method of imputing several randomly diﬀerent values for a missing value. The method may have no impact on any point estimate over and above that of simple imputation but it does better reﬂect variability of the missing value. G last observation carried forward multiple linear regression multiple regression multiple linear regression model multiple regression model multiple logistic regression logistic regression with more than one covariate multiple logistic regression model a statistical model resulting from multiple logistic regression multiple looks more than one analysis of accumulating data. G group sequential study multiple outcomes a study having more than one outcome variable. G multiple endpoints, multiple comparisons multiple regression linear regression with more than one covariate multiple regression model a statistical regression model resulting from multiple regression multiple signiﬁcance tests the use of more than one statistical signiﬁcance test in one study multiplicative model a statistical model where the combined eﬀect of separate variables contribute as the product of each of their separate eﬀects. Û additive model, linear model. G interaction multiplicity multiple comparisons multistage design a study that has more than one stage (or period), possibly including a run in period, a treatment period and a follow-up period multi-univariate more than one univariate response variable where the interest lies with each variable in its own right, rather than a multivariate combination of them multivariate relating to more than one variable (usually more than one response variable). Û univariate. G bivariate multivariate analysis special methods of analysis suitable for multivariate data multivariate data measurements that consist of more than one variable. For example, a person’s ‘size’ could be measured jointly by their femur length, tibia length and skull circumference. More than two variables are always referred to as multivariate: two variables, whilst still multivariate, are often referred to as bivariate multivariate distribution the joint distribution of more than one variable. Û univariate distribution. G bivariate distribution Mu¨nch’s law a pessimistic rule which suggests that the number of 115

mutually exclusive

mutually exclusive events

subjects expected to be available to enter a study should usually be divided by a factor of at least ten. G Lasagna’s law mutually exclusive not able to occur at the same time mutually exclusive events two or more events that are not able to occur at the same time. This is not restricted to situations where events have (by chance) not been observed to occur at the same time, but events that are not capable of jointly occurring. An example would be that a subject is male and is pregnant

116

N named patient use a way of allowing a doctor to oﬀer an unlicensed product to a patient outside of a clinical trial. This is often allowed in treatment of life threatening diseases where no alternative treatment is available. There are strict guidelines under which such a supply may be oﬀered. G compassionate use natural experiment a term used to describe a (usually major) event (usually some form of disaster). The resulting change in environment and its impact can be studied. It is not a true experiment as the intervention is not under our control. Examples include ﬂoods and chemical leaks natural history the course of events over time. The term can be used on a massive scale to describe geological and climatic changes or to describe how an illness in an individual has developed and is likely to develop over time. Û medical history natural logarithm a logarithm to base e. log C necessary and suﬃcient a term often used in a mathematical context but applicable elsewhere. It describes a set of circumstances that are required (‘necessary’) but also where no other circumstances are simultaneously required (‘suﬃcient’). For life to exist, oxygen must be present; but that is not all. Thus, oxygen is necessary, but not suﬃcient, for life to exist negative control see inactive control, placebo control negative correlation correlation between two variables such that as one variable increases the other tends to decrease. Û positive correlation. G inverse correlation negative eﬀect an eﬀect that is undesirable. G adverse reaction negative gain a loss. Û negative loss negative loss a gain. The term is used when referring to several items that generally incur a loss (possibly a ﬁnancial loss). To avoid switching between losses and gains, the term negative loss is sometimes used. Û negative gain Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

negative predictive value

nil effect

negative predictive value in a diagnostic test, the probability that a person with a negative result does not have the disease (a correct result). G positive predictive value, sensitivity, speciﬁcity negative relationship an informal term for negative correlation. Û positive relationship negative response a poor response, or no response, to treatment. Û positive response negative result a result less than zero. The result of a negative study. Û positive result negative skew describes a distribution that has a long left hand tail so that the majority of observations are at the upper end of the scale. Û positive skew negative study a study that fails to reject the null hypothesis or otherwise fails to fulﬁl its objectives. Û positive study negative treatment eﬀect an undesirable treatment eﬀect. G adverse reaction. Û positive treatment eﬀect nested design an experimental design where some factors occur only as subsets of other factors. G multilevel model nested factor a factor that occurs in an experiment only as a subset of another factor net change any change after removing the eﬀect in a control group. For example, if we calculate the change in blood pressure in a group of treated patients and in another group of untreated patients, the net change in the treated group would be their change minus any change observed in the untreated group. G treatment eﬀect net diﬀerence net change net eﬀect any eﬀect after removing some baseline or control eﬀect. G net change net treatment eﬀect see net change, treatment eﬀect network a system of communication between computers or people to share information new chemical entity (NCE) a new chemical that is being developed as a potential new drug new drug application (NDA) an application to the Food and Drug Administration in the USA for a licence to market a new chemical entity Newman–Keuls test a multiple comparison test for testing the null hypothesis of no diﬀerence between the means of more than two groups next of kin a person’s nearest relation (through either blood or marriage). Û legal guardian nil eﬀect no eﬀect, or zero eﬀect 118

no cause audit

nomogram

no cause audit an audit carried out as a matter of routine or because a study or a site has been selected at random for audit. Û for cause audit node on a decision tree (Figure 6), any point at which a choice of routes can be made n-of-1 study a study carried out in a single patient to determine the best treatment for that patient (which may not necessarily be the best treatment for patients in general) noise unwanted variation in data. G signal to noise ratio noisy data data that have a lot of noise, or a high variance nomenclature the terminology (symbols and special language) used in any science or discipline nominal data categorical data nominal scale a categorical scale whose possible values are simply in the forms of names: country of origin, concomitant medications, etc. nominal variable a variable measured on a nominal scale nomogram a type of graph used to depict the relationship between (usually) three variables (Figure 21)

Figure 21 Nomogram. For values of height and weight, Quetelet’s index (body mass index) can be read oﬀ 119

noncentral distribution

nonparametric method

noncentral distribution a variation of the more standard probability distributions (t distribution, F distribution, etc.) useful for calculating power of signiﬁcance tests noncompliance the act of not fully complying with a protocol. Often the term is restricted to whether or not a subject takes the medication as and when they should but it can be interpreted more widely to any aspect of a protocol noncompliant a subject who does not fully comply with a protocol nonignorable missing data missing data that indicate something about the subject because of the fact that the data are missing. For example, in an antihypertension study, data may be missing because a patient died: a death caused by a road traﬃc accident may be considered ignorable because it is unlikely to be study related but a death caused by a cardiac arrest would not be ignorable. In this example the term is partially related to a contrast between adverse events and adverse reactions nonignorable missingness a process that produces nonignorable missing data noninferiority study a study whose objective is to show that one treatment ‘is not worse than another’. This is subtly diﬀerent to showing that two treatments are equivalent ( equivalence study) and obviously diﬀerent to trying to show that one treatment is diﬀerent to another ( diﬀerence study). G superiority study noninformative censoring censoring in survival studies that is completely unrelated to treatment. Essentially the same meaning as nonignorable missing data in the context of survival studies and censoring noninformative missing data data that are missing, and the fact that they are missing tells us nothing about what the data value should be. G missing completely at random noninformative prior reference prior noninvasive any medical procedure that is not invasive nonlinear not in a straight line. Û linear nonlinear model a model that contains multiplicative terms, not simply additive terms. Û linear model nonparametric a branch of statistics that makes few assumptions about the distributions of data nonparametric data strictly, there is no such thing as nonparametric data. However, the term is quite commonly used to refer to data that come from distributions that do not obviously resemble any standard probability distribution and for which nonparametric methods of analysis need to be used. Û parametric data nonparametric method any statistical method for signiﬁcance testing and 120

nonparametric test

normal plot

estimation that makes fewer assumptions about the distribution of the data than do parametric methods. It is widely believed that these methods make no assumptions at all about the distribution of the data but this is not the case nonparametric test a nonparametric statistical signiﬁcance test. Examples include the Mann–Whitney U test, the Wilcoxon matched pairs signed rank test, etc. nonrandom not random; used to refer to nonrandom samples and nonrandom treatment allocation nonrespondent a subject who does not answer a question, either because they refuse to or because they did not attend a study visit and so could not be asked nonresponse similar meaning to nonrespondent but also used to describe subjects who do not respond to treatment nonsense correlation an observed correlation that may be statistically signiﬁcant but which does not make any biological or medical sense in terms of causality nonsigniﬁcant risk study a study of a medical device that poses no important risk to the subjects who take part. Û signiﬁcant risk study nonzero eﬀect this is usually used to refer to an eﬀect when it needs to be stressed that an eﬀect does exist. This may be because the eﬀect is very large or because, despite the eﬀect being very small, it may still be medically or scientiﬁcally important normal a rather dangerous term: it has an everyday use meaning typical or not unusual; it has a similar meaning in a technical sense of a normal range (G reference range) for a variable; it also has a highly technical (statistical) use as in Normal distribution, one of the most basic ideas in statistics. Because of these diverse uses, it is important to either avoid its use altogether or to be highly speciﬁc. In this book, an upper case ‘N’ is used for the statistical probability distribution, the Normal distribution normal approximation an approximate procedure based on assuming data come from a Normal distribution normal curve an informal term used to describe the shape of the curve of a Normal distribution Normal distribution the probability distribution that is very commonly used (either directly or as a basis for further reﬁnements) in statistical signiﬁcance testing, estimation, model building, etc. (Figure 22) normal limit the upper (or lower) limit of a normal range normal plot quantile–quantile plot 121

normal range

nuisance parameter

Figure 22 The classic ‘bell shape’ of a Normal distribution with mean 1 and standard deviation 1 normal range the usual range within which the values of a variable can be expected to lie. It usually implies that all subjects within that range will be healthy. Û reference range normality the degree to which a distribution is like a Normal distribution normally distributed said of a set of data that come from an underlying Normal distribution not signiﬁcant either an eﬀect that is of no clinical importance ( clinically signiﬁcant) or one that, regardless of its size, is not statistically signiﬁcant notiﬁable disease a disease that must, by law, be notiﬁed to health authorities nuisance parameter in a statistical model, parameters that may be very important as covariates but which are not of direct interest in the study. Usually the treatment eﬀect is of most interest; if it turns out that subject’s age or previous history are predictive of outcome (but equally 122

n-way classification

nuisance variable

predictive within each treatment group) then their parameters would be considered as nuisance parameters nuisance variable any variable in a statistical model that is not of primary interest. G nuisance parameter null distribution the probability distribution of a variable if the null hypothesis is true null hypothesis (H0) the assumption, generally made in statistical signiﬁcance testing, that there is no diﬀerence between groups (in whatever parameter is being compared). Evidence (in the form of data) is then sought to refute (or reject) this null hypothesis. Û alternative hypothesis (H1) number needed to harm the number of patients that a physician would have to treat with a new treatment in order to harm (in some predeﬁned sense) one extra subject who would otherwise not have been harmed. ‘Harm’ may be in the context of a treatment failure, an adverse reaction, a death, etc. More usually considered in the context of number needed to treat number needed to treat the number of patients that a physician would have to treat with a new treatment in order to avoid one event that would otherwise have occurred with a standard treatment numerator in a fraction, such as or , the numerator is the number on the top line of the fraction (in these cases 1 and 3, respectively). Û denominator numeric relating to numbers only. Û alphanumeric numeric variable a variable that is a number. This generally means it is a continuous variable and not, for example, a likert scale Nuremberg Code a set of ethical principles about research on humans that formed the basis of the Declaration of Helsinki n-way a generalisation of 1—way, 2—way, 3—way, etc. meaning any number of ways. Used particularly in the sense of n-way analysis of variance, n-way classiﬁcation, etc. n-way analysis of variance a generalisation of analysis of variance indicating that many (n) factors are included in the model n-way classiﬁcation classiﬁcation of a continuous variable (usually by a discrete variable) in several (n) subclasses

123

O O’Brien and Flemming rule one of the most common stopping rules used in group sequential studies. G Pocock rule objective the purpose of a study. It may be described either in very precise and speciﬁc terms or in general terms such as ‘to assess the safety and eﬃcacy of Drug A’. The term is also used to refer to clear facts rather than general impressions. For this interpretation Û subjective objective data data that are usually considered to be measured with high accuracy and that have low (or negligible) intraobserver variation and interobserver variation. Û subjective data objective endpoint an endpoint to a study that is objective data. Û subjective endpoint objective measurement a measurement of objective data. Û subjective measurement objective outcome an outcome that is objective data. Û subjective outcome observation usually meaning the data relating to one of the subjects being studied. However, the term is mostly used in a computing context to mean the number of rows in a (rectangular) database. Usually this will consist of one observation (with many variables) per subject; sometimes, if there is a diﬀerent number of records per subject, the database may be set out as one observation per record observational study a study that has no experimental intervention but just observes what happens to a group of subjects. G case-control study, cohort study. Û intervention study observed change the change in a variable that is seen to occur. This is in contrast to a ﬁtted value from a statistical model observed data strictly this is synonymous with data but use of the term ‘observed’ helps to contrast with ﬁtted values from statistical models observed diﬀerence the diﬀerence (in means, proportions, etc.) for a variable that is seen to occur. This is in contrast to a ﬁtted value from a statistical model observed distribution observed frequency distribution Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

observed effect

odds ratio

observed eﬀect usually the simple estimate of an eﬀect (diﬀerence in means, diﬀerence in proportion, the odds ratio, etc.) that has not been adjusted to account for any possible covariates observed frequency the frequency with which a speciﬁc variable is seen to occur. This is in contrast to a ﬁtted value from a statistical model observed frequency distribution strictly this is synonymous with frequency distribution but use of the term ‘observed’ helps to contrast with probability distribution observed mean the sample mean. Û population mean observed outcome the observed value (usually of categorical data). This is in contrast to an expected outcome from a statistical model observed rate the rate at which an event is seen to occur. This is in contrast to any ﬁtted values from statistical models observed relative frequency distribution the observed frequency distribution presented as a relative frequency distribution observed result any kind of result that is seen to occur. This is in contrast to any ﬁtted values from statistical models observed sample size the sample size actually obtained, in contrast to what was planned observed treatment diﬀerence observed eﬀect observed treatment eﬀect observed eﬀect observed value either the value of a measurement in a single subject or the number of occurrences of an event that have been observed observed variance the sample variance. Û population variance observer bias any bias in measurements introduced by an observer (for example digit preference) or caused by making observations ( Hawthorne eﬀect) observer error any error in measurements made by an observer. G intraobserver agreement, interobserver agreement observer variation see interobserver variation, intraobserver variation Occam’s razor a philosophical stance which prefers simple explanations to more complex alternatives. This is a general principal to adopt in formulating statistical models Ockham’s razor Occam’s razor odds the probability of an event occurring divided by the probability of it not occurring. For example, if one in ten cancer patients are cured by a drug, then the odds of being cured are stated as 1:9. Û rate, risk odds ratio the ratio of two odds, often used as a summary of the size of a treatment eﬀect in two-by-two tables. In Table 12, the odds ratio is calculated as (37;31)9(13;19) : 4.6. Û risk ratio 125

off label

on treatment

Table 12 Contingency table showing the distribution of treatment response by treatment group

Treatment success Treatment failure Total

Treatment A

Treatment B

37 13 50

19 31 50

oﬀ label the use of a product to treat a disease for which it does not have a marketing authorisation oﬀ site away from the buildings or facilities where key activities occur. This may be with reference to study medication being stored at a location separate from where patients are treated or it may refer to an archive of data being kept at a location separate from where the main data-processing activities take place. Û on site oﬀ study refers to clinical activities that may occur concurrently with a study protocol but which are not included in the protocol, or to procedures which take place, or medication that is given, after a subject has completed the protocol. Û on study oﬀ treatment any time (during the course of a study or after a subject has completed a study) when a subject is not being given treatment (or placebo). This may be during a run in period or during a long term follow-up period. Û on treatment ogive a graph of a cumulative frequency distribution (Figure 23) ointment a vehicle for delivering topical treatment, usually paraﬃn or Vaseline based. G cream, gel, lotion omitted covariate a covariate that has not been included in a regression model or analysis of covariance model (either intentionally or inadvertently) omnibus test any statistical signiﬁcance test that involves comparing parameters (often means or proportions) from more than two groups. It may, for example, be a test that all the means are equal: in such a case, if the null hypothesis (of equal means) is rejected we cannot immediately say which means are diﬀerent to which others on site activities that take place (or the availability of study material) at the site where they are needed. This may relate to medication being on the site where patients are treated, or to completed case record forms being at the premises of the data management oﬃce. Û oﬀ site on study activities that take place as part of a study protocol. Û oﬀ study on treatment any time when a subject is being given a study treatment (or placebo). Û oﬀ treatment 126

one sided

one way analysis of variance

Figure 23 Ogive. A graph of the cumulative number of patients who have suﬀered from eczema for less than 1 year (9 patients), less than 2 years (10 patients), less than 3 years (12 patients), . . . less than 55 years (all 66 patients) one sided concerned with only one tail of a distribution. Û two sided one sided alternative the alternative hypothesis that is a one sided hypothesis. Û two sided alternative one sided hypothesis a hypothesis that allows for the possibility of a diﬀerence in only one direction (for example, Drug A must be better than Drug B). Such hypotheses are not as common as two sided hypotheses one sided test any statistical signiﬁcance test that will accept a one sided hypothesis if the null hypothesis is rejected. Û two sided test one tailed one sided one tailed alternative one sided alternative one tailed hypothesis one sided hypothesis one tailed test one sided test one way analysis of variance the simplest form of analysis of variance, 127

one way classification

ordered alternative hypothesis

used to compare the means of two (or more) groups in a parallel groups study but without including any other factors or covariates in the statistical model one way classiﬁcation data that are grouped by only one categorical variable. Note that the categorical variable may have several levels ( levels of a factor) but there is only one variable one way design a study design that involves only a response variable and one (categorical) covariate online a computing term meaning that work is being done directly onto a central computer rather than being temporarily held on a local computer before batch processing or being uploaded to the central computer online data entry electronic data entry that occurs online. Û distributed data entry. G remote data entry open class interval a class interval that either has no lower limit (it is all values below a certain value) or has no upper limit (it is all values above a certain value). It is often used with highly skewed data open label not blind open label study a study where the treatments are not blinded open sequential design a sequential study design that does not have any upper limit to the number of subjects that may be recruited (Figure 24). Û closed sequential design open sequential study a study that is designed as an open sequential design open study open label study open treatment assignment treatment assignment that is not blinded (although it may still be random). G open label study operation a surgical procedure or a mathematical function optimal design a study that is the best (‘optimal’) for some speciﬁc purpose. Note that it may not be optimal for all purposes. It may be optimal on statistical grounds or from practical study management grounds oral assent assent that is given orally. Û written assent. G consent oral consent consent that is given orally. Û written consent (which is more common). G assent order of magnitude a multiple of, or division by, 10 order statistic any one of the centiles ordered see ascending order, descending order ordered alternative hypothesis an alternative hypothesis that involves more than two groups. The simplest example is that of comparing the means of three groups. The null hypothesis is that ‘all the means are 128

ordered categorical data

ordered logistic regression

Figure 24 Open sequential design. The solid lines indicate stopping boundaries for declaring a statistically signiﬁcant diﬀerence between treatments A and B. If the broken boundary is crossed, then the study stops, concludingthat no signiﬁcant diﬀerence was found between the treatments. Potentially, the number of preferences could continue indeﬁnitely between the upper solid and broken lines or between the lower solid and broken lines; in such a case no conclusion would ever be reached equal’ or, equivalently, ‘ : : ’; the simplest alternative ! hypothesis might be that ‘not all of the means are equal’; an ordered alternative hypothesis would be that ‘ ’ ! ordered categorical data data that are measured on a categorical scale but where the categories have a natural ordering, for example mild, moderate and severe. G likert scale ordered categorical scale the scale on which ordered categorical data are measured ordered categorical variable a variable that yields ordered categorical data ordered data data that are measured on an ordered scale ordered logistic regression an extension of the methods of logistic 129

ordered scale

outcomes research

regression where the response variable is ordered categorical, instead of binary. G polytomous regression ordered scale a measurement scale that is ordered. This includes ordinal scales, ordered categorical scales, interval scales ordinal data data that are simply ordinal numbers ordinal number the numerical position (1st, 2nd, 3rd, etc.) in a set of ordered data ordinal scale the scale on which ordinal data are measured ordinal variable a variable that yields ordinal data ordinary least squares least squares ordinate y axis. Û abscissa (or x axis) orientation layout, generally of paper in the form of either landscape or portrait origin the point of zero on a graph. On a two-dimensional graph, where x : 0 and y : 0 original data source data original document the top copy (not photocopies, etc.) of a document. G source data original record source data orphan drug a product that has a limited market because it is used for a rare disease. Regulatory requirements are diﬀerent for orphan drugs than for non-orphan drugs orthogonal when two ideas, measurements, estimates, etc. are at right angles to each other, implying that they are also independent of each other orthogonal contrasts two (or more) contrasts that are independent of each other outcome usually the primary variable of a study. Although an outcome would generally be an event ( outcome event), the term is frequently used to refer to the primary variable whatever the measurement scale outcome event the primary variable of a study, speciﬁcally when that variable is binary outcome measure the primary variable of a study, usually restricted to the case when that variable is continuous outcome variable the variable that deﬁnes the outcome for a study outcomes research methods of trying to answer research questions that do not involve intervention studies but which analyse databases and attempt to control for all possible confounding factors by complex statistical modelling. It is a cheaper, quicker and relatively easier way to answer a question than doing a clinical trial and so presents clear advantages over trials (which are often expensive and time consuming). 130

outlier

overview

However, much of the rigour and control of bias gained from clinical trials may be lost outlier a data value that does not seem to be true, given all the other data values, usually because it is very extreme (either too large or too small). Û inlier outpatient a patient who is not kept in hospital overnight. Note that treatment may still be given in the hospital. Û inpatient outpatient study a study of outpatients output device a method of getting data out of a computer (this may simply be the monitor or a printer) over represent when there is a higher proportion of some subgroup in a sample than there is in the population. Sometimes this may be desirable. The proportion of subjects with mild, moderate and severe symptoms may intentionally be kept equal in a sample, even though they are not similar in the population. Û under represent. G sample demographic fraction overmatch in matched studies, cases and controls might be matched for as many variables as is reasonably possible. If, however, the exposure variable is also matched between the groups then no diﬀerence between the groups will be found. This is called overmatching and is a risk in complex epidemiological studies where the variable causing the cases is not known over-the-counter drug products that can be purchased without needing a doctor’s prescription. Û prescription only medicine overview to look at data from various sources, considering them as a whole and making a conclusion. Û meta-analysis that involves more formal assessments of the completeness of the data and more formal statistical methods for combining them. Overviews and meta-analysis are very important methods for synthesising data

131

P package see computer package, package insert package insert the information given to a patient with a pack of medication. It contains information similar to the summary of product characteristics but is written in a style appropriate for patients to understand page orientation orientation pair two items. Usually this means the same variable measured on two similar subjects or the same variable measured on one subject on two occasions. It can sometimes refer to two independent items that are brought together in some way ( pairwise comparisons for an example) pair matching pairwise matching paired comparison a comparison that is made on paired data (not on independent groups). Û pairwise comparisons paired data the same variable measured on two similar subjects or the same variable measured on one subject on two occasions paired design a study design that involves taking paired observations and usually makes treatment comparisons using paired comparisons, often (but not necessarily) in the form of a crossover design paired observations two observations that are related to each other, either as two observations from the same subject at diﬀerent times (or on diﬀerent sites on the body) or as one observation from each of two matched subjects in a paired design paired sample a sample of paired observations paired t test a statistical signiﬁcance test testing the null hypothesis that the mean diﬀerence in a population (from which a sample of paired data has been taken) is equal to some particular value. Usually it is to compare the mean diﬀerence with zero. Û independent samples t test pairwise relating to pairs pairwise comparisons in a study where more than two groups are being compared, the term pairwise refers to each of the possible pairs of treatments that can be compared. For example, when there are three Dictionary for Clinical Trials Author: Simon Day Copyright © 1999 John Wiley & Sons Ltd ISBNs: 0-471-98611-9 (Hardback); 0-471-98596-1 (Paperback); 0-470-84256-3 (Electronic)

pairwise matching

parametric data

groups (A, B, C) there are three possible pairwise comparisons: A vs. B, A vs. C, and B vs. C. G multiple comparison method pairwise matching matching a pair of subjects palliative care care for the whole patient, rather than treatment of speciﬁc symptoms. Examples include supportive care in the form of good communication, sympathy, understanding, empathy, etc. towards patients (and their relatives) pandemic occurring over a large geographic area. Û endemic paperless using no paper; as in ‘paperless case record form’ (where data are entered directly onto a computer without being transcribed onto paper ﬁrst) parallel side by side; not crossing over parallel assay a dose ﬁnding study where the activity of a new product is compared with the activity of a standard drug parallel control the control group in a parallel group study parallel dose design a parallel group study where the diﬀerent groups of subjects receive diﬀerent doses of the same drug parallel group design the most common design for clinical trials, whereby subjects are allocated to receive one of several treatments (or treatment regimens). All subjects are independently allocated to one of the treatment groups. No subjects receive more than one of the treatments. Û crossover design parallel group study a study designed as a parallel group design parallel study parallel group study parallel track occurring at the same time but independently. For example, two studies that are being conducted at the same time but independently of each other parameter the true (but often unknown) value of some characteristic of a population. A simple example is the mean age of a population. Other examples include variances, minimum values and medians. Parameters are usually denoted by Greek letters (for example, for the variance) and are estimated by sample statistics that are usually denoted by Roman letters (for example, s for the variance). The most common parameter that we wish to estimate in clinical trials is the size of the treatment eﬀect parameter estimate the estimate (based on data) of a parameter parametric data as with nonparametric data, this term has no real meaning but it is quite commonly used to refer to data that come from recognisable probability distribution and for which parametric methods of analysis can be used 133

parametric method

pathogenesis

parametric method statistical methods that make speciﬁc assumptions about the distributions of data. Examples include the t test, correlation and regression. Û nonparametric method parametric test any statistical signiﬁcance test that uses parametric methods. Examples include the t test and the F test parent this term is used in the obvious way referring to mothers and fathers of children. It is also sometimes used in decision trees and mathematical models. In decision trees, it refers to a node from which branches come; in mathematical models it sometimes refers to a broad set of models from which other, simpler, models can be formulated. These are the most common uses but the term is sometimes used in any context where a hierarchy exists parent drug the basic form of a drug from which various alternative modiﬁcations are available parsimony the concept of simplicity being preferred over complexity. With particular reference to statistical models, models with few parameters are generally preferred over those with many parameters. G Occam’s razor partial response in cancer studies, this is generally regarded as a decrease in tumour size of at least 50%. G complete response, stable disease, progression partially balanced block a block of treatments that is balanced for some comparisons but not for others. For example, a block containing two assignments to Treatment A, two to Treatment B and three to Treatment C is only partially balanced partially balanced design a study design that uses partially balanced blocks of treatments partially confounded the situation where two estimates are not completely confounded but where some information in one estimate is not independent of another. This is very common in unbalanced designs and when using analysis of covariance participant someone who takes part (usually in a study) partition to split up. The term is most usually used when trying to decide if relationships (for example, dose—response relationships) are linear or quadratic. In this instance, we often refer to partitioning the sums of squares patent the process of registering, or the documents conﬁrming, ownership of an invention (such as a new drug), thus protecting that invention from being copied pathogen a microbiological organism that is capable of causing disease pathogenesis the cause and subsequent development of a disease 134

pathology

pay journal

pathology the science of the causes of disease patient a subject who has a disease or other illness. Note that the requirement to have a disease or other illness diﬀerentiates from the broader term ‘subject’. Note also that ‘volunteer’ is not a good choice of word when describing those who take part in studies because healthy subjects and diseased patients should all be taking part voluntarily patient accrual patient enrolment patient chart any kind of chart or graph on which a patient’s data are plotted patient compliance the degree to which an individual patient complies with the study protocol as a whole, or speciﬁcally complies with taking the appropriate medication patient contact any type of meeting between a patient and a health worker. The contact may be face to face, by telephone, by letter, etc. patient enrolment the process of recruiting patients into a study patient enrolment period the time period during which patients are enrolled into a study patient follow-up the process of observing a patient over time, after they have been given study medication. G follow-up data, follow-up period, follow-up visit patient home visit a visit (usually by a study nurse or an investigator) to a patient, in the patient’s home patient id subject id patient identiﬁcation number subject identiﬁcation number patient information booklet a small booklet given to subjects, before they agree to take part in a study, to give them information about the study to help them decide if they are prepared to volunteer patient information sheet a smaller form of a patient information booklet that is just a single sheet of paper patient monitoring observation of a patient to ensure safety (primarily) and sometimes to record eﬃcacy data patient population the entire (theoretical) population of patients that could be recruited into a study. The term is also used to refer to the diﬀerent analysis populations (intention-to-treat population, per protocol population, safety population, etc.) patient record the data referring to a single patient patient recruitment patient enrolment pay journal a journal for which the cost of publication has to be met (fully or partially) by the authors of the manuscripts. Peer review may, or may not, also be required. Û peer review journal 135

peak

performance measure

peak any area on a graph that shows a rise and subsequent fall peak value the maximum value from a set of related data. Usually it is from data that are all from one subject Pearson chi-squared statistic chi-squared statistic. The term is often preﬁxed with ‘Pearson’ to distinguish it from other forms of statistical signiﬁcance tests that also use the chi-squared distribution Pearson product-moment correlation coeﬃcient correlation coeﬃcient Pearson residual residuals in contingency tables (and in logistic regression models). Each residual is calculated as the diﬀerence between the observed value and expected value, divided by the square root of the expected value peer a colleague or other person who is considered an equal in scientiﬁc merit and experience peer review when an independent scientist of similar standing and experience to the ﬁrst reviews a manuscript or other documents or working practices and makes comments. G expert review peer review journal a journal that sends submitted manuscripts for peer review. It is usually assumed that there is no charge for publishing an accepted manuscript. Û pay journal per protocol analysis the analysis of study data that excludes data from subjects who did not adequately comply with the study protocol. Û intention-to-treat per protocol population the subset of subjects recruited into a study who are included in the per protocol analysis per unit cost the extra cost incurred (per person treated, per bottle manufactured, etc.) It does not include basic set up costs. Û ﬁxed cost percent of 100. For example, the phrase ‘37 percent of patients responded to treatment’ means that ‘of every 100 patients given treatment, 37 responded’ percent diﬀerence index the diﬀerence between two percentages. G percentage point percentage point the term is often used in a similar way to percent diﬀerence index: when considering a change from, for example, 20% to 30%, this can be described as a ‘50% increase’ or a ‘diﬀerence of 10 percentage points’ percentile centile percentile–percentile plot quantile–quantile plot per comparison error rate comparisonwise error rate per experiment error rate experimentwise error rate performance measure any measurement of how well a person or group of 136

performance monitoring

person-year

people carried out a particular task; or a measurement of how well an experiment or measuring instrument does what it is intended to do performance monitoring the process of reviewing performance (that is, how well a task is being done), with a view to making improvements, if necessary. The term can equally well apply to reviewing the performance of people or machines period an interval of time. In the speciﬁc context of crossover studies, the term refers to the intervals of time when a subject is given the ﬁrst treatment (period 1), when they are given the second treatment (period 2), etc. period eﬀect any systematic diﬀerence in response between two periods. Most commonly used in the context of crossover studies period prevalence the prevalence (number of cases) of an event during a speciﬁed period of time. Û point prevalence periodic safety update report a regular report sent to a regulatory authority with details of all adverse events reported for a product peripheral of secondary importance. In computer terms, it refers to any additional piece of hardware that can be added to a computer (image scanners, printers, etc.) permutation any ordering of a given ﬁxed set of data values permutation test nonparametric test permute to rearrange a set of data values to form a new permutation. G randomise permuted block randomised block personal computer typically a small (although possibly quite powerful) computer. Such computers are suﬃciently small that they easily ﬁt onto a desk; some are small enough to ﬁt into a small briefcase. Û mainframe computer personal data data about individual people. Often it is restricted to data that may be considered as of a sensitive nature (sexual behaviour, illegal substance abuse, etc.) personal probability in Bayesian statistics, this is one person’s prior probability of an event. It is sometimes called a ‘personal probability’ to emphasise that diﬀerent people may legitimately have diﬀerent prior probabilities for the same event, so the prior probability is of a ‘personal’ nature person-time see person-year as an example. ‘Time’ can be any chosen units person-year when many people have been exposed to an intervention for varying lengths of time, the total time of exposure for all people can be calculated and expressed as if it were one individual exposed for this 137

pessary

Phase I study

total length of time. For example, two people each exposed for 6 months would equate to one person-year; one person exposed for 12 months and another with zero exposure would also equate to a total exposure of one person-year pessary a suppository inserted into the vagina pharmaceutical relating to drugs. Û biologic, phytomedicine pharmaceutical company a commercial organisation that researches, develops, manufactures and markets drugs pharmaceutical industry pharmaceutical companies and other support companies involved in the research, development, manufacture and marketing of drugs pharmacist a person qualiﬁed to prepare, safely store and dispense drugs pharmacodynamics broadly, the action of a drug on the physiology of the body. Û pharmacokinetics pharmacoeconomics the study of economic implications of drug usage. This can be used either to try to justify use of drugs as an economic beneﬁt or to evaluate the cost associated with a patient having a disease compared with the cost needed to treat the patient pharmacoepidemiology the study of drug usage and results (positive and negative) in broad populations with a view to a better understanding of beneﬁcial drug usage. G epidemiology, outcomes research, pharmacovigilance pharmacogenetics the study of how drugs aﬀect the genetic makeup of the body pharmacokinetics broadly, the action of the body on a drug. Pharmacokinetics includes the study of the rate of absorption and distribution of products into and around the blood stream, and the rate (and methods) of elimination of drug from the body. Û pharmacodynamics pharmacology the study of drugs (including uses, beneﬁts, harmful eﬀects and stability) pharmacovigilance the study of adverse events (presumed to be related to drug usage) in broad populations pharmacy a place where drugs are stored in secure conditions and under the control of a pharmacist phase diﬀerent stages of drug development and testing ( Phase I, II, III, IV study) or, used on its own, to denote diﬀerent stages within a study. In this latter case phase of study Phase I study the earliest types of studies that are carried out in humans. They are typically done using small numbers (often less than 20) of healthy subjects and are to investigate pharmacodynamics, phar138

Phase II study

pilot

macokinetics and toxicity Phase II study studies carried out in patients, usually to ﬁnd the best dose of drug and to investigate safety. This term is sometime split into the subgroups Phase IIa studies and Phase IIb studies Phase IIa study of a set of Phase II studies, the earlier ones (often on fewer patients) are sometimes referred to as Phase IIa. Û Phase IIb study Phase IIb study of a set of Phase II studies, the later ones are sometimes referred to as Phase IIb. Û Phase IIa study Phase III study generally these are major studies aimed at conclusively demonstrating eﬃcacy. They are sometimes called conﬁrmatory studies and (in the context of pharmaceutical companies) typically are the studies on which registration of a new product will be based. They are sometimes split into so-called Phase IIIa studies and Phase IIIb studies Phase IIIa study this term is not often used; the term Phase III study is usually adequate. Û Phase IIIb study Phase IIIb study when a product already has a marketing authorisation but the indication is being expanded, new Phase III studies are needed to demonstrate eﬃcacy in the new indication. Since the Phase III studies in the drug’s development have already been completed, these new studies are sometimes referred to as Phase IIIb Phase IV study these are studies carried out after registration of a product. They are often for marketing purposes as well as to gain broader experience with using the new product. G post marketing surveillance study, seeding study phase of study denotes diﬀerent stages within a study. For example, a study may have a washout period, treatment period, follow-up period, etc.; each of these could be referred to as a phase rather than a ‘period’ phlebotomy the taking of blood physician a medically qualiﬁed person who can treat patients. G investigator physiology the study of the functioning of the body and body systems phytomedicine drugs developed from plants. Û biologic, pharmaceutical pi () a mathematical constant. It is the ratio of the circumference of a circle to its diameter although it has many uses in mathematics and statistics beyond this pie chart a circular graph used for showing percentages. Schematically it resembles a pie (or a cake) with slices cut out; each slice being proportional (in area) to the proportion of data being represented (Figure 25). Û stacked bar chart pilot pilot study 139

pilot study

placebo controlled

Figure 25 Pie chart. The proportion of patients recruited by each of four study centres is represented. In this example, the actual number of patients recruited at each centre is also indicated

pilot study a small study for helping to design a further, conﬁrmatory study. The main uses of pilot studies are to test practical arrangements (for example, how long do various activities take? is it possible to do all the things we want to?), to test questionnaires (do the subjects understand the questions in the way we intended?) and to investigate variability in data. G internal pilot study pilot test usually an informal type of pilot study pivotal something on which a major decision (possibly to continue or cease developing a compound) will depend pivotal study a study that is pivotal. It may be pivotal for internal company use or for regulatory use. In the latter case, G conﬁrmatory study placebo an inert substance usually prepared to look as similar to the active product being investigated in a study as possible. In some situations, the term vehicle is used instead. G blinding placebo control giving placebo to a control group of subjects placebo controlled placebo controlled study 140

placebo controlled study

plus and minus

placebo controlled study a description of a study that implies there is a control group who receive placebo placebo eﬀect a nonspeciﬁc term used to encompass any (usually beneﬁcial) changes that occur within a group ‘treated’ with placebo. G eﬀect, treatment eﬀect placebo group the group of subjects assigned to receive placebo. Û treatment group placebo lead in period placebo run in period placebo period a period within a study where subjects are given placebo. G placebo run in period, placebo washout period placebo run in period a run in period where all subjects are given placebo. G placebo washout period placebo subject a subject who has been allocated to receive placebo placebo treatment an alternative term simply for placebo. Strictly, placebo is not a ‘treatment’ but the term is still commonly used placebo washout giving subjects placebo when the purpose is to allow any other medication that may be in the body to be eliminated. This may be at the beginning of a study ( placebo run in period) or between periods in a crossover study placebo washout period the time during which placebo washout takes place plagiarise to extensively copy someone else’s ideas or work without adequately acknowledging them plagiarism the act of copying someone else’s ideas or work without adequately acknowledging them plasma the liquid part of blood platform often used to refer to types (and manufacturers) of diﬀerent computers. See, for example, mainframe computer, personal computer plausibility check an edit check to test if data items appear plausible. This may be based on a simple range check or may be a more complex consistency check. Note that data that pass a plausibility check may still not be correct play-the-winner rule a method of assigning treatment to subjects. When the response is binary, the next subject will be given the same treatment as the last subject, if the last subject showed a positive response. However, if the last subject showed a negative response, then the next subject will be given the alternative treatment play-the-winner treatment assignment a method of treatment assignment that uses a play-the-winner rule plot a graph; or to draw a graph plus and minus plus or minus. Note the distinction between ‘and’ and 141

plus infinity

pool

‘or’ is poorly used. G and/or plus inﬁnity the term inﬁnity strictly means plus inﬁnity but in some situations it is helpful to distinguish from minus inﬁnity plus or minus (

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close