Life @ North Lake - North Lake Presbyterian Church
Transcription
Life @ North Lake - North Lake Presbyterian Church
Developing new approaches to measuring NHS outputs and productivity FINAL REPORT 8 September 2005 (revised 2 December) Diane Dawson∗, Hugh Gravelle**, Mary O’Mahony***, Andrew Street*, Martin Weale*** Adriana Castelli*, Rowena Jacobs*, Paul Kind*, Pete Loveridge***, Stephen Martin****, Philip Stevens***, Lucy Stokes*** ∗ Centre for Health Economics, University of York. **National Primary Care Research and Development Centre, Centre for Health Economics, University of York. ***National Institute for Economic and Social Research. ****Department of Economics and Related Studies, University of York. Table of contents 1 Introduction ......................................................................... 7 1.1 The research remit...........................................................................................7 1.2 Research delivered ...........................................................................................8 1.3 Quality.............................................................................................................10 1.4 Value for money and technical change ........................................................11 1.5 Structure of the report...................................................................................11 2 Productivity and output measurement............................ 14 2.1 Total factor productivity growth in private markets .................................14 2.2 Total factor productivity growth and quality change ................................16 2.3 Application of TFPG methods in the NHS ..................................................17 2.4 Value weighted NHS output index ...............................................................21 2.5 Changes in marginal social values over time...............................................22 2.6 Outcomes and attribution .............................................................................24 2.7 Cost and value weights ..................................................................................26 3 Current practice................................................................. 27 3.1 The cost weighted activity index (CWAI)....................................................27 3.2 The ‘experimental’ NHS cost efficiency and service effectiveness indices28 3.2.1 The experimental cost efficiency index ...................................................29 3.2.2 The service effectiveness growth measure ..............................................31 3.3 Pharmaceuticals and prescribing .................................................................32 2 4 Quality adjustment with available data ........................... 33 4.1 Activities or outputs as the unit of analysis .................................................34 4.1.1 Activities: institutional approach .............................................................34 4.1.2 Outputs: patient-centred or disease-based approach................................35 4.2 Unit of hospital output...................................................................................36 4.3 Alternative sources for hospital activity ......................................................38 4.4 General practitioner consultations...............................................................41 4.5 Measures of marginal social value of outputs .............................................41 4.5.1 Unit costs .................................................................................................41 4.5.2 Private sector prices .................................................................................43 4.5.3 International prices...................................................................................44 4.5.4 Value of health.........................................................................................45 4.5.5 Value of waiting time...............................................................................45 4.5.6 Expert groups ...........................................................................................46 4.6 Quality adjustment for health effects of treatment ....................................47 4.7 Quality adjustment using long term survival..............................................52 4.8 Quality adjustment with short term survival..............................................54 4.8.1 Simple survival adjustment......................................................................54 4.8.2 Incorporating estimates of health effects .................................................60 4.8.3 Life expectancy and health effects...........................................................64 4.8.4 Cost of death adjustment..........................................................................65 4.8.5 In-hospital versus 30 day mortality..........................................................68 4.8.6 Conclusions: survival based quality adjustment ......................................71 4.9 Readmissions ..................................................................................................74 4.9.1 Readmissions as health effects.................................................................74 4.9.2 Readmissions (and clinical errors and MRSA) as a deadweight loss......80 4.10 Waiting times..................................................................................................81 4.10.1 Waiting time as a characteristic ...............................................................82 3 4.10.2 Waiting time as a scaling factor...............................................................83 4.10.2.1 Discounting to start of wait..........................................................85 4.10.2.2 Discounting to date of treatment with charge for waiting ...........89 4.10.3 Optimal waiting times..............................................................................92 4.10.4 Distribution of waiting times ...................................................................92 4.10.5 Outpatient waits .......................................................................................96 4.10.6 Waiting time adjustment: conclusion.......................................................97 4.11 Patient satisfaction .......................................................................................100 4.12 Discount rate on health................................................................................103 4.13 Quality adjustment for general practice....................................................104 4.14 Atkinson principles and quality adjustment .............................................107 5 Experimental indices of NHS output ............................. 110 5.1 General trends, index form and data sources ...........................................110 5.2 Index form ....................................................................................................113 5.3 Spells versus episodes ..................................................................................114 5.4 Survival adjustments: hospital output .......................................................115 5.4.1 Simple survival adjustment....................................................................115 5.4.2 Survival and estimated health effects adjustment ..................................121 5.4.3 Survival adjustments with health effects and life expectancy ...............122 5.5 Waiting time and survival adjustments: hospital output.........................123 5.5.1 Effect of waiting time adjustments ........................................................125 5.5.2 Outpatient waits .....................................................................................130 5.6 Additional quality adjustments ..................................................................131 5.6.1 Adjusting for the costs of poor treatment: readmissions and MRSA ....131 5.6.2 Patient satisfaction .................................................................................134 5.7 Conclusions...................................................................................................137 4 6 Specimen output index................................................... 138 6.1 Introduction..................................................................................................138 6.2 Data ...............................................................................................................140 6.3 Cost weighted output indices ......................................................................150 6.4 Health outcome weighted output indices ...................................................155 6.5 Value weighted output index.......................................................................159 6.6 Conclusion ....................................................................................................160 7 Effects of quality adjustments on hospital and NHS output indices: summary......................................................... 161 8 Labour input .................................................................... 165 8.1 Introduction..................................................................................................165 8.2 Labour input in the NHS.............................................................................166 8.2.1 Volume of labour input..........................................................................166 8.2.2 Quality of labour input...........................................................................166 8.2.3 Data sources and volume trends ............................................................167 8.2.4 Quality adjustments based on qualifications..........................................170 8.2.5 Quality adjustments: refinements ..........................................................177 8.3 Conclusion ....................................................................................................180 9 Experimental productivity estimates............................. 180 9.1 Labour input and labour productivity growth .........................................181 9.2 Intermediate and capital inputs..................................................................183 9.3 Total factor productivity growth................................................................186 5 10 Improving the data .......................................................... 188 10.1 Outcomes data..............................................................................................188 10.1.1 Health outcomes.....................................................................................188 10.1.2 Feasibility...............................................................................................191 10.1.3 Cost ........................................................................................................192 10.2 Other outcome measures: patient satisfaction ..........................................192 10.3 General practice data ..................................................................................193 10.3.1 GP activity .............................................................................................193 10.3.2 GP cost weights......................................................................................194 10.3.3 General practice staff .............................................................................194 10.3.4 Prescribing .............................................................................................194 10.4 Other primary care data .............................................................................195 10.5 Inputs ............................................................................................................196 11 Conclusions and recommendations ............................. 198 11.1 Methods.........................................................................................................198 11.1.1 The preferred approach ..........................................................................198 11.1.2 Methods using existing data...................................................................199 11.2 Results ...........................................................................................................200 11.2.1 Results for the hospital sector ................................................................200 11.2.2 Other quality indicators..........................................................................201 11.3 Total factor productivity growth................................................................202 11.4 Recommendations ........................................................................................202 11.5 Acknowledgements ......................................................................................205 References ................................................................................ 206 Annex: How should NHS output be measured? ....................................................212 Table of Notation ...................................................................................................215 6 1 Introduction 1.1 The research remit In March 2004 the Department of Health commissioned a research team from the Centre for Health Economics at the University of York and the National Institute for Economic and Social Research to develop new approaches to measuring NHS outputs and productivity. The research objectives were development of: • A comprehensive measure of NHS outputs and productivity • Methods to facilitate regular in-year analysis of NHS productivity • Output measures capable of measuring efficiency and productivity at subnational levels. The research team was also asked to co-operate with The Atkinson Review on measurement of government output and productivity for the national accounts. Three interim reports on this research were produced (July 2004, November 2004 and June 2005) as well as memoranda on data requirements (September 2004) and methodology (January 2005, August 2005). The work was presented for scrutiny at two workshops (7 July 2004 and 17 June 2005). The research team presented work in progress to four meetings of the NHS Outputs Steering Group (7 July 2004, 2 February 2005, 10 May 2005, 20 July 2005). This is the Final Report on the research project. The background to the research remit referred to the Public Service Agreement (PSA) following the 2002 Spending Review that “set a ‘value for money’ (productivity) target of 2%”. The target required information on quality improvement that had not previously been measured for the NHS as a whole. While PSA targets have changed over time, it is likely that some measure of quality improvement will continue to be required in reporting performance. Quality adjusted measures of NHS output were also required for other Department of Health purposes such as monitoring the performance of Trusts and identifying the scope for efficiency gains. It is important to appreciate that there are significant differences between the concepts 7 of efficiency, value for money, productivity and productivity growth that have implications for both methods of measurement and policy relevance of the resulting indices. • Efficiency is measured as the ratio of output produced with given inputs relative to the maximum feasible output. • ‘Value for money’ reflects the value individuals/society place on output relative to the costs of production. This often corresponds to a cost-benefit analysis. • Productivity is the ratio of a measure of total output to a measure of total inputs. • Productivity growth is the change in output relative to the change in inputs. It is often interpreted as reflecting the effect of technical change on production. Robust measurement requires precise definition of the concept to be measured. Effective employment of these measures in pursuit of policy objectives requires selection of the appropriate measure for the issue at hand. 1.2 Research delivered The research team has responded to the research remit by delivering the following outputs. 1. A methodology for producing a comprehensive quality adjusted index of NHS output. This is referred to as the “value weighted output index”. Data necessary to estimate this index are not currently available for all NHS activities but are feasible to collect. The DH has already planned or is considering collection of the relevant data. 2. Methodologies for calculating quality adjusted NHS output indices with existing data. These are cost weighted indices that incorporate varying combinations of changes in survival, health effects, waiting times, patient satisfaction, readmissions and MRSA. We present estimates of experimental indices which examine their sensitivity to different ways of measuring waiting 8 times, survival, and to different assumptions about the health effect, discount rates and other parameters. 3. For the small set of hospital based treatments where there are some data on health outcomes before and after treatment, we have produced a “specimen” index that illustrates how the recommended value weighted index can be populated with data on health outcomes when they become more generally available. 4. We have suggested additional data that are feasible to collect that would not only improve future measurement of NHS output but would also be of value in managing the NHS. 5. We have constructed a new index of labour input in the NHS. It combines data from a range of sources to calculate a volume measure of total hours worked and includes an adjustment to take account of increases in the skills of the workforce. 6. Using the cost weighted quality adjusted index of outputs and inputs, we have produced provisional estimates of Labour Productivity Growth and Total Factor Productivity Growth for the period 1998/99-2003/04. 7. The methodology and data used in these indices can be applied to sub-national groups of institutions (e.g. NHS Trusts). 8. For many purposes, quality adjusted measures of output and productivity growth for particular diseases and across institutional settings will be of more value to the NHS than a comprehensive index. We indicate how, with planned changes to NHS data collection, it will be feasible to produce disease specific output and productivity indices with the methodology presented in this report. Although key data used in our output indices, predominantly from the Hospital Episode Statistics (HES), are available on a quarterly basis, we would not recommend publication of within year estimates of output growth. The quarterly HES data are 9 subject to significant revision and use of quarterly index numbers could be misleading. 1.3 Quality Central to all the work reported is a method of defining and measuring “quality”. We define the quality of treatment as the level of the characteristics valued by patients and changes in quality are measured as the rate of change of these characteristics. Given that improving the health of patients is a primary objective of the NHS, improved health outcomes are likely to be the most important characteristic of treatment. In addition, the literature suggests the main impact of technical change in health care has been to improve expected health outcomes—e.g. the expected health outcomes from heart surgery or management of diabetes are better today than ten years ago. There is little data on health outcomes in the NHS and hence it has not been possible to measure quality improvement, productivity growth and technical change. For the present the main available health outcomes data are for mortality or survival rates. This is a severe limitation on any attempt to measure the quality of output or productivity since only 3% of NHS patients die soon after treatment. There is no routine data with which to measure the improvement in health following treatment for the 97% of patients who survive. It appears that this situation may change and the NHS may start collecting data on health outcomes. In Section 4 of this report, we present the structure of output indices that should be used if and when data on health outcomes in the NHS become available. The equations could be used for a subset of patients if initially outcomes data are collected for only a limited set of procedures. It follows from our definition of quality that the unit for measuring NHS output should be the patient treated. This makes it necessary to link the activities directed at treatment of a patient. For example, a patient undergoing treatment for heart disease would receive prescriptions for various drugs, attend outpatient clinics, undergo 10 diagnostic tests, perhaps surgery and follow-up care from a GP. At present it is not possible to identify the set of activities delivered to an NHS patient with a particular condition. The Department of Health plans to introduce a patient identifier that in future will permit analysis of the care delivered to a patient across activities, institutions and over time. For the present it is necessary to continue to use counts of activities as proxies for output. However, the indices recommended could readily be adapted to a patient-based definition of output when linked data become available. 1.4 Value for money and technical change Recent work in the US illustrates how, with data on outcomes and an ability to link activities/inputs to patients with particular conditions, it is possible to obtain approximate disease specific measures of value for money and technical change. Cutler et al. (2001), for example, examine improved survival rates for patients admitted with acute myocardial infarction (AMI). By placing a monetary value on quality adjusted additional years of life expectancy and dividing by the cost of inputs used for treating this group of patients, estimates can be produced of the growth in value for money. Similar work has been done for depression, schizophrenia and cataract surgery. The DH requested the research team to produce formulae and estimates for a comprehensive index of NHS output and productivity growth. When data become available that identify the set of inputs used to treat particular conditions and the monetary value of output, the approach we outline in Section 6 can be applied to studies of individual conditions as in the US work. 1.5 Structure of the report For any new method of measuring NHS output and productivity to be generally accepted, it is important that the methodology be well grounded in economic theory. In Section 2 we set out the theory behind measurement of Total Factor Productivity, the issues relevant to attribution of NHS activity to improvement in health outcomes and the choice of weights necessary to sum the many NHS outputs into a single index 11 number. Section 3 outlines recent and current DH practice for estimating output and productivity. This provides a baseline for comparison with the quality adjusted indices provided by the research team. In Section 4 we set out our preferred approach to measuring quality adjusted output. This is a value weighted output index that attaches monetary values to the characteristics that measure quality. We discuss the appropriate units of output and available data. While it is feasible to collect the data necessary for a value weighted output index, the data are not currently available. In the remainder of the section we explore the possibilities for estimating a quality adjusted cost weighted output index with existing data. Quality adjusting a cost weighted index is not straightforward and we set out the assumptions required. In the absence of data on health outcomes for most NHS activity, we focus on the possibility of quality adjusting for changes in long and short term survival and for changes in waiting times. In order to illustrate the impact of including some information on health outcomes, we examine the structure of an index that includes an indicative health gain for survivors. Ordinarily improvements in survival are considered an improvement in the quality of NHS care. However, for a number of conditions, the NHS provides terminal care. We examine adjustments to the quality indicator required to deal with this issue. The appropriate method for quality adjusting a cost weighted index for changes in waiting times is not obvious. We explore several alternative methods for doing this. We conclude section 4 by comparing our approach to the recommendations of the Atkinson Review. In Section 5 we present results for an experimental quality adjusted cost weighted output index. We show the sensitivity of the index to different ways of treating mortality rates, waiting times and choice of discount rate. We also examine the feasibility of augmenting the index with available data on patient satisfaction, readmission rates and incidence of MRSA. The results reported in Section 5 reflect what is feasible at present when estimating a comprehensive index. However, there are a few conditions for which outcomes data are available. In Section 6 we use these data to estimate a “specimen” index. We 12 present results for a value weighted output index and for variants of a cost weighted index that incorporate observed health gains in the survival adjustment and allow for changes in waiting times. We also use the specimen index to illustrate the effect of substituting health outcome weights for cost weights. In Section 7 we draw on the results from Sections 5 and 6 and present two variants of the quality adjustments, one of which is our preferred variant. We show the effects for hospital sector and for overall NHS output indices. Section 8 is devoted to measuring labour input in the NHS. It outlines methods for calculating labour volumes and quality adjusted labour input where the latter takes account of different productivities of workers, dividing the workforce by skill group. It combines data from the NHS employment census with the rich data on worker characteristics available in the Labour Force Survey. Section 9 brings together the output measures reported in Section 5 with the labour input measures in Section 8 to derive labour productivity estimates. Using estimates for growth in intermediate inputs and capital from a range of sources, productivity estimates are shown that account additionally for these two inputs. In Section 10 we summarise the lessons learned in the course of this research for the availability of relevant data. We stress the importance of making better use of existing data (e.g. by record linkage and diagnosis added to prescription forms), the scope for improving output measurement with data beginning to be collected (GP consultations) and the need for the NHS to routinely collect health outcomes data. We conclude in Section 11 with recommendations on how the Department of Health can advance work on output and productivity measurement. 13 2 Productivity and output measurement 2.1 Total factor productivity growth in private markets If private markets are complete and competitive, prices reflect marginal utilities of the services to consumers and the marginal costs of providers. With some additional assumptions, the measurement and interpretation of productivity growth is then straightforward. Denote the vector of outputs from a firm at a time t as x(t). We index the goods by j. Let z(t) be the vector of n inputs (types of capital, labour and materials). ν(t) is a parameter which captures the state of technology at time t. The technology of the firm is described by the implicit production function g (x (t ) , z (t ) , v (t )) = 0 (1) Assume that the technology exhibits constant returns to scale. Differentiating (1) with respect to time gives ∂g ∑ ∂x j x& j + ∑ n j ∂g ∂g z&n + i v& = 0 ∂zn ∂v (2) A profit maximising firm in a competitive market will choose x(t), z(t) to satisfy p j = −θ∂g / ∂x j , wn = θ∂g ∂zn , where θ = − p1 /(∂g / ∂x j ) is the Lagrange multiplier on the production constraint. We can rearrange (2) as ∑ω j where y j x& j xj − ∑ ωnz n ⎛ ∂x v ⎞ v& z&n θ ∂g v& = ω1y ⎜ 1 ⎟ = zn px ∂v ⎝ ∂v x1 ⎠ v ω jy = p j x j ∑p x j j j , ωinz = wn zin (3) ∑p x j ij j and y is the value of output from the firm y = ∑ j p j x j . The left-hand side of (3) is the rate of change of a Divisia quantity index of outputs, minus the rate of change of a Divisia quantity index of inputs. Since total factor productivity (TFP) is the ratio of an index of outputs to an index of inputs, the left hand side is also a measure of total factor productivity growth (TFPG). If production 14 takes place with constant returns to scale, then the total value of the product is expended on the costs of the inputs and we can replace the second term on the left hand side with the rate of change of an input index based on the cost shares wn zn ∑w z n n n The middle and last terms in (3) are equivalent expressions for the rate of technical progress. In the last term the rate of technical progress is given as the increase in one output (x1), holding all other outputs and inputs constant, made possible by the change in technology. Thus TFPG also measures the rate of technological progress. Technical progress increases welfare by relaxing the production constraint on the economy. Under certain assumptions total factor productivity growth can be given a direct welfare interpretation. Thus suppose that the economy is characterised by the implicit production function g(x,z,v) = 0 and resources are allocated to maximise current period welfare U(x,z) where x and z are vectors of outputs and inputs. The Lagrangean for the welfare problem is L = U ( x , z ) + λ g ( x , z, v ) (4) and from the envelope theorem dU / dv = dL / dv = ∂L / ∂v = λ g v (5) Hence, if U is derivable from an individualistic, non-paternal welfare function, the fact that the allocation in an economy with a complete set of competitive markets maximises some such welfare function, means that TFPG is an increasing monotonic function of the change in welfare resulting from technological change. The simple story above takes no account of changes in the stock of capital goods used to produce consumption goods. Since what is consumed no longer equals what is produced it is more complicated to give a welfare interpretation to changes in the output index, though it is possible to do so in some cases (Sefton and Weale, forthcoming). 15 2.2 Total factor productivity growth and quality change A measure of TFPG in a market sector when there is quality change can be constructed in the following manner. Let the production function for a firm or sector which produces only one type (j) of output be g j ( x j , q1 j ,..., qKj , z j , v j ) = 0 (6) Here xj is the volume or quantity of output j (the number of units produced) and qkj is the amount of outcome or characteristic k produced by consumption of one unit of output j. The vector qj determines the quality of the product. At the equilibrium of a market economy the price paid for a unit of output j depends on the outcomes it produces: pj(qj), and is also a measure of quality. If the market for good j is competitive a profit maximising firm’s choice of output, inputs, and outcomes will satisfy p j = −θ∂g j / ∂x j , x j ∂p j / ∂qkj = −θ∂g j / ∂qkj , and wn = θ∂g j ∂z jn . Totally differentiating the production function with respect to time gives ∂g j ∂x j x& j + ∑ k ∂g j ∂qkj q&kj + ∑ n ∂g j ∂z jn z& jn + ∂g j ∂v j v& j = 0 (7) and after using the profit maximising conditions, assuming constant returns to scale to substitute total cost for the value of output in the weights on the inputs, and rearranging we get ⎛ ∂p q ⎞ q& ⎛ ∂y v ⎞ v& z& θ ∂g j v& j = ⎜ j j ⎟ j + ∑ ⎜ j kj ⎟ kj − ∑ ω nz jn = ⎜ ∂v j x j ⎟ v j x j m ⎜⎝ ∂qkj p j ⎟⎠ qkj n z jn p j x j ∂v j ⎝ ⎠ x& j where ω nz = wn z jn ∑w z n jn (8) . n Thus if we do not take account of the change in quality (the middle term in the left hand side of (8)) and merely calculate the difference between the rate of growth of the output and input indices we will not be measuring the rate of technical progress (the second and last terms). Equivalently, if we define TFPG as the difference between the rates of growth of the value of output and the cost of inputs, we will typically underestimate TFPG if we do not allow for the changing value of outputs because of improvements in quality. Consequently we need to take account of the change in the mix of outcomes (characteristics) embodied in each unit of output. 16 Denoting the marginal effect of outcome or characteristic k on the price of output j as π kj ≡ ∂p j / ∂qkj we can write the rate of growth of the total value of output summed across all sectors ( y = ∑ j p j x j = ∑ j y j ) as ⎡ p x ⎡ π q q& x& ⎤ q& x& ⎤ y& = ∑ j j ⎢ ∑ jk kj kj + j ⎥ = ∑ ω jy ⎢ ∑ ω kj jk + j ⎥ y y ⎢⎣ k p j qkj x j ⎥⎦ q jk x j ⎥⎦ j j ⎢⎣ k where ω kj = π kj qkj ∑π (9) q kj kj k In competitive equilibrium these prices represent social values as well as costs of production. Thus, in principle the prices obtained in the competitive equilibrium enable us to calculate the rate of growth of the value of output and so derive the rate of technical progress via TFPG. We need to estimate the hedonic price functions pj(qj) which relate prices to the quality of goods (Rosen, 2002). In practice there are considerable difficulties even in market sectors in allowing for quality changes. The discussion shows that a measure of TFPG which relates only to the volume of outputs and ignores their outcome or quality characteristics is incomplete. Note also that it is also important to capture any quality change in inputs as well as outputs. Thus if the NHS is employing more skilled labour, a measure that merely counts number of workers without taking account of differences in marginal productivities across skill types, will overestimate TFPG. The contribution from using better quality labour is incorrectly attributed to technical progress. Section 8 deals with quality adjusting labour input. 2.3 Application of TFPG methods in the NHS The construction of a NHS productivity measure should capture the valuable things that the NHS produces. However, operationalising this simple idea is not straightforward because of the difficulties of defining NHS outputs, attaching values to the outputs, and obtaining the relevant data. We distinguish activities (operative procedures, diagnostic tests, outpatient visits, consultations…), outputs (courses of treatment which may require a bundle of 17 activities), and outcomes (the characteristics of output which affect utility). The focus in health economics has been on the change in health produced by a course of treatment, typically measured in quality adjusted life years (QALYs). But other characteristics of treatment also affect utility: the length of time waited for treatment, the degree of uncertainty attached to the waiting time, distance and travel time to services, the interpersonal skills of GPs, the range of choice and quality of hospital food, the politeness of the practice receptionist, the degree to which patients feel involved in decisions about their treatment, etc. The aim is to measure the change in the volume of NHS outputs taking account of quality changes (changes in the volume of characteristics produced) but not of changes in the marginal social value of those characteristics. The distinction between outputs and outcomes is identical to that between goods and characteristics in consumption technology models (Deaton and Muellbauer, 1980, Ch. 10; Lancaster, 1971) where consumers value goods because of the bundle of characteristics that yield utility. The quality of the output is a function of the vector of outcomes it produces. In the measurement of private sector productivity growth the focus is on outputs rather than the characteristics they produce because of the assumption that the market price of the output measures the consumers’ marginal valuation of the bundle of characteristics from consuming the output. In measuring private sector productivity we also do not need to concern ourselves with counting activities because they are embodied in the outputs which are produced and sold. The direct application of the methods used to measure TFPG in the private sector is problematic in the NHS for two main reasons. First, there is no final market for NHS outputs which makes calculation of TFPG more difficult. Second, NHS production may not be optimal, which undermines the welfare interpretation of TFPG. One of the justifications for having the NHS in the first place is to eliminate a market in which patients buy outputs from producers. Even in the few cases where the NHS does sell its output to the final consumer, as for pharmaceuticals prescribed by general practitioners (GPs) and dispensed to patients who are not exempt from payment, the price does not equal marginal cost. 18 The absence of final markets has two major consequences for attempts to measure NHS productivity. The first is that some outputs are not counted at all or are poorly measured. Instead there may be data only on the activities and even these may be lacking in many areas of activity. We discuss the implications in section 4. The second consequence is that, because there are no prices to reveal patients’ marginal valuations of NHS outputs, we have to find other means of estimating their value. We can do so in two equivalent ways: we can measure the outputs and attempt to estimate the marginal valuations attached to them or we can measure the outcomes produced by each unit of output and attempt to estimate marginal valuations of the outcomes. The bundle of outcomes produced by a unit of output is likely to change over time in the NHS because of, among other things, changes in technology or treatment thresholds. In a private market the price of output would change to reflect this. But in the absence of market prices for NHS outputs it is likely to be easier to calculate the change in the marginal value of output by focusing on the change in the vector of outcomes. We show below how the changing mix of outcomes (quality change) may be allowed for in principle. We discuss how quality adjustments based on the currently available data can be incorporated into an output index in section 4 and show the results of applying these methods to calculate experimental quality adjusted indices in Section 5. We have made suggestions as to how the quality adjustment can be improved by the collection of additional data in our Second Interim Report and discuss this further in Section 10. The major problem in interpreting TFPG in the NHS is that it is by no means obvious that the NHS is producing optimally. It may be technically inefficient in the sense that it is possible to increase some type of output without increasing inputs or reducing some other output. It may also be producing the wrong mix of outputs. Figure 2.1 illustrates. 19 Figure 2.1 Productivity, efficiency and welfare Output Production frontier B P2 P1 Social welfare indifference curve A Input A at year 1 has higher productivity than B at year 2 but lower welfare and is less efficient (further away from its period production frontier) Consider the simple single input, single output case in Figure 2.1. Point A in year 1 has higher productivity than point B in year 2 but welfare is lower at point A and, on any reasonable measure of technical efficiency, A has lower technical efficiency since it is further from its period production frontier. Technical progress has shifted the frontier upward from P1 to P2 but the productivity change does not even have the same sign as technical progress. The increase in welfare between period 1 and 2 is in part due to technical progress (B was not even feasible with the old technology) and to improvements in efficiency, perhaps because of changes in institutional structures and incentive mechanisms. Note also that both technologies in this example have diminishing returns to scale so that increases in inputs along the frontier reduce productivity but that such a movement along the frontier can be welfare increasing. These considerations suggest that there are problems in interpreting productivity growth as a welfare or efficiency measure. Nevertheless it can be a useful summary statistic to be used in conjunction with other data on the NHS. A further justification for attempting to measure productivity is that it will stimulate improvements in NHS information collection and processing which may lead to improved decision making within the NHS. 20 2.4 Value weighted NHS output index To measure NHS TFPG we need a measure of output growth which reflects the changes in quality. Let yjt be the social value of the volume of NHS output j (xjt) measured at the marginal social value of output j at date t (pjt). The marginal social value of output j depends on the mix of characteristics produced by a unit of output j: y jt = x jt p jt = x jt (∑ π k q kt kjt ) (10) where qkjt is the amount of characteristic k produced by a unit of j. Notice that we assume that the marginal social value function is linear p jt = ∑ k π kt qkjt (11) The assumptions that the marginal social value of a unit of output j is a linear function of its characteristics and that the πkt is independent of j are strong. The latter for example requires that an improvement in the quality of hospital food (say) per day in hospital has the same effect on the value of treatment for throat cancer as on the value of a hip replacement. The total value of NHS output is yt = ∑ j y jt . We want to measure the discrete time version of the growth rate of y. We could use the Tornqvist discrete time approximation to the continuous Divisia index (9) but for simplicity present the analysis in terms of a base weighted index.1 In practice there is little difference between a chained base weighted index and the Tornqvist index. The base value weighted output index that we seek to measure is I ytxq = ∑x ∑π ∑x ∑π j j q jt +1 k kt kjt +1 jt k kt kjt q (12) which allows for changes in volume of outputs (xjt) and of their characteristics (qkjt) but holds the marginal value of the characteristics (πkt) constant. We can also express the value weighted index as 1 We compare the results obtained from calculating base weighted indices with those from current weighted indices and Fisher indices (the square root of the product of the current and base weighted indices). 21 ⎛ x ⎞ ⎛ ∑ π kt qkjt +1 ⎞ ⎛ p jt x jt ⎞ I ytxq = ∑ ⎜ jt +1 ⎟ ⎜ k ⎜ x ⎟ ⎜ ∑ π q ⎟⎟ ⎜ y ⎟ t ⎠ ⎝ jt ⎠ ⎝ k kt kjt ⎠ ⎝ kt ⎤ ω ytjt = ∑ j (1 + g xjt ) ⎡⎣ ∑ k (1 + g qkjt ) ω pjt ⎦ (13) where gxjt is the growth rate of output xj, gqkjt is the growth rate of characteristic k produced by output j, ω ptkt is the proportion of the marginal value of output j accounted for by the k’th characteristic, and ω ytjt is share of the total value of period t output accounted for by output j. Note that year to year changes in the marginal value of characteristics (πkt) produced by an output j do not affect the year to year rate of growth of output j (gxjt). They will however affect the weights for any index form with chained weights. Thus the overall, weighted average, rates of growth over a period of years will depend on the changes in the marginal values. This is precisely analogous to the effect of changing product prices in output indices for private sector goods and services. 2.5 Changes in marginal social values over time y = ∑ j pjxj In section 2.4 we specified the value of NHS output as = ∑ j ∑ k π k q jk x j . In the rate of growth of the value of NHS output we assumed that the marginal social values of output (pj) or of outcomes (πk ) were constant over time. If instead we had allowed changes in marginal values over time then the rate of growth of the value of NHS output would have been ∑ (1 + g ) ⎡⎣∑ (1 + gπ ) (1 + g ) ω j xjt k kt qkjt kt pjt ⎤ ω ytjt − 1 ⎦ (14) where gπ kt is the growth rate in the marginal value πkt of characteristic k. (14) depends both on changes in production conditions (the rates of growth of outcomes per unit of output and the rates of growth of outputs) but also on preferences (the rates of growth of the marginal social values of outcomes). We argue that changes in the marginal social values of outcomes between periods should not affect the growth rate between periods of the value of NHS output. The measure of output 22 growth is intended to measure changes in real value of output: the volume of outputs and the volumes of valuable characteristics they produce. Thus gπ kt should not be included in the value weighted index of NHS output. For example, under plausible assumptions the growth in the value of a QALY is determined by the rate of growth of income and the elasticity of marginal utility of income (Gravelle and Smith, 2001). But it is not affected by decisions within the NHS (except perhaps to a negligible extent because NHS decisions affect population health and thus the growth rate in income by improving worker productivity across the economy). We should not count changes in the marginal value of the QALY when calculating real NHS output growth. This does not mean that changes in the value of QALYs and other outcomes have no relevance for decision making. Most decisions in the NHS have effects on outputs and outcomes over several periods – the health gain to a treated patient will typically accrue over several years. In evaluating these decisions the changing value of health should be taken into account: health changes accruing in different periods have different values. Changes in the value of health, and other characteristics, should affect decisions about the allocation of resources within and to the NHS. But they should not affect the calculation of changes in productivity between one period and the next, especially if the measure of productivity is intended to be used in part for monitoring the performance of the NHS. Whilst we may want to exclude the growth in the marginal value of outcomes as contributing to TFPG we have to know whether and how the marginal values change over time in order to use the correct weights in calculating productivity growth. Note that p& j pj =∑ k π k q jk ⎛ π& k q& jk ⎞ ⎜ + ⎟ p j ⎜⎝ π k q jk ⎟⎠ (15) which again brings out the importance of the distinction between outcomes and outputs. Even though we argue that the rate of growth of marginal social values should not be counted as part of productivity growth this does not mean that we 23 should remove all the rate of growth of marginal social value of outputs since part of p& j / p j is due to changes in quality rather than to changing preferences. 2.6 Outcomes and attribution The characteristics qkjt are the marginal effects of the NHS. Thus for example we wish to measure the marginal effect of output xjt on the health of individuals receiving this treatment, holding constant all other factors which affect health. Similarly the rate of growth of the effect of output j on characteristic k is the change in the marginal product from one period to the next due to changes in the technology (defined widely to include the way the NHS is organised). Since the other factors which affect the marginal product will also affect its rate of growth we should hold them constant in calculating the growth in the marginal product of NHS outputs from one year to the next. Although the effects of other factors (e.g. improvements in diet) on the marginal product should be excluded from the calculation of the growth rate for a particular year, they are not ignored because they affect the weights applied to the growth rates. Parts of the national income accounting literature note that health depends on factors in addition to health service outputs (OECD, 2000; para 7.26 – 7.28). For example health depends on income, education, age and other factors exogenous to NHS activity. Hence it is argued one cannot use health outcomes to adjust outputs to take account of “quality” changes because changes in health outcome may not be attributable to health service outputs. But what we want is the marginal effect of output j on health. If the health production function is additively separable in health service outputs and other factors, then the marginal effect of a health service output is well defined irrespective of the level of other variables affecting health. It is more plausible that the health production function is not additively separable so that the marginal effect of xj on health q depends on the confounding factors. This does not present a fundamental argument against the use of outcomes. The longstanding practice of standardising mortality rates to produce a measure of population health suggests a way round the difficulty. Standardisation produces a measure of population health from which the effects of population structure (age and 24 gender strata) have been removed so that one can make comparisons of mortality across periods or areas without the confounding effects of demographic structure. Under certain circumstances direct standardization can identify the true differences in mortality. The assumptions required are non trivial (age and gender specific mortality can be affected only proportionately by area or period (e.g. Yule, 1934)) but direct standardisation is still useful. (The more common method of indirect standardisation which produces SMRs requires even stronger assumptions.) Consider a simple example where health depends on a single NHS output x1 and another variable not controlled by the NHS, for example education or income, x2. The health production function is Qt = a0 t + a1t x1t + a2 t x2 t + a3t x1t x2 t and the marginal product of health service output is q1t = ∂Qt / ∂x1t = a1t + a3t x2 t (16) Generally we expect the effect of health service activity on health to depend on other factors ( a3t ≠ 0 ). Hence the growth in the marginal health effect of NHS output, which is crucial for quality adjusting NHS output indices, is affected by changes in the confounding factor: qt +1 a1t +1 + a3t +1 x2 t +1 = qt a1t + a3t x2 t (17) To remove the effect of the confounding factor we can choose an arbitrary fixed level of the confounding factor in (17). If we think that the changes in the coefficient a3 t are not due to health service decisions then we should also standardize with respect to it as well: qt +1 a1t +1 + a3 x2 = qt a1t + a3 x2 (18) Obvious choices for a3 and x2 are their base period values or an average of the base period and current period values. The health gains from treatment may increase simply because patients live longer. 25 Consider the example of an increase in life expectancy that is not due to developments in the NHS but reflects rising living standards, changes in diet etc. As a result an NHS treatment, such as a hip replacement, may produce a greater outcome (QALY gain) because the recipient of a hip replacement is on average alive for longer to enjoy the reduced pain and increased mobility resulting from the procedure. Thus the marginal product (the QALY gain) of the treatment is greater for reasons arising outside the health service. As far as possible, effects of changes to life expectancy which are quite independent of the procedures carried out should be kept out of calculation of the year on year growth rate in quality adjusted output of hip replacements. They should, of course, be allowed for in decisions about efficient resource allocation in the NHS but this is not the purpose of constructing an index of NHS output. There will be some cases where the QALY gain may be partly due to improvements in the procedure and partly due to patients being “better behaved”- e.g. circulatory treatments produce more QALYs if patients do not smoke. In terms of (16) the production function is not separable and judgement will be needed about how to unravel the impacts of factors exogenous to the NHS. 2.7 Cost and value weights By using costs to value outputs, a cost weighted output index can be calculated: a cost weighted sum of the growth rates of output I ctx = ∑x ∑x c jt +1 jt j j jt c jt ⎛ x ⎞ x jt c jt = ∑ j ⎜ jt +1 ⎟ = ∑ j (1 + g xjt )ωctjt ⎜ x ⎟∑ x c ⎝ jt ⎠ k kt kt (19) where cjt is the unit (average cost) of output j (see section 3). The cost weighted index I ctx is equivalent to the value weighted quality adjusted index I ytxq only if (a) quality change is zero for all characteristics of all outputs (b) cjt is proportional to the marginal social value of output (Dawson et al., 2004a, section 2.11): c jt = λt p jt = λt ∑ k π kt q kjt (20) There is limited information on both characteristics and their marginal social value so 26 that attempts to estimate I ytxq are bound to involve compromises. Suppose that we could observe the changes in characteristics but not their marginal values (πkt). How far does the assumption (20) that the output mix in the NHS maximises social value subject to budget constraint take us in estimating I ytxq ? Using (20) in (19) gives I ytxq ⎡ π q ⎤ = ∑ j (1 + g xjt ) ⎢ ∑ m (1 + g qkjt ) kt jkt ⎥ λtω ctjt c jt ⎥⎦ ⎢⎣ kt ⎤ ω ctjt = ∑ j (1 + g xjt ) ⎡⎣ ∑ m (1 + g qkjt )ω pjt ⎦ (21) Knowledge of cjt and λt is not sufficient to calculate the value weighted output index. kt in We require knowledge of the relative importance of each characteristic ω pjt determining the marginal social value of output: we need to know the marginal social values of each characteristic (πkt), and the amount of each characteristic produced by each output (qkjt). However if only one characteristic k is socially valuable then assumption (20) and knowledge of unit costs and the growth rate of the single valuable characteristic (k) is sufficient: I ytxq = ∑ j (1 + g xjt )(1 + g qkjt ) ω ctjt (22) In practice there is also imperfect information about the amount of the characteristics (qkjt). Section 4 discusses the possibility of using the currently available data to quality adjust the cost weighted output index. 3 Current practice The terminology employed by the Department of Health differs in some respects from that used in the economics literature. In our outline of current practice we use the Department of Health terminology but attempt to relate it to the economic concepts set out in Section 1.1 and used in this report. 3.1 The cost weighted activity index (CWAI) Prior to 2004 the measure of annual NHS productivity change published by the Department of Health was based on estimating the change in a cost weighted activity 27 index (CWAI) less the change in NHS expenditure deflated by the index of NHS costs and prices, to generate a cost weighted efficiency index (CWEI). CWAI was estimated using data on activity for twelve categories of Hospital and Community Health Service (HCHS) expenditure: • Inpatient and day case episodes • Outpatient, A&E and ward attenders • Regular day patients • Chiropody • Family planning • Screening • District nursing • Community psychiatric nursing • Community learning disability nursing • Dental episodes of care • Ambulances Each category of activity was weighted by its share in HCHS expenditure. There was no adjustment for improved health outcomes so that the only source of productivity improvement was an increase in the number of patients treated in hospital, ambulance trips, etc. per pound of real expenditure. 3.2 The ‘experimental’ NHS cost efficiency and service effectiveness indices In 2004 the DH replaced the CWEI and developed two new ‘interim’ indices: an NHS cost efficiency index and a service effectiveness index. The approach was dictated by the need to respond to the Treasury’s view that ‘Value-for-money’ should be measured in ways that permitted assessment of performance against a target of 1% p.a. improvement in cost efficiency and 1% p.a. improvement in service effectiveness. The latter was generally understood to refer to return on expenditure to improve quality. 28 3.2.1 The experimental cost efficiency index The experimental cost efficiency index incorporates a change to the measurement of outputs and a change to the measurement of inputs. The change to the measurement of outputs involved replacing CWAI with an Output Index, which includes significantly more activities than CWAI and uses Reference Costs to weight different activities. The Output Index now counts over 1,700 categories of NHS activity and includes activity in primary care. The services covered are: • Elective inpatients (over 500 activity categories) • Non-elective inpatients (over 500 activity categories) • Outpatients (around 300 activity categories) • A&E (9 activity categories) • Mental health services (30 activity categories) • Primary care prescribing (almost 200 activity categories) • Primary care consultations (5 activity categories) • NHS Direct calls answered (1 activity category) • NHS Direct online internet hits (1 activity category) • Walk in centre visits (1 activity category) • Ambulance journeys (1 activity category) • General Ophthalmic Services (1 activity category) • General Dental Services (1 activity category) • Others including Critical care, Audiological Services, Pathology, Radiology, Chemotherapy, Renal dialysis, Community services, Bone marrow transplants & Rehabilitation (over 100 activity categories) The coverage is not complete (Lee, 2004) and some of the omitted activities, such as the Prison Health Service, are not small; though others (Parentcraft Classes) seem unlikely to have a large impact on the index. But the extension of coverage is a very significant improvement. Use of Reference Costs to weight Hospital and Community Health Services (HCHS) activity means that increases in more expensive treatments will have greater weight in the Output Index than increases in relatively low cost treatments. This is also true of 29 primary care prescribing which is measured as prescriptions issued and weighted by the cost of drugs prescribed. An increase in prescribing more expensive pharmaceuticals will have a greater effect on the Output Index than increased prescribing of less expensive drugs. The Output Index is currently used by ONS to measure NHS output in the National accounts. Table 3.1 shows the relative weights for each main type of activity in the Output Index. Table 3.1 Components of the NHS output index Cost share Electives+ day cases Non-electives Outpatients Mental Health GP & practice nurse consultations Dentists Prescriptions Accidents & Emergency CCS Other Total DH Output Index (Laspeyres) 2001/02 Growth in 2002/03 relative to 2001/02 12.84 20.64 10.99 9.56 12.44 4.69 16.48 2.17 3.97 6.20 100 5.10 4.92 4.19 3.62 10.27 -0.61 7.85 4.52 -0.28 5.94 5.36 In comparison with the previous CWEI, the experimental cost efficiency index includes a revised index of inputs in addition to the revised measurement of outputs. Since there are no measures of quality associated with the activities included in the Output Index, the DH has attempted to estimate expenditure on inputs net of expenditure intended to improve quality. Total expenditure on inputs is reduced by estimated expenditure on: • Increases in capital charges • Increases in Private Finance Initiative revenue expenditure 30 • Increases in HCHS drugs expenditure • Increases in Information Technology expenditure • Increases in clinical supplies expenditure • Increases in Family Health Services drugs expenditure • Cost of occupational enrichment • Cost of grade enrichment • Cost of reduced waiting times The remaining expenditure on NHS services is deflated by the public sector price deflator to obtain an index of changes in real NHS inputs. The resulting productivity measure has been published as an index of NHS unit costs (Department of Health, 2004a, 2004b). 3.2.2 The service effectiveness growth measure In the absence of data on quality improvement for all the activities included in the DH’s new Output Index, and the need to quantify quality change for the Treasury, the DH has identified some areas where it believes that aspects of quality change can be measured and valued in monetary terms. Under consideration are: • Reduced waiting times (outpatient, A&E, inpatient treatment) • Reduced mortality rates for specific conditions (CHD and cancer) • Improved patient experience Discussion of how to value these quality improvements is still under way but possibilities include: • Incorporating changes in mortality rates and estimates of the number of ‘lives saved’. Given the age and gender of lives saved, an estimate could be made of the Quality Adjusted Life Years (QALYs) produced and valued at £30,000 per QALY. An alternative is the £1m per road death avoided used by the Department of Transport. • Placing a value on reduced waiting times and patient experience using data from discrete choice experiments. (Source: personal communication.) 31 3.3 Pharmaceuticals and prescribing Prescriptions issued in primary care are counted as activities and therefore as outputs in the DH’s new Output Index. The cost weight on this activity is total expenditure on the drugs prescribed. By contrast, in the hospital sector drugs are treated as inputs, not outputs. For hospital based activity, drugs prescribed only enter the output index as an element in the cost weight (Reference Cost) attached to an activity such as a bypass operation or dialysis: they are not counted as an activity. The impact of the current treatment of prescribing in primary care as an output weighted by the cost of the drugs prescribed can be seen in Table 3.2. It is the movement toward prescribing more expensive drugs that contributes most to the growth in output. Table 3.2 GP prescribing in the NHS output index (annual growth rates) 2002/03 2001/02 2000/01 Number of prescriptions Cost weighted prescriptions 5.44 7.85 5.41 7.52 5.01 6.28 Impact on overall index DH output index DH output index excluding prescriptions 5.24 4.74 4.22 3.53 1.82 0.66 When the new NHS output index is used in estimates of productivity growth, prescription drugs are also counted as an input. ONS in their measure of productivity change present two variants for family health services drugs (net of receipts from prescription charges) employing deflators based on the average unit cost of all items and a Paasche price index for existing items. The latter is an attempt to adjust the deflator for the changing quality of drugs. These two variants lead to quite big differences, amounting to about half a percentage point per annum from 1995 to 2003 in real input growth (Lee, 2004, Hemingway, 2004). ONS rely on the Prescription Pricing Authority (PPA) and plan to consider the division by item in more detail in future revisions. 32 These problems would disappear in our “preferred” value weighted index of NHS output (12). Patients treated would be the unit of output weighted by health gain. Pharmaceuticals would be counted as an input. If prescribing more expensive drugs turned out to be cost effective in improving health outcomes, this would appear as a productivity increase. There is no doubt that GPs add value through the activity of prescribing—otherwise all licensed drugs would be available over the counter. If this value added is not reflected in the assumption that the wage rate approximates the marginal product of GPs, a measure of this value added would be the correct weight for the activity of prescribing in the short-term cost weighted activity index. 4 Quality adjustment with available data In this section we discuss how we can use available data to calculate a quality adjusted index of NHS output which corresponds as closely as possible to the ideal value weighted output index I ytxq = ∑x ∑π ∑x ∑π j j q jt +1 k kt kjt +1 jt k kt kjt q (12) Calculation of (12) requires information on the outputs (xjt), the outcomes qkjt, and the marginal social values of the outcomes πkt. Previous NHS output indices have been derived from information on outputs and have implicitly assumed that unit costs measure marginal social values. As we noted in section 2.7, even if this assumption is correct the resulting cost weighted index is not equivalent to the value weighted index unless there is no change in quality. With current information any outcome index will have to rely heavily on the assumption that unit costs measure marginal social value. Thus the main focus of the section is the extent to which it is possible to use additional existing data to calculate a quality adjusted cost weighted index. Sections 4.1 to 4.4 examine issues in the measurement of outputs (x), section 4.5 discusses sources of information on marginal social values (π), and sections 4.6 to 4.13 consider how existing data on long and short term survival, readmissions, MRSA, waiting 33 times and patient satisfaction can be used to proxy changes in outcomes as quality adjustments (q). The annex provides a flow chart showing the relationship of the various indices estimated in the report. 4.1 Activities or outputs as the unit of analysis International guidance on the measurement of government output for national accounting purposes recommends distinguishing activities, outputs and outcomes. In the health service, activities would include operative procedures, diagnostic tests, outpatient visits, and consultations; outputs might comprise courses of treatment which may require a bundle of activities; and outcomes would be defined as the characteristics of output which affect utility. 4.1.1 Activities: institutional approach NHS productivity measures have been based upon estimates of the number of particular types of activities (procedures, consultations etc) or the number of patients treated in various institutional settings (see section 3). There are advantages to continuing within this framework. In instances where care for a patient with a particular condition is provided entirely within one setting, aggregation within the setting is equivalent to aggregation by patient pathway or disease group. It ensures compatibility with current NHS reporting systems and is likely to prove amenable to analysis at a disaggregated level. It can be a useful means for monitoring and managing lower level units within the NHS. Further, the approach would ensure consistency with other policy initiatives, most notably the Payment by Results reforms (Department of Health, 2002a). The major disadvantage is that most patient cases pass through more than one institutional setting and their care requires several activities. For example, a patient who has a hip replacement will typically have been seen in general practice, in an outpatient department, treated as an inpatient in hospital and received after care treatment from her general practitioner and from personal social services. Such care patterns can lead to double counting and make problematic the valuation of output of 34 separate sectors contributing to joint production across sectors. Current routine administrative data systems cannot track patients and their resource use as they move along care pathways across settings. Even within institutional settings data may not be appropriately linked. For example, whilst there are very detailed data on types and quantities of different drugs dispensed to the patients of individual general practitioners, they are not linked to the individual patient or even to diagnostic group, so it is not possible to say who got what prescriptions or for what condition. 4.1.2 Outputs: patient-centred or disease-based approach The bulk of NHS activities or services are delivered to individual patients with the aim of improving their health. But a disease or patient pathway approach has demanding data requirements. The approach is being investigated by US researchers (Berndt et al., 2002; Berndt, Busch and Frank, 2001; Cutler and Huckman, 2003; Shapiro, Shapiro and Wilcox, 2001) and, in the UK, by the Office for National Statistics. It is probably the best way forward in the long run but is not fully implementable with the types of data available in the NHS in the short to medium term. One key element required is linkage of patient records across activities and this improvement in the data is planned by the DH. Another requirement is the use of clinical teams to identify procedures and tests relevant to specific conditions and provision to update coding for procedures along clinical pathways as technology changes. The relative advantages of the patient/disease group and institutional setting approaches depend on the degree of coverage, ease and timeliness of data collection; the dangers of double counting (for instance, where patients suffer multiple health problems); the ability to link to data on outcomes or prices; and the usefulness of the disaggregated measures (for instance, in changing behaviour). For the short to medium term the lack of linked routine data means that the measurement of NHS output will be based predominantly on the measurement of activities rather than patients. 35 4.2 Unit of hospital output The main source of data on hospital output (excluding outpatient activity) is the Hospital Episode Statistics (HES) which is derived from the cleaned returns submitted by hospital trusts.2 There are four possible measures of hospital activity. • Consultant episodes. The basic unit in HES is the consultant episode. Each observation records the treatments provided to a patient whilst they are under the care of a particular consultant. HES contains episodes which are unfinished at the start and end of each HES year. • Finished consultant episodes (FCEs). A count of episodes means that an episode which spans two HES years would be counted in each year. FCEs are episodes which have finished by the end of the HES year, though they may have begun before the start of the HES year. The DH’s new Output Index use finished consultant episodes (FCEs) since unit costs are derived from the Reference Costs data and these are defined for FCEs. • Provider spells (PS). Around 8% of patients have more than one FCE during a spell in a hospital. It is possible to link episodes in the same spell to count provider spells. • Continuous inpatient spells (CIPS). Some patients (around 1%) are transferred to another provider at the end of an episode and it is possible to link episodes across providers to yield continuous inpatient spells. The amount of HES activity by year for FCEs, and CIPS is shown in Table 4.1. where, as they should, total FCEs exceed total CIPS. Both of these HES volume data are also always larger than those reported in the Reference Cost returns. The growth in activity (measured as the total numbers of Reference Cost hospital activities, and by HES based FCEs and CIPS) varies according to the measure employed, with all showing a larger increase in activity between 2002/03 and 2003/04. Appendix B describes our use of HES in more detail, including the construction of unit costs for spells. 2 From 2003/4 HES data includes outpatient attendances and Accident and Emergency department activity but this had not been included in released databases at the time of producing this report. 36 CIPS more nearly correspond to the patient journey. CIPS capture most comprehensively the full package of inpatient care and they are less vulnerable to being miscounted if transfers among providers vary over time or if there are changes in how “being under the care of a consultant” is defined. We have therefore calculated most of our indices using CIPS, though we also report comparisons of CIPS and FCE based indices (section 5). We recommend that future measures of hospital sector output use CIPS as the unit of outcome. 37 Table 4.1 Number of episodes, CIP spells from HES and number of episodes from Reference Costs HES data Episodes Electives Non-electives Total 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 5491046 6486405 11977451 5577523 6613290 12190813 5573942 6692208 12266150 5485256 6802085 12287341 5664968 7025377 12690345 5815929 7516611 13332540 98/99-99/00 1.57% 1.96% 1.78% 99/00-00/0 -0.06% 1.19% 0.62% 5487579 5618166 11105745 5479633 5595606 11075239 98/99-99/00 1.12% -2.86% -0.94% 99/00-00/0 -0.15% -0.40% -0.27% 4805812 5220380 10026192 5166244 5350960 10517204 98/99-99/00 1.59% 3.34% 2.50% 99/00-00/0 7.50% 2.50% 4.90% Growth Electives Non-electives Total CIP spells Electives Non-electives Total 5427066 5783750 11210816 Growth Electives Non-electives Total 00/01-01/02 01/02-02/03 02/03-03/04 -1.59% 3.28% 2.66% 1.64% 3.28% 6.99% 0.17% 3.28% 5.06% 5386575 5607484 10994059 5578093 5963742 11541835 5736331 6411777 12148108 00/01-01/02 01/02-02/03 02/03-03/04 -1.69% 3.55% 2.84% 0.21% 6.35% 7.51% -0.73% 4.98% 5.25% Reference Cost data Electives Non-electives Total Growth Electives Non-electives Total 4.3 4730410 5051451 9781861 5171867 5604390 10776257 5360406 5684987 11045393 5467913 6021765 11489678 00/01-01/02 01/02-02/03 02/03-03/04 0.11% 3.65% 2.01% 4.74% 1.44% 5.92% 2.46% 2.50% 4.02% Alternative sources for hospital activity There are two alternative sources of information about hospital activity: • the Reference Cost returns and • the Hospital Episode Statistics (HES). Table 4.1 compares Reference Cost activity volumes with those for HES FCEs and CIPS. 38 The Reference Cost returns have been compiled annually since 1998 and have become steadily more comprehensive. Hospital activity is summarised as aggregated counts separately for elective inpatients, elective daycases and non-electives by each Healthcare Resource Group. Based on version 3.1 HRGs, the Reference Cost returns include up to 3×565 HRG categories for hospital activity (excluding “unclassified” HRGs). Hospital activity is also available from the Hospital Episode Statistics. HES returns have been submitted by NHS providers since the late 1980s. HES contains data on every admitted patient, and comprises individual patient records, with information extracted directly from each patient’s medical record. HES provides different counts of activity to that recorded in the Reference Costs returns, the main reasons being the following: • First, the HES data undergo a more thorough process of validation than the Reference Cost returns. Among other things, this validation strips out duplicate records and ensures assignment to the correct Healthcare Resource Group. The estimates of activity submitted in the Reference Cost returns are not subject to the same validation process. • Second, HES counts all FCEs, whereas there is variable practice in what is recorded in the Reference Cost returns: sometimes all FCEs are recorded, sometimes only first FCEs are recorded. The main discrepancies between activity counts in HES and Reference Cost returns relate to activities with very long lengths of stay, including rehabilitation, mental health, bone marrow transplants, cystic fibrosis, etc. These are in HES but stripped out of Reference Cost activity. • Third, there may be differences in how activity is apportioned to each year. HES includes all FCEs that are completed within the financial year. It is not clear how patients are counted in the Reference Cost returns when their hospital stay crosses the end of the financial year. As well as being more thoroughly validated, HES is to be preferred to the Reference Cost return for the following reasons: 39 • Ideally, as explained in the previous section, we should be capturing each individual’s journey through the health system. The best available measure of this the Continuous Inpatient Spell (CIPS). CIPS cannot be derived from Reference Cost returns. • Being individual patient records, it is possible to aggregate the HES data in various ways. We aggregated the HES data into Healthcare Resource Groups, so that there is an equivalent set of activity categories as for the Reference Costs. But it is perfectly feasible to aggregate the data to other groupings, such as specialty or OPCS procedure. Moreover, HES data can be allocated easily to different HRG classifications, as the classification system is periodically revised. This flexibility in deciding activity categories is lacking in the Reference Cost returns, because these data have already been aggregated. • We argue that NHS activities should be quality adjusted. For hospital activity, the HES data include items by which it is possible to make these adjustments, notably the waiting time prior to admission and the discharge status of the patient (from which mortality rates are derived). This information is not available in the Reference Cost returns. For there reasons, we use HES based activity estimates rather than the Reference Cost returns for all elective inpatient, daycase and non-elective hospital activity. We use the Reference Costs database for the other sources of activity. There are two broad HRG groups in HES not included in the reference costs – these have code T (mental health) and U (unclassified). In a spells calculation we need to include all HES activity. If we did not do so then patients whose spell included one of the omitted categories would be excluded. In addition it is also important that unclassified groups are included in a count of activities since it is likely that over time less and less activities get put into an unclassified category. It is necessary to impute unit costs to these activities. In the case of group T we used average unit cost for other Mental Health activities and for group U we used the median cost across FCEs. 40 4.4 General practitioner consultations Estimates of consultation activity are derived from the consultations reported by respondents in the General Household Survey and are available by location (surgery, home, phone) and provider (GP, practice nurse (but only after 2000)). The estimate of the number of consultations per year is made by multiplying the number of reported consultations in the 14 days prior to interview by 26. No allowance is made for seasonal factors - the date of the consultation varies across respondents and has also varied between rounds of the GHS. There have been implausibly large changes in the numbers of consultations reported in the GHS for some age-gender groups from one year to next. The GHS was also not undertaken in 1997/8 and 1999/2000 so that estimates for these years have to be interpolated. Data on consultations with practice nurses was not collected before 2000. These deficiencies of the GHS as source of GP consultations information are widely recognised (Atkinson, 2005, pp 108-111). The DH has been investigating the use of GP record systems as a source of more accurate and detailed data. We have previously made detailed suggestions on how such data should be collected (Dawson, et al., 2004b, 2004c). New data from the QRESEARCH database derived from downloads from around 500 general practices has recently become available but too late for inclusion in this report. We have agreed to undertake an analysis of general practice consultation rates data from QRESEARCH for the DH which will examine what if any adjustments need to be made to QR consultation counts to produce an estimate of consultation activity. This report will be delivered separately in the Spring of 2006. 4.5 Measures of marginal social value of outputs 4.5.1 Unit costs Current NHS practice, which follows the recommendation of Eurostat (2001), is to use production costs (such as the average costs as reported in the annually produced 41 Schedule of Reference Costs) as weights in the calculation of output indices. This implies that costs reflect the value that society places upon these activities. So cochlear implant (with a unit cost of £23,747 in the 2002/03 Reference Costs) is assumed to be 25 times more valuable than a normal delivery without complications (unit cost £921). The use of unit costs as weights reflecting the marginal social value of outputs has the support, albeit reluctant, of Hicks (1940) but as we have noted (section 2.7) it rests on the strong assumption that resources are allocated efficiently in the NHS so that unit costs are proportional to the marginal value of output produced. Even with this assumption the use of unit costs will not allow for quality changes (section 2.7) in the calculation of growth between one year and the next. Reference Costs estimates of unit costs are based on allocations of fixed costs to HRGs with FCEs as the unit of measurement. We have investigated whether it would be possible to improve on this method of estimating activity costs by using regression analysis. The Second Interim Report (Dawson et al., 2004c; section 3.7.1) describes how we attempted to estimate cost functions using a provider Trust level panel of data on activities and costs and the problems we encountered. Our subsequent estimations were no more successful. We describe these attempts in Appendix D. Apart from difficulties in trying to back compute total provider costs from the Reference Cost data on unit costs and activities, the main problem is one of degrees of freedom. There are more HRG activity types (approximately 550) than Trusts (approximately 180) so that even with observations over 6 years it is necessary to use quite high levels of aggregation of activities. We feel that the unit costs in the Reference Costs are very unlikely to measure marginal costs, even long run marginal costs, because of the accounting procedures used to generate them. We understand that, as a result of the introduction of Practice Based Commissioning and Payment by Results, the DH is considering the production of a new set of unit costs for spells, rather than for FCEs. There is a danger that Payment by Results will encourage misreporting behaviour, with providers reporting their Reference Costs close to the tariff and being reluctant to divulge information about where their costs deviate from the tariff. If such behaviour is widespread, in future the Reference Cost database may not even approximate average, let alone, marginal costs. 42 There is Reference Cost data on unit costs from 1997/98. The data had patchy coverage in the early years: only 76% of activity currently recorded had unit costs assigned to them in 1997/98. Moreover there were some considerable fluctuations in unit costs for specific HRGs in the early years (Street and AbdulHussain, 2004). We therefore decided to use the unit estimates for 1999/00 for all previous years. When there were missing unit cost data we used the estimates from the previous or following year if these were available. Some activities measured in HES have no corresponding unit costs in the Reference Costs databases and for these activities we felt it was better to retain them in the index by applying the weighted average reference cost for all other activities for that year. Having dealt with the question of duplicate entries we took great care to ensure that no other entries were dropped as a result of missing data. The general principle was that it was important not to lose any patients merely on the grounds that the records were less than complete. To this end we replaced missing data for HRG unit costs by averages for the whole. Other missing variables were again replaced by suitable population averages. For example, in order to determine the individuals’ age we use the variable STARTAGE (age at start of episode) from the first episode in the spell. However, this is missing for some individuals, (e.g. in 2002/3 35,554 episodes did not have age recorded). These were replaced with the mean age for individuals of the same gender in the particular HRG and year. For those in sparsely-populated HRGs, missing values were replaced with the mean for the whole population. 4.5.2 Private sector prices Under certain conditions the market prices for goods and services measure their marginal social value and hence can be aggregated for the construction of measures of the growth rate of output. One possible method of valuing NHS output might be to use prices from the private sector. Some NHS activities have close matches in the private sector. The main example is that some types of elective care are provided both in the private and public sectors. There are also a few private sector general practitioners. Non-emergency ambulance transport is similar to a taxi service. In principle it might be possible either to use the private sector prices of outputs to 43 value NHS outputs or to estimate hedonic price functions to value characteristics or outcomes of NHS output such as waiting times and hotel services. However there are problems with attempting to use private health sector prices as measures of marginal social value of NHS outputs or outcomes: (a) the private sector produces very little emergency care and relatively little nonelective care, roughly half of NHS activity. (b) private sector outputs have a different mix of characteristics compared to the NHS. The health effects of treatment are probably broadly similar, but waiting times are much shorter and the quality of hotel services higher. Thus it would be necessary to attempt to estimate hedonic price functions to derive the marginal value of characteristics (πkjt) rather than use the market price of the output to weight NHS outputs. Time and resource constraints meant that we did not consider this to be a feasible option for this project, though it may be worthwhile for the DH to commission scoping review to investigate the possibility. (c) private patients are not a random sample of the population – they tend to be richer and better educated. Thus any estimated hedonic price function from the private sector may not predict the marginal valuations of characteristics for the general population. (d) much private health care is purchased by insured individuals so that the market price of care will overstates its marginal value to the private patient. (e) because the NHS is now encouraging commissioners (PCTs and general practices) to buy care from the private sector, prices for care to private patients will be increasingly influenced by the prices set by the NHS, which are based on Reference Costs. Private sector prices are therefore unlikely to be useful as sources of marginal social values for most NHS outputs. 4.5.3 International prices There is a precedent in cost benefit analysis for using world prices to value domestic output when domestic prices are absent or distorted. The rationale is that because trade could take place at world prices, they are legitimate measures of opportunity cost to the domestic economy. This option is not particularly useful in the valuation of UK health care outputs. There is not a significant world market in health care. In the 44 countries that do have published prices for health treatment, these tend to be administered prices subject to stringent domestic regulation or negotiation. It is highly unlikely that the relative prices observed in other countries will correspond to the relative value of NHS outputs. These caveats notwithstanding, we did report in our Second Interim Report (Dawson, et al., 2004c; section 3.7.2) whether the valuations of activity would be sensitive to the use of price information from other countries. We concluded that international prices were not likely to be useful as sources of relative marginal social valuations of NHS outputs. There were major differences in definitions of outputs so that it was not clear that similar outputs could be compared. Even when we were reasonably confident that the outputs were similar there were marked differences in the relative costs of treatments between different countries. For example the 2001/2 ratio of the costs of bilateral primary and primary hip replacement to the cost of a percutaneous transluminal coronary angioplasty (PTCA) was 2.54 in Australia and 0.94 in Italy. (The ratio in the Reference Costs database was 1.89). Other studies have also found marked differences in the input usage and hence costs for particular conditions (Baily and Garber, 1997). 4.5.4 Value of health A value per QALY of £30,000 is believed to be compatible with the decisions made by the National Institute of Clinical Excellence, although they do not mention a value of life explicitly (Devlin and Parkin, 2004). The DH has commissioned research into the value of a QALY but its results are not yet available. We have taken £30000 as our reference value, assuming it applies for the year 2002/3 and adjusted it by the rate of growth of money GDP for other years. 4.5.5 Value of waiting time A value weighted output index requires that we weight changes in the various characteristics by an estimate of the monetary value of each characteristic. One source of data on willingness to pay to reduce waiting times is evidence from discrete choice experiments. A recent review of the literature (Ryan, Odejar and Napper, 45 2004) reported that few studies addressed the issue of the monetary value of reducing waiting times for health care and contrasted this with the significantly greater volume of work on the value of time saving in transport. Of the six UK papers, only one sampled the English population. The other five were location or procedure specific. Ryan summarises the available evidence converting to 2002/03 prices. Propper’s analysis of English data suggests estimated values between £36.25 and £94.19 for a one month reduction in waiting time. Hurst’s study of waiting time for non-urgent rheumatology estimated values between £11.95 and £23.68 per week. Ryan points out that the Propper and Hurst studies give similar values assuming a linear additive model. A major limitation of the data available is its age. Propper’s survey was undertaken in 1987. While it is possible to adjust prices for inflation, it is also likely that willingness to pay to reduce waiting time has changed over the last eighteen years. We illustrate the effect of adopting a value weighted output index in place of a cost weighted index for a small subset of outputs where we have some health effects data (section 6). We have used the upper limit of the Propper evidence, £94.19 per month which corresponds to £3.13 per day in 2002/03 prices. This was the willingness to pay of retired individuals with above average incomes in the original survey. To explore the sensitivity of the index to price, we also use £50 per day which implies that a one month reduction in waiting is worth £1400. This is an arbitrary number which introspection suggests is likely to be at the high end of any willingness to pay for a reduction in waiting time for most elective care. If the DH wishes to make a value weighted output index a regular part of reporting NHS performance, we recommend that new research is undertaken on social willingness to pay to reducing waiting times. 4.5.6 Expert groups Clinical experts could provide estimates of the health effects of treatment without the need to deny cost-effective treatment to some patients for some treatments. They have been used in the UK for CABG (Williams, 1985), in the Netherlands to estimate burden of disease for 52 diagnostic groups accounting for 70% of health care costs, 46 and in the US for producing quality adjusted price indices for depression treatment (Berndt et al., 2002). We discussed the use of expert groups in our Second Interim Report (Dawson et al., 2000c, section 3.1). We do not believe that they should be used to provide comprehensive annual updates of the estimated health effects. Such groups are costly to convene, organise and train. They would be useful for a limited set of major conditions, supplementing the regular annual snapshot before and after health data collected from patients that we recommend (section 10.1). 4.6 Quality adjustment for health effects of treatment In the next four sections we consider how far it is possible to use existing data to quality adjust the output index for changes in the health effects of treatment, waiting times, and patient satisfaction with the process of care. We consider first what we would like to measure in principle. We assume for the moment that the only valuable characteristic of NHS care is its effect on health status and examine how we might use data on post treatment mortality to produce a quality adjusted index of NHS output. In this and following subsections we consider various methods of using mortality information and combining it with other very limited data on the health effects of treatment, stressing the assumptions required. As we are assuming in this section that health is the only relevant characteristic of health care we drop the subscript identifying the characteristic. Thus we use qjt, πt instead of qkjt, πkt. Denote the discounted sum of QALYs produced by the treatment if the patient survives treatment by q*jt = ∑ s δ sσ *jt ( s )∑θ ρ *jt (θ , s )h jt ( s ) = ∑ s δ sσ *jt ( s )h*jt ( s ) (23) δ is the discount factor on QALYs. h*jt ( s ) is the expected level of health s periods after treatment, conditional on being alive at time t: h*jt ( s ) = ∑θ ρ *jt (θ , s )h(θ ) (24) 47 σ *jt ( s ) is the probability of surviving s periods given that the patient survived treatment j at date t, h(θ) is the health level from having health state θ, where θ is a vector of mental and physical health characteristics, and ρ *jt (θ , s ) is probability of being in health state θ conditional on surviving s periods after treatment j at date t. To reduce notational complexity in examining the properties of the various indices we ignore the effect of age and gender on mortality, survival and the probability distribution of health states. Some HRGs are already age specific. A more disaggregated analysis is analytically straightforward by defining the output type by finer age categories and gender as well as HRG. If the patient had not been treated their discounted sum of expected quality adjusted life years would have been qojt = ∑ s δ sσ ojt ( s )∑θ ρ ojt (θ , s )h jt ( s ) = ∑ s δ sσ ojt ( s )h ojt ( s ) (25) h oj ( s ) is expected health if the patient would have survived s periods hence without receiving treatment j. σ ojt ( s ) is the probability of surviving s periods if not treated. It depends on the probabilities of health status θ at s conditional on surviving without receiving treatment ( ρ oj (θ , s ) ) . Setting health status when dead to zero, the expected increase in discounted QALYs from treatment j at time t is q jt = (1 − m jt ) q*jt − qojt (26) where mjt is the probability of death within a short period of treatment j. This expression for the health effect of treatment is useful because it distinguishes three components of qjt which are controllable by the NHS to different degrees and hence should be treated differently in calculating output growth rates attributable to the NHS. The amount of health outcome produced per unit of output can change over time because of changes in • short term post treatment mortality rate mjt; 48 • survival probabilities and health status probabilities conditional on survival σ *jt ( s ) , ρ *jt (θ , s ) • survival and the health status probabilities conditional on not having treatment σ oj ( s ) , ρ oj (θ , s ) The first is arguably the component most clearly attributable to the NHS for many treatments given current data and the third is unaffected by the NHS for all treatments. The effect of the NHS on the second will vary across treatments from relatively little effect on say varicose vein stripping and a large effect for cancer treatments. Figure 4.1 illustrates the effect of treatment j at date t. The lower dashed line shows the expected time stream of health given treatment after allowing for the treatment mortality probability. Figure 4.1 Expected time streams of health without treatment ( h oj ( s ) ), with treatment conditional on surviving treatment ( h*jt ( s ) ), and with treatment ( (1 − m jt )h*jt ( s ) ) h h*jt ( s ) (1 − m jt )h*jt ( s ) h oj ( s ) t Let the marginal social value at time t of a QALY be πt (£s per QALY) so that the 49 marginal social value of unit of output j at time t is p jt = π t q jt = π t ⎡⎣(1 − m jt ) q*jt − qojt ⎤⎦ (27) We wish to calculate the value weighted output index (12) which in the special case in which health is the only socially valuable characteristic is I ytxq = ∑ x πq ∑ x πq jt +1 t j j jt t jt +1 jt ⎛ x q ⎞ π t q jt x jt = ∑ j ⎜ jt +1 jt +1 ⎟ = ∑ j (1 + g xjt )(1 + g qjt )ω jty (28) ⎜ x q ⎟∑ π q x jt ⎠ ⎝ jt j t jt jt where g xjt = ( x jt +1 − x jt ) / x jt and g qjt = ( q jt +1 − q jt ) / q jt are the discrete period rates of growth of the output j and the health outcome per unit of output j. Consider the health adjusted cost weighted output index ⎛ x jt +1 q jt +1 I ctxq = ∑ j ⎜ ⎜ x jt q jt ⎝ ⎞ ⎟= ∑ j c jt x jt ⎟⎠ c jt x jt ∑ ⎛ q jt +1 ⎞ x ⎜⎜ ⎟⎟ c jt jt + 1 j ⎝ q jt ⎠ ∑ x jt c jt (29) The assumption of efficient allocation of NHS resources with only one quality characteristic outcome takes the form c jt = λtπ t q jt = λtπ j ⎡⎣(1 − m jt ) q*jt − qojt ⎤⎦ (30) We can interpret λtπ t as the cost per QALY used by the NHS in making its treatment decisions and is the optimality condition that at the margin all treatments have the same cost-effectiveness ratio. If we make the assumption (30) the health adjusted cost weighted output index is also the value weighted output index I ctxq = I ytxq . (This is just (22) with simpler notation.) If we do not assume efficient allocation then we can justify calculating I ctxq by arguing that we are interested in a weighted average of the growth rates of the “real” parts (outputs, quality as measured by health gain per unit of output) of the value of NHS activity and that costs are convenient weights. We can write the term qjt+1/qjt in (28) as q jt +1 q jt = 1 + g qjt = (1 − m jt +1 ) q*jt +1 − qojt +1 (1 − m jt ) q*jt − qojt = a jt +1 q*jt +1 a jt q*jt a jt q*jt q jt − qojt +1 qojt qojt q jt 50 = (1 + g ajt )(1 + g q* ) jt a jt q*jt a jt q*jt − qojt − (1 + g qo ) jt qojt (31) a jt q*jt − qojt where ajt = (1−mjt ) is the survival rate (proportion of patients getting treatment j who are alive for at least a short period after treatment). g ajt is the growth rate of survival and g q* , g qo the growth rates in q*jt , qojt . jt jt The NHS output growth rate in any year should not include changes in qjt which arise because of changes in health if not treated ( qojt ). Hence in calculating the growth rate we should set g qo = 0 and the health effect adjustment should be not (31) but jt q jt +1 q jt ⎛ a jt +1 ⎞ ⎛ q*jt +1 ⎞ a jt q*jt qojt a jt q*jt qojt =⎜ − = (1 + g ajt )(1 + g q* ) − ⎜ a ⎟⎟ ⎜⎜ q* ⎟⎟ q jt q jt q jt q jt ⎝ jt ⎠ ⎝ jt ⎠ jt (32) Notice that although we hold qojt constant in calculating the annual growth in qjt attributable to the NHS, changes in health without treatment will lead to changes in the weights. The larger is the growth in survival (gajt) and the larger the growth in health after treatment ( g q* ), both of which may be attributable to the NHS, the greater is the jt health effect adjustment factor (qjt+1/qjt) and the greater the index of NHS output. In general we do not have data on health conditional on surviving treatment q*jt or conditional on no treatment qojt . We do have data on the probability of surviving treatment ajt for all hospital spells. It is also possible that in the near future we may have information on longer term survival σ *jt ( s ) for a large number of NHS patients.3 We therefore consider in the next section how it will be possible to use information on treatment survival and longer term survival. 3 Note that we make a distinction between short term survival (a) and long term survival conditional on short term survival (σ*) whereas the little data currently available is couched in terms of unconditional survival probabilities which is the product of a and σ*. 51 4.7 Quality adjustment using long term survival We want to estimate the quality adjusted cost weighted index (29) where the quality adjustment factor is given by (32). We do not know health status conditional on treatment and surviving s periods h*jt ( s ) . One possibility is to assume that health status s periods after treatment is proportional to the current health status ht ( s ) of an average person who is s years older: h*jt ( s ) = f jth ht ( s ) . Such data are available for example from the 1996 Health Survey for England.4 Then we can estimate the growth in discounted expected QALYs conditional on surviving treatment as q*jt +1 q*jt ∑δσ = ∑δσ s ⎛ σ *jt +1 ( s ) ⎞ ⎛ f jth+1 ⎞ ⎛ δ sσ *jt ( s ) f jth ht ( s ) ⎞ = ∑s ⎜ * ⎟⎟ * h ⎜ σ ( s ) ⎟⎟ ⎜⎜ f h ⎟⎟ ⎜⎜ q*jt ⎝ jt ⎠ ⎝ jt ⎠ ⎝ ⎠ jt ( s ) f jt ht ( s ) * jt +1 s s s ( s ) f jth+1ht ( s ) (33) Notice that we have use ht ( s ) rather than ht +1 ( s ) in the numerator since changes in the general health of the population should not affect the rate of growth of health conditional on surviving treatment. We do allow for changes in the proportionality factor f jth to affect the health conditional on short term survival since this will reflect improvements in medical technology or in patient selection, both of which should be attributed to the NHS. In the current state of knowledge we cannot generally estimate the change in the proportionality factor from one period to the next and so assume that it is constant. Hence the proportionality factors cancel from the numerator and denominator in the middle ratio in the final part of (33) and we have q*jt +1 q*jt ∑δσ = ∑δσ s s s ⎛ σ *jt +1 ( s ) ⎞ ⎛ δ sσ *jt ( s ) f jth ht ( s ) ⎞ = ∑s ⎜ * ⎟⎟ * h ⎜ σ ( s ) ⎟⎟ ⎜⎜ q*jt ⎝ jt ⎠⎝ ⎠ jt ( s ) f jt ht ( s ) * jt +1 s ( s ) f jth+1ht ( s ) (34) Thus with information on ht ( s ), σ *jt ( s ) and assumptions about f jth we can compute q*jt +1 / q*jt . If it is also the case that the growth rate in survival ( gσ *t = σ *jt +1 / σ *jt - 1) is constant over s then we do not even need to make assumptions about the magnitude of the proportionality factor: (34) simplifies to 4 To keep the presentation simple we have assumed implicitly that all patients in an HRG have the same age and gender so that all have the same survival probabilities and expected health status. In practice when long term survival data become available it would be necessary to consider whether it was necessary to calculate age and gender specific survival rates and expected health status. 52 q*jt +1 q*jt = (1 + g q*t ) = (1 + gσ *t ) (35) We still need information on h ojt ( s ), σ ojt ( s ) to calculate qojt for (32) and estimates of ht ( s ) are of little help since we would have to specify proportionality factors which vary across the activity types and we have no information on survival without treatment for most types. One way to proceed, which is more plausible for treatment of cancer and CHD than for cataracts, is to assume that the alternative to treatment is death so that qojt = 0 and (32) becomes q jt +1 q jt = a jt +1 q*jt +1 a jt q*jt = a jt +1 a jt ⎛ σ *jt +1 ( s ) ⎞ ⎛ δ sσ *jt ( s ) f t h h j* ( s ) ⎞ ⎟⎟ ∑ s ⎜⎜ σ * ( s) ⎟⎟ ⎜⎜ q*jt ⎝ jt ⎠⎝ ⎠ S (36) Again if we assume a constant growth in long term survival for all ages the health effect adjustment (36) simplifies further to q jt +1 q jt = a jt +1 q*jt +1 a jt q*jt = (1 + g at )(1 + gσ *t ) (37) There are two practical issues to consider. First to which HRGs should the adjustment in respect of say cancer survival be applied? Cancer diagnoses appear in a number of HRGs. One possibility is to apply the adjustment to the HRGs which treat the highest proportions of patients with cancer diagnoses. Alternatively one could attach group cancer patients by the HRG of their activity. Second, we must choose a time horizon S for the adjustment. Lakhani et al. (2005) have recently presented estimates of 5 year cancer survival rates. The longer the horizon over which the summation in (36) takes place the more accurate the estimate of health effects. But a long time horizon has disadvantages. If the adjustment to a year is based on the survival experience of patients actually treated in that year then the output indices for S previous years will have to be revised every year. The alternative is to adjust the index for a particular year using the survival experience of patients treated S years previously. Thus the longer is S the greater the extent of revisions or the more out of date the adjustment. 53 Rather than use general population estimates of ht ( s ) it may in some cases be reasonable to make an even cruder assumption, that survival after S years is very low or that h*jt ( s ) after S years is very low. Setting h*jt ( s ) = 0 for s > S and assuming that each year to S has the same QALY score ( h*jt ( s ) = h*j ) we get q jt +1 q jt = a jt +1 q*jt +1 a jt q*jt = a jt +1 a jt ⎞ ⎛ σ *jt +1 ( s ) ⎞ ⎛ σ *jtδ s ⎜ ⎜ ⎟ ∑ s=1 ⎜ σ * ( s) ⎟ ⎜ 5 σ * ( s)δ s ⎟⎟ ⎝ jt ⎠ ⎝ ∑ s =1 jt ⎠ S (38) If the trend improvement in survival was reasonably stable over long periods then the use of the lagged survival change data would be a reasonably accurate estimate of the adjustment based on actual survival experience since one is interested in the growth rate in survival, not in actual levels. We believe that the use of longer term survival data is a promising way forward which will become feasible in the medium term (Lakhani et al., 2005). It will be especially promising if it is coupled with a programme to measure the health status of samples of NHS patients before and after treatment (see section 6; Appendix C). 4.8 Quality adjustment with short term survival 4.8.1 Simple survival adjustment In the absence of longer term survival data we now consider what can be done to quality adjust the output index using the data which is currently available. In the absence of information on longer term survival σ *jt ( s ) we cannot estimate q*jt , the change in the discounted QALYs associated with treatment conditional on survival ( q*jt ). If we assume that q*jt = q*jt +1 does not change over time ( g q* = 0), so jt that the only reason why qjt changes over time is that the post operative survival rate changes, the health effect adjustment becomes q jt +1 q jt = a jt +1q*jt − qojt a jt q − q * jt o jt = a jt +1 − k jt a jt − k jt (39) where k jt = qojt / q*jt . Clearly increases in ajt+1 other things equal lead to a higher 54 quality adjustment factor. For activities with the same ajt and kjt, the larger the survival in period t+1 the greater the health adjustment factor. Comparisons of quality adjustment factors across activities with different k jt = qojt / q*jt require a little more thought. Remember that we are interested in the effect on the quality adjustment factor qjt+1/qjt which is the ratio of health effects, not in the effect of kjt on the level of health effects. Suppose for definiteness that survival has increased, so that the quality adjustment factor has increased. Differentiating (39) with respect to kjt gives ∂ ( q jt +1 / q jt ) ∂k jt = a jt +1 − a jt ( a jt − k jt ) (40) 2 Thus outputs with a larger kjt have a larger health adjustment. Higher k jt = qojt / q*jt can arise from a smaller q*jt for the same qojt or a larger qojt for the same q*jt . The marginal effects of q*jt and qojt are ∂ ( q jt +1 / q jt ) ∂q*jt ∂ ( q jt +1 / q jt ) ∂qojt = = −( a jt +1 − a jt ) qojt ( a jt − k jt ) 2 ( a jt +1 − a jt ) q*jt ( a jt − k jt ) 2 (41) (42) Thus, other things equal, activities with larger q*jt have smaller health quality adjustment factors. This might appear paradoxical for two reasons. First, the larger is q*jt the greater the health effect in each period so that the health adjustment factor is smaller for more beneficial activities. Second, the larger is q*jt the greater is the absolute increase ( a jt +1 − a jt ) q*jt in the health effect between the two periods. But both “paradoxes” disappear when we remember that what we are interested in is the health adjustment factor which is the ratio of the health effect in the two periods. An increase in q*jt increases both numerator health effect qjt+1 and denominator health effect qjt but has a smaller proportionate effect on qjt+1 than on qjt and so the ratio 55 qjt+1/qjt gets smaller. Even with the assumptions that q*jt = q*jt +1 is constant over short time intervals we cannot measure (39) directly unless we know the magnitudes of q*jt , qojt in order to calculate kjt = qojt / q*jt . In future it may be possible to estimate q*jt , qojt using new data on longer term survival and on health status from surveys of patients before and after treatment and from the results of evaluations of different types of treatment. But for the moment, for the vast majority of activities we have no data on q*jt , qojt , though we do have information on survival ajt. We can therefore calculate the survival adjusted cost weighted output index ∑ I ctxa = j ⎛a ⎞ c jt x jt +1 ⎜ jt +1 ⎟ ⎜ a jt ⎟ ⎝ ⎠= ∑ j (1 + g xjt )(1 + gajt )ωctjt c x ∑ j jt jt (43) The effect of this simple survival adjustment5 on the rate of growth of NHS productivity will not be great since the vast majority of NHS patients survive their treatment so that survival rate does not change rapidly. Thus, for example, the 30 day CIPS mortality rate for Phakoemulsion Cataract Extraction with Lens Implant (HRG B02) fell from 0.0017 in 1999/2000 to 0.0013 in 2002/3, an annual rate of decline in the mortality probability of 0.56%. The survival rate rose from 99.83 to 98.87, an annual rate of increase in the survival rate of 0.013%. In terms of Figure 4.1 the effect of increased survival is shown in the shift upward in the dashed line plotting health post treatment (1 − m jt +1 )h*j − h oj . The effect is small relative to initial health. 5 An alternative apparently simpler adjustment is to apply the survival rates in each period to scale the output of that period: weighted index ∑ j ∑ j c jt x jt +1a jt +1 / ∑ j c jt x jt a jt . Unfortunately this index equals the value x jt +1q jt +1π t / ∑ j x jt q jtπ t qojt +1 = qojt = 0 , and c jt = θπ t q*jt . under the assumptions that q*jt +1 = q*jt , The last assumption is perverse. It is not an efficiency assumption: it requires that decision makers ignore the possibility of that a patient may not survive treatment when allocating resources across treatments. By contrast, the first two requirements and the ( efficiency assumption c jt = λtπ t a jt q jt − q jt * o ) imply that the survival adjusted cost weighted index (43) does equal the value weighted index. 56 The difference between what we would like to measure (true health adjustment) and what we can measure using survival data only is a jt + 1q *jt + 1 - qojt a jt + 1 a jta jt + 1 (q *jt + 1 - q *jt ) + qojt (a jt + 1 - a jt ) = a jt q *jt - qojt a jt a jt q jt (44) In general we cannot say anything about the direction of the bias in using a jt + 1 / a jt instead of (32). But if survival increases and health conditional on survival increases then (44) will be positive and the simple survival adjustment will underestimate the true adjustment. If there is little change in health conditional on surviving treatment, (44) becomes a jt + 1q *jt - qojt a jt + 1 qojt (a jt + 1 - a jt ) = a jt q *jt - qojt a jt a jt q jt (45) and a jt + 1 / a jt will always have the same sign as (32) and will always be less than it in absolute value. a jt + 1 / a jt is a conservative estimate of the true health adjustment (32) if health conditional on surviving treatment is constant or increasing. Table 4.2 gives some indication of the underestimation of the growth rate of the true health effect ( ( a jt +1 − k jt ) / ( a jt − k jt ) − 1 ) when it is calculated as the growth rate of survival ((ajt+1/ajt) – 1) . The example has a rate of survival of 0.97 in the base year which is approximately the average survival rate of patients. The greater the reduction in mortality the greater the increase in the survival rate and the greater the growth rate in the true health effect. Notice that because survival is initially high even quite large proportionate reductions in the mortality risk have small effects on the survival rate and on the true growth in the health effect. The true growth in the health effect is larger the larger is k = qto / qt* . Thus, as we discussed above, the smaller the proportionate effect of treatment on the discounted sum of QALYs, the larger is the true growth in the health effect. In the absence of information on the effect of NHS care on health and hence on the true growth rate it is impossible to say how large the underestimation of the overall growth rate of the effect of hospital care on health is. If our central guesstimate of the average value of k = qto / qt* = 0.8 is correct, then the kind of increases in short term survival which are perhaps towards the upper end of what is plausible (from 0.970 to 57 0.971 or 0.972 -- corresponding to proportionate reductions in mortality of 3.3% or 6.75%) underestimate the true growth in the health effect by 0.5% to 1%. Of course, the calculation takes no account of changes in health effects arising from increases in health conditional on surviving treatment. If q*jt grows then the simple survival adjustment would be more of an underestimate. Once again we see the importance of having improved estimates of health conditional on surviving treatment. Table 4.2 Error in using survival growth rate as estimate of growth rate in health effect of treatment Year t Survival Mortality Mortality % decrease 0.97 0.03 Survival % growth True health growth if true k = 0.5 0.8 0.9 Error using survival growth if true k = 0.5 0.8 0.9 Year t +1 0.971 0.029 -3.33% 0.972 0.028 -6.67% 0.975 0.025 -16.67% 0.980 0.020 -33.33% 0.985 0.015 -50.00% 0.10% 0.21% 0.52% 1.03% 1.55% 0.21% 0.59% 1.43% 0.43% 1.18% 2.86% 1.06% 2.94% 7.14% 2.13% 5.88% 14.29% 3.19% 8.82% 21.43% 0.11% 0.49% 1.33% 0.22% 0.97% 2.65% 0.55% 2.43% 6.63% 1.10% 4.85% 13.25% 1.65% 7.28% 19.88% If q *jt is constant ((45) holds) so that holds the difference between the quality adjusted cost weighted and survival adjusted cost weighted indices is ⎛ q 0jt I ctxq − I ctxa = ∑ j (1 + g xjt ) g ajt ⎜ ⎜ q jt ⎝ ⎞ x jt c jt ⎟⎟ ⎠ ∑ j ' x j 't c j 't (46) Since we do not observe q *jt , qojt we cannot determine the magnitude of the absolute downward bias in using a jt + 1 / a jt instead of q jt + 1 / q jt . When there is efficient allocation (30) for conditions where the alternative to NHS treatment is very poor health ( qoj is small) or treatment has a large effect so (that q*jt is large relative to qoj ), the bias is small. But if qjt is very small (the treatment has a small effect on health) the 58 bias is very large. Fortunately, the smaller the health gain from the treatments the smaller the weight of the treatment in the cost adjusted value weighted index (since costs are proportional to health gains by assumption (30) and so the downward bias is bounded. But if (30) does not hold so that unit cost is not proportional to marginal value then the downward bias when q jt is small may not be offset by having a low cost weight attached to such outputs. We also see from (43) that an increase in survival will have a smaller effect on the index the smaller is the cost weight cjt. When (30) holds this is reasonable. The increase in the health effect from an increase in survival is proportional to q *jt and the smaller is cjt the smaller is q jt = a jt q *jt - qojt and the more likely is q *jt to be small. But if (30) does not hold, the fact that survival gains in low cost activities will have smaller effects on the index than survival gains in high cost activities is less appealing. A further difficulty with the pure survival adjustment is that it takes no account of the age of the patients treated. A given post operative survival gain has the same effect on the output index if the treatments have the same cost and volume, even though one treats a much younger group of patients. Again this is reasonable if unit costs are proportional to health effects since the cost weight adjusts for the effects of differences in average age at treatment on health effects. But if we do not believe that unit costs are proportional to health effects we may want to find another means of allowing for differences in age across HRGs. We conclude that adjusting for survival is better than ignoring survival changes. We consider in the next two sections how it is possible to improve on a pure survival adjustment by combining a little more information (from a small sample of treatments where there some information on health effects and from estimates of life expectancy) with further assumptions. 59 4.8.2 Incorporating estimates of health effects To proceed further we must, in the current state of information about the health effects of treatment, replace knowledge with additional assumptions. We consider the implications of assuming (a) health conditional on treatment is constant from one period to the next for all treatments q *jt = q *jt + 1 (47) (b) the ratio of health conditional on surviving treatment to health conditional on no treatment is constant over time and the same for all treatments qojt / q *jt = k (48) The assumption of efficient allocation (30) by itself does not enable us to claim that a simple cost weighted index is what we want: I ytxq = ∑ j x jt +1q jt +1 ≠∑j x jt q jt x jt +1c jt (49) x jt c jt unless the quality of care is constant. But assumption (30) is useful since we can combine it with (47) and (48) to get ∑ I ytxq = I ctxq = ⎛ q jt +1 ⎞ x ⎜⎜ ⎟⎟ c jt jt + 1 j ⎝ q jt ⎠ = ∑ j x jt c jt ∑ ⎛ a jt +1 − k ⎞ x ⎜⎜ ⎟⎟ c jt jt + 1 j ⎝ a jt − k ⎠ ∑ j x jt c jt (50) Notice that we cannot make assumptions (30), (47) and (48) and then construct an index by applying the quality adjustments factors ajt+1 – k, ajt – k separately to the outputs in each year since ∑x ∑x jt +1 j j jt [a jt +1 − k ]c jt [a jt − k ]c jt ∑x = ∑x jt +1 j j ≠ jt = ∑x ∑x jt +1 j j ( q jt +1 / q*j ) q jt ( q jt / q*j ) q jt ∑x ∑x jt +1 j j jt q jt +1 q jt jt [a jt +1 − k ]λtπ t q jt [a jt − k ]λtπ t q jt ∑x = ∑x jt +1 j j = I ytxq jt q jt +1 ( q jt / q*j ) q jt ( q jt / q*j ) (51) (52) Only if q jt / q*j is the same across all treatments would separate application of the 60 quality adjustments factors ajt+1 – k, ajt – k to xjt+1, xjt produce the correct result. Our assumptions imply that qoj / q*j is constant across treatments, not that q jt / q*j is constant across treatments. The difference is that qoj / q*j does not involve the survival rate ajt, whereas q jt / q*j = ( a jt q*j − qoj ) / q*j = a jt − qoj / q*j = a jt − k does. To ensure that q jt / q*j is equal across all treatments we would have to make the additional (and patently false) assumption that all ajt are equal which then means that ajt+1 must be the same across all j, though possibly different from ajt. Hence the separate application of the quality adjustments factors ajt+1 – k, ajt – k to xjt+1, xjt as in (51) is valid only if we can apply the same adjustment factors to all treatments which is equivalent to scaling a simple cost weighted index by (at+1 – k)/ (at – k). From our review of the EQ5D literature and analysis of data from BUPA and York NHS Trust (summarised in Appendix C), we have snapshot estimates of health status before ( hlb ) and after ( hl* ) for a limited set of treatments. (See section 6 where we use these estimates in a specimen index for the set of treatments to illustrate, inter alia, the implications calculating a value weighted index rather than a cost weighted index.) We also use these estimates in section 5 to get very rough estimates of the cost weighted quality adjusted output index for all activities I ctxq by making assumptions (a) and (b) above and using h o / h* - the average value of hlb / hl* for our limited sample of treatments - as an estimate of qoj / q*j = k in (50). We use k = 0.8 as our base case but consider variants 0.7 and 0.9. Notice that we are estimating a ratio of sums of discounted QALYs by a ratio of health status snapshots. We discuss the implications further in section 4.8.3. Clearly the assumptions that the set of treatments for which we happen to have data on hlb / hl* are representative of the effects of all NHS treatment is very strong but we make it to illustrate the importance of having information on the health effects of treatment. Calculating the index (50) with k set equal to the mean of the values in the sample of 61 procedures for which there are health outcome data creates a problem with some activities which appear to have a negative health effect given the assumed value of k and the observed value of ajt. Some activities have high mortality rates so that the terms ajt+1 – k, ajt – k in the quality adjustment factor in (50) are close to zero or negative. Small changes in ajt can then to large changes in the index and if both are negative the adjustment will indicate negative growth when there has been an improvement in output in the sense that 0 > ajt+1 – k > ajt – k. A negative value of ajt – k implies that the activity has a negative social value. This may be true for some treatments but is clearly incorrect for others, such as terminal care. In the case of terminal care the problem arises from the factorisation of the health effects as ajt q*jt − qojt where q*jt is the post treatment health stream conditional on survival. This is not appropriate for terminal care since all treatment ends in death. The solution is to reformulate the health gain as total discounted QALYs from start of treatment minus qojt . Terminal care can then have a positive health outcome: patients are better off with terminal care than without it. In other cases the problem may be that our estimate of kj as the mean of our sample of procedures for which there are outcome data is too large: if we had health data specific to the treatment ajt – kj would be positive. Finally, it is possible that there are treatments which have a negative health effect and no other valuable characteristics: they have a negative social value. These create problems for a cost weighted index because they clearly violate the underlying assumption that their unit cost measures their social value: unit costs cannot be negative. If we had information on health effects and could use health effect weights (as in (28)) then activities with negative or very small social value would not lead to small changes in ajt having disproportionate effects on the index because their weight in the index would be negative or very small. In the absence of such information we have to make ad hoc adjustments to calculate an index which is not disproportionately sensitive to changes in ajt for activities with small or negative ajt – k. We adopt a cut off rule: if either ajt+1 – k or ajt – k is less than a threshold value, say 0.15, we use the pure survival adjustment ajt+1/ajt and 62 otherwise we use (ajt+1 – k)/(ajt – k). We calculate indices with various values of the cut off in section 5.4. Table 4.3 shows the magnitude of the error in the calculated growth rate in the health effect for different size errors in the estimated value of k = qto / qt* . The illustration assumes that the survival rate is 0.97 which is not far from the average post-operative survival rate. The assumed proportionate increase in survival and reduction in the mortality rate are on the large size, as is the true growth rate in the health effect when the true k exceeds 0.70. Notice again that because survival is high the survival growth rate (1.03%) is low, and is considerably less than the true growth in the health effect in the second column. The third column shows the error in adjusting purely by survival i.e. by setting k = 0. A pure survival adjustment does worse than assuming a positive value of k when the true value of k is 0.71 or above: it does worse when the proportionate effect of treatment is smaller. Our central estimate of k based on the small sample of HRGs where there is health effects data is 0.8 and for a true value of k = 0.81 we see that the pure survival adjustment is worse than setting k at 0.7, or 0.8. Table 4.3 Effect of error in estimated k = qto / qt* on error in calculated growth rate in health effect Year t survival 0.97, mortality 0.03; year t+1 survival 0.98, mortality 0.02; % growth in survival 1.03%, % growth in mortality -34%. Estimated k Estimated growth in health effect True k 0.0 1.03% 0.7 3.70% 0.8 5.88% 0.9 14.29% 0.91 True growth in health effect 16.67% Estimated minus true growth rate in health effect -15.64% -12.96% -10.78% -2.38% 0.85 8.33% -7.30% -4.63% -2.45% 5.95% 0.81 6.25% -5.22% -2.55% -0.37% 8.04% 0.71 3.85% -2.82% -0.14% 2.04% 10.44% 0.51 2.17% -1.14% 1.53% 3.71% 12.11% 0.31 1.52% -0.48% 2.19% 4.37% 12.77% The table suggests that even using the same fairly rough and ready estimate of k for 63 all treatments where survival exceeds k by a reasonably margin (say 0.05 or more) will be better than just using a pure short term survival adjustment. 4.8.3 Life expectancy and health effects We now consider how it is possible to use information on the age of treated patients to modify the survival and health effects adjustments. Consider a simple example. Let Ljt be the certain remaining length of life of patients who survive treatment j in year t and of those who are not treated. h *j and h oj are the levels of health status conditional on treatment and without treatment in all periods and these are constant over the remaining life of patients and are not affected by the period of treatment (there is no technological progress). The expected discounted health gain from treatment j in period t is L jt q jt = a jt ∫ h e ds − ∫ 0 * − rs j L jt 0 ⎛ 1 − e − rL jt ⎞ h e ds = ( a h − h ) ⎜ ⎟ r ⎝ ⎠ o − rs j * jt j o j (53) The quality adjustment factor to be applied to the between period t and t + 1 is therefore q jt +1 q jt (a = (a h − h oj ) ⎛ 1 − e − rL jt +1 ⎞ ⎜ − rL jt ⎟ * o jt h j − h j ) ⎝ 1 − e ⎠ * jt +1 j (54) Replacing h oj / h*j with the constant k which we estimate from the mean of our sample of health effect studies as in the previous section, the quality adjusted cost weighted output index analogous to (50) but allowing for changing life expectancy due to changes in the mix of patient types is ∑ I ctxa = x c j jt +1 jt (a − k ) ⎛ 1 − e ( a − k ) ⎜⎝ 1 − e ∑xc − rL jt +1 jt +1 jt j jt − rL jt ⎞ ⎟ ⎠ (55) jt Notice that in section 4.8.2 we assumed that the mean ratio of snapshot health status for our small set of specimen HRGs for which such data exists was equal to the ratio of sums of discounted QALYs over the lifetime of patients ( qoj / q*j ). Here we make 64 the possibly more plausible assumption that the ratio of snapshot health status values is equal to the ratio of snapshot health status without and with treatment ( hoj / h*j ). The adjustment rests on the implicit assumption that all patients in t have the same life expectancy Ljt. Since patients generally differ by age and often by gender it will matter whether we calculate the life expectancy adjustment using estimates based on the mean age and gender of patients e − rL jt =e − rEi Lijt for each age and gender group and then use Ee and e − rL jt − rLijt or whether we calculate use e . The difference between Ee − rLijt − rLijt is typically very small, less than 0.5% on average for electives. In section 5 we report results using both approaches to determine how sensitive the indices are to the use of grouped or individual calculations of the life expectancy adjustment. We use data on age specific health status from the 1996 Health Survey for England plus life tables to calculate healthy life expectancy, rather than actual life expectancy for use in the output indices. See Appendix A. Note that if life expectancy does not change between periods the life expectancy terms in (55) cancel out. Thus the index does not reflect cross treatment differences in age at treatment. The rationale is that we have assumed that costs are proportional to the marginal value of treatment so that any differences in average age at treatment which affect life expectancy and health gains are already allowed for. Since we have suggested that this assumption is not appealing we have investigated using life expectancy with the survival adjustment and estimated health effects in our specimen index where we have HRG specific information on the health effects. We report in section 6.4 the results from estimating ∑ j x jt +1 ( a jt +1h*j − hoj ) (1 − e − rL ∑ j x jt ( a jt h*j − hoj ) (1 − e − rL jt +1 ) (56) jt ) 4.8.4 Cost of death adjustment The conclusion from sections 4.8.1 to 4.8.3 is that in the absence of much of the 65 required data on the effects of NHS activity on health we have either to use a simple survival adjustment which will have a small effect or to make strong assumptions, bolstered by further ad hoc adjustments, in order to incorporate health effects into a cost weighted index. There is a third possibility to which we now turn which replaces the assumption of efficient allocation (30) with another assumption about the relationship between unit costs and the health effects of treatment. Instead of making the assumption of efficient allocation (30) to use information on costs cjt to make inferences about the effect of treatment on health suppose we assume that unit costs are equal to the value of output before making any allowance for death: c jt = pt (q *jt - qojt ) = p t (q jt + m jt q *jt ) (57) This implies that health care providers take no account of mortality risk when determining treatment and only consider the gain in health from successful treatment. Then we can use the assumption to estimate the value of the true health effects, which allow for mortality risk, pt q jt = pt [(1 - m jt )q *jt - qojt ] = c jt - m jt p t q *jt (58) and the value of the q jt + 1 at period t value of health is p t q jt + 1 = q *jt + 1 - qojt + 1 c jt - p t m jt + 1q *jt + 1 q *jt - qojt (59) This gives an estimate of the value weighted index as ⎧⎪ q*jt +1 − q 0jt +1 ⎫⎪ ∑ j ⎨ q* − q 0 c jt − m jt +1π t q*jt +1 ⎬ x jt +1 jt ⎩⎪ jt ⎭⎪ I xγ = * ∑ j c jt − m jtπ t q jt x jt { (60) } q*jt + 1 - q jt0 + 1 q*jt - q jt0 If nothing is known about we can set q*jt + 1 - q jt0 + 1 q*jt - q jt0 = 1 . This is equivalent to assuming that qojt = k jq *jt for all j and t and q *jt = q *jt + 1 . The index then simplifies to γ Ix ∑ {c − m π q } x = ∑ {c − m π q } x jt j j jt jt +1 t * jt +1 jt * jt t jt +1 (61) jt We can interpret pt q *jt as the cost of death for a patient who would have survived treatment. Thus the index in (61) is a cost weighted output index in which we deduct a 66 cost of death from the unit cost of activity. We can use pt = £30000, a value which is generally believed consistent with the approach used by NICE in order to calculate I Xg . However, we still need an estimate of the discounted sum of QALYs obtained by the average individual who receives treatment j in periods t and t+1. One possibility is to assume that q *jt is proportional to the average expected discounted sum of QALYs for people with the same age and gender distribution as those treated ( qˆ jt ). We can use general population estimates of QALYs in from the Health Survey for England, combined with appropriate life tables. However, we are still left with problem of estimating the proportionality factors fjt = q *jt / qˆ jt . We would expect the factor for patients receiving cataracts to differ from the factor for those undergoing heart surgery. The only source of information from which we can infer patient health post treatment is the death rate. One apparently simple and appealing possibility is to employ a proportionality factor f jt (m jt ) = 1 − m jt with the aim of ensuring that we have an index which attaches a low cost of each death to death in those treatments which have a high mortality rate. With this adjustment we obtain the index δ Ix ∑ {c − m π f ( m ) q$ } x = ∑ {c − m π f ( m ) q$ } x jt +1 t jt j jt j jt jt +1 t jt +1 jt jt ∑ {c − m π (1 − m ) q$ = ∑ {c − m π (1 − m ) q$ j jt +1 t jt j jt jt jt +1 t jt +1 jt jt jt jt +1 jt jt +1 }x }x jt +1 (62) jt where qˆ jt is estimated from life tables, HSE QALY estimates and the age-gender distribution of patients getting treatment j. Since very few mortality rates exceed 0.5 we can avoid the difficulty that the cost of death is decreasing with the mortality rate when it exceeds 0.5 by setting fjt = 0 when mjt > 0.5. We experimented with calculations of the index based on a value of £30,000 per QALY. We found that the simple proportionality factor fjt = 1- mjt point to the hospital service as a whole subtracting output. There is no practical resolution to this bizarre result except to use a scaling factor which several orders of magnitude smaller than 1-m. This would be entirely arbitrary. In addition, the underlying assumption 67 (57) about the relationship between unit costs and health effects on which the cost of death adjustment rests is less acceptable than the efficient allocation assumption made elsewhere in our this report. We do not recommend this approach. 4.8.5 In-hospital versus 30 day mortality Hospital Episode Statistics have a field indicating whether the patient was dead or alive on discharge from hospital. The data are available from 1988/89 onwards. It is also possible, though it requires considerable processing, to match HES records with ONS mortality records to count mortality within any required period after admission to hospital. HES has recently introduced a field recording the date of death if the date was between the start of the HES year (1 April) and 30 April of the following year (30 days after the end of the HES year). Thus is it possible to count deaths in hospital plus those within 30 days of discharge. We assign deaths to HRGs using the HRG of the first episode of the spell where a spell consists of more than one episode. There are three obvious counts of deaths: in hospital deaths, deaths within 30 days of admission, in hospital deaths plus deaths within 30 days of discharge. Measuring deaths within 30 days of discharge will represent a longer follow up than deaths within 30 days of admission. Some deaths (e.g. road accidents) outside hospital will have nothing to do with the quality of NHS care. Moreover, the matching of HES to ONS is not perfect: around 10% of spells with a discharged dead code according to HES do not have a matching ONS death record (see Appendix B). This suggests that some patients discharged alive but dying within 30 days may not have a matching ONS death record. Hence using in hospital deaths from HES plus deaths within 30 days after discharge from ONS will understate the 30 day post discharge mortality rate. On the other hand counting only deaths in hospital runs the risk of missing deaths which occur outside hospital which are capable of being affected by the quality of care. 68 The correlations between in-hospital and 30 day post discharge survival rates for all HRGs in 2002/03 are 0.985 for electives and 0.991 for non-electives. (The survival rates are based on CIPS since this is our preferred unit of output for the NHS hospital sector.) The correlations between the growth rates of the two measures of survival rates (in-hospital and 30 day), taken between 2001/02 and 2002/03 are 0.944 for electives and 0.994 for non-electives. Figures 4.2 and 4.3 plot the growth rates of in-hospital and 30-day survival rates between 2001/02 and 2002/03 for elective HRGs and for non-elective HRGs. These growth rates have been Winsorised at the 5th and 95th percentile to remove outlier observations. -.02 Winsorised growth in in-hospital survival rates -.01 0 .01 .02 Figure 4.2 Growth rates for in-hospital and 30 day CIPS based survival rates, 2001/02-2002/03, electives -.02 -.01 0 .01 Winsorised growth in 30 day survival rates .02 69 -.02 Winsorised growth in in-hospital survival rates -.01 0 .01 .02 Figure 4.3 Growth rates for in-hospital and 30 day survival rates, 2001/022002/03, non-electives -.02 -.01 0 .01 Winsorised growth in 30 day survival rates .02 Some HRGs have small numbers of cases (see Appendix B) so that their death rates are subject to large random fluctuations. It would be possible to allow for small number randomness with various shrinkage estimators but we decided not to do so. Precisely because such HRGs have small amounts of activity they will have little influence on the survival adjusted indices as they will account for a tiny proportion of activity. We have estimated indices with both types of death rate in section 5. We feel that at the moment the choice between possibly more accurately recorded in hospital deaths and the possibly more useful but less well measured 30 day deaths is finely balanced but have a mild preference for 30 day mortality. The data on 30 day deaths should continue to improve. Moreover, there is some evidence that publication of in hospital mortality rates in US led to reductions in reported in-hospital mortality for some conditions but an increases in reported 30 day deaths (Baker et al., 2002). Use of mortality data to quality adjust an output series does not necessitate its use as a performance indicator but it seems more prudent to use a mortality measure which is 70 less susceptible to manipulation. We recommend that the DH continue to encourage the refinement of the record linkage and that the short term survival adjustment be based on 30 day deaths. 4.8.6 Conclusions: survival based quality adjustment In this sub section we have • constructed a set of quality adjustments to the cost weighted output index which attempt to allow for the changing health effects of treatment. • shown how to use estimates on long term survival (say up to five years) when these become available (section 4.7). • shown how to use existing measures of short term survival (section 4.8.1). • demonstrated that the pure survival adjustment will almost certainly underestimate the true growth in the effect of treatment on health, • shown how guesstimates of the proportional effect of treatment compared to no treatment (section 4.8.2) and data on life expectancy (section 4.8.3) can be incorporated into survival adjustment. • shown that assuming that the unit costs are proportional to health effects when mortality risk is ignored, rather than making the standard assumption that allocation is efficient and unit costs proportional to health leads to an adjustment which takes the form of a deduction of a cost of death from the output valued using the unit costs (section 4.8.4). We defer judgement on recommending one of these adjustments until we have considered the results from calculating them on actual data, either on the whole of the hospital sector (section 5) or for our specimen set of HRGs for which we have some health effect data (section 6). Section 7 contains calculations based on our preferred variant. We wish to see if the resulting estimates of output growth are either implausible or overly sensitive to unverifiable assumptions about the parameters. However in light of the dubious underlying assumptions about unit costs required to derive the cost of death adjustment (section 4.8.4) and some preliminary calculations we do not recommend it and so do not report results using it. 71 Table 4.4 Summary of output indices with survival based adjustments Quality adjustments Long term survival Weights Costs Form Rationale in section ⎞ a jt +1 S ⎛ σ *jt +1 ( s ) ⎞ ⎛ σ *jtδ s ⎜ x c ∑ j jt +1 jt a ∑ s=1 ⎜⎜ σ * ( s) ⎟⎟ ⎜ 5 σ * ( s)δ s ⎟⎟ jt ⎝ jt ⎠ ⎝ ∑ s =1 jt ⎠ ∑ j x jt c jt 4.7 Results in section Not yet feasible Assumptions Comments * Efficient allocation. h jt ( s ) constant s = 1,…,5; zero s >5. h ojt ( s ) = 0 all s. Survival. Costs ⎛a ⎞ ∑ j c jt x jt +1 ⎜⎜ ajt +1 ⎟⎟ ⎝ jt ⎠ ∑ j c jt x jt 4.8.1 5.4.1, 6.3 Efficient allocation. Survival. Health effect. Costs ⎛a −k ⎞ ∑ j x jt +1 ⎜⎜ ajt +1− k j ⎟⎟ c jt j ⎠ ⎝ jt ∑ j x jt c jt 4.8.2 5.4.2, 6.4 Efficient allocation. q jt , q jt * o constant. Same proportionate effect of treatment on QALYs with and without treatment, all HRGs ( q j / q j = k ) in o * sec 5, 6. kj varies across j in section, 7. Survival. Health effect. Life expectancy. Costs ∑ j x jt +1c jt (a − k ) ⎛ 1 − e ( a − k ) ⎜⎝ 1 − e ∑xc − rL jt +1 jt +1 jt j jt jt − rL jt ⎞ ⎟ ⎠ Survival increases in more costly HRGs have bigger impact. 4.8.3 5.4.3 * o Efficient allocation. h jt , h jt constant; same proportionate effect of treatment on health status all HRGs in sec 5, 6. kj varies across j in section, 7. Certain life; same for treated and untreated; same for all patients in HRG Survival growth has larger effect if HRG more costly. Underestimates true growth. Impact of survival unaffected by age of those treated. Quality growth has larger effect if HRG more costly. Quality adjustment unaffected by age of those treated. Requires adjustment set to ajt+1/ajt survival rate low (close to k) to avoid instability in index Quality growth has larger effect if HRG more costly. Requires adjustment set to ajt+1/ajt survival rate low (close to k) to avoid instability in index 72 Survival. Life expectancy. Life expectancy ∑ x (a ∑ x (a jt +1 j j Cost of death Costs jt ( h − h oj ) 1 − e * jt +1 j jt ( h*j − h oj ) 1 − e − rL jt +1 − rL jt ) ) ∑ {c − m π f ( m ) q$ } x ∑ {c − m π f ( m ) q$ } x j j cjt unit cost, volume xjt volume; ajt, mlt jt +1 t jt jt jt jt +1 t jt jt +1 jt jt +1 jt 4.8.3 4.8.5 jt +1 6.4 Not reported h*jt , h ojt constant. Certain life; same for treated and untreated; same for all patients in HRG Cost proportional to health effect ignoring mortality risk Requires proportionality factor fjt to be very small to avoid negative output jt proportion patients alive, dead on discharge (or after 30 days); h*j constant health status conditional on surviving treatment. h oj constant health status if not treated. σ t* ( s ) probability of surviving s years; k estimate of proportionate effect of treatment on quality adjusted life years (QALYs); Ljt life expectancy at mean age of patients treated; r discount rate on QALYs; πt value of QALY (£s); qˆ jt QALYs lost by death of average patient; fjt(mjt) proportionality factor measuring seriousness of HRG. 73 4.9 Readmissions 4.9.1 Readmissions as health effects A non-trivial proportion of patients are readmitted to hospital within 28 days of discharge as emergencies. Figure 4.4 shows a rise in these readmissions over time, with the data being described in more detail in the Appendix. Some of these readmissions will reflect poor quality care, in that with proper care the readmission would not have occurred (Hofer and Hayward, 1995; Ludke et al, 1993; Thomas, 1996). However, it is not possible from these data to distinguish whether some failure of treatment occurred during the first hospital stay because “readmissions” are defined as any emergency admission at all within 28 days of discharge, whether or not they were related to the earlier admission. Since 2001 readmission rates have been used by the Department of Health, CHI and the Healthcare Commission, as performance indicators for NHS Trusts. In this section we discuss whether and how readmissions should be used to quality adjust a cost weighted output index to reflect poor quality care on first admission. Since readmissions may also be a distressing experience for patients per se, irrespective of the consequences of poor care at first admission for their health, we also consider in section 4.9.2 how to adjust for readmissions as an aspect of the patient experience. 74 Figure 4.4 Emergency readmission to hospital within 28 days of discharge, by age band 14 12 10 8 Age 75+ Age 16-74 Age 0-15 Readmission rate percent 6 4 2 0 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 Source: Department of Health For some procedures a proportion of patients will be readmitted because of complications arising from the treatment. Let x1jt be the number of first admissions and x 2jt be the number of readmission, so that the readmission rate is Rjt = x 2jt / x1jt . The readmissions may be in a different HRG but this is allowed for in the notation since x 2jt is the number of readmissions to whatever is the appropriate HRG for readmissions for unsuccessful treatments from the initial HRG j. If the first admission is successful the health gain to the patient is q1*jt − q1ojt (assume for the moment that there is no mortality risk for admissions or readmissions). If the operation is unsuccessful (probability Fjt) and they are not readmitted the effect of the first admission for this unsuccessful group q 2jto − q1jto . But if they are readmitted they 2o get a health gain from readmission, compared to having no readmission, of q 2* jt − q jt . We assume that the interval between admission and readmission is short enough to ignore any effect of an unsuccessful first admission on health for the time between admissions or that health change after an unsuccessful first admission is zero. 75 The value of total health gain is ∑π j t 2o ⎡⎣ x1jt ( q1*jt − q1jto )(1 − F jt ) + x1jt F jt ( q 2jto − q1jto ) + x 2jt ( q 2* jt − q jt ) ⎤ ⎦ (63) If the NHS maximises the value of health subject to the constraints that total cost does not exceed the NHS budget and that F jt x1jt − x 2jt ≥ 0 , then c1jt = λtπ t ⎡⎣(1 − F jt ) ( q1*jt − q1jto ) + F jt ( q 2jto − q1jto ) + F jt µ jt ⎤⎦ (64) 2o ⎤ c 2jt = λtπ t ⎡⎣( q 2* jt − q jt ) − µ jt ⎦ 1o 2 ⎤ = λtπ t ⎡⎣(1 − Rt ) ( q1*jt − q1jto ) + Rt ( q 2* jt − q jt ) ⎦ − Rt c jt (65) where µ jt is the Lagrange multiplier on F jt x1jt − x 2jt ≥ 0 and λt is the reciprocal of the multiplier on the budget constraint. Consider solutions in which all those for whom the first treatment is not successful are readmitted Fjt = Rjt so that the constraint binds and µ jt > 0. (This makes no essential difference to the conclusions.) We can substitute for µ jt in (64) to get 1o 2 ⎤ c1jt = λtπ t ⎡⎣(1 − Rt ) ( q1*jt − q1jto ) + Rt ( q 2* jt − q jt ) ⎦ − Rt c jt (66) In allocating resources to first admissions allowance must be made for the fact that with probability Fjt = Rjt the patient will have an unsuccessful first admission and then 1o 2 be readmitted, getting a health gain q 2* jt − q jt but generating additional costs c jt . The health gain when Fjt = Rjt is ∑π j t 1o ⎡⎣ x1jt ( q1*jt − q1jto )(1 − R jt ) + x 2jt ( q 2* jt − q jt ) ⎤ ⎦ (67) Current practice is to calculate a cost weighted outcome index (CWOI) which includes both first and subsequent admissions ∑ ⎡⎣ x ∑ ⎣⎡ x c + x 2jt +1c 2jt ⎤⎦ 1 1 jt +1 jt j J c + x 2jt c 2jt ⎦⎤ 1 1 jt jt (68) If there is efficient allocation ((65) and (66) hold) and if there is no change in health effects or in the readmission rate then the CWOI (68) equals the value weighted output index 76 ∑ ⎡⎣ x ∑ ⎡⎣ x c + x 2jt +1c 2jt ⎤⎦ 1 1 jt +1 jt j J c + x 2jt c 2jt ⎤⎦ 1 1 jt jt = ∑ π ⎡⎣ x ∑π j 1 jt +1 t j t 1o ( q1*jt +1 − q1jto+1 )(1 − R jt +1 ) + x 2jt +1 ( q 2* jt +1 − q jt +1 ) ⎤ ⎦ 1o ⎡⎣ x1jt ( q1*jt − q1jto )(1 − R jt ) + x 2jt ( q 2* jt − q jt ) ⎤ ⎦ (69) To show this use (65) and (66) and write the denominator in (68) is ∑ j { } λtπ t x1jt ⎡⎣ (1 − R jt ) ( q1*jt − q1jto ) + R jt µ jt ⎤⎦ + x 2jt ⎡⎣( q 2*jt − q1jto ) − µ jt ⎤⎦ { } 1o = ∑ j λtπ t x1jt ⎡⎣ (1 − R jt ) ( q1*jt − q1jto )⎤⎦ + x 2jt ( q 2* jt − q jt ) (70) which is proportional to the value of period t health output at the period t price of health. Each term in the denominator of (68) can be written as { } 1o ⎤ x1jt +1 ( c1jt + R jt +1c 2jt ) = x1jt +1λtπ t ⎡⎣(1 − R jt ) ( q1*jt − q1jto ) + R jt µ jt ⎤⎦ + R jt +1 ⎡⎣( q 2* jt − q jt ) − µ jt ⎦ which, with the assumption of a constant readmission rate and constant health effects, is λtπ t ⎡⎣ x1jt +1 (1 − R jt +1 ) ( q1*jt +1 − q1jto+1 ) + x 2jt +1 ( q 2*jt +1 − q1jto+1 )⎤⎦ (71) Hence the numerator in (68) is λt time the value of health output in period t +1 at period t price of health. Thus no adjustment need be made to the cost weighted output index for readmissions if there is efficient allocation and if there are no change in health effects or readmission rates. This is just a particular example of the result we noted in section 2.7. The conclusion holds if we relax the assumption that the time period between first and second admission is very short so that the change in health over this time interval can be ignored. It also holds if we allow for a non-zero mortality risk. What matters is the assumption, underlying the use of the cost weighted output index, that allocation of resources is efficient. Notice that the conclusion does not mean that readmissions do not affect the quality of care, nor does it mean that performance indicators for Trusts based on readmission rates do measure anything of interest. Increases in readmission rates, ceteris paribus, reduce welfare. But on the assumptions made above readmissions are already allowed for in the simple cost weighted index. If health effects change but readmission rates do not, then the quality adjustment of the index raises exactly the same issues as discussed in earlier sections. Suppose that 77 1o readmission rates change but that the health effects q1*jt − q1ojt , q 2* jt − q jt do not, and that the assumption of efficient allocation holds. Is there an adjustment to the CWOI which ensures that it also measures the change in the value of health effects? We require adjustment factors φ which ensure that φ1 (⋅)c1jt + φ2 (⋅) R jt +1c 2jt = λtπ t ⎡⎣( q1*jt +1 − q1jto+1 ) (1 − R jt +1 ) + ( q 2*jt +1 − q1jto+1 ) R jt +1 ⎤⎦ (72) where the adjustment factors can depend only on variables we know (or assume). In general the adjustment factors will depend on the health effects and, if we are unwilling to make any assumptions about them, there is no set of adjustment factors depending on cost and readmission rates which will satisfy (72). But if we are willing 1o to make assumptions about q1*jt − q1ojt , q 2* jt − q jt then we can make a little progress. Suppose that: (q 2* jt − q1jto ) = kˆ j ( q1*jt − q1jto ) (q 2* jt +1 − q1jto+1 ) = kˆ j ( q1*jt +1 − q1jto+1 ) (73) then by substituting (65) and (66) in the left hand side of (72) we can show that the required adjustment factors are φ1 = R (1 − R jt +1 ) + kˆ j R jt +1 , φ2 = φ1 jt R jt +1 (1 − R ) + kˆ R jt j (74) jt Thus for example if kˆ j = 1 so that readmission treatment has the same health effect as the successful first treatment ∑ ⎡ 1 1 R jt 2 ⎤ 2 x c x c jt ⎥ + ⎢ jt +1 jt jt +1 j R jt +1 ⎦ ⎣ ∑ J ⎡⎣ x1jt c1jt + x 2jt c 2jt ⎤⎦ ∑ = j 1o ⎡ x1jt +1 (1 − R jt +1 ) ( q1*jt +1 − q1jto+1 ) + x 2jt +1 ( q 2* ⎤ jt +1 − q jt +1 ) ⎦ ⎣ ∑ J ⎡⎣ x1jt (1 − R jt ) ( q1*jt − q1jto ) + x 2jt ( q2*jt − q1jto )⎤⎦ (75) and the CWOI with readmission costs scaled by Rjt/Rjt+1 is identical to the value weighted health effect index. The intuition is that if the health gain from readmission is the same as the health gain from a successful first admission then the only effect of a change in the readmission rate is via the increase in the total cost of readmissions. If Rjt+1 < Rjt then output, other things equal, must have increased: the cost saving on readmissions can be used to produce more health from a given budget. 78 If we were to take account of readmission rates in our comprehensive index, we would require readmission rates Rjt for HRG j in period t. This data is not routinely generated and at present is not readily usable in a productivity index in the way required for the adjustment in (75) First, since HRGs are assigned at episode level and having linked episodes into spells (CIPS), a spell may have more than one HRG within it. A readmission may or may not be related to any of the HRGs which occur in the index admission. There is as yet no agreed method for assigning an HRG to a CIPS where there are possibly multiple episodes and multiple diagnoses. In the construction of CIPS in this report, we have assigned the HRG to the first episode of care in the admission, but as mentioned, a readmission may not be related to the first HRG, or any others in the CIP spell. A spell-based HRG assignment methodology has been developed for Payment by Results which looks for the 'dominant' procedure / diagnosis / HRG across episodes, but this is based on a provider spell rather than CIPS and again, the readmission may not be linked to the 'dominant' HRG. Second, it is impossible, with available data, to distinguish between readmissions due to poor treatment (the element we wish to capture in the index) and readmissions due to patients being generally sick and prone to getting ill. One way of trying to make a judgement on this may be to examine whether a readmission is linked to a previous discharge, but this will be extremely complex and would need to be considered for each HRG. Indeed, the readmission could be related to any of the conditions in the spell, not necessarily the HRG coded in the CIP spell. It is possible for a readmission to be due to neglect or poor care and yet apparently entirely unrelated to the set of HRGs in the index admission. We conclude that the data do not yet support the kind of adjustment to reflect the health effects of treatment considered in this section. A more fundamental objection may be that the method rests on the standard assumption that unit costs are proportional to the value of treatments. In the case of readmissions this assumption is more than usually questionable. In the next section we sketch a more ad hoc, less theoretically grounded method which might command more support, though it also subject to the same data problems. 79 4.9.2 Readmissions (and clinical errors and MRSA) as a deadweight loss An alternative view of readmissions, which can also be applied to clinical errors and MRSA, can in principle be constructed within the confines of the cost weighted index, if the extra costs associated with these can be identified. The general principle of the cost weighted index is that costs are proportional to benefits. If follows from this that if there are some components of cost which are unrelated to benefits, then these should be omitted from the index. In other words if the costs of treating MRSA, directly associated with readmission as a consequence of plainly premature discharge or clinical errors could be identified these should be omitted from the qualityaugmented cost weighted index. The HRG costs include those elements of cost which are unrelated to benefit but which instead arise from poor provision of medical services. This means that in order to calculate the change in the value of output at constant prices we should deduct from both the numerator and denominator the costs which represent money wasted. If one wanted to represent not only the money wasted but also the disutility arising from such activity, then these deductions should be augmented. In this exercise there is a risk of double-counting. If either MRSA or premature discharge affects mortality rates, then this is already taken into account. In the quality adjusted index. Thus the index set out here can be defended only if one is sure that this is not the case. If we take an output, we then deduct from the value of output in constant prices the expenditures on bads arising from poor provision. We assume that bad j has a cost c bjt associated with it, and that there are x bjt examples of bad j. The index then becomes, in the example of the survival and life expectancy adjusted index ∑ x c j jt +1 jt (a − k ) ⎛ 1 − e ( a − k ) ⎜⎝ 1 − e ∑ x c −∑x − rL jt +1 jt +1 − rL jt jt j jt jt b b jt jt c ⎞ b b ⎟ − ∑ x jt +1c jt ⎠ j (76) j There is an obvious similarity with the cost of death index of section 4.8.4, but with 80 the important difference that the cost of death has to be inferred as best one can from information about the value put on life, while the costs associated with poor service provision can in principle be identified through cost accounting. We do not recommend this adjustment given the poor state of the data on readmissions, and the costs of associated with readmissions and MRSA. We do illustrate the effect of this type of adjustment in section 5 using data on readmissions and MRSA. We could find no usable data on clinical errors but in principle they could be incorporated in the same way. 4.10 Waiting times Waits for diagnostic tests and treatment may affect individuals in two ways. First, they may dislike waiting per se irrespective of the effect of treatment on discounted sum of their quality adjusted life year (qjt). Thus waiting time is regarded as a separate characteristic of health care, distinct from its effect on health. Second, longer waits can reduce the health gain from treatment and the waiting adjustment is akin to a scaling factor multiplying the health effect. Although currently the data do not exist to provide satisfactory estimates of the first type of effect of waiting, section 4.10.1 sets out how such data can be used when it becomes available. We then turn in subsequent sections to consider how estimates of the second type of effect can be made with current data. The total wait is the sum of the wait for a first outpatient appointment after referral from a GP, plus possible further waits for subsequent outpatient appointments for results of tests, plus the wait from the date the patient is placed on the waiting list for inpatient treatment. We ignore these distinct components of the total waiting time because currently data do not permit tracking of individual patients in order to calculate their total time waited. This limitation is likely to be rectified in future with data being collected in order to monitor achievement of the 18 week target for total waiting times. But given current limitations, in our empirical analysis we assess waiting time after a patient has been placed on the list for an inpatient admission. In 81 section 4.10.5 how to include an outpatient waiting time adjustment. 4.10.1 Waiting time as a characteristic The first way to quality adjust the output index to reflect changing waiting times is to use direct monetary valuation of reductions in waiting times. Let πwjt be the value of a reduction of one day in waiting time for treatment j in year t. The value of the output of j in year t is the sum of the values of its characteristics: health gain (qhjt) and waiting time (qwjt = wjt) : y jt = x jt p jt = x jt [π ht qhjt + πˆ wjt qwjt ] = x jt [π ht qhjt − π wjt w jt ] (77) Notice that we assume that the value of a day’s wait depends on the treatment waited for so that (77) differs from (10). The value weighted index (12) becomes ∑ x ⎡⎣π q ∑ x ⎡⎣π q j jt +1 j jt ht hjt +1 ht hjt − π wjt w jt +1 ⎤⎦ − π wjt w jt ⎤⎦ (78) We report in section 6 calculations of (78) based on estimates of the health effect for a small subset of HRGs in a specimen index. If we make the assumption of efficient allocation we can derive the quality adjusted cost weighted index where we take account of both health and waiting time as characteristics is a special case of (21) I ctxq = ∑ j x jt +1 ⎡⎣π ht q jt +1 − π wjt w jt +1 ⎤⎦ c jt x jt x jt ⎡⎣π ht q jt − π wjt w jt ⎤⎦ ∑ j c jt x jt ⎧⎪ x q ⎫⎪ c jt x jt π ht q jt x w π wjt w jt = ∑ j ⎨ jt +1 jt +1 − jt +1 jt +1 ⎬ ⎪⎩ x jt q jt ⎡⎣π ht q jt − π wjt w jt ⎤⎦ x jt w jt ⎡⎣π ht q jt − π wjt w jt ⎤⎦ ⎪⎭ ∑ j c jt x jt ⎡ π q π w ⎤ = ∑ j (1 + g xjt ) ⎢ (1 + g qhjt ) ht hjt − (1 + g qwjt ) wjt jt ⎥ ω ctjt p jt p jt ⎥⎦ ⎢⎣ (79) This index requires data on the amount of health gain qjt for each treatment in order to calculate the value share π ht qhjt / p jt due to health. We do not have such data for all HRGs and the survival adjustments considered in section 4.8 are inadequate since they used to estimate growth rates not levels of health. Thus with available data we 82 cannot calculate (79). The kind of data on health before and after treatment that we recommend in section 10 would enable such an index to be calculated. We have calculated a specimen index for (79) for the small subset of treatments where we have snapshot estimates of the health effects (see section 6). We require the monetary value of health ( π ht ) and of waiting times (πwjt) to calculate this index. For πht we use the monetary value of the QALY implied by the decisions of public bodies, such as the National Institute of Clinical Excellence and the Department for Transport (see Devlin and Parkin, 2004; Carthy et al., 1999). We estimate the value of waiting time from the studies reported in Ryan et al. (2004). Note that, even with πwt of the order of £10 per day, since waiting times are short compared with the horizon over which health gains are enjoyed, the effect of the waiting time adjustment may not be large. It is possible to calculate (79) without data on the magnitude of the health effects if we are willing to make further assumptions about the relative importance of health and waiting time characteristics. Thus if we believe that the health effect of treatment is say 10 times as important as the waiting time then we can set π ht qhjt / p jt = 10/9 which implies π wt w jt / p jt = 1/9 since the sum of the value shares must be 1. We have not done so in this report because we can think of no sensible method of estimating the relative importance of health and waiting times characteristics in the absence of data on health effects and the relative marginal social values of health and waiting times. 4.10.2 Waiting time as a scaling factor Delay may also lead to a reduced health effect qjt from treatment. This can arise because (a) the condition of the patient deteriorates whilst they wait; (b) the post treatment level of health status may be reduced; and (c) if treatment is delayed the time over which the benefits from treatment accrue may be reduced. Figure 4.5 illustrates the effect of reduction in waiting times on the health effects of 83 treatment. Initially we drop the subscripts for characteristics (since only health gain matters) and for type of treatment. ho(s) is the without treatment health status at time s (time is measured from the date at which the patient begins to wait). The waiting time in year t is wt and the time path of health with the treatment is h*(s;wt). In the following year the waiting time falls to wt+1 and the health time path is h*(s;wt+1). We assume that the change in the time path is due solely to the reduced wait for treatment, not to changes in technology. The dashed lines show the expected time streams given the treatment mortality probability and allow for a possible effect of reduced wait on the mortality probability. In Figure 4.5 the fact that ho(s) is declining means that earlier treatment prevents a deterioration in health. The fact that h*(s;wt+1) > h*(s;wt) implies that earlier treatment leads to better post treatment health. The fact that the individual lives for longer post treatment means that they have longer to enjoy their improved health. A recent survey of the literature (Hurst and Siciliani, 2003) found some evidence on deterioration and premature death associated with waiting for cardiology treatment but little for other procedures. Clinical reassessment of patients on a waiting list was thought to contribute to reduction in adverse outcomes of waiting but there is little data on the frequency or efficiency of re-classification of patients on waiting lists. If the NHS begins the routine collection of data on health related quality of life, the QALY improvement due to reduced waiting time should be captured by trend changes in QALYs. 84 Figure 4.5 Effect of reduced waiting times on time streams of health h h*(s;wt) h*(s;wt+1) (1-mt+1)h*(s;wt+1) (1-mt)h*(s;wt) ho(s) wt+1 wt Time since placed on waiting list There are two possible ways to measure the effect of treatment and hence of the effect of changes in waiting times. The first is to value treatment at the time the patient is placed on the waiting list: health effects are therefore discounted to this date. The second is to value treatment at the date it is actually received: health effects are discounted to the date of treatment. If we do not wish to quality adjust with respect to waiting times or if we are dealing with non-electives we do not need to take account of the difference between the two approaches. 4.10.2.1 Discounting to start of wait If the patient is placed on the waiting list at time 0 the expected health gain discounted to that date is L* ( wt ) + wt qt = ∫0 t hto ( s )e − rs ds + (1 − mt ) ∫wtt w Lo + wt ht* ( s; wt )e − rs ds − ∫0 t hto ( s )e − rs ds (80) where we subscript time paths to allow for the possibility of technological change across periods. Lot , L*t are life expectancies without and with treatment measured from the date of treatment. Thus Lot + wt , L*t + wt are life expectancies measured from the date placed on the list. r is the discount rate applied to health. Collecting terms 85 L* ( w ) + wt qt = (1 − mt ) ∫wtt Lo + wt ht* ( s; wt )e − rs ds − ∫wtt hto ( s )e − rs ds = (1 − mt ) qt* ( wt ) − qto ( wt ) (81) which is similar in form to the expression (26) for the health effect in the discussion of quality adjustment by mortality. In general we will be unsuccessful in seeking a simple method of quality adjusting for changes in survival and in waiting times which is equivalent to the quality adjustment for survival derived in section 3.1, even with the assumption about unchanging technologies employed to construct the survival quality adjustment. The reason is that the survival probability enters the expression for health gain linearly but this is not true for the waiting time, as inspection of (80) reveals. Only if further assumptions are made about the shape of the time paths of health status with and without treatment is it possible to derive a simple quality adjustment reflecting both survival and waiting times. The non-linearity of the waiting time adjustment also raises questions about whether the adjustment should be made on the basis of average waiting times applied to all cases of a particular type or whether we should use make the adjustment for each case separately and sum over the cases within each treatment type. We discuss this point further in section 4.10.4. To get a reasonably simple expression for the waiting time adjustment, suppose the time paths hto , ht* are constant with respect to elapsed time after the patient is placed on the waiting list, are unaffected by waiting time, do not change from one year to the next, and that total life expectancy is unaffected by treatment and waiting time: Lot + wt = L*t ( wt ) + w . (Remember that Lot , L*t are life expectancies without and with treatment measured from the date of treatment.) To simplify notation in this case where life expectancy from date of treatment is unaffected by treatment we use Lot = L*t = Lt . Figure 4.6 illustrates. 86 Figure 4.6 Health gain and waiting time – simple case h h*(s;wt) * * h (s;wt+1) (1-m)h (s;wt+1) (1-m)h*(s;wt) ho(s) wt+1 wt Time since placed on list The health gain from treatment in Figure 4.6 q1jt ( wt ) ( = ( a jt h*jt − hojt = a jt h*jt − hojt ) e )∫ w jt + L jt − rs e ds w jt − rw jt = ( a jt h*jt − hojt () e − rw jt −e − r ( L jt + w jt ) ) r (1 − e ) − rL jt (82) r Waiting has a cost, in units of health, which increases with the length of the wait but at a decreasing rate: an extra day after a long wait costs less than an extra delay after a short wait. The cost of a positive wait measured in terms of QALYs can be defined as this formula as ⎛ at ht* − hto ⎞ − rw − rL ⎟ 1− e t 1− e t r ⎝ ⎠ κ t1 ( wt ) = ⎡⎣ q1t (0) − q1t ( wt ) ⎤⎦ = ⎜ ( )( ) (83) The discounted health effect (82) is convex in the waiting time: increases in the wait reduce the quality of care but do so at a decreasing rate. Thus the gain from a reduction in the waiting time from 40 weeks to 39 weeks is less than the gain from a reduction from 10 to 9 weeks. It has been suggested (Atkinson, 2005; 119) that the value of waiting time is more plausibly concave in the waiting time: the value of reductions is greater the longer the wait. 87 The convexity of the health effect in the waiting time implies that an increase in the dispersion of waiting times, holding the mean wait constant, increases the average value of treatment. This seems counter-intuitive.6 The quality adjustment factor is o * qt1+1 ⎡⎣(1 − mt +1 ) − h / h ⎤⎦ = o * qt1 ⎣⎡(1 − mt ) − h / h ⎦⎤ ⎡ − rwt +1 (1 − e − rLt +1 )⎤ ⎣e ⎦ ⎡ e − rwt (1 − e − rLt )⎤ ⎣ ⎦ (84) and the cost weighted index with a waiting time adjustment is I xaw ct ⎛ a jt +1 − h oj / h*j ⎞ ⎛ e − rwt +1 (1 − e − rLt +1 ) ⎞ jt ⎜ − rw ⎟ ωct = ∑ j (1 + g xjt ) ⎜ o * ⎟ − rLt t ⎜ ⎟ ⎝ a jt − h j / h j ⎠ ⎝⎜ e (1 − e ) ⎠⎟ (85) The use of the adjustment factor (84) requires estimates of life expectancy for patients. If we do not wish to use such estimates because we feel they are unreliable we can instead use o * qt +1 ⎡⎣(1 − mt +1 ) − h / h ⎤⎦ − r ( wt +1 − wt ) = e qt ⎡⎣ (1 − mt ) − ho / h* ⎤⎦ (86) e − rwt +1 / e − rwt will have the same sign as ⎡⎣ e − rwt +1 (1 − e − rLt +1 )⎤⎦ / ⎡⎣ e − rwt (1 − e − rLt )⎤⎦ . If life expectancy increases from t to t+1 then the adjustment using only e − rwt +1 / e − rwt will understate growth and will overstate growth if life expectancy falls. Using only e − rwt +1 / e − rwt and ignoring life expectancy is equivalent to assuming that life expectancy does not change. Given the general lack of information on health with and without treatment we can use an estimate of the mean value of h oj / h*j (k) from a small sample of treatments (see 6 Although the conclusion that a mean preserving increase in dispersion is welfare increasing is counter-intuitive, similar conclusions hold in other contexts. For example, a mean preserving spread in the distribution of prices of a good faced by a consumer who is risk neutral towards income will, if the consumer observes the price before buying the good, reduce expected utility. It is essential to specify the decision context carefully and in particular whether uncertainty is resolved before or after decisions are taken. In the case of consumer price uncertainty, it can be shown that if the consumer decides on consumption before the price is revealed and if she is averse to income risk then she is made worse off by a mean preserving increase in the dispersion of prices. 88 section 4.8.2). If we feel that this estimate is too unreliable we can set h oj / h*j = 0 in the index (85) and apply the adjustment for waiting with a pure survival adjustment. Notice that we have applied the same discount rate to all types of treatment. The formulation of the health gain as having the same value whatever the type of treatment which underpins I ctxaw implies that the same discount rate should be applied to waits for heart bypass operations [HRG E04] as for varicose vein stripping [HRG Q11]. The differences in the health gains from the two treatments is already reflected in the index because it is in the cost weights which we have assumed are identical to the value weights. If we think that the assumption of proportionality of unit costs and marginal social values (30) is incorrect we could argue for using a higher discount rate for heart bypass waits than for varicose vein waits. However, this would be an ad hoc adjustment to a problem ( c jt ≠ λt p jt ) which is better tackled more directly by adjusting the weights rather than the discount rates. Because we treat waiting times as delaying health improvement we have also applied the same discount rate to waiting times and life expectancy. It is possible that people may feel differently about future health whilst waiting for treatment and when they have been treated and apply a discount rate whilst waiting for treatment of rw but a rate of rL < rw after treatment . The health effect would be q1'jt = ∫ w jt 0 hojt e − rw s ds + ∫ w jt + L jt w jt a jt h*jt e − rL s ds − ∫ w jt + L jt 0 ho e − rL s ds − rw w jt −r w e ⎡ ) (1 − e L jt ) ⎤ o (1 − e o * = h jt ⎢ − ⎥ + a jt h jt − h jt rw rL ⎢⎣ ⎥⎦ ( 4.10.2.2 ) − rL w jt (1 − e ) − rL L jt rL (87) Discounting to date of treatment with charge for waiting We measure activity when it takes place which suggests that we should measure the benefit from treatment at the time it takes place. This is what we did in section 4.8.3 when discussing life expectancy and health effects. But with such assumption the discounted sum of QALYs, is, continuing with the simple assumptions about the time streams of health used in the previous section and in section 4.8.3, just 89 L jt L jt ⎛ 1 − e − rL jt ⎞ q jt = a jt ∫ h*j e − rs ds − ∫ h oj e − rs ds = ( a jt h*j − h oj ) ⎜ ⎟ 0 0 r ⎝ ⎠ (53) To capture the welfare lost as a result of having to wait for treatment we use a charge for waiting which is offset against (53) which captures only the benefit for the life span from date of treatment. As with any cumulating debt, interest is charged on the cost of waiting: κ 2jt = ( a jt h*jt − h ojt ) ∫ e rs ds = w jt 0 (a jt h*jt − h ojt ) r (e rw jt ) −1 (88) The effect of treatment after adjustment for waiting time is now ( ) 0 − rw ⎤ ⎡ L jt q 2jt = a jt h*jt − hojt ⎢ ∫ e − rs ds − ∫ e jt ⎥ − w jt ⎣ 0 ⎦ ( a jt h*jt − h ojt ) = r ⎡ 2 − e − rL jt − e rw jt ⎤ ⎣ ⎦ (89) q 2jt is decreasing and convex in the waiting time so that an increase in an already long wait has a greater effect than the same increase in a short wait. Moreover an increase in the dispersion of waits reducing the expected value of treatment and there is a nontrivial cost of waiting even for the very young. We can also allow for the possibility that the interest charge on waiting time differs from the interest rate on future QALYs: ( ) ( ) 0 −r w ⎤ ⎡ L jt q 2jt′ = a jt h*jt − hojt ⎢ ∫ e − rL s ds − ∫ e w jt ⎥ − 0 w jt ⎣ ⎦ r w ⎡ 1 − e − rL L jt e w jt − 1 ⎤ * o ⎢ ⎥ = a jt h jt − h jt − ⎢ ⎥ rL rw ⎢⎣ ⎥⎦ ( ) ( ) (90) A difficulty with (89) is that it is theoretically possible for q 2jt to be very small or even negative (if e rw jt >2−e − rL jt ) even if a jt h*jt − a ojt is positive. The same possibility arises with (90). However, this will only be a problem when the waiting time is similar to life expectancy. Table 4.5 has illustrative calculations for an example in which the health effects and life expectancy do not change from period t to period t+1illustrates for examples. Only in the three italicised cells where life expectancy and the waiting 90 times are similar do we observe nonsensical results where a reduction in waiting time leads to a very large or a negative adjustment. Otherwise the results are in line with intuition: the effect of a reduction in waiting time is greater at higher waits and for shorter life expectancy. For all except the very shortest life expectancies a given percentage reduction in the waiting time implies a smaller percentage increase in quality adjusted output. The reason is that the waiting time is in general small relative to the length of time over which any health improvement will be enjoyed. The reductions are however in general not trivial. For example, reductions in waiting time from 120 to 90 days increases the growth rate by between 1.8% for life expectancy of 5 years and 0.35% for life expectancy of 30 years. Table 4.5 (a) Waiting time adjustment qjt+1/qjt: discounting to date treated with a charge for waiting (rw = rL = 1.5%, no change in health effects and life expectancy between period t and t+1) Reduction in wait (days) 10 to 0 30 to 10 60 to 30 90 to 60 120 to 90 150 to 120 180 to 150 365 to 150 0.5 1 1.0582 1.1319 1.2469 1.3283 1.4897 1.9621 27.2659 -0.1686 1.0284 1.0602 1.0995 1.1106 1.1245 1.1424 1.1663 -38.6866 Life expectancy (years) 2 5 10 1.0141 1.0290 1.0456 1.0478 1.0503 1.0530 1.0561 1.6183 1.0057 1.0116 1.0177 1.0180 1.0184 1.0188 1.0191 1.1563 1.0030 1.0060 1.0090 1.0091 1.0092 1.0093 1.0094 1.0719 20 30 1.0016 1.0032 1.0048 1.0048 1.0049 1.0049 1.0049 1.0366 1.0011 1.0023 1.0034 1.0034 1.0035 1.0035 1.0035 1.0257 Table 4.5 (b) Waiting time adjustment qjt+1/qjt: discounting to date treated with a charge for waiting (rw = 10%, rL = 1.5%, no change in health effects and life expectancy between period t and t+1) Reduction in wait (days) 10 to 0 30 to 10 60 to 30 90 to 60 120 to 90 150 to 120 180 to 150 365 to 150 0.5 1 1.0583 1.1326 1.2503 1.3376 1.5161 2.0850 -10.6470 -0.1420 1.0284 1.0605 1.1006 1.1129 1.1285 1.1488 1.1766 -9.6840 Life expectancy (years) 2 5 10 1.0141 1.0292 1.0461 1.0488 1.0517 1.0550 1.0587 1.6882 1.0057 1.0116 1.0179 1.0184 1.0189 1.0194 1.0199 1.1679 1.0030 1.0060 1.0091 1.0093 1.0094 1.0096 1.0098 1.0768 20 30 1.0016 1.0032 1.0049 1.0049 1.0050 1.0051 1.0051 1.0390 1.0011 1.0023 1.0035 1.0035 1.0036 1.0036 1.0036 1.0274 91 4.10.3 Optimal waiting times All the waiting time adjustments considered above assume that a reduction in waiting time is always of value – shorter waiting times are better than longer. It has been suggested that to us that the assumption may not be valid for very short waiting times because some patients find treatment at very short notice to be inconvenient. It would be possible to conduct patient surveys or stated preference experiments to determine optimal waits and the costs associated with departures from them. But in the absence of such data we have to fall back on some crude assumptions which nevertheless enable us to determine how much impact such consideration might have. We have therefore investigated the implications of replacing the measure of waiting time wjt in the waiting time adjustments considered in previous sections with wˆ jt = min(wjt – w*,0). This has the effect of increasing the proportionate effect of a given reduction in wjt if wjt exceeds w* and reducing it to zero if wjt < w*. Another possibility would have been to define wˆ jt = (wjt – w*)2 but we felt that the implication that a reduction in wjt below w* reduced quality adjusted output was likely to prove difficult to justify. 4.10.4 Distribution of waiting times We need to consider what waiting time measure should be used in the waiting time quality adjustments. Patients within an HRG do not have the same waiting times. This raises the question of how we take account of the variability of waiting times. Variations in waiting times across patients for a given treatment are due to a combination of prioritization of patients so that different types have different expected waits and random factors so that patients of given types have uncertain waiting times. We will not be able to observe all the factors which are used in prioritisation so that the distributions of realised waiting times conditional on variables, such as age and gender, which are observable for individual patients in routine data, will overstate the amount of uncertainty for individual patients. Suppose that treatment has the effect on health shown in the simple case in Figure 4.6. Then, assuming that all patients have the same life expectancy, the average health 92 gain for a patient, is discounting to the date of treatment as in section 4.10.2.2, Eqt2 ( w) = ⎡⎣ at ht* − hto ⎤⎦ ⎡⎣ 2 − e − rLt − Ee rwt ⎤⎦ r −1 (91) where Ee rw is the average of erw over realised w. Since Eerw < e rEw the waiting time adjustment factor ⎡⎣ 2 − e − rLt − Ee rwt ⎤⎦ r −1 , using the mean wait to quality adjust the output for a particular HRG will lead to an overestimate of the mean quality adjusted output in a particular year. But what matters for the calculation of the output index is the ratio of the quality adjustment factors for consecutive years and in general 2 − e − rLt +1 − e rEwt +1 2 − e − rLt − e rEwt (92) may be less than or greater than 2 − e − rLt +1 − Ee rwt +1 2 − e − rLt − Ee rwt (93) There are two possible ways of allowing for the distribution of waiting times in the waiting time adjustment. We can write Eerw as ∞ Ee rw = ∫0 e rw f ( w)dw = 1 + rEw + r 2 E ( w ) 2 r 3 E ( w) 3 + +L 2! 3! (94) or as the moment generating function for the distribution of waiting times. If the distribution is exponential (which it would be a random selection of patients was drawn from the waiting list each day) then Ewi e rw = µi/(1-r) where 1/µi is the mean i waiting time and the variance of waits is 1/ µi2 . Similarly if the distribution is uniform on [0,bi] then Ewi e rw = ( ebi r − 1 )/bir. Thus if the distribution of waiting times has a i tractable distribution and we know its parameters we can allow for the uncertainty in waiting times without requiring individual level data. After inspection of the empirical distributions so we decided not to pursue this approach as there is considerable disparity in the empirical distributions and few seem to be approximated by tractable theoretical distributions. If individuals have uncertain waiting times and have a utility function which is decreasing and convex in waiting time then taking the mean of the actual waiting times will understate the costs of waiting. One way to allow for a cost of the risk of having very long waits is to not to use the mean or median of the distribution of waits 93 but the “certainty equivalent” wait wc : the mean wait plus a waiting time “risk premium”. The certainty equivalent waiting time is EU ( w) = u( wc ) = u( E ( w) + RP ) (95) where RP is risk premium and u(.) is a utility function. 7 We take the specification of the benefits from treatment when there is a positive waiting time in section 4.10.2.2 as the utility function. The standard approximation to the risk premium (Pratt 1964) is RP = − Aσ w2 / 2 (96) where A = −u′′ / u′ is the coefficient of absolute risk aversion and σ w2 the variance of waiting times. With the specification in section 4.10.2.2, the coefficient of absolute risk aversion is − r so that RP = rσ w2 / 2 (97) Table 4.6 shows, for different rates of discount, the proportion of elective cases admitted in 2002/3 in HRGs where the distribution of waiting times implies a certainty equivalent wait greater than various percentiles of the distribution. 7 Since u is decreasing in the waiting time we have defined the risk premium as an addition to the mean by contrast to the usual case where u is increasing in a risky variable such as income. Hence the minus sign in the definition used here. 94 Table 4.6 Number of elective HRGs and proportion of total in HRGs where the certainty equivalent waiting time wc percentiles of the distribution of waiting times in 2002/3 Certainty equivalent wait 70th 75th 80th greater than r = 0.01 Number HRGs 214 152 152 Prop cases 0.109 0.057 0.057 r = 0.015 Number HRGs 222 156 156 Prop cases 0.138 0.069 0.069 r = 0.03 Number HRGs 241 170 170 Prop cases 0.235 0.153 0.153 r= 0.05 Number HRGs 258 185 185 Prop cases 0.292 0.205 0.205 r = 0.1 Number HRGs 292 211 211 Prop cases 0.314 0.224 0.224 r = 0.15 Number HRGs 320 233 233 Prop cases 0.375 0.241 0.241 r = 0.2 Number HRGs 340 250 250 Prop cases 0.392 0.258 0.258 elective admissions exceeds particular 85th 90th 95 0.032 32 0.005 97 0.038 32 0.005 107 0.116 34 0.005 112 0.123 38 0.006 127 0.130 42 0.008 144 0.141 52 0.013 154 0.142 56 0.014 The table suggests that using the extreme top end of the distributions as certainty equivalent waiting times is likely to overstate the cost of risk arising from the distribution of waiting times. In our calculation of the indices with waiting time adjustments in section 5 and 6 we compare the effects of using the 80th percentiles as the certainty equivalent wait compared with the mean wait at rates of interest between 1.5% and 10% The alternative approach is to use individual level data from HES on the waits experienced by each elective patient. Given the number of electives this is fairly cumbersome since each calculation of an index takes around four hours with individual data. We do however report in section 5 the results of some calculations of indices to show the sensitivity of indices to the use of individual rather than a certainty equivalent wait. 95 4.10.5 Outpatient waits Table 4.7 highlights the mean waiting times in weeks for a select number of key specialties. In most specialties waiting times have fallen. Table 4.7 Outpatient waiting times in weeks, selected specialties, 1999/002003/04 Code Specialty 1999/00 2000/01 2001/02 2002/03 2003/04 100 120 140 150 160 170 300 310 340 350 360 370 400 420 430 7.448 11.798 9.789 10.546 12.370 3.865 8.799 13.815 6.580 5.978 2.342 3.535 13.072 6.680 4.975 General Surgery ENT Oral Surgery Neurosurgery Plastic Surgery Cardiothoracic Surgery General Medicine Audiological Medicine Respiratory Medicine Infectious Diseases Genito-Urinary Medicine Medical Oncology Neurology Paediatrics Geriatric Medicine 7.398 11.261 9.451 10.781 12.352 4.162 8.877 13.175 6.986 6.491 2.316 3.597 12.814 6.758 5.170 7.495 10.764 9.707 10.812 13.023 4.150 9.066 13.813 7.415 6.583 2.483 3.208 13.378 6.756 5.344 6.787 9.871 9.029 10.019 10.300 4.194 8.099 10.986 6.953 5.232 2.214 3.559 11.324 6.639 5.340 6.377 9.258 8.653 9.251 9.233 4.503 7.392 9.351 6.369 4.949 2.048 3.851 10.038 6.582 5.232 Ideally, because of the non-linearity of the effect of waiting time on the value of activity, we require information on the total time that each individual waits for treatment. However, individual HES records of inpatient waiting times are not linked to individual outpatient data and indeed outpatient data is available only aggregated to specialty level which cannot be linked satisfactorily to inpatient data aggregated to HRG level. If we wanted to discount health effects to date of treatment (as in section 4.10.2) we require an estimate of e rwlt where wlt is individual l ’s total wait for treatment which is the sum of her outpatient and inpatient wait: wlOt + wlIt . We do not have such linked individual level outpatient and inpatient data. If we could group outpatient data to the same HRGs or specialties as inpatient data we could use the average wait for O outpatient treatment for each group to calculate e r ( wt + wtI ) for a specialty or HRG. But this is not feasible. In the future it will be possible to use HES to get total waiting 96 times for individuals across outpatient and inpatient care. But for the moment we require a means of incorporating a quality adjustment for outpatient waits. There is data on first outpatient visits, follow up visits, HRG outpatients, and maternity outpatients but only data on the wait for the first visit. The data on the HRG outpatient and maternity outpatients do not have speciality codes that we can match to the speciality codes for the first and follow up visits. We propose to use the waiting time adjustments from section 4.10.2.2 but to apply the waiting time for first visits to both first and follow up visits. It can be argued that a reduced first outpatient visit wait reduces the delay until the value of the whole course of visits is realised. Second, less plausibly, it could be argued that, regarding first and follow up visits as having different values, reductions in waits for first visits also indicate that waits for follow up visits have fallen and that this increases the value of follow up visits as well. 4.10.6 Waiting time adjustment: conclusion In this sub section we have • shown how data on waiting times can be used to quality adjust the cost weighted output index in addition to the survival based adjustments considered in sections 4.7 and 4.8. • regarding waiting time as a characteristic separate from the health effect of treatment, yielding an adjustment which is additive the health effects • regarding waiting time as delaying the health effect of treatment, yielding adjustments which act as scaling factors on the health effects • concluded that calculation of an index based on the value of the health and waiting time characteristics, as in section 2.4 and section 4.10.1, is not currently possible because of the lack of health effects data. We have constructed waiting time quality adjustments which can be applied to all elective HRGs with existing data and which rest on different assumptions about whether one should discount to the date the patient is placed on the waiting list (section 4.10.2.1) or to the date of treatment (4.10.2.2,). • suggested that an adjustment based on discounting to the date of treatment are preferable since they imply that increased dispersion of waiting times would 97 reduce quality adjusted output, whereas discounting to date placed on the list implies that patients would prefer increased dispersion. • suggested that the dispersion of waiting times should be reflected in the waiting times adjustment by calculating the adjustment using, not the mean or median wait, but the 80th percentile waiting time which can be interpreted as an estimate of the “certainty equivalent” wait. As with the health effects adjustments a final view on the preferred adjustment depends on the plausibility and lack of volatility of the adjustments when confronted with the data in sections 5 and 6. 98 Table 4.8 Summary of waiting time quality adjustments Quality adjustment Wght Survival. Uniform health effect. Life expectancy. Discounting to date on list. Cost Survival. Uniform health effect. Life expectancy. Discounting to date treated, with charge for wait. Health and waiting time characteristics Form Rationale Results Assumptions Comments ) ⎞⎟ 4.10.2.1 Efficient allocation. Constant health status. No effect of treatment on life expectancy. kj equal for all j in Sec 5.5. kj varying across j in secs 6, 7 Cost ⎛ a jt +1 − k j ⎞ ⎛ 2 − e rw jt +1 − e − rL jt+1 ⎞ c x ∑ j jt jt +1 ⎜⎜ a − k ⎟⎟ ⎜ 2 − erw jt − e− rL jt ⎟ j ⎠⎝ ⎠ ⎝ jt c x ∑ j jt jt 4.10.2.2 Sec 5.5, Sec 6.3. With “optimal” wait adjustment Sec 5.5 Sec 5.5, Sec 6.3 Adjustments have more effect with more costly treatments. Also calculated with different discount rates for w and L Adjustments have more effect with more costly treatments. Also calculated with different discount rates for w and L Value ∑ ( ( jt +1 1 − e jt +1 ⎞⎛ e ⎜ ⎟⎟ − rw jt − rL 1 − e jt ⎠ ⎜⎝ e ∑ j c jt x jt ⎛a −k ∑ j c jt x jt +1 ⎜⎜ ajt +1− k j j ⎝ jt − rw ( ( − rL ) ) ⎟ ⎠ − rL x jt +1 ⎡π ht ( a jt +1h*j − h oj ) 1 − e jt +1 r −1 − π wt w jt +1 ⎤ ⎣ ⎦ − rL jt o −1 * ⎡ ⎤ ∑ j x jt ⎣π ht ( a jt h j − h j ) 1 − e r − π wt w jt ⎦ Constant health status. No effect of treatment on life expectancy. Value additive in health and waiting time. cjt unit cost, volume xjt volume; ajt. mjt proportion patients alive, dead on discharge (or after 30 days); ; k estimate of proportionate effect of treatment on quality adjusted life years (QALYs); Ljt life expectancy at mean age of patients treated; r discount rate on QALYs/waits; πht value of QALY (£s), πwt value of day’s wait, h*, ho estimates of health with and without treatment. j ) 2.7, 4.10.1 Efficient allocation. Constant health status. No effect of treatment on life expectancy. kj equal for all j in Sec 5.5. kj varying across j in secs 6, 7 Sec 6.5 99 4.11 Patient satisfaction The indices we have set out above are all variants on cost weighted aggregates built up from FCE data. There are however some attributes which the Department may wish to see reflected in an output index which are not covered by the data discussed so far. Prominent among these are patient views on cleanliness, food quality and satisfaction with non-curative aspects of treatment (such as the behaviour of nurses and doctors). Our analysis in this section is based on the assumption that these variables cause patient disutility because they are seen as indicative of poor treatment quality. For this part of our work then, we introduce these to our index is by identifying indicators (such as aggregates constructed from responses to surveys of patient opinion). We then calculate the growth rate of our comprehensive index, ItComp as the weighted sum of the growth rate of one of the quality adjusted output indices we have described earlier (call it It0) and the growth rates of the other indicators. We denote the growth rate of indicator k by Itk. If there are n such indicators, and the relevant weights are denoted by ωk then the overall index is given by n I tComp = ∑ ωk I tk (98) k =0 In some circumstances we may be able to deduce appropriate values for the ωk. For example we know the total value of expenditure on cleaning and this therefore gives us the basis for a weight on a cleanliness measure. In other cases the choice has to be purely arbitrary. We do not know what costs, if any, are involved in persuading doctors and nurses to be polite to patients, so there is little we can do except to invent a weight. The NHS Patient Survey Programme covers Acute Trusts, Primary Care Trusts, Mental Health, Ambulance Trusts and others. In addition, there are plans for surveys focusing on the National Service Frameworks for coronary heart disease, mental health, older people, diabetes, etc. There are a number of issues that make the 100 incorporation of patient satisfaction into our measure of NHS output difficult, some theoretical and some practical. From a theoretical standpoint, there are at least three reasons why measures of patient satisfaction should be treated with caution. The first is that patient satisfaction is not independent of the level of output, xj, or other characteristics qk. For example, if we include a measure of waiting times in our set of characteristics, it is likely that patient satisfaction, not only with the time they had to wait, but also other aspects of the service they have received, will be correlated with this. Because of this, there is the likelihood of ‘double-counting’ output. Second, patients’ reports of satisfaction will depend on the levels of service they expect. Certain sections of the population may have lower expectations than others, as noted in the First Interim Report (Dawson et al., 2004a). These expectations may vary systematically across the population, introducing bias to our measure of output. Third, satisfaction is a multidimensional concept. There may be a number of orthogonal aspects to satisfaction, beyond total satisfaction, that should be considered as separate characteristics in themselves and aggregated using their social valuations. For example, patients’ satisfaction with the hotel services aspect of an inpatient stay may be largely independent of their satisfaction with the medical aspects. Eliciting these different dimensions of satisfaction with single instruments is extremely difficult, if not practically impossible, which leads us to the empirical obstacles to the inclusion of a measure of patient satisfaction into the index. The practical question arising from the use of patient satisfaction data is the construction of appropriate growth indicators, Itk In our analysis we make use of qualitative surveys carried out for the Health-care Commission. The results of these surveys show what proportion of patients gave answers in particular categories to each question. We constructed aggregates using the same approach as the Health-care Commission. For each relevant question (i.e. those questions which were asked in more than one survey round without changes likely to influence results), we gave a weight of 100 to the most favourable answer, a rate of 0 to the least favourable answer and weights to 101 intermediate categories based on the number of possible choices. Thus if there was one intermediate category it was given a weight of 50, if there were two the more favourable was given a weight of 67 and the less favourable a weight of 33 and so on. These weights were applied to the proportion of respondents in each category in order to give us an overall score. For a group of overall questions (cleanliness, food quality or other aspects of satisfaction) we took the arithmetic average of the individual scores in order to produce an overall indicator. This quantification appears arbitrary and one might be concerned that it does not really identify the latent variables which underlie the reported categorical responses. However the indices produced in this simple manner were very strongly correlated with indices derived from assumptions about the density function of the underlying latent variables. Thus we are reasonably happy with their use. There are a number of questions we have not used in our analysis. Some of these collected information in both years but did not invite observations on service quality (e.g. What age group do you belong to?). Others we have rejected because there were small but possibly important changes to the questions between the years (e.g. in 2002 “Were you in a mixed-sex room or ward?” and in 2004 “Were you in a mixed-sex bay in a room or ward?”, or because they do not actually convey what is needed to assess patient quality. Thus the question about the time people have waited to see GPs does not identify patients who have been unable to see their GPs because the GPs only make appointments for the day that they will see the patients. The outpatient questionnaire includes one question about the time the patient has had to wait for an appointment once they knew they needed it. We have counted waiting time elsewhere and therefore leave that question out of our assessment. A second question asks how long the patient has waited relative to the time of their appointment, inviting categorical answers. By making the assumption that the midpoint of each category is relevant to all patients in that category and a plausible assumption for the final category (two and a half hours for a wait of more than two hours) we can estimate a mean waiting time and measure the percentage change in this. This can be combined with other score variables for outpatient treatment. 102 These indicators are measured on a per-patient basis. To give an overall change in the “volume of quality”, it is necessary to add to each indicator the growth in the volume of treatment. Our own view is that this is not measured exactly by the cost weighted index, since non-medical quality matters whether a patient is given a cheap or an expensive treatment. The measure we use is number of inpatient spells. For outpatients the appropriate volume indicator is the number of outpatient consultations and the same is true for primary care. We discuss this further in section 5. The percentage change in the volume component has to be added to the percentage change in quality to give the change in the overall volume of quality, i.e. the relevant Itk. The quantified data, for the relevant questions are shown in tables A1 to A4 in Appendix A. 4.12 Discount rate on health There is little agreement on the appropriate discount rate to use when health gains accrue over time. Cairns (1994, 1997) examines evidence on individuals’ time preferences for social health gains, such as saving lives of other people at different points in the future. Discount rates for saving lives fall from 41% where the delay is only two years to 16% where the delay is nineteen years. Gravelle and Smith (2001), using a social welfare framework, point out that if the value of health is increasing over time, estimates of the volume of health benefits must take this into account. The volume of health effects can be adjusted directly by the rate of growth in the value of health. An indirect method is to reduce the discount rate on health effects relative to the discount rate applied to costs. The size of the adjustment depends on estimates of the rate of growth of the value of health which itself is a function of the rate of growth of the direct utility effect of health and the rate of growth of income. If the indirect method of allowing for growth in the value of health by means of a lower discount rate on health effects is used, the discount rate on health effects will be 1-3% less than the discount rate on costs depending on the assumptions made about key variables. The Department of Health currently discounts health benefits at 1.5% p.a. and costs at 3.5% p.a. (Department of Health, 1996;, HM Treasury, 2003). We take 1.5% as our base case in discounting life expectancy but also consider higher rates to investigate 103 the sensitivity of the index to the discount rate. The formulations for the waiting time adjustment allow for the possibility that different discount rates could be applied to the period whilst waiting and the post treatment period. It could be argued that a higher rate of discount should be applied to the waiting period since there is a cost to waiting over and above the delay in getting treatment. Unlike the discount rate applied to health effects post treatment there is no obvious higher rate to use for waits. We suggest that the direct disutility from waiting, over and above the delay in getting treatment, is best dealt with by treating it as a separate characteristic. Thus when the data become available to support a value weighted index waiting time could be included both as a determinant of the health effect and as a separate characteristic if patient preference studies suggest that this is appropriate. We have included in Sections 5 and 6 illustrative calculations of the scaling effect of waiting time using different discount rates for health and waiting time. Since the rates applied to waiting time are arbitrary and in the absence of any contrary indication we prefer on balance to use the same rate for waits and health effects. 4.13 Quality adjustment for general practice Until recently routine data on the quality of general practice has been virtually nonexistent because information on general practice has been collected with the aim of paying GPs and until April 2004 the General Medical Services contract paid little explicit attention to quality. GPs received bonus payments linked to the proportion of eligible women receiving cervical screens, and of children vaccinated. Cervical screening activities were included in the old Cost Weighted Activity Index as volume measures of activity so that to this extent the CWAI had a small quality component. From 1998/9 onwards GPs have been able to opt for Primary Medical Service contracts negotiated with their Primary Care Trusts under which they are meant to deliver an enhanced package of services. By April 2004 around 35% of practices had switched to PMS. Since PMS contracts are locally negotiated, the central reporting of the activities targeted under the GMS contract is patchy. 104 There are a number of validated measures of general practice quality - such as the proportion of patients with coronary heart disease whose blood pressure is controlled. The Department of Health is investigating the use of the QRESEARCH database extracted from a large (around 500) sample of GP electronic record systems to measure the quality of care in general practice using these types of indicators (Simkins, 2005). The new General Medical Services contract of April 2004 contained financial incentives linked to achievement against a large basket of quality indicators in the Quality and Outcome Framework (QOF) (Department of Health, 2003a; Roland, 2004). PMS practices were also required to join the QOF and returns for all practices are centrally collected and available via the Quality, Prevalence and Indicator Database QPID run by the DH’s Health and Social Care Information Centre. It is possible to calculate a subset of the quality indicators in the QOF for the practices in the QRESEARCH database over a number of years prior to the introduction of the QOF for all practices in April 2004. This series can then used to quality adjust the past volume series of general practice consultations. In principle the quality indicator series can be calculated in future years for the practices in QRESEARCH and, for all practices. There are potential difficulties in constructing a quality adjusted series which covers the periods before and after 2004/5. GPs are partially altruistic: their actions are guided by a professional concern for their patient’s well being and by a concern for their income and effort. The introduction of the QOF in April 2004 changed the relative importance of the professional and personal incentives. For aspects of quality covered by the QOF GPs now have both financial and professional incentives. For aspects of quality which are outside the QOF they now only have professional incentives. The QOF may therefore lead to increased activity in the areas covered by the QOF and reduced activity in those outside it. Hence the changes in QOF activity may provide a misleading guide to changes in overall quality of care since the unremunerated and therefore not centrally notified activities may have declined, or not increased to the same extent. 105 The data produced by the QOF is likely to be a fruitful source of quality adjustment of general practice activity in future years. Some of the indicators can be translated into health impacts, for example from studies of CHD event risk factors including blood pressure. But the interpretation of trends in quality indicators will require empirical modelling of the responses of practices to changing financial incentives and their impact on unremunerated quality indicators. Such modelling should be possible using QRESEARCH or a similar database. The QOF also has an incentive for GPs to undertake surveys of their patients, though the reward is for carrying it out and acting on the results, and is not linked to the actual patient responses. If the QOF is adjusted to link payment to responses, for example on satisfaction with particular aspects of the practice then this will be a potentially fruitful source of data on the patient experience in general practice. We discuss in section 4.11 how such data could be used as a quality adjustment to an output index. It has been suggested that admission rates for ambulatory care sensitive conditions (ACSCs) can be used as a measure of the quality of general practice. ACSCs are conditions which can be controlled in a good quality general practice and which should not result in a hospital admission. The usual examples include asthma, diabetes and epilepsy and ACSCs for these conditions have been used by the DH and the Commission for Health Improvement as primary care performance indicators. They have also been used extensively in the US, New Zealand and Australia (Giuffrida et al., 1999; Jackson and Tobias, 2001; Victorian Government, 2004). Even if ACSC admission rates affected by the quality of general they are also strongly influenced by factors outside the control of GPs, including the availability and admission criteria of hospitals (Giuffrida et al., 1999). ACSCs admissions are an imperfect proxy for the health of the relevant population. We do not recommend their use as a means of quality adjusting measures of national general practice output. 106 4.14 Atkinson principles and quality adjustment The Atkinson Review published its final report in January 2005 setting out recommendations for improving the measurement of government output and productivity (Atkinson, 2005). Atkinson paid particular attention to the Eurostat (2001) Handbook on price and volume measures in National Accounts. The Handbook made important recommendations on the methods to be used to measure output and the implications for the measurement of non-market output. Eurostat distinguished between activities, outputs, and outcomes. For purposes of national accounting it is preferable to measure outputs (treatment received by a patient) rather than activities (number of operations or prescriptions). Outcomes should be used to quality adjust outputs. Eurostat “graded” the methods that governments use for measuring non-market outputs into A, B or C. A. Preferred method: i) use output indicators (rather than activities) ii) all services should be covered, as detailed as possible iii) outputs should be quality adjusted iv) outputs should be cost weighted B. Less satisfactory but acceptable method: i) use output indicators but the detail needs to be improved ii) no account is taken of quality change C. Unacceptable method: i) use of inputs instead of activities or outputs ii) coverage of output not representative In discussing Group A methods, counting inpatient activity by DRG is accepted as a way of measuring output but this appears inconsistent as it is only a more disaggregated way of counting activities. Atkinson acknowledges the difficulty of 107 quality adjusting by DRG activity. Atkinson says that ideally output should be measured as the whole course of treatment for an illness rather than just counting activities (by DRG or HRG). This would make it possible to take better account of the quality of care. The report notes that it “would be very helpful to be able to base quality adjustments for NHS output on a data set which measures the health outcome achieved as a result of treatment, collected annually by all or part of the NHS for most aspects of health care.” In the short-term attention could be focused on a few disease groups for which data on outcomes is available. In the general discussion of methodology, Atkinson points to the distortion in output indices that are weighted by average costs. Different outputs should be weighted by the marginal value of the outputs to individuals and there is no reason to believe this corresponds to marginal or average cost. In major respects Atkinson (2005) recommended a methodology for measuring NHS output growth advocated in our earlier reports: • Units of output should relate to patient journeys (courses of treatment) rather than activities (tests, procedures, consultations, drugs prescribed). Until a patient identifier permits linking the various services delivered to a single patient, output will still have to be measured as the sum of activities. • The quality of outputs should be incorporated into output indices. • Quality measures should reflect the attributes of output valued by individuals. Atkinson initially identifies relevant attributes as those recognised by current objectives of government health policy. We suggest, since individuals may value other attributes than those targeted by the policies of any particular government, and may value them differently, that further research is required on the attributes valued by individuals. • In order to generate a single index of output, weights must be attached to the multitude of NHS activities. Weights should reflect the marginal social value of the activities. At present relevant data does not exist and ONS will continue to use cost weights. Atkinson acknowledges the distortions introduced by the use of cost weights—a relative increase in expensive treatments appears to increase NHS output while a relative increase in cost reducing treatments with 108 the same or better health outcomes will appear to reduce NHS output. If we had routinely collected data on health outcomes and these data were used to weight the activities of the NHS, it would not be necessary to make a separate quality adjustment to an index of NHS output. In the short term Atkinson suggests an alternative approach. Changes in quality which are currently measurable should be used to augment cost weighted output indices. We suggest that for health care, currently available data permit quality adjustment in respect of two attributes: waiting times and survival rates. Following the Atkinson (2005) approach requires that we find a basis for determining the relative value of changes in these two attributes of health care and use the resulting estimate of the rate of change in quality to quality adjust the cost weighted output index. Atkinson (2005) stresses the need to measure the value added by public services. This is particularly important in health care where individuals may be healthier or living longer for reasons unrelated to the availability and quality of health care. What is required are measures of the marginal effect of the NHS in producing outputs of value to patients. Given the difficulties in estimating production functions, rough and ready judgements will be required when attributing improvement in health outcomes to health care. Atkinson (2005) discusses the “complementarity” between public and private output. As economic growth leads to higher standards of living, the relative valuation of different goods and services will change. This will affect the weights to be attached to the various activities in an output index. If people are living longer the value of a hip replacement may increase as the present value of the benefits are estimated over a longer period of time. We argue that such changes in value arising from factors outside the control of the NHS should not be counted in measuring the quality adjusted growth rates of different types of NHS output. Instead they should be included in the weights to be applied in calculating the weighted average rate of growth of NHS output (the output index). Changes in the value of a hip replacement arising from factors outside the control of the NHS should influence decisions on how to allocate NHS resources across different activities but not in assessing the rate of growth of the output of hip replacement activities. 109 5 Experimental indices of NHS output 5.1 General trends, index form and data sources This section demonstrates the impact on a cost weighted output index of adjustments for a number of quality dimensions. Prior to presenting results, however, it is useful to highlight some salient features of the data. Our starting point is the dataset used by the DH in its cost weighted output index (CWOI). For inpatient hospital care, the Hospital Episode Statistics (HES) are used (data on non-elective, elective and day case activity). The main quality adjustments discussed in this report are for survival and waiting times. Mortality rates are only available for HES activity (non-elective, elective and day case hospital activity). Waiting times are available for electives and day cases and some outpatient activities. In 2002/03 the data consisted of 1913 groups of activities. Table 5.1 gives a summary of the main divisions of this dataset. In this chapter we focus on the quality adjustments to the HES data, which apply to 47% of the activity currently included in the CWOI. In chapter 6 we examine the effects of different quality adjustments to output indices for our small specimen set of HRGs for which we have health effects data. In the light of the results in this chapter for the HES set of hospital activities and the specimen set in chapter 6, we set out our preferred quality adjustment variant in chapter 7, for the set of HES hospital activities, for a broader set of hospital activities including outpatient treatments, accident and emergency, and for all NHS activity. In chapter 9, the output indices from chapter 7 are combined with the input indices from chapter 8 to yields estimates of various types of productivity growth. 110 Table 5.1 Activities and cost shares 2002/3 Electives+ day cases Non-electives Outpatients Other activities2 Total Number of activities (millions) Cost shares1 5.58 5.96 53.43 13.38 22.1 11.15 53.37 100 Notes: 1. Derived by multiplying activities by unit costs; 2.These activities are measured in noncomparable units so total numbers of activities are meaningless. A division of cost shares with this category is shown in Table 3.1. In order to highlight the impact of quality adjustment for the activities where the data permit adjustment, in sections 5.4 and 5.5 we compare our quality adjusted indices to an unadjusted index restricted to the same set of activities. In the tables this truncated version of the CWOI is labelled “unadjusted”. In Section 7 we examine how quality adjusting for this subset of activities affects the value of the complete CWOI. 5.1 General trends and data The HES data are grouped according to procedures comprising 574 Healthcare Resource Groups (HRGs), with an additional separation into electives and day cases and non-electives (Appendix B). Figure 5.1 graphs the number of episodes for each year from 1998/99 to 2003/04. It shows little change in electives up to 2001/02 with some growth thereafter. Non-electives show more significant growth, with very high growth in the final year. 111 Figure 5.1 Number of FCEs, electives+day cases (elip) and non-electives (nelip), 1998/99 – 2003/04 8500000 8000000 7500000 7000000 6500000 elip nelip 6000000 5500000 5000000 4500000 4000000 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 Within these broad categories there is considerable variation in number of procedures and in growth by HRG. For example, comparing 2002/03 with 2001/02, the arithmetic mean growth in episodes for elective HRGs was 3.8% but with a standard deviation of 15.9% – growth across HRGs was even more variable for non-electives. Partly this reflects substitution across treatments but nevertheless the variation is large. Unit costs from the Reference Costs database are employed to aggregate these diverse activities. The unit costs also show considerable variation across procedures, from over £20,000 for transplant procedures to under £500 for ophthalmic and ear procedures. In 2002/03 the mean unit cost across HRGs for electives was about £1,700 with a standard deviation of £2,220, with figures of £2,200 and £2,600, respectively, for non-electives. The cost weighted output index combines activity growth by weighting by unit costs, equivalent to multiplying the ratio of activities by cost shares. Cost shares are concentrated in a few HRGs. Treating electives and non-electives as separate sets of 112 activities, in 2002/03, 25% of expenditure was accounted for by only 20 HRGs with 50% accounted for by 74 HRGs, as illustrated by the cumulative expenditure share chart below. The top expenditure categories include HRGs where activity rates are very high such as maternity care for normal deliveries or hip replacements and which are not typically life threatening. But it also includes HRGs where mortality rates are very high such as heart procedures and complex procedures involving the elderly. A high cost share on this latter group turns out to be important in the adjustments for survival discussed below. Figure 5.2 Cumulative expenditure shares, FCEs (537 elective, 537 non-elective HRGs) 1.2 1 0.8 0.6 0.4 0.2 0 0 5.2 200 400 600 800 1000 1200 Index form Table 5.2 compares the Laspeyres (base period weighted) index with the Paasche (current period weights) index and the Fisher index which is the geometric mean of the Laspeyres and Paasche indices. The index number formula used does have a small 113 but significant impact on the indices, as shown for episodes for selected years. In this report we follow ONS in reporting Laspeyres indices. Table 5.2 Impact of index number formula on CWOI index, FCEs based 1999/00-2000/01 2000/01-2001/02 2001/02-2002/03 Laspeyres 0.90% 0.93% 4.41% Paasche 0.71% 0.82% 4.47% Fisher 0.81% 0.87% 4.44% Note: the reference costs for 1998/99 were considered unreliable so the index for the comparison between 1998/99 and 1999/00 use 1999/00 unit costs in the numbers reported below. 2003/04 unit reference costs on a comparable basis were not available. 5.3 Spells versus episodes We discussed the choice between measuring hospital output in finished consultant episodes (FCEs) and continuous inpatient spells (CIPS) which consist of sets of consecutive FCEs in section 4.2 (see also Appendix B). We argued that CIPS were a better approximation to the patient journey and therefore a more appropriate measure of output. We use CIPS for our calculation of the effects of quality adjustments. Although there are around 8% fewer CIPS than FCEs this should have essentially no effect on the calculation of a cost weighted output index since we constructed our unit costs for CIPS from the underlying FCE unit costs. Table 5.3 compares FCE based and CIPS based CWOIs as a check on our calculations of unit costs of spells. The only reason for a divergence between the two indices is that some of the FCEs assigned to a particular year in the FCE index may be assigned to a different year in a CIPS index since a CIPS is assigned to a year only if its last FCE finished in that year. We would however expect to see differences in FCE and CIPS based indices once the outputs are adjusted for survival and mortality since these adjustments are applied to the different distributions of HRG types generated by the FCE and CIPS volume measures. 114 Table 5.3 Comparison of cost weighted output indices for hospitals based on finished consultant episodes and continuous inpatient spells CWOI index 5.4 Episodes CIPS pp diff 1998/99-1999/00 1.84 1.87 -0.03 1999/00-2000/01 0.90 0.91 -0.01 2000/01-2001/02 0.93 0.95 -0.02 2001/02-2002/03 4.41 4.44 -0.03 2002/03-2003/04 5.75 5.81 -0.06 Average 2.75 2.78 -0.03 Survival adjustments: hospital output 5.4.1 Simple survival adjustment This section considers the results of applying the survival adjustment formula in section 4.8.1 to HES data. There are two choices of death rates that can be used in the calculations, those that occur during the hospital stay or in-hospital deaths together with those occurring within some period following discharge from hospital. Inhospital deaths are those most directly attributable to the NHS but are likely to underestimate survival changes due to medical treatment since many patients die within a short time after discharge. However using mortality rates after discharge runs the risk of attributing deaths from extraneous influences to the NHS. On average in the period under consideration 30 day mortality rates were about 25% higher than inhospital deaths (Table 5.4). Both indicators show a downward trend with similar rates of decline. 115 Table 5.4 Mortality rates, (deaths/CIPS), 1998/99-2003/04 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 In-hospital 0.0239 0.0238 0.0229 0.0236 0.0228 0.0222 30 day 0.0308 0.0306 0.0293 0.0299 0.0286 0.0276 Death rates vary enormously across procedures. Death rates are considerably higher for non-elective procedures than for electives. Although the rate of decline is greater in the latter - on average elective mortality rates declined by 5.9% from 1998/992003/04 against decreases of 2.1% for non-electives - aggregate trends are dominated by those for non-electives given their greater weight (Figure 5.3). Figure 5.3 30 day Mortality rates, electives and non-electives 0.06 0.05 0.04 electives non-electives 0.03 0.02 0.01 0 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 116 Figure 5.4 Plot of mortality rates 2002/03 1.2 1 0.8 0.6 0.4 0.2 0 0 200 { 400 electives 600 }{ 800 1000 non-electives 1200 } Table 5.5 reports calculations of the pure short term survival adjusted cost weighted output index (section 4.8.1) ∑ ⎛ a jt +1 ⎞ x ⎜⎜ ⎟⎟ c jt jt + 1 j ⎝ a jt ⎠ ∑ j x jt c jt (99) where a is the survival rate. The first column of results is the unadjusted CWOI. The second and third columns are the survival adjusted indices calculated with 30 day and in-hospital death rates. The adjustments are non-trivial, though generally quite small in percentage point terms. These adjustments are generally larger in the final two years than in the beginning of the period. The use of 30 day mortality rates yields a higher adjustment than in-hospital deaths in all but the first year. 117 Table 5.5 Laspeyres CWOI index, CIPS, adjusted for survival Laspeyres CWOI Unadjusted Adjusted for survival (1-m) 30 day In-hospital 1998/99-1999/00 1.87 1.27 1.37 1999/00-2000/01 0.91 1.16 1.08 2000/01-2001/02 0.95 0.89 0.86 2001/02-2002/03 4.44 5.37 5.14 2002/03-2003/04 5.81 6.37 6.22 Average all years 2.78 2.99 2.91 The impact of the survival adjustment depends on both the rate of change of survival across HRGs and their cost shares. The latter turn out to have a large impact since, as stated earlier, the majority of procedures show little change in survival but these tend to be concentrated in low cost procedures. To illustrate this point Figure 5.5 shows the change in average (unweighted) mortality rates (from Table 5.4) and the change in the CWOI adjusted for survival minus the unadjusted CWOI (the second column in Table 5.5 minus the first column in Table 5.5), both indexed at 1998 =1. The mortality rate shows a relatively smooth pattern, generally declining but with a small upward shift comparing 2000/01 and 2001/02. In contrast the impact on the CWOI is much more variable, and not always in the inverse direction to the change in the mortality rate. Figure 5.6 plots changes in survival rates against unit cost for one of the growth periods, 2001/02-2002/03. Most changes in survival are small, ranging around the value 1 on the y-axis and the majority of these are in the lowest unit cost range. Year on year changes in the CWOI are driven largely by variations in survival rates in the relatively few procedures with very high unit costs, plus a few cases where changes in survival rates are very high in the low unit cost range. 118 Figure 5.5 Mortality rates and the impact of the survival adjustment*, Index 1998/99=1 1.1 1.05 1 0.95 0.9 mortality rate CWOI impact of survival 0.85 0.8 0.75 0.7 0.65 0.6 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 * calculated as the difference between the 30 day survival adjusted CWOI and the unadjusted CWOI Figure 5.6 Growth in survival (ratio) and unit costs, 2001/02-2002/03, (electives and non-electives) 2.5 Ratio survival (2002/03)/(2001/02) 2 1.5 1 0.5 0 0 5000 10000 15000 20000 25000 30000 35000 unit costs 119 To understand the sensitivity of the results to cost shares we estimated the change in the index when survival was assumed unchanged for the top 25 high cost share HRGs, which represented just over 30% of total expenditure. The impact of this was to reduce the 30 day survival adjustments by about 60%. Thus the calculations depend heavily on the survival rates of a small number of HRGs. Within this high cost share group, comparing 1999/00 with 1998/99, 17 of the 25 HRGs showed reductions in survival rates and these are responsible to a large extent for the big negative impact of the survival adjustment on the CWOI in that growth period. In contrast in the final two growth periods the majority of high cost share HRGs witnessed increases in survival rates – 19 HRGs in 2001/02-2002/03 and 20 HRGs in 2002/03-2003/04. Over time, both the number of HRGs with positive growth in survival rates and the share of expenditure accounted by these procedures have increased as shown in Table 5.6. If the percent of HRGs with increases in survival rates is lower than the cumulative expenditure share (in percent) of these procedures, then increased survival is concentrated in relatively high cost procedures. Table 5.6 shows that this is the case in each growth period except the first and that the discrepancy has increased through time. Table 5.6 Changes in 30 day survival rates and expenditures shares 1998/99-1999/00 1999/00-2000/01 2000/01-2001/02 2001/02-2002/03 2002/03-2003/04 Percent of procedures* with change in survival rates >1 Expenditure shares of procedures* with change in survival rates >1 42.8 55.5 49.5 62.9 63.0 37.8 62.0 50.7 75.4 77.7 * Total number of procedures = 1148, with electives and non-elective HRGs treated as separate procedures. 120 We suggested in section 4.8.1 that a simple survival estimate would be likely to be a conservative estimate of the effect of the growth in the health effect of treatment and we now consider how making strong assumptions about the health effect alters the results. 5.4.2 Survival and estimated health effects adjustment We consider the survival and health effects adjusted index of section 4.8.2 ∑ j ⎛ a − kj x jt +1 ⎜ jt +1 ⎜ a jt − k j ⎝ ∑ j x jt c jt ⎞ ⎟⎟ c jt ⎠ (100) where kj = q ojt / q*jt is an estimate of the proportionate effect of treatment conditional on survival to no treatment which, in the absence data on actual health effects, we assume is constant over time. ( q*jt is the sum of discounted quality adjusted life years accruing to patients who survive treatment. qojt is the sum of quality adjusted life years for untreated patients). With k = 0 which implies that the patient would have zero quality adjusted life years if not treated we have the pure survival adjusted index. We examine the impact of assuming that k is positive. As noted in section 4.8.2, the rather sketchy available evidence suggests a value of around k = 0.8 for non life threatening procedures. When the treatment has a high mortality we set k = 0. If we used m = 0.2 for the cut-off mortality rate for setting k = 0 this would ensure that term a – k is never negative which would correspond to treatment having a negative effect on health. But as we noted in the simulations in section 4.8.2 this would make the index very sensitive to change in mortality when the rate is close to 0.2. We therefore set the cut off value for mortality which leads to k = 0 so that a – k is never smaller than 0.05. Thus for HRGs with high mortality we adjust only by the survival rates, not by the survival rates and the assumed health effect. Table 5.7 shows that including the crude health effects adjustment via k = qo/q* generally increases the growth rate compared with no adjustment (first column) and with a simple survival adjustment (Table 5.5). The greatest impact is the third column, in particular for the final two periods. The results in general suggest that 121 adjusting for survival adds about two percentage points to the growth rate in 2002/03 and 2003/04. Averaged across the five yearly growth rates, the impact ranges from adding about 1.0 to 0.4 percentage points to the growth rate. Contrast this with an average impact of 0.22 for the simple survival adjustment using 30 day survival rates. Thus a survival adjustment which incorporates crude but not implausible adjustments for health effects is capable of significantly adding to the growth rate of hospital output. Note, as with the simple survival adjustment, much of the impact is due to the behaviour of survival rates in the high cost share HRGs. For example in the case where k=0.8 with cut off = 0.10, nearly 70% of the adjustment can be attributed to the 25 HRGs with the highest cost shares. Table 5.7 CWOI index, CIPS, adjusted for survival, 30 day mortality rates Unadjusted q0/q*=0.8 if m<0.10, q0/q =0 otherwise q0/q*=0.8 if m<0.15, q0/q =0 otherwise q0/q*=0.7 if m<0.15, q0/q =0 otherwise q0/q*=0.7 if m<0.10, q0/q =0 otherwise q0/q*=0.9 if m<0.05, q0/q =0 otherwise 1998/99-1999/00 1.87 0.78 0.09 0.73 1.02 1.26 1999/00-2000/01 0.91 1.58 1.97 1.51 1.36 1.54 2000/01-2001/02 0.95 0.91 1.01 0.93 0.90 1.01 2001/02-2002/03 4.44 6.59 7.72 6.34 5.97 6.27 2002/03-2003/04 5.81 7.15 8.04 7.10 6.76 7.09 Average all years 2.78 3.36 3.77 3.28 3.20 3.43 5.4.3 Survival adjustments with health effects and life expectancy In section 4.8.3 we suggested that including a term reflecting life expectancy of patients treated would be a way of improving the crude adjustment for the health effect and proposed the index 122 ∑ x c j jt +1 jt (a − k ) ⎛ 1− e ( a − k ) ⎜⎝ 1 − e ∑xc jt +1 − rL jt +1 j jt − rL jt j j jt ⎞ ⎟ ⎠ (101) jt where Ljt is the life expectancy at the average age of patients getting treatment j and r is the discount rate on quality adjusted life years (the units in which health effects are measured). Table 5.8 reports the results of calculation of this index with a discount rate on remaining life equal to 1.5% for the simple survival adjustment and with our central case value of k = 0.8 with a mortality cut off of either 0.15 or 0.10. Table 5.8 CWOI index, CIPS, adjusted for survival, life expectancy, 30 day mortality rates, r=1.5 Unadjusted q0/q*=0.8 if m<0.10, q0/q =0 otherwise q0/q*=0.8 if m<0.15, q0/q =0 otherwise 1998/99-1999/00 1.87 1.12 0.74 1999/00-2000/01 0.91 1.37 1.76 2000/01-2001/02 0.95 0.76 0.89 2001/02-2002/03 4.44 6.31 7.44 2002/03-2003/04 5.81 7.13 8.03 Average all years 2.78 3.30 3.72 The growth is higher with either of the adjustments than without them, more markedly for the variant with the more generous cut off which leaves more HRGs being adjusted by the ratio of health effects than by the simple survival ratio. The effect of the life expectancy adjustment is to reduce the growth compared with the corresponding case in Table 5.7 in all years except the first and reflects the increasing age of patients treated by the NHS. 5.5 Waiting time and survival adjustments: hospital output This section considers additional impacts on the indices from taking account of changes in waiting times. It shows results for a number of variants based on the 123 formulae in section 4.10. We calculated the mean wait after truncating very long waits to four years and the “certainty equivalent” wait which was the mean plus a “risk premium” to reflect the disutility from the risk of long wait relative to the mean. We used the rule of thumb that the certainty equivalent wait for a treatment was at the 80th percentile wait for that treatment (see section 4.10.4). Table 5.9 shows mean waits across all patients and the mean 80th percentile wait across HRGs for electives in the period under study. This shows a decline in average waiting times using both measures since 1998/99. However this is mainly due to a large drop between 1998/99 and 1999/2000. Starting in the latter year mean waiting times increased up to 2002/03 but declined marginally in the final year. Table 5.9 Trends in waiting time, days, averages across HRGs Mean Per cent HRGs with decline in waiting times Truncated mean 80th percentile Truncated mean 80th percentile 1998/99 88.7 132.2 1999/00 80.8 117.7 62.3 55.1 2000/01 82.3 119.0 32.4 34.2 2001/02 85.2 124.4 31.9 33.9 2002/03 88.5 128.9 32.3 29.9 2003/04 85.9 126.8 63.4 51.5 The third and fourth column of Table 5.9 summarise the variation across HRGs in terms of waiting time experience by showing the percent of HRGs that show declines in average waiting times. Only in the first and last growth period do the majority of HRGs show decreasing waits with increases in the majority in the intervening years. The overall mean measures of waiting times are affected by the extent to which activity moves between procedures. Although substantial proportions of HRGs record reductions in waits in any one year this does not imply a substantial reduction in waiting times. Indeed if yearly waiting times were symmetrically randomly distributed around an unchanging mean for each HRG then 50% would have 124 reductions in any given year but there would be no overall downward trend in waiting times. The first two columns of Table 5.9 suggest that adjustments for reductions in waiting times are likely to have a small effect because the measures of waiting time that we use did not change very much. 5.5.1 Effect of waiting time adjustments The last point above is confirmed when we consider variations in quality adjustments to take account of changes in waiting times within the framework of a cost weighted output index. Tables below show the results of using various scaling factor waiting time formulae from sections 4.10.2, with the two measures of waiting time and different discount rates. The first column of figures in each panel shows as the base case the survival adjustment variant with k = q0/q*=0.8 and the mortality cut off set to m = 0.10 as a point of comparison. We found that other survival adjustments made little difference to the effects of the waiting time adjustments. Table 5.10 reports results from the adjustment with discounting to date of treatment with charge for wait (section 4.10.2.2) ( ) ( ) r w ⎡ 1 − e − rL L jt +1 e w jt +1 − 1 ⎤ ⎢ ⎥ − r r ⎢ ⎥ L w ⎛ a jt +1 − k j ⎞ ⎣ x c ∑ j jt +1 jt ⎜⎜ a − k ⎟⎟ ⎡ 1 − e− rLL jt erww jt − 1 ⎤ ⎦ j ⎠ ⎝ jt ⎢ ⎥ − rL rw ⎢ ⎥ ⎣ ⎦ ∑ j x jt c jt ( ) ( ) (102) where wjt is the waiting time measure for HRG j, rw is the discount rate on waiting times and rL is the discount rate on QALYs. Note this formula differs from that in Table 4.8 due to different discount rates on waits and QALYs – if the two discount rates are equal the formula reduces to that in Table 4.8. The panels differ in the measure of waiting time adopted (mean wait or 80th percentile). 125 Table 5.10 Laspeyres CWOI index, CIPS, adjustments for changes in waiting times Discount to date of treatment with charge for wait (based on mean wait variable) Survival adjustment only rw = rL = 1.5% rw = rL = 5% rw =10%, rL =1.5% rw =50%, rL =1.5% 1998/99-1999/00 1.12 1.16 1.08 1.16 1.16 1999/00-2000/01 1.37 1.35 1.46 1.35 1.34 2000/01-2001/02 0.76 0.75 0.79 0.75 0.75 2001/02-2002/03 6.31 6.31 6.40 6.32 6.32 2002/03-2003/04 7.13 7.20 7.26 7.20 7.21 Average 3.30 3.32 3.36 3.32 3.32 Discount to date of treatment with charge for wait (based on 80th percentile wait variable) Survival adjustment rw = rL = rw =50%, rw = rL = rw =10%, only 1.5% 5% rL =1.5% rL =1.5% 1998/99-1999/00 1.12 1.18 1.13 1.19 1.21 1999/00-2000/01 1.37 1.34 1.44 1.34 1.33 2000/01-2001/02 0.76 0.75 0.78 0.75 0.75 2001/02-2002/03 6.31 6.34 6.44 6.35 6.38 2002/03-2003/04 7.13 7.24 7.31 7.25 7.30 Average 3.30 3.33 3.38 3.34 3.36 Note: All columns have the same survival adjustment: k = 0.8 if m < 0.10, 0 otherwise We also considered a number of variations in quality adjustments to take account of changes in waiting times based on use of additional formula or different ways of measuring waiting times. The first reports the results for waiting time adjustment with discounting to the date placed on the list (section 4.10.2.1): ( ( w jt +1 1 − e L jt+1 ⎞⎛ e ⎜ ⎟⎟ − rw w jt −r L 1 − e L jt ⎠ ⎝⎜ e ∑ j c jt x jt ⎛a −k ∑ j c jt x jt +1 ⎜⎜ ajt +1− k ⎝ jt −r w −r L ) ) ⎟⎞ ⎟ ⎠ (103) where the waiting time adopted is the 80th percentile. 126 The second is based on the use of individual data to measure mean waiting times. Since the waiting times and life expectancy factors are non-linear and there is a variation in waiting times and in ages within an HRG in a given year it is possible that our use of a single waiting time and life expectancy estimate for each HRG may lead to misleading results. We therefore computed the equivalent of the waiting time adjustment with discounting to date of treatment with a charge for waiting with individual level data. Thirdly we were asked to consider how an adjustment for waiting times could allow for optimal waiting times – it was suggested that some patients might find too short a wait inconvenient. In the absence of any information on what an optimal wait might be we investigated the implications of assuming that the effect of an optimal waiting time w* was to replace the actual wait in our waiting time adjustments with the ŵ = w – w* if w > w* and 0 otherwise. Thus reductions in waiting time below w* would have no effect whereas the proportionate effect of reductions above w* would be increased. We experimented first with w* = 30 days but found that this resulted in a large number of HRGs where ŵ = 0, therefore we opted to use a value of w = 15 days. Table 5.11 shows the impact on the CWOI of these three variants, where the first column shows the calculations in Table 5.10, discount to date of treatment with charge for wait (based on 80th percentile wait variable) for comparable discount rates. Discounting to date on list lowers the average growth rates, mainly through reductions in the first and last years. The use of individual data has a greater effect in raising the growth rate, although this is concentrated in the first few years. The use of optimal waits has little impact on the average growth rates, with only a discernible impact in the final year. 127 Table 5.11 Laspeyres CWOI index, CIPS, adjustments for changes in waiting times, rL =rW=1.5% Discount to date of treatment with charge for wait, 80th percentile wait variable Discount to date on list, 80th percentile wait Using Optimal individual waits* data, discounting to discounting to date of date of treatment with treatment with charge for charge for wait, mean wait wait 1998/99-1999/00 1.18 0.69 1.30 1.18 1999/00-2000/01 1.34 1.36 1.59 1.34 2000/01-2001/02 0.75 0.74 1.05 0.75 2001/02-2002/03 6.34 6.53 6.41 6.34 2002/03-2003/04 7.24 7.07 7.24 7.25 Average 3.33 3.24 3.48 3.33 *Based on 15 day optimal waiting time. Note: All columns have the same survival adjustment: k = 0.8 if m < 0.10, 0 otherwise The results which show small effects of waiting time adjustments are largely driven by the lack of change in waiting times rather than the methods used. To see this suppose waiting times for the 80th percentile were reduced by 10% for all HRGs comparing 2003/04 with 2002/03. Then the discount to date of treatment with low discount rates equal to 1.5% would add 0.16 percentage points. With the same discount rates, reducing waits at the 80th percentile by 50% would add 1.12 percentage points. The results from the specimen index calculated with a much smaller set of HRGs are sensitive to the method of waiting time adjustment and can make a difference to estimated growth rates. In addition the impact of changes in waiting times is dependent on the cost share weights. In this case however, large increases in waiting times tend to be concentrated in low unit cost procedures. This is illustrated the final two growth periods in Figure 5.7 and Figure 5.8 but a similar pattern is also apparent for earlier years. 128 Figure 5.7 Percentage changes in waiting times (days) and unit costs, 2001/022002/03 150 100 ratio waiting times 50 0 0 5000 10000 15000 20000 25000 30000 35000 mean wait 80th percentile wait -50 -100 -150 unit costs Figure 5.8 Percentage changes in waiting times (days) and unit costs, 2002/032003/04 100 ratio waiting times 50 0 0 5000 10000 15000 20000 25000 30000 35000 mean wait 80th percentile wait -50 -100 -150 unit costs 129 Again it is useful to summarise the relationship between cost and changes in waiting times by the number of HRGs that show reductions and their expenditure shares. Table 5.12 shows that for three of the five growth periods the majority of HRGs show increases in waiting times with higher proportions in the first and final period. In general the percent of HRGs with reductions in waiting times are about equal to their expenditure shares so that reductions tend to be concentrated at the low unit cost end. Table 5.12 Changes in waiting times and expenditures shares Mean wait Expenditure Per cent of shares of electives* electives* with reduction with in waiting reduction in time waiting time 1998/99-1999/00 1999/00-2000/01 2000/01-2001/02 2001/02-2002/03 2002/03-2003/04 62.3 32.4 31.9 32.3 63.4 68.7 37.3 30.4 33.3 63.9 80th percentile wait Expenditure Per cent of shares of electives* electives* with with reduction in reduction in waiting time waiting time 55.1 34.2 33.9 29.9 51.5 65.8 37.7 35.2 29.8 51.7 * Total number of electives = 563 We did not estimate the alternative characteristic adjustment set out in section 4.10.1 for all HRGs because of lack of data on health effects. However, we report in section 6 results from using this approach to waiting times with a small specimen set of HRGs for which we have better health data. 5.5.2 Outpatient waits Finally we consider outpatient waits. Data on waiting times for first outpatient attendances are only available for four of the years considered in this report. Average days wait for outpatients were 64 days in 1999/00 and 2000/01 but then declined by about 10% in 2002/03 to 58 days and a further 7% to 54 days in 2003/04. We used the discount to date of treatment formula as for electives above, assuming all outpatients had remaining life expectancy of 26 years, the average across electives. The cost weights for changes in waiting times for outpatients was assumed to be the 130 sum of the cost share of first attenders and follow up appointments to be consistent with the spells approach employed in previous calculations. The effect of this adjustment was to increase the cost weighted output index for outpatient first attenders from 4.47% to 4.59% in 2001/02 and from 6.48% to 6.56% in 2002/03. These adjustments become very small when all outpatients including follow-ups are included in the index. 5.6 Additional quality adjustments We also considered the use of data in addition to survival and waiting time in order to quality adjust the output index. These additional adjustments are necessarily speculative because of the absence of crucial data so we present them mainly to illustrate the application of the methods described in section 4 and to give a very rough indication of what are the crucial parameters on which information is required Given current data availability we do not recommend they be used to quality adjust the NHS output index. The adjustments are of two types (a) we treat measures of readmissions and MRSA as indicators of unnecessary additional expenditure (section 4.9.1); and (b) we use measures of patient experience as summary indicators of characteristics that patients value (section 4.11) . 5.6.1 Adjusting for the costs of poor treatment: readmissions and MRSA We suggested in section 4.9.1 that one way of accounting for readmissions and MRSA was to argue that these led to lost output whose value, in accordance with the assumptions underlying the cost weighted index, was their additional cost to the NHS. Thus, ignoring other quality adjustments for illustrative purposes, we calculate ∑x ∑x c − ∑ x bjt +1c bjt jt +1 jt j j j jt c jt − ∑ x bjt c bjt (104) j where xb denotes the number of readmissions or cases of MRSA and cb their costs. As we noted in our discussion in section 4.9.1 (see also Appendix A) the current data on the xb are not sufficiently detailed, to enable us to distinguish say between readmissions which are the result of poor initial treatment and those which result from pre-existing poor health of the patient. In our calculation we therefore use the total 131 number of readmissions and the total number of MRSA cases. Since we have no data on MRSA cases or readmissions prior to 2001/02 we show only the effect from 2001/02 to 2003/04. There are also problems in estimating the costs cb if we do not know, for example, which readmissions are indicators of poor treatment. We therefore use notional costs of a readmission of £500 and of £1000 for an MRSA case at 2002/03 prices, with prices in other years estimated using money GDP per capita. With these data and working from the cost weighted index with no adjustment for mortality or waiting time, we can see what effect these have on the estimated growth rate. We show three cases in addition to the basic cost weighted index. Table 5.13 Effects of quality adjustment for readmissions and MRSA on cost weighted hospital output No. of MRSA Hospital Readmissions Cases CWOI* Adjusted Indices MRSA Charge (£ per case) 0 £1,000 Readmission Charge (£ per case) 2001/2 2002/3 2003/4 0 Average growth 476,556 492,247 536,005 17,933 18,519 19,311 £0 £0 £1,000 £500 £500 4.44% 4.44% 4.46% 4.46% 5.81% 5.81% 5.75% 5.75% 5.12% 5.12% 5.10% 5.11% *This is the unadjusted CWOI, for the inpatient hospital sector (see Table 5.3) The impact of MRSA cases is negligible. The number of cases is very small, compared to the number of patients treated in hospitals; a very much higher cost would be required for it to have an impact. The effect of readmissions is slightly larger (see section 4.9). However since the growth rate, at 6% p.a. over the two years is not very different from the growth rate of the unadjusted impact, the costs associated with readmissions (which are regarded as money wasted rather than contributing to output) would have to be very large for there to be a substantial effect on the overall index. 132 Table 5.14 shows the effect of adding the readmission and MRSA adjustments to the survival and waiting time adjusted hospital cost weighted output index. It also further clarifies the calculations, in the case where both adjustments are included, by showing the shares of total expenditure attributed to each. With such small shares, especially for MRSA, it is unlikely that the impact on the overall index would ever be of great significance. Table 5.14 Effects of quality adjustment for readmissions and MRSA on cost weighted inpatient hospital output in addition to survival and waiting time adjustment 2001/02 Index Assumed Assumed Additionally Costs of Index Costs of adjusted for Adjusted for Growth MRSA as Readmissions Proportion of as Proportion Waiting and Readmissions Growth Rate Mortality and MRSA Rate MRSA Readmissions Total Costs of Total Costs 0.14% 1.89% 2002/03 6.35% 6.41% 3.27% 3.29% 2003/04 7.25% 7.22% 4.28% 8.89% Av 03/4 over 01/02 6.80% 6.82% 3.77% 6.05% 0.14% 1.81% Combined Growth Rate MRSA/Readmission 5.90% Note: The index adjusted for waiting and mortality is the 80th percentile waiting variable with rw=10% p.a. and rl=1.5% p.a. in table 5.10. The adjustments for readmissions and MRSA are incorporated in the index by adding to the growth in the index adjusted only for waiting and mortality the growth rates of MRSA and readmission weighted negatively by the cost shares for the previous year. These speculative guesstimates suggest that adjusting for readmissions and MRSA in this way can have non-trivial effect on the survival and waiting time adjusted hospital cost weighted output index. While we stress the illustrative nature of these figures one important policy point does follow from them. Cases of MRSA are rare that costs associated with its treatment would have to be a substantial multiple of the £1000 we assumed before it could have an important impact on the index. Readmission, on the other hand, is of material importance. If the DH wishes to use our approach to adjust the CWOI we recommend that it should focus initially on quantifying the costs of MRSA cases and the proportion of readmissions which are avoidable or harmful. 133 5.6.2 Patient satisfaction As we note in our discussion in section 4.11, there are theoretical arguments against using such data, not least that it may simply be double counting aspects of quality, such as health effects and waiting times, which can be captured by other more direct means. On the other hand if one believes that satisfaction survey response measure characteristics of NHS care which are of value to patients and are not already reflected in other quality measures then we have suggested in section 4.11 a method of incorporating such data. The data are described in Appendix A. Since patient satisfaction data only permit comparison of 2004 or 2004/05 with 2003 we have illustrated our method by examining its impact on the average annual growth rate in hospital output between 2001/02 to 2003/04.8 Our method has three main steps. First we quantify the ordered qualitative responses to various satisfaction questions by assigning them equally spaced numerical values between 0 and 100 for the least to the most satisfied categories. We construct such numerical scores for three aspects of the patient experience: food, cleanliness and non-clinical experience (for example whether patients felt they were treated with respect and dignity). We have scores from A&E, outpatient, and inpatients surveys. There is a residual category of “other” hospital activity, taking up about 25% of costs. We have assumed that the surveys do not describe patient satisfaction with the services provided by this. Second, since these scores are in effect estimates on a per patient basis we need to scale them by a suitable measure of the volume of such experiences. We use the number of patients for outpatients and A&E and the number of patient spells for inpatients. The reason for choosing patient spells for inpatients rather than the alternative of bed-days is that the analogy with hotel services can be taken only so far. One can argue that most patients would prefer short stays rather than long stays in hospital- that staying in hospital is a necessary evil rather than a hotel service consumed with the readiness of a stay in a hotel. For practical purposes, because we 8 We also calculated patient satisfaction adjustments to the volume of patient consultations for the same period which had little effect because the patient experience scores changed little over the period 134 feel that, because a stay in hospital is a route to better health rather than a consumption good per se, we make the calculation with the volume of treatment measured by numbers of consultant inpatient spells. Third, the scores are incorporated into the output index in two ways. We use expenditure on food and cleaning to weight the food and cleanliness scores, taking account of the shares of A&E, outpatients, and inpatients in total hospital costs in 2001/02.9 We cannot, however find from the expenditure data weights appropriate to non-clinical experience. We therefore present results making the assumption that either 5% (case A) or 10% (case B) of total expenditure by hospital trusts and that these same proportions apply to the three activities covered. It can be doubted whether any form of accounting would identify the proportions since the score includes measures of politeness and courtesy which cost nothing. Table 5.15 reports an illustrative calculation of growth rates in the satisfaction scores, and the growth in the total volume of quality taking account of the numbers experiencing these different aspects of care in the different sectors. 9 These weights overstate the importance of the cost of providing the “hotel service” components of cleanliness and food quality since they also have medical consequences. But since we do not have data on the consequences of medical treatment for quality of life, we are not in fact double-counting. In any case, since the cost weights are small, double-counting is unlikely to be a major source of error. 135 Table 5.15 Illustrative indicators of patient satisfaction with hospital services, average annual growth rates, 2001/02 to 2003/04 A&E Overall Outpatients Inpatients Change Food Cleanliness 1.02% Superficial Attention 0.70% Quality Change (A) Quality Change (B) Volume Change 4.59% Total Change (A) Total Change (B) Weight 4.47% 0.64% 0.64% -2.27% -0.63% -1.05% -0.10% 0.56% 5.05% 5.12% 0.40% 0.14% 0.25% 5.08% 5.23% 5.49% 24.21% Weight in overall Trust Budget A B 0.85% 0.85% 1.44% 1.44% 5.00% 10.00% 7.29% 12.29% 71.32% Notes: 1. 2. 3. 4. The changes in the indicators of food, cleanliness and non-clinical care are the changes in the logarithms of the relevant variables. This is also true of the changes shown in tables A1 to A4. The changes in each indicator for each category of treatment are weighted together using the weights in the last row so as to give the overall change in each quality attribute. The food indicator is used as it stands because all expenditure on food is associated with inpatients. The overall quality indicators for each attributed are then weighted using weights (A) or (B) shown in the last two columns. To give the overall quality changes (A) and (B). These growth rates are then combined with the overall volume change (calculated as the weighted sum of the changes in the individual volumes) to give the total change using weights (A) and (B). The quality changes for food, cleanliness and non-clinical care are calculated for whatever period the data happen to be available. The changes in volume relate to 2003/04 over 2001/02. The overall impact on the overall hospital output index of these adjustments is shown in Table 5.16. The effect of the quality terms is further damped because total hospital output includes the “other” activities in addition to inpatients, outpatients and accident and emergency. We have no quality data on these. But in any case, with the MRSA/Readmissions and quality indices both growing at rates not very different from the overall quality-adjusted CWOI, it is not surprising that these effects have little influence on the overall total. 136 Table 5.16 Illustrative calculations of hospital CWOI with adjustments for survival, waiting times, patient satisfaction as measured in patient surveys, readmissions and MRSA. Average annual growth rates 2001/02 to 2003/04 Average Growth Rates 2001/2 to 2003/4 Unadjusted CWOI Quality Variant 1 With Adjustment for Patient Satisfaction (5% weight on non-clinical care satisfaction) With adjustment for MRSA, Readmissions and Patient Satisfaction (5% weight on non-clinical care satisfaction) With Adjustment for Patient Satisfaction (10% weight on non-clinical care satisfaction) With adjustment for MRSA, Readmissions and Patient Quality and Satisfaction (10% weight on non-clinical care satisfaction) % p.a. 4.34% 5.74% 5.71% 5.71% 5.69% 5.69% Note: these figures refer to the broader definition of the hospital sector; see Table 9.2 and preceding discussion. 5.7 Conclusions This section has implemented a number of the quality adjustments to hospital sector output based on the methods developed in section 4 to make use of currently available data. We found that • the pure survival adjustment raises the average annual growth rate of the hospital sector between 1999/00 and 2003/04 from 2.78% to 2.99% when survival was measured at 30 days. • the survival effect has a smaller effect when calculated using in-hospital survival (average growth 2.91%). • combining the survival adjustment with an assumed uniform proportional health effect further increase the average growth rate by around 0.4% to 1.0% depending on the assumed value of health effect and the mortality rate cut off criteria used to reduce the volatility of the index. • adding a life expectancy adjustment to the survival and health effect adjustment had little additional effect on the growth rate. • combining survival, assumed health effect, waiting time and life expectancy adjustments produced estimated growth rates which were very similar to estimates with only survival and assumed health effect adjustments. • the effect of the waiting time adjustment was insensitive to very large 137 variations in discount rates on waiting times, to the use of individual rather than HRG level data, to the form of the adjustment, and to the measure of waiting time (mean wait or 80th percentile wait). • the small waiting time effects are due to the small changes in waiting times over the period rather than to the form of the waiting time adjustment and the particular parameter values used. • a crude illustrative adjustment readmissions and MRSA (all that is possible with current data) in addition to the survival, assumed health effect, life expectancy and waiting times adjustments, had no perceptible effect on the average annual growth rate (2001/2 to 2003/4). • a similarly crude illustrative adjustment for patient satisfaction with food, cleanliness and non-clinical care reduced the growth rate very slightly, (2001/2 to 2003/4) by less than 0.1 percentage point. 6 Specimen output index 6.1 Introduction In Section 2.4 we set out our preferred index of NHS output, a value weighted output index. It is not possible to estimate a comprehensive value weighted index because of a lack of data on the most important characteristic: improvement in health for patients who survive treatment. In Sections 2.7 and 4.6 we examined the stringent assumptions necessary to justify using cost weights in the output index. In Section 4.8.2 we looked at the sensitivity of the output index to an estimate of health gain based on the assumption that health gain was constant across all activities and did not vary over time. In this section we examine the implications of having better health data which permit the calculation of our preferred value weighted output index. We have identified a few HRGs where data exist on health outcomes (Appendix C). The data are similar to what would be produced by sampling patients before and after treatment and are used 138 to drop the restrictive assumption that health gain is constant across activities. While the conditions for which we have outcomes data are not representative of all NHS activities, we are able to compare a number of “specimen” indices with the equivalent cost weighted output index for the same sub-set of conditions. We use the health data to examine: • A value weighted output index assigning monetary weights to improvements in health and reductions in waiting times • The impact on an index of substituting cost weights with value weights • The effect on health effect adjusted cost weighted indices of allowing health gain to vary by treatment In the next subsection we describe the data used in the construction of the specimen index. We then estimate the following indices: • A Cost Weighted Output Index (CWOI) • CWOI with a short-term survival adjustment • CWOI incorporating health adjustment • CWOI incorporating health and waiting times adjustment • A health outcome weighted output index (HOWOI) • A HOWOI incorporating waiting times adjustment • A Value Weighted Output Index (VWOI) where health and waiting times are treated as characteristics We also explore the sensitivity of results to • In-hospital versus 30-day survival rates • The measurement of waiting times • Discount rate • Monetary value of a QALY • Monetary value applied to a day spent waiting 139 6.2 Data The specimen index comprises the HRGs listed in table 6.1 below. For all of these HRGs, data on the health outcomes before and after treatment were available, either from clinical trials that employed the EQ5D or our analysis of SF36 from BUPA and York District Trust. The data derived from these two instruments are converted to a common scale. These data are described in greater detail in Appendix C. Table 6.1 Before and after health outcomes Health outcome h 0j h*j A07 B02 B03 C14 C22 C24 E04 E12 E15 E35 F73 F74 G01 G13 G14 H02 H04 H23 H24 H26 0.41 0.73 0.70 0.87 0.83 0.77 0.50 0.68 0.54 0.63 0.64 0.74 0.53 0.63 0.68 0.37 0.35 0.77 0.72 0.41 0.57 0.76 0.72 0.95 0.91 0.93 0.73 0.72 0.79 0.69 0.69 0.81 0.59 0.66 0.81 0.62 0.54 0.84 0.74 0.53 J01 L32 M07 M09 P18 Q11 R02 R03 R09 0.93 0.81 0.70 0.72 0.36 0.77 0.37 0.36 0.32 0.96 0.85 0.80 0.83 0.41 1 0.67 0.62 0.60 HRG description Source HRG Intermediate Pain Procedures Phakoemulsification Cataract Extraction with Lens Implant Other Cataract Extraction with Lens Implant Mouth or Throat Procedures - Category 2 Nose Procedures - Category 3 Mouth or Throat Procedures - Category 3 Coronary Bypass Acute Myocardial Infarction w/o cc Percutaneous Transluminal Coronary Angioplasty (PTCA) Chest Pain >69 or w cc Inguinal Umbilical or Femoral Hernia Repairs >69 or w cc Inguinal Umbilical or Femoral Hernia Repairs <70 w/o cc Liver Transplant Biliary Tract - Major Procedures >69 or w cc Biliary Tract - Major Procedures <70 w/o cc Primary Hip Replacement Primary Knee Replacement Soft Tissue Disorders >69 or w cc Soft Tissue Disorders <70 w/o cc Inflammatory Spine, Joint or Connective Tissue Disorders <70 w/o cc Complex Breast Reconstruction using Flaps Non-Malignant Prostate Disorders Upper Genital Tract Major Procedures Threatened or Spontaneous Abortion Psychiatric Disorders Varicose Vein Procedures Surgery for Degenerative Spinal Disorders Spinal Fusion or Decompression Excluding Trauma Revisional Spinal Procedures BUPA BUPA BUPA BUPA BUPA BUPA BUPA EQ5D BUPA EQ5D BUPA BUPA EQ5D BUPA BUPA BUPA York BUPA BUPA EQ5D BUPA EQ5D BUPA BUPA EQ5D EQ5D BUPA BUPA BUPA 140 Annual data from 1998/99 to 2003/04 on the following are used in the construction of the indices: • Activity, measured as continuous inpatient provider spells (CIPS), derived from HES. • In-hospital and 30-day survival rates, derived from HES. • Waiting times, measured as the mean waiting time and the wait at the 80th percentile, derived from HES. • Life expectancy, derived from life tables and estimated according to the average age of those in each HRG. Raw data for each of the variables used in the indices are provided for each year from 1998/99 to 2003/04. To give an intuitive sense of the change in these data over time, and hence what the various indices will be capturing, some of the data are presented in the following tables and figures. Table 6.2 provides elective and non-elective activity, measured as CIPS, for each year. Figure 6.1 shows the amount of activity in each HRG in 1998/99, when the series begins, and 2003/04, when the series ends. Figures (a) and (b) show the amount of elective CIPS for, respectively, low and high volume HRGs. Figures (c) and (d) provide similar information for non-elective CIPS. For the majority of HRGs, the number of CIPS in 2003/04 (the darker bars) is greater than the number in 1998/99. All else equal, this would be expected to translate into a positive change in the index over the full period. 141 Table 6.2 Activity by year Elective Activity HRG A07 B02 B03 C14 C22 C24 E04 E12 E15 E35 F73 F74 G01 G13 G14 H02 H04 H23 H24 H26 J01 L32 M07 M09 P18 Q11 R02 R03 R09 Non-elective Activity 1998/99 85,645 155,929 32,040 157,225 39,451 128,004 15,215 330 10,555 446 22,300 57,034 75 6,935 24,491 34,122 27,741 896 2,547 14,060 1,833 2,820 67,938 4,467 3,919 51,872 8,921 6,249 951 1999/00 86,770 177,551 20,644 152,462 37,603 114,666 14,618 212 11,364 412 21,432 54,838 99 6,825 24,707 34,355 28,730 1,153 3,121 14,281 2,020 2,603 62,433 4,864 3,378 45,659 8,368 6,029 924 2000/01 88,495 212,176 13,542 143,192 37,511 101,809 14,860 228 13,178 506 21,886 55,927 57 7,439 26,169 36,100 31,685 1,364 3,523 13,495 2,481 2,497 58,150 4,913 2,862 43,145 8,394 6,032 940 2001/02 93,227 224,247 10,648 140,591 31,858 100,901 15,046 187 15,439 482 21,348 54,517 88 7,600 27,636 37,530 34,392 1,189 3,014 15,591 2,655 2,184 55,138 5,039 3,308 40,306 8,150 6,329 931 2002/03 99,493 250,377 7,755 144,338 33,653 106,716 16,280 328 17,625 534 22,918 58,334 112 8,318 31,026 41,630 41,037 1,487 3,330 21,146 3,095 2,382 53,764 5,629 3,816 43,846 8,808 7,015 1,142 2003/04 100,640 282,486 5,602 145,078 33,026 101,381 15,132 332 21,490 530 24,405 59,413 141 8,910 33,125 46,126 48,916 1,637 3,807 23,992 3,337 2,219 52,315 5,998 3,439 41,156 9,161 7,987 1,135 1998/99 1,444 386 172 6,648 7,639 6,652 2,267 67,422 4,672 32,875 3,322 2,995 324 1,406 1,879 1,262 166 8,537 11,156 7,871 21 2,881 11,069 53,023 219 158 1,867 1,022 163 1999/00 1,048 392 110 6,665 7,680 7,006 1,141 58,969 4,918 34,483 3,189 2,884 386 1,417 1,962 1,168 159 8,876 11,877 7,735 13 2,975 10,239 55,123 206 150 1,589 870 168 2000/01 780 458 101 6,523 6,899 7,003 1,133 57,130 5,066 40,355 3,105 2,847 302 1,497 2,196 1,154 182 10,066 12,887 7,435 30 2,652 10,225 56,158 231 161 1,648 874 158 2001/02 679 574 66 6,134 6,773 6,936 950 55,455 4,995 42,724 2,957 2,782 327 1,534 2,264 1,149 218 10,562 13,136 7,214 22 2,692 9,736 59,124 192 150 1,457 792 133 2002/03 679 608 34 6,395 6,919 7,121 1,935 63,691 8,327 47,025 3,097 2,864 343 1,711 2,822 1,297 205 11,793 13,740 7,980 31 3,032 9,742 63,393 136 104 1,608 901 182 2003/04 836 583 42 6,557 6,542 7,622 2,074 63,900 10,488 51,843 3,217 3,070 298 1,824 3,157 1,157 255 12,911 14,527 8,076 14 3,355 9,889 64,311 192 128 1,663 1,044 195 33,242 32,487 32,847 33,089 35,722 37,342 8,259 8,048 8,250 8,335 9,232 9,647 Average Figure 6.1 Number of elective and non-elective CIPS by HRG, 1998/99 and 2003/04 (a) Low volume elective HRGs (b) High volume elective HRGs Activity - high volume elective HRGs Activity - low volume elective HRGs 10000 300000 9000 1998/99 8000 250000 1998/99 2003/04 7000 2003/04 200000 6000 5000 150000 4000 100000 3000 2000 50000 1000 0 0 G01 E12 E35 H23 R09 J01 H24 L32 P18 M09 R03 G13 R02 (c) Low volume non-elective HRGs E15 H26 E04 F73 G14 H04 B03 H02 C22 Q11 F74 M07 A07 C24 B02 C14 (d) High volume non-elective HRGs Activity - high volume non-elective HRGs Activity - low volume non-elective HRGs 80000 2000 1800 70000 1998/99 1600 2003/04 60000 1998/99 2003/04 1400 50000 1200 1000 40000 800 30000 600 20000 400 10000 200 0 0 J01 Q11 R09 H04 B03 P18 G01 B02 R03 H02 G13 A07 R02 G14 E04 L32 F74 F73 E15 C14 C24 C22 H26 H23 M07 H24 E35 M09 E12 142 There are two available measures of the rates of survival for each HRG – in-hospital survival and 30-day survival. Data on survival are provided in Table 6.3. The average survival rate among electives was upwards of 99% and around 96% for non-electives. Table 6.3 30-day and in-hospital survival rates, by year Elective survival rate 30 days inhospital HRG 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 A07 B02 B03 C14 C22 C24 E04 E12 E15 E35 F73 F74 G01 G13 G14 H02 H04 H23 H24 H26 J01 L32 M07 M09 P18 Q11 R02 R03 R09 99.83% 99.51% 99.52% 99.94% 99.93% 99.88% 98.07% 72.49% 99.52% 97.32% 99.47% 99.92% 94.52% 98.77% 99.94% 99.22% 99.30% 97.99% 99.96% 99.87% 100.00% 99.65% 99.82% 100.00% 99.77% 99.95% 99.88% 99.40% 99.68% 99.87% 99.57% 99.56% 99.95% 99.92% 99.87% 98.18% 74.59% 99.69% 98.04% 99.53% 99.95% 93.00% 99.13% 99.93% 99.21% 99.34% 98.78% 99.94% 99.88% 99.85% 99.50% 99.83% 99.98% 99.82% 99.95% 99.81% 99.50% 100.00% 99.85% 99.62% 99.59% 99.94% 99.94% 99.88% 98.27% 75.90% 99.59% 97.46% 99.55% 99.95% 84.75% 98.73% 99.92% 99.27% 99.26% 99.34% 99.91% 99.90% 99.96% 99.48% 99.81% 99.96% 99.79% 99.94% 99.81% 99.58% 99.89% 99.84% 99.67% 99.63% 99.93% 99.95% 99.90% 98.21% 72.68% 99.63% 98.96% 99.56% 99.93% 90.59% 99.13% 99.94% 99.31% 99.41% 99.58% 99.83% 99.94% 99.85% 99.50% 99.79% 100.00% 99.79% 99.95% 99.81% 99.60% 99.46% 99.87% 99.66% 99.68% 99.94% 99.92% 99.90% 98.19% 79.67% 99.67% 97.60% 99.61% 99.94% 92.79% 99.15% 99.94% 99.43% 99.36% 99.20% 99.88% 99.95% 99.94% 99.58% 99.81% 99.98% 99.84% 99.95% 99.79% 99.46% 99.65% 99.89% 99.69% 99.77% 99.95% 99.94% 99.91% 98.66% 87.42% 99.65% 98.91% 99.63% 99.95% 94.24% 99.20% 99.95% 99.33% 99.50% 99.33% 99.82% 99.93% 99.94% 99.73% 99.82% 99.98% 99.83% 99.94% 99.82% 99.64% 99.64% 100.00% 100.00% 99.99% 99.99% 99.99% 99.97% 98.28% 73.96% 99.77% 99.33% 99.87% 100.00% 94.52% 99.09% 99.98% 99.55% 99.61% 99.33% 99.96% 99.94% 100.00% 99.89% 99.90% 100.00% 99.82% 100.00% 99.93% 99.60% 99.68% 100.00% 99.99% 100.00% 99.99% 99.99% 99.96% 98.41% 77.70% 99.83% 99.26% 99.87% 99.99% 93.00% 99.37% 99.98% 99.54% 99.64% 99.39% 100.00% 99.96% 99.90% 99.81% 99.89% 100.00% 99.85% 100.00% 99.88% 99.62% 100.00% 100.00% 100.00% 99.97% 100.00% 99.99% 99.96% 98.52% 77.19% 99.79% 98.44% 99.88% 100.00% 84.75% 99.01% 99.98% 99.53% 99.56% 99.49% 100.00% 99.94% 100.00% 99.84% 99.87% 100.00% 99.82% 100.00% 99.89% 99.75% 100.00% 100.00% 100.00% 100.00% 99.99% 99.99% 99.97% 98.53% 75.26% 99.82% 99.38% 99.86% 99.99% 90.59% 99.26% 99.98% 99.54% 99.62% 99.92% 99.93% 99.99% 99.92% 99.91% 99.86% 100.00% 99.82% 100.00% 99.89% 99.73% 99.67% 100.00% 100.00% 100.00% 100.00% 99.99% 99.97% 98.44% 82.17% 99.82% 99.08% 99.88% 100.00% 92.79% 99.41% 99.98% 99.66% 99.65% 99.66% 99.97% 99.98% 99.94% 99.83% 99.89% 99.98% 99.87% 99.99% 99.90% 99.80% 99.74% 100.00% 100.00% 100.00% 100.00% 100.00% 99.97% 98.86% 88.23% 99.84% 99.27% 99.90% 100.00% 94.24% 99.37% 99.99% 99.59% 99.67% 99.70% 99.95% 99.98% 99.97% 99.91% 99.88% 100.00% 99.94% 100.00% 99.90% 99.75% 99.91% 99.71% 99.74% 99.73% 99.75% 99.75% 99.77% 99.90% 99.91% 99.90% 99.91% 99.91% 99.92% Activity weighted average Non-elective survival rate 30 days inhospital HRG 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 A07 B02 B03 C14 C22 C24 E04 E12 E15 E35 F73 F74 G01 G13 G14 H02 H04 H23 H24 H26 J01 L32 M07 M09 P18 Q11 R02 R03 R09 98.66% 98.04% 98.88% 99.46% 98.86% 97.59% 93.52% 84.34% 96.52% 97.44% 95.65% 99.77% 84.85% 92.21% 99.31% 80.81% 96.25% 97.31% 99.73% 99.33% 100.00% 97.76% 99.48% 99.99% 99.54% 98.09% 99.11% 93.14% 99.42% 98.53% 98.79% 98.21% 99.26% 98.82% 97.41% 92.79% 84.18% 96.36% 97.29% 95.00% 99.76% 90.82% 90.20% 99.69% 78.26% 94.94% 97.06% 99.72% 99.39% 100.00% 98.22% 99.43% 99.99% 99.52% 100.00% 98.56% 93.52% 98.84% 98.65% 99.16% 98.15% 99.57% 98.83% 97.81% 94.41% 85.16% 97.21% 97.56% 95.82% 99.93% 91.91% 92.07% 99.59% 76.15% 96.65% 97.49% 99.63% 99.33% 96.55% 97.63% 99.44% 99.99% 100.00% 99.38% 99.24% 91.88% 99.39% 99.02% 98.68% 97.10% 99.47% 98.68% 97.41% 93.74% 85.43% 97.48% 97.43% 95.44% 99.86% 90.35% 92.23% 99.47% 78.98% 94.05% 97.44% 99.62% 99.38% 100.00% 98.04% 99.65% 99.99% 100.00% 100.00% 98.66% 91.89% 100.00% 99.01% 98.10% 100.00% 99.50% 98.89% 97.92% 95.51% 87.74% 97.86% 97.97% 95.55% 99.76% 91.46% 93.52% 99.68% 78.28% 96.14% 97.71% 99.77% 99.57% 100.00% 98.35% 99.63% 100.00% 100.00% 99.04% 98.80% 95.02% 100.00% 99.08% 98.49% 100.00% 99.42% 99.19% 97.97% 96.50% 88.76% 98.09% 98.14% 95.69% 99.77% 92.52% 92.61% 99.78% 82.52% 95.54% 97.68% 99.82% 99.54% 100.00% 98.43% 99.59% 99.99% 97.93% 100.00% 99.32% 94.80% 98.52% 99.40% 98.53% 100.00% 99.71% 99.63% 98.35% 94.11% 85.69% 97.09% 98.62% 96.62% 99.93% 85.15% 93.42% 99.47% 83.70% 98.13% 98.51% 99.87% 99.50% 100.00% 98.51% 99.62% 100.00% 99.54% 99.36% 99.37% 95.86% 99.42% 99.35% 99.51% 100.00% 99.54% 99.51% 98.06% 93.64% 85.71% 96.96% 98.36% 96.32% 99.86% 91.07% 91.26% 99.80% 80.55% 96.84% 98.26% 99.83% 99.52% 100.00% 99.36% 99.56% 100.00% 99.52% 100.00% 99.16% 96.06% 99.42% 99.63% 99.37% 98.15% 99.71% 99.41% 98.42% 94.96% 86.55% 97.69% 98.61% 96.76% 99.93% 92.56% 92.75% 99.73% 80.38% 96.65% 98.40% 99.81% 99.47% 96.55% 98.49% 99.63% 100.00% 100.00% 100.00% 99.48% 95.58% 100.00% 99.30% 99.83% 97.10% 99.66% 99.35% 97.98% 94.22% 86.76% 97.95% 98.42% 96.05% 99.93% 90.35% 93.07% 99.51% 82.16% 94.93% 98.51% 99.82% 99.53% 100.00% 98.81% 99.75% 99.99% 100.00% 100.00% 98.91% 95.33% 100.00% 99.29% 98.89% 100.00% 99.75% 99.51% 98.42% 95.90% 88.77% 98.29% 98.77% 96.46% 99.90% 91.46% 93.99% 99.79% 80.47% 96.62% 98.61% 99.88% 99.67% 100.00% 98.75% 99.71% 100.00% 100.00% 99.04% 99.10% 95.83% 100.00% 99.66% 99.50% 100.00% 99.53% 99.57% 98.50% 96.62% 89.64% 98.47% 98.87% 96.47% 99.94% 92.52% 92.72% 99.81% 85.56% 96.75% 98.63% 99.90% 99.69% 100.00% 99.02% 99.66% 100.00% 97.93% 100.00% 99.54% 96.05% 99.01% 94.52% 94.85% 95.37% 95.54% 96.12% 96.52% 95.26% 95.60% 96.06% 96.19% 96.65% 96.99% Activity weighted average 143 Figure 6.2 shows the change in survival rates for elective CIPS between 1998/99 and 2003/04 for these alternative measures for low and high volume HRGs. Similar information is provided for non-elective CIPS in Figure 6.3. In general, survival rates have improved for both elective and non-elective patients. The most dramatic improvement in survival has been for non-elective acute myocardial infarction (E12), where the probability of 30-day survival increased from 85.69% in 1998/99 to 89.64% in 2003/04. As AMI is also a high volume HRG, this improvement would be expected to exert a high degree of leverage on the value of an index that included survival. In-hospital and 30-day survival rates map each other fairly closely, as comparison of figures (a) and (c) and of figures (b) and (d) show. Consequently, it would not be expected that the index would be particularly sensitive to which measure is adopted. Figure 6.2 Change in elective survival rates, 1998/99 – 2003/04 (a) In-hospital, low vol elec HRGs (b) In-hospital, high vol elec HRGs Proportionate change in in-hospital survival - low volume elective HRGs Proportionate change in in-hospital survival - high volume elective HRGs 1.2500 1.0060 1.0050 1.2000 1.0040 1.1500 1.0030 1.0020 1.1000 1.0010 1.0500 1.0000 0.9990 1.0000 G01 E12 E35 H23 R09 J01 H24 L32 P18 M09 R03 G13 R02 E15 H26 E04 F73 G14 H04 B03 H02 C22 Q11 F74 M07 A07 C24 B02 C14 0.9980 0.9500 0.9970 0.9960 0.9000 (c) 30 day, low vol elec HRGs (d) 30 day, high vol elec HRGs Proportionate change in 30-day survival - low volume elective HRGs Proportionate change in 30-day survival - high volume elective HRGs 1.007 1.3 1.25 1.005 1.2 1.003 1.15 1.001 1.1 1.05 0.999 E15 H26 E04 F73 G14 H04 B03 H02 C22 Q11 F74 M07 A07 C24 B02 C14 1 G01 E12 E35 H23 R09 J01 H24 L32 P18 M09 R03 G13 R02 0.997 0.95 0.9 0.995 144 Figure 6.3 Change in non-elective survival rates, 1998/99 – 2003/04 (a) In-hospital, low vol non-elec HRGs (b) In-hospital, high vol non-elec HRGs Proportionate change in in-hospital survival - low volume non-elective HRGs Proportionate change in in-hospital survival - high volume non-elective HRGs 1.1 1.06 1.08 1.05 1.06 1.04 1.04 1.03 1.02 1.02 1.01 1 J01 Q11 R09 H04 B03 P18 G01 B02 R03 H02 G13 A07 R02 0.98 1 0.96 0.99 0.94 0.98 0.92 0.97 G14 E04 L32 F74 F73 E15 C14 C24 C22 H26 H23 M07 H24 E35 M09 E12 (c) 30 day, low vol non-elec HRGs (d) 30 day, high vol non-elec HRGs Proportionate change in 30 days survival - low volume non-elective HRGs Proportionate change in 30 days survival - high volume non-elective HRGs 1.12 1.08 1.1 1.06 1.08 1.06 1.04 1.04 1.02 1.02 1 0.98 0.96 J01 Q11 R09 H04 B03 P18 G01 B02 R03 H02 G13 A07 R02 1 G14 E04 L32 F74 F73 E15 C14 C24 C22 H26 H23 M07 H24 E35 M09 E12 0.98 0.94 0.92 0.96 As discussed in Appendix B, there are a variety of ways of summarising how long people have to wait for admission to hospital. In the specimen index we compare two summary measures: the mean waiting time and waiting time at the 80th percentile of the distribution. Raw data for the waiting time at the mean and 80% percentile are provided in Table 6.4. On average, the mean wait fell from 163 days to 134 days over the period, while the wait at the 80% percentile fell from 262 days to 213 days. 145 Table 6.4 Mean and 80% percentile waiting time in days, by year Waiting time HRG A07 B02 B03 C14 C22 C24 E04 E12 E15 E35 F73 F74 G01 G13 G14 H02 H04 H23 H24 H26 J01 L32 M07 M09 P18 Q11 R02 R03 R09 Activity weighted average 1998/99 73 228 236 118 205 144 195 68 75 37 178 174 14 147 162 236 285 58 76 28 110 72 109 13 31 251 110 165 136 1999/00 66 204 209 97 184 126 199 40 72 37 159 151 13 147 157 238 285 59 76 26 109 65 106 12 19 225 107 162 130 163 148 mean wait 2000/01 2001/02 68 68 193 184 197 185 82 75 177 167 109 151 215 189 66 60 83 84 40 51 161 163 146 147 11 13 151 161 163 169 250 253 294 294 56 48 64 66 27 28 101 119 68 69 99 97 11 13 21 40 214 211 120 122 180 184 140 143 144 145 2002/03 73 180 165 81 183 128 154 95 89 77 161 148 19 161 169 247 282 67 68 30 125 79 100 11 22 216 127 179 140 2003/04 77 153 161 90 173 115 106 88 92 84 144 139 23 145 154 225 252 93 77 35 122 70 96 11 18 196 119 171 130 1998/99 100 357 371 185 356 245 350 68 112 37 316 303 14 252 277 374 437 65 96 28 168 86 173 13 31 408 173 304 227 1999/00 92 323 332 143 316 204 350 40 104 37 273 250 24 241 262 383 444 59 91 26 170 86 164 12 19 371 162 286 216 144 134 262 235 80% percentile wait 2000/01 2001/02 96 101 310 301 329 305 117 106 309 288 167 262 373 342 66 60 128 126 40 60 271 280 235 237 17 13 265 286 279 299 402 404 458 445 56 48 72 83 27 28 152 209 91 90 150 147 11 13 26 40 354 352 173 196 318 331 209 234 229 2002/03 106 295 293 118 319 211 256 102 140 83 280 246 19 290 302 378 406 71 70 33 268 122 155 11 23 348 228 330 258 2003/04 112 248 261 142 294 194 175 91 152 94 245 232 23 257 263 340 354 102 79 43 254 95 153 11 27 304 219 301 247 231 213 235 Figure 6.4 shows the change between 1998/99 and 2003/04 in the mean waiting time (Figure 6.4 (a) and (b)) and waiting time at the 80th percentile of the distribution (Figure 6.4 (c) and (d)). Although waiting times increased between 1998/99 and 2003/04 for some HRGs, these tend to be low volume activities. For the majority of high volume HRGs, waiting times fell. It would be expected that the net effect, therefore, of including waiting times in the specimen index would be an increase in the index over the period. Figures (a) and (c) and figures (b) and (d) are very similar, suggesting that the choice between mean and 80th percentile as a summary of waiting time is unlikely to have a dramatic effect on the index. 146 Figure 6.4 Change in waiting time, 1998/99 – 2003/04 (a) Mean wait, low volume HRGs (b) Mean wait, high volume HRGs Proportionate change in mean waiting time - high volume HRGs Proportionate change in mean waiting time - low volume HRGs 3 1.6 2.5 1.4 2 1.2 1.5 1 1 0.8 E15 G01 E12 E35 H23 R09 J01 H24 L32 P18 M09 R03 G13 H26 E04 F73 G14 H04 B03 H02 C22 Q11 F74 M07 A07 C24 B02 C14 R02 0.5 0.6 0 0.4 (c) 80th perc wait, low volume HRGs (d) 80th perc wait, high volume HRGs Proportionate change in 80th percentile wait - high volume HRGs Proportionate change 80th percentile wait - low volume HRGs 3 1.6 2.5 1.4 2 1.2 1.5 1 1 0.8 E15 H26 E04 F73 G14 H04 B03 H02 C22 Q11 F74 M07 A07 C24 B02 C14 G01 E12 E35 H23 R09 J01 H24 L32 P18 M09 R03 G13 R02 0.5 0.6 0 0.4 As noted in section 2.7, the use of cost weights presumes an efficient allocation of NHS resources. If efficient allocation cannot be assumed, an alternative basis for establishing the relative value of activity would be according to the health outcomes each produces. The before, h 0j , and after, h*j , measures of the health effect for each of the HRGs included in the specimen index are provided in Table 6.1 while unit costs, calculated on the basis of CIPS, are presented in Table 6.5. Where relative costs are not proportionate to relative health outcomes, the assumption of efficient allocation is questionable. 147 Table 6.5 Unit costs based on CIPS, by year Elective unit cost HRG A07 B02 B03 C14 C22 C24 E04 E12 E15 E35 F73 F74 G01 G13 G14 H02 H04 H23 H24 H26 J01 L32 M07 M09 P18 Q11 R02 R03 R09 Activity weighted average Non-elective unit cost 1998/99 £384 £628 £661 £465 £717 £639 £5,024 £995 £2,385 £812 £898 £667 £11,595 £1,756 £1,304 £3,965 £4,454 £713 £663 £997 £3,027 £481 £1,839 £282 £641 £676 £2,311 £3,407 £2,497 1999/00 £384 £628 £661 £465 £718 £641 £5,121 £1,224 £2,388 £838 £901 £667 £11,839 £1,780 £1,306 £3,993 £4,471 £709 £660 £1,003 £3,032 £491 £1,845 £282 £644 £677 £2,332 £3,403 £2,498 2000/01 £395 £624 £637 £491 £741 £679 £5,654 £1,588 £2,421 £803 £991 £731 £14,149 £1,857 £1,358 £4,284 £4,661 £734 £625 £1,051 £3,346 £526 £1,912 £305 £732 £727 £2,569 £3,727 £3,061 2001/02 £422 £673 £737 £550 £849 £769 £6,480 £1,580 £2,457 £1,086 £1,093 £810 £18,505 £2,055 £1,500 £4,442 £4,859 £861 £614 £978 £3,735 £628 £2,109 £349 £784 £837 £2,706 £3,893 £3,178 2002/03 £466 £682 £734 £574 £914 £807 £6,507 £1,745 £2,815 £992 £1,148 £873 £18,961 £2,117 £1,564 £4,763 £5,294 £851 £639 £907 £3,965 £653 £2,298 £405 £783 £894 £2,899 £4,184 £3,641 2003/04 £466 £682 £736 £574 £913 £806 £6,388 £1,687 £2,820 £942 £1,149 £872 £19,020 £2,100 £1,560 £4,758 £5,278 £853 £635 £902 £3,971 £646 £2,294 £405 £783 £894 £2,896 £4,189 £3,649 1998/99 £1,705 £1,094 £1,299 £778 £979 £948 £5,264 £1,182 £2,587 £855 £1,496 £1,042 £14,569 £3,073 £1,947 £4,039 £4,655 £910 £654 £1,299 £2,947 £1,136 £1,858 £325 £778 £1,373 £2,854 £4,813 £3,007 1999/00 £1,746 £1,096 £1,285 £785 £988 £984 £5,426 £1,352 £2,622 £912 £1,521 £1,047 £14,612 £3,202 £1,961 £4,172 £4,772 £966 £667 £1,352 £3,057 £1,217 £1,865 £325 £774 £1,397 £2,909 £4,995 £3,075 2000/01 £747 £1,116 £1,294 £850 £1,171 £1,146 £5,794 £1,484 £2,824 £915 £1,795 £1,126 £18,696 £3,310 £2,081 £4,741 £4,408 £981 £643 £1,448 £3,480 £1,303 £1,818 £351 £589 £1,319 £3,269 £5,200 £3,122 2001/02 £993 £1,087 £1,580 £898 £1,206 £1,256 £6,401 £1,688 £3,040 £970 £1,925 £1,327 £20,229 £3,894 £2,385 £5,191 £4,084 £961 £618 £1,521 £3,218 £1,465 £2,064 £400 £2,664 £1,487 £3,649 £5,764 £3,816 2002/03 £1,050 £994 £1,112 £943 £1,303 £1,251 £6,973 £1,546 £3,241 £893 £1,919 £1,368 £23,179 £3,802 £2,465 £5,658 £5,477 £947 £614 £1,488 £3,340 £1,303 £2,234 £399 £1,562 £1,234 £3,774 £5,961 £4,476 2003/04 £966 £1,000 £1,138 £930 £1,307 £1,233 £6,914 £1,625 £3,227 £868 £1,922 £1,360 £23,034 £3,750 £2,439 £5,669 £5,394 £936 £609 £1,473 £3,340 £1,285 £2,228 £399 £1,451 £1,237 £3,767 £5,954 £4,406 £1,069 £1,078 £1,150 £1,261 £1,355 £1,379 £1,077 £1,103 £1,159 £1,251 £1,274 £1,290 Figure 6.5 and Figure 6.6 show the relative weight given to each HRG if relative values are based on costs or health outcomes, for elective and non-elective HRGs respectively. The figures are sub-divided to show low and high volume HRGs separately and with cost weights calculated on 1998/99 Reference Costs and 2003/04 Reference Costs. The health outcome weights are time invariant. Many of the high volume HRGs appear relatively more “valuable” if value is based on health outcome rather than cost (e.g. C24, C22, H26, M09), with the stark exception of G01 (liver transplantation). All else equal, if a greater proportion of these activities were undertaken in 2003/04 compared to 1998/99, an index in which activity is valued according to health outcome would suggest greater output growth than an index where relative values are based on costs. 148 Figure 6.5 Cost and health outcome weights, elective HRGs (a) 1998/99 costs, low volume HRGs (b) 2003/04 costs, low volume HRGs 2003/04 cost and health outcome share - low volume elective HRGs 1998/99 cost and health outcome share - low volume elective HRGs 0.3 0.25 03/04 cost share 0.25 health outcome share 0.2 98/99 cost share 0.2 health outcome share 0.15 0.15 0.1 0.1 0.05 0.05 0 0 G01 E12 E35 H23 R09 J01 H24 L32 P18 M09 R03 G13 R02 (c) 1998/99 costs, high volume HRGs G01 E12 E35 H23 R09 J01 H24 L32 P18 M09 R03 G13 R02 (d) 2003/04 costs, high volume HRGs 1998/99 cost and health outcome share - high volume elective HRGs 2003/04 cost and health outcome share - high volume elective HRGs 0.1 0.1 98/99 cost share 0.09 03/04 cost share 0.09 health outcome share health outcome share 0.08 0.08 0.07 0.07 0.06 0.06 0.05 0.05 0.04 0.04 0.03 0.03 0.02 0.02 0.01 0.01 0 0 E15 H26 E04 F73 G14 H04 B03 H02 C22 Q11 F74 M07 A07 C24 B02 C14 E15 H26 E04 F73 G14 H04 B03 H02 C22 Q11 F74 M07 A07 C24 B02 C14 Figure 6.6 Cost and health outcome weights, non-elective HRGs (a) 1998/99 costs, low volume HRGs (b) 2003/04 costs, low volume HRGs 2003/04 cost and health outcome share - low volume non-elective HRGs 1998/99 cost and health outcome share - low volume non-elective HRGs 0.3000 0.2500 03/04 cost share 98/99 cost share health outcome share health outcome share 0.2500 0.2000 0.2000 0.1500 0.1500 0.1000 0.1000 0.0500 0.0500 0.0000 0.0000 J01 Q11 R09 H04 B03 P18 G01 B02 R03 H02 G13 A07 J01 R02 (c) 1998/99 costs, high volume HRGs R09 H04 B03 P18 G01 B02 R03 H02 G13 A07 R02 (d) 2003/04 costs, high volume HRGs 2003/04 cost and health outcome share - high volume non-elective HRGs 1998/99 cost and health outcome share - high volume non-elective HRGs 0.0900 0.0800 98/99 cost share 0.0700 Q11 0.0800 health outcome share 03/04 cost share health outcome share 0.0700 0.0600 0.0600 0.0500 0.0500 0.0400 0.0400 0.0300 0.0300 0.0200 0.0200 0.0100 0.0100 0.0000 0.0000 G14 E04 L32 F74 F73 E15 C14 C24 C22 H26 H23 M07 H24 E35 M09 E12 G14 E04 L32 F74 F73 E15 C14 C24 C22 H26 H23 M07 H24 E35 M09 E12 149 In the following sections we compare estimates of output growth under various specifications of the specimen index with CIPS measuring volume. All estimates are based on a Laspeyres index. 6.3 Cost weighted output indices Column (i) of Table 6.6 contains estimates of output change for a cost weighted output index (CWOI) corresponding to equation (19) of the form: I ctx = ∑ j x jt +1c jt ∑ j x jt c jt Table 6.6 Cost weighted output index, with adjustments for survival and health effects CWOI (i) 1998/99 - 1999/00 1999/00 - 2000/01 2000/01 - 2001/02 2001/02 - 2002/03 2002/03 - 2003/04 Average CWOI survival adjustmCWOI health adjustment In-hospital 30-day 30-day 30-day 30-day no threshold threshold<0.90 k=0.8 (ii) (iii) (iv) (v) (vi) -1.19% 2.79% 2.18% 9.14% 6.30% -1.18% 2.88% 2.20% 9.35% 6.42% -1.17% 2.90% 2.23% 9.39% 6.46% -1.33% 2.66% 1.78% 4.76% 6.37% -1.20% 2.98% 2.29% 9.55% 6.67% -1.19% 3.16% 2.57% 12.37% 7.49% 3.84% 3.93% 3.96% 2.85% 4.06% 4.88% This unadjusted CWOI suggests an average annual growth in output of 3.84%. There is annual variation in the estimated amount of growth. In particular, there is a large increase in the index of 9.14% between 2001/02 and 2002/03. This is driven by an increase in activity rather than a change in the costs. The average (unweighted for volume or cost) increase in activity between 2001/02 and 2002/03 was 12%. Figure 6.7 below shows the number of elective and non-elective CIPS in each year for high volume HRGs. 150 Figure 6.7 Activity change 2001/02-2002/03, high volume HRGs Activity change 2001/02 - 2002/03, high vol HRGs 300000 2001/02 250000 2002/03 200000 150000 100000 50000 0 C22 H04 H02 Q11 E35 F74 M07 E12 M09 A07 C24 C14 B02 Columns (ii) and (iii) in Table 6.6 present results for a survival adjusted cost weighted output index, of the form presented in equation (40): I ctxa = ∑ j x jt +1 (a jt +1 / a jt )c jt ∑ j x jt c jt Compared to the unadjusted cost weighted output index, there is slight increase in the estimated output growth when survival is included in index, reflecting the general improvement in survival over the period. The increase is slight, however, because survival rates are high for these HRGs (around 97%). Inclusion of the survival effect increases estimated annual output growth by 0.09% if in-hospital survival is considered and 0.12% based on 30-day survival rates. Given the high correlation between these measures, their equivalent influence on the index is unsurprising. Subsequent estimations employ 30-day survival as a measure of a j . The figures in columns (iv) and (v) of Table 6.6 show the estimates including adjustment for before and after health status, corresponding to equation (48): 151 I xq ct * 0 * 0 ∑ j x jt +1 (a jt +1h j − h j ) /(a jt h j − h j )c jt = ∑ j x jt c jt As discussed in section 4.8.2, in estimating this equation it is necessary to introduce an arbitrary threshold for HRGs with poor survival rates. Failure to make this adjustment makes the index disproportionately sensitive to changes in aj for activities with small or negative (a jt h*j − h 0j ) or (a jt +1h*j − h 0j ) . If survival rates are below a particular threshold, only the change in survival is taken into account. To illustrate we estimate the equation without a threshold and with a threshold at the 90% survival rate. As can be seen, omitting the threshold has a dramatic effect on the estimates, particularly in 2001/02-2002/03, where output growth was estimated as greater than 9% but now appears much lower (4.76%). The divergence stems predominantly (but not exclusively) from non-elective E12, which has a poor survival rate (of 85.69% in 1998/99). The influence of this HRG is felt particularly in the change between 2001/02-2002/03, when activity increased from 55,455 to 63,691. A formulation of the form (a jt +1h*j − h 0j ) , takes a negative value for E12, with the adjustment being the ratio of two negative numbers and showing the increase in (ah* − h 0 ) as a reduction. This pulls the index down dramatically particularly in years where there was a growth in this activity (2001/02-2002/03) and up in years where this activity declined (e.g. 1999/00-2000/01). Hence, for HRGs with a survival rate below 90%, the before-and-after health adjustment is not taken into account. This threshold applies in most years to E12 (AMI) and H02 (non-elective primary hip replacement) and in occasional years to G01 (liver transplantation). The set of estimates in column (v) in Table 6.6 show estimates when the threshold is included. Inclusion of the health effects leads to an average annual increase in the estimates of output growth of 0.1% compared to the CWOI survival adjusted index. The final set of figures (column (vi)) in table 6.6 assume that health effects are constant across treatments, ie where k=0.8. As can be seen this makes a dramatic 152 difference to the estimates of output growth, changing from an average of 4.06% when k varies by HRG to 4.88% when k is held constant. This sensitivity reflects both the difference in average values of k for the HRGs included in the specimen index ( k j =0.825) and, more particularly, the substantial variation in kj (standard deviation=0.14). Of course, it is not possible to speculate about the direction of estimated output change from relaxing the assumption of a constant value of k when applied across the full range of NHS activities. However, this analysis does suggest the impact might be of substantial magnitude. Table 6.7 presents estimates for a cost weighted output index where waiting times and life expectancy are taken into account with waiting time is discounted to date placed on the list, as described in section 4.10.2. ( −r w −r L ⎧ ⎡ (1 − e − rw w jt +1 ) (1 − e − rL w jt +1 ) ⎤ e L jt +1 1 − e L jt +1 * o o ⎪hj ⎢ − ⎥ + ( a jt +1h j − h j ) rw rL rL ⎪ ⎣ ⎦ ∑ j x jt +1c jt ⎨ − rL w jt −r L − rw w jt − rL w jt e 1 − e L jt ⎪ ) (1 − e )⎤ * o ⎡ (1 − e o − ⎥ + ( a jt h j − h j ) ⎪ hj ⎢ rw rL rL ⎣ ⎦ ⎩ ( I ctxaw = ∑ j x jt c jt ) ) ⎫⎪ ⎪ ⎬ ⎪ ⎪ ⎭ (105) Table 6.7 reports the sensitivity of results to: • • • Mean and 80th percentile waiting times Discounting life expectancy at 1.5% or 5% Discounting waiting time at 1.5%, 5% or 10% The choice between mean wait (top half of Table 6.7) and the wait experienced at the 80% percentile (bottom half of Table 6.7) has little difference on the estimates, unsurprisingly given their close correlation. Use of the 80th percentile generates slightly higher estimates of output growth, reflecting the policy concentration on reducing the waiting times for long waits during this period. In subsequent estimations the 80% percentile wait is chosen to measure waiting time. As can be seen, the choice of a discount rate of 10% applied to waiting time has a significant effect on the estimates. 153 Table 6.7 Cost weighted output index, with adjustment for waiting times discounted to date placed on list Mean waiting time discount rate life expectancy discount rate waiting time 1998/99 - 1999/00 1999/00 - 2000/01 2000/01 - 2001/02 2001/02 - 2002/03 2002/03 - 2003/04 Average 80% percentile waiting time discount rate life expectancy discount rate waiting time 1998/99 - 1999/00 1999/00 - 2000/01 2000/01 - 2001/02 2001/02 - 2002/03 2002/03 - 2003/04 Average CWOI health, waiting time and life expectancy adjustment Waiting discounted to date placed on list 1.50% 5% 1.50% 5% 10% 1.50% 5% 10% -1.36% 2.40% 1.96% 9.88% 6.58% -1.35% 2.40% 1.97% 9.89% 6.59% -1.35% 2.39% 1.97% 9.90% 6.61% -1.20% 2.65% 2.03% 9.99% 6.75% -1.19% 2.65% 2.03% 9.99% 6.77% -1.18% 2.64% 2.04% 10.00% 6.79% 3.89% 3.90% 3.90% 4.04% 4.05% 4.06% 5% 10% CWOI health, waiting time and life expectancy adjustment Waiting discounted to date placed on list 1.50% 5% 1.50% 5% 10% 1.50% -1.34% 2.40% 1.96% 9.92% 6.62% -1.32% 2.39% 1.96% 9.94% 6.65% -1.30% 2.39% 1.97% 9.97% 6.70% -1.15% 2.65% 2.01% 10.08% 6.87% -1.12% 2.65% 2.01% 10.11% 6.92% -1.08% 2.65% 2.02% 10.15% 6.98% 3.91% 3.93% 3.95% 4.09% 4.11% 4.14% An alternative approach to considering waiting times is to discount to date of treatment and include a charge for waiting, as discussed in section 4.10.2.2, so that the index becomes: ( ) ( ( I ctxaw ) r w ⎡ 1 − e − rL L jt +1 e w jt +1 − 1 ⎤ ⎢ ⎥ − o * ⎢ r r ⎥ L w ⎛ a jt +1 − h j / h j ⎞ ⎣ x c ⎜ ⎟ ∑ j jt +1 jt ⎜ a − ho / h* ⎟ ⎡ 1 − e− rLL jt erww jt − 1 ⎤ ⎦ j j ⎠ ⎝ jt ⎢ ⎥ − rL rw ⎢ ⎥ ⎣ ⎦ = ∑ j x jt c jt ) ( ) (106) We estimate this formulation at different discount rates, with results presented in Table 6.8. This generates higher estimates of output growth than the formulation in which waits were discounted to date placed on list. 154 Table 6.8 Cost weighted output index, with adjustment for waiting times discounted to date of treatment 80% percentile waiting time CWOI health, waiting time and life expectancy adjustment Waiting discounted to date of treatment, with charge for waiting discount rate life expectancy 1.50% 1.50% 5% discount rate waiting time 1.50% 1.50% 5% 10% 1.50% 5% k=0.8 (i) (ii) (iii) (iv) (v) (vi) 1998/99 - 1999/00 -1.17% -1.18% -1.17% -1.16% -0.99% -0.98% 1999/00 - 2000/01 2.50% 2.32% 2.32% 2.31% 2.58% 2.57% 2000/01 - 2001/02 2.33% 2.05% 2.05% 2.06% 2.11% 2.11% 2001/02 - 2002/03 12.46% 9.65% 9.67% 9.69% 9.81% 9.83% 2002/03 - 2003/04 8.10% 7.29% 7.32% 7.35% 7.52% 7.55% Average 4.85% 4.03% 4.04% 4.05% 4.20% 4.22% 10% 1.50% 0% (vii) -0.97% 2.57% 2.12% 9.86% 7.59% (viii) -1.39% 2.40% 2.00% 9.27% 6.54% 4.23% 3.76% Column (viii) has the results of setting rw = 0 so that there is no adjustment for waiting, only for life expectancy as in section 4.8.3. Compared to the CWOI with only a health effects adjustment Table 6.6, column (v) the average growth rate is reduced by about 0.3%. 6.4 Health outcome weighted output indices This section presents the results from calculation of health outcomes weighted output indices (HOWOI). For comparative purposes with the CWOI, we first estimate a version of HOWOI in which the impact of treatment on life expectancy is ignored. In effect, this amounts to comparing the use of cost and survival-adjusted before and after health outcomes (not QALYs) as weights: * 0 ∑ j x jt +1 (a jt +1h j − h j ) * 0 ∑ j x jt (a jt h j − h j ) (107) This equation is estimated both with k varying by HRG (column (ii) Table 6.9) and for a value of k=0 (column (iii) Table 6.9). As can be seen, output growth appears lower in this formulation of a HOWOI than the corresponding cost weighted output index (figures from Table 6.6 reproduced in column (i) of Table 6.9). As can be seen, substitution of cost for health outcome weights leads to a reduction in estimated output growth for this group of HRGs. The extent to which an index is sensitive to the choice of cost and health outcome weights depends on three factors: • Whether cost weights are disproportionate to health outcome weights; 155 • The volume of activity in those HRGs where the relative weights are most disproportionate; • The change in activity over time in those HRGs where the relative weights are most disproportionate. For a handful of HRGs, cost weights are greater than health outcome weights. This is particular evident for G01 liver transplants, which are costly (the non-elective cost was £23,000 in 2003/04) but their estimated contribution to health outcome is about average for the sample of HRGs considered here. However, because this is a low volume HRG and there is little change in the amount of activity over time, the impact of changing the valuation basis for G01 exerts little influence on the overall index. In contrast, elective activity categorised to A07 (intermediate pain procedures) contributes 7.4% of total 2003/04 activity, and elective activity in this HRG grew by 17.5% over the period captured by the index. Its cost share in 2003/04 is only 0.6% whereas its health outcome share is 4.43%. All else equal, the growth in activity in this high volume HRG would lead to an index based on health outcome shares having a higher value than one based on cost shares. There is little difference between the cost and health outcome weights for B02 (cataract extractions), but they are accorded slightly less weight (-0.06%) when relative values are based on health outcomes. However, despite this minimal difference, B02 exerts considerable influence on the overall index, contributing 20.7% of total volume in 2003/04. There has also been a volume increase of 81% in this activity over the period captured by the index. Thus, this HRG exerts downward influences on the index. 156 Table 6.9 Health outcome weighted output index CWOI HOWOI No LE No LE k=0.8 discount rate life expectancy 1998/99 - 1999/00 1999/00 - 2000/01 2000/01 - 2001/02 2001/02 - 2002/03 2002/03 - 2003/04 HOWOI With LE (i) -1.19% 2.79% 2.18% 9.14% 6.30% (ii) -2.96% 0.16% 1.41% 10.17% 4.62% (iii) -1.88% 1.87% 1.04% 9.11% 5.07% 1.50% (iv) -4.75% -3.78% 0.27% 8.43% 1.95% 5% (v) -4.06% -2.20% 0.48% 8.85% 2.73% 3.84% 2.68% 3.04% 0.42% 1.16% Average The previous adjustment makes the assumption that the health status snapshots h*j , hoj measure the discounted sum of QALYs q*j , qoj . More properly health outcome weights should incorporate the effect of treatment on life expectancy, so that they more nearly measure the discounted sum of QALYs: * 0 ∑ j x jt +1 (a jt +1h j − h j )(1 − e * 0 ∑ j x jt (a jt h j − h j )(1 − e − rL L jt +1 − rL L jt ) (108) ) Estimates are presented from this index in columns (iv) and (v) of table 6.9, with life expectancy discounted at 1.5% and 5%. The impact of including life expectancy is a substantial reduction in estimated output growth. The reason for this is that life expectancy declined gradually over the period, the main reason for this probably being that increasingly older people were receiving treatment, as demonstrated in column. Table 6.10 provides evidence. Table 6.10 Average age and life expectancy Age 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 45.71 45.76 46.17 46.3 46.93 47.9 Life expectancy 25.83 25.34 24.12 23.91 23.59 22.98 The health outcomes weighted output index is also estimated after incorporating 157 waiting times to date placed on list ( ⎧ ⎡ −r w −r w ⎡ e − rL w jt +1 1 − e − rL L jt +1 ⎪ o (1 − e w jt +1 ) (1 − e L jt +1 ) ⎤ o * − ∑ j x jt +1 ⎨ h j ⎢ ⎥ + ( a jt +1h j − h ) ⎢ r r rL ⎢ w L ⎦ ⎪⎩ ⎣ ⎣ ⎧ ⎡ −r w −r w ⎡ e − rL w jt 1 − e − rL L jt ⎤ ⎫ ⎪ o (1 − e w jt ) (1 − e L jt ) ⎤ * o ⎥ ⎬⎪ − ∑ j x jt ⎨ h j ⎢ ⎥ + ( a jt h j − h ) ⎢ rw rL rL ⎢ ⎥ ⎦ ⎪⎩ ⎣ ⎣ ⎦ ⎪⎭ ( ) ) ⎤⎥ ⎫⎪ ⎬ ⎥⎪ ⎦⎭ (109) and to date of treatment with a charge for waiting: ( ) ( ) ⎤⎥ r w ⎡ 1 − e − rL L jt +1 e w jt +1 − 1 − ∑ j x jt +1 ( a h − h ) ⎢⎢ r rw L ⎣ r w ⎡ 1 − e − rL L jt e w jt − 1 ⎤ o * ⎥ − ∑ j x jt ( a jt h j − h j ) ⎢⎢ r r ⎥ L w ⎣ ⎦ * jt +1 j o j ( ) ( ) ⎥ ⎦ (110) The sensitivity of these two variants of the HOWOI are assessed with respect to: • Discounting life expectancy at 1.5% or 5% • Discounting waiting time at 1.5%, 5% or 10% Results are provided in Table 6.11. When waiting time is discounted to date placed on list, estimates of output growth are slightly higher than those when no waiting time adjustment is made, with the difference increasing at higher discount rates. Compared to discounting to date placed on list, discounting to date of treatment results in lower estimates of output growth, decreasing at higher discount rates. 158 Table 6.11 HOWOI, adjusted for waiting time 80% percentile waiting time HOWOI health, waiting time and life expectancy adjustment Waiting discounted to date placed on list discount rate life expectancy discount rate waiting time 1998/99 - 1999/00 1999/00 - 2000/01 2000/01 - 2001/02 2001/02 - 2002/03 2002/03 - 2003/04 Average 80% percentile waiting time 1.50% 1.50% 5% 10% 5% 1.50% 5% 10% -4.64% -3.74% 0.20% 8.47% 2.00% -4.60% -3.74% 0.19% 8.48% 2.03% -4.54% -3.73% 0.18% 8.50% 2.08% -3.79% -2.08% 0.29% 8.98% 2.86% -3.71% -2.07% 0.28% 9.00% 2.91% -3.61% -2.06% 0.26% 9.03% 2.99% 0.46% 0.47% 0.50% 1.25% 1.28% 1.32% HOWOI health, waiting time and life expectancy adjustment Waiting discounted to date of treatment, charge for waiting discount rate life expectancy discount rate waiting time 1998/99 - 1999/00 1999/00 - 2000/01 2000/01 - 2001/02 2001/02 - 2002/03 2002/03 - 2003/04 Average 6.5 1.50% 1.50% 5% 10% 5% 1.50% 5% 10% -4.52% -3.81% 0.09% 8.55% 2.11% -4.52% -3.81% 0.09% 8.55% 2.11% -4.51% -3.82% 0.08% 8.56% 2.12% -3.66% -2.18% 0.21% 9.06% 3.01% -3.65% -2.18% 0.20% 9.06% 3.02% -3.63% -2.18% 0.19% 9.07% 3.04% 0.48% 0.48% 0.49% 1.29% 1.29% 1.30% Value weighted output index Our “ideal” index takes the form specified in equation (12), in which activities are valued according to their associated health outcomes and waiting times are considered a characteristic of health care. The index takes the form: I xq yt −r L * 0 ∑ j x jt +1[(a jt +1h j − h j )[(1 − e L t +1 )π h ] / rL − w jt +1π W ] = −r L * 0 ∑ j x jt [(a jt h j − h j )(1 − e L t )π h ] / rL − w jπ W ] (111) We estimate this index assuming that the monetary value of a QALY ( π h ) in 2002/3 is £30,000 and applying growth rates in money GDP to calculate values for earlier years. We explore the sensitivity of results to: • the cost of a day spent waiting ( π W ) - either £3.13 or £50 in 2002/3 (adjusted by money GDP growth in earlier years) • discounting life expectancy at 1.5% and 5% Estimates of output growth are presented in Table 6.12. These imply lower rates of output growth than a CWOI for these HRGs, the main reason being because of the 159 influence of treating an increasing older population (leading to decreasing life expectancy). The effect of applying a higher value to the cost of a day spent waiting is to increase estimated output growth, but not substantially. Table 6.12 Value weighted output index cost of day spent waiting discount rate life expectancy 1998/99 - 1999/00 1999/00 - 2000/01 2000/01 - 2001/02 2001/02 - 2002/03 2002/03 - 2003/04 Average 6.6 Value weighted output index £3.13 1.50% 5% £50 1.50% 5% (i) -4.71% -3.86% 0.23% 8.41% 2.00% (ii) -4.00% -2.32% 0.41% 8.82% 2.82% (iii) -3.39% -4.22% -0.29% 8.27% 2.90% (iv) -1.54% -2.53% -0.42% 8.70% 4.59% 0.41% 1.15% 0.65% 1.76% Conclusion In this section we have applied various formulations of an output index to a limited set of HRGs for which data were available on health status before and after treatment. The main conclusions are that: • Estimates of output growth are sensitive to whether k is assumed constant across treatments. In view of this, it would be advisable to ascertain before and after health status for a larger sample of NHS treatments. • There is a high correlation between indices using the two mortality measures in-hospital and 30-day survival. • Although relative cost and health outcome weights differ to some extent for our specimen set of HRGs, the difference does not lead to dramatic changes in the estimates produced by the specimen index. It cannot be assumed, however, that there will not be greater divergence between indices using costs and health outcome weights for other NHS activities. • Unable to estimate QALYs directly, we have had to rely upon life tables from the general population to generate estimates of life expectancy. With an increasingly older population being treated over time, this leads to decreasing life expectancy, which in turn implies declining output growth in indices 160 where life expectancy is included. This is because our index formulations make the value judgement that an additional quality adjusted life year should have the same value whatever the age of the person it accrues to. • Cost weighted indices with waiting time adjustments are sensitive to whether waiting time is discounted to the date placed on the list or to the date of treatment, and to the choice of discount rate. • The health and waiting time outcomes index, for the HRGs considered here, is not particularly sensitive to which point in the distribution is chosen to measure waiting time (mean or 80% percentile) or to the cost applied to a day spent waiting (£3.13 or £50). 7 Effects of quality adjustments on hospital and NHS output indices: summary In Section 4 we argued that it was important to include estimates of health effects in a quality adjusted output index. This should be done by regular collection of health outcomes data for a representative range of NHS activity. In the absence of this data, in Section 5 we used available information on outcomes for 29 HRGs and made the assumption that the average health gain observed could be applied uniformly to all hospital activity. For the specimen quality adjusted output index discussed in Section 6, it was possible to test the sensitivity of results to the assumption of a uniform effect. For the specimen index we were able to estimate quality adjusted output using data for actual health effects and compare the result with estimates using a uniform health effect. As expected, the move from uniform to actual values does affect the result. We recommend that wherever possible actual health effects data be used to estimate quality adjusted output indices. Over the next few years the number of HRGs for which actual data will be available should increase. This will gradually reduce the proportion of activity where it is necessary to make assumptions about health effects. 161 A consequence of this recommendation is that for the next few years a quality adjusted output index would have to be based on a mix of actual and assumed values. In this section we examine the impact of departing from the assumption of uniform fixed health effect (k = 0.8) and instead use actual values where they exist and assumed values where data is absent. • For the 29 elective procedures for which we have data, k varies by HRG as in the specimen index. • For all other elective procedures we assume k = 0.8 as suggested by the mean of the k for the elective HRGs where there are estimates • For non-elective HRGs we assume k = 0.4 on the grounds that non-elective patients may have worse health (qo) if not treated so that the ratio of health if not treated to health if treated (k = qo/q*) is smaller. Given that non-elective activity is growing more rapidly than elective, the lack of knowledge of health state and health gain for non-elective patients is a serious problem. We compare our recommended variant (Q2) based on a health effects adjustment which varies by HRG with a variant (Q1) with the same health effect adjustment for all HRGs, elective and non-elective. Quality variant 1 assumes k = q0/q* = 0.8 if a – k > 0.05 and k = 0 otherwise for all elective and non-elective HRGs, discounts to date of treatment with charge for wait, with discount rates on waits and health equal to 1.5% and the waiting time variable is the 80th percentile wait in each HRG. Quality variant 2 is our recommended quality variant. This sets k = q0/q* = 0.8 for electives, k = 0.4 for non-electives, k = actual k for those HRGs included in the specimen index where this is known, provided a – k > 0.10 and k = 0 otherwise. This quality variant discounts to date of treatment with charge for wait, with discount rates on waits and health equal to 1.5% and uses 80th percentile waits. 162 We show the effects of these two quality adjustments variants on the HES hospital output in Table 7.1, to all hospital output including outpatients and accident and emergency in Table 7.2 and to all NHS output in Table 7.3. Table 7.1 shows that the Q1 quality adjustment variant, with survival, health effects, life expectancy and waiting time adjustments adds just under one percentage point to the HES hospital unadjusted index average across the five growth periods. The recommended variant Q2 results in a smaller upward adjustment of just over 0.5% Table 7.1 HES hospital cost weighted output index with hospital sector quality adjustments Unadjusted Quality variant 1 Survival, Survival health and health effect, life effect only expectancy and waiting Quality variant 2 Survival, Survival health and health effect, life effect only expectancy and waiting 1998/99-1999/00 1.87 0.09 0.49 0.63 1.04 1999/00-2000/01 0.91 1.97 1.73 1.50 1.25 2000/01-2001/02 0.95 1.01 0.87 0.82 0.65 2001/02-2002/03 4.44 7.72 7.48 6.77 6.52 2002/03-2003/04 5.81 8.04 8.15 7.21 7.31 Average 2.80 3.77 3.74 3.38 3.35 Table 7.2 shows the effect of the two variants on a broader definition of hospital activity. All variants have faster growth than for the narrower HES output indices in Table 7.1. The main reason for this is the faster growth in the activities not captured by HES, such as A&E and outpatients, which are excluded from Table 7.1. However, because fewer of the quality adjustments apply to these non-HES activities the proportionate effect of Q1 and Q2 is smaller than in Table 7.1. Thus Q1 increases average annual growth by 0.44% instead of nearly 1% and Q2 increases growth by 0.25% instead of 0.55% over the period. 163 Table 7.2 Hospital sector cost weighted output index with hospital sector quality adjustments Unadjusted Quality variant 1 Quality variant 2 Survival and health effect only Survival, health effect, life expectancy and waiting Survival and health effect only Survival, health effect, life expectancy and waiting 1998/99-1999/00 2.03 0.43 0.79 0.91 1.28 1999/00-2000/01 1.54 2.35 2.16 1.99 1.80 2000/01-2001/02 4.48 4.52 4.43 4.40 4.31 2001/02-2002/03 3.94 5.71 5.57 5.19 5.06 2002/03-2003/04 4.78 5.94 6.00 5.51 5.56 Average 3.35 3.79 3.79 3.60 3.60 Finally Table 7.3 shows the impact of the quality adjustments to the hospital sector output on the cost weighted output index for the NHS as a whole. We first show the CWOI without quality adjustments and then add variants of the adjustments for survival and waiting times for the hospital sector. Overall NHS output growth is higher than either of the hospital sector output growth rates because of the more rapid growth in some non-hospital activities such as prescribing and consultation rates, and because of increasing coverage of NHS activity. Quality adjustment variant Q1 increases average annual growth by 0.29% and Q2 increases it by 0.17%. Notice that for 2001/02 to 2002/03 both variants have a much larger effect (1.04% for Q1 and 0.71% for Q2) but rather small effects in the middle years and actually reduce growth from 1998/99 to 1999/00. This negative adjustment is due, as we noted in section 5.4.1, to the fall in survival for a small number of high activity high cost HRG. Notice also that from 1998/99 to 1999/00 when both Q1 and Q2 lead to downward adjustments our recommended variant Q2 has a smaller negative effect. Thus in general variant Q2 has a smaller positive or negative effect than Q1 because it uses smaller assumed health effects for emergency activities. 164 Table 7.3 Aggregate NHS cost weighted output index with hospital sector quality adjustments Unadjusted Quality variant 1 Survival, Survival health and health effect, life effect only expectancy and waiting Quality variant 2 Survival, Survival health and health effect, life effect only expectancy and waiting 1998/99-1999/00 2.61 1.77 1.96 2.03 2.22 1999/00-2000/01 2.11 2.57 2.46 2.36 2.26 2000/01-2001/02 3.85 3.88 3.82 3.80 3.74 2001/02-2002/03 5.07 6.20 6.11 5.87 5.78 2002/03-2003/04 4.43 5.17 5.20 4.89 4.93 Average 3.62 3.92 3.91 3.79 3.79 We next examine input changes over the period and then in section 9 combine our output indices and input indices to calculate productivity growth rates. 8 Labour input 8.1 Introduction Current practice by DH and ONS calculates labour input by deflating payments to labour by a wage index. It is more usual to estimate labour inputs based on number of workers or hours worked so it was considered useful to devote effort in the project to this alternative method of measuring labour input. In addition the Atkinson Report recommended that labour input should be adjusted to take account of variations in types of workers employed, in particular the changing use of skilled workers; this section also addresses this recommendation. Labour is by far the most important input used in producing health services, accounting for about 75% of total hospital expenditures. These measures can then be combined with the aggregate and hospital output measures given in section 7 to calculate labour productivity growth rates and combined with measures of payment to labour can be used to calculate total factor productivity growth rates. Productivity estimates are presented in the next section. 165 8.2 Labour input in the NHS This section summarises results on constructing measures of labour input. 8.2.1 Volume of labour input The volume of labour input can be calculated using direct or indirect measures. Direct measures include number of persons engaged (including self-employed), number of full-time equivalents or total hours worked. The indirect measure is expenditure on labour deflated by a wage index, employed by ONS for NHS labour input. Productivity analysts tend to prefer direct measures since reasonable data are generally available on numbers employed whereas wage indexes are seen as less reliable. This section follows this tradition of using direct measures. However it should be noted that in sectors where the self-employed account for a large share of employment, as is the case for the NHS, there may be more grounds for using an indirect measure; this is discussed further below. The simplest direct measure is a headcount of number of persons employed. Since many persons in the NHS work part-time, a more reliable indicator is full-time equivalent workers. Such a measure is calculated by the DH where part-time work is weighted by normal weekly hours of these people. While full-time equivalents is undoubtedly a better measure than headcounts, it is only a half-way house to the measure recommended by the OECD productivity manual (OECD, 2001) of annual actual hours worked. Normal or usual hours worked do not take account of changes in time lost due to holidays, sickness etc. Over time trends in time paid but not worked tend to dominate changes in usual weekly hours worked. Adjustments to an annual actual hours worked basis are discussed below. 8.2.2 Quality of labour input “Because a worker’s contribution to the production process consists of his/her “raw” labour (or physical presence) and services from his/her human capital, one hour worked by one person does not constitute the same amount of labour input as one hour worked by another person”, OECD productivity manual (p. 41). 166 Volume measures of labour input hide considerable diversity across types of workers. Obviously the productivity of highly skilled workers is greater than that of less skilled workers as set out in the quote above. Division by skill type is not the only quality dimension; other candidates are age or experience, gender or occupation. Nevertheless most research on measuring labour quality suggests skill is the most important dimension (Jorgenson, Ho and Stiroh, 2005). The standard growth accounting formula for adjusting for skills divides labour hours by skill type and then weights the growth in hours of each type by their wage bill shares. This captures the fact that more highly skilled workers get paid more than the unskilled, and under competitive market conditions, the wage paid reflects the marginal productivity of workers of different types. Merely calculating growth in total hours worked is equivalent to weighting worker types by their share in employment. Hence if there is general upskilling of the workforce so that growth in hours is greater for skilled relative to unskilled workers, weighting by wage bill shares leads to higher aggregate labour input growth. Formally, quality adjusted labour input, with s types of skilled labour can be calculated using a Törnqvist index (see section 3) by: ln Lt − ln Lt −1 = ∑ϖ s (ln Lst − ln Lst −1 ) (112) l where Ls is number of hours worked for worker of skill type s, and ϖ s = 0.5( ws ,t −1 Ls ,t −1 wt −1 Lt −1 + wst Lst ) wt Lt is the wage bill share of type s workers in the total wage bill for all workers, averaged across periods t and t-1. The difference between the equation above and the growth in total hours worked gives the impact of skills on aggregate labour input growth. 8.2.3 Data sources and volume trends This report reviewed available data sources relating to health sector labour input to assess their usefulness in constructing direct measures of labour input for the NHS. Two sources seemed particularly useful and formed the basis of the calculations 167 presented in this section. These were: • NHS Workforce Census – An annual census conducted by the Department of Health. This source provides data on numbers employed in the NHS, both in headcount and full-time equivalent terms, by occupation and organisation. • Labour Force Survey – This is a quarterly sample survey conducted by ONS. It contains data on numbers employed and annual hours worked by industry (SIC92) and occupation (SOC), distinguishing private and public sectors and whether the person is employed by an NHS Trust. It also contains data on wages, qualifications, region and nationality of employees and questions relating to on the job training.10 These sources were supplemented by information from the NHS staff earnings survey. The NHS Workforce Census has an advantage over the Labour Force Survey in that being a census it captures all people whose employer is the NHS. The labour force survey headcounts are derived indirectly through both the industry in which the individual states they are working (SIC 85.1 – human health activities) and if they state that they work for an NHS Trust. Against this the LFS includes agency workers whereas the NHS Census excludes these. The two sources were compared for consistency and were found to follow similar but not identical trends through time (shown in Figure 8.1). 10 Note since the LFS is an individual survey, where the person interviewed is frequently reporting for their spouses or partner, reporting errors can lead to situations where individuals state that they work for industries such as personal services but that they are employed by the NHS, e.g. contract cleaners 168 Figure 8.1 Numbers employed in the NHS 1995-2003 1400000 1300000 1200000 Census LFS 1100000 1000000 900000 800000 1995 1996 1997 1998 1999 2000 2001 2002 2003 The decision was made therefore to employ the NHS Census data for the headcount but to include an adjustment using the ratio of agency to other staff from the LFS to adjust the Census data. The following table shows growth rates in persons engaged by broad occupational group. These show significant growth in total numbers employed, in particular since 2000. The growth rates are fairly uniform across broad categories. Within categories there is more variation. For example the number of managers, included in the infrastructure support group, increased by about 6.5% per annum from 1995 to 2003, with a very large increase of over 11% per annum since 2000. However it should be noted that managers represent a very small proportion of the NHS workforce, reaching only 2.7% by 2003 despite the high growth. In contrast nursing assistants and auxiliaries show growth of less than 2% p.a. since 1995 and 2.7% since 2000. 169 Table 8.1 Trends in the NHS workforce 1995-03 2000-03 Total 2.49 4.58 Professionally qualified clinical staff All doctors All qualified nurses (including practice nurses) 2.77 3.30 2.48 4.53 4.03 4.66 Total qualified scientific, therapeutic & technical staff 3.60 4.73 NHS infrastructure support Support to Clinical Staff Other 1.26 3.20 0.55 4.66 5.35 1.98 Rather than use full-time equivalents we employ LFS data on weekly hours to convert these headcounts to total annual hours. Weekly actual hours show a slight decrease over time from 28.9 in 1995 to 28.6 in 2003 with rises in between. 8.2.4 Quality adjustments based on qualifications The LFS is used in this project to incorporate quality adjustments. Thus the NHS Census data are used as control totals with proportions of workers in each skill group and their wage rates from the LFS used in the skills adjustment. We then refine these estimates in a number of ways to take account of on the job training, regional variations in wage rates and country of birth of workers. In addition we include an adjustment for doctors using data from the NHS Census and the Earnings survey since the qualification division in the LFS is not fine enough to ensure we are picking up all skill variations. Table 8.2 shows the proportion of workers by qualification group for selected years. Employment growth has been relatively strong among workers with higher degrees and primary degrees as well as those with A-levels and equivalents. The share of those with nursing qualifications or other NVQ4 has declined reflecting the growth in degrees among nurses and health care professionals. There has also been a marked 170 decline in the share of the workforce with no skills. While the figures confirm the well known high growth in university degrees among doctors, nurses and other health professionals, the changes in the lower end of the skill distribution suggest upskilling also among other NHS employees, illustrated by the chart for health care assistants below. Interestingly, managers have also experienced pronounced changes in their skill distribution with the percent of managers having a higher degree (mostly masters degrees) rising from about 3% in 1995 to about 16% in 2003 and the percent with degrees rising from 22% to 28% over the same time period (Figure 8.2). Table 8.2 Skill proportions of the Workforce: NHS selected years 1995 2000 2003 Higher degree (Masters, PhDs) Degree 5.4 13.2 6.9 17.4 8.3 18.2 Nursing qualification/other NVQ4 (higher education below degree) 36.4 31.3 29.2 NVQ3 (A-levels or equivalent) NVQ1 and NVQ2 (GCSEs or equivalent, vocational qualifications) Other None 6.1 22.8 8.6 23.8 9.8 23.2 6.8 10.0 5.9 6.1 5.9 5.5 171 Figure 8.2 Healthcare assistants: proportions employed by highest qualification held 60.00 50.00 40.00 1995 2000 30.00 2003 20.00 10.00 0.00 Degree/nursing/other NVQ4 NVQ3 NVQ1 and NVQ2 Other or none Figure 8.3 Managers: proportions employed by highest qualification held 45.00 40.00 35.00 30.00 25.00 1995 2000 2003 20.00 15.00 10.00 5.00 0.00 PhD and Masters First degree Nursing qualification, other NVQ4 NVQ3 All other . 172 It is also worth considering skill use in the NHS with that for the economy as a whole or other service sectors. Table 8.3 shows skill proportions in the total economy and private market services for the three years shown for the NHS above. Compared to the total economy, the NHS employs proportionally more workers at the high end of the skill distribution. These differences are more pronounced when the comparisons is between the NHS and private market services. Over this time period growth in proportions of the workforce with the highest qualifications (degree and above) has been higher in the NHS than in the total or market services and the reduction in the use of unskilled workers has been greater. One interesting trend that can be considered from the comparison between the NHS and other sectors is the extent to which those with nursing qualifications are leaving the NHS to work in other sectors. In 2003 the LFS shows about 550,000 individuals whose highest stated qualification is a nursing qualification. Of these 42% work in NHS Trusts whereas in 1995 nearly 47% worked in NHS Trusts. Some of this attrition can be traced to other areas of the public health sector or private health sectors but much is to other non health industries. Thus 42% of persons whose highest qualification was nursing worked outside the health sector in 1995 and this had risen to 45% by 2003. While this suggests some increase in attrition rates it may well be an underestimate since many people leaving may have other qualifications higher than nursing. 173 Table 8.3 Skill proportions of the Workforce: NHS selected years Total Economy Higher degree (Masters, PhDs) Degree Nursing qualification/other NVQ4 (higher education below degree) NVQ3 (A-levels or equivalent) NVQ1 and NVQ2 (GCSEs or equivalent, vocational qualifications) Other None Market Services1 Higher degree (Masters, PhDs) Degree Nursing qualification/other NVQ4 (higher education below degree) NVQ3 (A-levels or equivalent) NVQ1 and NVQ2 (GCSEs or equivalent, vocational qualifications) Other 1995 2000 2003 2.7 4.6 5.6 11.6 13.1 13.8 9.5 9.7 9.9 12.1 40.0 16.3 36.5 17.8 34.6 7.9 16.2 8.0 11.8 7.7 10.6 1.9 10.9 5.9 3.1 13.1 6.6 3.7 13.5 6.7 13.8 42.7 17.6 38.9 19.5 36.7 8.6 16.2 8.6 12.1 8.5 11.4 1. Comprising distribution, transport, communications, hotels & catering, business services, financial services and personal services. The changes in the skill use pattern noted above changes aggregate labour input only if there are significant differences in wage rates across skill groups. Table 8.4 shows wages relative to the lowest group for the qualifications given in the Table above. In fact there are very large differences in the wages paid to workers in the NHS with on average those with higher degrees earning about 4 times the average unskilled wages. Note however that these differentials tend to be smaller in the public sector than in the private sector. Also wage differentials have tended to stay reasonably constant through time. Therefore, based on these data, quality change is driven by greater employment growth among skilled workers rather than by any movement to employing more expensive and under competitive assumptions more productive workers, within these skill groups. 174 Table 8.4 Wages relative to the unskilled, selected years Higher degree Degree Nursing qualification/other NVQ4 NVQ3 NVQ1 and NVQ2 Other None 1995 2000 2003 3.88 2.72 1.88 1.39 1.14 1.48 1 4.07 2.61 1.87 1.33 1.14 1.41 1 3.81 2.68 1.87 1.21 1.09 1.70 1 Combining the information on skill use and wages gives growth in quality adjusted labour. Table 8.5 below shows growth in headcount, quality adjusted labour and the percentage point contribution of quality change for all NHS workers, and selected occupation groups. 175 Table 8.5 Growth in volume and quality adjusted labour input, Total NHS Numbers Quality-adjusted Difference 1995-03 2.68 3.47 0.79 2000-03 4.50 5.19 0.69 1995-03 3.30 3.33 0.03 2000-03 4.03 4.20 0.17 1995-03 2.75 3.00 0.25 2000-03 4.24 4.32 0.08 1995-03 2.22 2.34 0.13 2000-03 2.71 3.72 1.01 3.30 4.48 3.39 4.64 0.09 0.16 2.89 5.00 3.60 5.64 0.71 0.64 6.59 11.18 8.10 12.90 1.51 1.72 2.41 5.35 2.93 6.17 0.52 0.82 Total Doctors Nurses Nursing aux. Health associate profs 1995-03 2000-03 Healthcare assistants 1995-03 2000-03 Managers 1995-03 2000-03 Administrative 1995-03 2000-03 Again it is useful to compare these trends with other sectors of the economy. In addition we also show growth in labour input for the hospital sector alone, since calculations for other inputs discussed below are carried out mainly for the hospital sector. In this calculation we also include the adjustment for agency workers and convert to an annual hours basis. The sample sizes in the LFS were such that it was not possible to include occupation specific hours or agency adjustments in the previous table. Adjusting for agency workers raises the annual average growth rate over the entire period from 2.68% to 2.78% and since 2000 from 4.5% to 4.63%, 176 small upward adjustments. Adjusting for trends in hours raises the growth rate further to 4.75% from 2000-2003 but lowers the growth rate marginally over the entire period. Ideally, in the calculations we would like to incorporate trends in hours by qualification group. However when we attempted such a division in the LFS the resulting weekly hours turned out to show implausible variations from year to year and so this calculation was not attempted. The table shows growth rates over the six years considered in the output calculations. It shows much higher quality adjusted labour input growth in the total NHS and in hospitals than in the economy as a whole. Quality adjustments were similar in percentage points to those in the total economy and larger in the more comparable market services sector proportionally more important in the total economy or market services than in the health sector averaged across the entire time period and the period since 2000. However the aggregate and private sector estimates have been more sensitive to the stage of the business cycle and hence to the starting year chosen. In the past few years the NHS adjustments have been greater. Table 8.6 Growth in volume and quality adjusted labour input, comparison between the NHS and other sectors, annual average 1999/00-2003/04 Annual hours Quality-adjusted Difference Total NHS 3.37 4.24 0.87 Hospitals 3.58 4.35 0.77 Total Economy 0.92 1.74 0.82 Market Services 0.71 1.36 0.65 8.2.5 Quality adjustments: refinements So far the calculations assume that the only variation in type of worker is skill. We also calculated a number of variations which accounted for a further disaggregation of 177 doctors, regional variations, on the job training and country of birth of workers. The rationale for the first is that since all doctors must have degrees or equivalents in order to practice, the LFS data are not capturing skills acquired through on the job learning. To take account of this we need to disaggregate doctors by type, e.g. consultants, registrars, junior doctors. Data from the Census for the secondary care sector, coupled with snapshots of relative wages from the NHS earnings survey were employed to achieve a crude adjustment based on a division of doctors by consultants and others. Over the period from 2000/01 to 2003/04 this resulted in growth in quality adjusted doctors about 30% above numbers employed. For the same period quality adjusting based on certified qualification led to only an increase in growth of 4%. Therefore at least for doctors, using qualification data alone may not be sufficient to capture all quality change. However since doctors represent less than 1.5% of the NHS Trusts wage bill this amount to only a very small adjustment for the hospital sector. As we do not have comparable data to consider GPs, the adjustment for overall NHS activity is even smaller. It is likely that workers have received some additional training beyond that associated with their certified qualifications. Using LFS data we can divide workers according to whether they received any job related training or education during the 13 weeks previous to the survey date. Since the data are averaged across quarters this variable picks up any job related training carried out over entire years. For 2003/04 just over 50% of NHS workers answered yes to the question of whether they received some job related training over the past year and the percentage is high across all occupation groups (table 8.7). Nearly half the training occurs in the employees workforce with about 30% off site in education institutions (Table 8.8). About half the workers engaged in training with duration lasting one month or less while about 20% was on going at the time of the survey. Nearly 30% of training was for more than 6 months. 178 Table 8.7 Percent of NHS workers receiving job related training, 2003/04 Occupational group % Total 50.2 Managers 56.5 Medical practitioners 68.2 Other health professionals 62.7 Science & engineering professionals, technicians, IT, research 44.0 Teaching & all other professionals etc, plus skilled trades 34.1 Nurses, midwives 64.0 Other health associate professionals, therapists 57.2 Nursing auxiliaries and assistants 48.8 Other healthcare and related personal services 46.2 All other occupations 25.5 Table 8.8 Location of job related training Location of job related training % Employer/another employer's premises 49.1 Training centres etc 11.5 Educational institution 30.9 None of these 8.5 Job related training can vary from one day courses on health and safety to extensive formal learning from senior colleagues. If most training were concentrated in the former then this would have very little impact on overall productivity of workers whereas the latter type of training is likely to have a significant impact. Consistent with the method employed in the remainder of this section, we look at relative wages as an indicator of relative productivity. In 2003/04 workers who gave a positive response to the on the job training question earned on average 28% more than those who gave a negative response. The differential was greatest among highest skilled 179 groups but was also large across all skill divisions. In order to account for on the job training we compared the quality adjustments when labour is divided by skill level with those where we included the further division according to whether they received job related training or not. The results of this calculation were to raise the quality adjusted labour input growth rate on average from 1999/00 by about 5%, a small but not insignificant impact. Therefore our estimates of quality adjusted labour were scaled up to reflect the impact of on the job training. Finally we tried two additional calculations using a division by region and by country of birth but neither significantly altered the results. 8.3 Conclusion This section considered trends in the use of labour input in the NHS. It showed that labour input growth has been rising very rapidly in recent years, mainly due to growth in the numbers of workers employed but also there is a significant contribution from upskilling of the workforce. The latter is important in understanding why expenditure on the NHS has been increasing rapidly in recent years. Thus a crude calculation suggests some 20% of payments to labour is due to paying for higher skilled workers. 9 Experimental productivity estimates This section considers measurement of inputs and combining these with output estimates from previous sections to calculate productivity. Lee (2004) reported productivity estimates for the total NHS with details of calculations given in Hemingway (2004). The focus has been on refining the estimates of labour input to take account of the use of various types of skilled labour. Labour is by far the most important input used in producing health services, accounting for about 75% of total hospital expenditures. These measures can then be combined with the aggregate and hospital output measures given in Section 7 to calculate labour productivity growth 180 rates. Total factor productivity (TFP) estimates are then calculated for the hospital sector and the NHS as a whole. In the hospital sector it is straightforward to combine the direct measures of labour input discussed in the next section with data on the use of intermediate inputs and capital from Trust Financial Returns to calculate TFP growth. ONS data are combined with the estimates for labour input from Section 8 to derive the aggregate TFP estimates. 9.1 Labour input and labour productivity growth Using the calculations reported in section 8, annual estimates of the volume of labour input, and quality adjusted labour input, are shown in the following table for the aggregate NHS and the Hospital NHS. Note the hospital sector captures activities that are recorded in HES, which were the focus of Chapter 5, and non-HES activities such as outpatient, A&E, rehabilitation and mental health activities. This broader definition of outputs corresponds to the basis on which hospital inputs are measured. The overall NHS includes activities such as primary care, prescriptions, community care, dental and ophthalmic. Table 9.1 Volume of labour input and quality adjusted labour input, annual estimates of growth rates Labour input: volume Labour input: quality adjusted Total NHS Hospital Total NHS Hospital 1998/99-1999/00 1.58 1.49 2.53 3.47 1999/00-2000/01 1.05 1.77 1.45 2.75 2000/01-2001/02 5.42 5.05 5.31 3.77 2001/02-2002/03 4.69 4.51 5.57 5.91 2002/03-2003/04 4.48 5.47 4.94 5.83 Average 3.43 3.64 3.95 4.34 181 Labour productivity growth rates are derived by taking the growth in output minus the growth in labour input. Here we show calculations for unadjusted cost weighted output and the quality variants chosen for illustrative purposes in Section 7 adjusting for survival and waiting times. Denote these two variants by Q1 and Q2. For comparison purposes these output growth rates are shown in Table 9.2. The first panel in Table 9.3 presents labour productivity estimates based on volume of labour and shows positive average labour productivity growth across the period for all output variants for the total NHS and for the Q1 variant for hospital output but a small negative number for unadjusted hospital output. When quality adjusted labour is used instead, average labour productivity growth becomes negative in the total NHS when output is not quality adjusted, but is approximately zero using the Q1 variant of quality adjusted output. Hospital labour productivity growth is negative for all output variants when quality adjusted labour is used. Note negative labour productivity growth in health services is not unusual in international comparisons. Data from the US national accounts suggests labour productivity growth, unadjusted for labour quality changes, was -0.31% on average from 1999 to 2002. Adjusting for labour quality would reduce this further. Table 9.2 Output growth Total NHS Hospital Unadjusted Q1 Q2 Unadjusted Q1 Q2 1998/99-1999/00 2.61 1.96 2.22 2.03 0.79 1.28 1999/00-2000/01 2.11 2.46 2.26 1.54 2.16 1.80 2000/01-2001/02 3.85 3.82 3.74 4.48 4.43 4.31 2001/02-2002/03 5.07 6.11 5.78 3.94 5.57 5.06 2002/03-2003/04 4.43 5.20 4.93 4.78 6.00 5.56 Average 3.62 3.91 3.79 3.35 3.79 3.60 0 * Notes: Q1 is the ‘high’ quality adjustment variant with k = q /q = 0.8 if a – k > 0.05, and k = 0 otherwise, discounts to date of treatment with charge for wait, discount rates on waits and life expectancy equal to 1.5% and where the waiting time variable is the 80th percentile wait in each HRG. Q2 is our recommended quality variant and sets k = q0/q* = 0.8 for electives, k = 0.4 for non-electives, k = actual k for those HRGs where known, if a – k > 0.10, and k = 0 otherwise; discounts to date of treatment with charge for waits, discount rates on waits and life expectancy equal to 1.5% and uses 80th percentile waits. 182 Table 9.3 Labour productivity growth Total NHS Hospital Unadjusted Q1 Q2 Unadjusted Q1 Q2 1998/99-1999/00 1.02 0.38 0.63 0.53 -0.69 -0.20 1999/00-2000/01 1.05 1.40 1.20 -0.23 0.38 0.03 2000/01-2001/02 -1.49 -1.51 -1.59 -0.54 -0.58 -0.70 2001/02-2002/03 0.35 1.35 1.04 -0.55 1.02 0.53 2002/03-2003/04 -0.05 0.69 0.42 -0.65 0.51 0.09 Average 0.17 0.46 0.34 -0.29 0.13 -0.05 1998/99-1999/00 0.08 -0.56 -0.30 -1.40 -2.60 -2.12 1999/00-2000/01 0.65 1.00 0.79 -1.18 -0.57 -0.92 2000/01-2001/02 -1.38 -1.41 -1.48 0.68 0.64 0.52 2001/02-2002/03 -0.48 0.51 0.20 -1.87 -0.32 -0.81 2002/03-2003/04 -0.48 0.26 -0.01 -0.99 0.16 -0.25 -0.95 -0.54 -0.72 Labour volume Quality adjusted labour -0.32 -0.04 -0.16 Average Note: all productivity estimates use geometric means 9.2 Intermediate and capital inputs Intermediate input for the hospital sector comes from the Trust Financial Returns (TFR) and is deflated by a modified version of the DH Health Services Cost Index (HSCI) to derive a volume measure. Intermediate input was defined as all current non pay expenditure items in the TFR, and hence excluded all purchases of capital equipment and capital maintenance expenditures as these items cannot be allocated to a particular year’s output. The list of items included and their shares in total intermediate expenditure in selected 183 years is shown in Table 9.4. This shows the share of drugs increasing rapidly and a declining trend in the miscellaneous category with no other intermediate category showing much change. Within the final category, external purchase of health care from non-NHS bodies has shown an increased share through time but remains small at about 6% of total intermediate in 2003/04. Table 9.4 Share of intermediate expenditure by type 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 Drugs 0.238 0.242 0.286 0.303 0.316 0.338 Other Clinical Supplies 0.041 0.042 0.053 0.055 0.052 0.056 General 0.139 0.134 0.150 0.144 0.128 0.122 0.166 0.159 0.187 0.176 0.151 0.143 Non-capital premises3 0.112 0.101 0.115 0.114 0.101 0.104 Other4 0.304 0.322 0.209 0.209 0.252 0.237 Supplies and 1 Services Establishment expenditure2 1. Hotel, catering and cleaning services; 2. Stationery, communications, advertising and transport costs; 3. Energy, rent and external services; 4. Miscellaneous services including external purchase of health care from non-NHS bodies and other external contract expenditures. These numbers for intermediate input were deflated by an aggregate price index, derived as a chain linked index of corresponding HSCI items. This resulted in a very small upward adjustment in the intermediate input deflator than one using all items in the HSCI, as the prices of capital items have been growing more slowly than current items and in the case of computers have been falling. To be consistent with the methodology employed by ONS, capital input could be measured by depreciation from the TFR. However this calculation ignores any capital services from capital purchases in the current year. An alternative is to assume a proportion of these expenditures are depreciated in the current year. Since much of the expenditure is on computers, software and medical equipment with low asset lives, we assume one third of these assets are depreciated. These capital services are deflated by a chain linked deflator for capital items in the HSCI while depreciation is deflated by 184 the ONS capital consumption deflator for the total NHS. Adding a proportion of capital expenditures implies average annual real capital growth of 4.76% per annum from 1998/99 to 2003/04 compared to 4.69% using depreciation alone, a small effect. However it does raise capital’s share from 0.041 to 0.080. Finally consistent with the methodology employed in private services the value of business rates are added to capital’s share. Calculating input shares is more difficult for the total NHS since we attempt to combine data from different sources. For the aggregate NHS we use TFR data on payments to labour in the hospital sector and use ONS data for payments to labour in other parts of the NHS. Similarly, intermediate inputs are derived combining expenditures from TFR and PFR with ONS data on other parts of the NHS. Capital inputs are those employed by ONS in their measures of Health Sector Productivity. Family Health Drugs are deflated by the cost of all items rather than the ONS quality adjusted Paasche variant. The estimates for the NHS should be treated with considerable caution since data are being taken from a number of sources which may need further reconciling. Table 9.5 shows average period input shares and average growth in the three inputs where labour input is the quality adjusted variant. Table 9.5 Average period input shares and average growth in inputs NHS Hospital Shares Input growth Shares input growth Labour 0.61 3.95 0.72 4.34 Intermediate 0.33 8.58 0.20 4.93 Capital 0.06 3.12 0.08 4.76 Labour represents a lower share in the total NHS than in the hospital sector, mainly due to the inclusion of family health prescribing in the former. Growth in intermediate input is very large in the total NHS, while all three inputs show similar average growth rates in the hospital sector. 185 9.3 Total factor productivity growth Combining the input shares with growth in real inputs allow the calculation of total input growth and subtracting this from output growth yields total factor productivity growth rates, shown in Table 9.6. Average TFP growth rates are strongly negative for the total NHS for all quality variants. The numbers based on quality adjusted output are similar to those calculated by ONS reported in Lee (2004). The ONS estimates using the most comparable methods to measure inputs suggest average TFP declining by 1.34% per annum over the same period. However it should be noted that ONS output measures use reference cost activities which are not directly comparable with the HES based data employed in this report’s calculations. Average TFP growth is also negative for the hospital sector but less so than for the total NHS. The results are sensitive to the allocation of expenditures between the three broad categories of inputs. Lee (2004) highlighted the sensitivity of the results to the deflators used, in particular for drugs – see also the discussion in section 10.5. However since drugs are both an input and an output this sensitivity is surprising. Hence some further investigation is required and will be carried out following discussions with ONS. Table 9.6 Total factor productivity growth Total NHS Hospital Unadjusted Q1 Q2 Unadjusted Q1 Q2 1998/99-1999/00 -2.33 -2.95 -2.71 -2.82 -4.00 -3.53 1999/00-2000/01 0.55 0.89 0.69 0.30 0.91 0.56 2000/01-2001/02 -2.12 -2.15 -2.22 0.17 0.13 0.01 2001/02-2002/03 -1.86 -0.88 -1.19 -2.01 -0.46 -0.95 2002/03-2003/04 -2.97 -2.25 -2.51 -1.13 0.02 -0.39 Average -1.75 -1.48 -1.59 -1.11 -0.70 -0.87 The finding that TFP growth is negative is not unusual in the private sector. For example Basu et al. (2003) report negative gross output based annual average TFP 186 growth rates for a number of sectors in the 1990s including insurance and business services. Similar results have frequently been reported by the US Bureau of Labour Statistics. Negative TFP growth is mostly likely to occur in service sectors where output is poorly measured and quality adjustment is minimal. TFP growth rates for the private sector using comparable measurement methods are not yet available for the period under consideration in this report. When inputs are measured correctly, with adjustments for quality change then the TFP residual is close to a measure of pure technical change so long as output is also measured correctly. But as emphasised in many parts of this report, we are only capturing part of the improvement in quality of care via our proposed adjustments for survival, health effects and waiting times. Because of this incomplete adjustment for quality change we expect to underestimate TFP growth. There are also reasons why in the short term at least we might expect negative growth rates. The literature on the impact of information technology on productivity in the private sector points to an important role of organisational changes in facilitating benefits from new technology. Basu et al. (2003) suggest that these changes can lead to declining TFP in the short run due to disruption of production processes. There is no doubt that the NHS is undergoing significant change. Of more consequence for the health sector is the notion that there are diminishing returns as increased activity allows treatment of more complex and hence most costly cases. Activity rates have been increasing more rapidly in recent years. Some evidence in support of this is provided by the increased average age of patients treated in hospitals, from 48.6 years in 1999/00 to 50 years in 2003/04. In addition there has been some increase in the expenditure shares of HRG categories with the title ‘complex elderly’ from 3.4% of expenditures to 4.2% over the same period. Changes in the case mix are likely to be larger within than across HRGs but we lack the necessary data to examine this. Data that identified the characteristics of patients would also be useful in identifying the extent to which changes in NHS productivity are affected by diminishing returns. 187 10 Improving the data We appreciate that the Department of Health is trying to reduce the amount of data collected from the NHS (Review of Central Returns Unit). However, the data requirements outlined below are essential to development of robust measures of output, productivity and quality change in the NHS. These data will be required not only by the DH and Trusts to improve performance of the NHS but also by outside bodies such as the Treasury and ONS. 10.1 Outcomes data 10.1.1 Health outcomes The main aim of the health system is the improvement of the health of the population. This being so, it would seem reasonable that any measure of health system performance, output and productivity should include measures of the effect of the system on health. The challenges associated with measuring the effect of interventions are discussed in Appendix C. The construction of a productivity index requires information about changes in health status attributable to interventions. Such information currently is not collected by the NHS. We suggest the systematic use of a standardised measure of health status to improve the effective management of the NHS and to provide the fundamental data needed to properly reflect changes in NHS productivity. We have suggested that the NHS should collect data on the health of patients before and after treatment. An outcome measure based on the difference between snapshot measures of health status before treatment hb and after treatment ha is an imperfect measure of the change in the discounted sum of QALYs due to treatment. It does not measure health with and without care but health before and after care. It also replaces each time profile with a single snapshot. For some treatments and conditions the effect of treatment is merely to slow down the rate of decline in health status, so that h a − h b < 0 even though the treatment increases the sum of QALYs compared with no 188 treatment. For these cases incorporating estimates of hb and ha in estimates of the level of output in any given year would reduce measured output. However, since the aim is to measure the rate of growth of output productivity we are interested in whether the rate of growth of ∆h = ha – hb is a reasonable approximation to the rate of growth of the effect of treatment on the discounted sum of QALYs. The important issue is how well the rate of change in measures based on the snapshots hb, ha approximates the rate of change in the areas under the two time profiles of health streams with treatment h*(s) and without treatment ho(s) Both the level of health before treatment hb and the health of treated patients if not treated depend on the patient population selected for treatment and on the general health of the population. It is not unreasonable to suggest that the rates of change of hb and the discounted value of the no treatment health profile ho(s) over time will be similar. Both the snapshot level of health after treatment ha and the discounted value of the time profile h* will be measured on the same population and hence are affected by the same factors including any technological change. Hence, despite the imperfections of the difference between snapshots of post and pre treatment heath status for calculating the level of productivity, we suggest that rates of change of measures based on hb, ha will improve estimates of NHS output growth compared to estimates where such information is not used. The following points need to be addressed in such a data collection exercise. NHS patient sample. Since the scale of NHS activity is so broad and the potential volume of patients is so large sampling seems a more sensible strategy than attempting to measure health effects for all NHS patients in a sector. Sample sizes are likely to vary across different types of patients. While a random sample of the NHS patient population would be preferable, in the first instance it may be satisfactory to undertake a pilot exercise at a handful of Trusts. This might be adequate for national measures of output but would be inadequate if information on health outcomes are to be used to improve performance of the NHS. 189 Choice of instrument. A single generic, rather than condition-specific instrument is required in order to facilitate aggregation across different types of NHS activity. Profiles such as SF-36 provide multiple measures of outcome but are unsuitable for most non-clinical purposes since they typically lack the capacity to form a single aggregate index. Derivatives of SF-36 such as the SF-6D do not suffer from this deficiency. The EQ-5D is designed to produce a single index and its five dimensions have been calibrated in terms of social preference weights of a UK population and is probably the primary candidate measure. Timing. The timing of before and after health status measurement may depend on the type of activity (emergency or elective) and on diagnostic category or intervention type since different treatments may have an effect over shorter or longer periods. Our analysis of data for two elective procedures (hip and knee replacement) shows most treatment gain after six months but clinical advice could be used to determine the appropriate period for follow-up post treatment for other conditions. Grouping of NHS activities. Given the enormous range of NHS activities it is necessary to group them for data analysis. The main grouping of secondary care activities is at present by HRG which attempts to group activities by their costs. But a given HRG may contain a large number of procedures which have very different effects on health. The availability of patient-level health outcomes data by ICD will permit matching to other datasets (such as HES) and will make it possible to explore the extent to which health outcomes are related to other routinely collected patient characteristics, such as age, gender, diagnoses, procedure, survival rates and mortality. This is essential if we are to develop disease specific studies of improvement in health care. Frequency of data collection. If the pace of technological change in medicine was slow enough it could be argued that collecting data on health outcomes was an exercise that needed to be undertaken only at intervals of several years. But technological change in medicine and pharmaceuticals is 190 rapid, the NHS is subject to frequent organisational change which may affect the mix of patients receiving particular treatments and the speed at which new technology spreads. We believe that only a continuous sampling of the NHS patient population will be adequate to capture trends in the impact of NHS services on patients. 10.1.2 Feasibility Outside clinical trials, experience of routine collection of health status data in the UK is patchy. Individual clinicians and clinical teams make use of a variety of standardised measures, but this is largely uncoordinated, its coverage remains undocumented and aggregation of such data is problematic given the use of different instruments. Although there are a limited number of examples of prospective health data collection these examples demonstrate that such data collection would be feasible in the NHS. The survey of acute inpatients conducted by Picker International showed that it was possible to collect EQ-5D data from a sample of patients recently discharged from all NHS Trust hospitals. The value of these data would have been enhanced if they been linked with basic HES variables, such as diagnosis or procedure. Given current DH policy to regularly survey patient experience, inclusion of questions on health outcomes would involve very low marginal cost. The Health Outcomes Data Repository (HODaR) operates a continuous survey of all inpatients and outpatients at a single large Welsh Trust. These are now linked to individual level primary and community care data. Data for more than 30,000 patients have been collected, almost 10% of these having completed EQ-5D on more than one occasion. However, the data are predominantly based on post-discharge observations and this limits their value in measuring health outcomes. Since the advent of this project the HODaR survey has started to collect data on pre-admission health status. 191 As described in Appendix C, for a number of years BUPA has been routinely administering health status questionnaires to patients before and three months after treatment, with some 100,000 having now been surveyed. These data are restricted primarily to elective procedures. BUPA plan to extend them to four types of cancer. 10.1.3 Cost The incremental costs of introducing systematic observation of health status via existing information systems are difficult to estimate. Currently BUPA estimates that it costs around £4 per patient to administer their manual system of health status measurement based on SF-36. The introduction of systematic health status measurement might be achieved under the aegis of the National Programme for Information Technology announced in December 2002 with a budget of £2.3 billion and which the Audit Commission suggested would provide the Department of Health with the opportunity to improve NHS data quality. It would seem sensible to consider an extension to the current HES-based data to provide maximum scope for exploitation through record linkage. Modification of this sort ought not to incur a significant cost. However, the data captured from patients will require additional organisational and administrative costs. Patient-centred reporting systems using traditional paper and pencil techniques require costly processing in order to link them to other NHS data. Computer-assisted interview methods have scope for more efficient data acquisition and transmission but would need more costly administration. The use of handheld PDA recording systems is now becoming a feature of many clinical trials that record patient-reported health status and it can be expected that hardware costs will continue to fall. 10.2 Other outcome measures: patient satisfaction As we emphasised in our First Interim Report the effect of the NHS on health status is 192 obviously an extremely important dimension of NHS outcomes but other dimensions, especially process related outcomes, should also be taken into account. However, existing patient surveys rarely ask similar questions over time. The DH should decide on core aspects of the patient experience and ensure these questions appear each year in patient satisfaction surveys. A problem we discuss in this report is that of identifying a relevant weight to place on patient experience in a quality adjusted output index. Ryan (2004) recommended that the DH undertake a series of discrete choice experiments to obtain evidence on the relative value patients attach to different aspects of process quality. We endorse this recommendation. 10.3 General practice data 10.3.1 GP activity We have discussed the measures of GP consultations in section 4.4, noting the unsatisfactory nature of the current source (General Household Survey) and that the DH is now planning to obtain consultations directly from GP record systems via the QRESEARCH database. We have agreed to investigate this new data for the DH to determine what if any adjustments need to be made to improve its reliability. Ideally NHS productivity measures should be based on numbers of patient journeys of different types where journeys are likely to involve both primary and secondary care. In the absence of routine record linkage such measures are not currently feasible but it would still be worthwhile getting a finer breakdown of GP consultations to allow for the changing mix of providers and for the changing mix of types of consultations. We have also discussed measures of general practice quality in section 4.13. Whilst QRESEARCH and similar databases of GP record systems and the central collection of the greatly enlarged set of quality indicators linked to GP pay will provide potentially useful quality adjustment these will need to be based on careful empirical modelling of GPs responses to the new financial incentives, especially possible diversion of effort from unremunerated quality efforts. 193 10.3.2 GP cost weights The PSSRU estimates the unit costs of GP and nurse consultations (http://www.pssru.ac.uk/pdf/uc2003/uc2003.pdf) using a variety of official and unofficial sources. Several of the estimates rest on self reported GP activity from the 1992/3 GP Workload Survey undertaken for the DDRB. There does not appear to be a more recent survey of GP activity and we recommend that DH should consider undertaking such a survey at regular intervals. 10.3.3 General practice staff Prior to April 2004, practice staff such as practice nurses, although practice employees, were partly paid for by the NHS and so a record was kept, though it was not a reliable source because not all practices claimed these subsidies. Under the new GP contract the subsidies have been abolished and there is no record of non-GP staff in practices. Now practices instead receive a sum of money based on their practice. We recommend that the DH make it a condition of the practice contract that a full return of employed staff is made. 10.3.4 Prescribing The prescription activity measure in the recently revised NHS outputs index is derived from PPA data. The PPA data are collected in order to remunerate pharmacists (and dispensing GPs). It is therefore a comprehensive measure of prescriptions dispensed and can be disaggregated to product type if required. The data are reliable, comprehensive and readily available at national levels of aggregation. They have been used to construct a number of indicators of practice prescribing quality as well as quantity. The usefulness of the data could be greatly improved and this would be relatively simple. The most obvious example is by improving the patient information on the prescription form. At the moment the only patient data on the form indicates if the patient is entitled to free prescriptions and on what grounds. The information has been used by the Prescribing Support Unit to produce the Low Income Scheme Index (LISI) which measures the proportion of prescriptions which are dispensed without 194 charge on grounds of low income. The LISI is the only direct variable measuring practice population socioeconomic status which relates directly to practice patients rather than being attributed from Census or Social Security data on the basis of patient postcode. Adding a field for diagnosis to the prescription form would greatly enhance the usefulness of routine prescribing data as a measure of prescribing quality. Adding gender and age fields would also improve the socioeconomic data and improve prescribing quality indicators. We recommend that the DH should add these fields to the prescription form. 10.4 Other primary care data NHS Direct, NHS Direct Online and Walk-In Centres are recent innovations in the provision of first contact advice and information. They are likely to reduce the costs to patients of such first contacts, leading both to an increase in primary care activities and to a change in the mix of activities in general practice. The organisations are expected to play an increasing role in the NHS over the coming years and it is important that their presence is recognised in measures of NHS output and productivity. Aggregate data on use of NHS Direct and NHS Direct Online are available. In order to measure the outputs of the services more accurately it would be helpful to have data on the breakdown of enquires between the provision of health advice and information about the health service the type of conditions people seek health advice about actions that are recommended as a result of the request It is possible that such data have been collected, for example via the website service for those who seek advice from a nurse which involves self-completion of a detailed questionnaire on the nature of the symptoms and condition, as well as personal information. Presumably – although we have not been able to ascertain whether this is the case – enquiries that result in a self-care recommendation are logged also. The telephone service seems to be set up in a similar fashion, the difference being that the 195 information is recorded by NHS Direct staff. It appears, therefore, that these two organisations routinely collect (or, at least, have the capacity to collect) detailed electronic information from every person making an enquiry about their (or their family member’s) health condition. We recommend that the DH should utilise more of the data collected by NHS Direct to improve measures of output. 10.5 Inputs It is important to have a comprehensive coverage of inputs used in producing health services in order to explain changes in outputs and to measure productivity. Thus we require values of expenditures on inputs, volume measures and price deflators to convert values to volume measures when the latter are not available. We also require data on the extent to which the quality of inputs are changing through time. When volume measures are available, under the assumption that payments to labour equal marginal products, weighting diverse inputs by their shares in wage bills can be employed to adjust for quality change. Alternatively hedonic regressions may be employed to quality adjust price deflators. Both methods depend on competitive market assumptions which are unlikely to characterise many of the markets in which the NHS operates. The analysis of labour input showed that it is possible to derive reasonable volume measures and adjustments for quality change by linking readily available data sources. The quality adjustment employed was certified qualifications with an additional adjustment for job related training. While certified qualifications are useful they may not be sufficiently detailed to capture differences in the productive capacity of some employees. Many professionals within the NHS have similar qualifications since there are minimum requirements set by professional bodies. While the NHS census does include very detailed data on numbers of professionals by grade, there is no comparable data on earnings. The persons responsible for the Earnings survey within DH were unwilling to attempt to match earnings to the Census numbers by type on the 196 valid grounds that the sample size was too small. A larger sample survey of wage rates and earnings within the NHS would be very useful, not only for the productivity calculations carried out in this report but also to calculate the extent to which increases in payments to labour are due to employing more highly skilled personnel as against mere wage inflation. The productivity calculations in section 9 above highlight the importance of intermediate inputs in overall NHS activity and to a lesser extent in hospital activity. Reliable volume measures require reasonable estimates of intermediate input deflators. The main problem with intermediate inputs is the price deflator employed for drugs. As mentioned in section 3.3 above ONS is currently attempting to refine its deflator for prescription drugs based on detailed data on prices. Data sources for prices of hospital drugs are not readily available. Even if such data could be collected, it is doubtful if they would be useful as a tool for quality adjustments. Hedonic type adjustments are only valid in competitive markets. The market for hospital drugs in Britain is best characterised as a bilateral monopoly with a monopsony purchaser buying from powerful oligopoly drug producers. The standard textbook model of bilateral monopoly shows that the price will depend on the relative bargaining power of the purchasers and suppliers. It is well known that the NHS buys drugs at a discount and it may well be the case that discounting on new drugs may swamp any quality change, at least at the point of entry. The use of prescription drugs are in the control of independent GPs so the monopsony element is less important but there remain market imperfections on the producer side. These remarks suggest that using drugs prices to measure quality requires an understanding of how markets operate and should be based on a time profile of prices rather than comparisons between old and new drugs at the time new drugs enter the market. A more fruitful but also costly approach might be to examine patient outcomes and drug use from disease registers or to obtain opinions from panels of experts. This is simply another example of the need for outcomes data if we are to measure technical change and productivity in health care. 197 In order to quality adjust capital input it would be useful to have separate investment data on types of equipment that have seen rapid technological change and change in unit cost (e.g. MRI scanners). Alternatively it would be useful to have the value of the stock of these assets, numbers of items and age profiles of the stock. One problem that must be addressed is the gap in data created by PFI confidentiality. We understand that a significant proportion of new investment in equipment such as scanners is being undertaken under PFI contracts. Unless it is possible to access information on stocks and value, it may not be possible to adequately deal with questions of productivity growth and technical change associated with investment in new equipment. It would also be useful to have information on investment by GPs. 11 Conclusions and recommendations 11.1 Methods 11.1.1 The preferred approach Economic theory suggests that the preferred way of measuring NHS output is with a value weighted output index. • The unit of output is the patient treated, the characteristics of output valued by individuals indicate quality and the weight attached to each characteristic reflects the marginal social value of the characteristic. • The index overcomes the serious problem of a cost weighted index where movement to more cost-effective ways of treating patients appears as a reduction in output. • Data necessary to estimate this index are not currently available but are feasible to collect. • A condition specific value weighted index can be constructed as data on major diseases becomes available. Not only is the value weighted index theoretically correct, it would allow measurement of improvements in delivery of services intended to raise both productivity and patient satisfaction. 198 11.1.2 Methods using existing data It is not possible to calculate a value weighted index with current data. It is possible to quality adjust the hospital component of a cost weighted NHS output index using existing data combined with some assumptions. We have • spelt out the methods for quality adjustment with existing data in some detail, taking care to emphasise the necessary assumptions and their implications, rather than merely presenting plausible ad hoc adjustments which may in fact contain dubious assumptions or value judgements. • shown how it is possible to use routine data on short term survival and waiting times, coupled possibly with an explicit assumption about the proportionate effect of treatment on health, to calculate quality adjusted cost weighted output indices. • presented experimental calculations of these indices to compare their effects on a simple cost weighted output index and to investigate the empirical implications of the assumptions about important parameters which are required. • used data on the health effects of a limited set of treatments to illustrate the construction of a value weighted index for the set and to shed further light on the implications of making possibly inaccurate assumptions about the health effects when constructing an index for all hospital treatments. • described how it is possible in principle to use data on other aspects of care (readmissions, MRSA, patient satisfaction with food, cleanliness and nonclinical care) to provide an additional quality adjustment and have produced some illustrative examples of such adjustments based on the current unsatisfactory data on these characteristics of care. • described a method of quality adjustment using the information on longer term survival which will become available in the near future. 199 11.2 Results 11.2.1 Results for the hospital sector Results are reported for a cost weighted quality adjusted output index over the period 1999/00 – 2003/04. It was only possible to quality adjust for hospital activity, 47% of expenditure covered in the DH cost weighted output index. In Section 5 we examine sensitivity to discount rates and key assumptions. Table 11.1 summarises the central results. Table 11.1 Hospital sector cost weighted output index with recommended quality adjustments (growth rates%) No adjustment Survival and health effects adjustment1 With survival. health effects, and waiting time adjustments2 1998/99-1999/00 2.03 0.91 1.28 1999/00-2000/01 1.54 1.99 1.80 2000/01-2001/02 4.48 4.40 4.31 2001/02-2002/03 3.94 5.19 5.06 2002/03-2003/04 4.78 5.51 5.56 Average p.a. 3.35 3.60 3.60 1 This sets k = q0/q* = 0.8 for electives, k = 0.4 for non-electives, k = actual k for those HRGs included in the specimen index where this is known, provided a – k > 0.10 and k = 0 otherwise. 2 Recommended quality variant 2. As note 1 plus discounts to date of treatment with charge for wait, with discount rates on waits and life expectancy equal to 1.5% and uses 80th percentile waits. Table 11.2 shows the impact of incorporating quality adjustments for the hospital sector into the overall NHS cost weighted output index. 200 Table 11.2 Aggregate NHS cost weighted output index with hospital sector quality adjustments (growth rates %) Unadjusted CWOI CWOI with hospital survival and waiting time adjustments1 1998/99-1999/00 2.61 2.22 1999/00-2000/01 2.11 2.26 2000/01-2001/02 3.85 3.74 2001/02-2002/03 5.07 5.78 2002/03-2003/04 4.43 4.93 Average p.a. 3.62 3.79 1 Recommended quality variant 2. This sets k = q0/q* = 0.8 for electives, k = 0.4 for non-electives, k = actual k for those HRGs included in the specimen index where this is known, provided a – k > 0.10 and k = 0 otherwise; discounts to date of treatment with charge for wait, with discount rates on waits and life expectancy equal to 1.5% and uses 80th percentile waits. Overall, our results show that quality adjustment with existing data can make an impact on measures of NHS output and as more routinely collected data becomes available, the quality adjustment can be improved. 11.2.2 Other quality indicators We were asked to explore the use of indicators such as patient satisfaction, readmission rates, clinical errors and incidence of MRSA. The data are not at present suitable for inclusion in an output index but some “illustrative” adjustments are reported in Table 11.3. Table 11.3 Illustrative calculations of hospital CWOI with adjustments for survival, waiting times, patient satisfaction as measured in patient surveys, readmissions and MRSA. Average annual growth rates 2001/2 to 2003/4 Average Growth Rates 2001/2 to 2003/4 % p.a. Unadjusted cost weighted output index With adjustment for survival and waiting times1 With adjustment for adjustment for satisfaction with food, cleanliness, and non clinical care2 With adjustment for MRSA, readmissions satisfaction with food, cleanliness, and non clinical care2 4.34% 5.74% 5.71% 5.71% 1 Variant 1 (k = 0.8, mortality cut off 0.15, discounting to date of treatment, discount rate on waits 1.5%, on life expectancy 1.5%, 80th percentile waiting time measure). 2 5% weight on non-clinical care 201 11.3 Total factor productivity growth Employing a new quality adjusted index of labour input in the hospital sector, provisional estimates of productivity growth are reported. Key results appear in Table 11.4. They highlight the importance of quality adjustments to output in evaluating NHS performance. Table 11.4 Total factor productivity growth (%) Total NHS Unadjusted Recommended Quality Variant Hospital Unadjusted Recommended Quality Variant 1998/99-1999/00 -2.33 -2.71 -2.82 -3.53 1999/00-2000/01 0.55 0.69 0.30 0.56 2000/01-2001/02 -2.12 -2.22 0.17 0.01 2001/02-2002/03 -1.86 -1.19 -2.01 -0.95 2002/03-2003/04 -2.97 -2.51 -1.13 -0.39 Average p.a. -1.75 -1.59 -1.11 -0.87 11.4 Recommendations Recommendations for improving quality adjustment were made throughout the report. We summarise the main ones here. For the medium term improvement of the output index improvements to the data are required. We recommend: • Routine collection of outcomes data for a range of NHS treatments. The programme should start with a few high volume elective and medical conditions that would permit sampling rather than complete coverage. The data would also be immensely useful for other purposes including monitoring of Trust performance and improved cost-effectiveness analysis of particular treatments. 202 • Collection of longer term survival data by linkage of HES and ONS records to produce estimates of patient life expectancy. • A patient identifier that will permit grouping NHS activities across institutions and by disease. The DH has plans to implement this change. • Stated preference studies of patients to establish their relative valuations of the characteristics of NHS output from waiting times to being treated with courtesy and dignity by staff. The studies should also include a cost characteristic so that monetary valuations can be inferred and all characteristics can be valued in a common unit. The studies will enable the data from patient satisfaction studies to be utilised for quality adjustment as well as informing decision making in the NHS. For the short run improvement of the output index with available data: • We recommend the use of short term survival coupled with life expectancy to quality adjust hospital output. • The short term survival adjustment will underestimate output growth. We recommend that it be coupled with an estimated health effect derived from an estimate of the proportionate effect of treatment: o As the data become available from surveys of patient health before and after treatment and elsewhere, treatment specific estimates of k j = qojt / q*jt should be used. o Where there are no treatment specific estimates, kj should be estimated as the volume-weighted mean of existing treatment specific estimates for the relevant class (electives and non-electives). o In the absence of any estimates of treatment specific k for non-electives the estimate for non-electives should be equal to half the volume-weighted mean k of the electives. o The health effects adjustment should be used only for treatments with a mortality rate of 0.10 or less. • We recommend the use of a waiting time adjustment based on discounting to date of treatment, with a charge for waiting. Theoretical considerations 203 suggest that the discount rate on waits should be the same as the discount rate on QALYs. We suggest 1.5%. • We do not recommend quality adjustments based on patient satisfaction with food, cleanliness, and non-clinical care until there are data on the relative marginal values of these outcomes. If it is felt that estimates of the costs of cleaning and food derived from Trust accounts reasonably reflect marginal social values then it would be possible to include an adjustment just for these satisfaction indicators. • We do not recommend quality adjustments based on readmission rates and MRSA because of data problems and because they may reflect aspects of care which are better captured in the other quality adjustments. • We recommend the use of 30 day mortality, rather than in hospital mortality, as the measure of short term survival, since we believe its greater theoretical merits outweigh the difficulties in calculating it. As data linkage methods are improved the advantage of the 30 day mortality will increase. • The waiting time measure should be a certainty equivalent wait, to avoid the need to calculate adjustments on individual data. The 80th percentile wait seems a reasonable value. • Quality adjustments of hospital output should use CIPS rather than FCEs as the unit of output. • HES rather than the Reference Cost data base should be the source of data on hospital outputs. 204 11.5 Acknowledgements The outputs and productivity project is funded by the Department of Health. The views expressed here are not necessarily those of the Department of Health. This report has benefited greatly from discussion with numerous experts, including the members of the Steering Group Committee, Jack Triplett (consultant to the project team), Sir Tony Atkinson, Barbara Fraumeni, Andrew Jackson, Azim Lakhani, Phillip Lee, Alan Maynard, Alistair McGuire and Alan Williams. A number of individuals and organisations contributed to assemble the data used in this report. We are grateful to Kate Byram, Mike Fleming, Geoff Hardman, James Hemingway, Sue Hennessy, Sue Macran, Paula Monteith, Casey Quinn, Sarah Scobie, Bryn Shorney, Craig Spence, Karen Wagner, Chris Watson, BUPA, the Cardiff Research Consortium, Health Outcomes Group and York Hospitals NHS Trust. Any errors and omissions remain the sole responsibility of the authors. 205 References Atkinson, T. (2005) ‘Measurement of government output and productivity for the national accounts’, Atkinson Review: Final Report, HMSO, 31 January 2005. Baily, M. and Garber, A. (1997). ‘Health care productivity’, Brookings Papers on Economic Activity. Microeconomics, 143-215. Baker, D., Einstadter, D., Thomas, C., Husak, S., Gordon, N., Cebul, R. (2002). ‘Mortality trends during a program that publicly reported hospital performance’, Medical Car,e 40 (10): 879-890 Basu, S., Fernald, J., Oulton, N. and Srinivasan, S. (2003) ‘The case of the missing productivity, or, does information technology explain why productivity accelerated in the United States but not in the United Kingdom’, NBER working paper no. 10010. Berndt, E.R., Busch, S.H. and Frank, R.G. (2001) ‘Treatment price indexes for acute phase major depression’, in D.M. Cutler and E.R. Berndt, eds.: Medical Care Output and Productivity, University of Chicago Press, Chicago. Berndt, E.R., Bir, A., Busch, S.H., Frank, R.G. and Normand, S.T. (2002) ‘The medical treatment of depression, 1991-1996: productive inefficiency, expected outcome variations, and price indexes’, Journal of Health Economics, 21: 373396. Bruce, J., Russell, E.M., Mollison, J. and Krukowski, Z.H. (2001) ‘The measurement and monitoring of surgical adverse events’, Health Technology Assessment, 5 (22). Available at: http://www.hta.nhsweb.nhs.uk/fullmono/mon522.pdf Cairns, J.A. (1994) ‘Valuing future benefits’, Health Economics, 3: 221-229. Cairns, J.A. and van der Pol, M.M. (1997) ‘Constant and decreasing timing aversion for saving lives’, Social Science and Medicine, 45 (11): 1653-1659. Carthy, T., Chilton, S., Covey, J., Hopkins, L., Jones-Lee, M., Loomes, G., Pidgeon, N. and Spencer, A. (1999) ‘The contingent valuation of safety and the safety of contingent valuation, part 2: The CV/SG ‘chained’ approach’, Journal of Risk and Uncertainty, 17: 187-213. CDRweekly (2002) The first year of the Department of Health’s mandatory MRSA bacteraemia surveillance scheme in acute NHS Trusts I England: April 2001 – March 2002, Health Protection Agency, Vol. 12, nr. 25, 20 June 2002 CDRweekly (2003) The second year of the Department of Health’s mandatory MRSA bacteraemia surveillance scheme in acute NHS Trusts I England: April 2002 – March 2003, Health Protection Agency, Vol. 13, nr. 25, 19 June 2003. CDRweekly (2004) The third year of regional and national analysis of the Department of Health’s mandatory MRSA bacteraemia surveillance scheme in acute NHS 206 Trusts I England: April 2001 – March 2004, Health Protection Agency, Vol. 14, nr. 29, 15 July 2004. Commission for Health Improvement (2003) NHS performance ratings acute trusts, specialist trusts, ambulance trusts 2002/2003, Commission for Health Improvement: London. http://www.chi.nhs.uk/ratings/ Crowcroft, N. S. and M. Catchpole (2002) ‘Mortality from methicillin resistant Staphylococcus aureus in England and Wales: analysis of death certificates’, British Medical Journal, 325: 1390-1391. Cutler, D.M. and Huckman, R.S. (2003) 'Technological development and medical productivity: the diffusion of angioplasty in New York state', Journal of Health Economics, 22: 187-217. Cutler, D.M., McClellan, M., Newhouse, J.P. and Remler, D. (2001) 'Pricing Heart Attack Treatments', in D.M. Cutler and E.R. Berndt, eds.: Medical Care Output and Productivity, University of Chicago Press, Chicago. Dawson, D., Gravelle, H., Kind, P., O’Mahony, M., Street, A. and Weale, M. (2004a) ‘Developing new approaches to measuring NHS outputs and productivity’, First Interim Report to Department of Health, CHE Technical Paper 31, July 2004. Available at: http://www.york.ac.uk/inst/che/tech.htm Dawson, D., Gravelle, H., Kind, P., O’Mahony, M., Street, A. and Weale, M. (2004b) ‘Measurement of NHS outputs and productivity growth: memorandum to Department of Health on data requirements’, 30 September 2004. Dawson, D., Gravelle, H., Kind, P., O’Mahony, M., Street, A. and Weale, M. (2004c) ‘Developing new approaches to measuring NHS outputs and productivity: Data for productivity estimates’, Second Interim Report to Department of Health, November 2004. Deaton, A. and Muellbauer, J. (1980) Economics and Consumer Behaviour, Cambridge University Press, Cambridge. Department of Health (1996) ‘Policy Appraisal and Health: a guide from the Department of Health’, London GO70/38 3901. Department of Health (2001) NHS Performance Ratings: Acute Trusts 2000/01, Department of Health: London. http://www.doh.gov.uk/performanceratings/2001/index.html Department of Health (2002a) ‘Reforming NHS Financial Flows: introducing payment by results’, Department of Health: London. Department of Health (2002b) ‘Getting ahead of the Curve: a strategy for combating infectious diseases (including other aspects of health protection)’, Department of Health: London. Available at: http://www.dh.gov.uk/assetRoot/04/06/08/75/04060875.pdf Department of Health (2002c) NHS Performance Ratings and Indicators: Acute 207 Trusts, Specialist Trusts, Ambulance Trusts, Mental Health Trusts 2001/02, Department of Health: London. http://www.doh.gov.uk/performanceratings/2002/index.html Department of Health (2003a) Investing in General Practice, the new General Medical Services Contract, Department of Health: London. http://www.dh.gov.uk/assetRoot/04/07/86/58/04078658.pdf Department of Health (2003b) ‘Surveillance of Healthcare Associated Infections’, Publication and Statistics, Letters and Circulars, Department of Health: London. Available at: http://www.dh.gov.uk/assetRoot/04/01/34/10/04013410.pdf Department of Health (2004a), ‘The ‘Experimental’ NHS Cost Efficiency Growth measure’, Department of Health: London. Department of Health (2004b), Departmental Report, Department of Health: London. Department of Health (2004c) ‘Chief Executive's Report to the NHS’, Department of Health: London. Department of Health (2004d), Bloodborne MRSA infection rates to be halved by 2008 – Reid, Publication and Statistics, Press Release, Department of Health: London. Available at: http://www.dh.gov.uk/PublicationsAndStatistics/PressReleases/PressReleases Notices/fs/en?CONTENT_ID=4093533&chk=MY%2BkD/ Department of Health (2004e), Infection control training for over one million NHS staff, Publication and Statistics, Press Release, Department of Health: London. Available at: http://www.dh.gov.uk/PublicationsAndStatistics/PressReleases/PressReleases Notices/fs/en?CONTENT_ID=4093432&chk=7bcARL Department of Health (2004f), MRSA surveillance system – results, Health Protection Agency Communicable Disease Surveillance Centre for the Department of Health, Department of Health: London. Available at: http://www.dh.gov.uk/PublicationsAndStatistics/Publications/PublicationsStat istics/PublicationsStatisticsArticle/fs/en?CONTENT_ID=4085951&chk=HBt2 QD Department of Health (2005) CMO Update, ‘New reporting system to improve patient safety’, March. Devlin, N. and Parkin, D. (2004) ‘Does NICE have a cost-effectiveness threshold and what other factors influence its decisions? A binary choice analysis’, Health Economics, 13:437-452. EuroQol Group (1990) ‘EuroQol – a new facility for the measurement of healthrelated quality of life’, Health Policy, 16: 199-208. Eurostat (2001) Handbook on price and volume measures in national accounts. Luxembourg, Office for Official Publications of the European Communities. 208 Giuffrida A, Gravelle H, Roland M. (1999) ‘Measuring Quality of Care with Routine Data: Avoiding Confusion between Performance Indicators and Health Outcomes’, British Medical Journal, 319:94-98. Gravelle, H. and Smith, D. (2001) ‘Discounting for Health Effects in Cost-Benefit and Cost-Effectiveness Analysis’, Health Economics, 10:587-599. Healthcare Commission (2004) 2004 performance ratings, Healthcare Commission: London. http://ratings2004.healthcarecommission.org.uk/ Healthcare Commission (2005) 2005 performance ratings, Healthcare Commission: London. http://ratings2005.healthcarecommission.org.uk/ Health Protection Agency (2003), NINSS Partnership, Surveillance of Surgical Site Infection in English hospitals 1997 – 2002. Available at: http://www.hpa.org.uk/infections/topics_az/hai/SSIreport.pdf Health Protection Agency and Office for National Statistics (2004), Trends in MRSA in England and Wales: analysis of morbidity and mortality data for 1993 – 2002, Health Statistics Quarterly 21 – Spring 2004. Available at: http://www.statistics.gov.uk/STATBASE/Product.asp?vlnk=6725 Health Protection Agency (2004), Protocol for Surveillance of Surgical Site Infection – Surgical Site infection Surveillance Service England. http://www.hpa.org.uk/infections/topics_az/hai/SSI%20Protocol.pdf Hemingway, J. (2004) ‘Sources and methods for public service productivity: health’, Economic Trends, 613: 82-90. Hicks, J.R. (1940) 'The valuation of social income', Economica, 7: 105-124. HM Treasury (2003) ‘The Green Book: Appraisal and Evaluation in Central Government’, London: TSO. Hofer, T.P. and Hayward, R.A. (1995) ‘Can early re-admission rates accurately detect poor-quality hospitals?’, Medical Care, 33(3): 234-45. Hurst, J. and Siciliani, L. (2003) ‘Tackling Excessive Waiting Times for Elective Surgery: A Comparison of Policies in Twelve OECD Countries’, OECD Health Working Paper 6, OECD, Paris. Available at: www.oecd.org/health Jackson G, Tobias M. (2001) ‘Potentially avoidable hospitalizations in New Zealand, 1989-98’, Australian and New Zealand Journal of Public Health, 25(3):21221. Jorgenson D, Ho. M and Stiroh, K. (2005) ‘Labour Input and the Returns to Education’, Chapter 6 of Information Technology and the American Growth Resurgence, MIT Press. Lakhani, A., Coles, J., Eayres, D., Spence, C. & Rachet, B. (2005) ‘Creative use of existing clinical and health outcomes data to assess NHS performance in England: Part 1-performance indicators closely linked to clinical care’, British 209 Medical Journal, 330:1426-1431. Lancaster, K. (1971) Consumer Demand: A New Approach, Columbia University Press, New York. Lee, P. (2004) ‘Public service productivity: health’, Economic Trends, 613: 38-59. Ludke, R.L., Booth, B.M. and Lewis-Beck, J.A. (1993) ‘Relationship between early readmission and hospital quality of care indicators’, Inquiry, 30: 95-103. Mai, N. (2004) ‘Measuring health care output in the UK: a diagnosis based approach.’ National Patient Safety Agency (2005) Building a memory: preventing harm, reducing risks and improving patient safety, July, on line resource available at http://www.npsa.nhs.uk/site/media/documents/1246_PSO_Report_FINAL.pdf National Patient Safety Agency (2005) National Patient Safety Agency: helping to make the NHS safer, July 2005, on line resource available at: http://www.npsa.nhs.uk/site/media/documents/1260_NPSA_Information.pdf OECD (2000) A System of Health Accounts. Version 1.0. OECD, Paris. OECD (2001) Measuring Productivity: Measurement of Aggregate and Industry Level Productivity Growth, OECD, Paris. Available at: http://www.oecd.org/dataoecd/59/29/2352458.pdf Pratt, J. (1964) ‘Risk aversion in the small and the large’, Econometrica, 32: 122-136. Roland, M. (2004) ‘Linking physician pay to quality of care – a major experiment in the UK’, New England Journal of Medicine. 351, 1448-1454. Rosen, S. (2002) ‘Markets and diversity’, American Economic Review, 92 (1): 1-15. Ryan, M., Odejar, M. and Napper, M. (2004) ‘The Value of Reducing Waiting Time in the Provision of Health Care: A Review of the Evidence.’ Report to the Department of Health, Health Economics Research Unit, Aberdeen. Sefton, J. and Weale, M. (to appear) ‘The concept of income in a general equilibrium’, Review of Economic Studies. Shapiro, I., Shapiro, M.D. and Wilcox, D.W. (2001) 'Measuring the Value of Cataract Surgery', in D.M. Cutler and E.R. Berndt, eds.: Medical Care Output and Productivity, University of Chicago Press, Chicago. Simkins, A. (2005) ‘Using the GP contract Quality and Outcomes Framework for quality adjustment’, mimeo, August 2005. Street, A. and AbdulHassain, S. (2004), ‘Would Roman soldiers fight for the financial flows regime? The re-issue of Diocletian’s edict in the NHS’, Public Money and Management, 24 (5): 301-38. Thomas, J.W. (1996) ‘Does risk-adjusted readmission rate provide valid information 210 on hospital quality?’ Inquiry 1996; 28: 258-70. Victorian Government (2004), The Victorian Ambulatory Care Sensitive Conditions Study, 2001–02, Public Health, Rural and Regional Health and Aged Care Services, Victoria Government Department of Human Services Melbourne Victoria, July 2004. Williams, A. (1985) 'The economics of coronary artery bypass grafting', British Medical Journal, 291: 326-9. Yule, G.U. (1934) 'On some points relating to vital statistics, more especially statistics of occupational mortality', Journal of the Royal Statistical Society, 97: 1-84. 211 Annex: How should NHS output be measured? Value weighted output index The value weighted output index is our preferred way to measure NHS output: I ytxq = ∑x ∑π ∑x ∑π j j q jt +1 k kt kjt +1 jt k kt kjt q where xjt is the volume of output j in period t, qkjt is the amount of outcome or characteristic k produced by a unit of j, and πkt is marginal value of outcome k.11 The index requires data on both the characteristics produced and on their marginal social value. Since improving the health of patients is a primary objective of the NHS, improved health outcomes are one of the most important characteristics of treatment. But other characteristics of treatment also affect utility, e.g. the length of time waited for treatment, the degree of uncertainty attached to the waiting time, relationship with doctors, hospital food and safety. These can be incorporated in the value weighted output index when the necessary data are available. Cost weighted output index Continuous inpatient spells (CIPS) If the data needed to calculate the value weighted output index are not available, we can instead use unit costs to weight outputs, and make use of available data to quality adjust these cost weighted outputs. The cost weighted output index is: I ctx = ∑x ∑x c jt +1 jt j j jt c jt Information on survival can be used to adjust the cost weighted output index 30 day mortality rates Data on short term survival can be used to adjust the index as follows: ∑ ⎛ a jt +1 ⎞ c x ⎜⎜ ⎟⎟ jt jt + 1 j ⎝ a jt ⎠ ∑ j c jt x jt Information on long term survival (not currently available for most treatments) could be used to adjust the index as follows: ∑ x c j jt +1 jt ⎞ a jt +1 S ⎛ σ *jt +1 ( s ) ⎞ ⎛ σ *jtδ s ⎜ ⎟ ⎜ ⎟ ∑ 5 * s =1 ⎜ * s ⎟ a jt ⎝ σ jt ( s ) ⎠ ⎜⎝ ∑ s =1σ jt ( s )δ ⎟⎠ ∑ j x jt c jt 11 Please refer to the table of notation at the end of this report for further details of the notation used here. 212 The simple survival adjustment above implies that the patient would have zero quality adjusted life years if not treated. It is possible to introduce an additional term into the formula to include a uniform estimate of the difference between health before and after treatment, giving the health effect survival index: ⎛ a jt +1 − k j x ∑ j jt +1 ⎜⎜ a − k j ⎝ jt ∑ j x jt c jt ⎞ ⎟⎟ c jt ⎠ Note that for HRGs where the mortality rate is high, we make no health effect adjustment and use only the change in survival. Information on changes in the life expectancy of patients treated can also be included as follows: ∑ j x jt +1c jt (a − k ) ⎛ 1− e ( a − k ) ⎜⎝ 1 − e ∑xc jt +1 j jt j j Certainty equivalent wait – measured as the waiting time for patients at the 80th percentile of the waiting time jt − rL jt +1 − rL jt ⎞ ⎟ ⎠ jt Information on waiting times can be used to quality adjust the cost weighted output index. This approach regards reductions in the wait for treatment as valuable because of their effect on the discounted value of the health gain from treatment There are two main forms of waiting time adjustment Discount to date of treatment with charge for waiting ( Discount to date placed on list ) ( ) r w ⎡ 1 − e − rL L jt +1 e w jt +1 − 1 ⎤ ⎢ ⎥ − rL rw ⎥ ⎛ a jt +1 − k j ⎞ ⎢⎣ ∑ j x jt +1c jt ⎜⎜ a − k ⎟⎟ ⎡ 1 − e− rLL jt erww jt − 1 ⎤ ⎦ j ⎠ ⎝ jt ⎢ ⎥ − rL rw ⎢ ⎥ ⎣ ⎦ ∑ j x jt c jt ( ) ( ) This is the form of the CWOI that we recommend should be used in the interim, where; • rL = rW = 0.015, • kj = actual kj if known, = mean k for known electives if kj not known and elective, = ½ mean k for electives if non-elective, • if (ajt+1 – kj) and (ajt – kj) < 0.10 then kj = 0. • wjt, wjt+1 are 80th percentile waits ( ( jt +1 1 − e jt +1 ⎞⎛ e ⎟⎟ ⎜ − rw jt − rL 1 − e jt ⎠ ⎜⎝ e ∑ j c jt x jt ⎛a −k ∑ j c jt x jt +1 ⎜⎜ ajt +1− k j j ⎝ jt − rw − rL ) (assumes rW = rL = r) 213 ) ⎞⎟ ⎟ ⎠ Additional quality adjustments can also be made to the CWOI. A lack of appropriate data means that only illustrative adjustments for these additional aspects of quality could be calculated at present. A quality adjustment for readmissions and MRSA can be incorporated based on the assumption that their cost is a deadweight loss which reduces the value of treatment. Hence the CWOI (ignoring other quality adjustments for illustrative purposes) can be adjusted as follows, ∑x ∑x c − ∑ x bjt +1c bjt jt +1 jt j j j jt c jt − ∑ x bjt c bjt j where xb denotes the number of readmissions or cases of MRSA and cb their costs. Patient satisfaction can also be incorporated in the CWOI, using measures of patient experience, derived from patient satisfaction surveys, as summary indicators of characteristics that patients value. This can be incorporated as follows; n I tComp = ∑ ωk I tk k =0 where ItComp is calculated as the weighted sum of the growth rate of one of the quality adjusted output indices and the growth rates of the other indicators. We denote the growth rate of indicator k by Itk. If there are n such indicators, and the relevant weights are denoted by ωk then the overall index is given by the formula above. 214 Table of Notation Notation x jt π kt q jkt p jt = ∑k π kt q jkt y jt = p jt x jt Interpretation quantity of output j at time t (units of j) marginal social value of characteristic k at time t quantity of characteristic k produced by one unit of output j at time t marginal social value of unit of output j at time t (£s per unit of j) value of output j at time t (£s) yt = ∑i y jt total value of NHS output I ytxq value weighted output index g xjt growth rate of output xj g qkjt growth rate of characteristic k produced by output j proportion of marginal value of output j accounted for by characteristic k proportion of the total value of period t output accounted for by output j cost weighted output index CWOI ω ptkt ω ytjt I ctx c jt unit (average) cost of output j at time t (£s per unit of j) m jt mortality rate from NHS output j in period t survival rate (1-mjt) vector of mental and physical health characteristics health level from having health state ajt θ h(θ ) σ * jt (s ) p *jt (θ , s ) h*jt ( s ) = ∑θ p*jt (θ , s ) h (θ ) δ q*jt = ∑ s δ s −tσ *jt ( s )h*jt ( s ) h oj ( s ) θ probability of surviving s periods given that the patient survived treatment j at date t probability of being in health state θ conditional on surviving s periods after treatment j at date t expected level of health conditional on surviving s periods discount factor discounted sum of quality adjusted life years produced by the treatment if the patient survives treatment expected health s periods hence if 215 g ajt the patient does not receive treatment j conditional on surviving s periods probability of surviving without treatment probability of being in health state θ after s periods conditional on surviving without receiving treatment discounted sum of quality adjusted life years if patient not treated expected increase in discounted QALYs from treatment j at time t growth rate of survival g q* growth rate in q *jt g q0 growth rate in q 0jt I ctxa survival adjusted cost weighted output index value of a reduction of one day in waiting time for treatment j in year t value of health gain σ 0j (s ) p 0j (θ , s ) q0jt = ∑ s δ sσ oj ( s )h oj ( s ) q jt = (1 − m jt )q *jt − q 0jt jt jt π wjt π ht rw , rL w jt discount rates on the wait for treatment, QALYs waiting time for treatment j in year t L0t time path of health with treatment after wait w life expectancy without treatment L*t life expectancy with treatment I ctxaw survival and waiting time adjusted cost weighted index h * (s; wt ) 216 Developing new approaches to measuring NHS outputs and productivity FINAL REPORT - APPENDICES 7 September 2005 (revised 7 November) Diane Dawson∗, Hugh Gravelle**, Mary O’Mahony***, Andrew Street*, Martin Weale*** Adriana Castelli*, Rowena Jacobs*, Paul Kind*, Pete Loveridge***, Stephen Martin****, Philip Stevens***, Lucy Stokes*** ∗ Centre for Health Economics, University of York. **National Primary Care Research and Development Centre, Centre for Health Economics, University of York. ***National Institute for Economic and Social Research. ****Department of Economics and Related Studies, University of York. Table of contents Appendices ................................................................................... 4 A. Data sources other than HES ................................................. 4 1. Data on the patient experience................................................................................4 1.1 Data ......................................................................................................................4 1.2 Indicators.............................................................................................................5 2. Data on clinical errors ...........................................................................................10 3. Data on cleanliness and infections ........................................................................13 3.1 Cleanliness .........................................................................................................13 3. 2 Infection control................................................................................................14 3.2.1 Wound infections ....................................................................................15 3.2.2 Infection control......................................................................................16 3.2.3 MRSA .....................................................................................................16 4. Outpatient waiting times .......................................................................................17 5. Readmission rates ..................................................................................................18 6. Life expectancy.......................................................................................................20 B. Hospital Episode Statistics (HES) ....................................... 26 1. HES regular attender episodes .............................................................................26 2. HES: duplicate records..........................................................................................27 3. HES: episodes and spells .......................................................................................28 4. Identifying continuous inpatient (CIP) spells of NHS care................................29 5. HRG assignment ....................................................................................................31 6. Unit costs of spells ..................................................................................................31 7. Quarterly output measures ...................................................................................35 8. Identifying spells that end in death ......................................................................35 9. Mortality summary statistics ................................................................................37 10. Missing Observations: Maintaining Sample Size..............................................45 C. Measuring the health outcomes secured by treatment...... 46 1. Introduction............................................................................................................46 2. Defining health effects ...........................................................................................46 3. Measurement instruments.....................................................................................50 4. Clinical trial data ...................................................................................................53 5. Observational data.................................................................................................55 5.1 Health Outcomes Data Repository.....................................................................56 5.1.1 Analysis...................................................................................................60 5.1.2 Conclusions.............................................................................................65 5.2 York District Hospital........................................................................................65 5.2.1 Results.....................................................................................................66 5.2.2 Conclusions.............................................................................................70 5.3 BUPA.................................................................................................................72 5.3.1 Temporal trends in health gain ...............................................................72 5.3.2 SF36 data used in the specimen index ....................................................87 D. Comparing marginal and average cost................................ 93 1. Introduction............................................................................................................93 2. Methods...................................................................................................................93 3. Data .........................................................................................................................95 ii 4. Discussion................................................................................................................95 5. Results .....................................................................................................................95 iii Appendices A. Data sources other than HES 1. Data on the patient experience 1.1 Data The following surveys were carried out in 2004: Acute Trusts Inpatient Survey http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs /en?CONTENT_ID=4004319&chk=qlIfTU Acute Trusts: Young Patients Survey http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs /en?CONTENT_ID=4004320&chk=BOfOB4 Primary Health Care Trust Survey http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs /en?CONTENT_ID=4004323&chk=Z%2BCM6e Mental Health Survey http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs /en?CONTENT_ID=4004322&chk=yWjJbN Ambulance Patients Survey http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs /en?CONTENT_ID=4004321&chk=uZoE2N The Acute Trusts Inpatient Survey compared results with an earlier survey in 2003, while the Primary Health Care Trust Survey compared results with a survey in 2003. The other surveys did not compare results with earlier data. 4 In 2004/5 two relevant surveys have been completed: Emergency Department Survey http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs /en?CONTENT_ID=4011238&chk=0bcNSV Outpatient Department Survey http://www.healthcarecommission.org.uk/NationalFindings/Surveys/PatientSurveys/fs /en?CONTENT_ID=4011237&chk=bn4nNE These both compare results with figures for 2003. 1.2 Indicators In tables A1 to A4 the change in the logarithm of the quantified indicator is shown. 5 Table A1 Adult Inpatients 2004 compared to 2002 % change 2004 over 2002 annual rate In your opinion, how clean was the hospital room or ward that you were in? -0.20% How clean were the toilets and bathrooms that you used in hospital? Average cleanliness How would you rate the hospital food? Average food How organised was the care you received at A&E? Following arrival at the hospital how long did you wait before admission to a room or ward and bed When you were told you would be going into hospital, were you given enough notice of your date of admission? Was your admission date changed by the hospital? From the time that you arrived at the hospital, did you feel that you had to wait a long time to get to a bed on a ward? When you had important questions to ask a doctor, did you get answers that you could understand? Did doctors talk in front of you as if you weren't there? When you had important questions to ask a nurse, did you always get answers that you could understand? Did nurses talk in front of you as if you weren't there? Sometimes in a hospital, a member of staff will say one thing and another will say something quite different. Did this happen to you? If your family or someone else close to you wanted to talk to a doctor, did they have enough opportunity to do so? -1.07% -0.63% 0.64% 0.64% 1.95% Were you given enough privacy when discussing your condition or treatment? Were you given enough privacy when being examined or treated? Were your scheduled tests, x-rays, or scans performed on time? Do you think the hospital staff did everything they could do to help control your pain? Did a member of staff explain the purpose of the medicines you were to take at home in a way you could understand? Did a member of staff tell you about medication side effects to watch for when you went home? Did a member of staff tell you about any danger signals you should watch for after you went home? Did the doctors or nurses give your family or someone close to you all the information they need to help you recover? Average Non-medical Attention Overall, did you feel that you were treated with respect and dignity while you were in the hospital? Overall, how would you rate the care you received? Average overall care 0.31% 0.00% -0.93% 4.97% 0% 0.55% 0.96% 0.00% 0.30% 2.18% -0.28% -0.92% 2.79% 0.00% -0.88% 1.05% -1.49% 0.00% 0.56% 0.00% 0.99% 0.49% 6 Table A2 Accident and Emergency 2004/5 compared to 2003 % change 2004/5 over 2003 annual rate In your opinion, how clean was the emergency department? How clean were the toilets in the emergency department? Average cleanliness While you were in the emergency department, how much information about your condition or treatment was given to you? Were you given enough privacy when discussing your condition and treatment -0.84% -1.20% -1.02% Were you given enough privacy when being examined or treated? Sometimes in a hospital, a member of staff will say one thing and another will say something quite different. Did this happen to you in the emergency department? Were you involved as much as you wanted to be in decisions about your care and treatment? Did a member of staff explain the results of the test in a way you could understand? Do you think the hospital staff did everything they could to help control your pain While you were in the emergency department, did you feel bothered or threatened by other patients? Did a member of staff explain the purpose of the medications you were to take at home in a way you could understand? 0.76% Did a member of staff tell you about medication side effects to watch for? Did a member of staff tell you about any danger signals regarding your illness or treatment to watch for after you went home? Average Non-medical Attention Overall, did you feel you were treated with respect and dignity while you were in the emergency department? Overall, how would you rate the care you received in the emergency department? Average overall care 1.19% 1.20% 0.00% 0.87% -0.89% 1.44% 1.07% -0.38% 3.10% -0.65% 0.70% 1.15% 2.46% 1.80% 7 Table A3 Outpatients 2004/5 compared to 2003 % change 2004/5 over 2003 annual rate In your opinion, how clean was the outpatient department? How clean were the toilets at the outpatient department? Average cleanliness How long after the stated appointment time did the appointment start? (Change in Mean Waiting Time sign reversed) Did you have enough time to discuss your health or medical problem with the doctor? Did the doctor explain the reasons for any treatment or action in a way you could understand? Did the doctor listen to what you had to say? If you had important questions to ask the doctor, did you get answers that you could understand? Did you have confidence and trust in the doctor examining and treating you? Did the doctor seem aware of your medical history? If you had important questions to ask him/her (non-doctor staff), did you get answers that you could understand? Did you have confidence and trust in him/her (non-doctor staff)? If you had important questions to ask him/her (nurse), did you get answers that you could understand? Did you have confidence and trust in him/her (nurse)? If you had important questions to ask him/her (physiotherapist), did you get answers that you could understand? Did you have confidence and trust in him/her (physiotherapist)? If you had important questions to ask him/her(radiographer), did you get answers that you could understand? Did you have confidence and trust in him/her(radiographer)? Did doctors and/or other staff talk in front of you as if you weren't there In the outpatient department, how much information about your condition or treatment was given to you? Sometimes, a member of staff will say one thing and another will say something quite different. Did this happen to you? Were you involved as much as you wanted to be in decisions made about your care and treatment? Did a member of staff explain why you needed the test(s) in a way you could understand? Did a member of staff tell you how you would find out the results of your test(s)? Did a member of staff explain the results of the test in a way you could understand? -2.62% -1.92% -2.27% 3.77% 0.78% -0.20% 0.00% 0.00% -0.78% -0.76% 0.40% 0.00% 0.79% -0.37% 0.38% 0.37% -0.42% -0.73% 0.00% 0.00% -0.72% 0.00% -0.82% -1.73% -1.40% 8 Before the treatment, did a member of staff explain what would happen? Before the treatment, did a member of staff explain any risks and/or benefits in a way you could understand? Did a member of staff explain to you how to take the new medications? Did a member of staff explain the purpose of the medications you were to take at home in a way you could understand? Did a member of staff tell you about medication side effects to watch for? Did a member of staff tell you about what danger signals regarding your illness or treatment to watch for after you went home? Average Non-medical Attention Overall, did you feel you were treated with respect and dignity while you were at the outpatient department? Overall, how would you rate the care you received at the outpatient department? Average overall care % change 2004/5 over 2003 annual rate 0.39% 0.85% -0.74% 0.00% 1.28% -3.16% -0.10% 0.00% 0.00% 0.00% 9 Table A4 Primary Care 2004 compared to 2003 % change 2004 over 2003 The last time you saw a doctor from your GP surgery did you have to wait for an appointment? Not used Did someone tell you how long you would have to wait? Not used Were you given enough time to discuss your health or medical problem with the doctor? 0.00% Were you involved as much as you wanted to be in decisions about your care and treatment? -1.80% Were you involved as much as you wanted to be in decisions about the best medicine for you? 4.88% Did someone explain the results of the tests in a way you could understand? -3.70% Did that person explain the reasons for any action or treatment in a way that you could understand? (Health professional other than GP) In the last 12 months, have you ever been put off going to your GP surgery/health centre because the opening times are inconvenient for you? Were you involved as much as you wanted to be in decisions about your dental care and treatment? Did you have confidence and trust in the dentist? Average Non-medical Attention 0.00% -1.16% -4.20% -1.16% -0.89% 2. Data on clinical errors The National Patient Safety Agency (NPSA) was set up in 2001 with the principal aims to identify, understand and reduce the risk of harm to patients caused by clinical errors. It has established a variety of means with which to pursue their goal, including among others guidance and training to help identify possible sources of errors and help reduce the risk of recurrence of incidents. More importantly it started to collect directly the reporting of patient safety accidents, through the establishment of the National Reporting and Learning System (NRLS). Pilots were undertaking in June 2002 in 28 Trusts. And as from December 2004 all 607 NHS organisations in England and Wales have been able to report through their system. NHS organisations covered by the system (NRLS) include all healthcare 10 settings, from primary care to acute care, from learning and disabilities to mental health and ambulance care. The system has been developed in order that any reporting is confidential, with the implicit incentive of increasing the number of incidents. Health care professionals may also report any incident through their own Trust reporting systems, which will then be passed on to the NPSA via their own local risk management systems linked to the agency. It is worth noting that the reporting of incidents affecting patients’ safety is not mandatory. The Agency expects, however, a high degree of compliance due to the confidentiality with which incidents are reported. The NRLS collects data from nine health care areas: Acute hospitals Ambulance Service Community and general dentistry Community nursing, medical and therapy service (including community hospitals) Community optometrists/opticians Community pharmacy General Practice Learning disabilities Mental Health The data set has been developed in order to allow the following information to be collected soon after a single patient safety incident has occurred. 1. Type of incident 2. Time and place of incident 3. Characteristics of the patient(s) involved (age, sex, ethnicity) 4. Outcome for the patient 5. Staff involved in the incident and/ore making the report 6. Contributory factors 7. Factors that might have prevented harm 11 Space for free text is provided Data is collected at specialty level, as follows: • Accident and Emergency • Anaesthetics • Diagnostic services • Medical Specialties • Obstetrics and gynaecology • Surgical Specialties • Other/unknown Incident types that have been identified so far are: • Patient accident • Medication • Treatment, procedure • Infrastructure (including staffing, facilities, environment) • Access, admission, transfer, discharge (including missing patient) • Documentation (including records, identification) • Clinical assessment (including diagnosis, scans, tests, assessments) • Medical device/equipment • Implementation of care and ongoing monitoring/review • Other Examples of contributory factors are: communication, education and training, patient factors, work and environment.1 Only 230 NHS organisations in England and Wales have reported incidents so far. 1 All contributory factors with their respective frequencies by incident types can be found at http://www.npsa.nhs.uk/site/media/documents/1262_PSO_AdditionalInfo.pdf 12 And as shown in the report ‘Building a memory: preventing harm, reducing risks and improving patient safety (2005)’, the number of reports has increased rapidly over time, especially during the last few months of 2005. Currently, the amount of ready available data is non existent. Further, due to the confidentiality with which health care professionals report these incidents, data are very sensitive and access to them is not granted, as confirmed to us by the head of the Observatory. 3. Data on cleanliness and infections One characteristic of hospital output, which is gaining increasing importance, is that of hospital cleanliness, and the control of infections. A number of different data sources exist measuring these attributes. Cleanliness is given a great deal of attention in performance assessment of Trusts and is one of the key targets in the star ratings of NHS Trusts. The implicit assumption is that better cleanliness will lead to a reduction in hospital acquired and other infections, though infections are also now specifically being measured by various surveillance systems. 3.1 Cleanliness Patient Environment Action Teams (PEAT) inspect hospitals and report on the patient environment (cleanliness and tidiness) and food services. PEAT visitors are volunteers comprising members drawn from Facilities and Estates staff, Hotel Services Managers, Domestic Managers, Caterers, Infection Control Nurses, patient and patient representative organisations along with members of the general public. 3 PEAT teams assess cleanliness across 18 criteria. The PEAT scores are used in the star ratings system and are available from 2002/03 (Commission for Health Improvement, 2003; Healthcare Commission, 2004). For 2003/04 PEAT teams only visited individual hospitals which achieved a rating of 'acceptable' or less in the 2002/03 PEAT assessment. Hospitals which achieved a rating of 'good' in 2002/03 undertook a self-assessment. 13 The PEAT score at each Trust is calculated as the total number of PEAT points achieved at a Trust divided by the total number of PEAT points available, weighted to account for the relative size of each hospital within a Trust. A weighting process is then applied which emphasises the areas that patients deem most important. This includes: • Cleanliness in wards • Cleanliness in emergency departments • Cleanliness in other departments and waiting areas • Cleanliness of toilets in wards • Cleanliness of toilets in emergency departments • Cleanliness of toilets in other departments, and waiting areas • Cleanliness of bathrooms in wards • Environment in toilets in wards • Environment in toilets in emergency departments • Environment in toilets in other departments and waiting areas • Environment in bathrooms in wards For the purposes of the star ratings the percentage data is then transformed into a categorical variable with five bands. The PEAT score (1-5) achieved by each rated site in a Trust is multiplied by the number of beds in each rated site and summed across all rated sites and then divided by the total number of beds across the sites that have at least 10 overnight beds to produce an indicator value between 1 and 5. Both the overall percentage PEAT score and the transformed categorical variable are available for each Trust. 3. 2 Infection control A number of different surveillance systems have been in place (both voluntary and mandatory) to monitor infections in English hospitals. These fall into three categories: wound infection, overall measures of infection control, and MRSA (hospital acquired infection). In 1996, the Department of Health together with the Public Health Laboratory Service 14 (PHLS), now incorporated in the Health Protection Agency, established and launched the Nosocomial Infection National Surveillance Scheme (NINSS). The scheme was driven as a response to the need to identify and monitor infection in English hospitals. The NINSS scheme ran until November 2002, on a non compulsory basis. Further, it was a selective scheme in that each participating hospital could select one or more categories for surgical procedures among the list of 12 designated procedures. It was as well an intermittent scheme, in that hospitals were required to collect data for a minimum of three consecutive months each year. Some hospitals decided to collect the data continuously. In October 2000 the Minister of Health promised to introduce compulsory monitoring of hospital acquired infections (referred to as healthcare associated infections) in all NHS Trusts in England. This monitoring would extend to certain blood stream infections (including MRSA), as well as infections of wounds following orthopaedic surgery. The orthopaedic SSI surveillance is compulsory as from 1st April 2004, while the MRSA surveillance scheme has been in place since 2001. This information is used as an indicator in the balanced scorecard as part of the Star Ratings for acute hospital Trusts. (Department of Health, 2003b). We describe available data under each of these three areas in more detail: 3.2.1 Wound infections Wound infections can occur as a consequence of many surgical procedures. Bruce et al (2001) state that “surgical wound infection, although it seldom causes mortality, leads to morbidity and prolonged hospital stay” and, hence, to increased medical costs. It appears as well that 50 to 70 per cent of all wound infections occur after patients are discharged from hospital (Department of Health, 2002b). This leads to the conclusion that these types of infections are of concern to secondary, primary and community care providers within the NHS. A new surveillance of surgical site infection (SSI) scheme has been introduced in April 2004. It builds upon the previous NINSS experience. It is again a voluntary scheme; with the difference that data collection on wound infections occurring after 15 orthopaedic procedures is mandatory for all NHS Trusts in England. Data collection will be kept as far as possible comparable to that collected by NINSS (Health Protection Agency, 2004). Data on the volume of wound infections, rate and change is to our knowledge collected as part of the above-mentioned scheme. A summary of the data collected is reported in ‘Surveillance of Surgical Site Infections in English Hospitals 1997 – 2002’ (Health Protection Agency, 2003). 3.2.2 Infection control Data on infection control has been collected by the Department of Health since 2002/03 and an aggregate variable is used in the star ratings of NHS Trusts. It gives an average of the percentage scores for each of the 15 components in the controls assurance infection control standard. The NHS Plan Implementation Programme makes it clear that as a core requirement underpinning the targets in the Plan all relevant NHS organisations must ensure that they have effective systems in place to prevent and control hospital acquired infection. This average self-assessment score is then transformed into a categorical variable for use by the Healthcare Commission in the star ratings (Commission for Health Improvement, 2003; Healthcare Commission, 2004). 3.2.3 MRSA Methicillin resistant staphylococcus aureus, better known as MRSA, is a blood stream infection, which has been increasingly affecting NHS patients in England (Crowcroft and Catchpole, 2002; Health Protection Agency and Office for National Statistics, 2004). An announcement has been recently made by the Health Secretary, John Reid, introducing the new objective to halve MRSA bloodstream infections in hospitals by March 2008 (Department of Health, 2004d). Further, all NHS staff (nurses, healthcare assistants, porters and cleaners) will receive infection control training (under the Agenda for Change framework) with the aim of reducing MRSA and any other healthcare associated infections (Department of Health, 2004e). 16 A mandatory Department of Health Bacteraemia Surveillance Scheme for MRSA has been in place since 2001. Data is available by NHS Trust for the first three years of its existence (Department of Health, 2004f). These data include the number of MRSA bacteraemia reports by Trust and the MRSA rate per 1000 bed-days by Trust from 2001/02 to 2003/04. Data are disaggregated by Trust type: General Acute, Single Specialty and Specialist Trusts. The total number of Staphylococcus aureus bacteraemias in England rose from 17,933 in the first year to 18,519 in the second and 19,311 in the third. The numbers of MRSA bacteraemia were 7,250, 7,384 and 7,647 respectively (CDRweekly, 2002; 2003; 2004). This means an overall increase of MRSA reported cases of 5.5 per cent over the three years, and an increase of 3.6 per cent in the last year of surveillance. Some caveats are noted regarding the interpretation of the data: • The figures reflect the burden of serious infections associated with MRSA and not all MRSA infection. • The MRSA infections reported by an acute Trust were not necessarily acquired in that Trust (due to patient transfers between hospitals). • The bed occupancy figures used to derive the MRSA bacteraemia rate are from a period before the MRSA data. This disparity will have an effect on a Trust’s rates if there has been a significant change in activity in the Trust. This may occur when there has been a merger of Trusts. • The bed occupancy figures apply only to overnight admissions. Consequently MRSA bacteraemias in patients who are not admitted overnight may make a Trust’s figures look falsely high, as these patients will feature in the numerator but not in the denominator. This may apply to certain types of patients, such as those on renal units. 4. Outpatient waiting times Although there is extensive information at the individual patient level for those admitted to hospital, relatively little information exists about those seen as outpatients. The only information that exists is some aggregate (all outpatient) 17 information about referrals received, activity levels, and waiting times which is available from the quarterly (QM08) return submitted by NHS Trusts (see http://www.performance.doh.gov.uk/waitingtimes/index.htm). This quarterly return gathers data by over 60 specialties on: 1. 2. the number of written referrals received from GPs during the quarter; the number of other referral requests received (including those from A&E departments, a consultant in a department other than A&E, and a prosthetist); 3. the number of GP written referrals seen during the quarter who had waited: a) less than 4 weeks b) between 4 and less than 13 weeks c) between 13 and less than 17 weeks d) between 17 and less than 21 weeks e) between 21 and less than 26 weeks f) 26 weeks and over 4. the number of patients with a written referral from a GP who had not yet attended for a first appointment as at the census date and who had been waiting: a) between 13 and less than 17 weeks b) between 17 and less than 21 weeks c) between 21 and less than 26 weeks d) 26 weeks and over. The mean outpatient waiting time for those seen during the quarter was calculated as a weighted average of the mid-points of each time band with the number of patients seen employed as the weight for each band (for the over 26 weeks wait we employed 32.5 weeks as the ‘mid-point’ for this band). To convert this quarterly data into annual waits, a weighted average of the quarterly waits was calculated with weights reflecting the number of outpatients seen in each quarter. 5. Readmission rates Readmission rates at NHS Trust level have been published by the Department of Health for a number of years. Since 2001, readmission rates have been used by the Department of Health, CHI and the Healthcare Commission, in the construction of the star ratings for NHS Trusts. These have been calculated as the percentage of emergency readmissions within 28 days of discharge published from 1998/99 18 onwards. Figure A1 shows the England national average readmission rate. The data source is the Department of Health and the definition of readmissions is that used by the Healthcare Commission in the 2005 star ratings. The data are standardised by age, generating 3 age bands. The data are also standardised to a base year (2001/02). This definition classifies any emergency admission at all within 28 days of discharge as a readmission, whether or not the diagnosis on readmission had any connection with the previous diagnosis or not. Consequently there is no way of distinguishing in these data whether the readmission is associated with a failure in the prior hospital spell. Figure A1 Emergency readmission to hospital within 28 days of discharge, by age band 14 12 10 8 Age 75+ Age 16-74 Age 0-15 Readmission rate percent 6 4 2 0 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 Source: Department of Health Data are based on Hospital Episodes Statistics (HES). Data use CIP spells, and therefore include transfers to other hospitals. Readmissions are attributed to the Trust from which the patient was discharged. The matching occurs via the HES identifier which uniquely identifies a patient and exists in the dataset from 1997/98. It is generated by matching records for the same patient using a combination of their NHS 19 number, local patient identifier, plus the patients’ postcode, sex and date of birth. The readmissions indicators exclude day cases, maternities and spells with a cancer diagnosis or chemotherapy anywhere in the spell. Mental health specialties are also excluded. The indicator is indirectly standardised by calculating the ratio of a Trust’s observed number of readmissions and the number of readmissions that would be expected if it had experienced the same readmission rates as those of patients in England, given the mix of age, sex, methods of admission, diagnoses and main procedures among its patients. This standardised ratio is then converted into a rate by multiplying it by the overall readmission rate of patients in England. Several algorithms have been set up to try to deal with the difficult task of dealing with problems of double counting and readmissions which span calendar years. Because the data are unable to distinguish readmissions due to poor treatment (the element we wish to capture in the index) from other admissions within 28 days that are unrelated to the earlier condition, we do not regard the data as useable for adjusting output indices and have limited ourselves to illustrative calculations. 6. Life expectancy Estimates of quality-adjusted life expectancy are based on self-reported health states while overall life expectancy estimates are compiled from mortality data. The most recent data we have on quality-adjusted life expectancy were compiled in 1996 (table A5). However the use of these data as a basis for our work is less than completely satisfactory because it is well known that life expectancies have advanced considerably since then. We proceed instead by taking up to date estimates of life expectancy (Interim Life Tables 2000-2002 for the United Kingdom, Annual Abstract 2002, p. 71) and deducting from these the difference between total life expectancy and quality-adjusted life expectancy shown in the 1996 data (see Dawson et al., 2004c, section 3.1.1). Thus our assumption is one of constant morbidity though there is no clear view on whether healthy life expectancy is rising faster of slower than life 20 expectancy as a whole- in other words on whether the expected period of poor health is getting longer or shorter. Both life expectancy and quality-adjusted life expectancy tables suffer from the problem that they provide data only for specific ages in the case of the first, or age bands in the case of the second. It is necessary to interpolate and extrapolate in order to obtain the year-specific figures which we require. The method of interpolation which we use is the cubic spline- a 1-line command (SPLINE) available in MATLAB. This provides a smooth curve which passes through all the points for which data are available. We use the spline to interpolate/extrapolate up to the age of 95. The life table provides clearly the reference years around which the spline interpolates, while for the data on quality-adjusted life- and thus on years of poor health, we assume that they relate to the mid-point of the range shown. For people over 90 the figure is assumed to relate to 94. The method is subject to inaccuracy particularly outside the range of data and thus for the very elderly. In table A6 we show (i) life expectancy interpolated/extrapolated from the Annual Abstract table, (ii) the interpolated/extrapolated difference between total life expectancy and quality-adjusted life expectancy from the 1996 table and iii) the resulting estimate of quality-adjusted life expectancy calculated by subtraction. In the columns marked (i) for males and females the life expectancies at five-year intervals (0, 5 etc) match those of the table from the Annual Abstract. In the column marked ii) those for the mid-point years match those in the relevant columns of table A5. Until further results on quality-adjusted life expectancy become available, only the figures in the columns (i) for each sex can be updated. Since these calculations were done we became aware that the Government Actuary’s web site provides life expectancy figures for each year, avoiding the need for interpolation of these data. 21 Table A5 Life Expectancy and Quality-adjusted Life Expectancy: 1996 Age <1 1 to 4 5 to 9 10 to 14 16 to 19 20 to 24 25 to 29 30 to 34 35 to 39 40 to 44 45 to 49 50 to 54 55 to 59 60 to 64 65 to 69 70 to 74 75 to 79 80 to 84 85 to 89 90 plus Period life expectancy A 74.76 73.29 68.37 63.41 58.50 53.72 48.95 44.18 39.41 34.70 30.08 25.59 21.33 17.35 13.75 10.64 8.05 5.91 4.28 2.56 MALES qualityQuality Life expectancy marginal adjusted Adjusted less QA Life period life marginal Life expectancy Lex Expectancy expectancy B A-B 4.78 4.77 4.77 4.77 4.71 4.62 4.49 4.26 3.98 3.60 3.11 2.59 2.14 1.63 1.72 2.56 4.37 4.37 4.38 4.36 4.24 4.11 3.84 3.61 3.24 2.83 2.49 2.06 1.62 1.22 1.32 2.00 50.04 45.68 41.31 36.93 32.57 28.33 24.22 20.38 16.77 13.53 10.70 8.21 6.15 4.53 3.32 2.00 8.46 8.04 7.64 7.25 6.84 6.37 5.86 5.21 4.56 3.82 3.05 2.43 1.90 1.38 0.96 0.56 Period life expectancy C 79.76 78.21 73.27 68.30 63.36 58.46 53.55 48.65 43.79 38.99 34.27 29.66 25.20 20.94 16.93 13.32 10.14 7.40 5.24 2.95 FEMALES Life qualityQuality expectancy marginal adjusted Adjusted period life marginal Life less QA Life expectancy Lex Expectancy expectancy D C-D 4.90 4.91 4.90 4.86 4.80 4.72 4.61 4.46 4.26 4.01 3.61 3.18 2.74 2.16 2.29 2.95 4.44 4.40 4.43 4.34 4.21 4.08 3.92 3.72 3.33 3.13 2.83 2.34 1.97 1.52 1.61 1.81 52.09 47.65 43.25 38.82 34.48 30.27 26.19 22.27 18.55 15.22 12.09 9.26 6.92 4.95 3.43 1.81 11.27 10.81 10.30 9.83 9.31 8.72 8.08 7.39 6.65 5.72 4.84 4.06 3.22 2.45 1.81 1.14 22 Table A6 Calculation of Quality-adjusted Life Expectancy base on the 2000-2002 Life Tables MALES Life FEMALES QA Life Life QA Life Expectancy Quality Adjustment Expectancy Expectancy Quality Adjustment Expectancy i ii iii i ii iii 0 75.70 11.66 64.04 80.40 11.95 68.45 1 74.84 11.38 63.47 79.57 11.99 67.57 2 73.96 11.11 62.85 78.70 12.03 66.67 3 73.06 10.86 62.21 77.79 12.05 65.75 4 72.14 10.61 61.53 76.86 12.05 64.81 5 71.20 10.39 60.81 75.90 12.05 63.85 6 70.24 10.17 60.07 74.92 12.03 62.89 7 69.27 9.97 59.30 73.93 12.01 61.92 8 68.29 9.78 58.51 72.92 11.97 60.95 9 67.30 9.61 57.69 71.91 11.93 59.98 10 66.30 9.44 56.86 70.90 11.87 59.03 11 65.30 9.28 56.01 69.89 11.81 58.08 12 64.29 9.13 55.16 68.89 11.74 57.14 13 63.29 8.99 54.29 67.89 11.67 56.22 14 62.29 8.86 53.43 66.89 11.59 55.30 15 61.30 8.74 52.56 65.90 11.50 54.40 16 60.33 8.62 51.70 64.91 11.41 53.50 17 59.36 8.51 50.85 63.93 11.32 52.62 18 58.41 8.41 50.00 62.95 11.22 51.73 19 57.45 8.31 49.14 61.98 11.12 50.86 20 56.50 8.22 48.28 61.00 11.02 49.98 21 55.55 8.13 47.42 60.02 10.91 49.11 22 54.59 8.04 46.55 59.04 10.81 48.23 23 53.63 7.96 45.67 58.06 10.71 47.36 24 52.66 7.88 44.79 57.08 10.60 46.48 25 51.70 7.80 43.90 56.10 10.50 45.60 26 50.73 7.72 43.02 55.12 10.40 44.72 27 49.77 7.64 42.13 54.14 10.30 43.84 28 48.81 7.56 41.24 53.16 10.20 42.96 29 47.85 7.48 40.37 52.18 10.11 42.07 30 46.90 7.41 39.49 51.20 10.02 41.18 31 45.96 7.33 38.63 50.22 9.93 40.29 32 45.02 7.25 37.77 49.23 9.83 39.40 33 44.09 7.17 36.91 48.25 9.73 38.52 Age 23 34 43.15 7.09 36.06 47.27 9.63 37.64 35 42.20 7.01 35.19 46.30 9.53 36.77 36 41.24 6.93 34.32 45.34 9.42 35.92 37 40.28 6.84 33.44 44.38 9.31 35.07 38 39.31 6.75 32.56 43.42 9.20 34.22 39 38.35 6.66 31.70 42.46 9.08 33.38 40 37.40 6.56 30.84 41.50 8.96 32.54 41 36.46 6.47 30.00 40.53 8.84 31.69 42 35.53 6.37 29.16 39.57 8.72 30.85 43 34.62 6.27 28.34 38.60 8.60 30.01 44 33.71 6.18 27.53 37.65 8.47 29.18 45 32.80 6.08 26.72 36.70 8.34 28.36 46 31.90 5.97 25.92 35.77 8.21 27.56 47 31.00 5.86 25.14 34.85 8.08 26.77 48 30.10 5.74 24.36 33.93 7.94 25.99 49 29.20 5.61 23.59 33.02 7.81 25.21 50 28.30 5.48 22.82 32.10 7.67 24.43 51 27.40 5.34 22.06 31.17 7.53 23.65 52 26.51 5.21 21.30 30.25 7.39 22.86 53 25.63 5.08 20.55 29.32 7.25 22.07 54 24.76 4.95 19.81 28.40 7.11 21.29 55 23.90 4.82 19.08 27.50 6.97 20.53 56 23.06 4.69 18.36 26.62 6.81 19.80 57 22.23 4.56 17.67 25.75 6.65 19.10 58 21.41 4.42 16.99 24.89 6.48 18.42 59 20.61 4.28 16.33 24.05 6.29 17.75 60 19.80 4.13 15.67 23.20 6.10 17.10 61 19.00 3.98 15.02 22.35 5.91 16.44 62 18.20 3.82 14.38 21.51 5.72 15.79 63 17.42 3.66 13.75 20.66 5.53 15.13 64 16.65 3.50 13.14 19.83 5.35 14.48 65 15.90 3.35 12.55 19.00 5.18 13.82 66 15.18 3.20 11.98 18.19 5.00 13.18 67 14.48 3.05 11.43 17.39 4.84 12.55 68 13.81 2.91 10.89 16.60 4.68 11.92 69 13.15 2.78 10.36 15.84 4.53 11.31 70 12.50 2.66 9.84 15.10 4.37 10.73 71 11.86 2.54 9.32 14.38 4.22 10.16 72 11.24 2.43 8.81 13.68 4.06 9.62 73 10.64 2.32 8.32 13.01 3.90 9.11 24 74 10.06 2.21 7.84 12.35 3.73 8.62 75 9.50 2.11 7.39 11.70 3.56 8.14 76 8.97 2.01 6.97 11.07 3.39 7.68 77 8.48 1.90 6.58 10.45 3.22 7.23 78 8.00 1.79 6.21 9.85 3.06 6.79 79 7.54 1.69 5.86 9.26 2.90 6.37 80 7.10 1.58 5.52 8.70 2.74 5.96 81 6.67 1.48 5.19 8.16 2.59 5.56 82 6.25 1.38 4.87 7.63 2.45 5.18 83 5.85 1.29 4.56 7.13 2.31 4.82 84 5.46 1.20 4.27 6.65 2.18 4.48 85 5.10 1.11 3.99 6.20 2.05 4.15 86 4.76 1.03 3.73 5.77 1.93 3.84 87 4.45 0.96 3.49 5.36 1.81 3.55 88 4.17 0.89 3.28 4.98 1.70 3.28 89 3.92 0.82 3.09 4.63 1.59 3.04 90 3.70 0.76 2.94 4.30 1.49 2.81 91 3.52 0.71 2.82 4.00 1.39 2.60 92 3.39 0.65 2.73 3.73 1.30 2.42 93 3.29 0.61 2.69 3.48 1.22 2.26 94 3.24 0.56 2.68 3.27 1.14 2.13 95 3.24 0.52 2.73 3.08 1.07 2.01 25 B. Hospital Episode Statistics (HES) Hospital Episode Statistics (HES) provide information on admitted patient care delivered by NHS hospitals in England from 1989 to the present time. In addition, the HES system also captures admitted patient care funded by the NHS but provided by the private sector. HES covers all medical and surgical specialities and includes private patients treated in NHS hospitals. HES comprises over 12 million records each year. Each time a person is admitted to hospital certain details are collected – such as the patient's age, gender and date of admission – with further information – such as details of any diagnosis made and surgical procedures undertaken – added to the record in due course. This information is stored as a large collection of separate records, one for each period of care under the same consultant, in a secure database. Records are stored according to the financial year (actually 1st April to 31st March) in which the period of care finished. For a small proportion of stays in hospital, there is more than one record. This happens when a patient is transferred from the care of one consultant to another. Each period of care under the same consultant is known as a ‘consultant episode’. HES does not provide details of drugs used in hospitals, or information on outpatients. There are no records on patients treated in A&E departments who are discharged immediately. Outpatient activity also includes some day surgery which is not included in HES. However, from 2003/04, outpatient attendances and A&E activity are collected as part of HES, though it is not currently available yet. HES will therefore in future cover most secondary care. 1. HES regular attender episodes Since 1999/00 HES has gathered information on patients who regularly attend hospital as part of an on-going programme of treatment. Regular day and night attenders are patients who have regular appointments in hospital for treatment that may take longer than a day to complete. Previously, this activity was not included in HES. In 2003/04, there were about 690,000 of these regular attender episodes and 26 these were spread across 39,000 patients. The vast majority of these episodes (over 490,000) were to address conditions associated with renal failure (renal dialysis). Most of the other episodes were for cancer treatment (e.g., for chemotherapy). Only 10% of these episodes involved an overnight stay in hospital. It is mandatory for trusts to submit data on regular attenders to HES from April 2003. However, there may still be low coverage. In previous years the default in HES was not to supply the regular attenders, but in 2003/04 the default changed to include regular attenders. Since regular attenders do not appear in HES in previous years, they have been excluded from 2003/04 to make the annual HES data comparable. Much of the activity, for instance renal dialysis and chemotherapy, is picked up elsewhere in the Reference Cost activity which is included in the indices. Table B1 reports the annual number of regular attender episodes since 1999/00. The figures for 1999/00 – 2001/02 inclusive should be treated with great caution due to data reliability issues. The figure for 2002/03 is considered more reliable than those for previous years but the figure for 2003/04 is considered to be the most accurate. Data reliability issues for years prior to 2003/04 may hinder the calculation of accurate growth rates. Table B1 Regular attender episodes in HES HES year Number regular attender episodes 1999/00 432,024 2000/01 513,434 2001/02 519,177 2002/03 613,389 2003/04 689,713 2. HES: duplicate records HES is constructed from records submitted by Trusts on a quarterly basis to a central national clearing-house. Given the number of records involved it is perhaps inevitable that errors will occur. One problem is that occasionally Trusts will submit duplicate 27 records (i.e. that the same record (episode) will be submitted more than once). The following method was employed to eliminate these duplicate records from HES (Lakhani et al., 2005). All episodes were sorted by the HESID, EPISTART, EPIORDER and EPIEND fields. If two or more consecutive episodes contained identical values in the fields listed in Table B2 below, then the record with a valid value in the ELECDUR field was retained. If no record or more than one record had a valid ELECDUR, then the record with the most non-empty fields was selected. If this failed to select only a single copy of the duplicated episode then the first occurrence of the duplicated episode was retained. Table B2 Fields used to identify duplicate HES records ADMIDATE ADMIMETH ADMISORC CLASSPAT DIAG1-14 DISDATE DISDEST DISMETH EPIEND EPIORDER EPISTART EPISTAT EPITYPE HESID MAINSPEF OP_DTE_1-12 OPER_1-12 RESHA RESLADST STARTAGE TRETSPEF The number of duplicate records dropped from each year’s HES can be found in Table B3. Table B3 Duplicate records (episodes) dropped from HES, 1997/98 - 2003/04 HES data year Number of duplicate records dropped 1997/98 139,807 1998/99 32,103 1999/00 30,960 2000/01 18,447 2001/02 45,845 2002/03 58,258 2003/04 24,128 3. HES: episodes and spells A HES record is generated for each episode of admitted patient care under a particular consultant within a single hospital provider; admitted patient care includes day 28 surgery. The unit of analysis employed by HES is the finished consultant episode (FCE). Over 90% of all episodes involve a patient remaining under the care of the same consultant for the duration of their stay in hospital. In the other cases, however, the patient is discharged from the care of one consultant, but remains in hospital, and moves to the care of another consultant. This move, from the care of one consultant to another, terminates one HES episode and triggers the start of another. By grouping together all those episodes associated with a stay at a given provider (hospital) one identifies continuous inpatient provider spells of care. A patient who is admitted to one hospital and then transferred to another - perhaps because the first hospital is not equipped to provide highly specialised care which the patient's condition demands - would also generate two HES episodes. Rather than use episodes or provider spells as the unit of analysis, we employ a third unit of analysis: continuous inpatient spells of NHS care. These are defined as continuous periods of patient care anywhere within the NHS. An NHS spell might comprise multiple episodes, including the transfer from one hospital to another as well as the move from one consultant to another within a given hospital. There will therefore be more than one record in the HES database for such patients. The records within the HES database are not currently linked, but admission details and patient identifier can be used to link continuous periods of treatment. 4. Identifying continuous inpatient (CIP) spells of NHS care To identify the episodes associated with each spell of continuous inpatient care, all episodes were sorted by the HESID, EPISTART, EPIORDER and EPIEND fields in that order (Lakhani et al., 2005). This grouped together all episodes involving the same patient (assuming that episodes with the same HESID relate to the same patient). EPISTART was used to place the episodes in chronological order with EPIORDER employed to place two episodes that start on the same day in the correct sequence. Finally, EPIEND puts those spells that involve a transfer from one provider to another into the correct chronological sequence. The ordered episodes are linked to form continuous inpatient spells providing that 29 HESID (the patient identifier) remains the same and there is neither a discharge episode (patient leaves hospital, dead or alive (DISMETH<7)) nor a transfer episode. If HESID changes then the record with the new HESID is the beginning of a new spell. If HESID remains unchanged but there is a discharge episode that is not a transfer to another provider then the next episode also relates to a new spell. Any one patient might record more than one spell in any given year. Following Lakhani et al. (2005), an episode is defined as ending with a transfer if: a) the discharge destination (DISDEST) is 51, 52 or 53; OR the admission source (ADMISORC) of the following episode is 51, 52 or 53; OR the admission method (ADMIMETH) of the following episode is 81 AND b) the HESID field of the following episode is the same as that of the current episode AND c) the episode order (EPIORDER) of the following episode is 1 AND d) the admission date of the following episode is valid (i.e. a positive number of days since 31 March 1990) AND e) the discharge date of the current episode is valid (i.e. a positive number of days since 31 March 1990) AND f) the admission date of the following episode minus the discharge date of the current episode’s discharge date is between zero and two (inclusive). Table B4 reports the number of episodes, provider spells and NHS spells available for analysis. 30 Table B4 Number of episodes, provider spells and NHS spells by HES data year HES data FCE Episodes (millions) Provider spells (millions) CIP spells (millions) 1997/98 11,404 10,893 10,799 1998/99 12,001 11,350 11,211 1999/00 12,203 11,285 11,106 2000/01 12,273 11,257 11,075 2001/02 12,316 11,183 10,994 2002/03 12,719 11,739 11,542 2003/04 13,332 year 12,148 5. HRG assignment HRGs are assigned at episode level and having linked episodes into spells (CIPS), a spell may have more than one HRG within it. We assign HRGs to each CIP spell on the basis of the first HRG within the CIPS. An alternative HRG assignment methodology has been developed for Payment by Results which looks for the 'dominant' HRG across episodes, but this is based on a provider spell (PS). HRG v3.1 was in use until 2002/03 and was then replaced by HRG v3.5 in 2003/04. Several categories of HRG are affected by this change. Reference Cost data is collected using HRG v3.5 in 2003/04. However, raw HES data is still generated in v3.1 in 2003/04. In order for activity data to be comparable to 2002/03, HES data was not recoded into v3.5 for 2003/04. Since we calculate base weighted Laspeyres indices lagged costs are used and we do not require unit costs from the Reference Cost data in 2003/04 under v3.5. Hence all our analysis could be undertaken using v3.1. If the output indices are however to be updated for 2004/05, the HES data will need to be recoded to v3.5, if it is not already generated in that format. 6. Unit costs of spells If CIPS or provider spells are to be used as the output measure, the unit cost weights from the Reference Costs are inappropriate as they relate to FCEs. 31 To produce a cost weighted output index which is quality adjusted for survival and waiting times it is necessary to assign costs, survival, and waiting times to each type of output, whether measured as numbers of CIPS of each type or numbers of FCEs of each type. We can assign waiting times to CIPS using the waiting time for the first FCE of the spell and can assign mortality by using the mortality indicator of the last FCE in the spell. But it is more complicated to compute a cost for CIPS. The source for hospital unit costs is the Reference Cost database which has the costs for FCEs. FCEs which are clinically similar and have similar resource use are grouped together as a Health Resource Groups (HRG). The Reference Cost database has unit costs for FCEs in the different HRGs. Thus it is straightforward to assign costs to HES outputs if these are measured as numbers of FCEs in each type of HRG. Matters are more complicated if one wishes to have a unit cost for CIPS, 8% of which contain more than one FCE. There are a number of ways of calculating a cost for each spell and for labelling spell types. (a) Define the spell type by the set of FCEs it contains and to simplify ignore the order of FCEs in the spell. The unit cost a spell type is the sum of the unit costs of the HRGs of its constituent FCEs. The advantage of this approach is that output types are homogenous: each contains spells with the same set of FCEs. The disadvantage is that the number of different types of spell is potentially very large. Even if spells consist of at most two FCEs there are ½ n(n+1) types of spell if there are n types of FCE. There are over 500 elective HRGs and the same number of non-electives. Restricting attention to 2002/3 spells whose first episode was elective we found over 17,000 different types of spell. Thus calculation of the indices is cumbersome because of the great increase in the number of outputs. The procedure satisfies adding up: total costs calculated as numbers of spells of each type times their unit costs will equal total cost calculated as numbers of FCEs of each type times their Reference Cost unit costs. Thus a pure cost weighted activity index with activity measured by spells defined in this way would be very nearly equal to the pure CWAI with activity based on FCEs.2 2 There would be a slight difference because some spells which have a first FCE finishing in year t and a second FCE finishing in year t+1 would be assigned to year t+1, whereas the constituent FCEs would be assigned to different years with a FCE based index. 32 Similarly average waits and mortality also satisfy adding up. (b) Assign each spell to a type by using the HRG of its first episode and use the cost of the HRG of the first FCE in the spell as the unit cost of the spell. Thus the spells in an HRG category may contain disparate types of spell (defined by the set of HRGs of their constituent FCEs) but all will have the same first FCE type). This will underestimate the cost of multi-FCE spells. Multiplying the number of spells by the unit costs of their first FCEs will yield a total cost which is less than actual total cost (which is the number of FCEs of different types times their unit costs). Thus the average cost for each output type is not the “true” unit cost i.e. the total cost of all spells assigned to the type divided by the number of spells. (c) Define the spell type by the HRG type of the first episode. Calculate the unit cost of the HRG as the total cost of all spells assigned to it divided by the number of spells assigned. The cost of a spell is calculated as the sum of the unit costs of the FCEs it contains. Like (a) this satisfies adding up for costs, waits and deaths. It requires more HES processing than (b) but less than (a) since the number of output types would be smaller. Since it produces the same number of output types as (b) the subsequent calculation of the indices is simpler than (a). Assume that all CIPS have at most two FCEs. Let zj0t be the number of single FCE spells of HRG type j in year t and zjkt the number of two FCE spells where the first FCE has HRG type j and the second has HRG type k. With method (c) the total number of CIPS assigned to HRG type j by the HRG of the first FCE in the CIPS is: x Sc jt = z j 0 t + ∑ k ≠ j z jkt (1) The current period t unit cost of this activity type in period t using method (c) is: c Sc jt = z j 0t c Fjt + ∑ k ≠ j z jkt ( c Fjt + cktF ) z j 0 t + ∑ k ≠ j z jkt = z j 0t c Fjt + ∑ k ≠ j z jkt ( c Fjt + cktF ) x Sc jt (2) where c Fjt is the unit cost of an FCE of HRG type j in year t The total cost of CIPS by method (c) is: 33 CtSc = ∑ j x Sjt c Sjt = ∑ j z j 0 t c Fjt + ∑ j ∑ k ≠ j z jkt c Fjt + ∑ j ∑ k ≠ j z jkt cktF (3) which equals the total cost actually incurred in year t. In order to calculate a unit cost for period t+1 CIPS activity based on period t actual FCE costs to be applied to period t+1 CIPS activity we calculate: c St +1c jt = z j 0t +1c Fjt + ∑ k ≠ j z jkt +1 (c Fjt + cktF ) z j 0t +1 + ∑ k ≠ j z jkt +1 = z j 0t +1c Fjt + ∑ k ≠ j z jkt +1 (c Fjt + cktF ) x Sc jt +1 (4) F Note that cˆ Sjtt+1c ≠ c Sc jt since although the raw unit costs of FCEs c jt are the same in (4) and in (2), (4) use the numbers of FCEs of different types and the total number of spells for period t +1 and (2) uses those from period t. The CWAI based on CIPS (method (c)) is then: I x Sc c Sc t ∑x = ∑x St +1c Sc jt +1 jt j j cˆ Sc Sc jt jt c = ∑ j ⎡ z j 0t +1c Fjt + ∑ z jkt +1 (c Fjt + cktF ) ⎤ k≠ j ⎣ ⎦ Sc Sc x c ∑ j jt jt (5) which is identical to a CWAI based on FCEs. But notice that when we apply quality adjustment factors the equality will disappear since in general (a) the factors are applied to different distributions of HRGs (based on spells versus FCEs) and (b) for some of the types of adjustment the factors will differ for HRGs based on FCEs and those based on spells (e.g. the simple survival adjustment is the ratio of survival rates and will differ for a given HRG label j because survival rates for the spell are based on the survival rate for the last FCEs assigned to the spell and those for the FCE based HRG on the survival rate of that FCE). Using CIPS with version (c) would give us a unit of output which was not homogenous in that different cases in each output group (by HRG of first FCE) may have different combinations of second, third etc. FCEs. But any output categorisation will have heterogeneous cases since different individuals wait different lengths of time for the same type of output. Using (c) involves averaging over waiting times, costs and mortalities but it does give an accurate answer to a meaningful question: what is the average cost, wait or mortality of a person whose first FCE was of this type. We therefore base our calculation of indices with CIPS activity measures on method (c). 34 7. Quarterly output measures We were asked to examine the construction of quarterly output indices to be used for within year estimates of productivity growth. Most of the required activity measures are available on a quarterly basis, or will shortly become so3. HES produces quarterly activity counts but there are difficulties in using these data for in year growth estimates. The quarterly data is not cleaned anywhere near as thoroughly as the annual data and the data from the later months in a quarter are likely to be incomplete. The quarterly releases are cumulative so that for example the third quarter release has data based on all records in the year to date and so overwrites second and first quarter releases. This would increase the accuracy of estimates made with a one quarter lag. If spell based activity indices are required the lagged unit costs to be applied to current period activities would have to be recalculated each quarter, though we suspect that using the unit costs from the previous period would make little difference. The quarterly HES data do not contain the date of death field provided by ONS so that it would not be possible to calculate 30 day death adjustments with quarterly data. 8. Identifying spells that end in death To calculate death (or survival) rates we need to be able to identify which continuous inpatient spells of NHS care end in death. Fortunately, there is a field in HES (DISMETH) which defines the circumstances under which the patient left hospital. DISMETH takes a value of 4 (patient died) or a value of 5 (baby was stillborn) if the patient is not alive when discharged. However, the cause of death is not recorded and may be unrelated to the patient's diagnosis or treatment. Table B5 (column (a)) reports the number of ‘deaths on discharge’ for spells of NHS care by HES data year. These vary between 250,000 and 267,000 per annum. One shortcoming of the HES data is that it cannot identify how long a patient survives after leaving hospital. Using information gathered from the registration of deaths, the 3 General practitioner consultations have previously been estimated from the General Household Survey which is conducted annually but these estimates are now likely to be replaced by those derived from downloads of a large database of practice record systems which could be used to produce quarterly data. 35 Office for National Statistics (ONS) has attempted to link its date of death data to HES records by matching various fields including those recording a patient’s sex, date of birth and postcode. The outcome of this matching process is a file, for each HES data year, that contains the HESID field and the date of death for all deaths that occur between the 1st of April of the data year and the 30th of April of the following year. In principle, this data set can be merged with HES to identify those spells that end with death within 30 days of discharge from hospital. Unfortunately, the linkage of the ONS deaths data to HES is not without its difficulties and work is ongoing to improve this matching process. These difficulties can generate some surprising outcomes including the result that: a) it is not uncommon for the ONS date of death to be before the HES date of discharge from hospital; and b) the ONS deaths data records no death while HES indicates that the patient was dead on discharge. One consequence of these linkage difficulties is that the ONS deaths data tends to under-estimate the number of deaths. Typically, the number of deaths on discharge (according to HES) exceeds the number of deaths on the day of discharge (from ONS) by about 20,000 (compare columns (a) and (b) of Table B5). The number of deaths within 30 days of discharge from hospital is presented in column (c) of Table B5. This is based solely on the ONS deaths data. An alternative ‘deaths within 30 days of discharge’ measure can be constructed by combining the deaths data from HES (for those dead on discharge) and adding those deaths that occur within 30 days of discharge from hospital (from ONS data). The number of deaths captured by this combined HES/ONS deaths within 30 days of discharge measure can be found in column (d) of Table B5. Our empirical work on death rates employs both this combined HES/ONS 30 day death rate as well as the dead on discharge (HES) death rate. 36 Table B5 Number of deaths from HES and ONS (thousands) HES data year HES deaths ONS deaths HES/ONS on discharge on discharge day* within 30 days* 30 days discharge* (a) (b) (c) (d) 1997/98 250 n/a n/a n/a 1998/99 268 238 306 335 1999/00 264 239 305 329 2000/01 254 231 294 315 2001/02 259 238 300 319 2002/03 263 244 304 322 2003/4 270 245 305 327 * This category also includes those deaths where the ONS date of death is one day before the date of discharge. 9. Mortality summary statistics This appendix presents some summary data on mortality and survival rates. Figure B1 is a histogram of in-hospital CIPS based survival rates for 548 elective and daycase HRGs in 2002/03. The mean CIPS based survival rate is 97.8% (98.4% FCE based). A few HRGs have zero or very low survival rates. They are listed in Table B6. Most have extremely low levels of activity. 37 0 10 Density 20 30 40 Figure B1 In-hospital CIPS based survival rates, 2002/03, electives and day cases 0 .2 .4 .6 In-hospital survival rates 2002/03 .8 1 Table B6 HRGs with the lowest in-hospital survival rates, 2002/03, electives HRG HRG Label Number of Number of spells deaths D09 Pulmonary Embolus - Died 20 20 D15 Bronchopneumonia 185 118 E28 Cardiac Arrest 49 22 A32 Head Injury without Significant Brain Injury with complications 3 1 D29 Inhalation Lung Injury or Foreign Body Aspiration 71 20 Summary data for 555 non-elective HRGs are shown in Figure B2. The mean CIPS based survival rate is 93% (94.9% FCE based) which is lower than for electives. HRGs with zero or very low survival rates are listed in Table B7. 38 0 5 Density 10 15 20 Figure B2 In-hospital CIPS based survival rates, 2002/03, non-electives 0 .2 .4 .6 In-hospital survival rates 2002/03 .8 1 Table B7 HRGs with the lowest in-hospital survival rates, 2002/03, non-electives HRG HRG Label D09 N01 E28 J13 D15 Q01 Pulmonary Embolus - Died Neonates - Died <2 days old Cardiac Arrest Major Burn >29% TBSA without Significant Graft Procedure >49 Bronchopneumonia Emergency Aortic Surgery Number of spells 733 3127 2313 43 9789 116 Number of deaths 733 3127 1610 28 5065 22 Figures B3 and B4 present the same information but for 30-day survival. The patterns are very similar to those for in-patient survival. The unweighted mean CIPS based survival rate for electives and day cases is 97.3%. (97.9% FCE based). The unweighted mean CIPS based survival rate for non-electives is 91.7% (94.0% FCE based). 39 0 10 Density 20 30 40 Figure B3 Thirty day CIPS based survival rates, 2002/03, electives 0 .2 .4 .6 30 day surv ival rates 2002/03 .8 1 0 5 Density 10 15 20 Figure B4 Thirty day CIPS based survival rates, 2002/03, non-electives 0 .2 .4 .6 30 day surv ival rates 2002/03 .8 1 Figures B5 and B6 are scatter plots of the Winsorised growth rates in CIPS based 30 40 day and in-hospital survival rates between 2001/2 and 2002/3 for electives and nonelectives. The correlations between the two survival growth rates is high: 0.984 for electives and day cases and 0.944 for non-electives. -.02 Winsorised growth in in-hospital survival rates -.01 0 .01 .02 Figure B5 Winsorised growth in in-hospital and 30 day CIPS based survival rates, 2001/02-2002/03, electives -.02 -.01 0 .01 Winsorised growth in 30 day survival rates .02 41 -.02 Winsorised growth in in-hospital survival rates -.01 0 .01 .02 Figure B6 Winsorised growth in in-hospital and 30 day CIPS based survival rates, 2001/02-2002/03, non-electives -.02 -.01 0 .01 Winsorised growth in 30 day survival rates .02 The following table shows descriptive statistics for age and gender compositions of elective and daycase HRGs. Mean age for both in-hospital and 30 day deaths appears to have risen marginally. The proportion of males in both in-hospital and 30 day deaths appears to be constant. The table also highlights the proportion of electives which are day cases. 42 Table B8 Summary statistics for age, gender, daycase rates for HRGs, electives Variable Mean age of in-hospital deaths Mean age of in-hospital deaths Mean age of in-hospital deaths Mean age of in-hospital deaths Mean age of in-hospital deaths Mean age of in-hospital deaths Mean age of in-hospital deaths Mean age of deaths within 30 days Mean age of deaths within 30 days Mean age of deaths within 30 days Mean age of deaths within 30 days Mean age of deaths within 30 days Mean age of deaths within 30 days Mean age of all cases Mean age of all cases Mean age of all cases Mean age of all cases Mean age of all cases Mean age of all cases Mean age of all cases Percent male in-hospital deaths Percent male in-hospital deaths Percent male in-hospital deaths Percent male in-hospital deaths Percent male in-hospital deaths Percent male in-hospital deaths Percent male in-hospital deaths Percent male deaths within 30 days Percent male deaths within 30 days Percent male deaths within 30 days Percent male deaths within 30 days Percent male deaths within 30 days Percent male deaths within 30 days Percent male of all cases Percent male of all cases Percent male of all cases Percent male of all cases Percent male of all cases Percent male of all cases Percent male of all cases Percent daycases of elective admissions Percent daycases of elective admissions Percent daycases of elective admissions Percent daycases of elective admissions Percent daycases of elective admissions Percent daycases of elective admissions Percent daycases of elective admissions Year n HRG 1997/98 467 1998/99 454 1999/00 460 2000/01 453 2001/02 449 2002/03 446 2003/04 440 1998/99 491 1999/00 501 2000/01 499 2001/02 492 2002/03 503 2003/04 493 1997/98 566 1998/99 566 1999/00 567 2000/01 565 2001/02 563 2002/03 564 2003/04 567 1997/98 468 1998/99 455 1999/00 461 2000/01 454 2001/02 450 2002/03 447 2003/04 441 1998/99 481 1999/00 495 2000/01 487 2001/02 483 2002/03 486 2003/04 484 1997/98 567 1998/99 567 1999/00 568 2000/01 566 2001/02 564 2002/03 565 2003/04 568 1997/98 567 1998/99 567 1999/00 568 2000/01 566 2001/02 564 2002/03 565 2003/04 568 mean 65.623 65.320 65.616 65.607 66.566 66.818 65.430 64.780 64.673 64.574 65.249 65.290 64.120 50.934 50.366 50.293 50.475 50.408 50.792 50.571 0.522 0.533 0.511 0.534 0.527 0.536 0.511 0.672 0.650 0.671 0.664 0.665 0.654 0.492 0.496 0.501 0.498 0.500 0.500 0.501 0.326 0.339 0.362 0.366 0.371 0.374 0.374 sd 17.604 17.424 16.986 17.556 16.979 17.218 18.002 17.189 17.179 17.642 16.774 16.861 18.335 18.693 18.477 18.291 18.323 18.443 18.146 18.465 0.286 0.293 0.288 0.284 0.304 0.288 0.300 0.271 0.284 0.272 0.282 0.286 0.293 0.200 0.200 0.203 0.198 0.199 0.197 0.203 0.278 0.288 0.295 0.294 0.295 0.292 0.294 min 0.003 0.003 0.003 0.003 0.003 0.003 0.056 0.003 0.003 0.003 0.003 0.026 0.050 0.003 0.092 0.010 0.003 0.175 0.170 0.089 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 max 102 98 98 96 97 92 89 92 89 96 97 93 93 97 82 82 82 82 83 82 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0.981 1.000 0.984 1.000 0.983 0.984 43 The following table illustrates the age and gender composition for all non-elective HRGs over time. Mean age for both in-hospital and 30 day deaths appears to have increased marginally, whilst the pattern for gender is less clear. Table B9 Summary statistics for age, gender, deaths for HRGs, non-electives Variable Mean age of in-hospital deaths Mean age of in-hospital deaths Mean age of in-hospital deaths Mean age of in-hospital deaths Mean age of in-hospital deaths Mean age of in-hospital deaths Mean age of in-hospital deaths Mean age of deaths within 30 days Mean age of deaths within 30 days Mean age of deaths within 30 days Mean age of deaths within 30 days Mean age of deaths within 30 days Mean age of deaths within 30 days Mean age of all cases Mean age of all cases Mean age of all cases Mean age of all cases Mean age of all cases Mean age of all cases Mean age of all cases Percent male in-hospital deaths Percent male in-hospital deaths Percent male in-hospital deaths Percent male in-hospital deaths Percent male in-hospital deaths Percent male in-hospital deaths Percent male in-hospital deaths Percent male deaths within 30 days Percent male deaths within 30 days Percent male deaths within 30 days Percent male deaths within 30 days Percent male deaths within 30 days Percent male deaths within 30 days Percent male of all cases Percent male of all cases Percent male of all cases Percent male of all cases Percent male of all cases Percent male of all cases Percent male of all cases Year 1997/98 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 1997/98 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 1997/98 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 1997/98 1998/99 1999/00 2000/01 2001/02 2002/03 2003/04 n 528 530 535 530 524 533 528 546 540 543 532 540 534 571 571 571 571 571 571 571 529 531 536 531 525 534 529 539 537 542 531 539 534 572 572 572 572 572 572 572 mean 64.647 64.280 64.189 65.020 65.116 65.086 65.151 63.891 64.206 64.444 64.438 64.666 64.566 51.139 50.567 50.388 50.288 50.373 50.555 50.537 0.495 0.509 0.507 0.502 0.500 0.510 0.500 0.584 0.576 0.578 0.571 0.573 0.562 0.506 0.510 0.507 0.508 0.510 0.513 0.514 sd 19.611 19.909 20.137 19.411 19.678 19.601 19.257 19.888 19.974 19.587 20.013 19.882 19.554 20.119 19.916 19.814 19.684 19.715 19.605 19.564 0.216 0.225 0.219 0.229 0.217 0.227 0.214 0.229 0.219 0.232 0.217 0.227 0.219 0.191 0.194 0.197 0.198 0.196 0.195 0.198 min 0.003 0.003 0.004 0.004 0.014 0.003 0.017 0.005 0.004 0.004 0.018 0.003 0.003 0.253 0.005 0.004 0.004 0.010 0.004 0.011 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 max 94 88 87 90 96 95 94 86 87 90 94 95 91 84 84 84 84 84 84 84 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 44 10. Missing Observations: Maintaining Sample Size In order to maintain as full a sample as possible, all missing observations have been replaced with a representative figure, with the exception of those observations lacking the individual identifier, EPIKEY, or method of admission, ADMIMETH. All missing values for deaths and waits are replaced with mean for their HRG for electives and non-electives separately, provided a value for that specific case exists, or else the population mean is applied. Missing values for cost by HRG are replaced by the population mean for electives and non-electives. For the age variable, missing STARTAGE values are replaced with the gender mean for the HRG for electives and non-electives. 45 C. Measuring the health outcomes secured by treatment 1. Introduction In section 4 we set out a methodology for an “ideal” index of NHS output. The output produced by NHS activity should be measured as the health gain and other characteristics valued by patients. Construction of such an index requires data on health gain. We explored two main sources from which such information might be obtained: 1. Clinical trial data, from which it may be possible to extract before and after treatment estimates of health status for patients enrolled in clinical trials; 2. Observational data, from which before-and-after treatment estimates might be derived for patients in routine clinical practice. 2. Defining health effects In essence, the measure of health outcome included in a productivity index should indicate the ‘value added’ to health as a result of contact with the health system. This has proved difficult to make operational in the health sector, mainly because of measurement difficulties. The fundamental difficulty is that the counterfactual – what health status would have been in the absence of intervention – is rarely observed. Although health status measurement is becoming increasingly routine in many health care settings, this tends to involve comparisons of health states before and (sometime) after intervention. The with/without and before/after measures of the value added of treatment are unlikely to be equivalent. Figure C1 illustrates the problem. The figure plots an individual’s health status, measured on the y axis using a scale where 1 indicates full health and 0 death, with time measured along the x axis. In this illustration the individual suffers a decline in health status until time t0 when the option of treatment is made available. Without treatment, the health profile would follow the dashed line, ho(t), with the patient dying 46 at time t2. If the patient accepts treatment, the time path h*(t) is followed. Treatment initially reduces health - the patient may be temporarily incapacitated even by successful surgery. After t1, though, the patient begins a recovery and the treatment improves health status and lengthens life from t2 to t3. The net change in health status resulting from treatment can be calculated by subtracting the area where h*(t) < ho(t)) from the area where h*(t) > ho(t). This difference is commonly referred to as the total number of quality-adjusted life years (QALYs) resulting from treatment. A QALY therefore captures the effect of treatment on both how long patients live and their quality of life over time. It is thus a more sophisticated measure of the treatment effect than simply focussing on the change in survival, which would capture only the difference between t2 to t3. However, QALYs are considerably more difficult to estimate accurately. There are three major difficulties. 47 Figure C1 With and without treatment health profiles Health status h=1 * With treatment: h (t) o Without treatment: h (t) t0 t1 t2 t3 First, the “without treatment” time path, ho(t), is typically not monitored. Rarely are patients denied any sort of treatment, so this counterfactual is simply unobservable. Some studies have relied on the “expert opinion” of clinicians to estimate expected outcome in the absence of treatment (Williams, 1985), but it is not clear how their opinion can be ratified. In the absence of knowledge of about ho(t), one possibility is to fall back on a before/after measure of value-added. This brings with it the two remaining difficulties. A major complication in measuring QALYs is that to identify h*(t) it is necessary to measure health status continuously. Typically, though, only snapshot estimates at particular points in time are available. The timing of the snapshots is clearly crucial: the individual’s health status appears quite different if measured at t2 rather than at t1. So the treatment effect view can look very different depending on the point of time chosen to measure health status after treatment. We investigate this issue for hip and knee replacements in the empirical section of this annex. The third and most critical difficulty is that the difference between with/without and before/after measures of value-added will depend on the patient’s underlying health condition. For conditions where the without treatment time path ho(t) is likely to be constant, the with/without and before/after measures of value-added may be in close agreement. But for other conditions, the measures will diverge. Divergence is likely to be most noticeable where people if left untreated would be expected to suffer 48 deterioration in health over time and where the purpose of the intervention is not to improve health but to stabilise the condition. In such cases the before/after measure would suggest little or no change in health status, irrespective of the fact that the intervention might have been highly beneficial. Sole reliance on before/after measures will systematically favour interventions that lead to health improvements over those that stabilise or slow the deterioration of conditions. Consequently, such measures should not be used to make judgements about the relative value of different types of health care intervention. It is also problematic for the construction of a productivity index because if, over time, there is a change in the mix of services provided by the health care system, the true value of this change may not be captured accurately. As well as requiring a basis by which the relative value of different interventions is established, accurate measures of health system productivity ought to able to capture technological improvements that enhance health outcomes. To measure improvements from one year to the next we would need to compare h*(t) in a baseline year (year 0) with h*(t) in a later (year 1). If technological changes have generated improvements in the quality but not length of life of people suffering from a particular condition, this would lead to an upward shift of the h*(t) time path, as illustrated in figure C2. If both quality and length of life were improved, the year 1 time path would extend beyond t3. Figure C2 Improved health profiles over time Health Status h=1 year 1 year 0 t0 t3 49 3. Measurement instruments The two key requirements for a measure of health status useful for measuring NHS productivity are a means of describing the health profile of an individual along a set of dimensions of health and a mapping from the set of dimensions to an overall health score. There are two broad groups of health status measure. The first group is of conditionspecific or targeted measures designed for use in a single therapeutic area. Even within a single diagnostic area such as depression, it is possible to find multiple measures. These measures often have their origins in clinical practice. An example is the VF14, which is described in box 1. Data collected by BUPA using this instrument are reported later in the annex. The second, and much smaller group of measures, is generic and designed for use in a wide range of settings. Potentially such measures might be applied in all therapeutic settings. Practical design considerations mean that generic measures trade-off complexity and comprehensiveness against reduced administrative burden and simplicity of coding. Since there are no absolute standards against which to judge the design of any measure, such compromises are reached on the basis of the experience of instrument developers and the technical performance of the measures they produce. All health status measurement systems have two components. First, they require a means of describing health states. This can be done by an explicit formalised classification system (for example, the Karnofsky Performance Scale or Health Utilities Index) or, more often, by implicit combinations of responses to complex questionnaires (for example, Hospital Anxiety and Depression Scale or SF-36). Instruments differ in the dimensions of health they assess and the fineness of the measures employed along the dimensions. Second, all descriptive systems require some form of valuation or weighting system to reduce the volume of captured information to manageable proportions. Most weighting systems are characterised by arbitrary weights imposed by instrument developers. For example, responses to the 5point Excellent, Very Good to Poor categorisation of self-rated health status in the original form of SF-36 is recoded to 100, 75, …, 0. The SF-36 is described in box 2, 50 and data derived from this instrument are considered later in this annex. Following recommendations of the Washington Panel and the Technology Appraisal Guidelines published by NICE, it is widely held that the weights for health states (particularly in the context of cost-effectiveness analysis) should be based on those of the relevant general population. Weights for health states could be elicited from many different sources – such as health care professionals or patients. It is sometimes argued that since patients have the relevant first-hand experience it is they who should be consulted about the issue of weights. However, such weights can be expected to vary in the course of the patient “journey” through ill-health; consequently, there is scope for seeing different values arising from patients with (say) newly diagnosed illness and those whose illness is longer-standing. Similarly, patients who have wholly or partially recovered from an illness might hold different values to those whose response to treatment has been less positive. The requirement for a standard metric to measure health outcomes at system level rules out the consideration of all condition-specific measures based as they are on different descriptive systems each with an independent valuation or scoring system. Profile measures (such as the Nottingham Health Profile and SF-36) report generic health status in terms of separate dimensions of health but require a set of weights for the different dimensions to produce a single health index. The set of generic measures which produce a single index is small and includes Health Utilities Index, EQ-5D, 15D and AQLQ. Of these measures, only EQ-5D is calibrated in terms of social preference weights of a UK population. EQ-5D is described in box 3. Box 1: VF14 The VF14 is a condition-specific index of functional impairment for patients that suffer cataracts (Steinberg 1994). Visual functioning is measured across fourteen vision-dependent activities, such as the ability to read small print, watch television, or take part in sport. Each activity is scored from 1 (great deal of difficulty) to 4 (no difficulty). A single index (the VF-14) is calculated on a 0-100 scale, by multiplying the average score across each activity by 25. Averaging across activity implies that they are of equal importance to the determination of the overall level of visual functioning. 51 Box 2: Short Form 36 The SF36 is a general health questionnaire comprising 35 questions (termed “items”) about various dimensions of current health status and 1 question relating current health status to that experienced a year previously. The 35 items can be summarised into two measures, physical health and mental health. This involves a three-stage process, in which items are first scored, collapsed into scales, and then scales are collapsed into the summary measures. First, items are scored to a 0-100 scale, from the least to most favourable responses for each item. An item with three response categories is coded 0-50-100; an item with six responses is coded as 0-20-40-60-80-100. Hence, a convenient set of cardinal values is imposed on the ordinal responses for each item. Second, sets of items are allocated to one of eight scales. Thus, for example, the physical functioning scale comprises 10 items; social functioning contains just two. A score for each scale is calculated by averaging the scores across items in each set. This averaging across items assumes all are of equal importance in construction of the scale. Third, the two summary health measures are constructed by amalgamating the four scales that relate to physical health and the four that relate to mental health. The scores for these summary measures are calculated as the average of the scale scores, without weighting for the number of items that comprise the scale. This implies, for instance, in the summary physical health measure items in the bodily pain scale (comprising two items) carry five times the weight of the items in the physical functioning scale (comprising ten items). Items Scales Summary Measure 10 } Physical functioning 4 } Role-physical 2 } Bodily pain 5 } General health 4 } Physical Health Vitality 2 } Social functioning 3 } Role-emotional 5 } Mental health Mental Health 52 Box 3: EQ5D The EQ-5D is a generic health status instrument which measures health along five dimensions. An overall health status index EQ-5Dvas can be derived using a visual analogue scale on a 0-100 scale where the endpoints are labelled “worst imaginable health” and “best imaginable health”. Alternatively the index can be derived by time trade-off methods. A set of weights for the five EQ-5D dimensions can then be derived by analysis of the scores on the five dimensions describing a given health state and the overall summary scores for that health state for each individual. The resulting set of average weights can then be applied to construct a health index EQ-5Dindex for all possible health states defined in EQ-5D. Such weights have been periodically determined in UK population studies. The set of weights most widely used in economic evaluation of healthcare are those from the Measurement and Valuation of Health Project. These are the default values adopted in evaluation studies commissioned by NICE. 4. Clinical trial data The use of published studies to estimate change in health status has a number of precedents (Berndt et al., 2001; Mai 2004) and may be combined with or validated by expert opinion (Berndt et al., 2002). In principle trial data can yield estimates of health status before treatment (hb), for those treated (h*) and – in circumstances where a placebo was offered - those not treated (ho) at the same time point post treatment. We sought to determine whether published studies contain information in the requisite form for use in constructing a productivity index. To do this we have surveyed users of the EQ5D instrument and reviewed studies that have employed the instrument. We identified thirty published studies that had used the EQ-5D, and extracted and summarised the published information in a standard format. The EQ-5D data from this source used for the construction of the specimen index are given in Table C1. The main conclusions are that: 1. EQ-5D has been applied to the evaluation of a wide range of interventions but coverage is a long way from being comprehensive. The studies are neither representative of the general stream of NHS practice, nor are they concentrated on high volume/expenditure activities, nor are they fully reflective of areas 53 where most technological progress has been made. 2. There is a high degree of variability in the summary statistics reported in studies, with differences in what measure of the average effect is used (mean or median) and whether or not variance is reported. 3. There is no common follow-up time across studies. 4. No studies report results according to the year in which the intervention took place so that even if there was more than one study of a particular treatment it would be difficult to use the information to derive estimates of productivity growth. Added to these observations, there are substantial drawbacks to using data from published clinical trials in constructing a productivity index: • The study population may not be representative of the patients who receive the intervention in routine practice, given the exclusion criteria for many studies. • The description of interventions in such studies may not map well to the classification of activities in the index. Usually trial interventions are very precisely defined, while fairly aggregated classifications are used in the construction of productivity indices. • In most trials alternative interventions are compared, say of a new technology with existing practice. As such, there will be alternative estimates of effects. There are no clear grounds for favouring one estimate over the other in constructing a productivity index. • The follow-up time in studies is variable, and hence the full treatment effect may not be captured by the estimate. • Trials are not conducted for all NHS activities and, as such, only partial coverage can be achieved. • Measurement of productivity change requires information about whether the effect of interventions has changed over time. There are very few examples of published studies that have attempted to ascertain the extent to which the effectiveness of the interventions being compared is determined by the time the intervention was made. 54 Table C1 Estimates of health status before and after treatment from clinical trials used for specimen index Study number Diagnostic Group Before After treatment treatment b hbj / h*j * ( hj ) ( hj ) 1 Rheumatoid arthritis 0.43 0.63 0.68 2 Ischaemic stroke 0.31 0.62 0.50 3 Psychiatric disorder 0.36 0.41 0.88 3 Rheumatoid arthritis 0.38 0.43 0.88 4 Liver transplantation 0.53 0.59 0.90 9 Chest pain 0.63 0.69 0.91 13 Spinal pain 0.59 0.69 0.86 13 Spinal pain 0.62 0.66 0.94 15 Acute mycardial infatction 0.68 0.72 0.94 24 Low back pain 0.7 0.8 0.88 25 Vaginal standard hysterectectomy 0.75 0.91 0.82 25 Vaginal laparoscopic hysterectectomy 0.76 0.92 0.83 25 Abdominal standard hysterectectomy 0.72 0.89 0.81 25 Abdominal laparoscopic hysterectectomy 0.69 0.87 0.79 28 Benign prostate hypertrophy 0.81 0.85 0.95 29 Acute lower back pain 0.69 0.8 0.86 29 Acute lower back pain 0.76 0.76 1.00 29 Acute lower back pain 0.69 1 0.69 30 Varicose veins - PIN stripping 0.73 1 0.73 30 Varicose veins - conventional stripping 0.8 1 0.80 mean 0.83 stddev 0.12 5. Observational data In view of the difficulties associated with relying on clinical trial data, an alternative is to measure health status before and after treatment using surveys of the general stream of NHS patients. In the absence of routinely collected NHS data we explored three existing sources of more limited observational data: Health Outcomes Data Repository 55 York District Hospital BUPA 5.1 Health Outcomes Data Repository Since June 2002, the Health Outcomes Data Repository (HODaR) (www.cardiffresearch-consortium.co.uk/hodar) has operated a continuous health status survey of all inpatients and outpatients at a single large Welsh Trust. These are linked to individual-level hospital, primary and community care data. We were provided with the HODaR database, containing 29,541 observations. We analysed the information in order to gain an understanding of the challenges faced in collecting and using observational data. Data on health status are collected using two generic health survey instruments, SF36 and EQ-5D. In-patients are sent a postal questionnaire six weeks post discharge. Outpatients are surveyed at the time of their clinic appointment. Because the survey is linked to HES data, it is possible to analyse the health status survey by condition, defined by ICD-10 diagnoses codes, procedure codes or Healthcare Resource Group (HRG). Table C2 presents the mean and standard deviation in EQ5D score by HRG for all 29,541 patients in descending order of HRG frequency (only HRGs containing more than 65 patient surveys are listed). For the full sample, mean EQ5D score is 0.66, implying that the average health state experienced by these patients has 66% of the value of being in full health. The variation in EQ5D score is high, with a standard deviation of 0.32. This is to be expected, given the wide range of conditions included in the sample. However, the variance is high even within each HRG category suggesting that, although they might be resource homogenous, HRGs are not particularly “health outcome” homogenous. This implies caution in the use of the mean EQ5D estimates as precise measures of the health states experienced by patients in each HRG. 56 Table C2 EQ5D score by HRG HRG N Mean EQ5D Standard Deviation in HRG N EQ5D E03 Mean EQ5D Standard Deviation in EQ5D Total 29541 0.66 0.32 E14 1141 0.61 0.31 M09 120 0.82 0.30 J37 811 0.76 0.28 Q14 120 0.56 0.33 F06 741 0.69 0.30 F54 118 0.77 0.26 B02 700 0.67 0.29 H37 116 0.63 0.27 M06 631 0.74 0.30 L27 115 0.66 0.32 S22 550 0.63 0.31 L46 114 0.59 0.34 F35 547 0.73 0.29 C22 112 0.72 0.33 M07 515 0.74 0.29 F32 112 0.73 0.26 120 0.69 0.29 L21 479 0.69 0.30 B03 110 0.60 0.31 N12 422 0.85 0.23 M11 105 0.85 0.24 J02 387 0.73 0.26 S25 105 0.57 0.41 E36 338 0.63 0.34 F31 103 0.70 0.27 E15 318 0.65 0.32 S01 99 0.70 0.26 F47 315 0.69 0.31 C02 93 0.77 0.29 F16 310 0.67 0.30 Q12 93 0.58 0.31 H10 296 0.59 0.33 R02 92 0.43 0.38 H04 253 0.51 0.30 E10 90 0.61 0.32 M03 243 0.70 0.28 F82 89 0.85 0.25 C24 237 0.70 0.31 H06 89 0.43 0.33 M02 232 0.75 0.28 M01 89 0.81 0.25 F74 228 0.77 0.26 H19 87 0.64 0.35 N07 225 0.89 0.20 J03 87 0.77 0.28 E04 221 0.63 0.27 B06 82 0.71 0.27 L17 217 0.74 0.27 F37 81 0.61 0.37 H02 204 0.53 0.30 F93 81 0.80 0.28 H17 199 0.67 0.32 H40 81 0.74 0.30 E12 196 0.68 0.28 A22 80 0.54 0.33 E33 194 0.58 0.28 H26 78 0.42 0.34 A07 191 0.25 0.34 L26 77 0.73 0.33 G14 190 0.72 0.29 E09 76 0.68 0.29 E34 178 0.57 0.32 F56 76 0.64 0.36 B09 172 0.69 0.30 F42 75 0.61 0.39 B05 164 0.66 0.31 H33 75 0.49 0.31 J35 154 0.77 0.28 C01 74 0.65 0.36 F73 151 0.70 0.26 K16 74 0.65 0.34 M05 149 0.79 0.28 E31 73 0.62 0.32 S16 147 0.55 0.38 K01 72 0.71 0.28 E30 142 0.71 0.29 E18 70 0.53 0.35 H13 142 0.66 0.32 L10 70 0.71 0.32 57 Q11 142 0.80 0.24 Q10 70 0.71 0.30 D21 140 0.56 0.34 A30 69 0.55 0.37 L19 140 0.70 0.28 D13 68 0.51 0.38 D20 136 0.45 0.34 E32 68 0.71 0.30 E35 133 0.57 0.31 J04 68 0.77 0.24 F46 128 0.60 0.32 L53 68 0.76 0.28 E29 126 0.67 0.24 R03 68 0.31 0.36 E08 125 0.64 0.28 S98 68 0.63 0.31 F36 123 0.52 0.34 D02 67 0.61 0.28 H14 123 0.76 0.28 K04 67 0.67 0.37 C14 121 0.76 0.32 L43 67 0.76 0.26 For 2,587 patients who had two courses of treatment, two EQ5D surveys are available so that it is possible to measure the change in their health status over time Although such patients are unlikely to be typical of all patients receiving care, we have concentrated on analysing data for this subset of patients to see what light it sheds on changes in health status. This subset of patients differ along three important dimensions: 1. The elapsed time between the two surveys. This can be calculated because we have information on the date that each survey was completed. 2. Whether the two surveys relate to the same underlying condition. We have assessed whether coding of diagnosis remains similar for the two surveys. Different diagnostic codes may indicate that the surveys relate to different underlying conditions. 3. The ordering of the settings to which the surveys refer. Ordering is determined by the survey identification number. The interpretation placed on the change in health status observed between each survey depends on this ordering. There are four possible scenarios, depicted in figure C3 below. 58 1.0 Health status Health status Figure C3 Timing scenarios of relationship between surveys t 0+6 t0 t1 1.0 t0 time 1.0 t0 t1 t 1+6 time Scenario 2 Health status Health status Scenario 1 t 0+6 t1 1.0 t0 time Scenario 3 t1 t 1+6 time Scenario 4 Scenario 1: Inpatient survey followed by outpatient survey. Under this scenario, the patient is admitted at t0, completes a post-discharge survey at t0+6 and completes an outpatient survey during their clinic visit at time t1. The change in health status is calculated as ∆h1 = EQ 5Dt1 − EQ5Dt0+ 6 . Even if the outpatient visit relates to the same health problem as the earlier admission it is not possible to predict the sign of ∆h1. It may be positive if the patient continues to recover during the time between the surveys. But if the outpatient visit is triggered by deterioration in the patient’s condition, ∆h1 will be negative. Moreover the full treatment effect will not be captured, because any recovery occurring between t0+6 and t0 is not recorded. It may well be that, for many interventions, the most dramatic changes in health status occur during this period. Scenario 2: Two inpatient surveys Under this scenario the change in health status is calculated as 59 ∆h2 = EQ5Dt1+6 − EQ5Dt0+ 6 . There is a high probability that the two inpatient admissions are unrelated, this probability increasing the longer the elapsed time between the two surveys. Irrespective of whether or not the admissions are related, it is not possible to predict the sign of ∆h2. Scenario 3: Two outpatient surveys Here the change in health status is calculated as ∆h3 = EQ5Dt1 − EQ 5Dt0 . As for the previous scenario it is not possible to predict the sign of ∆h3. Scenario 4: Outpatient survey followed by inpatient survey The change in health status is calculated as ∆h4 = EQ5Dt1+6 − EQ5Dt0 . This scenario may represent a before-and-after measurement of the intervention effect. This is more likely if the outpatient visit and admission are related, the shorter the lapsed time between the outpatient visit and admission, and if most of the post-treatment recovery occurs in the six-week post-discharge period. The more these conditions prevail, the more likely that ∆h4 will have a positive sign. 5.1.1 Analysis Table C3 shows the number of patients, change in EQ5D score, and time between the two surveys for each of the above scenarios. The mean change in EQ5D is negative for all four scenarios, although there is a wide range in scores. There are a number of EQ5D health states to which negative values are attached, implying that these are health states considered worse than death. Changes into or out of such health states may lead to ∆h taking values in excess of ± 1 . 60 Table C3 Change in EQ5D score and time between surveys, by scenario Scenario N Mean change Min change in Max change in in EQ5D EQ5D EQ5D Mean time between surveys (days) 1 Inpatient to Outpatient 502 -0.0075 -1 1 633 2 Inpatient to Inpatient 1612 -0.0015 -1 1.016 634 3 Outpatient to Outpatient 192 -0.0196 -0.912 0.796 489 4 Outpatient to Inpatient 281 -0.0069 -1 1 581 The lapsed time between surveys is often considerable, amounting to around a year and a half on average. This is the case even for scenario four, where the average time between surveys amounts to 581 days. This undermines the case for considering the surveys for this subset of patients as constituting before-and-after measurements. That said, the distribution over time for these patients is highly skewed. The patients can be classified into three groups: 1. 48 patients for whom the date of the outpatient and inpatient surveys are identical (the ordering of surveys is determined by the survey identification number). This implies that the outpatient survey was completed retrospectively, at the same time as the inpatient survey. There may be problems of recall for these patients. 2. Those for whom the elapsed time is very long. It is doubtful that these surveys constitute before-and-after measurements. 3. Remaining patients, where elapsed time is short enough to suggest they may be before-and-after surveys. Of course, this requires a judgement to be made about what constitutes a ‘short enough’ time. The possibility that the two surveys constitute before-and-after measurement is further undermined when considering the diagnosis recorded at each survey. The ICD-10 codes recorded over the two periods are rarely the same, and there is little consistency even in the HRG chapter to which the patient is classified. Table C4 shows how the 281 patients under scenario four are classified to HRG chapters at the first and second survey. Shaded boxes along the diagonal indicate the number of patients who remain in the same chapter for both surveys with only 43% doing so. 61 The off-diagonal cells show the number of patients who moved HRGs between the surveys. As an example, the first row shows where the 7 patients originally classified to Chapter A (Nervous System) were classified in their second survey. Thus, 3 remained in this chapter, one patient was subsequently classified to Chapter C (Mouth, Head, Neck and Ears) and three to Chapter H (Musculoskeletal). The extent of this movement across HRG chapters implies that caution should be exercised in attaching the change in EQ5D score to a specific underlying condition. Moreover, inconsistency in the condition to which patients are classified over time increases data requirements substantially (by the power of the additional classes that need to be considered). 62 Table C4 Change in HRG chapter classifications between surveys, scenario 4 patients HRG Chapter (2nd Survey) HRG Chapter (1st Survey) Nervous System A B 3 Eyes and Periorbita Mouth, Head, Neck and Ears D 1 1 1 Cardiac Surgery 1 Digestive System G 6 L 1 2 1 3 2 2 18 4 4 1 5 Urinary Tract and Male Reproductive System Q R S U 1 2 3 2 4 1 2 3 3 2 2 9 1 2 3 1 4 1 1 1 1 1 2 2 1 1 1 1 3 3 2 1 3 5 2 13 4 53 5 52 2 27 2 17 2 7 2 2 21 1 3 25 1 5 5 1 10 1 2 1 1 10 17 4 16 37 281 1 1 11 1 1 1 11 1 1 1 1 1 4 2 Spinal Surgery and Primal Surgery Conditions 1 Haematology, Infectious Diseases 1 Mental health 1 Invalid Primary Diagnosis 2 5 7 8 6 57 1 35 10 8 7 3 1 Total 7 4 Obstretics and Neonatal Care Total N 1 2 Vascular System M 1 2 1 2 K 1 34 Endocrine and Metabolic System Female Reproductive System J 2 1 1 1 H 1 1 Musculoskeletal System F 3 1 1 Hepato-Biliary and Pancreatic System E 1 Respiratory System Skin, Breast and Burns C 22 1 1 1 1 1 24 20 14 3 3 1 10 63 Table C5 Change in EQ5D score between surveys, scenario 4 patients HRG Chapter (2nd Survey) HRG Chapter (1st Survey) Nervous System A B 0.20 Eyes and Periorbita Mouth, Head, Neck and Ears D E F G 0.15 0.59 -0.59 Cardiac Surgery -0.10 Digestive System Hepato-Biliary and Pancreatic System -0.16 Musculoskeletal System -0.07 -0.28 K L -0.06 -0.33 0.06 0.06 0.00 0.04 0.12 0.04 -0.09 -0.30 -0.28 -0.03 -0.28 0.08 -0.06 -0.21 0.08 -0.11 -0.14 0.19 0.00 0.03 0.12 0.15 -0.17 -0.26 0.00 Urinary Tract and Male Reproductive System Q R S 0.11 U Total -0.16 -0.03 -0.15 -0.31 -0.18 -0.10 0.05 -0.07 Obstretics and Neonatal Care -0.40 0.05 0.75 0.19 0.01 0.18 0.11 0.56 0.06 -0.19 -0.11 0.31 0.03 0.04 0.05 0.13 -0.07 -0.04 0.06 -0.02 0.05 0.00 0.08 -0.07 -0.24 0.00 -0.05 0.00 -0.14 0.00 -0.08 -0.02 1.00 0.04 -0.03 0.00 -0.03 0.00 0.00 -0.12 -0.54 0.00 1.00 -0.12 0.00 0.20 0.00 -0.15 0.35 0.10 0.15 0.00 Haematology, Infectious Diseases -0.66 Mental health 0.10 Invalid Primary Diagnosis 1.00 -0.07 0.35 0.04 -0.15 -0.07 0.06 0.00 -0.07 0.02 -0.11 0.00 -0.06 0.00 -0.01 -0.07 0.00 0.11 Spinal Surgery and Primal Surgery Conditions Total N 0.00 -0.35 Vascular System M -0.11 -0.03 -0.10 -0.04 J -0.84 Endocrine and Metabolic System Female Reproductive System H -0.39 -0.04 -0.10 Respiratory System Skin, Breast and Burns C -0.16 -0.27 -0.25 0.07 0.20 -0.05 0.01 0.28 0.10 0.00 -0.10 0.00 0.05 -0.62 0.49 0.03 0.07 0.02 0.04 0.01 -0.01 64 Finally for completeness, Table C4 shows the mean change in EQ5D score (∆h4) for the patients in scenario 4 according to their classifications to HRG chapters across the two surveys. Obviously, small numbers in each cell preclude drawing any conclusions from these data. 5.1.2 Conclusions We have analysed the HODaR set of observational data to ascertain whether the information can be utilised in the construction of outcome weights for a productivity index. We have concluded that the HODaR data are unsuitable for this purpose. The surveys have not been administered with the express intention of collecting before and after information. Although multiple surveys exist for a subset of patients, it is unlikely that many of these constitute before-and-after measurements. However, the HODaR data do provide an indication of the variation in EQ5D scores for particular conditions (e.g. specific diagnoses or HRGs). This information might be used to assess the sample size requirements for the collection of other observational data. 5.2 York District Hospital The Orthopaedics and Trauma unit at York District Hospital began collecting SF36 data since March 2001 for patients undergoing hip and knee replacements. Patients completed the SF36 up to December 2002 and SF12 thereafter at the following times: • Initial consultation in the outpatients setting • Pre-operatively upon admission • Between 3-6 months following their operation • 12 months post-operatively Post-operative information was collected predominantly from postal questionnaires, although some patients may have filled in the forms during a follow-up outpatient appointment. The SF36 data are analysed to answer the following questions: • Does it matter when hb is measured? 65 • What are the values of hb and h*? • Does it matter when h* is measured? The SF12 data proved of limited value, and hence analytical results are not reported. 5.2.1 Results Timing of the measurement of hb Patients in the sample completed SF36 either at initial consultation or at admission. The rationale was to ascertain whether patients experienced any pronounced deterioration in their health status prior to admission. If no such deterioration occurred the pre-treatment measures can be considered equivalent. • For patients having hip replacements, there were no differences among patients in terms of their age and gender as to when the before treatment SF36 was administered (Table C5) • Those surveyed at admission reported slightly worse bodily pain than those at initial consultation (23.8 compared to 26.95), but there were no differences for these patients along the other SF36 scales. • 85 patients waiting for hip replacements were surveyed at both periods (Table C6). These patients reported improved general health between their initial consultation and at admission (58.37 compared to 64.27), but there were no differences across the other SF36 scales. • For patients having knee operations, a higher proportion of women were surveyed at the initial consultation than on admission (Table C5). • Mean SF36 scores on some scales were higher at admission than they had been at the initial consultation. These scales were physical functioning, general health and the role-emotional element of mental health. • The patients surveyed at admission reported higher levels of physical health and of mental health than those surveyed at the initial consultation (physical health: 35.1 compared to 30.93; mental health: 57.67 compared to 52.28). • 56 patients were surveyed at both times, and these reported improvements in 66 their general health, vitality and mental health (Table C6). Overall, these patients experienced an improvement in their mental health summary score (58.67 compared to 51.03). • These findings are contrary to the expectation that patients would have experienced deterioration in their health status while waiting for admission. These results suggest that the timing at which the SF36 is administered may have an impact on the level of health reported. Table C5 SF36 scores at initial consultation and pre-operatively, all patients Hips Initial consultation Males Females n Pre-operative Knees Initial consultation n Preoperative n n 41.50% 58.50% 93 131 41.90% 58.10% 101 140 NS 38.60% 61.40% 64 102 49.10% 50.90% 162 168 * Age 68.23 224 67.78 241 NS 70.11 166 70.05 328 NS Physical functioning Role-Physical Bodily pain General health Physical Health 23.88 10.1 26.95 63.53 31.49 224 217 221 221 212 22.83 10.35 23.8 65.69 30.68 239 236 239 240 231 NS NS * NS NS 21.65 12.04 28.66 61.49 30.93 165 164 164 163 159 26.49 16.12 31.31 66.71 35.1 328 318 325 322 310 ** NS NS ** ** Vitality Social functioning Role-Emotional Mental health Mental health 43.39 48.65 45.78 69.38 52.06 224 223 217 224 217 41.62 46.99 45.53 68.45 50.73 241 241 235 241 235 NS NS NS NS NS 44.59 52.33 43.57 69.38 52.28 164 166 158 165 156 48.44 56.4 52.78 72.59 57.67 327 324 318 328 315 NS NS * NS * 67 Table C6 SF36 scores at initial consultation and pre-operatively, patients surveyed on both occasions only Related observations Hips Knees Initial consultation Pre-operative n Physical functioning Role-Physical Bodily pain General health Physical Health 19.08 5.95 22.76 58.37 26.74 17.31 8.23 23.18 64.27 28.5 85 84 84 84 82 Vitality Social functioning Role-Emotional Mental health Mental health 40.67 47.21 39.23 64.16 48.11 41.2 45.29 43.9 67.89 49.95 85 85 82 85 82 Initial consultation Pre-operative n NS NS NS ** NS 22.82 6.73 25.81 63.9 29.84 21.44 12.5 29.59 70.32 32.86 56 52 54 52 48 NS NS NS * NS NS NS NS NS NS 41.64 51.34 39.74 70.91 51.03 49.82 57.81 50 76.33 58.67 55 56 52 55 51 ** NS NS ** ** The values of hb and h* and timing of the measurement of h* Tables C7 and C8 compare the mean pre-operative scores by SF36 dimension with the mean scores 3-4 months and 1 year after the operation for those having hip and knee replacements. The main points are that: • Patients showed improved scores across all dimensions. • Much of the improvement occurs in the immediate post-treatment period, with the change in health status being little changed between the first follow-up period and 1 year after treatment. This implies that a six month follow-up is sufficient to capture the treatment effect for these patients. • A number of patients completed their first follow-up at 6 months rather than 34 months. Some completed the SF36 at both 3-4 months and at 6 months. The scores for these patients are little different between the two periods. This implies that the data across these two periods can be considered equivalent for the purposes of analysis. • In other words, the h*(t) time path depicted in figure C1 can be considered flat between 3 and 12 months for these two types of patient. 68 Table C7 SF36 scores for hip replacement, patients with both before and after surveys only York Trust Hip Replacements SF36 data: comparison of pre-op and follow-up scores, by dimension Pre-Op 3/4 months Sig. Level Pre-Op 139 cases 1 yr Sig. Level 53 cases Physical functioning Role-Physical Bodily pain General health 24.8 11.6 27.1 66.1 61.3 44.3 65.1 70.6 ** ** ** ** 23.2 8.2 26.1 67.8 61.9 53.8 68.8 70.7 ** ** ** NS Physical Health 32.5 60.3 ** 31.5 63.8 ** Vitality Social functioning Role-Emotional Mental health 45.3 52 51 70.3 59.6 79.2 72.2 78.4 ** ** ** * 43 53.3 41.8 66.1 63.7 80.9 74.8 77.7 ** ** ** ** Mental health 54.6 72.6 ** 51.6 74.3 ** * sig at 95% ** sig at 99% 69 Table C8 SF36 scores for hip replacement, patients with both before and after surveys only York Trust SF36 data: comparison of pre-op and follow-up scores, by dimension Knee Replacements IC/Pre-op 3/4 months Sig. Level IC/Pre-op 1 yr Sig. Level 134 cases 54 cases Physical functioning Role-Physical Bodily pain General health 27.4 13.4 31.1 69.1 49.2 35.8 58 67.4 ** ** ** NS 24.5 14.2 32.8 66.3 49.6 44.9 57.1 66.1 ** ** ** NS Physical Health 35.5 52.6 ** 34.8 54.4 * Vitality Social functioning Role-Emotional Mental health 48.4 57 52.3 74.5 53.3 72.5 62.2 76.4 ** ** * * 45.7 48.4 40.1 68.6 54.6 71.1 72.5 74.8 ** ** * ** Mental health 58.2 66.1 * 51.5 68.3 ** * sig at 95% ** sig at 99% 5.2.2 Conclusions A number of lessons emanate from the York experience: • Data entry was undertaken using an optical reader. This was highly deficient, generating a large number of duplicate records and coding inaccuracies. The technology has to improve substantially for this to be a cost-effective alternative to professional data entry. • For a large proportion of patients originally surveyed there is no information about their post-treatment experience. This implies that effort must be directed at ensuring that patients are followed-up. This is not a trivial matter. It requires a system to alert when a follow-up questionnaire is due to be posted and a system for ensuring it is returned. • For these treatments, there was little difference between the two before treatment surveys, suggesting that patients did not experience a marked deterioration in their condition in the immediate period prior to treatment. 70 Indeed, contrary to expectations, patients waiting for knee replacements reported improvements along some dimensions. • There was little difference in SF36 scores at the first follow-up and one year post-treatment. This implies that, for these conditions, most of the treatment effect is realised in the first 6 months, and that measurement beyond this period adds little additional information. • Part way through the data collection exercise a decision was made to change instruments from the SF36 to SF12. These instruments are not directly comparable, so it was not possible to analyse treatment effects for those patients who originally completed SF36 but who were sent SF12 postoperatively. Particularly if an objective of the exercise is to identify time trends, it is important to collect data in a consistent fashion. 71 5.3 BUPA Since 1998 BUPA have been collecting outcomes data (before and after health status) on their patients from 70 private hospitals. BUPA provided CHE/NIESR with a dataset containing 90,000 patient treatment episodes covering the period 1998-2003. Patients fill in a health status questionnaire before treatment and three or four months post treatment depending on whether the instrument is SF-36 or VF-14 (for cataract procedures since 2001). Originally BUPA aimed for comprehensive data collection but, since 2002, efforts have been concentrated on collecting data from 20 “sentinel” high volume procedures. We analyse the data for two purposes: • To identify temporal trends in health gain that could be attributed to improvements in each procedure rather than (say) changes in the type of patients receiving the procedure. For this analysis we concentrate on sentinel procedures for which more than 1,000 before and after surveys are available. • To derive estimates of before and after health status for use in the specimen index. Full sample information is exploited for this purpose 5.3.1 Temporal trends in health gain Method The key attraction of the BUPA data is that it is possible to explore whether any technological improvements in the provision of care that took place over the period led to improvements in health status. This is investigated as follows: • To ascertain whether any trends in health gain are discernible we compare the before and after change in health status over time; • Changes in health status may be attributable to changes in the characteristics of patients. We attempt to identify conditional trends by controlling for the age and gender of patients. We regressed individual health gain, measured by either SF36 or VF14, on a set of 72 covariates and on time dummies. Covariates include age, gender, and hospital dummy variables. For both the SF36 and VF14, health gain, ∆hi , is calculated as: ∆hi = hi* − hib where hi* is the post-treatment measure of health; hib is the pre-treatment measure; and i indexes patients. We then calculate the time trend, conditioning on various factors that might explain measured health status. The most comprehensive regression model took the following form: ∆hi = α + β1agei + β 2 agei2 + β 3 agei3 + β 4 genderi + J K T j =1 k =1 t =1 ∑ δ j providerij + ∑ δ k seasonik + ∑ γ t yearit + ε i We explored various permutations of this specification and report results for a reduced model of the following form: T ∆hi = α + β1agei + β 2 agei2 + β 3agei3 + β 4 genderi + ∑ γ t yearit + ε i t =1 Data Since 2002, BUPA has concentrated on collecting data on sentinel (high volume) procedures. We analyse those for which more there are more than 1,000 observations in total but restrict analysis only to those patients who completed both a before (preoperative) and an after (3 months post-operative) survey. When the SF36 instrument is employed this limits the available sample to two-thirds of the original sample, because not all of patients originally surveyed had completed an “after treatment” survey at the time at which the data were extracted and because there was item nonresponse in a number of surveys. In contrast, we have complete before and after surveys for around 95% of those surveyed using the VF14 instrument. 73 Table C9 Numbers of BUPA patients surveyed, selected conditions HRG code F73 & F74 G13 & G14 A07 H23 & H24 C14 H02 B02 & B03 B02 & B03 BUPA code and description T2000 - Primary inguinal hernia J1830 - Laparoscopic cholecystectomy A5210 - Epidural injection M6530 - TURP F0910 - Wisdom Teeth extraction W3710 - Primary hip replacement C7120 & C7180 - Cataract Surgery C7120 & C7180 - Cataract Surgery Instrument SF36 SF36 SF36 SF36 SF36 SF36 SF36 VF14 Initial sample 3,172 1,322 1,033 1,021 1,686 3,198 3,441 2,759 Usable sample 2,169 901 633 617 1,254 1,827 1,703 2,613 Results Before and after health status The before and after health status of BUPA patients is reported for each SF36 scale and by the SF36 summary measures in Table C10. On average, patients generally experienced a significant positive change in both their physical and mental health status. The exceptions were patients who had wisdom teeth extraction, who experienced no change in their mental health, and patients who underwent transurethral resection of the prostate (TURP), who report no change in either their physical or mental health. The SF36 scales provide an indication of what aspects of health the treatments have most impact upon. For instance, patients who have treatment for a primary inguinal hernia report most improvement in the “role-physical” and “bodily pain” scales of the SF36, but less change along the other SF36 scales. This “drill down” ability appears to be an attractive feature to clinicians of the SF36 instrument. Table C10 SF36 scores by dimension and condition 74 F73 & F74 Primary inguinal hernia Pre-Op 3 mths Sig. Level C14 Wisdom teeth extraction Pre-Op 2169 cases 3 mths 80.7 65.9 66.3 75.3 86 79.6 79.2 73 ** ** ** ** 94.6 93.1 78.1 79.4 93.8 94.1 87.6 77.4 Physical Health 72.1 79.4 ** 86.3 88.2 Vitality Social functioning Role-Emotional Mental health 67.4 85.1 90.8 81.9 68.1 90.5 90.5 82.3 69.5 91.5 94.4 79.8 69.1 91.9 91.9 79.6 Mental health 81.3 82.8 83.8 83.1 ** ** H02 Primary hip replacement 3 mths Pre-Op 1254 cases Physical functioning Role-Physical Bodily pain General health Pre-Op Sig. Level G13 & G14 Laparoscopic cholecystectomy Sig. Level 1827 cases 3 mths Sig. Level Pre-Op 3 mths ** ** 84.7 62.2 51.2 70.7 87.2 83 80.2 72.7 ** ** ** ** 79.4 70.4 74.4 70.8 79.1 69.7 77.8 68.4 ** 67.2 80.8 ** 73.7 73.7 53.9 73 84.4 74.5 64.4 87.9 89.2 79.4 ** ** ** ** 63.7 81 88.2 80.3 63.6 84.2 87.4 82 71.4 80.2 ** 78.3 79.3 ** Sig. Level Sig. Level 617 cases 901 cases A07 Epidural injection Pre-Op 3 mths H23 & H24 TURP ** ** ** ** B02 & B03 Cataract surgery Pre-Op 633 cases 3 mths Sig. Level 1703 cases Physical functioning Role-Physical Bodily pain General health 31.8 17.2 29.3 69.41 59.54 38.68 64.31 70.92 ** ** ** ** 48.4 22.8 28.7 61.6 58 39.3 47.4 57.9 ** ** ** ** 72.3 73.4 78.5 69.7 71.4 68.8 75 67.7 ** ** ** ** Physical Health 36.93 58.36 ** 40.4 50.7 ** 73.5 70.7 ** Vitality Social functioning Role-Emotional Mental health 47.68 58.33 72.94 74.91 61.11 76.95 81.66 82.02 ** ** ** ** 44.6 54.7 74.1 68.6 51.1 67.2 74.4 71.7 ** ** ** 64.3 87.1 88.3 81.5 62 85.2 85.1 80.6 ** ** ** ** Mental health 63.46 75.44 ** 60.5 66.1 ** 80.3 78.2 ** 75 Unconditional time trends With the exception of the final graph, the graphs below plot the mean and 95% confidence intervals for the difference in before and after SF36 physical and mental health summary scores for each of the conditions over time. Time is plotted by season, with the first set of observations corresponding to winter 1998 and the final set to spring 2003. The final graph provides equivalent information derived from the VF14 applied to patients having cataract surgery. This series is shorter, commencing in winter 2001 and finishing in spring 2004. • None of the graphs based on the SF36 provide evidence of a time trend that might be attributable to technological progress. • There does appear to have been a temporal improvement in the change in health status as measured by the VF14. The 144 cataracts patients surveyed in winter 2001 reported a 12-point improvement in their VF14 score following treatment. The 184 patients surveyed in spring 2004 reported a 14-point improvement. Generally, patients treated in each successive quarter reported marginal improvements compared to those treated in the preceding quarter. • The time trend captured by the VF14 instrument is not in evidence when the SF36 is applied to cataract patients. This may be because the VF14 is more sensitive than the SF36 to the specific dimensions of health gain that cataract patients experienced. • It may be that responses to mental health questions, in particular, are contaminated by seasonal effects. If there is a seasonal effect, it may be revealed in the type of pattern evident in the SF36 mental health summary scores for patients who had hip replacements. In quarters 8 and 12 (both JulySeptember), mean summary scores appear higher than those in other quarters. However, there is no general evidence that these survey responses are seasonally affected. 76 SF36 physical score by time, hernia 20 95% CI DIFFPHYS 10 0 -10 N= 34 62 95 101 90 161 144 157 192 1 2 3 4 5 6 7 8 9 185 145 128 126 154 136 110 97 52 10 11 12 13 14 15 16 17 18 TIME SF36 mental score by time, hernia 10 8 6 95% CI DIFFMENT 4 2 0 -2 -4 N= 34 62 95 101 90 161 144 157 192 1 2 3 4 5 6 7 8 9 185 145 128 126 154 136 110 97 52 10 11 12 13 14 15 16 17 18 TIME 77 SF36 physical score by time, wisdom teeth extraction 20 95% CI DIFFPHYS 10 0 -10 N= 12 38 84 78 71 101 74 79 106 1 2 3 4 5 6 7 8 9 107 88 64 65 75 64 64 60 24 10 11 12 13 14 15 16 17 18 TIME SF36 mental score by time, wisdom teeth extraction 10 95% CI DIFFMENT 0 -10 -20 N= 12 38 84 78 71 101 74 79 106 1 2 3 4 5 6 7 8 9 107 88 64 65 75 64 64 60 24 10 11 12 13 14 15 16 17 18 TIME 78 SF36 physical score by time, laparoscopic cholecystectomy 40 30 95% CI DIFFPHYS 20 10 0 -10 N= 10 25 30 33 54 62 82 50 67 83 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 74 53 68 51 50 45 53 11 TIME SF36 mental score by time, laparoscopic cholecystectomy 40 30 95% CI DIFFMENT 20 10 0 -10 N= 10 25 30 33 54 62 82 50 67 83 74 53 68 51 50 45 53 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 TIME 79 SF36 physical score by time, TURP 40 30 20 95% CI DIFFPHYS 10 0 -10 -20 -30 N= 9 19 35 27 20 41 47 44 52 62 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 36 47 41 36 37 28 25 11 TIME SF36 mental score by time, TURP 30 20 10 95% CI DIFFMENT 0 -10 -20 -30 N= 9 19 35 27 20 41 47 44 52 62 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 36 47 41 36 37 28 25 11 TIME 80 SF36 physical score by time, hip replacement 40 30 95% CI DIFFPHYS 20 10 0 N= 13 31 81 71 70 105 109 121 156 1 2 3 4 5 6 7 8 9 145 156 130 149 146 92 113 76 63 10 11 12 13 14 15 16 17 18 TIME SF36 mental score by time, hip replacment 30 20 95% CI DIFFMENT 10 0 -10 N= 13 31 81 71 70 105 109 121 156 1 2 3 4 5 6 7 8 9 145 156 130 149 146 92 113 76 63 10 11 12 13 14 15 16 17 18 TIME 81 SF36 physical score by time, epidural injection 30 20 95% CI DIFFPHYS 10 0 -10 N= 23 26 28 47 39 45 31 42 28 40 47 42 41 45 43 38 25 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 TIME SF36 mental score by time, epidural injection 20 10 95% CI DIFFMENT 0 -10 -20 N= 23 26 28 47 39 45 31 42 28 40 47 42 41 45 43 38 25 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 TIME 82 SF36 physical score by time, cataract surgery 10 95% CI DIFFPHYS 0 -10 -20 N= 27 62 94 101 98 132 141 137 159 1 2 3 4 5 6 7 8 9 148 97 132 81 45 39 26 15 1 10 11 12 13 14 15 16 17 18 TIME SF36 physical score by time, cataract surgery 10 95% CI DIFFPHYS 0 -10 -20 N= 27 62 94 101 98 132 141 137 159 1 2 3 4 5 6 7 8 9 148 97 132 81 45 39 26 15 1 10 11 12 13 14 15 16 17 18 TIME 83 VF14 by time, cataract surgery 20 18 16 95% CI DIFFVF14 14 12 10 8 6 N= 144 320 238 260 279 271 251 218 217 235 184 1 2 3 4 5 6 7 8 9 10 11 TIME 84 Conditional time trends The table below reports coefficients from regressing the change in the SF36 physical and mental health summaries on age, gender (male=1) and year (with 1998 and 1999 being the baseline). The final column of the table reports the change in VF14 for cataract patients. Coefficients that are significant at 5% are highlighted. The main messages are: • Very few variables are statistically significant; • There is no evidence of year-on-year trends in changes in the SF36 physical or mental health summary scores for any of the conditions; • For VF14, the coefficients on the year dummies are increasingly positive. This suggests that the time trend identified in the unconditional analysis is not explicable by changes in the age and gender composition of patients having cataract surgery. 85 n G13/G14 J1830 H23/H24 M6530 H02 W3710 A07 A5210 Laparoscopic cholecystectomy TURP primary hip replacement Epidural injection 2168 1253 900 616 1826 B02/B03 B02/B03 C7120/C7180 C7120/C7180 632 Cataract surgery C14 F0910 Cataract surgery F73/F74 T2000 Wisdom teeth extraction HRG code BUPA code Primary inguinal hernia Table C11 Regression results: dependent variable “Change in health status” 1702 VF14 SF36 physical health summary score Constant Age Age^2 Age^3 Gender 2000 2001 2002 2003 13.792 -0.033 -0.001 0.000 -3.666 -1.375 -2.481 0.478 -2.816 -1.280 0.134 0.000 0.000 1.149 0.606 0.841 -0.132 -1.461 -3.463 1.420 -0.031 0.000 1.728 3.580 0.813 -0.578 2.504 2617 16.006 -2.091 0.054 0.000 7.616 -0.274 -1.308 2.389 9.888 54.614 -1.778 0.034 0.000 1.534 -1.082 -0.422 0.553 0.328 -4.076 1.207 -0.026 0.000 -2.310 0.677 -1.349 -0.674 -7.845 -21.141 1.185 -0.019 0.000 0.175 -0.106 -0.475 -1.238 -2.595 -40.681 1.361 -0.010 0.000 11.680 -1.398 -2.191 -2.504 -5.879 36.542 -1.319 0.024 0.000 4.775 -1.769 -1.643 -1.147 1.419 -1.476 0.486 -0.008 0.000 -2.442 2.528 1.954 -2.379 -5.339 -25.284 1.279 -0.020 0.000 0.415 0.560 0.009 -2.194 -1.828 -164.629 8.308 -0.126 0.001 1.231 0.265 1.842 3.132 SF36 mental health summary score Constant Age Age^2 Age^3 Gender 2000 2001 2002 2003 6.692 -0.120 0.000 0.000 -1.450 -0.131 -1.455 0.000 -0.223 -2.937 0.037 0.001 0.000 0.899 0.508 -0.530 0.648 -0.248 -11.132 1.347 -0.026 0.000 2.375 1.511 -0.373 -2.599 4.177 coefficients significant at 5% in bold 86 5.3.2 SF36 data used in the specimen index Table C12 provides details of the before and after estimates of health status that were used in constructing the specimen index. Attention is drawn to the following: • The BUPA data are reported at procedural level whereas the specimen index aggregated activities by HRG. We mapped the BUPA procedural codes to their appropriate HRG. There is a presumption, therefore, that experience of patients undergoing the BUPA procedures are representative of all those patients classified within the relevant HRG. This presumption is less likely to hold the more heterogeneous the HRG, some of which amalgamate a large number of different types of procedure. • Only the physical summary health score from the SF36 is used in the specimen index. There is no obvious way to combine the physical and mental health summary scores into a single index. • Mean before and after treatment health status is estimated from the full sample for whom data were available not just those who completed both surveys. Comparison of the mean values reported in Table C10 with those reported below shows that the means are similar. 87 Table C12 Estimates of health status before and after treatment from BUPA and York Trust used for specimen index BUPA code HRG A5210 - Epidural injection A07 C7120 - Cataract Surgery B02 C7180 - Cataract Surgery B03 HRG description Intermediate Pain Procedures Phakoemulsification Cataract Extraction with Lens Implant Other Cataract Extraction with Lens Implant Before treatment n After treatment n 977 0.413 909 0.518 2724 0.726 2524 0.688 336 0.699 293 0.656 F0910 - Wisdom Teeth extraction C14 Mouth or Throat Procedures - Category 2 1625 0.870 1599 0.862 E0360 - Septoplasty C22 Nose Procedures - Category 3 758 0.830 749 0.826 F3440 - Adult tonsillectomy C24 Mouth or Throat Procedures - Category 3 618 0.767 607 0.846 K4100 - CABG E04 747 0.500 725 0.661 K4900 - PCTA E15 85 0.539 75 0.722 T2000 - Primary inguinal hernia F73 776 0.644 702 0.692 T2000 - Primary inguinal hernia F74 Coronary Bypass Percutaneous Transluminal Coronary Angioplasty (PTCA) Inguinal and Umbilical or Femoral Hernia Repairs >69 or w cc Inguinal and Umbilical or Femoral Hernia Repairs <70 and w/o cc Biliary Tract - Major Procedures > 69 or w cc Biliary Tract - Major Procedures <70 and w/o cc Primary Hip Replacement Primary Knee Replacement Soft Tissue Disorders > 69 or w cc Soft Tissue Disorders <70 and w/o cc Complex Breast Reconstruction using Flaps Upper Genital Tract major Procedures Threatened or Spontaneous Abortion 2715 0.738 2612 0.811 144 0.632 135 0.659 1476 0.683 1408 0.808 2991 134 541 413 0.371 0.355 0.768 0.715 2741 134 523 383 0.567 0.526 0.768 0.673 J1830 - Laparoscopic cholecystectomy J1830 - Laparoscopic cholecystectomy W3710 - Primary hip replacement Source: York Trust M6530 - TURP M6530 - TURP G13 G14 H02 H04 H23 H24 B3120 - Augmentation mammoplasty J01 Q0740 - Hysterectomy procedures Q0800 - Hysterectomy procedures V3310 - Lumbar procedures: posterior decompression on lumbar spine and posterior excision of intervertebral disc V2530 - Lumbar procedures: posterior decompression on lumbar spine and posterior excision of intervertebral disc V3410 - Microdiscectomy and Revision disc M07 M09 287 0.931 292 0.870 1178 287 0.703 0.715 1143 273 0.723 0.756 R02 Surgery for Degenerative Spinal Disorders 543 0.365 527 0.613 R03 Spinal Fusdion or Decompression Excluding Trauma 584 0.365 574 0.559 R09 Revisional Spinal Procedures 40 0.321 38 0.547 Source: BUPA (with exception of HO2 from York Trust) 88 6. Lessons for future routine collection of health outcomes data The lack of outcome measures is the most obvious barrier to measuring improved performance of the NHS. Outcome measures are required for a number of different purposes • Measurement of NHS quality adjusted output and productivity growth. • Performance management of provider trusts – how are provider trusts doing relative to past performance and to other trusts? • Performance management of consultant teams – how are they doing relative to their past performance, or relative to other consultant teams doing the same kind of work? • Cost effective resource allocation – which treatments are a cost effective use of resources given the way care is actually delivered in the NHS? We conclude by offering some recommendations to guide future collection of health outcomes data. NHS patient sample. Since the scale of NHS activity is so broad and the potential volume of patients is so large sampling seems a more sensible strategy than attempting to measure health effects for all NHS patients in a sector. Sample size requirements are likely to vary across different types of patients, and the variances in EQ5D scores reported in Table C2 suggest that sample size would need to be large if data are required at an HRG level because these groupings are not particularly “outcome homogenous”. There are two ways in which the exercise might be made more manageable. The first would be to start with a small number of salient activities, both elective and emergency. Even if it is not possible to collect health data before patients receive emergency care, the post treatment health data will still yield useful information about trends in the quality of emergency care and comparisons across teams if allowance is made for case mix. For emergencies where it is not possible to collect no treatment or pre treatment health outcome data, it may not be unreasonable to make simple assumptions about the no treatment health outcome in order to estimate a health effect 89 using the actual post treatment outcomes. The second would be to undertake a pilot exercise at a handful of Trusts. Even in the longer term there may well be economies of scale from concentrating data collection within a small number of institutions rather than spreading the data collection burden more thinly across all Trusts. The drawback is that the selected Trusts may not be fully representative of the NHS as a whole. CHKS Ltd have approached a number of Trusts offering to facilitate the collection of health outcomes data, and such efforts deserve to be encouraged. Choice of instrument. It is important to collect patient reported outcome measures (PROMs), using a generic instrument. Since the outcome is health gain for patients it is their views on the effect of treatment which matter. A single generic, rather than condition-specific instrument is required in order to facilitate aggregation across different types of NHS activity in a productivity index. Generic measures also enable comparisons to be made across activities, to compare the cost-effectiveness of different activities and to measure quality adjusted output summed over different activities for performance management of Trusts or for measuring NHS productivity. Profiles, such as SF-36, provide multiple measures of outcome but are of more limited use for non-clinical purposes since they typically lack the capacity to form a single aggregate index. That said, the disaggregated nature of the SF36 instrument appears to have clinical appeal. In applying the SF36 in this work we focussed on the physical summary score and ignored the mental health summary score, as there is no obvious way of combining the two into a single index. For aggregation and comparative purposes, it is necessary to have a means of combining the health profiles into an overall score. EQ5D has associated with it a set of weights derived from samples of the general population (over 10 years ago) which can be used to aggregate the dimensions to produce an overall score. Methods of producing an overall score from the SF instruments are less well developed. Clinicians in some areas argue that some aspects of quality are not measured sensitively by generic instruments and that condition-specific instruments are 90 necessary. The VF14 used by BUPA and analysed above provides evidence of the value of such measures. Others may argue that the necessarily relatively short term follow up of patients cannot detect longer term health impacts and that these are best measured by using technical process measures. If necessary, generic instruments can be supplemented with condition-specific process measures but these must be in addition to, not instead of, generic instruments. It is essential that the same measure be applied across different activities to enable costeffectiveness comparisons and for aggregate output measurement. Using a conditionspecific instrument alongside a generic measure should not add greatly to the administrative burden if the data collection system is in place for collecting data. Comparison of the health outcomes assessed using condition-specific and generic measures would also provide firmer evidence on whether the generic instrument is sufficiently sensitive. If it is not then the generic measures should be amended with additional dimensions for a finer categorisation of existing dimensions. Timing. The timing of before and after health status measurement may depend on the type of activity (emergency or elective). Measurements from patient self-report are preferable but there are circumstances in which proxy or retrospective data provide the only feasible route (for example, when emergency care is required by patients in an unconscious state). The timing of the administration of the instrument may also depend on diagnostic category or intervention type since different treatments may have an effect over shorter or longer periods. For some activities or conditions it may be necessary to make several post intervention measurements since the effect on health may vary substantially with the time elapsed since treatment. Grouping of NHS activities. Given the enormous range of NHS activities it is necessary to group them for data analysis. Thus the main grouping of secondary care activities is by HRG which attempts to group activities by their costs. But a given HRG may contain a large number of procedures which have very different effects on health. The availability of patient-level health outcomes data matched to other datasets (such as HES) will make it possible to explore the extent to which health outcomes are related to other routinely collected patient characteristics, such as age, gender, diagnoses and procedures. This information may be used to create 91 homogenous health outcome groupings which, in turn, may allow refinement of how activities are defined. Frequency of data collection. If the effect of treatment on health depended only on the state of medical knowledge and the pace of technological change in medicine was slow enough it could be argued that collecting data on health effects was an exercise that needed to be undertaken at intervals of several years. But technological change in medicine and pharmaceuticals is rapid, the NHS is subject to frequent organisational change which may affect the mix of patients receiving particular treatments and the speed at which new technology spreads. We believe that only a continuous sampling of the NHS patient population will be adequate to capture trends in the effects of the NHS on its patients. Updates may be concentrated on areas subject to most technological change. Stated preference research might be supported to get valuations of the other dimensions of outcome (waiting times, process experience) in the same units as health effects measured using the generic instruments. The valuation of waiting times in terms of willingness to trade off waiting time for health gains as measured in the generic instrument need not be an annual exercise. The data on waiting times are generated routinely – what is missing is the valuation and this is not likely to change greatly from one year to the next. Some data on patient experience is collected regularly via patient satisfaction questionnaires but there are problems in using the data to make comparisons over time. Information on trade-offs between the nonhealth dimensions of patient experience and health gain and could also be collected from regular, though not annual, stated preference research. 92 D. Comparing marginal and average cost 1. Introduction As we argued in our First Interim Report (section 2.10.3) under certain assumptions, marginal costs reflect marginal valuations. In such situations, a case can be made for using marginal costs as a basis for determining the relative values of different activities. Current NHS practice is to use unit (average) costs derived from the Reference Costs. In an attempt to test whether the use of marginal costs would make a difference to calculated productivity growth we estimate marginal costs and compare them with reference costs. 2. Methods We compare average cost estimates derived from accounting data and marginal cost estimates derived from regression models. Accounting Estimates. Collected annually from all NHS Trusts, the Reference Costs comprise average costs for different types of activity, defined predominantly by Healthcare Resource Group. These average costs are accounting costs arrived at by applying national guidance about how to apportion shared input costs to each activity type. We calculated a national weighted average Reference Cost - C ijtRC - for each HRG i in Trust j at time t for elective, non-elective and daycase activity respectively. The weights represent the share of activity (FCEs) in each Trust j for each HRG i. Hence for each time period t we calculated: C ijRC = ∑ FCEij ( RC ij ) i ∑ FCE ij i where RC ij are the Reference Costs reported by Trusts j in HRG i and where FCEij are the number of FCEs for each Trust j in HRG i. 93 The annual figures for each time period t were summarised into a single estimate C ijtRC covering the full period for which data were available, with the annual activity in each year used to weight the annual mean costs. In addition to the weighted average Reference Cost figures calculated for elective, non-elective and daycase activity respectively, we also calculated an overall national weighted average Reference Cost for all activity. For this, we weighted the mean Reference Cost for each type of activity by the total activity for each of these 3 types of activity respectively. These annual figures were then aggregated up in the same way as the rest of the data, by weighting according to the annual activity in each year. Regression Estimates. The alternative approach to calculation of marginal costs is to derive them from a regression in which total costs are regressed on the amount of activity undertaken in each output activity. The parameters from this regression can be interpreted as estimates of marginal cost. Hence, we estimated a regression model of the form: TC it = α + β 1t x1it + β 2t x 2it + ... + β Jt x Jit + eit where TC it is the total cost of all types of activity in Trust i at time t. This is calculated by multiplying the number of FCEs by the reference cost for those FCEs in each HRG and summing across all HRGs. β 1t can be interpreted as the marginal cost of treating an additional patient in HRG 1 at time t. We ran regressions for each year separately, but also created a panel dataset combining all six years’ worth of data, running regressions for panels of 6, 5 and 4 years (dropping the first two years respectively to test for robustness of data). We tested functional form by running linear models, log-log, and log-linear models. We tested the inclusion of a size variable namely, the average number of beds in the Trust and various non-linear combinations of this variable. We also tested the inclusion of time dummy variables. We ran a number of different estimation procedures including pooled OLS, GLM, GEE, random and fixed effects models. 94 The objective is to compare C ijtRC with estimates of β it from the various specifications of the regression model. 3. Data We used six available years of Reference Cost data from 1998/99 to 2003/04. These give, by NHS Trust, the number of FCEs, and the average cost, in each HRG for elective, daycase and non-elective (emergency) activity respectively. We excluded PCTs (some of which began to provide acute care from 2001/02) since their production and cost structure may differ from those of NHS hospital Trusts. In the regression model, we would wish to use the number of FCEs in each HRG as an explanatory variable. This however gives rise to serious degrees of freedom problems with over 500 HRGs and thus RHS regressors. To address this limitation, we selected the 60 largest volume HRGs for inclusion in the model. We then created a composite HRG category for all other activity. Volumes were determined by running frequencies for all HRGs across all 6 years. 4. Discussion This exercise produced some significant and plausible coefficient estimates, but also a few coefficients with negative signs suggesting a negative marginal cost. Possible reasons for this may be: 1.) issues of scale economies 2.) misspecification in the model 3.) misallocation of costs between elective and daycases and non-electives 5. Results We present results respectively, for daycases, electives, non-electives and then all activity together, showing the change in results from including and then excluding 95 data for 1998/99. We also first present results on the proportion of activity represented by the top 60 volume HRGs in each category. There is notably relatively little overlap between the top volume HRGs in each type of activity. Table D1 Proportion of activity explained by top 60 HRGs in each type of activity and number of overlapping HRGs in top 60, 1998-2003 Daycase Elective Non-elective All Daycase Elective 26 Non-elective 5 7 All 28 25 35 Prop of activity 85 61 63 55 Table D2 Estimated marginal cost and average (Reference Cost), Daycases, 19982003 Coeff. P>[t] RC B02 Phakoemulsification Cataract Extraction with Lens Implant 523 0.008 609 C22 Nose Procedures - Category 3 5605 0.004 556 E14 Cardiac Catheterisation without Complications 1144 0.001 627 F06 Oesophagus - Diagnostic Procedures 473 0.024 279 F44 General Abdominal - Endoscopic or Intermediate Procedures <70 w/o cc 829 0.000 468 F54 Inflammatory Bowel Disease - Endoscopic or Intermediate Procedures <70 w/o cc 2208 0.029 324 H17 Soft Tissue or Other Bone Procedures - Category 1 <70 w/o cc 4101 0.003 501 H20 Muscle, Tendon or Ligament Procedures - Category 1 5055 0.046 533 H26 Inflammatory Spine, Joint or Connective Tissue Disorders <70 w/o cc 2021 0.008 382 J36 Minor Skin Procedures - Category 1 w cc 3969 0.039 416 L20 Bladder Minor Endoscopic Procedure w cc 1393 0.014 317 L21 Bladder Minor Endoscopic Procedure w/o cc 550 0.012 307 L48 Renal Replacement Therapy w/o cc 230 0.000 158 M06 Upper Genital Tract Intermediate Procedures 1074 0.010 456 M12 Non-Surgical Treatment of Lower Genital Tract Disorders 493 0.000 260 Table D3 Estimated marginal cost and average (Reference Cost), Daycases, 19992003 Coeff. P>[t] RC 679 0.014 392 A07 Intermediate Pain Procedures B02 Phakoemulsification Cataract Extraction with Lens Implant 519 0.031 611 C22 Nose Procedures - Category 3 5476 0.010 571 E14 Cardiac Catheterisation without Complications 1420 0.011 622 F44 General Abdominal - Endoscopic or Intermediate Procedures <70 w/o cc 774 0.002 476 F54 Inflammatory Bowel Disease - Endoscopic or Intermediate Procedures <70 w/o cc 2771 0.026 326 96 H17 Soft Tissue or Other Bone Procedures - Category 1 <70 w/o cc 3960 0.006 505 J36 Minor Skin Procedures - Category 1 w cc 5267 0.015 424 L21 Bladder Minor Endoscopic Procedure w/o cc 503 0.048 312 L48 Renal Replacement Therapy w/o cc 230 0.000 158 Table D4 Estimated marginal cost and average (Reference Cost), Electives, 19982003 Coeff. P>[t] RC Mouth or Throat Procedures - Category 3 2311 0.025 734 E04 Coronary Bypass 14351 0.000 5707 E15 Percutaneous Transluminal Coronary Angioplasty (PTCA) 10440 0.000 2598 H02 Primary Hip Replacement 7146 0.007 4133 C24 N07 Normal Delivery w/o cc Q14 Diagnostic Radiology - Arteries or Lymphatics w/o cc 1022 0.041 860 -11103 0.003 877 Table D5 Estimated marginal cost and average (Reference Cost), Electives, 19992003 Coeff. P>[t] RC E04 Coronary Bypass 16058 0.000 5681 E15 Percutaneous Transluminal Coronary Angioplasty (PTCA) 10657 0.001 2543 H02 Primary Hip Replacement 7266 0.027 4213 N12 Other Maternity Events 525 0.018 457 Q14 Diagnostic Radiology - Arteries or Lymphatics w/o cc -13631 0.011 898 Table D6 Estimated marginal cost and average (Reference Cost), Non-electives, log-log model, 1998-2003 Coeff. Transform P>[t] RC A30 Epilepsy <70 w/o cc -0.050 -12901 0.032 515 D34 Other Respiratory Diagnoses <70 w/o cc -0.083 -23098 0.001 555 E18 Heart Failure or Shock >69 or w cc 0.087 8927 0.043 1360 E31 Syncope or Collapse >69 or w cc -0.070 -10884 0.006 835 F07 Disorders of the Oesophagus >69 or w cc -0.053 -15792 0.040 1215 P03 Upper Respiratory Tract Disorders -0.042 -3893 0.044 438 Table D7 Estimated marginal cost and average (Reference Cost), Non-electives, log-log model, 1999-2003 Coeff. Transform P>[t] RC N12 Other Maternity Events 0.105 1360 0.012 410 P03 Upper Respiratory Tract Disorders -0.054 -4688 0.027 440 97 Table D8 Estimated marginal cost and average (Reference Cost), Non-electives, log-linear model, 1998-2003 Coeff. Transform P>[t] RC A28 Headache or Migraine <70 w/o cc -0.008 0.008 489 D21 Asthma >49 or w cc 0.003 0.042 1075 F36 Large Intestinal Disorders >69 or w cc 0.005 0.044 1362 F56 Inflammatory Bowel Disease <70 w/o cc -0.005 0.042 733 J35 Minor Skin Procedures - Category 2 w/o cc -0.006 0.026 844 L09 Kidney or Urinary Tract Infections >69 or w cc -0.005 0.024 1403 M05 Upper Genital Tract Minor Procedures 0.003 0.033 560 Table D9 Estimated marginal cost and average (Reference Cost), Non-electives, log-linear model, 1999-2003 No results – no significant variables Table D10 Estimated marginal cost and average (Reference Cost), All, 1998-2003 Coeff. P>[t] RC C14 Mouth or Throat Procedures - Category 2 -2556 0.041 506 E12 Acute Myocardial Infarction w/o cc 6427 0.014 920 E14 Cardiac Catheterisation without Complications 4941 0.002 890 E30 Arrhythmia or Conduction Disorders <70 w/o cc -13773 0.014 506 H17 Soft Tissue or Other Bone Procedures - Category 1 <70 w/o cc 23926 0.000 919 H37 Closed Pelvis or Lower Limb Fractures <70 w/o cc 21000 0.045 1776 L20 Bladder Minor Endoscopic Procedure w cc 5946 0.008 682 N11 Caesarean Section w/o cc 7750 0.007 1857 S01 Haematological Disorders with Minor Procedure 2593 0.001 374 Table D11 Estimated marginal cost and average (Reference Cost), All, 1999-2003 B05 Other Ophthalmic Procedures - Category 2 Coeff. P>[t] RC -4396 0.031 476 C14 Mouth or Throat Procedures - Category 2 -2239 0.032 514 H17 Soft Tissue or Other Bone Procedures - Category 1 <70 w/o cc 23612 0.000 929 H42 Sprains, Strains, or Minor Open Wounds <70 w/o cc 10837 0.010 480 L20 Bladder Minor Endoscopic Procedure w cc 5451 0.030 705 M06 Upper Genital Tract Intermediate Procedures 3777 0.032 531 S01 2450 0.002 374 Haematological Disorders with Minor Procedure 98 Table D12 Estimated marginal cost and average (Reference Cost), All, log-log model, 1998-2003 Coeff. Transform P>[t] RC E31 Syncope or Collapse >69 or w cc -0.013 -3228 0.037 830 J35 Minor Skin Procedures - Category 2 w/o cc -0.014 -3093 0.006 687 -0.012 -1591 0.035 508 M05 Upper Genital Tract Minor Procedures M09 Threatened or Spontaneous Abortion -0.010 -1626 0.014 341 P15 -0.011 -2456 0.019 702 Accidental Injury Table D13 Estimated marginal cost and average (Reference Cost), All, log-log model, 1999-2003 Coeff. C22 Nose Procedures - Category 3 0.008 Transform 1925 P>[t] RC 0.041 821 C24 Mouth or Throat Procedures - Category 3 -0.022 -1900 0.017 708 E31 Syncope or Collapse >69 or w cc -0.023 -5591 0.016 836 F35 Large Intestine - Endoscopic or Intermediate Procedures -0.027 -1204 0.015 364 F47 General Abdominal Disorders <70 w/o cc -0.023 -2157 0.026 593 J37 Minor Skin Procedures - Category 1 w/o cc 0.023 1169 0.003 474 As these results suggest, there are large discrepancies between marginal and average cost estimates produced. There is also volatility in the coefficient estimates when using different model specifications, e.g. log-log versus log-linear. Different sets of HRGs would emerge as significant. As a result of these problems, we did not pursue this approach as a means of obtaining measures of marginal cost. 99