Development of a Chronic Obstructive Pulmonary Disease Severity Classification System using a Japanese Health Insurance Claims Database

Background: Healthcare services provided to patients should vary depending on disease severity. However, disease severity bias, a type of selection bias, is a commonly encountered problem in administrative database studies. Herein, we selected chronic obstructive pulmonary disease (COPD), which commonly affects elderly Japanese citizens, for the development and validation of a severity classification system based on a health insurance claims database. Methods: Patients who received COPD-related diagnostic codes in 2011 were selected from a commercially based health insurance claims database. COPD patients were randomly divided into two groups to develop and validate severity scores. A principal component analysis was used to estimate factor loadings used to weight calculations of COPD severity scores. Score validity was evaluated using a linear trend test to predict COPD treatment costs and acute exacerbation events. Results: Using records from 880 patients, ten variables were created: acute exacerbation events, emphysema diagnoses, laboratory test and oxygen therapy procedures, prescribed anticholinergic, inhaled corticosteroid (ICS), short acting beta-agonist, and long acting bronchodilator (LABA) agents, asthma diagnosis and patient birth years. Factor loadings from LABA and ICS prescriptions had the strongest impacts on estimated severity scores (0.50 and 0.49, respectively). Among 300 validation group patients, scores were found to associate with increasing trends of median costs and exacerbation risks (p for trend < 0.05). Conclusions: Estimated severity scores would help to predict COPD-related medical costs and exacerbation events. For further clinical implementation, this classification system should be re-evaluated using clinical lung functions information indicative COPD severity and treatment choices. DOI : 10.14302/issn.2474-7785.jarh-17-1727 Corresponding author: Manabu Akazawa, Public Health and Epidemiology, Meiji Pharmaceutical University,2522-1, Noshio, Kiyose, Tokyo 204-8588, Japan, , E-mail: makazawa@my-pharm.ac.jp, Affilications: Public Health and Epidemiology, Meiji Pharmaceutical University.


Introduction
Chronic obstructive pulmonary disease (COPD) is a progressive disease characterized by chronic dyspnea, cough, sputum production, and mainly attributed to long-term exposure to tobacco smoke.
COPD is the tenth-most common cause of death in Japan, and the number of associated deaths has exhibited an increasing trend [1].The previous study estimated that 5.3 million individuals aged ≥40 years were at risk of COPD in 2001 (estimated prevalence rate: 8.6%) [2].In addition, statistical surveys reported that patients with COPD accounted for expenditures totaling 151 billion yen (approximately 0.4% of the Japanese total medical expenditures) in 2011 [3].
Health insurance claims databases, which reflect real-world clinical environments, are important research tools with respect to drug safety monitoring, epidemiology and health economic studies.These databases include information about provided medical services, including disease diagnoses, procedures, and prescribed medications.Under the universal health insurance coverage system in Japan, patients can evenly use all available services, thus allowing the database collection of comprehensive information for patients living in Japan [4].However, important clinical information is not included in these databases such as results of clinical test and disease severity.Appropriate treatment is provided according to a patient's medical needs, which are determined by the disease condition and/or severity [5].In the absence of such information, estimated treatment effects determined through database studies are often biased due to confounding by indication [6].Thus, when using health insurance claims, summary variables indicative of disease conditions or severity must be created using diagnostic code and/or prescribed medications.For example, the Charlson comorbidity index was developed to predict mortality [7], and the Elixhauser comorbidity measure was developed to predict health-related outcomes [8].COPD severity is generally assessed according to the results of respiratory function tests, using Global Initiative for Chronic Obstructive Lung Disease (GOLD) criteria [9].
In the present study, we focused on COPDrelated disease scores as a method of evaluating disease -specific costs and health service utilization.Various administrative database-focused COPD severity scores have been reported [10][11][12], from among these, we selected a scoring system developed by Wu and colleagues that corresponded to our research purposes [10].This score, which was developed in the United States (US) for patients with COPD who experience acute exacerbation events, was used to estimate drug utilization and medical costs related to COPD severity without relying on respiratory function test data [13].
However, the patients evaluated in that study might have had a more severe disease condition, compared with the general population of Japanese patients with COPD.Moreover, drug selections for COPD management differ between the US and Japanese clinical environments: for example, the transdermal tulobuterol patch, a long-acting β2-agonist (LABA), has been frequently used in Japan for the long-term management of stable patients with COPD [14].Therefore, we decided to modify the severity scoring system developed by Wu and colleagues to the Japanese clinical environment and validate this modification using a Japanese administrative database.

Settings and Data
This study used a health insurance claims database maintained by the Japan Medical Data Center (JMDC, Tokyo, Japan), which contains inpatient, outpatient, and pharmacy prescription records collected for approximately 2 million individuals since January 2005 [4].A majority of insured persons listed in the database are employees of large companies or their family members [15].This database is a useful resource

Developing the Severity Score
The first claim involving a COPD diagnosis (ICD-10 codes: J42-J44) in 2011 was identified, and the month was defined as an index.Next, information about COPD related services provided to patients during the 12 -month period after the index month were extracted.We created nine variables: acute exacerbation event, emphysema diagnosis, asthma diagnosis with the medication, laboratory testing (respiratory function tests, x-ray photography, or CT imaging), oxygen therapy, or prescribed anticholinergic, inhaled corticosteroid (ICS), short acting beta-agonist (SABA), and LABA agents [10].
As no direct code indicating an acute exacerbation event in the insurance claims data, we created a proxy variable using information of an out-patient visit with a macrolide antibiotic/oral corticosteroid prescription or an emergency department visit with respiratory related diagnosis as used in previous studies [11].We counted the number of variables that appeared in the monthly claims during the 12-month follow-up period (each variable range: 0-12).In addition, a continuous variable indicating patient age at COPD diagnosis was created using the patients' birth-year data.
A principal component analysis (PCA) was conducted to calculate factor loadings as weights [16].
The PCA, a multivariate technique, is often used to reduce the number of variables.Each value of the relevant variables (age plus the nine variables indicated above) was multiplied by its own factor loading of the first principal component; these values were then summed to yield a severity score.For example, if the LABA factor loading was 0.6 and six LABA prescriptions were given in a year, the LABA score would be calculated as 0.6 × 6 = 3.6 points.Other points were similarly calculated and summed to yield each patient's original severity score.These scores were then standardized (mean: 50, standard deviation; 10) according to the different units of the variables.The reliability of the estimated scores was assessed using Cronbach's alpha [17].In addition, we divided the severity scores into quartiles and calculated the mean value in each quartile to confirm increasing trends for each variable.

Predictive Performance
COPD severity score validity was confirmed by estimating the annual costs of COPD treatment and probability of an acute exacerbation event.We assumed that the calculated severity scores would be able to predict increased trends in these values.[21,22].

Patient Characteristics
We identified  3), and the results indicated increased trends in higher severity score quartiles.

Predictive performance
Severity

Brief Statement of the Principal Findings
This study developed a COPD severity classification method using a Japanese administrative database and validated the performance of this method.
Score validity was confirmed by estimating COPD treatment costs and acute exacerbation risks, with higher scores indicating worse COPD conditions.
Accordingly, this severity classification system could be used as a risk adjustment factor to control for potential confounders in administrative database studies.Each variable has a possible value of 0-12 except for age, which is a continuous variable.colleagues was later used to examine the utilization and cost of medical services according to COPD severity [13], and was validated using another administrative database, although no direct comparison of respiratory function test values was performed [24].

Comparison with Similar Studies
In our study, we added asthma variable and excluded three variables (hospitalization due to acute exacerbation, pulmonologist visit, and use of oral corticosteroid) from the method described by Wu and colleagues to increase score reliability for the following reasons.First, asthma is an important risk factor of COPD.Our study population had approximately 50% of asthma diagnosis.Second, our database included long-term hospitalized patients who required no aggressive treatments.Third, not all patients received COPD services from pulmonologists; some occasionally received services from doctors in other departments.In addition, when patients received COPD services from large hospitals, codes indicative of the doctors' specialties were often missing.Last, the oral corticosteroid variable was used to define an acute exacerbation event.
In our study, prescriptions of anticholinergic, LABA, and ICS agents were strongly associated with higher severity scores.However, this trend was in contrast to the findings of In addition, the data included in the JMDC database were collected through the insurance reimbursement process; therefore, information is rarely missing.Moreover, under the Japanese national health insurance program, all services provided to COPD patients should be almost fully covered.For these reasons, our classification method was developed using all records of COPD treatment provided to Japanese patients.
However, this study also included limitations common to administrative database studies [10,12].
Notably, we did not consider the risk factor of smoking history, because the variable was not available in our database.Even without considering the data, our method was capable of describing COPD severity with regard to age and other treatment procedures.
Approximately 50% of patients in our study had comorbid asthma.It is difficult to clinically distinguish COPD from asthma, and therefore these diagnoses often overlap (asthma-COPD overlap syndrome).In previous research showed the prevalence of asthma-COPD overlap syndrome was 1.8-56.0%[25][26][27][28].Therefore, we did not remove patients with asthma from the study population.

Implications for Research
PCA often faces problems related to the low reproducibility of factor loading as a score system basis.
Reproducibility depends on treatment patterns in a database.Therefore, when using different data sources, researchers should re-calculate factor loading, as demonstrated in our study.Furthermore, additional studies in which our findings are applied for clinical usage are needed.We will compare the performance of COPD severity scores and clinical conditions using electronic health records at a large-scale hospital in Japan.The severity scores calculated from factor loadings in this study are relative values and cannot be used for distributions of COPD severities (i.e., proportions of mild vs. more severe conditions).
Therefore, we will set the cut-off values according to the GOLD criteria.These criteria allow the classification of COPD conditions into four severity categories depending on the values of respiratory function tests.We will evaluate the scores using the c-statistic, positive predictive value, or negative predictive value according to the electronic health records database.These techniques have been used previously to assess model discrimination and validate severity classification methods [29][30][31].

Conclusion
In scores were re-calculated in the validating group (n = 300), using factor loadings from the development step.Distribution scores ranged from 4 to 34 and were similar between the developing and validating groups, as shown in Fig 2. When severity scores were divided into three categories, mild, moderate, and severe/very severe, the median costs were 79,027 yen, 204,445 yen, and 422,463 yen, respectively, indicating an increasing trend (p for trend < 0.05, Fig 3).In addition, a similar increasing trend was observed for the risk of an acute exacerbation event (48%, 61%, and 83%, respectively; Fig 4).

Few attempts to 5 Figure 1 .
Figure 1.Selection criteria for the study population, chronic obstructive pulmonary disease.

Figure 2 .
Figure 2. Distribution of severity scores in the developing and validating groups.

Figure 3 .
Figure 3.Total costs (yen) per year for chronic obstructive pulmonary disease (COPD) treatment in the validating group.

Figure 4 .
Figure 4. Probability of acute exacerbation per year in the validating group.

Table 3 .
Average number of variables in the developing group *Short-acting beta-agonist, † Long-acting beta 2 -agonist, Q1-Q4 indicate the score quartiles www.openaccesspub.org| JARH CC-license DOI : 10.14302/issn.2474-7785.jarh-17-1727Vol-2 Issue2 Pg. no.-7 this study, a COPD severity classification method based on an administrative database in Japan was developed.This method is able to estimate COPD conditions without requiring laboratory test or clinical symptom data.For clinical implementation, we will confirm the validity of this classification system through comparison with medical information, including laboratory data.This classification method is a very Freely Available Online www.openaccesspub.org| JARH CC-license DOI : 10.14302/issn.2474-7785.jarh-17-1727Vol-2 Issue2 Pg. no.-11 important step in the adjustment of potential outcome risk factors according to administrative databases.