Chapter 5 - Data sources, methods and limitations
5.1 Sources of data on vital status
5.1.1 Client Data Base
5.1.2 The National Death Index and the National Mortality Database
5.1.3 The electoral roll
5.1.4 Medicare
5.1.5 National Cancer Statistics Clearing House
5.2 Quality of Korean War Nominal Roll
5.3 Record linkage between the nominal roll and selected data sources
5.3.1 Matching by DVA
5.3.2 Matching by the AIHW
5.3.3 Matching with the NDI and NCSCH
5.3.4 Matching with the State and Territory BDM
5.3.5 Matching with overseas databases
5.3.6 Matching with the electoral roll
5.3.7 Matching by the Health Insurance Commission
5.4 Results of the matching process
5.4.1 Duplicate records in the nominal roll
5.4.2 Final results of matching
5.5 Summary and Discussion on Determination of Vital Status
5.5.1 Potential reasons for unknown status
5.6 Statistical Methods
5.6.1 Population at risk
5.6.2 Deaths amongst veterans
5.6.3 Expected number of deaths
5.6.4 Cause of death analysis
5.6.5 Mortality analysis by duration of service
5.6.6 Mortality analysis by period of service
5.7 Statistical Power
5.8 Smoking Prevalence
5.8.1 Calculation of estimated cancer mortality rates for varying levels of smoking prevalence
5.9 Statistical software used
References
In the conduct of a mortality study such as this one, the task of determining which Australian veterans of the Korean War have died since their service in Korea is of crucial importance. This information allows for an analysis of patterns of death in the cohort by various characteristics.
One of the key patterns to examine in a mortality study is the distribution of cause of death and how it compares with the Australian community. For the purposes of this study, deaths of veterans were analysed for the period after the servicemen left Korea until 31 December 2000. This study examines veterans of the Korean War, that is, those who returned to Australia from the war. Thus, while recognising the significance of the 349 deaths that occurred in Korea, these are excluded from this study.
Determining vital status (that is, whether alive or dead) was carried out in part using computerised matching of veterans' records with information in large national databases, such as the National Death Index (NDI), the electoral roll, DVA databases and other registers. Primarily, the Nominal Roll was matched against the DVA databases, as this contained information about both living and deceased veterans. Matching of deaths before 1980 was performed manually by the Registrars of Births, Deaths and Marriages.
5.1 Sources of data on vital status
Registration of deaths in Australia is compulsory and is the responsibility of the State and Territory Registrars of Births, Deaths and Marriages (RBDM). All veterans that died in Australia should be registered with the RBDM but the quality of information (eg. the lack of computerised records in the early years, changing names of veterans, incomplete date of birth) doesn't always make it easy to access or confirm the death. Therefore multiple sources of information are needed to maximise coverage and to get the best evidence regarding the vital status of each veteran.
Tables 5-1 and 5-2 summarise the different sources of vital status data used in this study. Table 5-1 shows the period covered for death information and Table 5-2 shows events indicating whether a subject is alive and on what date.
| Table 5-1: Summary of sources of vital status - death | |
|---|---|
| Died when? | Source |
| On active service in Korea | Department of Defence |
| In service, post-Korea | Department of Defence |
| Between 1950 and 1980 | Australian State and Territory Registries of Births, Deaths and Marriages |
| After 1980 | National Death Index |
| Since Korean service | Department of Veterans' Affairs Client Data Base New Zealand Registry of Births, Deaths and Marriages |
| After 1984 | Health Insurance Commission Medicare database |
| Table 5-2: Summary of sources of vital status - alive | ||
|---|---|---|
| Action indicating the subject is alive |
Assumed alive on the date of | Source |
| Receiving a Veterans' Affairs pension | their last payment | Department of Veterans' Affairs Client Data Base |
| Made a Medicare claim | their last claim | Health Insurance Commission Medicare database |
| Enrolled to vote | extraction of the roll | Australian Electoral Commission rolls |
5.1.1 Client Data Base
DVA maintains a Client Data Base, which provides a central source of information about veterans who have registered for any benefit provided by DVA. The Client Data Base record contains information on surname, given name, other initials, date of birth, date of death and some information on military service and the service on which a claim was determined but records any subsequent service inconsistently.
Data qualityBecause the personal data, names and pension details on the Client Data Base are regularly used and referred to in correspondence with veterans, these details are believed to be current and accurate. However, details of military service are less reliable and often incomplete as this database was originally intended for payment management, not military service tracking. For this reason, the Client Data Base was not used as a source of data on service details. Such details were obtained from the relevant Service records office. However, Service numbers, where recorded, provided confirmation of correct matches from other sources. It should be noted that pension related details were not accessed for the purposes of this study. Many veterans who died prior to the development of computerised databases are recorded as deceased but with no date of death.
DVA has no information on the vital status of Korean War veterans, or their dependants, who have not registered for any benefit provided by the Department.
5.1.2 The National Death Index and the National Mortality Database
The NDI is a database located at the AIHW. It contains identified records of all deaths in Australia registered after 1980. In excess of 2.5 million records are contained in the database. The Registrars of Births, Deaths and Marriages (RBDMs) in each Australian State and Territory supply the information for this database. As registration of death is a legal requirement, the database is virtually complete for deaths in Australia. The data available for matching in the NDI covered the period from 1980 to 2002 for all States and Territories, and some 2003 data.
Although the NDI identifies each person who dies, it does not record the cause of death in a standardised manner. This standardised cause of death information is available in the National Mortality Database, also located at the AIHW.
The National Mortality Database contains de-identified information on each person's underlying cause of death, coded using the International Statistical Classification of Diseases, Injuries, and Causes of Death (ICD) 1. An NDI record can be linked to its corresponding record in the National Mortality Database, via a common registration number to obtain cause of death information, under Ethics Committee approval.
The data quality of the NDI varies considerably between States and Territories and over time within each State and Territory. Data quality and completeness affected the matching strategy and the results of data matching for this study. The NDI does not have full dates of birth for:
- Queensland for the period 1980-1996 inclusive;
- New South Wales for the period 1980-1992 inclusive; and
- Victoria for the period 1980-1989, inclusive.
In these situations, a year of birth is derived from the date of death and the age at death.
Within the NDI there are inconsistencies in the way names are recorded. Data standardising procedures were therefore applied to the NDI in order to reduce inconsistencies. Examples are provided in section 5.3.3.
While personal information is usually provided about the deceased by the next of kin, acquaintance or official of the institution where the death occurred, information on the cause of death is variously supplied by family doctors, hospital resident physicians, pathologists, or the coroner's office. This large range of information sources contributes to the variable quality of cause of death data and a degree of inaccuracy overall. This situation also applies to the data held by the State and Territory and New Zealand Registries of Births, Deaths and Marriages. However, it should be recognised that Australia has one of the best death information systems in the world.
During the period of the study, six different versions of the ICD codes were in use in Australia. Disease classifications changed over time and meaningful observations could not be made for all diseases of interest for the entire study period. For example, chronic obstructive pulmonary disease (COPD) was first identified as a unique entity in the codex in 1979. Prior to this time, this disease was included in a range of other codes. Thus observations about COPD can be made for the period 1979 to 2000 only.
5.1.3 The electoral roll
The electoral roll is maintained by the Australian Electoral Commission (AEC). Matching against the electoral roll is undertaken by the AEC and is subject to its approval in accordance with legal and privacy legislation. Most living Australian citizens will appear on the roll.
Enrolment on the electoral roll is compulsory for all Australian citizens who have attained 18 years of age. However, the following people are not entitled to have their name included or retained on any electoral roll:
- the holder of a temporary visa;
- an unlawful non-citizen under the Migration Act 1958;
- a person of unsound mind;
- a person serving a sentence of five years or longer for an offence against the law of the Commonwealth or of a State or Territory; or
- someone who has been convicted of treason or treachery and has not been pardoned.
Few Korean War veterans would be expected to be ineligible to have their names included or retained on the electoral roll.
Data qualityThere are known to be multiple registrations on the electoral roll of persons across States and Territories. This occurs if a person moved between a States and Territory of Australia and the previous entry had not been removed from the electoral roll.
Recorded names may not necessarily be legal names and there are persons who have died but their deaths are not known to the AEC.
5.1.4 Medicare
The Health Insurance Commission (HIC) has administered Medicare, Australia's national health insurance scheme, since its introduction on 1 February 1984. The scheme provides free access to hospital services for all Australian residents and subsidises the costs of a range of other medical services 2.
Two databases are maintained by the HIC: one of persons enrolled in the Medicare scheme; and one for claims processing. As at 30 June 2002 there were 10,146,235 males enrolled with Medicare, which is 103.8% of the estimated resident male population of Australia 3. Medicare enrollers include some persons who are not Australian residents (eg. long term visitors whose stay is greater than 6 months and eligible short term visitors).
When notified, the HIC records the date of death and the date of departure from Australia of persons on its database, but more commonly the record just becomes inactive. During 1992-93, approximately 800,000 records were culled from the enrolment database of those who had not claimed for five or more years 2.
The HIC only keeps records of claims made in the previous five years. Older claims are deleted from the database. As only recent and active records are kept, matching with HIC Medicare data can reliably ascertain that a person is alive provided they have made a claim in the last five years. Conversely, as information on deaths and departures from Australia is only gathered if the information is proffered, the finding of this type of information is less reliable than other sources.
Although many clients of DVA are entitled to free health care for accepted disabilities, most of those services are billed through the HIC.
5.1.5 National Cancer Statistics Clearing House
Cancer is a notifiable disease in all States and Territories. The data are collected by cancer registries and include clinical and demographic information about people with newly diagnosed cancer. This information is obtained from hospitals, pathologists, radiation oncologists, cancer treatment centres, nursing homes and Registrars of Births, Deaths and Marriages.
The AIHW is responsible for the national collection of cancer incidence statistics through the National Cancer Statistics Clearing House (NCSCH). The NCSCH receives data from individual State and Territory cancer registries on cancer diagnosed in residents of Australia. National statistics are available for all years from 1982 to 1999, for the purposes of this study. The database is updated annually.
The NCSCH was used as an additional check to determine the vital status of the Korean War veterans. The important data items for this purpose are names, date of birth and date of diagnosis. Surname was available for all records, first name for 99.9% of the records, second name for 52% (not all persons have a second name), date of birth for 99.9% and date of diagnosis for 99.9%.
5.2 Quality of Korean War nominal roll
Any individual, who was either a member of the ADF or a civilian from an organisation accredited by the ADF, who physically entered the Korean War Operational Area during the qualifying period between July 1950 and April 1956, was included in the nominal roll. The roll provides a list of names, dates of birth and service type of 17,813 male veterans, as well as all deaths of veterans notified to DVA.
Missing or incomplete data items reduced the chances of matching the nominal roll records with the NDI or other databases. Thus, failure to match with the NDI may falsely indicate that the veteran is alive (false negative) or, conversely, an incorrect match may give the false impression that the veteran is dead (false positive). Such errors may arise simply as a result of missing or incomplete data in the source record. Table 5-3, shows that missing and incomplete data were a minor concern for the nominal roll. Practically all first forenames were recorded - for only one Army servicemen there was just an initial. Most second forenames were recorded in full but for 14% of cases this data item was missing although the percentage of missing second names compared with those who had no second names to record is unknown. The number of records with missing dates of birth was negligible for the Navy and Army (three and one, respectively) and nil for the Air Force. In all, the quality of the nominal roll was considered good for matching purposes.
| Table 5-3: Frequencies of incomplete and missing data | |||
|---|---|---|---|
| Roll | Initial only for first name | No second namea | Missing date of birth |
| Navy | 1 | 745 | 3 |
| Army | 0 | 1,696 | 1 |
| RAAF | 0 | 131 | 0 |
| a Not all persons have a second name | |||
5.3 Record linkage between the nominal roll and selected data sources
5.3.1 Matching by DVA
DVA was responsible for identification of potential duplicate records and matching the nominal roll of Australian veterans of the Korean War with information indicative of vital status of veterans available within DVA. This included matching with the Client Data Base records of deaths and the Client Data Base records of payments to veterans. Approximately 81% of all Korean War veterans were identified on the DVA databases.
The nominal roll was matched with the Client Data Base records of veterans receiving payment of a pension or allowance from DVA. If there was a match the veteran was recorded as being alive. It was then matched against records of death. If there was a match, the veteran's date of death and cause of death, if recorded, were entered onto the nominal roll. Although date of death was frequently recorded, cause of death in ICD format had to be obtained from the National Mortality Database. Known deaths, and those remaining of unknown status, were then matched with the NDI, Medicare and electoral roll.
For these matches only an exact match of surname, forenames, day, month and year of birth or an exact match of surname and service number were permitted. These criteria were more stringent than those for matching into the NDI where a probabilistic approach was taken, and were thus given precedence. Results of matching from external databases were then used to search for further Korean War veterans in the client database, to identify variations in recorded names and dates of birth. Discrepancies in matches were resolved by manual search of DVA and service records. Manual searches of DVA records were also undertaken for known deceased to determine dates and causes of death that were not recorded on the client databases. This was undertaken to increase the certainty of the probabilistic matching by AIHW. Correct matching of Korean War veterans on DVA databases was able to be confirmed for the majority of discrepancies between DVA and external databases by information located on DVA files including service number, dates or location of service and branch of Service.
All results of matching on external databases were subject to clerical review to try to resolve discrepancies. A number of discrepancies remained between DVA records and information provide on the initial nominal roll. Over 1,000 individual Service and DVA records were manually searched and a number of transcription errors were located. Many of these were attributed to the quality of handwritten records nearly 50 years old, but a significant number of discrepancies were unable to be resolved. For example, 470 Korean War veterans had different dates of birth recorded on Service and DVA records. A study nominal roll was finalised which recorded the results of all efforts to resolve discrepancies prior to matching by AIHW for cause of death information.
5.3.2 Matching by the AIHW
The AIHW was responsible for:
- matching with the NDI and the NCSCH; and
- supervising the matching with the State and Territory BDM Registries.
Identification of potential duplicate records and matching with the NDI and the NCSCH were undertaken using the Integrity software 4. Integrity links records that are believed to relate to the same individual. The process is described as 'probabilistic' because for each linkage there is an associated degree of certainty that the records are correctly paired, the same as if the process were carried out manually 5.
The package calculates the likelihood of a correct linkage, i.e. that the records represent the same individual. The higher the likelihood of a correct linkage, the higher the weight accorded the match. Below a designated cut-off value, the weight of the match is too low to be considered a correct linkage and the records linked are considered to be different individuals.
5.3.3 Matching with the NDI and NCSCH
The nominal roll, the NDI and the NCSCH files were standardised to improve the likelihood of successfully matching veterans' details. This meant that apostrophes, hyphens and other miscellaneous characters were removed from surnames, and dates of birth and dates of death, where available, were presented within valid ranges. Soundex and New York State Intelligence Information System (NYSIIS) coded versions of the standardised surnames were created which allows for variations in spelling of names (e.g. Smith, Smithe, Smythe). Standard versions of first names were added to all files (e.g. Robert for Bob and Rob) 5. Allowance was made for slight variations and transpositions in dates of birth.
5.3.4 Matching with the State and Territory BDM
It was considered likely that a significant proportion of the "unknown" group may have been missed because they had died during the period from 1950, when the first veterans returned from Korea to 1980, immediately prior to the establishment of the NDI. In order to capture these deaths, the "unknown" group was matched against State and Territory death records for the period (except the Northern Territory, where the possible returns were deemed too low). NSW and Tasmanian records were matched in part by electronic means; all other records were matched manually. In some circumstances this meant searching nearly 30 yearbooks for approximately 1,500 names.
The data quality of the Registries' mortality information varies between States and Territories and over time within each State and Territory. Varying storage and indexing methods also influence the results of the data matching carried out for this study. Personnel carrying out the matching were provided with guidelines and encouraged to include doubtful matches which could then be further examined by the AIHW to maximise consistency across States and Territories. The relatively conservative matching criteria adopted for the NDI and NCSCH matching were then applied to the State and Territory BDM Registry results.
These deaths were then processed by the AIHW collaborating unit, the National Centre for Classification in Health, in order to determine the underlying cause of death. These deaths were coded in such away that was consistent with the coding rules used at the year of death in order to maintain consistency with other deaths at that time.
5.3.5 Matching with overseas databases
Veterans who are living or who have died overseas are also part of this "unknown" group. The group was matched against New Zealand death records for the period from 1950 by the New Zealand Registry of Births, Deaths and Marriages. Matching against UK and USA databases was considered but was deemed too costly and time consuming for limited results.
5.3.6 Matching with the electoral roll
A file was created of Korean War veterans who were not located on the DVA client databases, including those whose files were inactive and hence whose vital status was unknown to DVA. These unmatched records and those whose vital status remained unknown, were forwarded to the AEC for matching.
Exact matches were accepted as indicating that the veteran was alive. Close matches such as variations in spelling or date of birth resulted in a further search of the DVA client database, and a number of Korean War veterans were subsequently identified by this process. Where clerical review was unable to resolve discrepancies, a veteran's status was assessed as unknown.
5.3.7 Matching by the Health Insurance Commission
The HIC was responsible for matching 1,934 veterans whose vital status was unknown (i.e. no match with the Client Data Base or electoral roll) with its Medicare enrolment database records, then retrieving the most recent claim from the claim database.
For matching with the Medicare enrolment database, a generalised matching program developed by the HIC was used. This program distinguished three levels of match:
- an exact match of surname, given names and the day, month and year of birth;
- an exact match of surname, given names, and the month and year of birth; and
- an exact match of the day, month and year of birth and a phonetic match for the surname and given names.
Each matched record was linked to the claim database to determine the date on which the subject last received a medical service. That is, the date they were last known alive was recorded, unless there was a more recent date of death or departure from Australia. The results of matching underwent clerical review to resolve discrepancies.
5.4 Results of the matching process
5.4.1 Duplicate records in the nominal roll
In the initial nominal roll of 17,881 veterans, a number of duplicate records were noted. Three were duplicate entries, while seven were veterans who had served in Korea in more than one Service. The latter were classified by the branch of the Service in which they first served in Korea. The definitive nominal roll thus consisted of 17,871 Korean War veterans. This list, with an additional 470 duplicates, was submitted to the AIHW for matching. The duplicate entries consisted of 467 names with alternative dates of birth and three names with alternative dates of death. These alternative dates of birth and death had been unable to be resolved even after examination of Service and DVA records. After investigation, the appropriate duplicate records were removed from the final roll before conducting any analysis.
5.4.2 Final results of matching
The Korean nominal roll received by AIHW contained 18,341 records. Female veterans (58, comprising 37 Army and 21 Air Force), duplicate records (470) and those veterans who died in the Korean War (349, comprising nine Navy, 302 Army and 38 Air Force) were excluded from the study. The summary results of matching are presented in Table 5-4. It shows that vital status as at 31 December 2000 was determined for 94.9% of the cohort and 5.1% were 'lost to follow-up'. The final cohort for analysis was therefore 17,464.
| Table 5-4: Summary results of nominal roll matchinga | ||||
|---|---|---|---|---|
| Group | Alive | Dead | Unknown | Total |
| Navy | 3,256 | 2,271 | 239 | 5,766 |
| Army | 4,976 | 4,929 | 626 | 10,531 |
| Air Force | 638 | 508 | 21 | 1,167 |
| Total | 8,870 | 7,708 | 886 | 17,464 |
| Per cent | ||||
| Navy | 56.5 | 39.4 | 4.1 | 100 |
| Army | 47.3 | 46.8 | 5.9 | 100 |
| Air Force | 54.7 | 43.5 | 1.8 | 100 |
| Total | 50.8 | 44.1 | 5.1 | 100 |
| a Status as at 31 December 2000 | ||||
In this study, the Air Force had the lowest proportion of subjects 'lost to follow-up' (1.8%). The figure was higher for the Navy and the Army at 4.1% and 5.9%, respectively. A possible explanation for these higher percentages in the Army and Navy is that the number of overseas recruits was higher in these groups, a proportion of whom returned to their country of origin after the war, making follow-up more difficult.
The Army had the highest proportion of subjects classified as deceased (46.8%), slightly higher than the Air Force (43.5%). The lowest proportion of deaths (39.4%) was found in those who served in the Navy.
5.5 Summary and discussion on determination of vital status
The prime objective of the matching was to determine the vital status of as many members of the cohort as possible. To achieve this, the study used a variety of data sources of vital status data. Some of these data are specific to Korean War veterans while others are general to the whole Australian population.
The cohort was first matched with data held by DVA. This included data on deaths obtained from the Department of Defence and data on deaths and those alive, obtained from the DVA Client Data Base. These sources were not mutually exclusive. Deaths that occurred before 1980 (including combat deaths) were identified from these sources.
Those veterans whose status remained unknown were then matched with the Medicare database, the Australian Electoral Roll and the NDI. Following clerical review, a final roll was prepared for forwarding to AIHW to determine cause of death.
All members of the cohort who were not assessed as being alive were then matched with the NDI to identify deaths in the period 1980-2000 not previously known to DVA and to obtain each underlying cause of death code from the associated National Mortality Database. Cause of death codes were obtained in this way in an effort to minimise potential bias that could have occurred if some of these had been obtained from DVA databases. The whole cohort was concurrently matched with the electoral roll to identify those who were alive. No further matches were obtained.
Overall, 50.8% of the cohort was determined to be alive and 44.1% were accepted as having died as of 31 December 2000, nearly 50 years after the conflict. The remaining 5.1% had a vital status that remains unknown. This latter group is on average 9.5 months older than the rest of the cohort. 19.6% of the unknown group would be aged 60 - 69 in 2000, 67.5% aged 70-79, 11.2% aged 80-89 and 1.7% aged 90 or older.
Some idiosyncrasies with the process described were noted. The DVA Client Data Base and other databases included deaths after 1980 that were not identified in the NDI. Similarly these databases identified deaths which were not identified in the BDM Registries prior to 1980. Thus even with inclusive matching criteria not all deaths were correctly found in the NDI or with the Registrars. The clerical review of the matching revealed a number of errors in the original nominal roll, DVA databases and also in the NDI. Clerical errors were noted in both transcription of names, dates of birth and dates of death. In addition, a systemic error was detected in the NSW death data between 1980 to 1991 that was foreshadowed in section 5.1.2. About 28 mismatches in dates of death were identified where the NSW death data had assigned a date of death of 30 June. These errors were reported to AIHW.
An inspection of the names for whom vital status could not be determined, revealed that a high proportion of these people have names that are common and result in multiple matches against the NDI that were unable to be resolved. Some names were likely to be anglicised and there were also some names that are commonly abridged or known by some other name. For example, Australians who are named William are often known as Bill or Will or Wills (such as Bill Hayden) and often the less formal name is entered on death certificates. People known as Benedict are often known as Ben (as in Benedict Chifley) but so are people whose more formal name is Benjamin. Other Australians choose to be known by their second name, such as Edward Gough Whitlam and John Malcolm Fraser. The matching software that was used did try to accommodate this, but it is unlikely to be completely successful. The effect of this missing data would be to under-estimate the number of deaths; in other words, bias towards the norm.
5.5.1 Potential reasons for unknown status
The group of veterans with an unknown vital status will possibly contain subjects who died, most likely before 1 January 1980, the first date for data in the NDI, and who were not captured by any of the DVA databases or the manual searches by the various RBDMs.
Other reasons for 'lost to follow-up' include:
- emigration from Australia since the end of the Korean War;
- change of name since the end of the Korean War;
- living in certain types of institutional care;
- living in Australia but not recorded on the electoral roll;
- multiple matches with the NDI that were unable to be resolved;
- typographical or other errors in data records in nominal roll and/or databases used as sources of vital status information; and
- incorrect birth date recorded on the nominal roll as a result of membership in the 'Unders and Overs Club' in which veterans who were too old or too young for service enlisted for Korea using a false date of birth.
There is no additional information available about this group of veterans and therefore no further analysis was possible to investigate any bias these veterans may introduce.
In summary, from a total cohort of 17,464 male Korean War veterans followed up after approximately 50 years, the vital status of 5.1% remained unknown. This is comparable to the 3.1% unknown in the Vietnam Veterans Mortality Study 6 where 59,036 veterans were followed up after approximately 30 years.
5.6 Statistical methods
The statistical analysis of the Korean War veteran cohort employed standard statistical methods for cohort studies 7. Essentially three components are necessary to undertake the study:
- the population at risk in the veteran cohort by age and calendar period, duration of service and service type;
- the number of deaths in the veteran cohort by age, calendar period, duration of service, service type and cause; and
- the age-specific mortality rates by calendar period and cause for the comparison population, in this case Australian males.
5.6.1 Population at risk
In broad terms the population at risk was derived by the person-years method which estimated the length of time each cohort member was out of Korean service and alive during the period of observation 1950 to 2000. This estimation of person years at risk was made for each calendar year and for each five-year age group.
Korean War veterans became part of the population at risk from the time they completed their Korean War service until they died or until the study end date, 31 December 2000. For example, a 23 year old soldier departing Korea in 1953 and dying in 1993 aged 63 would contribute 40 person years to the population at risk. Veterans who completed their service before 30 June were allocated to that years' population at risk; veterans who completed their service after 30 June were allocated to the next years' population at risk.
Several small variations to the population at risk compared to results of nominal roll matching needed to be considered-79 deceased veterans had no recorded date of death or cause of death, and four veterans (two deceased, two unknown) have no known date of birth. As no further evidence to clarify the records relating to these veterans was forthcoming, these records were excluded from the overall analysis.
Veterans whose last date of service in Korea is unknown (17 records) were added to the population at risk with a notional date of 31 December 1956. Table 5-7 shows the formation of the final population at risk for the analysis.
| Table 5-5: Population at risk | ||||
|---|---|---|---|---|
| Roll | Alive | Deceased | Unknown status | Total |
| Number | 8,870 | 7,625 | 886 | 17,381 |
| Percentage | 51.0 | 43.9 | 5.1 | 100.0 |
The 886 veterans with an unknown vital status were not in contact with DVA, and were not found on the Australian Electoral Roll or any other databases interrogated for this study. For these veterans, it was therefore not possible to determine whether they were still alive and residing in Australia or if they had died or moved permanently overseas. This group is referred to as the 'veterans whose vital status is unknown' for the purpose of this study.
In this type of cohort study the size of this unknown vital status group may influence the results and therefore needed to be accounted for in the analysis. This was managed by considering the population at risk using two scenarios:
- Scenario 1 excludes all unknown veterans. This assumes the veterans lost to follow-up have the same rate of death as the other Korean War veterans. If the death rate of those lost to follow-up is substantially different, then the Standardised Mortality Ratio (SMR) using this scenario may be an over or under-estimate of the true situation.
- Scenario 2 includes the unknown veterans and assumes that all the veterans lost to follow-up are still alive and residing in Australia. This is unlikely and the analysis using Scenario 2 will result in an estimate of SMRs that are lower than the 'true' situation. This will provide an indication of the effect of excluding the unknowns from Scenario 1 to be assessed.
In presenting the findings from the analysis later in this report, both population Scenarios are provided.
5.6.2 Deaths amongst veterans
The number of deaths in the veteran cohort by age, calendar period, duration of service, Service type and cause was derived from the final matched file described in earlier sections. As part of the veteran follow-up exercise, an additional 531 veterans were found to have died after 31 December 2000, but for the purposes of this study are considered alive.
Australian mortality rates, against which the Korean War veterans' deaths are compared, only refer to deaths on Australian soil. Korean War veterans who died in the Vietnam War (13 veterans) or overseas (105 veterans) are therefore not included in this study. The 79 deceased veterans with an unknown date of death are excluded from the analysis as well as the two deceased veterans with an unknown date of birth. A total of 7,514 veteran deaths were analysed in this study. The categories are not mutually exclusive. For example five of the 531 veterans who died after 31 December 2000 died overseas and are included in the 118 overseas deaths. A flow chart for determining the number of veteran deaths for analysis is given in Figure 5-1.
Figure 5-1: Determination of Vital Status and Number of Deaths for Analysis
5.6.3 Expected number of deaths
In order to determine if mortality patterns in the veteran cohort are different from that experienced in the Australian population a standardised mortality ratio (SMR) approach is taken. This method compares the number of deaths amongst veterans with the number expected if the Australian mortality pattern was applied to the veteran population at risk. The expected number of deaths was computed by multiplying the person-years for each age and calendar year in the veteran cohort by the corresponding Australian mortality rate.
The Australian mortality rate is usually available for the complete study period, 1950-2000, but for some diseases these rates are only available for the later years. In those situations the observed number of deaths are restricted to the reduced time period. Deaths occurring in the earlier years are, however, analysed in the larger disease groupings. For example, head and neck cancer deaths are only counted for the period 1968-2000, but head and neck cancer deaths that occurred before 1968 are included in the 'Neoplasm comparison' as it investigates the 1950-2000 period. The situation is clarified in the tables under the heading 'Period examined'.
The standardised mortality ratio (SMR) is a measure of the relative mortality rate between the cohort and the reference population (in this study, the Australian male population). An SMR greater than one indicates higher death rates in the cohort compared with the Australian male population, adjusted for age and calendar year. An SMR less than one reflects lower death rates in the cohort than in the comparison population.
Analysis focused on causes of death identified in the study protocol as of a priori interest, however other conditions were also considered. All deaths were coded to ICD-10 and all SMRs for specific causes of death are based on these ICD codes.
SMR estimates based on small numbers of deaths are less certain than estimates based on large numbers of deaths. The variability of an estimate is reflected in its 95% confidence interval (CI): an estimate based on few deaths has a wider 95% CI than one based on many deaths. The term statistically significant is used to describe mortality estimates (for veterans) which are different to the mortality estimate for the comparison group (Australian) allowing for this variability. For standardised mortality ratios where the 95% CI does not include one (1.0) the estimate is considered statistically significantly higher (or lower) than the comparison group. This corresponds to a two-sided 5% significance test. The method of calculation of confidence intervals assumes that the number of observed deaths can be modelled as a random variable with a Poisson distribution. This chapter restricts itself to explaining the method of the various analyses. The results and analysis of the mortality patterns for the veterans as a whole and within the various groups can be found in Chapter 6.
5.6.4 Cause of death analysis
This study analysed the underlying cause of death of 7,514 veterans. Of these veterans, 262 have been recorded as having died, however no recorded cause of death was obtainable. Their date of death tends to occur towards the earlier years of the follow-up period
(Table 5-6). In order to incorporate these deaths, the records were allocated a cause of death on a prorata basis using the patterns of causes of death of the other veterans in the same period.
| Table 5-6: Analysis of missing cause of death data | |||||
|---|---|---|---|---|---|
| Navy | Army | Air Force | Total | ||
| Total deaths | 2,226 | 4,795 | 493 | 7,514 | |
| 1950-1967 | Missing causes | 23 | 51 | 14 | 88 |
| Percentage | 13.0 | 12.3 | 25.0 | 13.6 | |
| 1968-1979 | Missing causes | 12 | 65 | 4 | 81 |
| Percentage | 3.7 | 8.1 | 5.6 | 6.7 | |
| 1980-2000 | Missing causes | 18 | 71 | 4 | 93 |
| Percentage | 1.0 | 2.0 | 1.1 | 1.6 | |
| 1950-2000 | Missing causes | 53 | 187 | 22 | 262 |
| Percentage | 2.4 | 3.9 | 4.5 | 3.5 | |
5.6.5 Mortality analysis by duration of service
The deaths of servicemen were analysed by their length of service in Korea. Two approaches were taken in this study:
- Both the Navy and Army personnel were divided into three groups based on their duration of service. The boundaries of these groups were chosen by observing natural breaks in the distribution of the servicemen's length of stay in Korea. The boundaries are different for both Services reflecting their length of service, and are not indicative of any natural disease risk (Table 5-7). Two Navy records and 11 Army records had no recorded length of service and were excluded from the analysis. The number of deaths in Air Force personnel was too small for this type of analysis, resulting in unstable mortality estimates.
| Table 5-7: Duration of service for Navy and Army personnel | |||
|---|---|---|---|
| Duration in days | Average duration (days) | Number of veterans | |
| Navy | 1-174 | 96 | 504 |
| 175-294 | 223 | 4,231 | |
| 295+ | 434 | 1,005 | |
| Total | 249 | 5,740 | |
| Army | 1-345 | 197 | 4,997 |
| 346-389 | 366 | 4,199 | |
| 390+ | 565 | 1,275 | |
| Total | 311 | 10,471 | |
- The study also investigated the relationship between the duration of service as measured using a finer level continuous variable (months of service) and the risk of dying using the Cox Proportional Hazards Model. This investigation was carried out for all three services separately and combined. Fifteen records had no recorded length of service and were excluded from the analysis.
5.6.6 Mortality analysis by period of service
The Korean War had a number of distinct phases, the most notable being the periods before and after 31 December 1951. These varying war experiences were, of course, most noticeable for the Army, and analysis was restricted to these veterans.
The Army veterans were categorised into three groups: those veterans who returned from the war on or before 31 December 1951 (Period 1), those veterans who commenced their war service before 31 December 1951 but returned after that date (Period 2) and those veterans who commenced their war service after 31 December 1951 (Period 3). Table 5-8 gives a breakdown of these three groups.
| Table 5-8: Period of service for Army personnel | |||
|---|---|---|---|
| Number of veterans | Deceased veterans | ||
| Number | Percentage | ||
| Period 1 | 1,395 | 784 | 56.2 |
| Period 2 | 1,148 | 575 | 50.1 |
| Period 3 | 7,939 | 3,436 | 43.3 |
| Total | 10,482 | 4,795 | 45.7 |
There is an obvious overlap between Periods 1 and 2 and between Periods 2 and 3. Period 1 and Period 3 are mutually exclusive. Consequently, the analysis focuses on the comparison between Period 1 and Period 3.
5.7 Statistical Power
The power of this study was assessed using standardised death rates for males from Australian Bureau of Statistics (ABS) publicised data for 1997. Table 5-9 shows the calculations for the estimated power of this study. This reveals that where a standardised death rate exceeds 30 per 100,000 per year then the study has an 85% chance of detecting a 20% increase in relative risk at the 0.05 level of significance.
5.8 Smoking prevalence
The study revealed high standardised mortality ratios in smoking-related cancers. An analysis was conducted to examine if smoking could explain all of the elevation in the smoking-related cancer deaths in Korean War veterans.
The prevalence of smoking amongst Korean War veterans is unknown either during the conflict or afterwards. While there is anecdotal evidence of high levels of smoking during the conflict and knowledge of the cigarette rations, there was no systematic measurement of smoking rates amongst veterans or the population. Therefore this analysis provides for a range of smoking prevalences from 30-100% and generates a hypothetical number of expected deaths based on these prevalence rates and estimates of attributable risk of cancer death due to smoking.
5.8.1 Calculation of estimated cancer mortality rates for varying levels of smoking prevalence
The method used to estimate mortality rates for hypothetical levels of smoking prevalence uses aetiological fractions, or estimates of attributable risk of dying from specified causes due to smoking, and smoking prevalence estimates for the Australian population for the study period (1950-2000). Smoking prevalence estimates were calculated by Ridolfo & Stevenson (2001) 8 , using a method proposed by Peto et al. (1992) 9 , and subsequently used in the Australian Burden of Disease Study (Mathers et al. 1999) 10. These estimates of smoking prevalence take into account past exposure to tobacco rather than current exposure, and reflect the disease burden from the commencement of smoking.
Estimated mortality rates were calculated by separately deriving mortality rates attributable to smoking and rates not attributable to smoking. The total rate was then calculated by weighting the rate attributable to smoking according to the hypothetical smoking prevalence in the population.
The formula is therefore:
DRh=SR*h + NR
where
-
DRh = derived mortality rate, assuming the hypothetical prevalence h applies
SR = mortality rate due to smoking
NR = mortality rate not due to smoking
h = the hypothetical smoking prevalence
To calculate the smoking rate, SR, it was necessary to first estimate the aetiological fraction, or attributable proportion of the mortality due to smoking. The formula for calculating the aetiological fraction, F is:
F=P*(RR-1)/(P*(RR-1)+1)
where
-
P is the actual smoking prevalence
- RR is the ratio of the mortality rate of the cancer among those exposed to smoking to the mortality rate of those not exposed, or the relative risk of the cancer due to smoking.
The mortality rate due to smoking, SR, was then calculated using the formula:
SR =R*F/P
where
- R is the actual mortality rate.
The mortality rate for non-smokers is:
NR=R*(1-F)
5.9 Statistical software used
Several statistical packages were used for data management and analysis. Initial processing, such as the calculation of person-years, and the Cox Regression was performed in SAS Release 6.12 11. Tables of observed and expected deaths and SMRs were compiled in EXCEL 97 12 and DeltaGraph Version 5.0.1 13 was used to produce the graphs.
Table 5.9: Estimated Power of Korean Veteran Mortality StudyReferences
- 1
- World Health Organisation. Manual of the international statistical classification of diseases and related health problems, 10th Revision (ICD-10) 1999
- 2
- Health Insurance Commission. Annual Report 2001-02. Canberra: HIC, 2002
- 3
- Health Insurance Commission. Annual Report 2001-02, http://www.hic.gov.au/annualreport_0102/statistics/medicare1a.htm
- 4
- INTEGRITY, Data Re-Engineering Environment [computer program]. Version 3.6. Boston, Massachusetts: Vality Technology Inc, 2000
- 5
- Newcombe HB. Handbook of record linkage: methods for health and statistical studies, administration, and business. Oxford: Oxford University Press, 1988
- 6
- Crane PJ, Barnard DL, Horsley KD, Adena MA. Mortality of Vietnam veterans: the veteran cohort study. A report of the 1996 retrospective study of Australian Vietnam veterans. Canberra: Department of Veterans' Affairs, 1997
- 7
- Breslow NE, Day NE. Statistical methods in cancer research. Volume II - The design and analysis of cohort studies. Lyon: International Agency for Research on Cancer, 1987
- 8
- Ridolfo B & Stevenson C 2001. The quantification of drug-caused mortality and morbidity in Australia, 1998. AIHW cat. no. PHE 29. Canberra: AIHW (Drug Statistics Series no. 7)
- 9
- Peto R, Lopez AD, Boreham J et al. 1992. Mortality from tobacco in developed countries: indirect estimation from national vital statistics. Lancet 339:1268-78
- 10
- Mathers CD, Vos T & Stevenson C 1999. The burden of disease and injury in Australia. Canberra: Australian Institute of Health and Welfare
- 11
- SAS Release 8.2 [Computer program]. SAS Institute Inc., 2003
- 12
- Microsoft EXCEL [computer program]. Windows 95 Version 7.0. Microsoft Corporation, 1995
- 13
- DeltaGraph Version 5.0.1 [Computer program] SPSS Inc., 2001

