Table of Contents
The most commonly used data for information on Indigenous Australians is the Census of Population and Housing. The five-yearly census allows us to generate reasonably reliable social statistics about Indigenous people as a by-product of the introduction, in 1971, of a question which asked whether people identify as Indigenous. However, the census is a blunt instrument that is designed primarily to count the national population rather than to measure and track changes in complex socioeconomic conditions of population sub-groups. Furthermore, census questions are limited in their number and scope by the exigencies and costs involved in collecting information from the entire population. Surveys are more flexible and cost-effective instruments for collecting a wide range of information, even though the resulting data is subject to sampling error.
In 1991 the Royal Commission into Aboriginal Deaths in Custody recommended a national survey of the Indigenous population. It was the dearth of information with which to inform the Royal Commission that resulted in the first NATSIS in 1994. This survey provided the first nationwide inter-censal estimates of Indigenous socioeconomic status.
The 2002 NATSISS is the second major nationwide survey specifically targeted to collect a large range of information on Indigenous Australians. Carried out between August 2002 and April 2003, it collected information from 9359 individuals aged 15 years and over from 5887 households. Some of the information had never been collected before for the Indigenous population, whereas a number of the questions were broadly comparable to the 1994 NATSIS.
The 2002 survey was also conducted more or less concurrently with the 2002 General Social Survey (GSS) which collected information about the total adult Australian population (the Indigenous and non-Indigenous populations are not separately identifiable in the GSS). Many of the data items in the 2002 NATSISS are comparable with the GSS, but the GSS did not collect information in very remote areas and was limited to individuals 18 years and over.
The ABS has a program (or cycle) of Indigenous household surveys, with the next NATSISS survey scheduled for 2008. Other ABS data collections with significant Indigenous components planned before then are the 2004–05 Indigenous Health Survey, the 2006 Community Housing and Infrastructure Needs Survey (CHINS) and, of course, the 2006 Census of Population and Housing.
The NATSISS survey was designed to ‘enable analysis of the interrelationship of social circumstances and outcomes, including the exploration of multiple disadvantage’ (ABS 2005c: 1). Information is provided across a range of topics:
demographic characteristics of the individuals and household and geographic characteristics of the area in which they live
cultural and language information and the family and community context
health and disability
education participation and achievement
employment
income, housing and financial stress, and
information technology, transport and law and justice.
This chapter seeks to outline a number of the methodological issues concerning the 2002 NATSISS, with a particular focus on helping readers of this monograph understand the remainder of the empirical results and analysis presented. The next section will outline the survey methodology, including the scope, sample selection and survey design and implementation. Section 3 will provide an overview of some of the potential issues one might need to take into account in an empirical analysis of the NATSISS, while section 4 will provide a brief critical analysis of the existing ABS outputs from the 2002 NATSISS. Finally, section 5 will summarise and highlight the main implications of this chapter.
The 2002 NATSISS collected information from Indigenous Australians aged 15 years and older who were usual residents in private dwellings at the time of the survey. In line with standard ABS household survey scope, it excludes visitors to the randomly selected private dwellings. [1] The survey was carried out across Australia with the aim of collecting enough information to make conclusions at either the State/Territory level or by the Australian Standard Geographic Classification (ASGC) remoteness classification. [2] The coverage of the 2002 NATSISS data is different to the GSS (which only collected information on people aged 18 and over in private dwellings) as well as the 1994 NATSIS (which collected information from people aged 13 years and over in both private and non-private dwellings).
The difference in age structures is reasonably easy to take into account by re-weighting, though it should always be kept in mind during comparative analysis. However, given that the 2002 NATSISS did not collect information on people in non-private dwellings, differential coverage may be more problematic when making comparisons with the 1994 NATSIS or the 2001 Census, or when drawing conclusions about the total Indigenous population. Non-private dwellings include hotels, motels, hostels, hospitals, short-stay caravan parks and—perhaps most importantly—prisons and other correctional facilities. Such dwellings can be identified in both the 1994 NATSIS and the 2001 Census.
According to the ABS (2005c), at 31 December 2002 there were an estimated 19 320 Indigenous people living in non-private dwellings, or about 4 per cent of the entire Indigenous population. The following discussion gives some information from the 1994 NATSIS on people who were usual residents of non-private dwellings. The sample size is 375 adults.
By definition, all prisoners in the 1994 NATSIS data were in non-private dwellings. Biddle and Hunter (2004) provided a profile of Indigenous people in non-private dwellings using the original weights for the 1994 NATSIS. Residents of non-private dwellings were more likely than Indigenous residents in private dwellings to have been arrested in the last five years. They were also concentrated outside capital cities, and were more likely to be male and young. Also, a higher percentage of respondents from non-private dwellings were taken from their natural families. Without access to accurate sampling errors for the 1994 NATSIS data, it is difficult to definitively claim that the differences between private and non-private dwellings are significant. However, it seems reasonable to assert that the people living in private and non-private dwellings are drawn from different populations.
Therefore, when statements are made regarding data from the 2002 NATSISS, it may not always be appropriate to say ‘the Indigenous population has a given characteristic’ but rather make a more qualified statement about ‘the Indigenous population living in private dwellings’ or, if referring to law and justice data, ‘that portion of the Indigenous population that is not currently in prison or other non-private dwelling’.
Given the substantial growth in the Indigenous population since 1994, it is important that comparisons between the 1994 and 2002 surveys use the re-weighted 1994 data set (when the ABS makes it available as a CURF). Customised cross-tabulations provided by the ABS will use the re-weighted data as a matter of course.
The overall sample was spread across States/Territories in order to produce estimates that have a relative standard error of no more than 20 per cent for characteristics that are relatively common in the Indigenous population (for example, that at least 10% of the population would possess). However, there were two components to the 2002 NATSISS sample designs. The first (in parts of Queensland, South Australia, Western Australia and the Northern Territory) was based on a sample of discrete Indigenous communities and the outstations associated with them. This is the Community Area (CA) sample. In the remainder of these four States and Territories, as well as in all of New South Wales, Victoria, Tasmania and the Australian Capital Territory, the survey methodology and sample design was somewhat different. The data from these other areas are described as the Non-Community Area (NCA) sample. Around 30 per cent of the sample came from the CAs and 70 per cent from the NCAs. Those in NCAs were interviewed using Computer Assisted Interviewing, whereas those in CAs were interviewed using a pen and paper interview.
The differences between survey questions and survey technique raise one of the most important issues for people analysing 2002 NATSISS data. Before documenting such issues, it is also necessary to briefly discuss how individuals were selected in each of the two types of areas.
The CA sample was obtained from a random selection of discrete Indigenous communities and outstations. The sample frame used to design the survey was based on both 2001 Census counts, and information collected in the 2001 CHINS (ABS 2005c: 4). Once the communities had been selected, a random selection of dwellings was made.
Dwellings—and therefore individuals—in NCAs were selected using a stratified multi-stage area sample based on the 2001 Census. A random selection of dwellings within selected census Collection Districts (CDs) was then screened to assess their usual residents’ Indigenous status. An insufficient number of households with Indigenous Australians was initially collected, so additional CDs were sampled during February to April 2003.
Before moving on to the CA and NCA survey design, it is important to make clear the difference between the CA sample and the concept of remote areas. The sampling in non-remote areas (i.e. major cities, inner regional and outer regional areas) was carried out entirely under the NCA methodology. This included 5242 of the surveyed individuals.
In remote areas (which includes the remote and very-remote Accessibility/Remoteness Index of Australia classifications), both CA and NCA sampling methodology was used. Remote areas that were not identified as ‘discrete communities’ used the same sampling methodology and interviewing techniques as were used in non-remote areas (i.e. the NCA methodology). In remote areas where NCA methods were used, there were 1997 respondents. In discrete communities and outstations, CA sampling was used, with information collected on 2120 individuals. Although the distinction between CAs and remote areas is not entirely clear in published record to date, according to correspondence with the ABS, the majority of those collected under the CA sample were from very remote areas rather than remote areas.
The questionnaire for the 2002 NATSISS was designed with the assistance of a special advisory group. Although the questions in the CAs and NCAs were broadly similar, there were still differences in what was asked, and the way the data was presented for publication by the ABS. The variables that were affected by such decisions are listed in Table 4.1 and can be classified into three main categories: those that were collected in both CAs and NCAs but that have different output categories; those collected in NCAs only; and those collected—but not released—in remote areas.
Table 4.1. Differences in data collection in CA and NCA areas
|
Restriction |
Variable |
|
Collected in both NCAs and CAs, with different categories outputted in remote areas |
Main reason for last move Type of stressor in last 12 months Type of social activities in last three months Presence of neighbourhood/community problems Neighbourhood/community problems Whether used formal child care in last four weeks Type of child care used in last four weeks Main reason for not using (more) formal child care in last four weeks Type of organisation undertook unpaid voluntary work for in last 12 months Disability status Tenure type Type of major structural problems All sources of personal income Principal source of personal income Where used computer in last 12 months Where used internet in last 12 months Modes of transport Type of legal services used Attendance at cultural events in last 12 months* Self-assessed health* Type of government pension/allowance (auxiliary)* Whether working telephone at home* |
|
Collected in NCAs only |
Whether has an education restriction Whether has an employment restriction Disability type Multiple job holder Cash flow problems All types of cash flow problems Number of types of cash flow problems |
|
Collected, but not released in remote areas |
Whether ever used substances Type of substances ever used Whether used substances in last 12 months Type of substances used in last 12 months |
Source: ABS (2004d)
Notes: An * refers to variables where the only difference between the CAI and PAPI samples is the presence or absence of a ‘not stated’ option.
For those variables in the first part of the table, it is not always clear, from the published record, why the ABS chose different output categories in remote and non-remote areas. After reading the questionnaires and through correspondence with the ABS, it would appear that for these questions, a reduced set of options was available to interviewees. It also seems that this was done mainly because the ABS, on advice from stakeholders and their testing processes, felt that these options would not be relevant to those living in CAs, so the benefits of having a more streamlined survey could be gained without adversely affecting the quality of the survey. However, the ABS needs to be clearer and more transparent about how it came to the conclusion that these particular variables needed different data outputs, and hence potentially limited the ability to use such data items with total confidence.
For example, full information on where computers and internet were used in the last 12 months is not available in remote areas. The missing category was internet/cyber cafes. While it could be argued that there are no internet cafes in CAs, it presumes that community residents are not mobile and have not visited urban settings where such cafes may exist. If the ABS has information that this is the case, then it needs to be made clear. Even though it should be acknowledged that the final survey content for CAs were a product of advice from ABS stakeholders and testing which identified items that were either inappropriate or did not work in the household interview environment in CAs, the ABS does not necessarily provide the level of detail that more sophisticated users may require (e.g. ABS 2004c: 20–26).
The second category of variables in Table 4.1 is those that were collected in NCAs, but not in CAs. Obviously, the structure of the survey prevents comparisons for these variables, but it is worth a brief reflection on why these decisions may have been made. For example, it is unlikely that many respondents held more than one job in CAs, so it does not make much difference that there is no relevant data in such areas. With respect to the lack of cash flow data in CAs, it is arguable that the notions of cash flow and poverty differ in remote and non-remote areas, so will differ between CAs and NCAs (Altman & Hunter 1998). Once again, though, it would be useful for the ABS to publish why such decisions were made without requiring analysts to speculate themselves, with incomplete information.
The disability variables warrant special mention, as quite different variables were constructed for the NCA and CA samples. As a result of field testing and consultative processes, data items and questions for a range of topics, including disability, were modified to take account of language and particular circumstances of people living in very remote communities. In addition, Indigenous stakeholders advised that attempts to measure psychological disabilities in remote communities required development of an appropriate instrument sensitive to the circumstances of people in these areas. The interim instrument developed by the ABS and used with Indigenous stakeholder endorsement in the recent 2004–05 National Aboriginal and Torres Strait Islander Health Survey is the initial response to that requirement. In the longer term, a culturally appropriate social and emotional wellbeing question module is being developed by Indigenous stakeholders.
Full disability was collected in NCAs, and this is comparable to data items in the GSS. A modified set of disability questions that did not include psychological disability was collected in CAs. This question was combined with the relevant options from the NCA sample to create a new variable which can be used across both samples (see ABS 2005c), though not in comparative analysis with the GSS. While the different nature of the labour markets in remote and non-remote areas may mean that comparative data on education and employment restrictions caused by a disability may not be very meaningful (that is, given the binding constraints evident in the lack of employment prospects outside the CDEP scheme), this issue must still be of concern for the remote areas using NCA methodology.
The last category in Table 4.1 is the substance use variables for which data was collected but not released in remote areas. The relevant questions in the 2002 NATSISS were based on the National Drug Strategy Household Survey (NDSHS) and had a response rate of over 90 per cent. In NCAs, a voluntary self-enumerated form was used to collect this information, whereas in CAs, respondents were required to respond verbally to questions asked by an interviewer. The low prevalence of substance use reported in CAs has been assumed, by the ABS, to be the result of the use of direct questioning in CAs (ABS 2005c). It is further assumed that this led to a significant adverse effect on both the level of response and the quality of responses to questions on substance use. For this reason, information on substance use in remote areas was considered to be unreliable and has not been released.
Not only were some of the questions different in the CAs and NCAs, so too were the interviewing techniques. As indicated above, interviews in NCAs were conducted using a CAI where interviewers use a notebook computer to read the questions and to record the data gathered. If respondents were asked to choose from a range of options, then prompt cards were used. For the substance use questions, a voluntary self-enumerated form was used with a response rate of 90 per cent.
In CAs, the surveying techniques were modified to take into account the cultural and language differences predicted for these areas. Firstly, the interviewing was conducted by more traditional pen and paper interviews. In addition, Community Information Forms (CIFs) were used to collect information about the community from the local council office. In every community, Indigenous facilitators were used to improve the validity of the data. However, not all interviews in CAs were conducted in the presence of facilitators. These facilitators ‘explained the purpose of the survey to respondents, introduced the interviewers, assisted in identifying the usual residents of a household and in locating residents who were not at home, and assisted respondents in understanding questions where necessary’ (ABS 2005c).
While the differential use of facilitators may have introduced potential interviewer bias in the response to some of the questions, accurate records were not kept by the ABS as to when facilitators were used. This means it is not possible to control the analysis of 2002 NATSISS for the effect of the presence of facilitators. However, it is still important to appreciate the relative importance of so-called ‘non-sampling error’ (which includes interviewer bias when facilitators are and are not present) and sampling error (which is present, to a greater or lesser extent, in all survey data).
The issue of remote/non-remote and CA/NCA is a confusing one, especially with regard to analysis of the CURF. The biggest issue is that on the CURF, those 1997 individuals in remote areas collected under the NCA methodology cannot be distinguished from those 2120 who were collected under the CA methodology. This is problematic when the questions asked across the two methodologies are quite different. Consider the question used to obtain information on whether a person was a victim of assault. In the NCA sample, the person was asked, ‘In the last 12 months, did anyone, including persons you know, use physical force or violence against you?’. In the CA sample, on the other hand, the person was asked, ‘In the last year, did anybody start a fight with you or beat you up?’.
There can be arguments made for or against the results from the two questions being comparable or not. However, from a methodological point of view, the point is that it is impossible for analysts using the CURF data to make that decision themselves. To repeat, users of the CURF are unable to identify whether people were surveyed under the NCA or CA sample, only whether they lived in remote or non-remote areas. The ABS does have such information available on the 2002 NATSISS MURF, and it may be possible for analysts to access it through customised tables. Ultimately, it is up to users as to whether they need to pursue such options to investigate data quality issues in greater detail.