Proto-Austronesian, the ancestral language from which all other Austronesian languages descended, is considered by most scholars to have been spoken on the island of Taiwan something in the order of 5000 years ago. This ancestral language is considered to have diverged over time into four major subgroups, represented as follows:

In other words many scholars consider that three of the four highest-order subgroups of Austronesian are spoken on Taiwan and have been ever since the development of Proto-Austronesian. As discussed above, languages evolve by two different processes — gradual dialect differentiation, and separation. It is likely that the languages of the Atayalic, Tsouic and Paiwanic subgroups have arisen by gradual dialect differentiation from Proto-Austronesian, or from early descendant dialects spoken by the population which stayed behind when the languages which belong to the Malayo-Polynesian subgroup left the island. This distinction in developmental process is better signalled by the following diagram:

According to this classification all of the Austronesian languages spoken outside Taiwan are descended from Proto-Malayo-Polynesian.
At this point it would be useful to consider Blust’s representation of all of the major subgroups of Austronesian and then to return and consider the evidence upon which they are based, together with some alternative subgroupings currently under consideration by Austronesianists. The full Austronesian family tree devised by Blust (1978) is as follows:

Perhaps a good place to begin is to examine the major pieces of evidence which led scholars to conclude that all Austronesian languages outside Taiwan constitute a single first-order subgroup of Austronesian. Blust (1977, 1982) adduces the following:
PAn *kuSkuS > PMP
*kuku ‘nail (of finger, toe)’,
PAn *tuqaS > PMP
*tuqa ‘old’,
PAn *CumeS > PMP
*tuma ‘clothes louse’.
Dyen (1990) dissents from the view that all of the Austronesian languages outside Taiwan are members of a single Malayo-Polynesian subgroup. Invoking a lexical method called “homomeric lexical classification” whereby “different sets of cognates distributed over exactly the same set of languages are said to be homomerous” (1990:212), Dyen claims that “all the other classifications separate the Philippine languages from the Formosan at, or nearly at, the highest level, whereas the evidence presented here favors regarding the Philippine languages as the closest relatives of the Formosan languages, the latter being considered to form a single subgroup” (1990:224).
In his discussion of problems in Austronesian subgrouping, Ross (1994) assesses that within Taiwan there is a fair measure of agreement concerning the lower-order subgroups. Li (1980, 1981, 1985) has carried out detailed comparative work on the Atayalic subgroup, and Tsuchida (1976) has produced a substantial reconstruction of Proto-Tsouic. There is also general agreement on the core members of the Paiwanic subgroup. Beyond this there are disagreements as to subgroup affiliation, especially regarding the position of Rukai. Compare, for instance, the family trees produced in Tsuchida (1976) and Li (1985). Li (1985) proposes three major subgroups within Taiwan; a Northern group, which includes a number of languages attributed by others to Paiwanic, Tsouic, and a reduced Paiwanic group. In spite of subgrouping problems with the Austronesian languages of Taiwan, it appears clear that Proto-Austronesian diversified into a linkage of dialects and/or languages before the speakers of what later became Proto-Malayo-Polynesian (PMP) left Taiwan.
Ross (1994) has suggested that pre-PMP might have departed from the southeast coast of Taiwan, the Amis language area, since this language name appears to derive from PAn *qamis ‘north’. It is possible that the Amis might have been given this name by the Malayo-Polynesian speakers to the south who might have remembered them as their relatives. Indeed, on linguistic grounds, Reid (1982) considers that an Amis-Extra Formosan node is required in the Austronesian family tree, as follows:

We will return to Reid’s assessment of higher level Austronesian subgroups below. First, however, let us return to Blust’s Malayo-Polynesian subgroup and its major constituent subgroups, thus:

Each of the right hand nodes in the tree diagrams presented here represent the speech of a segment of the population which has migrated from a settled area, so that a new language arose by divergence as a result of separation. Together, the right hand nodes represent the main migratory path of the Austronesians from Taiwan to Oceania. It has been noted, however, that most of the left hand branches do not appear to represent a discrete proto-language, since they represent the “stay-at-homes” (Ross 1994). His further comment on the left-branching nodes is worth quoting in full:
It looks as if the settled proto-language had already diversified into a local linkage before separation occurred. In these cases, the dialects or languages of the “stay-at-homes” have no exclusively shared ancestor. Instead they share only an ancestor at the node above, with the language of the departed migrants.
The Western Malayo-Polynesian languages include the languages of the Philippines and western Indonesia, including Chamorro, Palauan, Chamic and Malagasy. We know little about the subgrouping of the Western Malayo-Polynesian languages, and as Blust (1985) indicates, there is no clear evidence that these languages form a single subgroup of Austronesian. He is not alone in his thinking.
In fact, there is not even any real agreement as to how the Western Malayo-Polynesian languages subgroup among themselves. Ruhlen (1987), basing himself mainly on the work of Blust, assigns the members of the WMP subgroup to eleven divisions, as follows:
Ruhlen provides no justification for these subgroups, other than a geographical one. It should be noted that Chamorro, Palauan and Yapese are spoken in Micronesia to the east of the Philippines. While Chamorro and Palauan are clearly non-Oceanic, the position of Yapese is less clear.
The languages of the Philippines archipelago (including the Batan Islands between Taiwan and the Philippines) and several groups of languages spoken in the northern arm of Sulawesi have generally been believed to belong to a single Philippines subgroup regarded as having descended from Proto-Philippines (Zorc 1977, 1986). This subgrouping has been assumed rather than justified, however. Reid (1982:202) points out that the innovations which Charles (1974) lists as shared by the languages of the Philippines subgroup are based on a number of phonemic contrasts for PAn proposed by Dyen and Dempwolff which “do not stand close scrutiny, and are probably the result of unrecognized borrowing or obscured phonological processes in the history of the languages involved”.
Basing himself on Reid, Ross (1994) suggests another possible scenario, as follows:

Reid (1982:212) was also unhappy about the southern Mindanao languages Blaan and Tboli (which continue to exercise his mind today, see below), quite apart from the higher order subgroups linking the languages of Taiwan and the Philippines. He found that the southern Mindanao languages reflected none of the innovations characteristic of the Malayo-Polynesian languages and may be descended from “a very early migration south of Formosa by an Austronesian-speaking people”. We will return to this point in a moment, for an update on Reid’s current thinking.
Zorc (1986) challenges Reid’s subgrouping and defends the notion of a single Philippines grouping. His claim is based on a large number of lexical innovations shared widely by the languages of the Philippines archipelago. Commentators have remarked that it is difficult to assess Zorc’s position because it is not clear that his lexical innovations are not in fact vocabulary items that have been retained from PMP but lost in extra-Philippines languages. Reid himself (1982:212) comments: “As one moves south in the Philippines … the degree of influence of one or more of the central Philippines languages becomes more and more pervasive, so that it becomes more and more difficult to separate the strata in the languages”.
Reid’s thinking today has changed a little, but still centres to some extent around the problem posed by the languages of the Central Philippines, which appear to share a number of innovations with the Malayo-Javanic languages, including the formation of a set of ligatures exclusively shared by these two groups. His current position (pers.comm.) may be represented diagrammatically as follows:

Reid is not so much concerned with the higher level subgroups in this tree diagram as the lower level attempt to resolve the problem posed by the languages of the Central Philippines and their obvious connection with the Malayo-Javanic languages, most probably through a southerly migration or series of migrations.
There are a number of other established subgroups in the Western Malayo-Polynesian area. Blust (pers.comm. to Ross) recognizes the following:
There has been further progress made with a number of these proposed lower order subgroups in recent times, as follows:
Nothofer (1990, 1994) has made a number of fresh proposals concerning Western Malayo-Polynesian. His proposal is that much of the WMP region was once occupied by speakers of languages belonging to a group which he calls “Palaeo-Hesperonesian”, and that at a later date much of this area was occupied by speakers of “Hesperonesian” languages, who became culturally dominant in western Indonesia, displacing the Palaeo-Hesperonesian languages. Some of these survive today around the periphery of the WMP region. In Nothofer’s terms, the languages of northern Sulawesi and the central and southern Philippines, together with those of north-west Sumatra and the Barrier Islands, north-west Borneo and central and southern Sulawesi would be Palaeo-Hesperonesian, while the Malayo-Chamic, Java-Bali-Sasak and Barito groups are Hesperonesian. Ross (1994) notes, however, that much of the evidence which Nothofer uses is lexical and so suffers from the same difficulties as Zorc’s use of such evidence in the Philippines. By the same token, however, Nothofer’s hypothesis must be considered seriously.
The Central-Eastern Malayo-Polynesian subgroup of Austronesian is much more substantial. It was first proposed as Eastern Austronesian by Blust (1974), and later rebaptized Central-Eastern Malayo-Polynesian (CEMP) by the same scholar (Blust 1978). The languages which constitute the CEMP subgroup stretch from Bimanese, on the island of Sumbawa, eastward through the Lesser Sunda chain of Indonesia as far as the Aru Islands, and then north-west into the central Moluccas, inclusive of the Sula Archipelago. In addition, several still very poorly known CEMP languages appear to be scattered along the southern coast of Irian. CEMP and its lower order subgroups are as follows:

Blust (1990:2) states that “the evidence for CEMP and for some previously unrecognized subgroups within CMP is considerably stronger than the evidence for CMP itself”. For, as we shall see, CMP is “united” by a number of overlapping innovations which cover many, but not all of the languages in question. This distribution of non-coincident innovations suggests to Blust that at an early stage in the Austronesian settlement of eastern Indonesia the languages now assigned to CMP formed a relatively isolated dialect chain which still shared well over 90 per cent of its basic vocabulary with languages that were not part of that chain.
In terms of culture-historical implications, after its separation from Proto-Western Malayo-Polynesian (PWMP), Proto-Central-Eastern Malayo-Polynesian (PCEMP) developed for some time in a relatively compact geographical area before splitting into Proto-Central Malayo-Polynesian (PCMP) and Proto-Eastern Malayo-Polynesian (PEMP). PEMP and its immediate descendant, Proto-Oceanic (POc) each developed in a relatively compact geographical area before splitting into descendant languages. By contrast, Proto-South Halmahera-West New Guinea (PSHWNG) and PCMP spread rapidly over a considerable distance before much dialect differentiation existed. A large number of linguistic innovations arose and spread through the CMP dialect chain in opposite directions, as they did also in SHWNG. These changes failed to reach the geographical extremes furthest from their respective centres of origin, producing differences of rule ordering in the central diffusion corridor (Blust 1978). The result is a patchy distribution of widely dispersed innovations. On the other hand, Blust maintains that some recurrent changes in the CMP languages may have been independent of contact, hence products of drift. Finally, after the differentiation of the CMP chain into distinct languages, there were limited migrations of small populations from the southern Moluccas in Indonesia to the southern coast of the Bird’s Head Peninsula of New Guinea.
The evidence for the existence of a CEMP subgroup by Blust is quite substantial. It consists of the following:
Blust maintains that there is little to distinguish PMP from PCEMP phonologically. PMP *c and *s apparently merged as *s. But a similar merger occurs in many WMP languages and in all Formosan languages. However, as mentioned above, there is a reduction of hetero-organic consonant clusters in the reflexes of reduplicated monosyllables. All CEMP languages have simplified medial clusters in the reflexes of PMP reduplicated monosyllables (unless the cluster consisted of a nasal followed by a stop or fricative, in which case the nasal assimilated to the place of articulation of the stop, but was not lost). Examples:
|
PMP *bukbuk > PCEMP *bubuk |
‘wood weevil’ |
|
|
PMP *ñamñam > PCEMP *ñañam |
‘tasty, delicious’ |
|
|
PMP *mekmek > PCEMP *memek |
‘crumbs’ |
Some WMP languages, for example Malay, have made similar simplifications — but, in Blust’s opinion, the universality of this change in CEMP is best explained as the product of a single innovation in a language ancestral to the whole group.
Further evidence for the CEMP subgroup cited by Blust (1990) is as follows:
|
PMP *uliq |
PCEMP *oliq |
‘return, go back’ |
|
|
PMP *i-sai |
PCEMP *i-sei |
‘who?’ |
|
|
PMP *ma-qitem |
PCEMP *ma-qet əm |
‘black’ |
|
|
PMP *maRi |
PCEMP *mai |
‘come’ |
|
|
PMP *tudan |
PCEMP *todan |
‘sit’ |
|
1. |
PMP *ka-labaw |
PCEMP *kanzupay |
‘rat’ |
|
2. |
PCEMP *liqə |
‘voice’ |
|
|
3. |
PCEMP *malu |
‘loincloth’ |
|
|
4. |
PMP *dilaq |
PCEMP *maya |
‘tongue’ |
|
5. |
PMP *surat |
PCEMP *tusi |
‘scratch, draw a line, etc.’ |
|
6. |
PMP *tawa |
PCEMP *malip |
‘laugh’ |
|
7. |
PCEMP *saRa |
‘sweep, broom’ |
|
|
8. |
PCEMP *kandoRa |
‘cuscus, phalanger’ |
|
|
9. |
PCEMP *mansar/mansər |
‘bandicoot’ |
|
|
10. |
PCEMP *keRa(nŋ) |
‘hawksbill turtle’ |
|
|
11. |
PMP *amuR |
PCEMP *au |
‘dew’ |
|
12. |
PCEMP *bai |
‘do, make’ |
|
|
13. |
PMP *paen |
PCEMP *bayan/payan |
‘bait’ |
|
14. |
PMP *hazani |
PCEMP *da ŋi |
‘near’ |
|
15. |
PCEMP *kese |
‘keep to oneself, be different’ |
|
|
16. |
PMP *dalem |
PCEMP *laman |
‘deep’ |
|
17. |
PMP *paen |
PCEMP *pani(n ŋ) |
‘bait’ |
|
18. |
PMP *muRmuR |
PCEMP *pupuR |
‘gargle, rinse the mouth’ |
|
19. |
PMP *kapal |
PCEMP *t əlu |
‘thick (of materials)’ |
|
20. |
PCEMP *qumun |
‘earth oven’ |
|
|
21. |
PMP *lakaw/panaw |
PCEMP *ba |
‘go’ |
|
22. |
PCEMP *balaŋ |
‘side, part’ |
|
|
23. |
PMP *qa-lima |
PCEMP *baRa |
‘hand, arm’ |
|
24. |
PCEMP *lama |
‘spread over, cover’ |
|
|
25. |
PCEMP *ŋaRa |
‘wild duck’ |
|
|
26. |
PCEMP *papaR |
‘cheek’ |
|
|
27. |
PCEMP *paRa- |
‘reciprocal prefix’ |
|
|
28. |
PMP *palihi |
PCEMP *tambu |
‘forbid’ |
|
29. |
PMP *hiup |
PCEMP *upi |
‘to blow’ |
|
30. |
PCEMP *waŋka |
‘canoe’ |
|
|
31. |
PCEMP *wari |
‘sing, song’ |
|
|
32. |
PMP *ma-esak |
PCEMP *madar |
‘ripe, overripe’ |
|
33. |
PMP *bahu |
PCEMP *mapu |
‘unpleasant odour’ |
Blust (1990) maintains that there are two features which are widely distributed in Eastern Indonesia and Oceania, namely:
However, there is a lack of established cognation in the morphemes used to express formally similar systems — thus a hypothesis of convergent development between the CMP and the Oc proclitics cannot easily be ruled out. Indeed Ross (1988:96ff.) also questions whether there is convincing evidence for an immediate common ancestor of the CMP, SHWNG and Oc subgroups.
|
1. |
PMP *apa |
PCEMP *sapa |
‘what?’ |
|
2. |
PMP *hepat |
PCEMP *pat, *pati |
‘four’ |
|
3. |
PMP *ma-huab |
PCEMP *mawab |
‘yawn’ |
|
4. |
PMP *ma-hiaq |
PCEMP *mayaq |
‘shy, ashamed’ |
Blust (1990) concludes that the evidence for the existence of the PCEMP subgroup is fairly strong, as individual pieces of evidence are mostly mutually independent. Grimes (1990) has made an independent evaluation of the CEMP evidence and finds that Blust has a good case, even though very few of the lexical innovations which Blust lists are replacement innovations.
With respect to the Central Malayo-Polynesian subgroup (CMP), Blust and others are much less confident. These are the languages of the Lesser Sundas east of the Bima-Sumba group, and those of the southern and central Moluccas. The problems associated with this subgroup are not surprising, as we are again dealing with a “stay-at-home” rather than a migratory group. As suggested above, the most striking feature of the phonological history of the CMP languages is the extent to which similar changes are found in many but not all of the languages. This pattern of innovation suggests that PCMP underwent a short period of development apart from other contemporary Austronesian languages before it began to spread from the Moluccas to the Lesser Sundas. Many of the changes that are now widespread in these languages took place after this geographical dispersal and were the result of diffusion and in some cases drift.
The innovations which distinguish the CMP languages according to Blust (1990) are the following:
PMP *ma-putiq > Kemak (C.Timor), Bonfia (E.Seram) buti, Buru boti, ‘white’.
As with the two previously discussed innovations, however, it appears that postnasal voicing was also a recurrent change.
However, while examples of this irregular development are known from Flores to the Leti-Moa Archipelago, they are apparently not found in the southern and central Moluccas.
Blust lists the following innovations which he claims are exclusively shared by the languages of the Lesser Sundas and the Moluccas:
|
1. |
PCMP *balabu |
‘see dimly’ |
|
|
2. |
PCMP *balik |
‘mix, blend’ |
|
|
3. |
PCMP *beta |
‘cut wood’ |
|
|
4. |
PCMP *dada |
‘drag’ |
|
|
5. |
PCMP *dodok |
‘pierce, stab’ |
|
|
6. |
PCMP *letay |
‘above’ |
|
|
7. |
PMP *kawit |
PCMP *gae |
‘hook’ |
|
8. |
PCMP *kati |
‘call a dog’ |
|
|
9. |
PCMP *ketu |
‘pluck, break off’ |
|
|
10. |
PCMP *lemba |
‘carry with a carrying pole’ |
|
|
11. |
PCMP *lesi |
‘excess, overabundance’ |
|
|
12. |
PCMP *lesu |
‘come out, take out’ |
|
|
13. |
PCMP *letay |
‘bridge’ |
|
|
14. |
PCMP *leu |
‘bend’ |
|
|
15. |
PCMP *liRi |
‘sound, voice’ |
|
|
16. |
PCMP *lolan |
‘cut off a piece’ |
|
|
17. |
PCMP *lunu |
‘roll up’ |
|
|
18. |
PMP *i-nu |
PCMP *mpae |
‘where?’ |
|
19. |
PCMP *peu |
‘bind together in a sheaf’ |
|
|
20. |
PMP *qasu |
PCMP *masu |
‘smoke’ |
|
21. |
PCMP *silu |
‘lift, raise’ |
|
|
22. |
PMP *tahiq, *zaqit |
PCMP *sora |
‘sew’ |
|
23. |
PCMP *sula |
‘horn’ |
|
|
24. |
PCMP *ta |
‘no, not’ |
|
|
25. |
PMP *taliŋ a |
PCMP *tilu |
‘ear’ |
The major problem with the lexical innovations as proposed here is that again they are not replacement innovations.
Blust (1990) also proposes some morphosyntactic and semantic innovations for the CMP subgroup, but here again the problem is that they are not shared throughout the proposed subgroup. In fact that is the very point which Blust himself makes. Blust asks whether we should assume that the changes he has documented are the product of completely independent innovations, that is, of drift. If so, he says, it is puzzling why the changes in question should be concentrated in the languages of the Lesser Sundas and the southern and central Moluccas.
In Blust’s opinion, diffusion is the most plausible explanation for the distributions he puts forward. It is well known that diffusion can occur across major subgroup boundaries. Thus the widely shared phonological innovations among the languages of the Lesser Sundas, the southern and central Moluccas and the southern part of the Vogelkop Peninsula may simply be the products of contact among Austronesian languages that share no particularly close genetic affinity.
The Central Malayo-Polynesian subgroup of Austronesian, then, is faced with the same kinds of problems as other “stay-at-home” Austronesian groups, and its existence cannot at this stage be taken as proven any more than that of the Western Malayo-Polynesian subgroup. Nobody has really looked at the over-all relationships of the languages of Nusatenggara and Timor with the languages of Maluku. Thus we have no real idea of the first-order nodes within CMP.
The two descendants of the Eastern Malayo-Polynesian subgroup are the South Halmahera-West New Guinea (SHWNG) and Oceanic (Oc) subgroups. The SHWNG subgroup consists of all of the Austronesian languages of Halmahera and its near satellites and the various languages along the north coast of the Vogelkop Peninsula and Cenderawasih Bay, Waropen and all the Austronesian languages of Yapen Island and its satellites. The data available for many of these languages is far from adequate, making subgrouping difficult and at present uncertain. One important problem remaining to be solved concerns the boundary between CMP and SHWNG languages.
Blust (1978) set out the criteria for the SHWNG and Oc subgroups of Austronesian and need not be repeated in full here. In summary, Blust considers that the following are the most useful defining innovations for the South Halmahera-West New Guinea subgroup:
In terms of the Oceanic subgroup (Oc), Ross (1988:30) sets out a list of ten phonological innovations which distinguish POc (Proto-Oceanic) from PAn (Proto-Austronesian). However, half of these are also reflected in the South Halmahera-West New Guinea subgroup (SHWNG) and as such are attributable to Proto-Eastern Malayo-Polynesian (PEMP), the immediate ancestor of both SHWNG and POc. There are, however, five innovations shared exclusively between PEMP and POc, as follows:
|
PEMP |
POc |
||
|
1. |
(m)p |
(m)b |
(m)p, ŋp |
|
2. |
(n)s |
(n)z |
(n)s |
|
3. |
e |
aw |
o |
|
4. |
ay |
ey |
e |
|
5. |
m |
m, ŋm |
In terms of phonological innovations between PEMP and POc, then, we are dealing with four mergers and two splits, quite substantial evidence by any standards. There is also lexical and morpho-syntactic evidence for the existence of the Oceanic subgroup presented in Pawley (1972:2-3). The development and dispersal of the Oceanic subgroup of Austronesian is discussed in the following chapter by Pawley and Ross.