Skip to content

Indonesian Accents and Regional Languages

It is widely known that there are multiple accents of the English language. Accent is the part of dialect concerning local pronunciation and varies from region to region. English dialects differ greatly in their pronunciation of open vowels.

Likewise, Indonesian (as in the language, Bahasa Indonesia) speakers tend to have certain accents, especially if they originate from regions where a secondary regional language is used on a daily basis. As far as our research goes, there is yet to be an existing study that quantifies the different accents of the Indonesian language. Hence, we aim to explore and get a brief understanding of how Indonesian accents vary and how regional languages influence them.

Info

This blog post is highly inspired by the discussion posted in italki. It raises the question: "Do people from different islands in Indonesia have a distinct accent when they speak Indonesian? If they do, what are characteristics of those accents like?"

Language Phonologies

We will only be covering examining regional languages Javanese (jv), Sundanese (su), Toba Batak (bbc) and will be comparing them with Indonesian (id).

Indonesian

Vowel phonemes

Front Central Back
Close i u
Close-Mid e É™ o
Open-Mid É› É”
Open a

It is usually understood that there are 6 vowel phonemes in the Indonesian language, particularly a, e, i, o, u, and É™. However, newer systems tend to add two more open-mid vowels É” and É›. For simplicity, we will be ignoring the latter two new phonemes and consider the first 6 as standard.

Consonant phonemes

Labial Dental/Alveolar Palatal Velar Glottal
Nasal m n ɲ ŋ
Plosive/Affricate voiceless p t̪ t͡ʃ k (ʔ)
voiced b d d͡ʒ ɡ
Fricative voiceless (f) s (ʃ) (x) h
voiced (v) (z)
Approximant w l j
Trill r

On the other hand, it is quite tricky to say just exactly how many consonant phonemes there are. This is due to the heavy influence of both foreign languages like Dutch, Arabic, English and Sanskrit, as well as regional languages such as Balinese, Madurese, Sundanese, and Javanese 1. They come in the form of loanwords whose phonemes are shown in parantheses.

Regardless, if we account for all 24 consonant phonemes and the six vowel phonemes, we will arrive at a total of 30 phonemes. It is these 30 phonemes that serve as phonetic units in g2p ID, an Indonesian Grapheme-to-Phoneme Converter.

Javanese

Javanese is one example of a difficult language to study. Not only does it vary from region to region (e.g. East versus West Javanese speakers), the language has also evolved from Old Javanese (Kawi) to the newer, modern Javanese. The table which we are going to analyze will supposedly cover phonemes of Modern Standard Javanese2,3.

Vowel phonemes

Front Central Back
Close i u
Close-mid e É™ o
Open-mid (É›) (É”)
Open a

Javanese vowel phonemes are essentially identical to that of Indonesian: six usual vowel phonemes plus two open-mid ones. There might additionally be phonetic changes, depending on where the speaker is located. For example, in the standard dialect of Surakarta, a is pronounced É” in word-final open syllables, and in any open penultimate syllable before such an É”.

Consonant phoneme

Labial Dental/ Alveolar Retroflex Palatal Velar Glottal
Nasal m n ɲ ŋ
Plosive/ Affricate stiff voice p t̪ ʈ tʃ k ʔ
slack voice b̥ d̪̥ ɖ̥ dʒ̊ ɡ̊
Fricative s h
Semivowel j w
Liquid lateral l
rhotic r

Here's where it gets different: Javanese adds 2 new retroflex consonant phonemes ʈ and ɖ̥ (romanized as th and dh, respectively), and eliminates most of the borrowed consonant phonemes that are present in Indonesian phonology. These differences influence the way several Javanese speakers speak Indonesian.

For example, Javanese speakers tend to use the retroflex consonant phonemes in-place of the counterpart dental/alveolar ones. So instead of saying medok, a Javanese speaker may instead say meÉ–Ì¥ok.

On the flipside, the absence of f and v may cause the speaker to interchange with the phoneme p. For instance, instead of saying bərpikir, a Javanese speaker might say bərfikir.

Certainly, these changes differ from region to region and from speaker to speaker, but it makes sense why Javanese speakers might have a strong accent when they speak Indonesian. Regional language speakers tend to incorporate their regional language's phonetic system into that of Indonesian's -- and Javanese speakers aren't the only ones who do that.

Sundanese

Like Javanese, the Sundanese language has evolved from the Old Sundanese script to a more modern version. Moreover, there are multiple dialects such as the Western Dialect (or Bantenese), Northern Dialect, Southern/Priangan Dialect, and still many others.

Vowel phonemes

Front Central Back
Close i ɨ u
Mid É› É™ É”
Open a

Now, this is where Sundanese phonemes differ from that of Indonesian's and Javanese's. Aside from the usual 6 phonemes shared with Indonesian, Sundanese introduces a new vowel phoneme ɨ (romanized as eu). Examples of Sundanese words that contain the ɨ phoneme are: teu, ieu, haseup, haseum, etc.

While it's not entirely obvious how this would bleed into the way a Sundanese speaker might speak Indonesian, some Indonesian words that do have their Sundanese counterparts (e.g. asam and haseum) might still be interchangeable and hence how the phoneme ɨ might be used.

Consonant phonemes

Bilabial Alveolar Palatal Velar Glottal
Nasal m n ɲ ŋ
Plosive/ Affricate voiceless p t tʃ k
voiced b d dÊ’ É¡
Fricative s h
Lateral l
Trill r
Approximant w j

Like Javanese, Sundanese originally does not have borrow/foreign consonant phonemes and only have 18 consonants in total4. However, as foreign words were gradually incorporated into the language, several additional consonants such as f, v, z, ʃ and x have been introduced. And just like Javanese, these new phonemes tend to be transferred into native consonants, namely:

  • f / v p
  • ʃ s
  • z dÊ’
  • x h

Toba Batak

Gradual language evolution and wide variation are no exceptions to Toba Batak, and basically to most regional languages of Indonesia as well. It used to also be written in Batak script but the Latin script is now preferred.

Vowel phonemes5

Front Central Back
Close i u
Close-mid e (É™) o
Open-mid É› É”
Open a

As shown above, Toba Batak has the same set of vowel phonemes to that of Indonesian and Javanese. However, the ə phoneme is not native to Toba Batak unlike the others! It only occurs in loanwords from Indonesian. Therefore, the tendency of native Toba Batak speakers is to pronounce all e graphemes as e phonemes. Hence, instead of saying məreka baru kəmbali, they might say mereka baru kembali (notice the difference in e's).

This is particularly tricky as Indonesian is delicate on how the letter e is phonemized. A lot of Indonesian homographs depend on how the letter is spoken and could lead to different meanings. More on that here.

Consonant phonemes

Labial Dental/ Alveolar (Alveolo-)palatal Velar Glottal
Nasal m n Å‹
Plosive/ Affricate voiceless p t t͡ɕ k
voiced b d d͡ʑ ɡ
Fricative s h
Trill r
Approximant w l j

Aside from the slight variation from palatal to alveolo-palatal consonant phonemes, Toba Batak has exactly the same consonant phonemes to that of Sundanese. Further, foreign borrowed consonant phonemes are also eliminated, hence phonetic transfers might also occur -- converting foreign phonemes (if any) to their closest native counterparts, just like Sundanese.

Recap, Takeaways, and Suggestions

A quick recap of the frequent changes that occur across the regional languages discussed above is as follows:

  • t ʈ (especially Javanese)
  • d É–Ì¥ (especially Javanese)
  • a É” (occassionally Javanese)
  • f / v p
  • ʃ s
  • z dÊ’
  • x h
  • É™ e (especially Toba Batak)

Now, these are only 3 out of hundreds of regional languages in Indonesia. But ultimately, the takeaway from exploring these different languages and their impact to the Indonesian accent is being contextually aware when building speech technology involving the language. Although it is widely understood that there is a "standard Indonesian" or "accent-less Indonesian", there might still be a vast majority of its speakers who tend to carry over their regional phonemization when speaking the lingua franca.

The question to answer when building speech technology while knowing all of this is: should we build a system that accounts for these strong, varying accents? Do we mark them as non-Indonesian, or do we accept them for their differences and still consider them as proper Indonesian? Where is the fine line between a homographic mistake, versus a phonetic difference?

We personally think that the answers are very much case-by-case dependent. If you could quickly identify the accent of the speaker, then having an accent-aware, stricter model would be perfect. But if that's not feasible, then having a more flexible, lenient model might be more preferrable.

Extra: Indonesian English

Singaporeans have their own English accent, Singapore English (en-SG) and Malaysians similarly have Malaysian English (en-MY). Likewise, Indonesians have Indonesian English (en-ID): essentially how Indonesians tend to speak English, i.e. their accent.

To the best of our knowledge, there is yet to be a concrete study investigating en-ID. We suspect that this is due to the fact that English is not Indonesia's official language, unlike its neighboring countries. However, MasteringBahasa noted several interesting characteristics of Indonesian English, which can be summarized as:

  • Rolled R's
  • No silent letters
  • Transfers of English phonemes to native phonemes (f, v, ʃ)
  • Full/whole A's

You can find more details here.


Written by Wilson Wongso. Last updated 14 December 2022.


  1. Poedjosoedarmo, Soepomo (1982). "Javanese influence on Indonesian phonology". Javanese influence on Indonesian (PDF). D. Vol. 38. Canberra: Pacific Linguistics. pp. 19–50. Archived (PDF) from the original on 9 October 2022. 

  2. Brown, Keith; Ogilvie, Sarah (2008). Concise encyclopedia of languages of the world. Elsevier. p. 560. ISBN 9780080877747

  3. Suharno, Ignatius (1982). A Descriptive Study of Javanese. Canberra: ANU Asia-Pacific Linguistics / Pacific Linguistics Press. pp. 4–6. doi:10.15144/PL-D45. hdl:1885/145095. ISBN 9780858832589

  4. Müller-Gotama, Franz (2001). Sundanese. Languages of the World. Materials. Vol. 369. Munich: LINCOM Europa. 

  5. Nababan, P. W. J. (1981). A Grammar of Toba-Batak. Pacific Linguistics Series D – No. 37. Canberra: Dept. of Linguistics, Research School of Pacific Studies, The Australian National University. doi:10.15144/pl-d37. hdl:1885/145092