Emotion audio files from the Odia Indic language. Our Open Access paper in IEEE describes the detailed description of the data collection, setting, data distribution, and perceptual validation of the SITB-OSED. (Here)

The final SITB-OSED database contains 12,110 utterances, including all five major dialects (Cuttacki, Baleswari, Berhampuri, Sambalpuri, and Phulbani ) spoken by 20 professional Odia native speakers (10 male and 10 female) in Odisha state in India. For each dialect, 4 speakers were performed (2 male and 2 female) with six basic emotions: 1) Anger 2) Surprise 3) Happy 4) Sadness 5) Disgust 6) Fear. The duration of the utterances varied between 3.5s to 8s. All the samples were recorded in .wav format by 2 channels (stereo), 16-bit quantization rate with a sample rate of 22.05 kHz.

Download the SITB-OSED dataset

File naming style

Each of the files has a 4-part identifier (e.g., a1-01-05-17.wav).

  • Number of sample (a1 = angry one).
  • Emotion (01 = angry, 02 = disgust, 03 = fear, 04 = happy, 05 = sad, 06 = surprised).
  • Dialectal identifier (01= Cuttack, 02=Baleswari, 03=Berhampuri, 04=Sambalpuri, 05 = Phulbani).
  • No. of Actor (01 to 20. Odd numbered represent female and even-numbered represent male).

Example: a1-01-05-17.wav

  1. a1= angry one
  2. 01= angry
  3. 05= Phulbani
  4. 17= Female actor

please cite the below paper if you use SITB-OSED in an academic publication

B. Maji and M. Swain, “SITB-OSED: An Odia Speech Emotion Dataset,” 2022 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), Gold Coast, Australia, 2022, pp. 1-5, doi: 10.1109/CSDE56538.2022.10089254.