Starting from August 12, 2022, the "Magichub Chinese-British Mixed ASR Challenge" co-sponsored by Magic Data, TAL, Tsinghua University, and the Institute of Acoustics of the Chinese Academy of Sciences has received more than 30 domestic and foreign research applications since its

2024/12/1021:03:32 education 1993

Starting from August 12, 2022, the

Starting from August 12, 2022, the "Magichub Chinese-British Mixed ASR Challenge" co-sponsored by Magic Data, TAL, Tsinghua University , Institute of Acoustics, Chinese Academy of Sciences has received more than 30 Support the registration of participating teams from domestic and foreign research institutions, well-known enterprises and universities, including litchi FM, Terminus, NetEase Games, China Mobile Online, Chinese Academy of Sciences , Huazhong University of Science and Technology , University of Science and Technology of China , Northwestern Polytechnical University, Xiamen University , Tianjin University, etc. On August 24, the organizers officially opened the development training set and baseline system to participating teams.

Registration is in progress

https://magichub.com/join-competition/?id=11627

development training set

The organizer has opened the following training and development data sets:

1, MagicData-RAMC Including 351 sets of multiple rounds of Mandarin conversations, with a total duration of 180 hours. The annotation information of each group of conversations includes transcript text, voice activity timestamps, speaker information, recording information, and topic information. Speaker information includes gender, age and region, and recording information includes environment and equipment. Please check the email to download the data set.

2 and TAL_CSASR mixed Chinese and English speech data sets, which are TAL English course audios with a total duration of 587 hours. Including mixed speech in Chinese and English, each audio has only one speaker, including more than 200 speakers in total. Please check the email to download the data set.

3, development set (Dev), including 14 speakers, with a total duration of about 6.8 hours.

All participants are expected to abide by the following rules:

1. DATA: Only MagicData-RAMC and TAL_CSASR are allowed. Data augmentation can use two noisy data sets, namely MUSAN (openslr17), RIRNoise (openslr 28).

2. It is strictly prohibited to use the test set in any form, including but not limited to using the test data set to fine-tune or train the model.

3. Allows multi-system integration. Fusion using systems with the same structure is however discouraged.

4. All models should be trained on allowed datasets. Specifically, pretrained models are not allowed to use other datasets (including unlabeled data).

5. The final interpretation right belongs to the organizer.

Baseline system introduction

In order to help contestants evaluate system performance, the organizer provides baseline system performance for contestants' reference. The system adopts the Transformer model and is developed based on the ETEH platform.

For detailed information, please see:

https://github.com/MagicHub-io/CSASR_Challenge

Scoring tool

uses the open source scoring tool Sclite for scoring. The scoring indicator uses Mixed Error Rate (MER), which calculates the word error rate for Chinese and the word error rate for English. Please see

scoring examples. :

https://github.com/MagicHub-io/CSASR_Challenge/blob/main/dev_scoring_sclite.sh

Baseline system Q&A guide

If you have any questions about the baseline system, please visit the following link for help, and a team of experts will answer it.

Q&A express:

https://github.com/MagicHub-io/CSASR_Challenge#contact

Award settings

The competition will set first prize, second prize and third prize respectively. Three groups of winning teams/individuals will be selected. The winners will have the opportunity to participate in on-site demonstrations and exchange activities at international and domestic top conferences.

1 first prize: Huawei Watch + Apu fascia gun (worth 3,000 yuan) + award certificate

2 second prizes: Magic Data Koi gift pack + TAL Future & Lingmei joint pen gift box (worth 1,500 yuan) +Award certificate

3 third prizes: Magic Data customized gift + Apu weight scale (worth 500 yuan) + award certificate

Schedule setting

Starting from August 12, 2022, the

Competition organizing committee support team

For questions related to the challenge, please send an email to [email protected], the email title is "Questions about the Chinese-English Mixed ASR Challenge". If you have questions, the following senior technical experts from the organizing committee will provide professional technical Q&A and guidance. The guiding experts have been working in the speech field for many years and have rich research and practical experience. I believe that the contestants will be inspired and gain from their guidance.

Starting from August 12, 2022, the

Registration method

Registration address: https://magichub.com/join-competition/?id=11627

Number of participants: Each team has no more than 4 participants (including 4 people)

More details: www.magichub.com

education Category Latest News

In my country's current education system, colleges and universities can be divided into two types: public and private. The former belongs to government departments of different levels and has government financial support; the latter is organized by private enterprises or institut - DayDayNews

In my country's current education system, colleges and universities can be divided into two types: public and private. The former belongs to government departments of different levels and has government financial support; the latter is organized by private enterprises or institut

I regretted it just a few days after I registered for a public higher vocational school. I dropped out of school and went home to resume my studies. Is it really that bad for higher vocational school?