Business Unit
Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers, TEG provides users with a full range of customer services.What the Role Entails
Job Responsibilities:
1.
Track the latest research in speech generation algorithms, explore next-generation paradigms for speech/audio generation, and push the boundaries of speech generation capabilities.
2.
Investigate cutting-edge multimodal voice foundation model technologies to enhance voice interaction experiences by integrating text, speech, and vision.
3.
Lead the technical R&D of voice foundation models, driving model performance improvements and innovative applications.
Who We Look For
Job Requirements:
1.
Master’s or in Computer Science, Artificial Intelligence, Electronic Engineering, Signal Processing, or related fields.
2.
Research or development experience in one or more areas: voice foundation models, speech synthesis, speech recognition, audio generation, voice conversion, or speech codec.
3.
Familiarity with mainstream voice-enabled large models (, GPT4o, GLM-4-Voice, , Voila).
Prior project experience is preferred.
4.
Proficient in deep learning frameworks (, PyTorch).
Experience with large-scale model training frameworks (Megatron/Deepspeed) is a plus.
5.
Solid understanding of large model architectures and principles.
Experience in large-scale pretraining or post-training is preferred.
Location State(s)
US-Washington-BellevueThe expected base pay range for this position in the location(s) listed above is $ to $ per year.Equal Employment Opportunity at Tencent
As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community.
We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.