Applied Scientist Intern (Audio)
Reality Defender
About Reality Defender
Reality Defender provides accurate, multi-modal AI-generated media detection solutions to enable enterprises and governments to identify and prevent fraud, disinformation, and harmful deepfakes in real time. A Y Combinator graduate, Comcast NBCUniversal LIFT Labs alumni, and backed by DCVC, Reality Defender is tdhe first company to pioneer multi-modal and multi-model detection of AI-generated media. Our web app and platform-agnostic API built by our research-forward team ensures that our customers can swiftly and securely mitigate fraud and cybersecurity risks in real time with a frictionless, robust solution.
Youtube: Reality Defender Wins RSA Most Innovative Startup
Why we stand out:
Our best-in-class accuracy is derived from our sole, research-backed mission and use of multiple models per modality
We can detect AI-generated fraud and disinformation in near- or real time across all modalities including audio, video, image, and text.
Our platform is designed for ease of use, featuring a versatile API that integrates seamlessly with any system, an intuitive drag-and-drop web application for quick ad hoc analysis, and platform-agnostic real-time audio detection tailored for call center deployments.
We’re privacy first, ensuring the strongest standards of compliance and keeping customer data away from the training of our detection models.
Role and Responsibilities
Explore and conceptualize novel methods to leverage different modalities (e.g., speech, text) for deepfake detection and relevant audio understanding tasks.
Perform fundamental and applied research to advance the current state-of-the-art on audio deepfake detection.
Build models with generalizability to unseen generative methods - Collaborate with scientists and engineers across the organization
Summarize, publish, and present research findings
About You
Currently enrolled in a PhD program with specialization in machine learning/deep learning, natural language processing, and/or speech processing.
2+ years of experience with training/fine-tuning large models, esp. audio language models, speech foundation models, multi-modal foundation models.
Experience with end-to-end model building pipeline for ML tasks: dataset curation/cleaning, model implementation, benchmarking, and result analysis
Familiarity with distributed multi-GPU training Prior experience with publications in reputable ML/Audio/NLP research venues, e.g. NeurIPS, Interspeech, ICASSP, ACL, EMNLP.