PixelPlayer Official Website Experience Portal - Online AI Audio-Visual Separation Tool
-
PixelPlayer is a revolutionary tool that learns to locate sound-producing image regions and separate input audio into components representing each pixel's sound by watching large amounts of unlabeled videos. The system leverages the natural synchronization of visual and auditory modalities to jointly parse sound and image models without requiring additional manual annotations. Through extensive training videos, PixelPlayer can separate different instrument sounds in mixed audio, explore the relationship between audio-visual perception, and assign distinct audio waveforms to each pixel of input videos.
PixelPlayer is suitable for users who need unsupervised audio-visual separation and analysis of audio-visual relationships. This tool can help researchers, audio engineers, and music enthusiasts understand the process of separating different instrument sounds in mixed audio and explore the role of pixel regions in the overall auditory experience. To use PixelPlayer, simply provide training videos and monaural auditory input. The system will automatically perform the audio-visual source separation and localization process, separating the input sound into N sound channels, each corresponding to a different instrument category. Users can conduct real-time testing and application through the experience portal on the official website.