Overview
Audio Solutions is a comprehensive technical guidance framework developed by Realtek for the audio domain. This framework encompasses three main components: audio signal processing elements, system-level solutions, and practical application implementations. Its core functional modules include Audio Route, Audio Effect processing, Audio Stream management, Notification, and Voice Activity Detection (VAD). Currently, this solution has been widely implemented in various products such as TWS earphones, smart helmets, voice recorders, soundbars, speakers, and smart glasses.
Audio Signal Processing Components
Codec (Encoder-Decoder)
Realtek's Codec solutions are built around two core product lines: the ALC series (for PC/consumer electronics) and the RTL series (integrated within Bluetooth Audio SoCs). These lines collectively address full-scenario audio demands, ranging from entry-level to flagship applications. Characterized by five core strengths — high integration, low power consumption, exceptional audio fidelity, rich algorithm support, and broad compatibility — these products are widely adopted in global PC motherboards and smart audio devices (such as TWS earbuds, smart glasses, and soundbars). Realtek maintains a long-standing leading position in market share within this sector.


DSP (Digital Signal Processor)
In Bluetooth audio devices (such as headphones, speakers, TWS earbuds, etc.), sound quality and feature performance are primarily determined by the built-in Digital Signal Processor (DSP). This processor acts as the "brain" of the audio system, responsible for implementing core functions including Active Noise Cancellation (ANC), dynamic equalizer (EQ), Dynamic Range Control (DRC), audio enhancement, voice call enhancement, spatial audio immersive experience, and Bluetooth audio codec support. By performing real-time processing, optimization, and enhancement of digital audio signals, it effectively compensates for the inherent limitations of Bluetooth transmission and hardware constraints, thereby achieving professional-grade audio performance.
Realtek's DSP solutions are characterized by core advantages such as high integration, low power consumption, dual-HiFi architecture, intelligent algorithms, and full-scenario compatibility. They are widely integrated into audio Codecs, Bluetooth Audio SoCs, and main control chips for smart devices, covering a diverse range of audio products including PCs, smart glasses, TWS earbuds, and soundbars. With its unique strengths in hardware-software co-optimization and scenario-specific algorithm libraries, this solution has become a mainstream choice in the consumer electronics audio processing domain.

The Realtek RTL8763D series chip is a highly integrated audio platform designed for wired and wireless audio applications featuring Bluetooth Enhanced Data Rate (EDR) and Bluetooth Low Energy (BLE) connectivity. The hardware architecture of this platform includes the following functional modules:
Audio Solution
This chapter introduces the audio solution provided by Realtek. This solution is a dedicated software framework for the audio domain, centered around standardized audio driver abstraction interfaces, virtualized audio stream routing mechanisms, and a set of high-level modular functional components. These components include audio routing, audio effects processing, audio stream control, notification prompts, voice activity detection, and more.
The diagram "Audio Subsystem Architecture" clearly illustrates the component composition of this solution. The entire architecture is divided into two layers by a solid black line: the bottom layer is the Audio Hardware Abstraction Layer, responsible for interacting with specific hardware; the top layer is the Audio Framework, which is built upon the Hardware Abstraction Layer and designed as a platform-independent software component. The Audio Framework can be further subdivided into audio paths, the audio core, and various high-level functional modules.

Audio Route
Audio Route refers to the static configuration of physical data paths, with its core function being to configure the Gateway and logical IO parameters for a specific audio routing path. This module is organized by Audio Category and includes Gateway Configuration, Endpoint Configuration, Logical IO Configuration, and Physical IO Configuration. The diagram below illustrates a simplified Audio Route path between the Codec and the Digital Signal Processor (DSP).

Audio Category
Audio Category is used to classify all streams transmitted between the Host and the Controller. Audio streams that share the same control methods, usage scenarios, and functionalities are grouped under the same category. The primary categories include:

Audio Stream
The Audio Stream Component provides the application layer with a suite of abstract, efficient, and flexible functions for controlling and processing audio data. It is primarily divided into the following three categories:
By configuring the underlying hardware stream routing paths via Audio Route, developers can leverage the APIs offered by these high-level audio stream models to efficiently handle complex audio scenarios.
Audio Track
Audio Track provides a dedicated high-level API for handling playback streams, voice communication streams, and recording streams. Specifically:
The diagram "Audio Track Overview" illustrates the overall architecture of the audio track across different modules:

Audio Line
Audio Line operates across different modules in a manner similar to Audio Track. Each Audio Line instance acquires a dedicated stream from a local input peripheral and transmits it to a local output peripheral. Local input peripherals may include a built-in microphone, an external microphone, an auxiliary input (AUX-In), or a digital audio input (SPDIF-In). Local output peripherals may include a built-in speaker, an external speaker, an auxiliary output (AUX-Out), or a digital audio output (SPDIF-Out). Audio Line supports flexible combinations of input and output peripherals.

Audio Pipe
The diagram below briefly illustrates how Audio Pipe operates across different modules: The application layer inputs a data stream requiring codec format conversion from a Source Endpoint. This stream is then processed and converted by the Digital Signal Processor (DSP) before being sent back to the application layer. This model converts an input data stream from one codec format into the desired output format. Audio Pipe supports conversion between different codec types as well as conversion of specific codec attributes within the same type, all configurable as needed by the application. Furthermore, Audio Pipe supports cascaded processing: the output stream from the Sink Endpoint of one Audio Pipe can directly serve as the input stream for the Source Endpoint of the next pipe
Audio Pipe supports the following codec types: PCM, CVSD, mSBC, SBC, AAC, OPUS, FLAC, MP3, LC3, LDAC, LHDC, G729, LC3plus.

Notification
Notification tones are short, urgent audio messages directed to the user. The Audio Subsystem currently supports the following three types of notification tones:
Both Ringtone and Voice Prompt support three modes: audible mode, mute mode, and volume-fixed mode.

VAD
VAD (Voice Activity Detection) is a core algorithm in audio signal processing designed to automatically distinguish between "speech signals" and "non-speech signals" (such as silence and ambient noise). This algorithm is widely used in Bluetooth audio devices (e.g., earphones, speakers), call systems, and voice assistant applications. Within a Bluetooth audio system, VAD operates as a lightweight algorithm module on the DSP. It is typically enabled only during idle mode and A2DP audio playback mode. In voice/HFP call mode and Line-in input mode, VAD is usually not required, as the former is already focused on call voice and the latter handles external audio input. This module is responsible for analyzing the audio stream captured by the microphone in real-time and outputs a detection signal to the MCU indicating the "presence/absence of speech."
VAD is not an independent module but is deeply integrated into the Bluetooth audio subsystem's chain of "microphone capture → ADC sampling → DSP processing → MCU control." Its architecture is closely tied to the hardware design of Bluetooth Audio SoCs (such as Realtek's RTL8763). The overall architecture is illustrated in the diagram below. VAD can be categorized into two types: Software VAD and Hardware VAD, which differ in their implementation approaches.

Audio Effect
The diagram below illustrates the binding relationship between Audio Effect and Audio Track: The application can enable, disable, or clear specified effects via the API of the Audio Effect submodule, while simultaneously starting, stopping, or restarting the corresponding data stream via the API of the Stream submodules. To apply a specific audio effect to a data stream, the application must actively invoke the Stream submodule API to establish the binding relationship between them. The underlying Audio Path module will then pass the bound effect information to the Digital Signal Processor (DSP) for execution at the appropriate time.
The Audio Subsystem supports the dynamic binding of effects to data streams. Any audio effect can be associated with any type of data stream, and binding/unbinding can be managed flexibly at runtime, thereby providing the application with greater freedom in audio control. This design, based on the abstract Audio Effect model, decouples the data stream from specific effects, facilitating independent expansion of the subsystem along both the data stream and effect dimensions.

Built-In Effect
Built-in effects include Equalizer (EQ), Noise Reduction Enhancement (NREC), Wide Dynamic Range Compression (WDRC), Sidetone, and Beamforming. The interaction flow for these effects is largely consistent and can be controlled via a comprehensive set of lifecycle APIs. The application first calls the relevant functions to create and enable an audio effect, then associates it with the target playback data stream via an interface. During data stream playback, the application can dynamically update the effect parameters, temporarily disable the effect by calling the corresponding function at any time, or ultimately release the effect resources.
Vendor Specific Effect
The interaction flow between Vendor Specific Effects (VSE) and the application differs from that of built-in effects. When integrating Vendor Specific Effects, the Audio Subsystem primarily serves as a transport layer, responsible for transparently relaying information between the application and the vendor's custom algorithm library — the format of this information is defined by the algorithm vendor.
Application Products
Realtek's audio solution is built on System-on-Chip (SoC) designs optimized for audio scenarios, combined with advanced audio processing technologies and high-performance Digital Signal Processors (DSPs). This enables smooth handling of high-quality audio while supporting features such as Active Noise Cancellation (ANC) and echo cancellation, delivering an immersive auditory experience for users. Furthermore, this series of SoCs is compatible with multiple audio format decoding and mainstream audio streaming protocols, helping to expand product applicability and market coverage.
Realtek's Bluetooth SoCs have been widely adopted across various audio devices. Representative products include smart voice recorders, smart helmets, soundbars, smart speakers, Bluetooth hearing aids, smart glasses, and smart charging cases. Their performance is demonstrated in the following areas:
Leveraging these strengths, Realtek's Bluetooth SoC audio solution not only meets the core audio requirements of various smart devices but also establishes high audio quality, low power consumption, and intelligent connectivity as key competitive advantages, providing reliable support for the ongoing evolution of consumer electronics audio experiences.
TWS
Realtek TWS Earbuds Audio Solution features high integration, low power consumption, comprehensive scenario noise cancellation, and high cost-effectiveness, providing complete technical support for TWS earbud products across different market segments.

Record Pen
Realtek Voice Recorder Audio Solution focuses on high-fidelity recording as its core, integrating professional-grade noise reduction algorithms, intelligent voice control, low-power design, and flexible connectivity expansion capabilities. It caters to full-scenario recording needs from entry-level to professional applications. Whether for lecture notes, business meetings, or professional interviews, this solution provides corresponding chip platforms and technical support to assist manufacturers in rapidly developing differentiated recording products.

Helmet
Realtek Smart Helmet Audio Solution is deeply optimized for riding scenarios, comprehensively adapting to the communication and entertainment needs of motorcycle and e-bike riders in terms of connection stability, professional noise cancellation, and low-power design. Centered on professional audio processing, efficient noise reduction algorithms, stable Bluetooth connectivity, and a low-power architecture, this solution provides clear, safe, and convenient audio experiences for smart helmet products across different market segments and assists manufacturers in accelerating time-to-market.

Soundbar
Realtek Soundbar Audio Solution builds upon high-fidelity audio, integrating immersive surround sound technology, professional audio processing, stable low-latency connectivity, and flexible expansion capabilities. It caters to a full range of Soundbar applications, from entry-level to professional-grade. Whether for home theater systems, TV audio enhancement, or gaming and entertainment scenarios, this solution provides suitable chip platforms and technical support to assist manufacturers in rapidly launching differentiated products.

Speaker
Realtek Speaker Audio Solution is built upon high-fidelity sound quality, integrating intelligent audio processing, stable low-latency connectivity, flexible expansion, and multi-room audio capabilities. It comprehensively covers speaker application scenarios from entry-level to professional-grade. Whether for everyday music playback, home theater systems, or smart home integration, this solution provides matching chip platforms and technical support to assist manufacturers in efficiently launching differentiated speaker products.

Glasses
Realtek Smart Glasses Audio Solution is built upon an ultra-thin integrated design, combining open acoustic optimization, AI-enhanced voice processing, LE Audio low-latency connectivity, and ultra-low power consumption for extended battery life. This solution addresses the core requirements of smart glasses in terms of portability, audio quality, and interactive experience. Whether for entry-level voice-enabled glasses or high-end AR glasses, it provides corresponding chip platforms and technical support to assist manufacturers in rapidly launching differentiated products.

苏公网安备32059002006558号
苏ICP备10062199号-8