AWE Core 8.D.10 Documentation
|
This document is meant to be a "Quick-Start" on how to integrate AWE Core. Advanced concepts are left out. Before reading this document, please see the Theory Of Operation document.
The AWE Core™ is a hardware-independent, reconfigurable audio-processing engine. The AWE Core includes over 400 audio-processing Modules, from mixers and filters to compressors and FFTs, that can be combined to easily create complex signal processing flows. The AWE Core operates by passing audio through a user-defined sequence of these Modules, as directed by Configuration-data at run-time.
Term | Definition |
---|---|
API | Application Interface |
AWE Instance | The main AWE Core object |
Tuning Interface | The interface by which commands/replies are transferred |
Layout | Audio Weaver signal processing flow |
Sub-Layout | Clock-divided section of a layout |
AWB | Audio Weaver Binary File |
AWD | Audio Weaver Design File |
AWE | Audio Weaver |
AWS | Audio Weaver Script File |
BSP | Board Support Package |
IPC | Inter Process Communication |
RTOS | Realtime operating system |
RT | Real-time |
This document will go over the basics of integrating the AWE Core. It also includes a Doxygen-generated map of all the available API Functions, Macros, Data Structures, etc. API Doc.
For a brief view of the API, see the 'AWECore_cheatsheet.pdf' in this package.
For a more detailed description of the Tuning Protocol and the various transport methods, please see the Tuning Protocol document here.
Here are the basic steps to integrate AWE Core.
A number of variables must be defined before configuring/initializing the AWE Flash File System (FS) Instance. FlashFSInstance.h
//declare and initialize to zeros AWEFlashFSInstance g_AWEFlashFSInstance = {0};
// Specify flash memory available for flash file system #define FLASH_MEMORY_SIZE_IN_BYTES 0x4000000 #define ERASEABLE_SECTOR_SIZE 0x10000 // 704KB from the beginning is for application loader file #define FILE_SYSTEM_START_OFFSET 0xB0000 #define SECTOR_ERASE_TIME_MS 400
Later, in the initialization section, it will become clear that there are several fields of the AWEFlashFSInstance structure that must be configured before it can be used.
AWE Flash File System requires callbacks they must be defined before initializing the AWE Flash FS Instance.
///----------------------------------------------------------------------------- /// @name BOOL usrInitFlashFileSystem(void) /// @brief Callback to initialize Flash device /// /// @retval TRUE - Initialization is success /// @retval FALSE - Initialization failed ///----------------------------------------------------------------------------- BOOL usrInitFlashFileSystem(void) { // Implement any flash specific initializations // Return FALSE up on failure return 1; } // End usrInitFlashFileSystem
///----------------------------------------------------------------------------- /// @name BOOL usrReadFlashMemory(UINT32 nAddress, UINT32 * pBuffer, UINT32 nDWordsToRead) /// @brief Read 4-byte words from flash memory /// /// @param[in] UINT32 nAddress - address in flash to start reading /// @param[in] UINT32 *pBuffer - buffer to read into /// @param[in] UINT32 nDWordsToRead - number of 4-bytes elements to read /// /// @retval TRUE - read succeeded /// @retval FALSE - read failed ///----------------------------------------------------------------------------- BOOL usrReadFlashMemory(UINT32 nAddress, UINT32 * pBuffer, UINT32 nDWordsToRead) { // Check for the count zero and skip remaining if(nDWordsToRead == 0) { return 1; } // Flash specific read implementation // Return FALSE up on failure return 1; } // End usrReadFlashMemory
///----------------------------------------------------------------------------- /// @name BOOL usrWriteFlashMemory(UINT32 nAddress, UINT32 * pBuffer, UINT32 nDWordsToWrite) /// @brief Write 4-byte words to flash memory /// /// @param[in] UINT32 nAddress - address in flash to start writing /// @param[in] UINT32 * pBuffer - buffer to write into /// @param[in] UINT32 nDWordsToWrite - number of 4-bytes elements to write /// /// @retval TRUE - write succeeded /// @retval FALSE - write failed ///----------------------------------------------------------------------------- BOOL usrWriteFlashMemory(UINT32 nAddress, UINT32 * pBuffer, UINT32 nDWordsToWrite) { // Check for the count zero and skip remaining if(nDWordsToWrite == 0) { return 1; } // Flash device write specific implementation // Return FALSE up on failure return 1; } // End usrWriteFlashMemory
///----------------------------------------------------------------------------- /// @name BOOL usrEraseFlashSector(UINT32 nStartingAddress, UINT32 nNumberOfSectors) /// @brief Erase flash memory starting at address for number of sectors /// /// @param[in] UINT32 nStartingAddress - address in flash to start erasing /// @param[in] UINT32 nNumberOfSectors - number of flash memory sectors to erase /// /// @retval TRUE - erase succeeded /// @retval FALSE - erase failed ///----------------------------------------------------------------------------- BOOL usrEraseFlashSector(UINT32 nStartingAddress, UINT32 nNumberOfSectors) { UINT32 nSectorAddress, index; nSectorAddress = nStartingAddress; // Loop through number of sectors and erase each sector for (index = 0; index < nNumberOfSectors; index++) { // Flash device specific sector erase implementation // with sector start address 'nSectorAddress' // Return FALSE up on failure // Go to next sector start address nSectorAddress += ERASEABLE_SECTOR_SIZE; } return 1; } // End usrEraseFlashMemorySector
Now that the required variables have been declared, the AWE Flash FS Instance will be configured by assigning its members and pointers.
First, ensure that the AWEFlashFSInstance is initialized to zeros. If not, explicitly initialize to 0 with memset.
memset(&g_AWEFlashFSInstance, 0, sizeof(AWEFlashFSInstance) );
g_AWEFlashFSInstance.cbInit = &usrInitFlashFileSystem; g_AWEFlashFSInstance.cbEraseSector = &usrEraseFlashMemorySector; g_AWEFlashFSInstance.cbFlashWrite = &usrWriteFlashMemory; g_AWEFlashFSInstance.cbFlashRead = &usrReadFlashMemory;
g_AWEFlashFSInstance.flashSizeInBytes = FLASH_MEMORY_SIZE_IN_BYTES; g_AWEFlashFSInstance.flashErasableBlockSizeInBytes = ERASEABLE_SECTOR_SIZE; g_AWEFlashFSInstance.flashStartOffsetInBytes = FILE_SYSTEM_START_OFFSET; g_AWEFlashFSInstance.flashEraseTimeInMs = (INT32)((FLOAT32)((( (FLASH_MEMORY_SIZE_IN_BYTES - FILE_SYSTEM_START_OFFSET)/ ERASEABLE_SECTOR_SIZE)*SECTOR_ERASE_TIME_MS/1000) + 0.5f) + 5);
Next, initialize the AWE Flash FS Instance by calling the awe_initFlashFS() function. **This must happen before calling awe_init().** If awe_initFlashFS() is called after awe_init, the initialization will fail.
awe_initFlashFS(&g_AWEInstance, &g_AWEFlashFSInstance);
g_AWEInstance.pFlashFileSystem = &g_AWEFlashFSInstance;
The code above will initialize the AWE Flash FS Instance.
A number of variables must be defined before configuring/initializing the AWE Instance.
//declare and initialize to zeros AWEInstance g_AWEInstance = {0};Later, in the initialization section, it will become clear that there are several fields of the AWEInstance structure that must be configured before it can be used to process audio data.
static IOPinDescriptor aweInputPin; static IOPinDescriptor aweOutputPin;
//The list of class objects is defined in ModuleList.h const void* g_module_descriptor_table[] = { LISTOFCLASSOBJECTS};
UINT32 g_FastHeapA[FAST_HEAP_A_SIZE]; UINT32 g_FastHeapB[FAST_HEAP_B_SIZE]; UINT32 g_SlowHeap[SLOW_HEAP_SIZE];where FAST_HEAP_A_SIZE, FAST_HEAP_B_SIZE, and SLOW_HEAP_SIZE are chosen appropriately for the system. See the Optimization section for more details.
#define PACKET_BUFFER_SIZE 264 UINT32 AWE_Packet_Buffer[PACKET_BUFFER_SIZE];
UINT32 AWE_Packet_Buffer_Reply[PACKET_BUFFER_SIZE];Note: The standard packet size is 264. If a different packet size is needed, please contact DSPC Engineering.
Now that the required variables have been declared, the AWE Instance will be configured by assigning its members and pointers.
g_AWEInstance.instanceId = 0;
g_AWEInstance.pInputPin = &aweInputPin; g_AWEInstance.pOutputPin = &aweOutputPin;
g_AWEInstance.pPacketBuffer = AWE_Packet_Buffer;
g_AWEInstance.pReplyBuffer = AWE_Packet_Buffer_Reply;
#define PACKET_BUFFER_SIZE 264 g_AWEInstance.packetBufferSize = PACKET_BUFFER_SIZE;
g_AWEInstance.pModuleDescriptorTable = g_module_descriptor_table;
UINT32 module_descriptor_table_size = sizeof(g_module_descriptor_table) / sizeof(g_module_descriptor_table[0]); g_AWEInstance.numModules = module_descriptor_table_size;
g_AWEInstance.numThreads = 2; //dual threaded, supports two blocksizes
g_AWEInstance.sampleRate = 48000.0f;
#define AWE_BLOCK_SIZE 32 g_AWEInstance.fundamentalBlockSize = AWE_BLOCK_SIZE;
g_AWEInstance.pFlashFileSystem = 0;
g_AWEInstance.fastHeapASize = FAST_HEAP_A_SIZE; g_AWEInstance.fastHeapBSize = FAST_HEAP_B_SIZE; g_AWEInstance.slowHeapSize = SLOW_HEAP_SIZE;
g_AWEInstance.pFastHeapA = g_FastHeapA; g_AWEInstance.pFastHeapB = g_FastHeapB; g_AWEInstance.pSlowHeap = g_SlowHeap;
g_AWEInstance.coreSpeed = 10e6f; g_AWEInstance.profileSpeed = 10e6f;
g_AWEInstance.pName = “mytarget”;
//this example represents a date g_AWEInstance.userVersion = (UINT32) 06212019;
g_AWEInstance.cbAudioStart = &usrCallbackAudioStart; g_AWEInstance.cbAudioStop = &usrCallbackAudioStop; ///----------------------------------------------------------------------------- /// METHOD: INT32 usrCallbackAudioStart(AWEInstance *pAWE) /// PURPOSE: Start audio processing callback ///----------------------------------------------------------------------------- INT32 usrCallbackAudioStart(AWEInstance *pAWE) { // Target specific configurations, if any, before starting the layout pump. // Like enabling audio IO interrupts etc. return 0; } // End usrCallbackAudioStart ///----------------------------------------------------------------------------- /// METHOD: INT32 usrCallbackAudioStop(AWEInstance *pAWE) /// PURPOSE: Stop audio processing callback ///----------------------------------------------------------------------------- INT32 usrCallbackAudioStop(AWEInstance *pAWE) { // Target specific configurations, if any, before destroying the layout. return 0; } // End usrCallbackAudioStop
g_AWEInstance.cbCacheInvalidate = &usrCallbackCacheInvalidate; ///---------------------------------------------------------------------------- /// METHOD: usrCallbackCacheInvalidate /// PURPOSE: Invalidate cache region ///---------------------------------------------------------------------------- INT32 usrCallbackCacheInvalidate(AWEInstance *pAWE, void* pStartAddr, UINT32 lengthInWords) { // Target specific cache invalidation logic. // Validate input argument pStartAddr. If it falls in the cached region then invalidate cache region with start address pStartAddr and end address (pStartAddr+length-1). return 0; } // End usrCallbackCacheInvalidate
g_AWEInstance.cbGetLayoutThreadPriority = &usrCallbackGetLayoutThreadPriority; ///---------------------------------------------------------------------------- /// METHOD: usrCallbackGetLayoutThreadPriority /// PURPOSE: Return the thread priority corresponding to the layoutNum ///---------------------------------------------------------------------------- INT32 usrCallbackGetLayoutThreadPriority(AWEInstance *pAWE, INT32 layoutNum) { INT32 threadPriority; if (layoutNum == 0) { // Priority of the first layout, where awe_audioPump(&g_AWEInstance, 0); is called. threadPriority = firstLayoutThreadPriority; } elseif (layoutNum == 1) { // Priority of the second layout, where awe_audioPump(&g_AWEInstance, 1); is called. threadPriority = secondLayoutThreadPriority; } elseif (layoutNum == 2) { // Priority of the third layout, where awe_audioPump(&g_AWEInstance, 2); is called. threadPriority = thirdLayoutThreadPriority; } and so on up to NUM_THREADS. return threadPriority; } // End usrCallbackGetLayoutThreadPriority
The next step is to first initialize the IO Pins followed by initializing the AWE Instance using specific API functions.
Simply call the awe_initPin() function and pass it the input pin, the desired channel count (determined by audio HW), and an optional pin name.
define TWO_CHANNELS 2 int ret = awe_initPin(&aweInputPin, TWO_CHANNELS, NULL);
The code above would initialize the input pin with 2 channels, and the default name.
Initializing the output pin is the same as the input pin, but the output pin is passed in.
#define SIX_CHANNELS 6 int ret = awe_initPin(&aweOutputPin, SIX_CHANNELS, "outpin");
The code above would initialize the output pin with 6 channels with the name "outpin".
NOTE: multiple IO pins not supported. There can only be one input pin and one output pin. All API's that take the argument "pinIdx" will always pass in 0.
Next, initialize the AWE Instance by calling the awe_init() function. This must happen as the last initialization step. If awe_init() is called before the AWE Instance structure is configured or before the IO pins are initialized, the initialization will fail.
int ret = awe_init(&g_AWEInstance);
The code above will initialize the AWE Instance.
After initialization of the AWE Instance, users can optionally register callbacks supported by the AWE Core. AWE Core supports callbacks for logging functionality (awe_registerLoggingCallback), and for Event module support (awe_registerEventCallbacks).
In both the logging and event callbacks described below, the payload passed to the user callback functions is not guaranteed to persist after the user function returns. Any handling or copying of the payload must be completed before the function is returned. The user is responsible for making sure to return these functions quickly - any expensive processing of logging or events should be done in a separate thread to avoid interrupting audio processing.
The AWE Core logging functionality provides log information to BSPs that have registered a callback function. Log level (Error, Warning, Info, Debug), as well as a log type (bitfield), are provided as part of the log messages. The user can configure which log levels and log types to filter out using the awe_registerLoggingCallback function. The register function can be called at any time after awe_init, and the registered log level and log types can be updated with subsequent calls.
To receive callbacks from AWE Core for logging messages, the application needs to define a function matching the signature of cbAweLogging_t. A simple example of a user function that prints a timestamp and the content of the logging payload is shown below. Note that the payload is always an ASCII message with no trailing newline.
void usrLogging(AWEInstance* pAWE, INT32 level, UINT32 type, void* payload, INT32 payloadSizeInBytes) { // Simple logging function always writes payload to stdout struct timeval ts; char timeCh[25]; // Get current time for log gettimeofday(&ts, NULL); snprintf(timeCh, 25, "%ld.%03ld: ", ts.tv_sec, ts.tv_usec / 1000); printf("%s%s\n", timeCh, (char *)payload); }
The memory used for the payload passed to the logging function is not guaranteed to persist after the user logging function is returned from. Any handling or copying of the payload must be completed before the function is returned.
See LinuxApp.c for more details of an example use case.
The AWE Core has an Event Module that can trigger callbacks to the system when events occur in the layout. The meaning of the event must be defined by the Event Module user and the application has to understand how to respond to different event types. Similarly to the logging feature, event callbacks must be registered with the AWEInstance for the Event Module to be able to trigger any events. The Event Module supports 3 callbacks:
These callbacks must be registered using awe_registerEventCallbacks. See LinuxApp.c for an example use case.
The next step is to implement a tuning interface – arguably the most important component of a successful AWE Core integration as it provides essential debugging capabilities. The tuning interface communicates commands/replies to and from the AWE Instance. For example, these commands can instantiate a layout, or set and query the value of a module parameter.
Different platforms will support different transport protocols for the tuning interface. AWE Server supports USB, TCP/IP (sockets), RS232, SPI, etc. Helper code is available to aid in the development of these different transport layers on the target. It is the responsibility of the integrator to enable the transport protocol for the tuning commands to be passed to and from the platform.
Here are the basic steps for setting up and using a tuning interface with AWE Server.
AWE_Packet_Buffer
and then call awe_packetProcess(&g_AWEInstance)
on the instance. Remember that this packet buffer has been registered with the AWE Instance, which is why the awe_packetProcess(&g_AWEInstance)
function does not need to take an argument to the packet buffer.AWEInstance.pReplyBuffer
. If the same buffer is used for both send and reply, the reply message will overwrite the original command.int sizeOfPacket = 264; readPacket(&AWE_Packet_Buffer, sizeOfPacket); awe_packetProcess(&g_AWEInstance); writePacket(&AWE_Packet_Buffer_Reply, sizeOfPacket);
The next step is to integrate real-time audio. AWE Core aside, real-time audio can be a tricky topic. Before attempting to integrate real-time audio into an AWE Core system, please ensure that the integrator has a basic understanding of digital audio and real-time audio processing. Here are some helpful links.
http://www.rossbencina.com/code
Giulio Moro - Assessing the stability of audio code for real time low latency embedded processing
1 << N
, for N=0:numThreads-1, to determine which layouts are ready to be pumped. Based on the system's capabilities, each layout should be pumped at it's own interrupt level/thread priority. See the Multi-Rate section for more information.The data type of the input and output audio data is determined by the audio hardware. Typically, digital audio is represented as fixed point data in 16, 24, or 32 bit depth. The audio sample formats supported by AWE Core's awe_audioImportSamples() and awe_audioExportSamples() functions, as defined in SampleType in StandardDefs.h, are:
Internally, all inputs and outputs to Audio Weaver layouts are of type Sample32bit, also referred to as fract32 within a layout. This is done to guarantee portability of any signal processing layout to any target system, regardless of the hardware's native audio data type. The awe_audioImportSamples() and awe_audioExportSamples() functions will convert to and from the Sample32bit data as needed based on the integrator's supplied _SampleType argument. If the target's native audio sample formatting is not one of those listed above, then the integrator will have to manually convert the data to one of the supported types before using the import and export functions.
Since some common target systems natively support floating point audio, helper functions are provided to convert between float<-->fract32 in AWECoreUtils.c, which is provided in the AWE Core package. Add AWECoreUtils.c to the build project to access the sample-by-sample conversion functions float_to_fract32 and fract32_to_float. See the API in doc AWECoreUtils.h for more info.
The AWE Core allows audio to be processed at multiple block rates. For example, consider a system with two processing paths: one that processes time-domain data in blocks of 32 samples and another that processes frequency-domain data in blocks of 64 samples but at 1/2 the rate of the time-domain path. Such a system is shown in the figure below. It uses BufferUp and BufferDown modules to connect the different block size domains. These modules effectively partition the layout into 2 sub-layouts operating at different block rates.
The two paths are executed at different rates and AWE Core’s awe_audioGetPumpMask() API call provides a mechanism to determine when the processing for each path should be initiated. Consider the following pseudocode:
layoutMask= awe_audioGetPumpMask(&g_AWEInstance); if (layoutMask & 0x1) raise(AWEProcess_HiPrio); if (layoutMask & 0x2) raise(AWEProcess_LowPrio); AWEProcess_HiPrio() { awe_audioPump(&g_AWEInstance, 0); // small block size path } AWEProcess_LowPrio() { awe_audioPump(&g_AWEInstance, 1); // large block size path }
This code tests which sub-layouts have accumulated enough data to execute. The layoutMask variable contains a 1 in each bit position corresponding to sub-layout that is ready to execute. For example, if layouts 0 and 1 are ready to execute, layoutMask would be 0x00000003.
Lower numbered sub-layouts correspond to smaller block sizes. In the pseudo-code, a signal is raised for each sub-layout that is ready to be pumped.
Profiling designs on high-level OSes is challenging because layout threads may run on any available core at any given time. An OS-level mechanism to lock a particular thread to a certain core may be available. If so, one can notify AWECore that a layout thread will run on a specific core, and more accurate profiling will be shown in AWE Server. These functions are awe_fwSetLayoutCoreAffinity() and awe_fwGetLayoutCoreAffinity(). By default, all layouts are assumed to run on core 0.
Real-time constraints dictate that if the 64 sample sub-layout is executed in the same context as the 32 sample sub-layout, both will need to complete processing before the next block of 32 samples arrives or real-time will be broken. This imposes an unnecessarily strict constraint on the 64 sample sub-layout – its processing need only be completed before the next block of 64 samples is accumulated. Thus, its processing time can, in principle, be spread over 2 of the 32 sample block times without breaking real-time. In practice, this is achieved by executing the two sub-layouts in separate contexts. The sub-layout with the shorter block size should have higher priority so that it can preempt the processing of the sub-layout with the longer block size. In this way, the real time constraints of each sub-layout can be accommodated. The following figure shows the timing for this example:
In the multi-instance environment where instance 0 is the master with interface to audio IO peripherals, it's common to have bigger layout block size on secondary instances then the layout block size on instance 0. In this situation, it's not a good practice to signal secondary instances at the fundamental block rate. To avoid unnecessary overhead with signalling secondary instances at higher block rate, user must call awe_audioIsReadyToPumpMulti() API and if it returns TRUE then signal the secondary instance.
if (awe_audioIsReadyToPumpMulti(&g_AWEInstance, 1)) { // Signal secondary instance to check for the pump mask, through either raise() or any other option depending on the target }
Note: It is recommended to call awe_audioIsReadyToPumpMulti() before calling awe_audioGetPumpMask() in instance 0. When instance 0 is implementing low latency path (i.e. calling of audioPump in the DMA handler), this may affect the starting time of the secondary instances when awe_audioIsReadyToPumpMulti() is called at the end. To keep the secondary instances start time in align with instance 0, call awe_audioIsReadyToPumpMulti() before blocking DMA handler with low latency audio pump call.
//Call awe_audioIsReadyToPumpMulti() to trigger secondary instances if (awe_audioIsReadyToPumpMulti(&g_AWEInstance, 1)) { // Signal secondary instance to check for the pump mask, through either raise() or any other option depending on the target } // Repeat for all secondary instances //Get the pump mask of instance 0 layoutMask = awe_audioGetPumpMask(&g_AWEInstance);
In a multi rate processing with different priorities of each processing thread, it's quite common that the low priority thread processing is preempted by the high priority thread processing which affects the profiling of low priority thread processing due to the continuous clock counter. By default, AWE Core corrects the overhead by the high priority thread processing due to preemption. Currently the preemption overhead correction is supported on all the platforms except Windows (WIN32) and Linux.
In a system with multiple AWE Instances in the same core, user must call awe_setInstancesInfo() API to enable the preemption overhead correction in profiling. Call this API as the last step in init sequence after all AWE Instances are configured and initialized.
AWEInstance *g_pInstances[NUM_AWE_INSTANCES];
for (int i = 0; i < NUM_AWE_INSTANCES; i++) { g_pInstances[i] = &g_AWEInstance[i]; }
awe_setInstancesInfo(g_pInstances, NUM_AWE_INSTANCES);
In a multi-rate or multi-instance systems, when the low priority layout processing is preempted by high priority layout processing, the profiling overhead of the high priority layout processing is corrected by default within the AWE Core framework. To include overhead due to external events like DMA ISR, call awe_audioStartPreemption() and awe_audioEndPreemption as explained below.
DMA ISR() { UINT32 start_time; INT32 coreAffinity = 0; // For embedded targets, default core affinity is 0 in AWE Core. // For other targets like Linux based, get the core affinity from which this function is called. // Call the start preemption API to get the start time stamp start_time = awe_audioStartPreemption(&g_AWEInstance, coreAffinity); // All the audio import calls here // All the audio export calls here // Get pump mask and other code // Call the end preemption API to include this ISR overhead in any low priority active layout(s) awe_audioEndPreemption(&g_AWEInstance, start_time, coreAffinity); }
DMA ISR with low latency audio pump:
DMA ISR() { UINT32 start_time; INT32 coreAffinity = 0; // For embedded targets, default core affinity is 0 in AWE Core. // For other targets like Linux based, get the core affinity from which this function is called. // Call the start preemption API to get the start time stamp start_time = awe_audioStartPreemption(&g_AWEInstance, coreAffinity); // All the audio import calls here // Get pump mask and other code // Call the end preemption API to include overhead by import calls in any low priority active layout(s) awe_audioEndPreemption(&g_AWEInstance, start_time, coreAffinity); // Call the low latency audio pump. Overhead due to this pump call in any low priority active layout(s) is corrected by the AWE Core framework awe_audioPump(&g_AWEInstance, 0); // Call the start preemption API to get the start time stamp, after low latency pump call to include overhead due to export calls start_time = awe_audioStartPreemption(&g_AWEInstance, coreAffinity); // All the audio export calls here // Call the end preemption API to include overhead by export calls in any low priority active layout(s) awe_audioEndPreemption(&g_AWEInstance, start_time, coreAffinity); }
// Global variable which counts overhead by the high priority interrupt (UART ISR in this case) UINT32 uartOverhead = 0; DMA ISR() { UINT32 start_time; INT32 coreAffinity = 0; // For embedded targets, default core affinity is 0 in AWE Core. // For other targets like Linux based, get the core affinity from which this function is called. // Clear the high priority event overhead at the beginning uartOverhead = 0; // Call the start preemption API to get the start time stamp start_time = awe_audioStartPreemption(&g_AWEInstance, coreAffinity); // All the audio import calls here // All the audio export calls here // Get pump mask and other code // Call the end preemption API to include this ISR overhead in any low priority active layout(s) // Consider high priority event overhead if it happened from this ISR awe_audioEndPreemption(&g_AWEInstance, start_time + uartOverhead, coreAffinity); } UART ISR() { UINT32 start_time; INT32 coreAffinity = 0; // For embedded targets, default core affinity is 0 in AWE Core. // For other targets like Linux based, get the core affinity from which this function is called. // Call the start preemption API to get the start time stamp start_time = awe_audioStartPreemption(&g_AWEInstance, coreAffinity); // UART packets handling // Call the end preemption API to include this ISR overhead in any low priority active layout(s) // Accumulate the overhead for the low priority events (DMA ISR in this case) to address recursive preemptions in the same DMA block uartOverhead += awe_audioEndPreemption(&g_AWEInstance, start_time, coreAffinity); }
There are certain AWE modules that need to perform time consuming calculations, for example when the cutoff frequency of a Second Order Filter module is changed, the filter coefficients need to be recalculated. Performing these calculations in the audio processing context can cause it to overrun. To address this issue, certain modules defer those calculations until the firmware calls awe_deferredSetCall()
.
Note: For module authors, the awe_deferredSetCall()
function calls the module's Set function with a mask of 0xFFFFFF00.
An integrator can check for any required deferred processing using the return value of awe_audioPump(), which will return TRUE
if any deferred processing is pending. If deferred processing is needed, then call awe_deferredSetCall() at a priority that is lower than the audio processing. awe_deferredSetCall() performs deferred processing for a single module, and returns TRUE
if there is more deferred processing that is pending. Thus it should be called repeatedly until it returns FALSE
.
//g_bDeferredProcessingRequired is returned by awe_audioPump() if (g_bDeferredProcessingRequired || bMoreProcessingRequired) { g_bDeferredProcessingRequired = FALSE; bMoreProcessingRequired = awe_deferredSetCall(&g_AWEInstance); }
At this point, the integrator should be able to load a layout from Audio Weaver Designer and run it on the target via the Tuning Interface. Once a Layout in Designer has been completed, it is easy to switch to stand-alone operation. Simply ask Audio Weaver to "Generate Target Files", and the Layout's configuration-data will be generated as a C array to be compiled into the system.
From the viewpoint of AWE Core, the signal processing layout is described by a data array of binary Audio Weaver commands. This command array is generated by Audio Weaver Designer using the Tools/Generate Target Files menu item and enabling the [BaseName]_initAWB checkbox. The layout is loaded into an AWE Core instance by making a call to awe_loadAWBfromArray with the command array, its size in 32-bit words, and the AWEInstance as arguments. If an error occurs during loading, then the offset of the offending AWB command is returned to the pPos argument.
INT32 err = awe_loadAWBfromArray (&g_AWEInstance, pCommands,arraySize, &nErrOffset); if (err) { // report the error printf(“error code %u due to command at position %u\n”, err, nErrOffset); // handle the error ... }
awe_loadAWBfromArray() will load the entire array of commands and process them locally on the AWE Instance. If an array of commands needs to be loaded on a remote instance, it can be loaded command by command with the awe_getNextAWBCmd() helper function in AWECoreUtils.h. Each command is parsed so that they can be routed to the remote instance.
All priority and interrupt issues are managed outside of the AWE Core by the firmware integrator since the AWE Core has no knowledge of the supporting processing environment it is being integrated into. The following processing context may be implemented using interrupt handlers or using OS threads/tasks. The main requirement is that the processing context is implemented at different preemptible priority levels.
A basic Audio Weaver platform has a minimum of five priority levels of processing. From highest priority to lowest priority:
These three actions must be atomic to avoid the possibility of race conditions due to concurrent access of resources.
A control interface lets the system interact with modules in the running layout directly from the firmware. To access the layout from the API, use Designer to generate a control-interface header file for the layout. Then, use the following API calls define the functionality of the control interface:
Get/set a value of a module: awe_ctrlGetValue() and awe_ctrlSetValue()
INT32 awe_ctrlGetValue(const AWEInstance *pAWE, UINT32 handle, void *value, INT32 arrayOffset, UINT32 length) INT32 awe_ctrlSetValue(const AWEInstance *pAWE, UINT32 handle, const void *value, INT32 arrayOffset, UINT32 length)
Get/Set the status of a module (bypassed, active, inactive, muted): awe_ctrlSetStatus() and awe_ctrlGetStatus()
INT32 awe_ctrlSetStatus(const AWEInstance *pAWE, UINT32 handle, UINT32 status) INT32 awe_ctrlGetStatus(const AWEInstance *pAWE, UINT32 handle, UINT32 *status)
Check if a module exists, and if so return its ClassID: awe_ctrlGetModuleClass()
INT32 awe_ctrlGetModuleClass(const AWEInstance *pAWE, UINT32 handle, UINT32 *pClassID)
The following functions provide finer grained control over how module variables get set and are for advanced users: awe_ctrlSetValueMask() and awe_ctrlGetValueMask()
INT32 awe_ctrlSetValueMask(const AWEInstance *pAWE, UINT32 handle, const void *value, INT32 arrayOffset, UINT32 length, UINT32 mask) INT32 awe_ctrlGetValueMask(const AWEInstance *pAWE, UINT32 handle, void *value, INT32 arrayOffset, UINT32 length, UINT32 mask)
Note: For multi-instance system with a single layout for all instances, user can't call control API's with one instance to control modules running on different instance. To control modules on other instances (running on another core), there are 3 options:
1) In the signal flow, use ChangeThread to route control signals to an instance (core) and ParamSet module on that instance to control modules. 2) In the application firmware, create tuning packets with commands to control target instance modules as appropriate, with a dedicated IPC (Source module) on that instance. 3) Do the IPC of the control value separately from AWE in whatever way is supported on the target.
To access a module and control it via the Control Interface,
handle, length
arguments will all be defined in the [BaseName]_ControlInterface.h file that was generated. See the API Doc for details about the control functions arguments/return values.See the following example.
// Does the current AWE model have a SinkInt module with this control object ID? if (awe_ctrlGetModuleClass(&g_AWEInstance, AWE_SinkInt1_value_HANDLE, &classID) == OBJECT_FOUND) { // Check that module assigned this object ID is of module class SinkInt if (classID == AWE_SinkInt1_classID) { // SinkInt module (value is an array) awe_ctrlGetValue(&g_AWEInstance, AWE_SinkInt1_value_HANDLE, &nValue, 0, 1); } }
The AWE Core can be optimized for a system for heap space and for image size.
In a typical workflow, the integrator would decide on heap size/placement for development and finally optimized for production.
To decide on heap size/placement for board bring-up, set all three heaps to some small number of 1K word blocks say (1024 * 5). When the BSP is fully developed inspect the memory map to determine how much free space is available in the memory sections that have been assigned to the heaps. Then adjust the heap space to use as much of this memory as is practical. A typical assignment of heap might be to assign processor “tightly coupled memory” to fast heap A, processor RAM to fast heap B, and off-chip SDRAM to the slow heap.
Audio Weaver Designer can also generate the absolute smallest heap sizes needed for a specific layout via Tools->Generate Target Files and selecting [BaseName]_HeapSize.h. This can greatly reduce the memory footprint and aid in optimizing an AWE Core integration.
The ModuleList.h file that is delivered with AWE Core contains a large set of modules. This is convenient during development to provide a large selection of modules with which to design layouts. To optimize for a specific layout at production time, Audio Weaver Designer can generate a ModuleList.h with only the modules used by that layout using Tools->Generate Target Files->[BaseName]_ModuleList.h. Since most modern linkers will only link those modules referenced, this will significantly reduce the image size.
AWE Core target systems can contain multiple AWE Instances. This is useful on a platform with multiple cores when a system needs to do signal processing on each core, or if separate instances are needed for dedicated signal processing tasks.
AWE Core supports two methods of implementing multiple AWE Instances on a target system. Both methods support multiples AWE Instances on a single core, or across multiple cores on a single SOC.
Note: The Multi Instance feature is currently released as a beta feature, and may change in incompatible ways in future releases.
Before setting up a multi instance tuning interface, we recommend implementing a single instance tuning interface. This will solidify the theory of operations and allow for easier understanding of the multi instance model.
AWE Core packets always contain a prefixed address called an 'instanceID'. On a single instance system, this instanceID is always 0. However, when developing a multi instance system, commands can be addressed to different instanceID's.
It is important to note that there is normally only one tuning interface between Server and the system. That single tuning interface will receive the packets for all instances and the BSP integrator must route them to the correct instance.
(Multi Instance only): The instanceID of an AWE command is determined by the instanceID of a module/wire in the design. The instanceID is a propagatable field, and can be modified using the 'ChangeThread' module, or, for a source module, by setting the clockDivider field in the build tab of the module properties. The syntax to set the clockDivider field is <clockDivider (#)><threadThead (letter)><instanceID (#)>, so '2B3' will run the source module with clockDivider of 2, on thread B of instance 3. The 'ChangeThread' module exists to take one or more input wires of data that exists on one instanceId, and send it to another user-specified instanceId. (Multi Canvas only): When using Designer, the instanceID of an AWE command is determined by which instance is selected in the dropdown window of Designer.
Utilizing multi instance AWE Core requires these steps:
/** The shared heap. */ volatile UINT32 *pSharedHeap; /** The shared heap size. */ UINT32 sharedHeapSize;
/** The number of audio processing instances of AWECore configured on a single target */ UINT32 numProcessingInstances;
For more details on how to implement multi instance AWE Core, including pseudocode related to all necessary implementation details, read the Multi-Instance AWE Core Integration Guide found at (https://documentation.dspconcepts.com/awe-designer/latest-version/application-notes), or see the example file LinuxAppMulti.c.
Troubleshooting note: -> If trouble in connecting AudioWeaver server to the target, make sure packet buffer (reply buffer) on all instances pointing to the same address in shared memory. Similarly, make sure shared heap on all instances pointing to the same address in shared memory. -> For example, on a multi-instance system with one core as ARM with different compiler and another core as DSP with different compiler when shared buffers are allocated as
section("shared_mem") UINT32 g_packetBuffer[MAX_COMMAND_LEN]; section("shared_mem") UINT32 g_sharedBuffer[SHARED_HEAP_SIZE];
on the DSP, linker will allocate both g_packetBuffer and g_sharedBuffer in the same order as declared above in the "shared_mem" memory segment, whereas on ARM, linker may allocate g_sharedBuffer first and then g_packetBuffer in the same "shared_mem" memory segment.
In the multi-instance systems the signal flow for all instances is contained in a single AWD layout file. The AWE Core framework loads the entire AWB so that all instances load their respective layouts before calling PFID_StartAudio, which signals the system to start processing the layout. This strategy of loading all the layouts for every instance before starting processing guarantees initial synchronicity between all the AWE Instances. The diagram below shows the timing of normal loading an AWB for a Multi-Instance system.
If the system has different boot time requirements for each instance (for example a 3 core system can have 2 cores with short boot time whereas the third core may have significantly higher boot time requirements), it's not possible to achieve these early audio requirements with a single AWB. To overcome this limitation, AWECore supports progressive loading of multiple AWB's one per instance on boot time. The AWE Core will ensure that audio remains synchronous between instances, even when loaded separately. Assigning AWE Instances as standalone only changes the timing of loading AWB’s directly on the target. The behavior of instantiating and tuning layouts over the tuning interface (like from Designer) is not changed. Following diagram shows the timing of progressive loading of multi-instance AWB.
Progressive loading can be applied to a multi-instance system with the following rules and constraints:
Splitting a multi-instance AWB into multiple AWB's to support progressive loading should never be done manually. In order to preserve synchronicity between instances, splitting a multi-instance AWB or AWS must be done using the MATLAB script split_multi_instance_layout
, provided as part of Designer installs. Type help split_multi_instance_layout
in MATLAB for usage information.
To enable progressive loading, call awe_setInstanceStandaloneAWBLoad() with the second argument as 1 (True) on instance 0 before loading AWB on instance 0 in standalone mode. On the remaining instances it depends on the split. If the split is into 2 AWB's, the remaining AWB can be loaded as normal from the tuning master instance. If the split is separate AWB per instance, then awe_setInstanceStandaloneAWBLoad() must be called in each instance before calling AWB loading API.
INT32 awe_setInstanceStandaloneAWBLoad(AWEInstance* pAWE, INT32 status);
NOTE:
**When progressive loading is enabled, awe_audioIsReadyToPumpMulti() must be called for all secondary instances, before calling awe_audioGetPumpMask() in instance 0.**
In progressive loading, the AWB loading APIs awe_loadAWBfromArray() and awe_loadAWBfromFile() and awe_loadAWBfromFlash() return error code E_MULTI_INSTANCE_SPLIT_AWB_NOT_STARTED (-119) when trying to load AWB in an instance but the previous instance is not yet started (see constraint 4 above). For example, when trying to load AWB on instance(s) > 0, but instance 0 has not finished loading its AWB, the awe_load* APIs will return E_MULTI_INSTANCE_SPLIT_AWB_NOT_STARTED. This error code is provided to help users synchronize sequential AWB loading across multiple instances. Users should check for this error code and try to load the AWB in a loop until success or different error code.
Following are some of the example use cases with 3 instances system where instance 0 is the audio master and instance 2 is a tuning master.
Case 1.
-> Split the 3 instances AWB into 3 parts, Inst0_AWB, Inst1_AWB and Inst2_AWB.
-> On instance 0,
awe_setInstanceStandaloneAWBLoad(&g_AWEInstance, 1); // Mandatory ret = awe_loadAWBfromArray(&g_AWEInstance, Inst0_AWB, Inst0_AWB_Len, &pos);
-> On instance 1,
awe_setInstanceStandaloneAWBLoad(&g_AWEInstance, 1); // Mandatory
while(1) { ret = awe_loadAWBfromArray(&g_AWEInstance, Inst1_AWB, Inst1_AWB_Len, &pos);
// If the previous instance not started, awe_loadAWBfromArray() returns E_MULTI_INSTANCE_SPLIT_AWB_NOT_STARTED error code. Call AWB loading function in a loop until error code is not E_MULTI_INSTANCE_SPLIT_AWB_NOT_STARTED. if (ret != E_MULTI_INSTANCE_SPLIT_AWB_NOT_STARTED) { break; } }
-> On instance 2 (tuning master), AWB can be loaded in standalone or regular AWB.
awe_setInstanceStandaloneAWBLoad(&g_AWEInstance, 1); // Optional
while(1) { ret = awe_loadAWBfromArray(&g_AWEInstance, Inst2_AWB, Inst2_AWB_Len, &pos);
// If the previous instance not started, awe_loadAWBfromArray() returns E_MULTI_INSTANCE_SPLIT_AWB_NOT_STARTED error code. Call AWB loading function in a loop until error code is not E_MULTI_INSTANCE_SPLIT_AWB_NOT_STARTED. if (ret != E_MULTI_INSTANCE_SPLIT_AWB_NOT_STARTED) { break; } }
Case 2.
-> Split the 3 instances AWB into 2 parts, Inst0_AWB and Inst1_2_AWB.
-> On instance 0,
awe_setInstanceStandaloneAWBLoad(&g_AWEInstance, 1); // Mandatory ret = awe_loadAWBfromArray(&g_AWEInstance, Inst0_AWB, Inst0_AWB_Len, &pos);
-> On instance 1, no changes.
-> On instance 2 (tuning master), combined AWB for instance 1 and 2 must be loaded as regular AWB.
while(1) { ret = awe_loadAWBfromArray(&g_AWEInstance, Inst1_2_AWB, Inst1_2_AWB_Len, &pos);
// If the previous instance not started, awe_loadAWBfromArray() returns E_MULTI_INSTANCE_SPLIT_AWB_NOT_STARTED error code. Call AWB loading function in a loop until error code is not E_MULTI_INSTANCE_SPLIT_AWB_NOT_STARTED. if (ret != E_MULTI_INSTANCE_SPLIT_AWB_NOT_STARTED) { break; } }
Integrating a multi canvas AWE Core system can be broken into the following basic steps. This assumes that the integrator is able to implement a single instance tuning interface.
//the following instance table represents a two-instance system. UINT32 numInstances = 2; UINT32 instanceIDs[numInstances] = { 0, 16 };
if (opcode == PFID_GetCores2) { GenerateInstanceTableReply(AWE_Packet_Buffer_Reply, numInstances, instanceIDs); writePacketToServer(AWE_Packet_Buffer_Reply); }
The following pseudocode example is for a chip with two signal processing AWE Instances with the first AWE Instance providing the tuning interface.
TuningInterface() { AWE_Packet_Buffer = receivePacketFromServer(); if (PACKET_OPCODE(AWE_PACKET_BUFFER) == PFID_GetCores2) { *AWE_Packet_Buffer = *GenerateInstanceTableReply(AWE_Packet_Buffer, numInstances, instanceIDs); writePacketToServer(AWE_Packet_Buffer); } else if (PACKET_INSTANCEID(AWE_Packet_Buffer) == 0) { awe_packetProcess(&AWEInstance0); writeReplyToServer(); } else if (PACKET_INSTANCEID(AWE_Packet_Buffer) == 16) { sendToInstance16(); //command is processed on instance 16 readReplyFromInstance16(); writeReplyToServer(); } }
Below is a pseudocode example of a multi-canvas tuning interface that is implemented on an MCU that has no knowledge of the AWE Core instances. The system has two AWE instances on separate processors, with IDs of 0 and 16.
TuningInterface() { Tuning_PacketBuffer= receivePacketFromServer(); if (PACKET_OPCODE(Tuning_PacketBuffer) == PFID_GetCores2) { GenerateInstanceTableReply(Tuning_PacketBuffer, numInstances, instanceIDs); writePacketToServer(Tuning_PacketBuffer); } else if (PACKET_INSTANCEID(Tuning_PacketBuffer) == 0) { sendPacketToInstance0(Tuning_PacketBuffer); //Instance 1 gets packet and calls awe_packetProcess receiveReplyPacketFromInstance0(); writeReplyToServer(); } else if (PACKET_INSTANCEID(AWE_Packet_Buffer) == 16) { sendPacketToInstance16(); //Instance 16 gets packet and calls awe_packetProcess receivePacketFromInstance16(); writeReplyToServer(); } }
This section discusses only the latency caused by the AWE Core framework. The signal processing flow, the firmware, and the hardware can all introduce additional latency to the overall system. In this discussion, a "block" of latency refers to the layout block size and the systems sample rate. For example, with a layout block size of 256 and a sample rate of 48 kHz, the latency through the AWE Core system will be in multiples of blocks of (256 / 48000) = 5.33 ms. AWECore can introduce up to 2 block’s of latency, but can also be configured to achieve 0 blocks of latency under certain conditions.
There are 3 possible latency situations introduced by the AWECore library
Which of the three latency paths will be taken depends on some conditions in the implementation of the AWECore API, and the relationship between the system and the signal processing layout.
Note: The following abbreviations may be used for these AWECore API calls:
To achieve 0 blocks of AWE-induced latency, the following three conditions must be met:
AWE Core has an internal double buffering scheme on the input and output pins to handle situations where a layout’s blocksize is a multiple of the AWEInstance’s fundamental blocksize. For example, a system with a 64 BS layout running on a 32 fundamental BS AWEInstance must complete two, 32-sample audio callbacks before actually pumping audio through the layout (to satisfy the 64 samples). Double buffering is used in this situation in order to store the next frames of data in one buffer, while the processing occurs on the data in the other buffer. Two blocks of latency are introduced by this double buffering of the input and output pins.
However, in a situation where a layout BS is the same as a fundamental BS, the double buffering scheme is not required and two blocks of latency can be avoided. AWECore has an internal mechanism to check if the layout and fundamental blocksizes are equivalent at layout runtime, and will automatically bypass the double buffering to eliminate the introduced latency. Less memory is also consumed under this condition as only a single buffer can be used at the input and output pins.
At a very high level, 0 blocks of latency can only fundamentally be achieved if all of the audio processing occurs in a single callback. Based on this, the audio processing function must implement the following order of API calls to achieve the lowest possible latency: awe_audioImportSamples() awe_audioGetPumpMask() && awe_audioPump() (can be a signal to lower priority threads to do actual pumping) awe_audioExportSamples()
Different orders of API calls will still operate correctly from a processing standpoint, but may not achieve the lowest possible latency.
Virtually all systems that integrate the AWE Core will use a thread signaling or interrupt raising scheme to trigger awe_audioPump in separate, lower priority contexts from the main DMA audio callback. While this scheme is required in order to allow for efficient, multi-rate operation of signal processing layouts, it does mean that the processing of the audio signal will not be complete by the end of the audio callback, As mentioned in the section above, a system with 0 blocks of AWECore induced latency must complete all audio processing within the context of the main DMA audio callback. So in order to achieve minimal latency, the first sublayout (layout index 0) must be executed in place during the audio callback, not in a lower priority context.
The first sublayout consumes the input pin and fills the output pin, so it is only this layout that needs to be processed in place. Other non low latency paths (clockdivided sublayouts, layout index > 0) should still be signaled by the callback and pumped in another context.
There are a few common pitfalls when BSP authors are integrating AWE Core. The most common problems are scheduling issues, specifically when the audio thread is not at a high enough priority and is preempted by the packet or deferred processing. RT audio needs to be running at a very high priority, just below the Tuning Interface IO thread.