I would like to implement a Reinforcement Learning (RL) algorithm on an Embedded Platform, targeting an application that requires continuous control. The goal is to be able to acquire several input signals (4-6) at a relatively fast sampling rate (> 10MHz) and produce just as fast 4-6 control signals.
Given the tight constraints in terms of timing, I was orient toward and FPGA-based device (Xilinx ZCU104, ZCU111, …). These devices have applications that are generally limited to inference, and there are many IP cores/library dedicated to inference acceleration. Unfortunately, if we are just limited to inference, we lose one of the key advantages of online RL models that seamlessly combine training and prediction. There are also GPU-based devices (e.g., Nvidia Jetson Xavier), but they all seem limited in acquiring and handling such fast signals.
Given the time constraints and the necessity to handle on-device training, do you have any suggestions?