Took a little longer to verify the the actual problem and best solutino
What happened was that the motherboard would send down control requests, 'get report descriptor', but for some reason, the endpoint out transfers were no getting received by the host. So the host would request the same descriptor again, and fail again. Eventually, it would pass, but for some reason, so many transfers to the host would fail. The hosts typically sends down an endpoint reset, but this was after dozens of fails, which is way too late.
The device uses a 512 byte SRAM area in memory for the DMA transfers to the host. Without the endpoint reset, which would reset our endpoint frame back down to the bottom of the 512 SRAM, we would overflow, corrupt our stack pointer, and eventually crash.
Along with some other bug fixes yet to deploy, this version treats the 512 byte DMA area as a circular buffer. So it just wraps around if the hosts never sends down an endpoint reset.
Really hard to debug. All real time and any GDB operations, the host disconnects the device and transfers stop.