@m20k I'd probably start with a #softwaremodel though, keeping the #FPGA limitations in mind. However you will soon find that filling instructions with independent #µops (TMTA called these molecules and atoms) is really hard, especially for short #basicblocks. TMTA dealt with that in many ways, most significantly by compiling very large #superblocks (multiple exits) and make heavy use of #predication.
I would try something different: make the #dataflow explicit ...