Natasha Butt, Blazej Manczak, Auke Wiggers, Corrado Rainone, David W. Zhang, Michaël Defferrard, Taco Cohen:
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay. CoRR abs/2402.04858 (2024)
CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay. CoRR abs/2402.04858 (2024)