OpenAI: Build trace-driven agent improvement loop with Promptfoo evals, HALO-ranked harness changes, and Codex handoff | SignalBreak | SignalBreak