Hugging Face: OpenEnv Calendar Gym reveals production reliability gaps for tool-using agents | SignalBreak | SignalBreak