OpenAI: OpenAI adopts Proximal Policy Optimization as default reinforcement learning algorithm | SignalBreak | SignalBreak