Hugging Face: Data-first guidance for NLP models (BERT, GPT-3): baselines, tokenization checks, and debugging with PyTorch/TensorBoard | SignalBreak | SignalBreak