https://arxiv.org/abs/2009.03300

Key Contributions

  1. Behavioral Testing Framework
  2. Testing Matrix
  3. User-Friendly Tooling
  4. Applications

Key Findings

  1. Models Lack Robustness
  2. Behavioral Gaps
  3. Importance of Systematic Testing