News Trends Compare Rankings Learn Claude Code

News Trends Compare Rankings Learn Claude Code

Tag

LLM safety

2 articles

Policy Invariance as a Better LLM Judge Test

Research/May 12

Policy Invariance as a Better LLM Judge Test

This paper argues that accuracy alone is not enough to trust LLM safety judges, and proposes policy invariance as a reliability test.

ASMR-Bench Tests Sabotage Detection in ML Code

Research/Apr 20

ASMR-Bench Tests Sabotage Detection in ML Code

ASMR-Bench probes whether auditors can spot subtle sabotage in ML research codebases, and the answer so far is: not reliably.

Content

News
AI Trends Overview
LLM Comparison 2026
AI Rankings and leaderboards

Categories

Model Releases
AI Agent
Research
Blockchain & Web3

Tools

AI Glossary
LLM API Pricing Calculator
AI Timeline 2024–2026
Developer Prompt Library

About

The Team
OG Preview
RSS Feed

© 2026 OraCore.dev

v4.37.3—