Tag

language models

Language models sit at the core of generative AI, spanning pretraining, token initialization, alignment, and security evaluation. This tag collects work on how LMs learn semantics, absorb new vocabulary, and where jailbreak tests expose failure modes.

3 articles

Research/May 7

Do LLMs Learn Grammar Beyond Likelihood?

A probe study finds hidden layers in language models encode grammaticality better than string probability, but not plausibility.

Research/Apr 23

AVISE tests AI security with modular jailbreak evals

AVISE is an open-source framework for finding AI vulnerabilities, with a 25-case jailbreak test that flagged all nine models as vulnerable.

Blockchain & Web3/Apr 3

A Better Way to Seed New LM Tokens

GTI grounds new vocabulary tokens before fine-tuning, aiming to preserve distinctions that mean initialization tends to collapse.