The whale in the machine: reconstructing Moby-Dick with large language models
An experiment testing whether LLMs can reproduce Moby Dick verbatim with mixed results.
The whale in the machine: reconstructing Moby-Dick with large language models
An experiment testing whether LLMs can reproduce Moby Dick verbatim with mixed results.
The color memory test: a quick filter for language model quality
A simple yet effective test for weeding out unreliable LLMs using eye and hair color descriptions.
Exploring embedding space vulnerabilities through lessons learned from a recent Kaggle competition.