-
Predicting when LLMs fail
An overview of different paradigms to do so and connections with related areas
-
Summary of "From Testing to Evaluation of NLP and LLM Systems"
This work compares academic research in evaluation with practitioners' questions on community forums.
-
Finetuning GPT3
Using the OpenAI API to finetune GPT3 on a custom dataset
-
Generalizing Bayesian Inference
Updating a 250 years old theorem for the 21st century
-
Sampling from doubly intractable distributions - the ExchangeMCMC algorithm
A quick review