Base-Rate Item Evaluation and Typicality Scoring Using Large Language Models


[Up] [Top]

Documentation for package ‘baserater’ version 0.1.1

Help Pages

download_data Load base-rate database, model typicality matrices, or human validation ratings
evaluate_external_ratings Evaluate how new typicality ratings predict human ratings and compares performance to LLM baselines
extract_base_rate_items Create base-rate items from groups x descriptions typicality matrix
generate_typicality Generate typicality ratings via an 'Inference Provider' (experimental)