LLM with Conformal Inference
Posted on
develop new conformal inference methods for obtaining validity guarantees on the output of LLMs
prior work in conformal language modeling identifies a subset of the text that satisifies a high-probability guarantee of correctness
- by filtering claims from the LLM’s original response if a scoring function evaluated on the claim fails to exceed a threshold calibrated via split conformal prediction
existing methods in this area suffer from two deficiencies
- guarantee is not conditionally valid
- the scoring function is imperfect, the filteing step can remove many valuable and accurate claims
address both of these challenges via two new conformal methods
- generalize the conditional conformal procedure in order to adaptively issue weaker guarantees when they are required to preserve the utility of the output
- show how to systematically improve the quality of the scoring function via a novel algorithm for differentiating through the conditional conformal procedure
Introduction
LLMs often confidently hallucinate facts that do not exist, and can generate toxic outputs that may offend or discriminate
while there are many approaches to this problem (papers…), this paper consider conformal inference for black-box uncertainty quantification
- several recent papers have applied conformal inference to define a set of LLM responses that contains at least one factual response with high probability
but it is not a generalizable approach for the diverse and unstructured tasks faced in real-world deployment.
Mohri and Hashimoto: propose to utilize conformal inference to filter out invalid components of the LLM response
assume the existence of an annotated calibration set of $n$ i.i.d. prompt-response-claim-annotation tuples, ${(P_i, R_i, C_i, W_i)}_{i=1}^n$
- vector $C_i$ is obtained by using an LLM to parse the response into a list of scorable sub-claims
- $W_i$ might correspond to human verfication of the underlying factuality of each claim
twin goals: improved conditional validity and enhanced quality of filtered outputs
first method: conditional boosting
- allows for the automated discovery of superior claim scoring functions via differentiation through the conditional conformal algorithm
- optimize through the conditional conformal algorithm is not straightforward. The key technical contributions of the paper are a proof that (under mild assumptions) the cutoff output by the conditional conformal method is differentiable and a computationally efficient method for computing this derivative
second method: level-adaptive conformal prediction
- allows the validity of the conformal output to depend on characteristics of the queried prompt
it is well-known that exact conditional guarantees in conformal inference are impossible to achieve without strong distributional assumptions, the paper presents an interpretable alternative: group-conditional calibration
-
e.g., group questions by medical area or data provenance
- data-adaptive level function: $\alpha_{n+1}$
- filtered set of claims $\hat F(C_{n+1})$
the issued probabilities are well-calibrated in the following sence: among similar prompts, the outputs that we claim to be factually correct with probability between 70% and 80% will be actually factual between 70% and 80% of the time