h2o_sonar.evaluators package

Submodules

h2o_sonar.evaluators.abc_byop_evaluator module

class h2o_sonar.evaluators.abc_byop_evaluator.AbcByopEvaluator

Bases: ABC, Evaluator

Abstract base class for Bring Your Own Prompt (BYOP) evaluators.

class Classes(failure, success)

Bases: tuple

failure

Alias for field number 0

success

Alias for field number 1

IDENTIFIER_ACTUAL_OUTPUT = '{ACTUAL_OUTPUT}'
IDENTIFIER_CONTEXT = '{CONTEXT}'
IDENTIFIER_EXPECTED_OUTPUT = '{EXPECTED_OUTPUT}'
IDENTIFIER_INPUT = '{INPUT}'
KEY_ANSWER: str = 'answer'
KEY_ERROR: str = 'error'
KEY_PARSED_ANSWER: str = 'parsed_answer'
KEY_PROMPT: str = 'prompt'
PARAM_JUDGE_HOST: str = 'judge_host'
PARAM_JUDGE_MODEL: str = 'judge_model'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, **kwargs) List
get_result() LeaderboardResult
property judge
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.bleu_evaluator module

class h2o_sonar.evaluators.bleu_evaluator.BleuEvaluator

Bases: Evaluator

METRIC_BLEU_1 = 'bleu_1'
METRIC_BLEU_2 = 'bleu_2'
METRIC_BLEU_3 = 'bleu_3'
METRIC_BLEU_4 = 'bleu_4'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.classification_evaluator module

class h2o_sonar.evaluators.classification_evaluator.ClassificationEvaluator

Bases: Evaluator

METRIC_FN = 'fn'
METRIC_FP = 'fp'
METRIC_TN = 'tn'
METRIC_TP = 'tp'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.contact_information_byop_evaluator module

class h2o_sonar.evaluators.contact_information_byop_evaluator.ContactInformationByopEvaluator

Bases: AbcByopEvaluator

h2o_sonar.evaluators.fairness_bias_evaluator module

class h2o_sonar.evaluators.fairness_bias_evaluator.FairnessBiasEvaluator

Bases: Evaluator

METRIC_FAIRNESS_BIAS = 'fairness_bias'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.gptscore_evaluator module

class h2o_sonar.evaluators.gptscore_evaluator.GptScoreEvaluator

Bases: ABC, Evaluator

DEFAULT_METRIC_THRESHOLD = inf
PARAM_EVAL_GPT_SCORE_MODEL = 'gpt_score_model'
add_problem_for_row(severity: ProblemSeverity, message: str, row: LlmDatasetRow)
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.gptscore_machine_translation_evaluator module

class h2o_sonar.evaluators.gptscore_machine_translation_evaluator.GptScoreMachineTranslationEvaluator

Bases: GptScoreEvaluator

METRIC_ACCURACY = 'accuracy'
METRIC_FLUENCY = 'fluency'
METRIC_MULTI_QUAL_METRICS = 'multidimensional quality metrics'

h2o_sonar.evaluators.gptscore_question_answering_evaluator module

class h2o_sonar.evaluators.gptscore_question_answering_evaluator.GptScoreQuestionAnsweringEvaluator

Bases: GptScoreEvaluator

METRIC_CORRECTNESS = 'correctness'
METRIC_ENGAGEMENT = 'engagement'
METRIC_FLUENCY = 'fluency'
METRIC_INTEREST = 'interest'
METRIC_RELEVANCE = 'relevance'
METRIC_SEMANTICALLY_APPROPRIATE = 'semantically appropriate'
METRIC_SPECIFIC = 'specific'
METRIC_UNDERSTANDABILITY = 'understandability'

h2o_sonar.evaluators.gptscore_summary_without_reference_evaluator module

class h2o_sonar.evaluators.gptscore_summary_without_reference_evaluator.GptScoreSummaryWithoutReferenceEvaluator

Bases: GptScoreEvaluator

METRIC_COHERENCE = 'coherence'
METRIC_CONSISTENCY = 'consistency'
METRIC_FACTUALITY = 'factuality'
METRIC_FLUENCY = 'fluency'
METRIC_INFORMATIVENESS = 'informativeness'
METRIC_RELEVANCE = 'relevance'
METRIC_SEMANTIC_COVERAGE = 'semantic coverage'

h2o_sonar.evaluators.gptscore_summary_with_reference_evaluator module

class h2o_sonar.evaluators.gptscore_summary_with_reference_evaluator.GptScoreSummaryWithReferenceEvaluator

Bases: GptScoreEvaluator

METRIC_COHERENCE = 'coherence'
METRIC_FACTUALITY = 'factuality'
METRIC_FLUENCY = 'fluency'
METRIC_INFORMATIVENESS = 'informativeness'
METRIC_RELEVANCE = 'relevance'
METRIC_SEMANTIC_COVERAGE = 'semantic coverage'

h2o_sonar.evaluators.language_mismatch_byop_evaluator module

class h2o_sonar.evaluators.language_mismatch_byop_evaluator.LanguageMismatchByopEvaluator

Bases: AbcByopEvaluator

h2o_sonar.evaluators.parameterizable_byop_evaluator module

class h2o_sonar.evaluators.parameterizable_byop_evaluator.ParameterizableByopEvaluator

Bases: AbcByopEvaluator

PROMPT_TEMPLATE_PARAM: str = 'prompt_template'
check_compatibility(params: CommonInterpretationParams | None = None, model: ExplainableModel | None = None, **explainer_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

h2o_sonar.evaluators.perplexity_evaluator module

class h2o_sonar.evaluators.perplexity_evaluator.PerplexityEvaluator

Bases: Evaluator

METRIC_PERPLEXITY = 'perplexity'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.pii_leakage_evaluator module

class h2o_sonar.evaluators.pii_leakage_evaluator.PiiLeakageEvaluator

Bases: Evaluator

DEFAULT_EVAL_RC = True
PARAM_EVAL_RC = 'evaluate_retrieved_context'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

static check_creditcard_leakage(checked_txt: str, failed_constraints: List, fragments: List) Tuple[List, List]
static check_email_leakage(checked_txt: str, failed_constraints: List, fragments: List) Tuple[List, List]

Check email leakage.

Returns:
bool

Return list of leaked emails found, empty list otherwise.

static check_ssn_leakage(checked_txt: str, failed_constraints: List, fragments: List) Tuple[List, List]
evaluate(llm_testset, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.rag_answer_correctness_evaluator module

class h2o_sonar.evaluators.rag_answer_correctness_evaluator.AnswerCorrectnessEvaluator

Bases: Evaluator

check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.rag_answer_relevancy_evaluator module

class h2o_sonar.evaluators.rag_answer_relevancy_evaluator.AnswerRelevancyEvaluator

Bases: Evaluator

check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.rag_answer_relevancy_no_judge_evaluator module

class h2o_sonar.evaluators.rag_answer_relevancy_no_judge_evaluator.RagAnswerRelevancyNoJudgeEvaluator

Bases: Evaluator

COL_ACTUAL_OUTPUT = 'actual_output'
COL_CONTEXT = 'context'
COL_EXPECTED_OUTPUT = 'expected_output'
COL_INPUT = 'input'
COL_MODEL = 'model'
COL_SCORE = 'score'
METRIC_ANSWER_RELEVANCY = 'answer_relevancy'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

static split_sentences(text: str) List[str]

Split the data into sentences

h2o_sonar.evaluators.rag_answer_similarity_evaluator module

class h2o_sonar.evaluators.rag_answer_similarity_evaluator.AnswerSemanticSimilarityEvaluator

Bases: Evaluator

check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.rag_chunk_relevancy_evaluator module

class h2o_sonar.evaluators.rag_chunk_relevancy_evaluator.ContextChunkRelevancyEvaluator

Bases: Evaluator

COL_ACTUAL_OUTPUT = 'actual_output'
COL_CONTEXT = 'context'
COL_EXPECTED_OUTPUT = 'expected_output'
COL_INPUT = 'input'
COL_MODEL = 'model'
COL_SCORE = 'score'
METRIC_PRECISION_RELEVANCY = 'precision_relevancy'
METRIC_RECALL_RELEVANCY = 'recall_relevancy'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

static split_sentences(text: str) List[str]

Split the data into sentences

h2o_sonar.evaluators.rag_context_precision_evaluator module

class h2o_sonar.evaluators.rag_context_precision_evaluator.ContextPrecisionEvaluator

Bases: Evaluator

check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.rag_context_recall_evaluator module

class h2o_sonar.evaluators.rag_context_recall_evaluator.ContextRecallEvaluator

Bases: Evaluator

check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.rag_context_relevancy_evaluator module

class h2o_sonar.evaluators.rag_context_relevancy_evaluator.ContextRelevancyEvaluator

Bases: Evaluator

check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.rag_faithfulness_evaluator module

class h2o_sonar.evaluators.rag_faithfulness_evaluator.FaithfulnessEvaluator

Bases: Evaluator

check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.rag_groundedness_evaluator module

class h2o_sonar.evaluators.rag_groundedness_evaluator.RagGroundednessEvaluator

Bases: Evaluator

COL_ACTUAL_OUTPUT = 'actual_output'
COL_CONTEXT = 'context'
COL_EXPECTED_OUTPUT = 'expected_output'
COL_INPUT = 'input'
COL_MODEL = 'model'
COL_SCORE = 'score'
METRIC_GROUNDEDNESS = 'groundedness'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

static split_sentences(text: str) List[str]

Split the data into sentences

h2o_sonar.evaluators.rag_hallucination_evaluator module

class h2o_sonar.evaluators.rag_hallucination_evaluator.RagHallucinationEvaluator

Bases: Evaluator

COL_ACTUAL_OUTPUT = 'actual_output'
COL_CONTEXT = 'context'
COL_EXPECTED_OUTPUT = 'expected_output'
COL_INPUT = 'input'
COL_MODEL = 'model'
COL_SCORE = 'score'
DEFAULT_METRIC_THRESHOLD = 0.5
METRIC_HALLUCINATION = 'hallucination'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.rag_ragas_evaluator module

class h2o_sonar.evaluators.rag_ragas_evaluator.RagasEvaluator

Bases: Evaluator

KEY_ANSWER = 'answer'
KEY_CONTEXTS = 'contexts'
KEY_GROUND_TRUTHS = 'ground_truths'
KEY_QUESTION = 'question'
METRIC_ANSWER_CORRECTNESS = 'answer_correctness'
METRIC_ANSWER_RELEVANCY = 'answer_relevancy'
METRIC_ANSWER_SIMILARITY = 'answer_similarity'
METRIC_CONTEXT_PRECISION = 'context_precision'
METRIC_CONTEXT_RECALL = 'context_recall'
METRIC_CONTEXT_RELEVANCY = 'context_relevancy'
METRIC_FAITHFULNESS = 'faithfulness'
METRIC_META_ANSWER_CORRECTNESS = <h2o_sonar.lib.api.commons.MetricMeta object>
METRIC_META_ANSWER_RELEVANCY = <h2o_sonar.lib.api.commons.MetricMeta object>
METRIC_META_ANSWER_SIMILARITY = <h2o_sonar.lib.api.commons.MetricMeta object>
METRIC_META_CONTEXT_PRECISION = <h2o_sonar.lib.api.commons.MetricMeta object>
METRIC_META_CONTEXT_RECALL = <h2o_sonar.lib.api.commons.MetricMeta object>
METRIC_META_CONTEXT_RELEVANCY = <h2o_sonar.lib.api.commons.MetricMeta object>
METRIC_META_FAITHFULNESS = <h2o_sonar.lib.api.commons.MetricMeta object>
METRIC_META_RAGAS = <h2o_sonar.lib.api.commons.MetricMeta object>
METRIC_RAGAS = 'ragas'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

eval_custom_metrics(llm_testset, metrics_threshold: float, save_llm_result: bool, custom_eval_judge_cfg_key: str, metrics_to_run: MetricsMeta, evaluator: Evaluator | None = None, nan_tolerance: float = 0.2) List
evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.rag_tokens_presence_evaluator module

class h2o_sonar.evaluators.rag_tokens_presence_evaluator.ConditionEvaluator(c: str, logger)

Bases: object

Condition evaluator for the AIP-160 syntax subset.

evaluate(s: str, c_ast: List | None = None, failed_sub_conditions_as_str: bool = False) Tuple[bool, List]

Evaluate the condition.

Parameters:
sstr

The string to be evaluated.

c_astOptional[List]

Optional custom condition AST.

failed_sub_conditions_as_strbool

If True, return the failed sub-conditions as string, otherwise as AST.

Returns:
Tuple[bool, List]

The evaluation result and the list of failed sub-conditions.

class h2o_sonar.evaluators.rag_tokens_presence_evaluator.RagStrStrEvaluator

Bases: Evaluator

DEFAULT_EVAL_RC = False
PARAM_EVAL_RC = 'evaluate_retrieved_context'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

static eval_tc_conditions(row: LlmDatasetRow, evaluator: Evaluator, evaluator_id: str, evaluator_display_name: str, eval_results: LlmEvalResults, key_2_evaluated_model: Dict, llm_host: LlmModelHostType, do_eval_rc: bool, logger)
evaluate(llm_testset, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.rag_tokens_presence_evaluator.constraints_to_condition(constraint: List | None) str

Convert constraints to more powerful AIP-160 syntax based expression. The main motivation is KISS - use one evaluator for all types of constraints.

Parameters:
constraintOptional[List]

Constraints structure to be converted to the condition.

Returns:
str

The condition string.

h2o_sonar.evaluators.rouge_evaluator module

class h2o_sonar.evaluators.rouge_evaluator.RougeEvaluator

Bases: Evaluator

METRIC_ROUGE_1 = 'rouge_1'
METRIC_ROUGE_2 = 'rouge_2'
METRIC_ROUGE_L = 'rouge_l'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.sensitive_data_leakage_evaluator module

class h2o_sonar.evaluators.sensitive_data_leakage_evaluator.SensitiveDataLeakageEvaluator

Bases: Evaluator

DEFAULT_EVAL_RC = True
PARAM_EVAL_RC = 'evaluate_retrieved_context'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

h2o_sonar.evaluators.sexism_byop_evaluator module

class h2o_sonar.evaluators.sexism_byop_evaluator.SexismByopEvaluator

Bases: AbcByopEvaluator

h2o_sonar.evaluators.stereotype_byop_evaluator module

class h2o_sonar.evaluators.stereotype_byop_evaluator.StereotypeByopEvaluator

Bases: AbcByopEvaluator

h2o_sonar.evaluators.summarization_byop_evaluator module

class h2o_sonar.evaluators.summarization_byop_evaluator.SummarizationByopEvaluator

Bases: AbcByopEvaluator

h2o_sonar.evaluators.summarization_evaluator module

class h2o_sonar.evaluators.summarization_evaluator.SummarizationEvaluator

Bases: Evaluator

KEY_COMPLETENESS = 'completeness'
KEY_FAITHFULNESS_CONV = 'faithfulness_conv'
KEY_FAITHFULNESS_ZS = 'faithfulness_zs'
calculate_scores(inputs: list[str], actual_outputs: list[str]) Tuple[Dict[str, float], Dict]
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

static split_sentences(text: str) List[str]
summac_faith_score1(summary: str, refs: str) float

Calculate the summac convolution faithfulness score using the summac convolution

summac_faith_score2(summary: str, refs: str) float

Max summac/NLI score for individual sentences

summary_completeness_batch(summaries: list[str], docs: list[str], nearest_neighbors: int = 10, umap_dimension: int = 5) Tuple[List | None, Dict]
h2o_sonar.evaluators.summarization_evaluator.load_summac()
h2o_sonar.evaluators.summarization_evaluator.pairwise_distances_wrapper(points)
h2o_sonar.evaluators.summarization_evaluator.segment_calc(distances: Any, n: int) float

h2o_sonar.evaluators.toxicity_evaluator module

class h2o_sonar.evaluators.toxicity_evaluator.ToxicityEvaluator

Bases: Evaluator

DEFAULT_TOXICITY_METRIC_THRESHOLD = 0.25
METRIC_IDENTITY_ATTACK = 'identity_attack'
METRIC_INSULT = 'insult'
METRIC_OBSCENE = 'obscene'
METRIC_SEVERE_TOXICITY = 'severe_toxicity'
METRIC_THREAT = 'threat'
METRIC_TOXICITY = 'toxicity'
check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool

Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns False or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.

evaluate(llm_testset, explanations_types=None, **kwargs) List
get_result() LeaderboardResult
setup(model, persistence, **kwargs)

Set all the parameters needed to execute fit() and explain().

Parameters:
modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]

Explainable model with (fit and) score methods (or None if 3rd party).

models

(Explainable) models.

persistence: ExplainerPersistence

Persistence API allowing (controlled) saving and loading of explanations.

key: str

Optional (given) explainer run key (generated otherwise).

params: CommonInterpretationParams

Common explainers parameters specified on explainer run.

explainer_params_as_str: Optional[str]

Explainer specific parameters in string representation.

dataset_apiOptional[datasets.DatasetApi]

Dataset API to create custom explainable datasets needed by this explainer.

model_apiOptional[models.ModelApi]

Model API to create custom explainable models needed by this explainer.

loggerOptional[loggers.SonarLogger]

Logger.

explainer_params:

Other explainers RUNTIME parameters, options, and configuration.

Module contents