h2o_sonar.evaluators package
Submodules
h2o_sonar.evaluators.abc_byop_evaluator module
- class h2o_sonar.evaluators.abc_byop_evaluator.AbcByopEvaluator
Bases:
ABC
,Evaluator
Abstract base class for Bring Your Own Prompt (BYOP) evaluators.
- class Classes(failure, success)
Bases:
tuple
- failure
Alias for field number 0
- success
Alias for field number 1
- IDENTIFIER_ACTUAL_OUTPUT = '{ACTUAL_OUTPUT}'
- IDENTIFIER_CONTEXT = '{CONTEXT}'
- IDENTIFIER_EXPECTED_OUTPUT = '{EXPECTED_OUTPUT}'
- IDENTIFIER_INPUT = '{INPUT}'
- KEY_ANSWER: str = 'answer'
- KEY_ERROR: str = 'error'
- KEY_PARSED_ANSWER: str = 'parsed_answer'
- KEY_PROMPT: str = 'prompt'
- PARAM_JUDGE_HOST: str = 'judge_host'
- PARAM_JUDGE_MODEL: str = 'judge_model'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, **kwargs) List
- get_result() LeaderboardResult
- property judge
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.bleu_evaluator module
- class h2o_sonar.evaluators.bleu_evaluator.BleuEvaluator
Bases:
Evaluator
- METRIC_BLEU_1 = 'bleu_1'
- METRIC_BLEU_2 = 'bleu_2'
- METRIC_BLEU_3 = 'bleu_3'
- METRIC_BLEU_4 = 'bleu_4'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.classification_evaluator module
- class h2o_sonar.evaluators.classification_evaluator.ClassificationEvaluator
Bases:
Evaluator
- METRIC_FN = 'fn'
- METRIC_FP = 'fp'
- METRIC_TN = 'tn'
- METRIC_TP = 'tp'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.contact_information_byop_evaluator module
- class h2o_sonar.evaluators.contact_information_byop_evaluator.ContactInformationByopEvaluator
Bases:
AbcByopEvaluator
h2o_sonar.evaluators.fairness_bias_evaluator module
- class h2o_sonar.evaluators.fairness_bias_evaluator.FairnessBiasEvaluator
Bases:
Evaluator
- METRIC_FAIRNESS_BIAS = 'fairness_bias'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.gptscore_evaluator module
- class h2o_sonar.evaluators.gptscore_evaluator.GptScoreEvaluator
Bases:
ABC
,Evaluator
- DEFAULT_METRIC_THRESHOLD = inf
- PARAM_EVAL_GPT_SCORE_MODEL = 'gpt_score_model'
- add_problem_for_row(severity: ProblemSeverity, message: str, row: LlmDatasetRow)
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.gptscore_machine_translation_evaluator module
- class h2o_sonar.evaluators.gptscore_machine_translation_evaluator.GptScoreMachineTranslationEvaluator
Bases:
GptScoreEvaluator
- METRIC_ACCURACY = 'accuracy'
- METRIC_FLUENCY = 'fluency'
- METRIC_MULTI_QUAL_METRICS = 'multidimensional quality metrics'
h2o_sonar.evaluators.gptscore_question_answering_evaluator module
- class h2o_sonar.evaluators.gptscore_question_answering_evaluator.GptScoreQuestionAnsweringEvaluator
Bases:
GptScoreEvaluator
- METRIC_CORRECTNESS = 'correctness'
- METRIC_ENGAGEMENT = 'engagement'
- METRIC_FLUENCY = 'fluency'
- METRIC_INTEREST = 'interest'
- METRIC_RELEVANCE = 'relevance'
- METRIC_SEMANTICALLY_APPROPRIATE = 'semantically appropriate'
- METRIC_SPECIFIC = 'specific'
- METRIC_UNDERSTANDABILITY = 'understandability'
h2o_sonar.evaluators.gptscore_summary_without_reference_evaluator module
- class h2o_sonar.evaluators.gptscore_summary_without_reference_evaluator.GptScoreSummaryWithoutReferenceEvaluator
Bases:
GptScoreEvaluator
- METRIC_COHERENCE = 'coherence'
- METRIC_CONSISTENCY = 'consistency'
- METRIC_FACTUALITY = 'factuality'
- METRIC_FLUENCY = 'fluency'
- METRIC_INFORMATIVENESS = 'informativeness'
- METRIC_RELEVANCE = 'relevance'
- METRIC_SEMANTIC_COVERAGE = 'semantic coverage'
h2o_sonar.evaluators.gptscore_summary_with_reference_evaluator module
- class h2o_sonar.evaluators.gptscore_summary_with_reference_evaluator.GptScoreSummaryWithReferenceEvaluator
Bases:
GptScoreEvaluator
- METRIC_COHERENCE = 'coherence'
- METRIC_FACTUALITY = 'factuality'
- METRIC_FLUENCY = 'fluency'
- METRIC_INFORMATIVENESS = 'informativeness'
- METRIC_RELEVANCE = 'relevance'
- METRIC_SEMANTIC_COVERAGE = 'semantic coverage'
h2o_sonar.evaluators.language_mismatch_byop_evaluator module
- class h2o_sonar.evaluators.language_mismatch_byop_evaluator.LanguageMismatchByopEvaluator
Bases:
AbcByopEvaluator
h2o_sonar.evaluators.parameterizable_byop_evaluator module
- class h2o_sonar.evaluators.parameterizable_byop_evaluator.ParameterizableByopEvaluator
Bases:
AbcByopEvaluator
- PROMPT_TEMPLATE_PARAM: str = 'prompt_template'
- check_compatibility(params: CommonInterpretationParams | None = None, model: ExplainableModel | None = None, **explainer_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
h2o_sonar.evaluators.perplexity_evaluator module
- class h2o_sonar.evaluators.perplexity_evaluator.PerplexityEvaluator
Bases:
Evaluator
- METRIC_PERPLEXITY = 'perplexity'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.pii_leakage_evaluator module
- class h2o_sonar.evaluators.pii_leakage_evaluator.PiiLeakageEvaluator
Bases:
Evaluator
- DEFAULT_EVAL_RC = True
- PARAM_EVAL_RC = 'evaluate_retrieved_context'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- static check_creditcard_leakage(checked_txt: str, failed_constraints: List, fragments: List) Tuple[List, List]
- static check_email_leakage(checked_txt: str, failed_constraints: List, fragments: List) Tuple[List, List]
Check email leakage.
- Returns:
- bool
Return list of leaked emails found, empty list otherwise.
- static check_ssn_leakage(checked_txt: str, failed_constraints: List, fragments: List) Tuple[List, List]
- evaluate(llm_testset, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.rag_answer_correctness_evaluator module
- class h2o_sonar.evaluators.rag_answer_correctness_evaluator.AnswerCorrectnessEvaluator
Bases:
Evaluator
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.rag_answer_relevancy_evaluator module
- class h2o_sonar.evaluators.rag_answer_relevancy_evaluator.AnswerRelevancyEvaluator
Bases:
Evaluator
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.rag_answer_relevancy_no_judge_evaluator module
- class h2o_sonar.evaluators.rag_answer_relevancy_no_judge_evaluator.RagAnswerRelevancyNoJudgeEvaluator
Bases:
Evaluator
- COL_ACTUAL_OUTPUT = 'actual_output'
- COL_CONTEXT = 'context'
- COL_EXPECTED_OUTPUT = 'expected_output'
- COL_INPUT = 'input'
- COL_MODEL = 'model'
- COL_SCORE = 'score'
- METRIC_ANSWER_RELEVANCY = 'answer_relevancy'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
- static split_sentences(text: str) List[str]
Split the data into sentences
h2o_sonar.evaluators.rag_answer_similarity_evaluator module
- class h2o_sonar.evaluators.rag_answer_similarity_evaluator.AnswerSemanticSimilarityEvaluator
Bases:
Evaluator
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.rag_chunk_relevancy_evaluator module
- class h2o_sonar.evaluators.rag_chunk_relevancy_evaluator.ContextChunkRelevancyEvaluator
Bases:
Evaluator
- COL_ACTUAL_OUTPUT = 'actual_output'
- COL_CONTEXT = 'context'
- COL_EXPECTED_OUTPUT = 'expected_output'
- COL_INPUT = 'input'
- COL_MODEL = 'model'
- COL_SCORE = 'score'
- METRIC_PRECISION_RELEVANCY = 'precision_relevancy'
- METRIC_RECALL_RELEVANCY = 'recall_relevancy'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
- static split_sentences(text: str) List[str]
Split the data into sentences
h2o_sonar.evaluators.rag_context_precision_evaluator module
- class h2o_sonar.evaluators.rag_context_precision_evaluator.ContextPrecisionEvaluator
Bases:
Evaluator
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.rag_context_recall_evaluator module
- class h2o_sonar.evaluators.rag_context_recall_evaluator.ContextRecallEvaluator
Bases:
Evaluator
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.rag_context_relevancy_evaluator module
- class h2o_sonar.evaluators.rag_context_relevancy_evaluator.ContextRelevancyEvaluator
Bases:
Evaluator
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.rag_faithfulness_evaluator module
- class h2o_sonar.evaluators.rag_faithfulness_evaluator.FaithfulnessEvaluator
Bases:
Evaluator
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.rag_groundedness_evaluator module
- class h2o_sonar.evaluators.rag_groundedness_evaluator.RagGroundednessEvaluator
Bases:
Evaluator
- COL_ACTUAL_OUTPUT = 'actual_output'
- COL_CONTEXT = 'context'
- COL_EXPECTED_OUTPUT = 'expected_output'
- COL_INPUT = 'input'
- COL_MODEL = 'model'
- COL_SCORE = 'score'
- METRIC_GROUNDEDNESS = 'groundedness'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
- static split_sentences(text: str) List[str]
Split the data into sentences
h2o_sonar.evaluators.rag_hallucination_evaluator module
- class h2o_sonar.evaluators.rag_hallucination_evaluator.RagHallucinationEvaluator
Bases:
Evaluator
- COL_ACTUAL_OUTPUT = 'actual_output'
- COL_CONTEXT = 'context'
- COL_EXPECTED_OUTPUT = 'expected_output'
- COL_INPUT = 'input'
- COL_MODEL = 'model'
- COL_SCORE = 'score'
- DEFAULT_METRIC_THRESHOLD = 0.5
- METRIC_HALLUCINATION = 'hallucination'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.rag_ragas_evaluator module
- class h2o_sonar.evaluators.rag_ragas_evaluator.RagasEvaluator
Bases:
Evaluator
- KEY_ANSWER = 'answer'
- KEY_CONTEXTS = 'contexts'
- KEY_GROUND_TRUTHS = 'ground_truths'
- KEY_QUESTION = 'question'
- METRIC_ANSWER_CORRECTNESS = 'answer_correctness'
- METRIC_ANSWER_RELEVANCY = 'answer_relevancy'
- METRIC_ANSWER_SIMILARITY = 'answer_similarity'
- METRIC_CONTEXT_PRECISION = 'context_precision'
- METRIC_CONTEXT_RECALL = 'context_recall'
- METRIC_CONTEXT_RELEVANCY = 'context_relevancy'
- METRIC_FAITHFULNESS = 'faithfulness'
- METRIC_META_ANSWER_CORRECTNESS = <h2o_sonar.lib.api.commons.MetricMeta object>
- METRIC_META_ANSWER_RELEVANCY = <h2o_sonar.lib.api.commons.MetricMeta object>
- METRIC_META_ANSWER_SIMILARITY = <h2o_sonar.lib.api.commons.MetricMeta object>
- METRIC_META_CONTEXT_PRECISION = <h2o_sonar.lib.api.commons.MetricMeta object>
- METRIC_META_CONTEXT_RECALL = <h2o_sonar.lib.api.commons.MetricMeta object>
- METRIC_META_CONTEXT_RELEVANCY = <h2o_sonar.lib.api.commons.MetricMeta object>
- METRIC_META_FAITHFULNESS = <h2o_sonar.lib.api.commons.MetricMeta object>
- METRIC_META_RAGAS = <h2o_sonar.lib.api.commons.MetricMeta object>
- METRIC_RAGAS = 'ragas'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- eval_custom_metrics(llm_testset, metrics_threshold: float, save_llm_result: bool, custom_eval_judge_cfg_key: str, metrics_to_run: MetricsMeta, evaluator: Evaluator | None = None, nan_tolerance: float = 0.2) List
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.rag_tokens_presence_evaluator module
- class h2o_sonar.evaluators.rag_tokens_presence_evaluator.ConditionEvaluator(c: str, logger)
Bases:
object
Condition evaluator for the AIP-160 syntax subset.
- evaluate(s: str, c_ast: List | None = None, failed_sub_conditions_as_str: bool = False) Tuple[bool, List]
Evaluate the condition.
- Parameters:
- sstr
The string to be evaluated.
- c_astOptional[List]
Optional custom condition AST.
- failed_sub_conditions_as_strbool
If
True
, return the failed sub-conditions as string, otherwise as AST.
- Returns:
- Tuple[bool, List]
The evaluation result and the list of failed sub-conditions.
- class h2o_sonar.evaluators.rag_tokens_presence_evaluator.RagStrStrEvaluator
Bases:
Evaluator
- DEFAULT_EVAL_RC = False
- PARAM_EVAL_RC = 'evaluate_retrieved_context'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- static eval_tc_conditions(row: LlmDatasetRow, evaluator: Evaluator, evaluator_id: str, evaluator_display_name: str, eval_results: LlmEvalResults, key_2_evaluated_model: Dict, llm_host: LlmModelHostType, do_eval_rc: bool, logger)
- evaluate(llm_testset, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
- h2o_sonar.evaluators.rag_tokens_presence_evaluator.constraints_to_condition(constraint: List | None) str
Convert constraints to more powerful AIP-160 syntax based expression. The main motivation is KISS - use one evaluator for all types of constraints.
- Parameters:
- constraintOptional[List]
Constraints structure to be converted to the condition.
- Returns:
- str
The condition string.
h2o_sonar.evaluators.rouge_evaluator module
- class h2o_sonar.evaluators.rouge_evaluator.RougeEvaluator
Bases:
Evaluator
- METRIC_ROUGE_1 = 'rouge_1'
- METRIC_ROUGE_2 = 'rouge_2'
- METRIC_ROUGE_L = 'rouge_l'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.sensitive_data_leakage_evaluator module
- class h2o_sonar.evaluators.sensitive_data_leakage_evaluator.SensitiveDataLeakageEvaluator
Bases:
Evaluator
- DEFAULT_EVAL_RC = True
- PARAM_EVAL_RC = 'evaluate_retrieved_context'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
h2o_sonar.evaluators.sexism_byop_evaluator module
- class h2o_sonar.evaluators.sexism_byop_evaluator.SexismByopEvaluator
Bases:
AbcByopEvaluator
h2o_sonar.evaluators.stereotype_byop_evaluator module
- class h2o_sonar.evaluators.stereotype_byop_evaluator.StereotypeByopEvaluator
Bases:
AbcByopEvaluator
h2o_sonar.evaluators.summarization_byop_evaluator module
- class h2o_sonar.evaluators.summarization_byop_evaluator.SummarizationByopEvaluator
Bases:
AbcByopEvaluator
h2o_sonar.evaluators.summarization_evaluator module
- class h2o_sonar.evaluators.summarization_evaluator.SummarizationEvaluator
Bases:
Evaluator
- KEY_COMPLETENESS = 'completeness'
- KEY_FAITHFULNESS_CONV = 'faithfulness_conv'
- KEY_FAITHFULNESS_ZS = 'faithfulness_zs'
- calculate_scores(inputs: list[str], actual_outputs: list[str]) Tuple[Dict[str, float], Dict]
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.
- static split_sentences(text: str) List[str]
- summac_faith_score1(summary: str, refs: str) float
Calculate the summac convolution faithfulness score using the summac convolution
- summac_faith_score2(summary: str, refs: str) float
Max summac/NLI score for individual sentences
- summary_completeness_batch(summaries: list[str], docs: list[str], nearest_neighbors: int = 10, umap_dimension: int = 5) Tuple[List | None, Dict]
- h2o_sonar.evaluators.summarization_evaluator.load_summac()
- h2o_sonar.evaluators.summarization_evaluator.pairwise_distances_wrapper(points)
- h2o_sonar.evaluators.summarization_evaluator.segment_calc(distances: Any, n: int) float
h2o_sonar.evaluators.toxicity_evaluator module
- class h2o_sonar.evaluators.toxicity_evaluator.ToxicityEvaluator
Bases:
Evaluator
- DEFAULT_TOXICITY_METRIC_THRESHOLD = 0.25
- METRIC_IDENTITY_ATTACK = 'identity_attack'
- METRIC_INSULT = 'insult'
- METRIC_OBSCENE = 'obscene'
- METRIC_SEVERE_TOXICITY = 'severe_toxicity'
- METRIC_THREAT = 'threat'
- METRIC_TOXICITY = 'toxicity'
- check_compatibility(params: CommonInterpretationParams | None = None, **evaluator_params) bool
Explainer’s check (based on parameters) verifying that explainer will be able to explain a given model. If this compatibility check returns
False
or raises error, then it will not be run by the engine. This check may, but does not have to be performed by the execution engine.
- evaluate(llm_testset, explanations_types=None, **kwargs) List
- get_result() LeaderboardResult
- setup(model, persistence, **kwargs)
Set all the parameters needed to execute
fit()
andexplain()
.- Parameters:
- modelOptional[Union[models.ExplainableModel, models.ExplainableModelHandle]]
Explainable model with (fit and) score methods (or
None
if 3rd party).- models
(Explainable) models.
- persistence: ExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
- key: str
Optional (given) explainer run key (generated otherwise).
- params: CommonInterpretationParams
Common explainers parameters specified on explainer run.
- explainer_params_as_str: Optional[str]
Explainer specific parameters in string representation.
- dataset_apiOptional[datasets.DatasetApi]
Dataset API to create custom explainable datasets needed by this explainer.
- model_apiOptional[models.ModelApi]
Model API to create custom explainable models needed by this explainer.
- loggerOptional[loggers.SonarLogger]
Logger.
- explainer_params:
Other explainers RUNTIME parameters, options, and configuration.