Evaluation Metrics


N-gram	Example
1-gram (unigram)	“How”, “high”, “will”, “the”, “temperature”, “climb”, “?”
2-gram (bigram)	“How high”, “high will”, “will the”, “the temperature”, “temperature climb”, “climb?”
3-gram (trigram)	“How high will”, “high will the”, “will the temperature”, “the temperature climb”, “temperature climb?”
4-gram	“How high will the”, “high will the temperature”, “will the temperature climb”, “the temperature climb?”

¶ Evaluation Metrics

¶ Classification metrics

¶ Accuracy

¶ Precision

¶ Recall / Sensitivity / True positive rate (TPR)

¶ False positive rate (FPR)

¶ F1 score

¶ Area Under Receiver operating characteristic (AUROC)

¶ Confusion matrix (CM)

¶ Transcribing text sentence metrics

¶ Word Error Rate (WER)

¶ Character Error Rate (CER)

¶ Edit distance

¶ Speaker evaluation metric

¶ Diarization Error Rate (DER)

¶ Translation evaluation metric

¶ BLEU Score

¶ N-gram

¶ N-gram precisions

¶ Regression metrics

¶ Mean absolute error (MAE)

¶ Mean square error (MSE)

¶ Root mean square error (RMSE)

¶ Mean absolute percentage error (MAPE)

¶ Mean absolute scaled error (MASE)

¶ Evaluation Metrics

¶ Classification metrics

¶ Accuracy

¶ Precision

¶ Recall / Sensitivity / True positive rate (TPR)

¶ False positive rate (FPR)

¶ ​F1 score

¶ Area Under Receiver operating characteristic (AUROC)

¶ Confusion matrix (CM)

¶ Transcribing text sentence metrics

¶ Word Error Rate (WER)

¶ Character Error Rate (CER)

¶ Edit distance

¶ Speaker evaluation metric

¶ Diarization Error Rate (DER)

¶ Translation evaluation metric

¶ BLEU Score

¶ N-gram

¶ N-gram precisions

¶ Regression metrics

¶ Mean absolute error (MAE)

¶ Mean square error (MSE)

¶ Root mean square error (RMSE)

¶ Mean absolute percentage error (MAPE)

¶ Mean absolute scaled error (MASE)

¶ F1 score