ISO/TR 27877:2021 バイナリ測定法の精度とその結果を評価するための統計分析

この規格プレビューページの目次

※一部、英文及び仏文を自動翻訳した日本語訳を使用しています。

3 用語と定義、記号

3.1 用語と定義

このドキュメントの目的のために、ISO 3534-1 および ISO 5725-1 に記載されている用語と定義、および以下が適用されます。

ISO と IEC は、次のアドレスで標準化に使用する用語データベースを維持しています。

3.1.1

に従って

2つのバイナリ測定値が同じ実験室から取得された場合に同一である確率

注記 1:この概念は、ISO 5725 の「再現性」の定義に対応し、参考文献 [3] によって最初に提案されました。

3.1.2

一致

2つのバイナリ測定値が異なる研究所から取得された場合に同一である確率

注記 1:この概念は、ISO 5725 の「再現性」の定義に対応し、参考文献 [3] によって最初に提案されました。

3.1.3

オルダノバ

順序尺度の測定方法の精度と順序分散測定に基づく結果を評価するための統計的方法

注記 1:この概念は、参考文献 [4] によって最初に提案されました。

3.1.4

真陽性

正の結果の正しい測定値、つまり、測定結果と正しい結果の両方が正の場合

3.1.5

真陰性

負の結果の正しい測定値、つまり、測定結果と正しい結果の両方が負の場合

3.1.6

偽陽性

正の結果の測定値、つまり正しくない、測定値が正であるが正しい結果が負の場合

3.1.7

偽陰性

測定値が負の結果、つまり正しくない、測定値が負であるが正しい値が正の場合

3.1.8

混同行列

真陽性、真陰性、偽陽性、偽陰性の数を示す 2×2 マトリックス

3.1.9

CM精度

正しい測定値のパーセンテージによって定義される、2 クラス分類の能力を示すための統計

注記 1: CM 精度という用語は、機械学習分野における精度という用語と同じであり、一般的には使用されません。ただし、このドキュメントでは、機械学習分野での精度の代わりに CM-accuracy を使用して、ISO 5725 での精度という用語と機械学習分野での用語を区別しています。このドキュメントでは、CM 精度は混同行列に基づいて計算できるため、接頭辞 CM は混同行列を表します。

3.1.10

感度

正の正の測定値のパーセンテージによって定義される、2 クラス分類の能力を示すための統計

3.1.11

特異性

正しい負の測定値のパーセンテージによって定義される、2 クラス分類の能力を示すための統計

3.1.12

CM精度

正の測定値における正しい測定値のパーセンテージによって定義される、2 クラス分類の能力を示すための統計

注記1 CM精度という用語は，機械学習分野における精度という用語と同一であり，一般には使用されない。ただし、このドキュメントでは、機械学習分野の精度ではなく CM 精度を使用して、ISO 5725 の精度という用語と機械学習分野の用語を区別しています。このドキュメントでは、CM 精度は混同行列に基づいて計算できるため、接頭辞 CM は混同行列を表します。

3.1.13

Fメジャー

感度と CM 精度の間の調和平均によって定義される、2 クラス分類の能力を示すための統計

3.1.14

カッパ係数

2 クラス分類の能力を示す統計量。CM 精度から偶然に発生する正確さの可能性を引いた値と、1 から偶然に発生する正確さの可能性を引いた値の比率によって定義されます。

エントリへの注 1: エントリへの注 1: カッパ係数は、参照 [15] によって最初に導入された CM 精度の拡張統計であり、偶然に発生する正確さの可能性を考慮に入れています。

3.2 アイコン

	共同研究に参加している研究室の数
	共同研究に参加している各研究室での繰り返し回数
	実験室を表すサフィックス、および
	繰り返しを表す接尾辞
	実験室繰り返し測定値
	の場合の実験室の測定値の合計、つまり、
	実験室の算術平均、つまり、
	の全体的な算術平均、つまり
	実験室の正の測定値の数
	に関するの合計、つまり、
	- 混同行列の要素
	ISO 5725-2の基本モデルにおける一般平均（期待値）
	ISO 5725-2の基本モデルにおける実験室の再現性条件下でのバイアスの実験室成分。
	ISO 5725-2 の基本モデルにおいて、再現性条件下で測定ごとに発生するランダムな誤差。
	研究室の研究室内分散
	再現性の差異または研究所内の差異
	研究室間の差異
	再現性の分散、つまり、
	ISO 5725 ベースの方法における試験所の試験所内分散
	ISO 5725 ベースの方法における再現性のばらつき
	ISO 5725 ベースの方法におけるラボ間差異
	ISO 5725 ベースの方法における再現性のばらつき、つまり、
	2 項分布に従うと仮定されたバイナリデータの場合の参考文献 [14] によって提案された順序分散測度
	ORDANOVAにおける検査室の検査室内分散
	ORDANOVA の再現性のばらつき
	ORDANOVA の研究室間差異
	ORDANOVA の再現性の分散、つまり、
	実験室の測定値を得る確率
	の算術平均、つまり、
	統計検定の帰無仮説
	カイ二乗検定の検定統計量
	参考文献[3]の方法による
	参考文献[3]の方法の一致
	の見積もり

参考文献

[1]	ISO 1614, 食物連鎖の微生物学 - メソッド検証
[2]	ISO/TS 27878, 共同および社内検証研究におけるバイナリ法の LOD の再現性
[3]	Langton SD, Chevennement R, Nagelkerke N, Lombard B, 質的微生物学的方法の共同試験の分析: 準拠と一致、In J.食品微生物。 79 (2002) 175-18 https://doi.org/10.1016/s0168-1605(02)00107-1 .
[4]	Gadrich T, Bashkansky E, ORDANOVA：順序変動の分析、J. Staプラン。推論。 142 (2012) 3174-3188. https://doi.org/10.1016/j.jspi.2012.06.004 .
[5]	足利 T, 吉田 Y, 広田 M, 米山 K, 板垣 H, 坂口 H, 宮澤 M, 伊藤 Y, 鈴木 H, 豊田 H -CLAT)、毒性。試験管内で20 (2006) 767-77 https://doi.org/10.1016/j.tiv.2005.10.012 .
[6]	Sakaguchi H, 足利 T, Miyazawa M, Yoshida Y, Ito Y, Yoneyama K, Hirota M, Itagaki H, Toyoda H, Suzuki H ヒト細胞株を用いた in vitro 皮膚感作性試験の開発;ヒト細胞株活性化試験 (h-CLAT) II. h-CLAT, Toxicol の共同研究。試験管内で20 (2006) 774-78 https://doi.org/10.1016/j.tiv.2005.10.014 .
[7]	Sakaguchi H, Ryan C, Ovigne J-M, Schroeder KR, Ashikaga T 欧州化粧品協会 (COLIPA) リング試験、Toxicol におけるヒト細胞株活性化試験 (h-CLAT) の皮膚感作性と研究所間再現性の予測。試験管内で24 (2010) 1810-182 https://doi.org/10.1016/j.tiv.2010.05.012 .
[8]	Driscoll KE, Costa DL, Hatch G, Henderson R, Oberdorster G, Salem H, Schlesinger RB, 気道毒性の評価のための曝露技術としての気管内注入：使用と制限、Toxico理科55 (2000) 24-3 https://doi.org/10.1093/toxsci/55.1.24 .
[9]	産総研、プロジェクト年報、「ナノマテリアルの気管内投与試験の標準化に関する調査とその課題」(2017), (2018). https://www.meti.go.jp/meti_lib/report/H29FY/000102.pdf (2020 年 3 月 19 日アクセス)
[10]	Ghandur-Mnaymne L, Raub WA, Sidhar KS, Orge Albores-Saavedr, Gould E, Duncan RC 肺癌の組織学的分類の精度とその再現性: 腺扁平上皮癌の 75 の保存症例の研究 Cancer Invest. 11 (1993) 641-65 https://doi.org/10.3109/07357909309046936 .
[11]	足利 T, 坂口 H, 園 S, 小坂 N, 石川 M, 額田 Y, 宮澤 M, 伊藤 Y, 西山 N, 板垣 H In Vitro 皮膚感作性試験の比較評価: ヒト細胞株活性化試験 (h-CLAT)対局所リンパ節アッセイ（LLNA）、老化。ラボ。アニメ38 (2010) 275-28 https://doi.org/10.1177/026119291003800403 .
[12]	Takeshita J, Nakayama H, Kitsunai Y, Tanabe M, Oki H, Sasaki T, Yoshinari K ラットの反復投与毒性研究における血清 ALT レベルの増加を予測するための分子記述子を使用した識別モデル、Compu有毒。 6 (2018) 64-70. https://doi.org/10.1016/j.comtox.2017.05.002 .
[13]	Wilrich P.-Th.、研究所間実験による定性的測定法の精度の決定、Accreditation Qual. Assur. 15 (2010) 439-44 https://doi.org/10.1007/s00769-010-0661-1 .
[14]	Blair J, Lacy MG, 順序変動の統計、Socio Mathods Res. 28 (2000) 251-28
[15]	Cohen J, 公称スケールの一致係数、Edu Psychol.Meas. 20 (1960) 37-46. https://doi.org/10.1177/001316446002000104 .
[16]	鈴木隆, 堤陽一, 川村浩バイナリ計測における精度評価法を特徴づける視点計測. 46 (2013) 3710-371 https://doi.org/10.1016/j.measurement.2013.05.032 .

3 Terms and definitions, and symbols

3.1 Terms and definitions

For the purposes of this document, the terms and definitions given in ISO 3534-1 and ISO 5725-1 and the following apply.

ISO and IEC maintain terminological databases for use in standardization at the following addresses:

3.1.1

accordance

probability that two binary measured values be identical when they are taken from the same laboratory

Note 1 to entry: The concept corresponds to the definition of “repeatability” in ISO 5725 and was originally proposed by Reference [3].

3.1.2

concordance

probability that two binary measured values be identical when they are taken from different laboratories

Note 1 to entry: The concept corresponds to the definition of “reproducibility” in ISO 5725 and was originally proposed by Reference [3].

3.1.3

ORDANOVA

statistical method for evaluating the precision of ordinal-scale measurement methods and their results based on an ordinal dispersion measure

Note 1 to entry: The concept was originally proposed by Reference [4].

3.1.4

true positive

correct measured value in positive results, that is, the case where both the measured and the correct results are positive

3.1.5

true negative

correct measured value in negative results, that is, the case where both the measured and the correct results are negative

3.1.6

false positive

incorrect measured value in positive results, that is, the case where the measured value is positive but the correct one is negative

3.1.7

false negative

incorrect measured value in negative results, that is, the case where the measured value is negative but the correct one is positive

3.1.8

confusion matrix

2×2 matrix showing the numbers of true positives, true negatives, false positives and false negatives

3.1.9

CM-accuracy

statistic for indicating the capability of two-class classifications, defined by the percentage of correct measured values

Note 1 to entry: The term CM-accuracy is identical to the term accuracy in the machine learning field, and is not generally used. However, this document uses CM-accuracy instead of accuracy in the machine learning field to distinguish between the term accuracy in ISO 5725 and the term in the machine learning field. In this document, the prefix CM stands for confusion matrixes because CM-accuracy can be calculated based on those.

3.1.10

sensitivity

statistic for indicating the capability of two-class classifications, defined by the percentage of correct positive measured values

3.1.11

specificity

statistic for indicating the capability of two-class classifications, defined by the percentage of correct negative measured values

3.1.12

CM-precision

statistic for indicating the capability of two-class classifications, defined by the percentage of correct measured values in positive measured values

Note 1 to entry: The term CM-precision is identical to the term precision in the machine learning field, and is not generally used. However, this document uses CM-precision instead of precision in the machine learning field to distinguish between the term precision in ISO 5725 and the term in the machine learning field. In this document, the prefix CM stands for confusion matrixes because CM-precision can be calculated based on those.

3.1.13

F-measure

statistic for indicating the capability of two-class classifications, defined by the harmonic mean between sensitivity and CM-precision

3.1.14

kappa coefficient

statistic for indicating the capability of two-class classifications, defined by the ratio of CM-accuracy minus the possibility of the correctness occurring by chance to one minus the possibility of the correctness occurring by chance

Note 1 to entry: Note to entry 1: The kappa coefficient is an extended statistic of CM-accuracy, originally introduced by Reference [15], which takes into account the possibility of the correctness occurring by chance.

3.2 Symbols

	number of laboratories participating in a collaborative study
	number of repetitions in each laboratory participating in a collaborative study
	suffix describing a laboratory, and
	suffix describing a repetition, and
	measured value of repetition of laboratory
	sum of the measured values of laboratory for the case where , that is,
	arithmetic mean of of laboratory , that is,
	overall arithmetic mean of , that is,
	number of positive measured values of laboratory
	sum of with respect to , that is,
	-element of a confusion matrix
	general mean (expectation) in the basic model of ISO 5725-2
	laboratory component of bias under repeatability conditions of laboratory in the basic model of ISO 5725-2
	random error occurring in every measurement under repeatability conditions in the basic model of ISO 5725-2
	within-laboratory variance of laboratory
	repeatability variance or within-laboratory variance
	between-laboratory variance
	reproducibility variance, that is,
	within-laboratory variance of laboratory in the ISO 5725-based method
	repeatability variance in the ISO 5725-based method
	between-laboratory variance in the ISO 5725-based method
	reproducibility variance in the ISO 5725-based method, that is,
	ordinal dispersion measure proposed by Reference [14] for the case of binary data assumed to follow a binomial distribution
	within-laboratory variance of laboratory in ORDANOVA
	repeatability variance in ORDANOVA
	between-laboratory variance in ORDANOVA
	reproducibility variance in ORDANOVA, that is,
	probability of obtaining a measured value of laboratory
	arithmetic mean of , that is,
	null hypothesis of a statistical test
	test statistic of a chi-squared test
	accordance of Reference [3] method
	concordance of Reference [3] method
	estimate of

Bibliography

[1]	ISO 16140 (all parts), Microbiology of the food chain - Method validation
[2]	ISO/TS 27878, Reproducibility of the LOD of binary methods in collaborative and in-house validation studies
[3]	Langton S.D., Chevennement R., Nagelkerke N., Lombard B., Analysing collaborative trials for qualitative microbiological methods: accordance and concordance, Int. J. Food Microbiol. 79 (2002) 175–181. https://doi.org/10.1016/s0168-1605(02)00107-1 .
[4]	Gadrich T., Bashkansky E., ORDANOVA: Analysis of ordinal variation, J. Stat. Plan. Inference. 142 (2012) 3174–3188. https://doi.org/10.1016/j.jspi.2012.06.004 .
[5]	Ashikaga T., Yoshida Y., Hirota M., Yoneyama K., Itagaki H., Sakaguchi H., Miyazawa M., Ito Y., Suzuki H., Toyoda H., Development of an in vitro skin sensitization test using human cell lines: The human Cell Line Activation Test (h-CLAT), Toxicol. In Vitro. 20 (2006) 767–773. https://doi.org/10.1016/j.tiv.2005.10.012 .
[6]	Sakaguchi H., Ashikaga T., Miyazawa M., Yoshida Y., Ito Y., Yoneyama K., Hirota M., Itagaki H., Toyoda H., Suzuki H., Development of an in vitro skin sensitization test using human cell lines; human Cell Line Activation Test (h-CLAT) II. An inter-laboratory study of the h-CLAT, Toxicol. In Vitro. 20 (2006) 774–784. https://doi.org/10.1016/j.tiv.2005.10.014 .
[7]	Sakaguchi H., Ryan C., Ovigne J.-M., Schroeder K.R., Ashikaga T., Predicting skin sensitization potential and inter-laboratory reproducibility of a human Cell Line Activation Test (h-CLAT) in the European Cosmetics Association (COLIPA) ring trials, Toxicol. In Vitro. 24 (2010) 1810–1820. https://doi.org/10.1016/j.tiv.2010.05.012 .
[8]	Driscoll K.E., Costa D.L., Hatch G., Henderson R., Oberdorster G., Salem H., Schlesinger R.B., Intratracheal Instillation as an Exposure Technique for the Evaluation of Respiratory Tract Toxicity: Uses and Limitations, Toxicol. Sci. 55 (2000) 24–35. https://doi.org/10.1093/toxsci/55.1.24 .
[9]	AIST, Annual Report on the Project, “Survey on standardization of intratracheal administration study for nanomaterials and related issues” (2017), (2018). https://www.meti.go.jp/meti_lib/report/H29FY/000102.pdf (accessed March 19, 2020).
[10]	Ghandur-Mnaymne L., Raub W.A., Sidhar K.S., orge Albores-Saavedr, E. Gould, R.C. Duncan, The accuracy of the histological classification of lung carcinoma and its reproducibility: A study of 75 archival cases of adenosquamous carcinoma, Cancer Invest. 11 (1993) 641–651. https://doi.org/10.3109/07357909309046936 .
[11]	Ashikaga T., Sakaguchi H., Sono S., Kosaka N., Ishikawa M., Nukada Y., Miyazawa M., Ito Y., Nishiyama N., Itagaki H., A Comparative Evaluation of In Vitro Skin Sensitisation Tests: The Human Cell-line Activation Test (h-CLAT) versus the Local Lymph Node Assay (LLNA), Altern. Lab. Anim. 38 (2010) 275–284. https://doi.org/10.1177/026119291003800403 .
[12]	Takeshita J., Nakayama H., Kitsunai Y., Tanabe M., Oki H., Sasaki T., Yoshinari K., Discriminative models using molecular descriptors for predicting increased serum ALT levels in repeated-dose toxicity studies of rats, Comput. Toxicol. 6 (2018) 64–70. https://doi.org/10.1016/j.comtox.2017.05.002 .
[13]	Wilrich P.-Th., The determination of precision of qualitative measurement methods by interlaboratory experiments, Accreditation Qual. Assur. 15 (2010) 439–444. https://doi.org/10.1007/s00769-010-0661-1 .
[14]	Blair J., Lacy M.G., Statistics of ordinal variation, Sociol. Mathods Res. 28 (2000) 251–280.
[15]	Cohen J., A coefficient of agreement for nominal scales, Educ. Psychol. Meas. 20 (1960) 37–46. https://doi.org/10.1177/001316446002000104 .
[16]	Suzuki T., Tsutsumi Y., Kawamura H., Viewpoints to characterize precision evaluation methods in binary measurements, Measurement. 46 (2013) 3710–3714. https://doi.org/10.1016/j.measurement.2013.05.032 .

ISO/TR 27877:2021 バイナリ測定法の精度とその結果を評価するための統計分析 | ページ 6