Skip to main content

Table 3 Comparison with state-of-the-art DL-based methods (with a minimum test set size of 500) for a KOA severity assessment task

From: Accurate, automated classification of radiographic knee osteoarthritis severity using a novel method of deep learning: Plug-in modules

Year

Method

DL algorithm

Test set size

Accuracy for KLG 0 (%)

Accuracy for KLG 1 (%)

Accuracy for KLG 2 (%)

Accuracy for KLG 3 (%)

Accuracy for KLG 4 (%)

Average class accuracy (%)

2016

Reference [24]

CNN

2686

71

20

56

76

80

59.60

2017

Reference [25]

CNN

4400

86.9

6.0

60.2

73.0

78.1

62.29

2018

Reference [26]

Deep Siamese CNN

5960

78

45

52

70

88

66.70

2019

Reference [27]

CNN ensemble

1890

Not available

69.50

2019

Reference [28]

CNN

1495

Not available

64.3

2019

Reference [5]

Modified CNN

1385

89.8

55.6

82.6

36.0

100.0

74.3

2019

Reference [10]

CNN ensemble

5941

83.7

70.2

68.9

86.0

78.4

2020

Reference [29]

CNN

1175

79

52

58

59

85

66.0

2020

Reference [3]

CNN ensemble

7599

94

61

90

96

97

87.0

2020

Reference [6]

CNN ensemble

11,743

63.0

11.0

79.8

84.8

94.9

68.0

2020

Reference [4]

CNN

4090

86.5

27.0

66.8

80.9

85.8

71.0

2022

Reference [7]

ResNet-50 + AlexNet + TL

634

99.8

99.4

99.5

99.6

99.6

98.9

2024

Ours

PIM ensemble

17,040

85

46

69

82

93

75.7

  1. The average class accuracy was highlighted in bold
  2. DL, deep learning; KLG, Kellgren–Lawrence grade; CNN, convolutional neural network; TL, transfer learning; PIM, plug-in module