Skip to main content

Table 3 Preliminary experiment on RFM(m) models, clustering analysis, and decision algorithms in 16,440 valid datasets obtained after cleaning

From: Adherence predictor variables in AIDS patients: an empirical study using the data mining-based RFM model

Model type

Clustering analysis

Decision algorithm

Clustering type

Number of models constructed

Model quality

Predictor variable importance

Predictor variable importance

Prediction model accuracy (%)

C5.0

CHAID

CART

QUEST

C5.04

CHAID

CART

QUEST

RFM model1

K-means

1st round

0.8

R = 1 M = 1 F = 1

R = 0.9865

M = 0.0067

F = 0.0067

R = 0.7402

M = 0.2547

F = 0.0052

R = 0.9697

M = 0.0152

F = 0.0152

R = 0.9689

M = 0.0156

F = 0.0156

99.96

93.55

99.77

97.07

2nd round

0.9

R = 1 M = 1 F = 1

M = 1

M = 0.7039

F = 0.2961

_

_

99.98

99.9

_

_

3rd round

0.7

R = 1 M = 1 F = 1

M = 0.5

F = 0.5

M = 1

_

_

99.98

99.93

_

_

Two-step clustering

1st round

0.4

R = 1 M = 1 F = 1

_

_

_

_

_

_

_

_

Kohonen

1st round

0.4

R = 1 M = 1 F = 1

_

_

_

_

_

_

_

_

RFm model2

K-means3

1st round

0.5

R = 1 m = 1 F = 1

R = 0.6735

m = 0.017

F = 0.3095

R = 0.6661

m = 0.0429

F = 0.2910

R = 0.7172

m = 0.0020

F = 0.2808

R = 0.7479

m = 0.0017

F = 0.2503

99.88

90.38

97.43

95.86

2nd round

0.8

R = 1 m = 1 F = 1

R = 0.5918

m = 0.4043

F = 0.0039

R = 0.5938

m = 0.1768

F = 0.2293

R = 0.5814

m = 0.4129

F = 0.0057

R = 0.5962

m = 0.3999

F = 0.0039

99.96

96.66

98.06

98.33

3rd round

0.6

R = 1 m = 1 F = 0.37

R = 0.7258

m = 0.0451

F = 0.1841

R = 0.7708

m = 0.2728

F = 0.0014

R = 0.971

m = 0.0268

F = 0.0022

R = 0.6749

m = 0.3245

F = 0.0007

99.82

97.48

97.48

98.25

Two-step clustering5

1st round

0.7

R = 1 m = 1 F = 1

R = 0.6864

m = 0.2090

F = 0.1046

R = 0.6607

m = 0.2962

F = 0.0431

R = 0.6283

m = 0.2582

F = 0.1135

R = 0.5797

m = 0.2624

F = 0.1579

99.87

96.9

98.41

97.32

2nd round

0.7

R = 1 m = 0.15 F = 0.01

R = 0.7058

m = 0.2942

R = 0.8428

m = 0.1572

R = 0.7351

m = 0.2649

R = 0.7398

m = 0.2602

98.61

95.45

98.04

97.7

3rd round

0.4

R = 1 m = 1 F = 1

_

_

_

_

_

_

_

_

Kohonen

1st round

0.4

R = 1 m = 1 F = 1

_

_

_

_

_

_

_

_

  1. R: Recency; F:Frequency; M: Monetary. Values lie in the 0–1 range
  2. 1The predictor variables of the RFM model were either unstable, or could not be used for modeling in clustering analysis and decision tree algorithm. M: total medical costs
  3. 2The predictor variables of the RFm model in the decision algorithm were stable. m: average medical costs per visit
  4. 3The K-means clustering model was robust
  5. 4The C5.0 algorithm prediction model had an accuracy of 99%
  6. 5The accuracy of the two-step clustering in the C5.0 algorithm was lower than that of the k-means clustering model, and the quality of the model in the third round was low