Weka --- 关联规则挖掘

发布时间：2020-12-14 03:03:36 所属栏目：大数据来源：网络整理

导读：算法属性设置： 1.car：如果设为真，则会挖掘类关联规则而不是全局关联规则。 2.classindex：类属性索引。如果设置为-1，最后的属性被当做类属性。 3.delta：以此数值为迭代递减单位。不断减小支持度直至达到最小支持度或产生了满足数量要求的规则。 4.low

算法属性设置：

1.car：如果设为真，则会挖掘类关联规则而不是全局关联规则。 2.classindex：类属性索引。如果设置为-1，最后的属性被当做类属性。 3.delta：以此数值为迭代递减单位。不断减小支持度直至达到最小支持度或产生了满足数量要求的规则。 4.lowerBoundMinSupport：最小支持度下界。 5.metricType：度量类型，设置对规则进行排序的度量依据。可以是：置信度（类关联规则只能用置信度挖掘），提升度(lift)，杠杆率(leverage)，确信度(conviction)。在 Weka中设置了几个类似置信度(confidence)的度量来衡量规则的关联程度，它们分别是： a)Lift ： P(A,B)/(P(A)P(B)) Lift=1时表示A和B独立。这个数越大(>1)，越表明A和B存在于一个购物篮中不是偶然现象,有较强的关联度. b)Leverage :P(A,B)-P(A)P(B) Leverage=0时A和B独立，Leverage越大A和B的关系越密切 c) Conviction:P(A)P(!B)/P(A,!B) （!B表示B没有发生） Conviction也是用来衡量A和B的独立性。从它和lift的关系（对B取反，代入Lift公式后求倒数）可以看出，这个值越大,A、B越关联。 6.minMtric ：度量的最小值。 7.numRules：要发现的规则数。 8.outputItemSets：如果设置为真，会在结果中输出项集。 9.removeAllMissingCols：移除全部为缺省值的列。 10.significanceLevel ：重要程度。重要性测试（仅用于置信度）。 11.upperBoundMinSupport：最小支持度上界。从这个值开始迭代减小最小支持度。 12.verbose：如果设置为真，则算法会以冗余模式运行。

=== Run information === %运行信息 Scheme: ? ? ? weka.associations.Apriori -I -N 10 -T 0 -C 0.9 -D 0.05 -U 1.0 -M 0.1 -S -1.0 -c -1 % ?算法的参数设置：-I -N 10 -T 0 -C 0.9 -D 0.05 -U 1.0 -M 0.5 -S -1.0 -c -1 ; % ?各参数依次表示： % ?I - 输出项集，若设为false则该值缺省; % ?N 10 - 规则数为10; % ?T 0 – 度量单位选为置信度，(T1-提升度，T2杠杆率，T3确信度); % ?C 0.9 – 度量的最小值为0.9; % ?D 0.05 - 递减迭代值为0.05; % ?U 1.0 - 最小支持度上界为1.0; % ?M 0.5 - 最小支持度下届设为0.5; % ?S -1.0 - 重要程度为-1.0; % ?c -1 - 类索引为-1输出项集设为真 % ?(由于car,removeAllMissingCols,verbose都保持为默认值False，因此在结果的参数设置为缺省，若设为True，则会在结果的参数设置信息中分别表示为A,R,V) Relation: ? ? mushroom %数据名称 Instances: ? ?8124 %数据项个数 Attributes: ? 23 %属性项个数/属性项 ? ? ? ? ? ? ? cap-shape ? ? ? ? ? ? ? cap-surface ? ? ? ? ? ? ? cap-color ? ? ? ? ? ? ? bruises? ? ? ? ? ? ? ? odor ? ? ? ? ? ? ? gill-attachment ? ? ? ? ? ? ? gill-spacing ? ? ? ? ? ? ? gill-size ? ? ? ? ? ? ? gill-color ? ? ? ? ? ? ? stalk-shape ? ? ? ? ? ? ? stalk-root ? ? ? ? ? ? ? stalk-surface-above-ring ? ? ? ? ? ? ? stalk-surface-below-ring ? ? ? ? ? ? ? stalk-color-above-ring ? ? ? ? ? ? ? stalk-color-below-ring ? ? ? ? ? ? ? veil-type ? ? ? ? ? ? ? veil-color ? ? ? ? ? ? ? ring-number ? ? ? ? ? ? ? ring-type ? ? ? ? ? ? ? spore-print-color ? ? ? ? ? ? ? population ? ? ? ? ? ? ? habitat ? ? ? ? ? ? ? class === Associator model (full training set) === Apriori ======= Minimum support: 0.95 (7718 instances) ?%最小支持度0.95，即最少需要7718个实例 Minimum metric <confidence>: 0.9 ? %最小度量<置信度>: 0.9 Number of cycles performed: 1 %进行了1轮搜索 Generated sets of large itemsets: %生成的频繁项集 Size of set of large itemsets L(1): 3 ? %频繁1项集：3个 Large Itemsets L(1): ?%频繁1项集(outputItemSets设为True,因此下面会具体列出) gill-attachment=f 7914 veil-type=p 8124 veil-color=w 7924 Size of set of large itemsets L(2): 3 Large Itemsets L(2): ?%频繁2项集 gill-attachment=f veil-type=p 7914 gill-attachment=f veil-color=w 7906 veil-type=p veil-color=w 7924 Size of set of large itemsets L(3): 1 ? Large Itemsets L(3): ?%频繁3项集 gill-attachment=f veil-type=p veil-color=w 7906 Best rules found: ? %最佳规则 ?1. veil-color=w 7924 ==> veil-type=p 7924 ? ?conf:(1) ?2. gill-attachment=f 7914 ==> veil-type=p 7914 ? ?conf:(1) ?3. gill-attachment=f veil-color=w 7906 ==> veil-type=p 7906 ? ?conf:(1) ?4. gill-attachment=f 7914 ==> veil-color=w 7906 ? ?conf:(1) ?5. gill-attachment=f veil-type=p 7914 ==> veil-color=w 7906 ? ?conf:(1) ?6. gill-attachment=f 7914 ==> veil-type=p veil-color=w 7906 ? ?conf:(1) ?7. veil-color=w 7924 ==> gill-attachment=f 7906 ? ?conf:(1) ?8. veil-type=p veil-color=w 7924 ==> gill-attachment=f 7906 ? ?conf:(1) ?9. veil-color=w 7924 ==> gill-attachment=f veil-type=p 7906 ? ?conf:(1) 10. veil-type=p 8124 ==> veil-color=w 7924 ? ?conf:(0.98)

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!