Toy Cancer
Overview
This is referred to as a “Toy Dataset” because of its small size and the fact that it uses made-up data. However, it is meant to show that the probability of someone having cancer increases if they smoke or have friends who smoke.
Target: cancer
The facts contain information on two labels: friends
, smokes
.
train_facts
friends(Alice, Bob).
friends(Alice, Fred).
friends(Chuck, Bob).
friends(Chuck, Fred).
friends(Dan, Bob).
friends(Earl, Bob).
friends(Bob, Alice).
friends(Fred, Alice).
friends(Bob, Chuck).
friends(Fred, Chuck).
friends(Bob, Dan).
friends(Bob, Earl).
smokes(Alice).
smokes(Chuck).
smokes(Bob).
train_pos
cancer(Alice).
cancer(Bob).
cancer(Chuck).
cancer(Fred).
train_neg
cancer(Dan).
cancer(Earl).
Download
Download: Toy-Cancer.zip (2.83 KB)
md5sum
:fa15b64583f9b1abc7fd78b93025792d
sha256sum
:618d9283caa5459711b01d7b535aa1e91c8c98945ed4085248368a373ce880c2
Setup
-
After downloading, unzip Toy-Cancer.zip
unzip Toy-Cancer.zip
-
If you’re using a jar file, move it into the Toy-Cancer directory:
mv (BoostSRL jar file) Toy-Cancer/
mv (auc jar file) Toy-Cancer/
-
Learning:
java -jar BoostSRL.jar -l -train train/ -target cancer -trees 10
-
Inference:
java -jar BoostSRL.jar -i -test test/ -model train/models/ -target cancer -aucJarPath . -trees 10
Modes
useStdLogicVariables: true.
setParam: treeDepth=4.
setParam: nodeSize=2.
setParam: numOfClauses=8.
mode: friends(+Person, -Person).
mode: friends(-Person, +Person).
mode: smokes(+Person).
mode: cancer(+Person).
bridger: friends/2.
//precompute1:
num_of_smoking_friends(x, n) :-
friends(x, y), // grounding x first
countUniqueBindings((friends(x,z)^smokes(z)), n).
mode: num_of_smoking_friends(+Person, #Number).