File Structure

Basic File Structure

Files that BoostSRL operates on are stored in a folder with three things:

background.txt : Modes
train/ folder :
- train_bk.txt : Pointer to the background file.
- train_facts.txt : Facts
- train_pos.txt : Positive examples
- train_neg.txt : Negative examples
test/ folder :
- test_bk.txt : Pointer to the background file.
- test_facts.txt : Facts
- test_pos.txt : Positive examples
- test_neg.txt : Negative examples

Example:

File structure for the Cora dataset, notice that the background is called “cora_bk.txt” in this example.

This is okay if train_bk.txt and test_bk.txt both point correctly with: import: "../cora_bk.txt".

FAQ

What does it mean to have positive and negative examples during testing? Isn’t the point of testing that I do not know these labels ahead of time?

Think from a classic machine learning perspective. We divide data into training and test sets, learn from training set, hide the labels on the test set, and try to predict what is hidden. Positive and negative examples during testing are hidden, and the goal is to predict which are positive and which are negative based on the learned model and the facts.

If you want to use the model to perform inference on data you do not know the labels for, add your data to either test_pos.txt or test_neg.txt. The regression values in the results_(target).txt can be roughly interpreted as “What is the probability of this example being true/false?”, respectfully.

Advanced File Structure

After training/testing, more files and folders will appear. This advanced guide explains what each of them are, including the contents of the models, dotFiles, bRDNs, and WILLtheories directories.

Not all of these will necessarily appear, for example: the CombinedTrees(target).dot only appears when the -combine flag is set.

:Data/
:├── background.txt
:├── BoostSRL.jar
:├── test
:│   ├── query_(target).db
:│   ├── results_(target).db
:│   ├── test_bk.txt
:│   ├── test_facts.txt
:│   ├── test_infer_dribble.txt
│   ├── test_neg.txt
│   └── test_pos.txt
└── train
   ├── models
   │   ├── bRDNs
   │   │   ├── dotFiles
   │   │   │   ├── CombinedTrees(target).dot
   │   │   │   ├── rdn.dot
   │   │   │   ├── WILLTreeFor_(target)0.dot
   │   │   │   ├── ...
   │   │   │   └── WILLTreeFor_(target)9.dot
   │   │   ├── (target).model
   │   │   ├── (target)_testsetStats_pos_neg_Lits1Trees10Skew2.txt
   │   │   ├── old_(target).model
   │   │   ├── predictions_pos_neg_Lits1Trees10Skew2.csv
   │   │   └── Trees
   │   │       ├── CombinedTreesTreeFile(target).tree
   │   │       ├── (target)Tree0.tree
   │   │       ├── ...
   │   │       └── (target)Tree9.tree
   │   └── WILLtheories
   │       ├── (target)_learnedWILLregressionTrees.txt
   │       └── old_(target)_learnedWILLregressionTrees.txt
   ├── schema.db
   ├── train_bk.txt
   ├── train_facts.txt
   ├── train_gleaner.txt
   ├── train_learn_dribble.txt
   ├── train_neg.txt
   └── train_pos.txt

Data/: Directory that contains the data we train/test on.
background.txt: modes file used to guide the search space.
BoostSRL.jar: If you’re using a jar file, you’ll usually keep it at the root of the data directory.
test/: Directory containing testing data.
query_(target).db:
results_(target).db: Results of running inference (testing) on the data.
test_bk.txt: Pointer to the background.txt
test_facts.txt: Predicates described in the background.txt
test_infer_dribble.txt:
test_neg.txt: Negative testing examples.
test_pos.txt: Positive testing examples.
train/: Directory containing training data.
models/:
bRDNs/:
dotFiles/:
CombinedTrees(target).dot: Combined Tree of the target, results from using the -combine flag.
rdn.dot:
WILLTreeFor_(target)0.dot: First of the boosted trees.
...: For each tree, there will be an associated file named WILLTreeFor_(target)#.dot
WILLTreeFor_(target)9.dot: Last of the boosted trees, equal to one less than the number of trees learned.
(target).model:
(target)_testsetStats_pos_neg_Lits1Trees10Skew2.txt:
old_(target).model:
predictions_pos_neg_Lits1Trees10Skew2.csv:
Trees/:
Combined TreesTreeFile(target).tree
(target)Tree0.tree:
...: For each tree, there will be an associated file named (target)Tree#.tree
(target)Tree9.tree:
WILLtheories/:
(target)_learnedWILLregressionTrees.txt:
old_(target)_learnedWILLregressionTrees.txt:
schema.db:
train_bk.txt: Pointer to the background.txt
train_facts.txt: Predicates described in the background.txt
train_gleaner.txt:
train_learn_dribble.txt:
train_neg.txt: Negative training examples.
train_pos.txt: Positive training examples.

Share on

Twitter Facebook LinkedIn