English | 简体中文
python == 3.8
- torch == 1.5
- hydra-core == 1.0.6
- tensorboard == 2.4.1
- matplotlib == 3.4.1
- scikit-learn == 0.24.1
- transformers == 3.4.0
- jieba == 0.42.1
- deepke
git clone https://github.com/zjunlp/DeepKE.git
cd DeepKE/example/re/standard
- Create and enter the python virtual environment.
- Install dependencies:
pip install -r requirements.txt
.
-
Dataset
-
Download the dataset to this directory.
wget 120.27.214.45/Data/re/standard/data.tar.gz tar -xzvf data.tar.gz
-
Three types of data formats are supported,including
json
,xlsx
andcsv
. The dataset is stored indata/origin
:train.csv
: Training setvalid.csv
: Validation settest.csv
: Test setrelation.csv
: Relation labels
-
-
Training
- Parameters for training are in the
conf
folder and users can modify them before training. - If using LM, modify 'lm_file' to use the local model.
- Logs for training are in the
log
folder and the trained model is saved in thecheckpoints
folder. This task supports multi card training. Modifytrian.yaml
's parameteruse_multi_gpu
to true,gpu_ids
set to the selected gpus. The first card is the main card for calculation, which requires a little more memory.show_plot
set to visualize the loss of the current epoch.The default value is False.
python run.py
- Parameters for training are in the
-
Prediction
Set the fp in predict.yaml as the path of the trained model / checkpoint to be used in prediction.The absolute path of the model needs to be used,such as
xxx/checkpoints/2019-12-03_ 17-35-30/cnn_ epoch21.pth
.python predict.py
- CNN
- RNN
- Capsule
- GCN (Based on the paper "Graph Convolution over Pruned Dependency Trees Improves Relation Extraction")
- Transformer
- Pre-trained Model (BERT)
If you only have sentence and entity pairs but relation labels, you can get use our distant supervised based relation labeling tools.
Please make sure that:
- Use the triple file we provide or high-quality customized triple file
- Enough source data