A Logical Fallacy-Informed Framework for Argument Generation

This repository is the official implementation of the paper entitled "A Logical Fallacy-Informed Framework for Argument Generation", by Luca Mouchel, Debjit Paul, Shaobo Cui, Robert West, Antoine Bosselut and Boi Faltings, to be published at NAACL 2025.

The pipeline supports any causal and sequence-to-sequence models from HuggingFace and consists of the following stages:

Data Collection with ChatGPT
Supervised Fine-Tuning (SFT)
Preference Optimization with existing methods (DPO, KTO, CPO, PPO) and our method (FIPO)
win-rate and fallacy-rate evaluations with GPT-4

Environment Installation

We use a python environment using virtualenv. To create an environment and install the required packages, follow the next steps:

pip install virtualenv
virtualenv fallacies
source fallacies/bin/activate
pip install -r requirements.txt

Automatically Collecting Data with ChatGPT

The scripts for this step are in src/augment_arguments.py

Superised Fine-Tuning (SFT)

To implement SFT, you can run

python src/preference-optimization/trainer.py --train-using=SFT --train-data=data/sft/train.json --model-name=<HF model-id> --use-peft=True

You can also add extra training arguments which have a default value, including:

python src/preference-optimization/trainer.py ... --n-epochs=<> --batch-size=<> --gradient-accumulation-steps=<> --learning-rate=<> --warmup-steps=<> --weight-decay=<> --logging-steps=<> --save-steps=<> --output-dir=<>

By default, running SFT will save the trained model in models/sft_<model-name> e.g., if you train with Llama-2-7b (meta-llama/Llama-2-7b-hf), we will save the model as models/sft_Llama-2-7b-hf.

Preference Optimization

Methods Requiring a Reference Model (DPO, KTO and PPO)

For methods requiring a reference model, you don't have to specify the model-id, however, you must provide the path to the reference (SFT) model previously trained.

DPO

the required arguments are the following - and you can add extra training arguments as mentioned above

python src/preference-optimization/trainer.py --train-using=DPO --beta=<> --ref-model-path=<Path to SFT model> --train-data=data/preference-data/train.json

KTO

For KTO, we have two additional arguments, which both default to 1.0

python src/preference-optimization/trainer.py --train-using=KTO --beta=<> --ref-model-path=<Path to SFT model> --desirable-weight=<> --undesirable-weight=<> --train-data=data/preference-data/train.json

PPO

PPO requires explicit reward modelling - for which we need a reward model. The reward model we pick is a binary fallacy argument classifier (0 - not a fallacy; 1 - fallacy) which uses the preferred and dispreferred arguments in the preference dataset. You can train the binary classifier using:

python src/fallacy_classifier/train.py --language-model=<> --epochs=<> --batch-size=<> --val-batch-size=<> --lr=<> --data-dir=<> --gradient-accumulation=<> --train-data=data/preference-data/train.json

By default, we use the howey/electra-large-mnli language model. The classifier is then saved in the folder models/fallacy_clf/. You can then run PPO using

python src/preference-optimization/trainer.py --train-using=PPO --ref-model-path=<Path to SFT model> --reward-model-path=models/fallacy_clf/<> --train-data=data/preference-data/train.json

The rewards used during the PPO phase are the "not a fallacy" logits - i.e.,

tokens = reward_tokenizer(generated_responses)
rewards = reward_model(**tokens)[:, 0] ## the model outputs two logits in the form [not a fallacy logit, is a fallacy logit]

we use the not a fallacy logits as rewards.

Reference Free Methods (CPO, FIPO)

These methods are easier to run. Here you should not specify a reference model path, instead, specify the huggingface model-id (e.g., meta-llama/Llama-2-7b-hf)

CPO

python src/preference-optimization/trainer.py --train-using=CPO --model-name=<HF model id> --beta=<> --train-data=data/preference-data/train.json

FIPO

This is the method we introduce in our paper, it uses CPO as the backbone preference optimization method, and adds a classification loss on top. The loss is defined by

$$\mathcal{L}_{\text{FIPO}} = \mathcal{L} _ \theta +\lambda\mathcal{L} _ \text{FI} $$

where in our case $\theta$ is CPO, and $\mathcal{L} _ \text{FI}$ is the additional loss, as a weighted cross-entropy loss and $\lambda$ is a weighting parameter.

You can also specify the weighting scheme - either uniform or frequency. The frequency works better, as it gives a larger weight to fallacy types occurring more often - teaching the model to learn more from certain fallacy types, rather than having the same penalty for all types. To run FIPO:

python src/preference-optimization/trainer.py --train-using=FIPO --model-name=<HF model id> --lambda-value=<> --weighting-scheme=<frequency or uniform> --beta=<> --train-data=data/preference-data/train.json

Examples

DPO with Llama-2

Here is an example of running the pipeline:

SFT -

python src/preference-optimization/trainer.py --train-using=SFT --train-data=data/sft/train.json --model-name=meta-llama/Llama-2-7b-hf --use-peft=True

DPO -

python src/preference-optimization/trainer.py --train-using=DPO --train-data=data/preference_optimization/train.json --ref-model-path=models/sft_Llama-2-7b-hf

If you find this code useful or our paper useful to your research, please cite:

@article{mouchel2024logical,
  title={A logical fallacy-informed framework for argument generation},
  author={Mouchel, Luca and Paul, Debjit and Cui, Shaobo and West, Robert and Bosselut, Antoine and Faltings, Boi},
  journal={arXiv preprint arXiv:2408.03618},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
data		data
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Logical Fallacy-Informed Framework for Argument Generation

Environment Installation

Automatically Collecting Data with ChatGPT

Superised Fine-Tuning (SFT)

Preference Optimization

Methods Requiring a Reference Model (DPO, KTO and PPO)

DPO

KTO

PPO

Reference Free Methods (CPO, FIPO)

CPO

FIPO

Examples

DPO with Llama-2

About

Releases

Packages

Languages

aksbdc/Logical-Fallacies

Folders and files

Latest commit

History

Repository files navigation

A Logical Fallacy-Informed Framework for Argument Generation

Environment Installation

Automatically Collecting Data with ChatGPT

Superised Fine-Tuning (SFT)

Preference Optimization

Methods Requiring a Reference Model (DPO, KTO and PPO)

DPO

KTO

PPO

Reference Free Methods (CPO, FIPO)

CPO

FIPO

Examples

DPO with Llama-2

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages