Skip to content

Commit c6afe3f

Browse files
committed
add multiple files
1 parent 3b71c4d commit c6afe3f

File tree

146 files changed

+439366
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

146 files changed

+439366
-0
lines changed

.DS_Store

6 KB
Binary file not shown.

README.md

+85
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
<p align="center">
2+
<br>
3+
<img src="figures/logo.png" width="500"/>
4+
<br>
5+
<p>
6+
<div align="center">
7+
<a href="">Benchmark</a>,
8+
<a href="">Technical Report</a>,
9+
<a href="">Documentation</a>,
10+
<a href="">Jupyter Notebook Examples</a>,
11+
<a href="">Blog</a>
12+
</div>
13+
14+
# DialogStudio: Unified Dialog Datasets and Instruction-Aware Models for Conversational AI
15+
16+
17+
### Datasets
18+
Check [DialogStudio_datasets.csv](https://docs.google.com/spreadsheets/d/10U9I4GoHFTYxl3OlzbbV0gmXerMT9Itn2MZs8t6AIK0/edit#gid=461625820) for all supported datasets.
19+
20+
<p align="center">
21+
<br>
22+
<img src="figures/DialogStudio_Stats.png" width="700"/>
23+
<br>
24+
<p>
25+
26+
27+
Data Structure
28+
```
29+
Datasets/
30+
├── Task-Oriented:
31+
│ ├── KVRET
32+
│ ├── MuDoCo
33+
│ ├── AirDialogue
34+
│ ├── DuRecDial-2.0
35+
│ ├── SimJointGEN
36+
│ ├── BiTOD
37+
│ ├── DSTC2-Clean
38+
│ ├── OpenDialKG
39+
│ ├── Taskmaster1
40+
│ ├── Taskmaster2
41+
│ ├── Taskmaster3
42+
│ ├── CaSiNo
43+
│ ├── HDSA-Dialog
44+
│ ├── MetaLWOZ
45+
│ ├── FRAMES
46+
│ ├── MULTIWOZ2_2
47+
│ ├── SalesBot
48+
│ ├── STAR
49+
│ ├── ABCD
50+
│ ├── SGD
51+
│ ├── WOZ2_0
52+
│ ├── CraigslistBargains
53+
│ ├── MulDoGO
54+
│ ├── SimJointMovie
55+
│ ├── SimJointRestaurant
56+
│ └── SimJointGEN
57+
dialog-summarization
58+
│ ├── AMI
59+
│ ├── ConvoSumm
60+
│ ├── DialogSum
61+
│ ├── ICSI
62+
│ ├── MediaSum
63+
│ ├── QMSum
64+
│ ├── SAMSum
65+
│ ├── SummScreen_ForeverDreaming
66+
│ ├── SummScreen_TVMegaSite
67+
│ ├── TweetSumm
68+
│ └── ECTSum
69+
│ └── CRD3
70+
71+
72+
```
73+
74+
# License
75+
76+
Our project follows the following structure with respect to licensing:
77+
78+
1. For all the modified datasets in DialogStudio:
79+
- A portion of these datasets is under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).
80+
- Some retain their original licenses even after modification.
81+
- For a few datasets that lacked a license, we have cited the relevant papers.
82+
2. Original dataset licenses: For reference, we also put the original avaliable licenses for each dataset into their respective dataset folders.
83+
3. Code: Our codebase is under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).
84+
85+
For detailed information, please refer to the specific licenses.

dialogue-summarization/AMI/converted_examples.json

+6,255
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)