You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+37-15
Original file line number
Diff line number
Diff line change
@@ -1,16 +1,14 @@
1
1
# RNNSharp
2
-
RNNSharp is a toolkit of deep recurrent neural network which is widely used for many different kinds of tasks, such as sequence labeling. It's written by C# language and based on .NET framework 4.6 or above version.
2
+
RNNSharp is a toolkit of deep recurrent neural network which is widely used for many different kinds of tasks, such as sequence labeling, sequence-to-sequence and so on. It's written by C# language and based on .NET framework 4.6 or above version.
3
3
4
-
This page will introduces you about what is RNNSharp, how it works and how to use it. To get the demo package, please access release page and download the package.
4
+
This page introduces what is RNNSharp, how it works and how to use it. To get the demo package, you can access release page.
5
5
6
6
## Overview
7
-
RNNSharp supports many different types of deep recurrent neural network (aka DeepRNN) structures.In the aspect of historical memory, it supports BPTT(BackPropagation Through Time) and LSTM(Long Short-Term Memory) structures. And in respect of output layer structure, RNNSharp supports native output layer and recurrent CRFs[1]. In additional, RNNSharp also support forward RNN and bi-directional RNN structures.
7
+
RNNSharp supports many different types of deep recurrent neural network (aka DeepRNN) structures.In terms of historical memory, it supports BPTT(BackPropagation Through Time) and LSTM(Long Short-Term Memory) structures. And in respect of output layer structure, RNNSharp supports softmax, negative sampling softmax and recurrent CRFs[1]. In additional, RNNSharp also supports forward RNN and bi-directional RNN structures.
8
8
9
-
For BPTT and LSTM, BPTT-RNN is usually called as "simple RNN", since the structure of its hidden layer node is very simple. It's not good at preserving long time historical memory. LSTM-RNN is more complex than BPTT-RNN, since its hidden layer node has inner-structure which helps it to save very long time historical memory. In general, LSTM has better performance than BPTT on longer sequences.
9
+
For BPTT and LSTM, BPTT-RNN is usually called as "simple RNN", since the structure of its hidden layer node is very simple. It's not good at preserving long time historical memory. LSTM-RNN is more complex than BPTT-RNN, since its hidden layer node has inner-structure for very long time historical memory. In general, LSTM has better performance than BPTT on longer sequences.
10
10
11
-
For native RNN output, many widely experiments and applications have proved that it has better results than tranditional algorithms, such as MMEM, for online sequence labeling tasks, such as speech recognition, auto suggestion and so on.
12
-
13
-
For RNN-CRF, based on native RNN outputs and their transition, we compute CRF output for entire sequence. Compred with native RNN, RNN-CRF has better performance for many different types of sequence labeling tasks in offline, such as word segmentation, named entity recognition and so on. With the similar feature set, it has better performance than linear CRF.
11
+
For output layer, softmax output layer is the tranditional type which is widely used in online sequence labeling tasks, such as speech recognition, auto suggestion and so on. Negative sampling softmax output layer is especially used for the tasks with large output vocabulary, such as sequence generation tasks (sequence-to-sequence model). For recurrent CRF, based on softmax outputs and tags transition, RNNSharp computes CRF output for entire sequence. Compred with native RNN, RNN-CRF has better performance for many different types of sequence labeling tasks in offline, such as word segmentation, named entity recognition and so on. With the similar feature set, it has better performance than linear CRF.
14
12
15
13
For bi-directional RNN, the output result combines the result of both forward RNN and backward RNN. It usually has better performance than single-directional RNN.
16
14
@@ -20,8 +18,11 @@ Here is an example of deep bi-directional RNN-CRF network. It contains 3 hidden
20
18
Here is the inner structure of one bi-directional hidden layer.
Here is the neural network for sequence-to-sequence task. "TokenN" are from source sequence, and "ELayerX-Y" are auto-encoder's hidden layers. Auto-encoder is defined in feature configuration file. "<s>" is always the beginning of target sentence, and "DLayerX-Y" means the decoder's hidden layers. In decoder, it generates one token at one time until "</s>" is generated.
RNNSharp supports four types of feature set. They are template features, context template features, run time feature and word embedding features. These features are controlled by configuration file, the following paragraph will introduce what these features are and how to use them in configuration file.
25
+
RNNSharp supports four types of feature set. They are template features, context template features, run time feature and word embedding features. These features are controlled by configuration file, the following paragraph will introduce how these feaures work.
25
26
26
27
## Template Features
27
28
@@ -155,9 +156,7 @@ Training corpus contains many records to describe what the model should be. For
155
156
156
157
In training file, each record can be represented as a matrix and ends with an empty line. In the matrix, each row describes one token and its features, and each column represents a feature in one dimension. In entire training corpus, the number of column must be fixed.
157
158
158
-
When RNNSharp encodes, if the column size is N, according template file describes, the first N-1 columns will be used as input data for binary feature set generation and model training. The Nth column (aka last column) is the answer of current token, which the model should output.
159
-
160
-
There is an example for named entity recognition task(The full training file is at release section, you can download it there):
159
+
Sequence labeling task and sequence-to-sequence task have different training corpus format. For sequence labeling tasks, the first N-1 columns are input features for training, and the Nth column (aka last column) is the answer of current token. Here is an example for named entity recognition task(The full training file is at release section, you can download it there):
161
160
162
161
Word | Pos | Tag
163
162
-----------|------|----
@@ -195,13 +194,31 @@ The named entity type looks like "Position_NamedEntityType". "Position" is the w
195
194
ORGANIZATION : the name of one organization
196
195
LOCATION : the name of one location
197
196
197
+
For sequence-to-sequence task, the training corpus format is different. For each sequence pair, it has two sections, one is source sequence, the other is target sequence. Here is an example:
198
+
199
+
Word
200
+
--------
201
+
What
202
+
is
203
+
your
204
+
name
205
+
?
206
+
207
+
I
208
+
am
209
+
Zhongkai
210
+
Fu
211
+
212
+
In above example, "What is your name ?" is the source sentence, and "I am Zhongkai Fu" is the target sentence generated by RNNSharp seq-to-seq model. In source sentence, beside word features, any other feautes can be added for it as well such as postag feature in sequence labeling task in above.
213
+
214
+
198
215
## Test file format
199
216
200
-
Test file has the similar format as training file. The only different between them is the last column. In test file, all columns are features for model decoding.
217
+
Test file has the similar format as training file. For sequence labeling task, the only different between them is the last column. In test file, all columns are features for model decoding. For sequence-to-sequence task, it only contains source sequence. The target sentence will be generated by model.
201
218
202
-
## Tag Mapping File
219
+
## Tag (Output Vocabulary) File
203
220
204
-
This file contains available result tags of the model. For readable, RNNSharp uses tag name in corpus, however, for high efficiency in encoding and decoding, tag names are mapped into integer values. The mapping is defined in a file (-tagfile as parameter in console tool). Each line is one tag name.
221
+
For sequence labeling task, this file contains output tag set. For sequence-to-sequence task, it's output vocabulary file.
This command trains a bi-directional recurrent neural network with CRF output. The network has two BPTT hidden layers and one softmax output layer. The first hidden layer size is 200 and the second hidden layer size is 100
This command trains a forward-directional sequence-to-sequence LSTM model, and the output layer is negative sampling softmax. The encoder is defined in [AUTOENCODER_XXX] section in features_seq2seq.txt file.
261
+
240
262
### Decode Model
241
263
242
264
In this mode, the console tool is used to predict output tags of given corpus. The usage as follows:
0 commit comments