Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recurrent neural net to match existing standards so as to merge into master #24

Closed
robertleeplummerjr opened this issue Aug 7, 2016 · 12 comments
Assignees

Comments

@robertleeplummerjr
Copy link
Contributor

Investigation started here, but we needed an issue to track for merging into master at some hopeful date. The work will be done here until it is (hopfully) merged.

@robertleeplummerjr
Copy link
Contributor Author

robertleeplummerjr commented Aug 9, 2016

Inspiration hit yesterday. I had been thinking of how to better iterate through the matrixes so as to not have two specific memory leaks:

The first is creating a new matrix every time a calculation is made for it.
The second is creating a new closure every time a calculation that needs backpropagation is needed.

This is probably considered a premature optimization, but from past experience, this is a memory leak.

Consider these lines of code, referenced as "original math":

var h0 = this.multiply(hiddenMatrix.weight, inputVector, this);
var h1 = this.multiply(hiddenMatrix.transition, hiddenPrev, this);
var hiddenD = this.relu(this.add(this.add(h0, h1, this), hiddenMatrix.bias, this), this);

This provides a pleasing visual of what is actually going on, and I'd like to preserve it. At the same time, I see the relationship to each other matrix is fixed, it will never be different so we could potentially create this relationship on instantiation. Too, because the relationship is fixed, we don't need to dynamically reference different matrixes using a closure, but can rather create a relationship where this is explicitly exposed.

Consider a brainstorming of pseudo code:

//psuedo strongerly typed relationship of above code
hiddenMatrix.weight.connector = new Matrix(hiddenMatrix.weight.rows, inputVector.columns);
hiddenMatrix.weight.next = inputVector;
hiddenMatrix.weight.action = multiply;
hiddenMatrix.weight.backPropigateAction = multiplyBack;

hiddenMatrix.transition.connector = new Matrix(hiddenMatrix.transition.rows, hiddenPrev.columns);
hiddenMatrix.transition.next = hiddenPrev;
hiddenMatrix.transition.action = multiply;
hiddenMatrix.transition.backPropigateAction = multiplyBack;

inputVector.connector = new Matrix(inputVector.rows, hiddenPrev.columns);
inputVector.next = hiddenPrev;
inputVector.action = add;
inputVector.backPropigateAction = addBack;

hiddenPrev.connector = new Matrix(hiddenPrev.rows, hiddenMatrix.columns);
hiddenPrev.next = hiddenMatrix.bias;
hiddenPrev.action = add;
hiddenPrev.backPropigateAction = addBack;

hidden.connector = null;
hidden.next = null;
hidden.action = relu;
hidden.backPropigateAction = reluBack;

If we could use a relationship similar to the above, the complexity drops so as to be something like this:

//pseudo proposal:
function add(m1, m2) {
    m1.action = realAdd;
    m1.next = m2;
    m1.connector = new Matrix(m1.rows, m2.columns);
    m1.backPropigateAction = realAddBack;
}

function multiply(m1, m2) {
    m1.action = realMultiply;
    m1.next = m2;
    m1.connector = new Matrix(m1.rows, m2.columns);
    m1.backPropigateAction = realMultiplyBack;
}

function relu(m) {
    m.forwardAction = realRelu;
    m.next = null;
    m1.connector = new Matrix(m.columns, 1);
    m.backPropigateAction = realReluBack
} 

//the means of running it forward
function run(matrix) {
    while (matrix.next !== null) {
        matrix.action(matrix, matrix.next, matrix.connector);
        matrix = matrix.next;
    }
    matrix.action(matrix);
}

//the means of running it backpropagate
function backpropagate(matrix) {
    while (matrix.next !== null) {
        matrix.backPropigateAction(matrix, matrix.next, matrix.connector);
        matrix = matrix.next;
    }
    matrix.backPropigateAction(matrix, matrix.connector);
}

Then we can create the mathematical flow using the same sort of explicit code of the original math, yet much more performant:

//connections made at instantiation, rather than on forward,which locks relationships and contexts
var hiddenD = 
    relu(
        add(
            add(
                multiply(
                    hiddenMatrix.weight,
                    inputVector
                ),
                multiply(
                    hiddenMatrix.transition,
                    hiddenPrev
                )
            ),
            hiddenMatrix.bias
        )
    );

One could even argue it may be more visually simple. I feel like this iteration technique closely resembles that of the original toFunction method in brain.js, and could easily be used as well to output a toFunction method from the recurrent neural net.

Suggestions welcome.

@robertleeplummerjr
Copy link
Contributor Author

As of a few minutes ago, working rnn, lstm, and gru, with surprising efficiency!

#29

As is stated, not quite ready. I'd like to get the following items working for the rnn before release, methods:

  • runInput
  • train
  • trainPattern
  • calculateDeltas
  • adjustWeights
  • formatData
  • test
  • toFunction

Proper testing where they do apply to match the existing api. They'd need implemented here: https://github.com/harthur-org/brain.js/blob/recurrent/src/recurrent/rnn.js#L325

@robertleeplummerjr
Copy link
Contributor Author

I imagine toFunction is going to be the biggest "bear", but with the existing architecture having setup a formula builder, I bet it will be cake.

@robertleeplummerjr
Copy link
Contributor Author

Started on toFunction and already have toJSON methods for matrices. Ideally, as with the existing implementation there'd be no non-native function calls, where the whole function just has a bunch of really clear loops where needed. If this is the best course of action, as it would keep closures and added arguments, resources, etc, from being created, what would be the best way to get the matrix math function as A

@robertleeplummerjr
Copy link
Contributor Author

robertleeplummerjr commented Sep 14, 2016

After brainstorming a bit, and using the above (although reluctant) method, I came up with something like:

toFunction() {
    let model = this.model;
    let modelAsString = JSON.stringify(this.toJSON());
    function matrixToString(m) {
      if (!m) return 'null';

      for (var i = 0, max = model.hiddenLayers.length; i < max; i++) {
        var hiddenLayer = model.hiddenLayers[i];
        for (var p in hiddenLayer) {
          if (hiddenLayer[p] === m) {
            return `model.hiddenLayer[${ i }].${ p }`;
          }
        }
      }
      if (m === model.input) return `model.input`;
      if (m === model.outputConnector) return `model.outputConnector`;
      if (m === model.output) return `model.output`;
      return `new Matrix(${ m.rows }, ${ m.columns })`;
    }

    function toInner(fnString) {
      //crude, but should be sufficient for now
      //function() { inner.function.string.here; }
      fnString = fnString.toString().split('{');
      fnString.shift();
      // inner.function.string.here; }
      fnString = fnString.join('{');
      fnString = fnString.split('}');
      fnString.pop();
      // inner.function.string.here;
      return fnString.join('}');
    }

    let equation = this.model.equations[0];
    let states = equation.states;
    let statesRaw = [];
    let usedFunctionNames = {};
    let innerFunctionsSwitch = [];
    for (var i = 0, max = states.length; i < max; i++) {
      let state = states[i];
      statesRaw.push(`{
        into: ${ matrixToString(state.into) },
        left: ${ matrixToString(state.left) },
        right: ${ matrixToString(state.right) },
        forwardFnName: '${ state.forwardFn.name }'
      }`);

      if (!usedFunctionNames[state.forwardFn.name]) {
        usedFunctionNames[state.forwardFn.name] = true;
        innerFunctionsSwitch.push(`
        case '${ state.forwardFn.name }':
          // start ${ state.forwardFn.name }
          ${ toInner(state.forwardFn.toString()) }
          // end ${ state.forwardFn.name }
          break;
        `);
      }
    }

    return new Function(`
      var model = ${ modelAsString };
      var states = [${ statesRaw.join(',') }];
      for (var i = 0, max = states.length; i < max; i++) {
        var state = states[i];
        var into = state.into;
        var left = state.left;
        var right = state.right;

        switch (state.forwardFnName) {
          ${ innerFunctionsSwitch.join('\n') }
        }
      }
    `);
  }

A rough start, but the output looks very surprisingly what we'd want:

function anonymous() {

    var model = { /* one huge line*/ };
    var states = [{
        into: new Matrix(6, 1),
        left: model.input,
        right: null,
        forwardFnName: 'rowPluck'
    }, {
        into: new Matrix(20, 1),
        left: model.hiddenLayer[0].weight,
        right: new Matrix(6, 1),
        forwardFnName: 'multiply'
    }, {
        into: new Matrix(20, 1),
        left: model.hiddenLayer[0].transition,
        right: new Matrix(20, 1),
        forwardFnName: 'multiply'
    }, {
        into: new Matrix(20, 1),
        left: new Matrix(20, 1),
        right: new Matrix(20, 1),
        forwardFnName: 'add'
    }, {
        into: new Matrix(20, 1),
        left: new Matrix(20, 1),
        right: model.hiddenLayer[0].bias,
        forwardFnName: 'add'
    }, {
        into: new Matrix(20, 1),
        left: new Matrix(20, 1),
        right: null,
        forwardFnName: 'relu'
    }, {
        into: new Matrix(20, 1),
        left: model.hiddenLayer[1].weight,
        right: new Matrix(20, 1),
        forwardFnName: 'multiply'
    }, {
        into: new Matrix(20, 1),
        left: model.hiddenLayer[1].transition,
        right: new Matrix(20, 1),
        forwardFnName: 'multiply'
    }, {
        into: new Matrix(20, 1),
        left: new Matrix(20, 1),
        right: new Matrix(20, 1),
        forwardFnName: 'add'
    }, {
        into: new Matrix(20, 1),
        left: new Matrix(20, 1),
        right: model.hiddenLayer[1].bias,
        forwardFnName: 'add'
    }, {
        into: new Matrix(20, 1),
        left: new Matrix(20, 1),
        right: null,
        forwardFnName: 'relu'
    }, {
        into: new Matrix(13, 1),
        left: model.outputConnector,
        right: new Matrix(20, 1),
        forwardFnName: 'multiply'
    }, {
        into: new Matrix(13, 1),
        left: new Matrix(13, 1),
        right: model.output,
        forwardFnName: 'add'
    }];
    for (var i = 0, max = states.length; i < max; i++) {
        var state = states[i];
        var into = state.into;
        var left = state.left;
        var right = state.right;

        switch (state.forwardFnName) {

            case 'rowPluck':
                // start rowPluck

                for (var column = 0, columns = m.columns; column < columns; column++) {
                    into.weights[column] = m.weights[columns * rowIndex + column];
                    into.recurrence[column] = 0;
                }

                // end rowPluck
                break;


            case 'multiply':
                // start multiply

                var leftRows = left.rows;
                var leftColumns = left.columns;
                var rightColumns = right.columns;

                // loop over rows of left
                for (var leftRow = 0; leftRow < leftRows; leftRow++) {

                    // loop over cols of right
                    for (var rightColumn = 0; rightColumn < rightColumns; rightColumn++) {

                        // dot product loop
                        var dot = 0;

                        //loop over columns of left
                        for (var leftColumn = 0; leftColumn < leftColumns; leftColumn++) {
                            dot += left.weights[leftColumns * leftRow + leftColumn] * right.weights[rightColumns * leftColumn + rightColumn];
                        }
                        var i = rightColumns * leftRow + rightColumn;
                        into.weights[i] = dot;
                        into.recurrence[i] = 0;
                    }
                }

                // end multiply
                break;


            case 'add':
                // start add

                for (var i = 0, max = left.weights.length; i < max; i++) {
                    into.weights[i] = left.weights[i] + right.weights[i];
                    into.recurrence[i] = 0;
                }

                // end add
                break;


            case 'relu':
                // start relu

                for (var i = 0, max = m.weights.length; i < max; i++) {
                    into.weights[i] = Math.max(0, m.weights[i]); // relu
                    into.recurrence[i] = 0;
                }

                // end relu
                break;

        }
    }

}

Looks like the biggest problem is too many matrices being created to link previous matrices up. We'd need better tracking between them, and then bob is our uncle.

@robertleeplummerjr
Copy link
Contributor Author

Just thinking outloud here, the above would be an eventual recurrent neural network, in that it would allow for a single whole import to be recursed, but how would we achieve a recurrent neural network for for than a single value for a singleton of a function?

@robertleeplummerjr
Copy link
Contributor Author

fyi, toFunction now outputs very similarly to as how run works. Would there be a point at which:
a: the network should continue to recurse/be recurrent, though it is fully trained
b: should there be a toPredictionFunction, in addition to the toFunction? Seems like it'd be a very nice addition.
c: how could the run/toFunction call be more useful?

@robertleeplummerjr
Copy link
Contributor Author

If the singleton function simply had the weights it uses stored as a property of the function, then reusing them for the backpropagation will be doable. Don't know why that didn't hit me prior. May be best to store it inside the function, and assign it on first call to its parent for reuse afterward.

@robertleeplummerjr
Copy link
Contributor Author

I believe we are nearing the light at the end of the tunnel. The rnn works (rnn, lstm, and gru) we just need to ensure exporting, importing, and then toFunction are working properly and we'll be ready for this pr to be closed.

@robertleeplummerjr
Copy link
Contributor Author

fyi: d9dc977

@robertleeplummerjr
Copy link
Contributor Author

The moment has come. .toFunction is working great, unit tests could be improved overall, but that will be a separate strategy. I believe we are merge worthy.

@robertleeplummerjr
Copy link
Contributor Author

Now in master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant