Options
All
  • Public
  • Public/Protected
  • All
Menu

External module "transcription/transcription_utils"

Utility functions for Onsets and Frames models.

license

Copyright 2018 Google Inc. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Index

Functions

batchInput

  • batchInput(input: number[][], batchLength: number): Tensor<R3>
  • Batches the input, adding padding for the receptive field.

    For batches in the middle (not the first or last), we pad the beginning and end with values from the previous and following batches to cover the receptive field.

    We can't just use zero padding for the first and last batch since bias will be added to it, making it non-zero after the first convolution. This does not match how same padding works, which is reset to 0 at each layer. Instead we treat the first and last batch differently. The first batch has no initial padding and we include extra padding from the second batch on the end to make its length match. The final batch has no end padding and we include extra padding from the previous batch to the beginning to make its length match.

    In most cases, the number of batches will equal ceil(input.shape[0] / batchLength). However, in rare cases where the final batch would be shorter than the receptive field, it is instead appended to the previous batch, reducing the final batch size by 1.

    Parameters

    • input: number[][]

      The 2D input matrix, shaped [N, D].

    • batchLength: number

      The desired batch size (excluding receptive field padding). The final batch may be less or slightly more than this.

    Returns Tensor<R3>

    The 3D batched input, shaped [B, batchLength + RF_PAD * 2, D]

pianorollToNoteSequence

  • pianorollToNoteSequence(frameProbs: tf.Tensor2D, onsetProbs: tf.Tensor2D, velocityValues: tf.Tensor2D, onsetThreshold?: number, frameThreshold?: number): Promise<NoteSequence>
  • Converts the model predictions to a NoteSequence.

    Parameters

    • frameProbs: tf.Tensor2D

      Probabilities of an active frame, shaped [frame, pitch].

    • onsetProbs: tf.Tensor2D

      Probabilities of an onset, shaped [frame, pitch].

    • velocityValues: tf.Tensor2D

      Predicted velocities in the range [0, 127], shaped [frame, pitch].

    • Default value onsetThreshold: number = 0.5
    • Default value frameThreshold: number = 0.5

    Returns Promise<NoteSequence>

    A NoteSequence containing the transcribed piano performance.

unbatchOutput

  • unbatchOutput(batches: tf.Tensor3D, batchLength: number, totalLength: number): Tensor<R3>
  • Unbatches the input, reversing the procedure of batchInput.

    Parameters

    • batches: tf.Tensor3D

      The batched input matrix.

    • batchLength: number

      The desired batch size (excluding receptive field padding). The final batch may be less or slightly more than this.

    • totalLength: number

    Returns Tensor<R3>

    The batched input, shaped [N, batchLength + RF_PAD * 2]

Generated using TypeDoc