Neutone · caenopy · Jul 18, 2023 · Jul 20, 2023 · Jul 21, 2023 · Jul 24, 2023
diff --git a/neutone_midi_sdk/README.md b/neutone_midi_sdk/README.md
@@ -0,0 +1,109 @@
+# Neutone-MIDI SDK
+
+The goal of this SDK is to provide an environment where researchers, musicians and engineers
+can quickly 'wrap' an existing machine-learning model for symbolic music tasks into a format that is deployed in a real-time
+plugin for DAW's. 
+
+There are two guides to help with this process:
+1. model_training_guide: Details the setup you should follow 
+in the training pipeline to ensure your model will be compatible
+with the SDK
+2. model_preparation_guide: how to export your model
+
+Once your model is trained and serialized with the above 
+methods, you will find the remaining instructions in this Readme 
+to 'wrap' it for deployment in the Neunote-MIDI plugin. 
+
+We have designed the SDK to work in conjunction with [MIDITok](https://github.com/Natooz/MidiTok), 
+which lets you
+tokenise an entire collection of MIDI files in a few easy commands. The SDK can convert the MIDI 
+data in DAW's to and from this format, allowing your model to interact with the same data format that it was trained on. 
+
+
+# Wrapping your model
+
+Once you have a serialized a model trained on a supported tokenization format, it's time to wrap it!
+
+**First, load your vocab and config files**
+```angular2html
+import torch
+import json
+from neunote_SDK import MidiToMidiBase
+from data_preparation import prepare_token_data
+from tokenization import TokenData
+
+with open(vocab_file_path, 'r') as fp:
+    vocab = json.load(fp)
+with open(config_file_path, 'r') as fp:
+    config = json.load(fp)
+tokenizer_type = config["tokenization"]
+```
+
+Load your serialized model:
+```
+remi_Model = torch.jit.load("path_to_model")
+```
+
+Wrap it:
+```angular2html
+tokenizer_type = "REMI"
+tokenizer_data: TokenData = prepare_token_data(tokenizer_type, vocab, config)
+wrapped_model = MidiToMidiBase(model=remi_Model(),
+                               vocab=tokenizer.vocab,
+                               tokenizer_type=tokenizer_type,
+                               tokenizer_data=tokenizer_data)
+scripted_model = torch.jit.script(wrapped_model)
+scripted_model.save('REMI_Model.pt')
+```
+And...that's it! Your model is now ready to deploy in the Neutone-MIDI Plugin. 
+
+
+# SDK Components
+### Neutone-MIDI SDK:
+
+Provides the base wrapper for a MIDI-to-MIDI model, which is saved as a pytorch scripted .pt file
+
+
+### Data Preparation
+Each tokenization method has a particular set of quantized values that are available, 
+related to pitch, timing, velocity, etc. Because sequence length often has a large impact 
+on computational time, each model can use a slightly different granularity. To maintain efficiency,
+it is helpful for the scripted model to have lists already identifying these available values. 
+
+For example, if a midi message comes in with ``velocity=43`` and the available values are 
+``[20, 40, 60, 80, 100, 120]`` then the tokenizer can quickly round the incoming velocity to the 
+nearest value of ``40``. 
+
+Given the original vocab json and the type of tokenization method, the data preparation utility
+will return a tuple of dictionaries of lists of the relevant data values. As this can be accomplished during the
+wrapping procedure, it saves the plugin the necessity to calculate available values on each forward pass. 
+
+
+### MIDI Data Format
+
+Input tensor will be dim of (x, 4) where x = number of midi messages. Each midi message will have type:
+
+``{type, value, velocity, timestep}``
+
+Current types:
+```
+0.0 = note on
+
+1.0 = note off
+```
+
+``{0.0, 64.0, 90.0, 2.5}`` = note on, pitch of 64, velocity 90, at beat 2.5 
+
+Every tokenization method is expecting this as an input, and will return it as an output
+
+**Timing**: 
+
+Within the C++ environment, timing is always expressed as **PPQ**, which is a float value in relation to quarter notes. 
+Continuing off the above example '2.5' means an eight note (.5) after the second quarter note (2). MIDI can communicate time in a number of formats 
+in varying resolutions; but the input and output must always adhere to this. as it determines where the plugin places MIDI events within the 
+buffer. 
+
+If, for example, your model uses a 'ticks-per-beat' system with a resolution of 96 ticks-per-quarter,
+then it is the job of the tokeniser to convert from the PPQ to ticks system. All included tokenisation 
+methods already take care of this functionality. 
+
diff --git a/neutone_midi_sdk/__init__.py b/neutone_midi_sdk/__init__.py
@@ -0,0 +1,6 @@
+from .core import NeutoneMIDIModel
+from .tokenization import *
+from .parameter import *
+from .data_preparation import *
+from .constants import *
+from .neutoneMIDI_SDK import *
diff --git a/neutone_midi_sdk/constants.py b/neutone_midi_sdk/constants.py
@@ -0,0 +1,6 @@
+SDK_VERSION = "0.1.1"
+
+MAX_N_NUMERICAL_PARAMS = 4
+MAX_N_TENSOR_PARAMS = 1
+SUPPORTED_TOKENIZATIONS = ["MIDILike", "TSD", "REMI", "HVO", "HVO_taps", "Custom"]
+MAX_N_CATEGORICAL_VALUES = 20
diff --git a/neutone_midi_sdk/core.py b/neutone_midi_sdk/core.py
@@ -0,0 +1,199 @@
+import torch as tr
+from torch import nn, Tensor
+from typing import List, Dict, Tuple, Union
+from abc import abstractmethod
+from neutone_midi_sdk.tokenization import TokenData
+from neutone_midi_sdk.parameter import NeutoneParameter
+import neutone_midi_sdk.constants as constants
+
+
+class NeutoneMIDIModel(tr.nn.Module):
+    def __init__(self,
+                 model: tr.nn.Module,
+                 vocab: Dict[str, int],
+                 tokenizer_type: str,
+                 tokenizer_data: TokenData):
+
+        super().__init__()
+        self.MAX_N_NUMERICAL_PARAMS = constants.MAX_N_NUMERICAL_PARAMS
+        self.MAX_N_TENSOR_PARAMS = constants.MAX_N_TENSOR_PARAMS
+        self.SDK_VERSION = constants.SDK_VERSION
+        self.n_neutone_parameters = len(self.get_neutone_parameters())
+
+        # Allocate default numerical params to prevent dynamic allocations later
+        numerical_default_param_vals = self._get_numerical_default_param_values()
+        assert len(numerical_default_param_vals) <= self.MAX_N_NUMERICAL_PARAMS, (
+            f"Number of default numerical parameter values ({len(numerical_default_param_vals)}) "
+            f"exceeds the maximum allowed ({self.MAX_N_NUMERICAL_PARAMS})."
+        )
+        numerical_default_param_values_t = tr.tensor([v for _, v in numerical_default_param_vals])
+        # Ensure number of parameters is within the maximum allowed
+        self.n_numerical_neutone_parameters = len(numerical_default_param_vals)
+        assert self.n_numerical_neutone_parameters <= self.MAX_N_NUMERICAL_PARAMS
+        # Ensure parameter names are unique
+        assert len(set([p.name for p in self.get_neutone_parameters()])) == len(
+            self.get_neutone_parameters()
+        )
+        self.register_buffer("tensor_default_param_values", numerical_default_param_values_t.unsqueeze(-1))
+
+        # Allocate default tensor params to prevent dynamic allocations later
+        tensor_default_param_vals = self._get_tensor_default_param_values()
+        assert len(tensor_default_param_vals) <= self.MAX_N_TENSOR_PARAMS, (
+            f"Number of default tensor parameter values ({len(numerical_default_param_vals)}) "
+            f"exceeds the maximum allowed ({self.MAX_N_TENSOR_PARAMS})."
+        )
+        # TODO(nic): this assumes a common dimension for all tensor parameters
+        tensor_default_param_values_t = tr.cat([v for _, v in tensor_default_param_vals])
+        self.register_buffer("numerical_default_param_values", tensor_default_param_values_t.unsqueeze(-1))
+
+        # Save parameter metadata
+        self.neutone_parameters_metadata = {
+            p.name: p.to_metadata_dict()
+            for idx, p in enumerate(self.get_neutone_parameters())
+        }
+
+        # Allocate remapped params dictionary to prevent dynamic allocations later
+        self.remapped_params = {
+            name: tr.tensor([val])
+            for name, val in numerical_default_param_vals
+        }
+        self.remapped_params.update(
+            {
+                name: val
+                for name, val in tensor_default_param_vals
+            }
+        )
+        self.default_param_values = self.remapped_params
+
+        # Save parameter information
+        self.neutone_parameter_names = [p.name for p in self.get_neutone_parameters()]
+        # TODO(nic): remove from here once plugin metadata parsing is implemented
+        self.neutone_parameter_descriptions = [
+            p.description for p in self.get_neutone_parameters()
+        ]
+        self.neutone_parameter_used = [p.used for p in self.get_neutone_parameters()]
+        self.neutone_parameter_types = [
+            p.type.value for p in self.get_neutone_parameters()
+        ]
+
+        # instantiate model
+        model.eval()
+        self.model = model
+
+        # Setup tokenization methods
+        assert tokenizer_type in constants.SUPPORTED_TOKENIZATIONS, \
+            f"{tokenizer_type} not a recognized tokenization format."
+        tokenizer_data = generate_fake_token_data() if tokenizer_data is None else tokenizer_data
+        vocab = {"v": 0} if vocab is None else vocab
+        self.midi_to_token_vocab = vocab
+        self.token_to_midi_vocab = {v: k for k, v in vocab.items()}
+        self.tokenizer_type = tokenizer_type
+        self.tokenizer_data: TokenData = TokenData(tokenizer_data.strings, tokenizer_data.floats, tokenizer_data.ints)
+
+    @abstractmethod
+    def _get_numerical_default_param_values(
+        self,
+    ) -> List[Tuple[str, Union[float, int]]]:
+        """
+        Returns a list of tuples containing the name and default value of each
+        numerical (float or int) parameter.
+        This should not be overwritten by SDK users.
+        """
+        pass
+
+    @abstractmethod
+    def _get_tensor_default_param_values(
+        self,
+    ) -> List[Tuple[str, Union[float, int]]]:
+        """
+        Returns a list of tuples containing the name and default value of each
+        tensor parameter.
+        This should not be overwritten by SDK users.
+        """
+        pass
+
+    @abstractmethod
+    def get_model_name(self) -> str:
+        """
+        Set the model name
+        """
+        pass
+
+    @abstractmethod
+    def get_model_authors(self) -> List[str]:
+        """
+        Used to set the model authors. This will be displayed on both the
+        website and the plugin.
+
+        Should reflect the name of the people that developed the wrapper
+        of the model using the SDK. Can be different from the authors of
+        the original model.
+
+        Maximum of 5 authors.
+        """
+        pass
+
+    @abstractmethod
+    def get_model_short_description(self) -> str:
+        """
+        Used to set the model short description. This will be displayed on both
+        the website and the plugin.
+
+        This is meant to be seen by the audio creators and should give a summary
+        of what the model does.
+
+        Maximum of 150 characters.
+        """
+        pass
+
+    def get_neutone_parameters(self) -> List[NeutoneParameter]:
+        return []
+
+    @tr.jit.export
+    def get_neutone_parameters_metadata(self) -> Dict[str, Dict[str, str]]:
+        """
+        Returns the metadata of the parameters as a string dictionary of string
+        dictionaries.
+        """
+        return self.neutone_parameters_metadata
+
+    @tr.jit.export
+    def get_default_param_values(self) -> Dict[str, Tensor]:
+        """
+        Returns the default parameter values as a tensor of shape
+        (N_DEFAULT_PARAM_VALUES, 1).
+        """
+        return self.default_param_values
+
+    @tr.jit.export
+    def get_default_param_names(self) -> List[str]:
+        # TODO(nic): remove this once plugin metadata parsing is implemented
+        return self.neutone_parameter_names
+
+    @tr.jit.export
+    def get_default_param_descriptions(self) -> List[str]:
+        # TODO(nic): remove this once plugin metadata parsing is implemented
+        return self.neutone_parameter_descriptions
+
+    @tr.jit.export
+    def get_default_param_types(self) -> List[str]:
+        # TODO(nic): remove this once plugin metadata parsing is implemented
+        return self.neutone_parameter_types
+
+    @tr.jit.export
+    def get_default_param_used(self) -> List[bool]:
+        # TODO(nic): remove this once plugin metadata parsing is implemented
+        return self.neutone_parameter_used
+
+    def prepare_for_inference(self) -> None:
+        self.model.eval()
+        self.eval()
+
+
+# Todo: Would like to deprecate this method, it is used in "HVO" format where there is no TokenData necessary
+def generate_fake_token_data():
+    token_strings: Dict[str, List[str]] = {"value": ["value"]}
+    token_floats: Dict[str, List[float]] = {"value": [0.0]}
+    token_ints: Dict[str, List[int]] = {"value": [0]}
+    token_data: TokenData = TokenData(token_strings, token_floats, token_ints)
+    return token_data