by Andy Carluccio
The files used in this tutorial are available HERE.
Using Teachable Machine with NDI-Classify for Image Recognition in Isadora
The ability to classify images and the people or objects within them is revolutionizing computer science thanks to advances in machine learning and specialized computing hardware that have significantly accelerated this area of research over the past decade. Despite its scientific roots, the power of image classification is no longer restricted to academics, as projects like Teachable Machine have expanded access to image classification technologies to anyone with a web browser. In this knowledgebase article, I will demonstrate how to leverage real-time image classification with Isadora so that you can explore the implications of this technology in your own artistic practices.
DISCLAIMER 1: This article assumes that you are already familiar with using Open Sound Control and NewTek NDI with Isadora.
STEP 1: Using Teachable Machine
Teachable Machine is a Google-acquired technology that allows you to generate machine learning models online, for free. It has many capabilities and classification systems, but the feature we are interested in using is the ability to export Tensorflow models, which are essentially the files that contain our machine learning models. We will use the Teachable Machine website to generate the classifications and then download the file to use with NDI-Classify.
Note: it is not required to use Teachable Machine to generate Tensorflow models. What sets Teachable Machine apart is its ease of use, but if you have existing models you would prefer to use instead, it is not required to use Teachable Machine and you can skip this section.
Here are the steps to get to the training area:
- Navigate to https://teachablemachine.withgoogle.com/ in your browser
- Click “Get Started”
- Click “Image Project” (but take note of the exciting other modes you can use!)
- Click “Standard Image Model”
You should now see this page:
This is the area where you will train your image classifier. On the left, you create your different classes. A class is a comparative category you want to identify. For example, if you want to make a model that can identify you from your pet, the idea is that you might upload images of yourself as Class 1 and images of your pet as Class 2. These images are the “training set” for the model. Once trained, the image classification model will then try to determine which class best describes a new image it is provided.
Teachable Machine allows you to upload images from your computer or use your camera to define a class. You can also use a virtual camera if you have one available. Feel free to name the classes, though the names will not be retained by NDI-Classify later. As a result, it is critical to remember the numeric order of your classes. For example, you would want to make a note that class 1 is you, class 2 is your dog, and class 3 is your fish. When training a robust model, a variety of reference images will help the model succeed in classifying new images, so keep that in mind when recording or uploading the images.
After setting up your classes, click “Train Model” in the center node. Be sure not to change tabs during training. When the model finishes training, you will be able to preview the results on the right with your webcam or files from your computer. You can see the scores for each class at the bottom of the node. Feel free to redo your model until you are satisfied with the results. When you are ready to proceed, use the following steps:
- Click Export in the right node
- Select the center “Tensorflow” Tab in the top area
- Leave the conversion model type as “Keras”
- Click Download Model
- In a few minutes, the website will provide a model to download (converted_keras.zip)
- Extract the .h5 model file from the zip. This is your model!
STEP 2: Using NDI-Classify
I created NDI-Classify to make it easy to use Tensorflow models with Isadora by utilizing NewTek NDI to receive video feeds and Open Sound Control (OSC) to communicate the classification results from anywhere on the network.
NDI-Classify is an open-source python. You can find the project at:
- Mac & WINDOWS - https://github.com/ohglobal/NDI-Classify the executables are available for download under Releases on the right. The Mac and Windows releases are separate downloads. Please download the most recent release for your platform.
As an aside, if you are a programmer interested in contributing to the project, please open a pull request!
Once you have downloaded the zip file for NDI-Classify, extract the contents. Paste your .h5 file next to the NDI-Classify.exe file, replacing the existing .h5 file. Then, run NDI-Classify.exe
When prompted, tick the box to allow NDI-Classify to communicate on your private network, then click “Allow Access”.
In a few moments, text will appear in the console that opens with the application. There may be warnings related to Tensorflow, but you can ignore them. You will be prompted to either enter a custom name for the model file or just use the default name. Use the default name if you did not rename the .h5 file from Teachable Machine. You are then prompted to select a location to send the OSC data to. By default, the app will communicate on 127.0.0.1:1234, which will be compatible with Isadora running on the same computer by default. You can enter your own networking information if you want to send to Isadora on another computer with a private IP address.
Next, the program will display a list of NDI sources available on your network and ask you to select a set of those sources to use with the application. You select video feeds by entering their indices as a list of comma-separated values. For example, if the output is:
3 NDI Sources Detected…
- Isadora Stage Output 1
- NDI Scan Converter
- vMix External Output 1
You could run the model on Isadora and vMix by entering: 0,2
note: the indices are zero based, so 0 is the Isadora Stage in this case.
NDI-Classify will run the model on each NDI feed and report scores via OSC. The output format for NDI-Classify is:
/ndiClassify/source/SOURCE_ID class_1_weight, class_2_weight, ...
Using the earlier example, if we had two classes in our model, our outputs could be:
/ndiClassify/source/0 0.32 0.68
/ndiClassify/source/1 0.9 0.1
Where source 0 is Isadora Stage Output 1’s class weights and source 1 is vMix External Output 1’s class weights. These indices are based on the order of appearance in the input list (not in the NDI source list). The values of these OSC outputs are the weights for each class, essentially the votes that the model makes on each classification. The higher the weights, the more certain the classifier is that the incoming video contains that class.
STEP 3: Connecting to Isadora
The procedure for connecting to Isadora is relatively straightforward, but the specific steps will depend on what you desire to do with the end product. As a result, we will explore a specific example for the remainder of the article.
In this example, you will find a project that allows you to classify between myself and two of my dog’s toys: a pizza slice and a burger. I trained a model on Teachable Machine by uploading reference images of each class following the steps above. The example Isadora file will annotate the video of me holding up the toys with the name of the toy it thinks I am holding, or my name if it sees me primarily. I have intentionally not tuned the model so as to exaggerate any potential issues you might face and to keep the base file simple. A proper implementation would likely involve gates on the values and timers to ensure the classes are not flipping rapidly between frames.
Extract the attached example and run the Isadora file. The patch will generate an NDI feed on your network because the Stage 1 output has NDI enabled. The movie will be a silly video of me holding up my dog’s toys. Under communications -> stream setup, note that Isadora has been set to receive OSC from NDI-Classify to channel 10, and that the OSC Multi Listener in the patch has three outputs. The included Tensorflow model will classify the pizza toy as Class 1, the burger toy as Class 2, and me as Class 3. The values then go into my custom user actor for comparisons, which delivers a trigger at the index of the highest value. Those triggers then force the correct label into the Text Draw actor for the annotation.
To run the model after launching Isadora, copy the .h5 file to the NDI-Classify.exe directory, then run the application using the above steps. If running on the same computer as Isadora, the default network settings are fine. Otherwise, input the private IP address where Isadora is running into NDI-Classify when prompted. Be sure to select the NDI feed generated by Isadora, otherwise the app will not be looking at the correct video. A few seconds after the app begins running the model, OSC will flow into your Isadora patch and begin the annotation logic.
While the example project demonstrates a silly use case of identifying dog toys in Isadora, the implications of machine learning for live performances are staggering. In practice, Tensorflow models are able to identify all sorts of categories beyond image classification, including sound analysis and pose recognition. Imagine being able to trigger effects in your Isadora patch based upon pose recognition of a dancer, for example. When you include Tensorflow in your patch, you are teaching Isadora how to “see” the world!