Integrate Azure Cognitive Services in UWP app

Azure Cognitive Services are an amazing tool than enables developers to augment users’ experience using the power of machine-based intelligence. The API set is powerful and provides lots of features that are organized in categories: vision, speech, language, knowledge, search, and labs.image

In this post we learn how to leverage the Emotion API to get our user mood and set a background of our app accordingly.

Get API key

To get starded we need to get an API key to be able to access the Cognitive Services. So, let’s go to this address (https://azure.microsoft.com/en-us/try/cognitive-services/?api=emotion-api) and click on the Create button next to Emotion API.

image

After that the website will ask to accept the trial term of services and we accept:

image

After the login we get access to our keys:

image

Now we’re done with Azure web site and we can start coding.

Coding

We fire up Visual Studio and create a new UWP project.

image

To achieve our goal (get our user mood and change the background) where’re going to develop a simple UI where the user press a button, we take a picture of him/her, send that picture to Azure, and based on the result load a background image.

Before we write code we need to setup our application capabilities in the manifest and download some NuGet packages. We double click on the Package.appxmanifest file in Solution Explorer and Visual Studio, go to the Capabilities tab and check the webcam and microphone boxes.

image

Then we download the Microsoft.ProjectOxford.Emotion NuGet package that contains some helper classes to deal with the Azure Cognitive Services. In the Solution Explorer we right click and select Manage NuGet Packages. In the search box we type “Microsoft.ProjectOxford.Emotion”. We download the package.

image

With the following XAML we draw a simple UI with a Button to trigger the camera.

<Page x:Class="IC6.EmotionAPI.MainPage"       xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"       xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"       xmlns:local="using:IC6.EmotionAPI"       xmlns:d="http://schemas.microsoft.com/expression/blend/2008"       xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"       mc:Ignorable="d">

    <Grid Name="myGrid">
        <StackPanel>
            <Button Click="Button_Click">Take a camera picture</Button>

        </StackPanel>
    </Grid>
</Page>

At the handler of the click event we write


  private async void Button_Click(object sender, RoutedEventArgs e)
        {
            try
            {

                using (var stream = new InMemoryRandomAccessStream())
                {
                    await _mediaCapture.CapturePhotoToStreamAsync(ImageEncodingProperties.CreateJpeg(), stream);

                    stream.Seek(0);

                    var emotion = await MakeRequest(stream.AsStream());

                    if (emotion == null)
                    {
                        await new MessageDialog("Emotions non detected.").ShowAsync();

                        return;
                    }

                    var imgBrush = new ImageBrush();

                    if (emotion.Scores.Sadness &amp;amp;amp;gt; emotion.Scores.a)
                    {
                        imgBrush.ImageSource = new BitmapImage(new Uri(@"ms-appx://IC6.EmotionAPI/Assets/sad.jpg"));
                    }
                    else
                    {
                        imgBrush.ImageSource = new BitmapImage(new Uri(@"ms-appx://IC6.EmotionAPI/Assets/happy.jpg"));
                    }

                    myGrid.Background = imgBrush;

                }

            }
            catch (Exception ex)
            {
                await new MessageDialog(ex.Message).ShowAsync();
            }
        }

In this method we’re leverging the power of the MediaCapture class that provides functionality for capturing photos, audio, and videos from a capture device, such as a webcam. The InitializeAsync method, which initializes the MediaCapture object, must be called before we can start previewing or capturing from the device.

In our exercise we’re going to put the MediaCapture initialization in the OnNavigatedTo method:

        protected async override void OnNavigatedTo(NavigationEventArgs e)
        {
            base.OnNavigatedTo(e);

            if (_mediaCapture == null)
            {
                await InitializeCameraAsync();
            }

        }

InitializeAsync is a helper method we write to search for a camera and try to initialize it if we find one.

  private async Task InitializeCameraAsync()
        {

            // Attempt to get the front camera if one is available, but use any camera device if not
            var cameraDevice = await FindCameraDeviceByPanelAsync(Windows.Devices.Enumeration.Panel.Front);

            if (cameraDevice == null)
            {
                Debug.WriteLine("No camera device found!");
                return;
            }

            // Create MediaCapture and its settings
            _mediaCapture = new MediaCapture();

            var settings = new MediaCaptureInitializationSettings { VideoDeviceId = cameraDevice.Id };

            // Initialize MediaCapture
            try
            {
                await _mediaCapture.InitializeAsync(settings);
            }
            catch (UnauthorizedAccessException)
            {
                Debug.WriteLine("The app was denied access to the camera");
            }
        }

        ///
<summary>
        /// Attempts to find and return a device mounted on the panel specified, and on failure to find one it will return the first device listed
        /// </summary>

        /// <param name="desiredPanel">The desired panel on which the returned device should be mounted, if available</param>
        /// <returns></returns>
        private static async Task<DeviceInformation> FindCameraDeviceByPanelAsync(Windows.Devices.Enumeration.Panel desiredPanel)
        {
            // Get available devices for capturing pictures
            var allVideoDevices = await DeviceInformation.FindAllAsync(DeviceClass.VideoCapture);

            // Get the desired camera by panel
            DeviceInformation desiredDevice = allVideoDevices.FirstOrDefault(x => x.EnclosureLocation != null && x.EnclosureLocation.Panel == desiredPanel);

            // If there is no device mounted on the desired panel, return the first device found
            return desiredDevice ?? allVideoDevices.FirstOrDefault();
        }

Let’s focus on the MakeRequest method we called in the click event handler because here we make use of the Project Oxford library to detect emotions.

private async Task<Emotion> MakeRequest(Stream stream)
        {
            var apiClient = new Microsoft.ProjectOxford.Emotion.EmotionServiceClient("f1b67ad2720944018881b6f8761dff9a");
            var results = await apiClient.RecognizeAsync(stream);

            if (results == null) return null;

            return results.FirstOrDefault();
        }

We need to create an instance of the Microsoft.ProjectOxford.Emotion.EmotionServiceClient class. In the constructor we pass the key obtained from the Azure portal at the beginning of this post. Then, we call the RecognizeAsync method. Here we’re using the overload with the Stream parameter because we have our picture saved in memory. There is also an overload that accepts a URL string. With this call the Azure platform is now doing its magic and soon it’ll deliver the result. The RecognizeAsync returns an array of Emotion. An Emotion is made by a Rectagle reference and a Score reference. The Rectagle instance tells us the coorindates of the face detected while the Score instance tells us the confidence of every mood that Azure can detect: sadness, neutral, happiness, surprise, fear, and anger. Based on this data we can make “ifs” to do some funny things like changing the background of our main window.

TL; DR

In this post we learned how to detect the current mood of our user. We achieved that by using our front camera to take a picture and then make a call to Azure Emotion API to guess if our user is happy or not. We had to set the app manifest to inform the OS that we need to use the Webcam to ask the user for the privacy settings.

If you want to learn more about the MediaCapture class, visit MSDN (https://docs.microsoft.com/en-us/uwp/api/windows.media.capture.mediacapture) and the Azure Cognitive Services (https://azure.microsoft.com/en-us/services/cognitive-services/) website. The source code of the app developed in this post is available on my GitHub (https://github.com/phenixita/IC6.EmotionAPI).

Social

If you have questions you can use the comment section below or you can find me on Twitter! If you liked this post, share!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s