FluentRealSense – the first steps to a simpler RealSense

Those who know me are aware that I have a long term association (nay, let’s say it for what it is, love affair) with the RealSense platforms from Intel. If you’ve been following developments in that space for any length of time, you’re probably aware that things have moved on a lot and the cameras are available in all sorts of devices and are no longer just limited to the Windows platform. This shift of emphasis means that Intel have moved away from concentrating on the old monolithic RealSense SDK and have moved to supporting a new open source API (available here: https://github.com/IntelRealSense/librealsense).

I have spent a period of time working with this API as it has evolved and have started “wrapping” it to make it a little bit friendlier to use (in my opinion) for people who don’t want to get bogged down with the “nitty gritty” of how to get devices, and so on. This work has been in C++, the language the API is available in, and I am going to cover, over a series of posts, how this library evolved. I’ll show you the code I’ve written at each step so that you can also see how I have gone about relearning modern C++ (bearing in mind it has been 18 years since I last worked with it in anger) so you will definitely find that early iterations of the code are naive, at best. My aim with this library was to make it as fluent as possible, hence it’s name, FluentRealSense.

An upfront note. Early iterations of this codebase have been done entirely on Windows running Visual Studio 2017. I chose this particular environment because, a) Visual Studio is, to my mind, the finest IDE around and b) because 2017 supports porting your C++ code directly over to Linux. This lets me leverage the rapid turnaround time I’m used to in my dev environment and the ease of deploying to my intended targets later on. Thank you Microsoft for keeping this old developer happy.

Let’s start with the first iteration. The first thing I wanted the code to do was to provide me with the ability to iterate over all the cameras on my system and print out some diagnostic information. This meant that my code would be broken down initially into three classes:
information: This class reads information from the API about a single camera and returns it as a string.
camera: This is a single device. Initially, it’s just going to instantiate and provide access to the information class.
cameras: The entry point for consuming applications, this class is responsible for finding out about and managing access to each camera; this class is also enumerable so that the calling code can quickly iterate over each camera.

Let’s start off by looking at the information class. First we’re going to add some includes and namespaces:

#pragma once
#include "hpp/rs_context.hpp"

using namespace std;
using namespace rs2;

class information
{
public:
class information() {}
~information() {}
}

That’s all very straightforward and nothing you won’t have seen before. Let’s start fleshing this out.

The first thing we’re going to need is a RealSense device to get information from – this is going to be passed in so let’s add a member to store it and replace our constructor with this:

explicit information(const device device) : device_(device) {}
:::
private:
device device_;

At this stage, our code looks like this:

#pragma once
#include "hpp/rs_context.hpp"

using namespace std;
using namespace rs2;

class information
{
public:
explicit information(const device device) : device_(device) {}
~information() {}

private:
device device_;
}

I have to say, at this point, I love the improvements to instantiating members that C++ now provides. This is a wonderful little innovation.

Now, in order to get the information out of the API, I’m going to add a handy little private helper method. This makes use of the fact that the API exposes this information via an rs2_camera_info enumeration:

const char* get_info(const rs2_camera_info info) const
{
if (!device_.supports(info))
{
return "Not supported";
}
return device_.get_info(info);
}

This is the first point that our code is going to add any value. Whenever we call get_info against the underlying device, we have to check that that particular type of info can be retrieved for that device. It’s very easy to write code that makes assumptions that each camera supports exactly the same set of properties here so the underlying call could throw an exception. To get around this, our code checks to see if the device supports that particular rs2_camera_info type and returns “Not supported” if it doesn’t. It is so easy to forget that you should always pair the get_info with the supports, so this helper method removes the need to remember that.

So, how do we use this? Well, with some public methods like these:

const char* name() const
{
return get_info(RS2_CAMERA_INFO_NAME);
}

const char* serial_number() const
{
return get_info(RS2_CAMERA_INFO_SERIAL_NUMBER);
}

const char* port() const
{
return get_info(RS2_CAMERA_INFO_PHYSICAL_PORT);
}

const char* firmware_version() const
{
return get_info(RS2_CAMERA_INFO_FIRMWARE_VERSION);
}

const char* debug_opCode() const
{
return get_info(RS2_CAMERA_INFO_DEBUG_OP_CODE);
}

const char* advanced_mode() const
{
return get_info(RS2_CAMERA_INFO_ADVANCED_MODE);
}

const char* product_id() const
{
return get_info(RS2_CAMERA_INFO_PRODUCT_ID);
}

const char* camera_locked() const
{
return get_info(RS2_CAMERA_INFO_CAMERA_LOCKED);
}

string dump_diagnostic() const
{
string text = "\nDevice Name: ";
text += name();
text += "\n Serial number: ";
text += serial_number();
text += "\n Port: ";
text += port();
text += "\n Firmware version: ";
text += firmware_version();
text += "\n Debug op code: ";
text += debug_opCode();
text += "\n Advanced Mode: ";
text += advanced_mode();
text += "\n Product id: ";
text += product_id();
text += "\n Camera locked: ";
text += camera_locked();

return text;
}

We now have our complete information class. It looks like this:

#pragma once
#include "hpp/rs_context.hpp"

using namespace std;
using namespace rs2;

class information
{
public:
explicit information(const device device) : device_(device) {}
~information()
{
}

const char* name() const
{
  return get_info(RS2_CAMERA_INFO_NAME);
}

const char* serial_number() const
{
  return get_info(RS2_CAMERA_INFO_SERIAL_NUMBER);
}

const char* port() const
{
  return get_info(RS2_CAMERA_INFO_PHYSICAL_PORT);
}

const char* firmware_version() const
{
  return get_info(RS2_CAMERA_INFO_FIRMWARE_VERSION);
}

const char* debug_opCode() const
{
  return get_info(RS2_CAMERA_INFO_DEBUG_OP_CODE);
}

const char* advanced_mode() const
{
  return get_info(RS2_CAMERA_INFO_ADVANCED_MODE);
}

const char* product_id() const
{
  return get_info(RS2_CAMERA_INFO_PRODUCT_ID);
}

const char* camera_locked() const
{
  return get_info(RS2_CAMERA_INFO_CAMERA_LOCKED);
}

string dump_diagnostic() const
{
  string text = "\nDevice Name: ";
  text += name();
  text += "\n Serial number: ";
  text += serial_number();
  text += "\n Port: ";
  text += port();
  text += "\n Firmware version: ";
  text += firmware_version();
  text += "\n Debug op code: ";
  text += debug_opCode();
  text += "\n Advanced Mode: ";
  text += advanced_mode();
  text += "\n Product id: ";
  text += product_id();
  text += "\n Camera locked: ";
  text += camera_locked();
  return text;
}

private:
  const char* get_info(const rs2_camera_info info) const
  {
    if (!device_.supports(info))
    {
      return "Not supported";
    }
    return device_.get_info(info);
  }
  device device_;
};

The next thing we have to do is write a class that represents a single RealSense camera. There’s not much to this class, at the moment, so let’s look at it in its entirety.

#pragma once
#include "hpp/rs_context.hpp"
#include "information.h"

using namespace std;
using namespace rs2;

class camera
{
public:
  explicit camera(const device dev) : information_(make_shared(dev)) {}

  ~camera()
  {
  }

  shared_ptr get_information() const
  {
    return information_;
  }

private:
  shared_ptr information_;
};

I did say this class was pretty light at the moment. The class accepts a single RealSense device which is used to instantiate the information class. We provide one method which is used to get the instance of the information class. That’s it so far.

Finally, we come to the entry point of our code, the cameras class. This class let’s us enumerate all of the cameras on our system and access the functions inside. As usual, we’ll start off with the definition:

#pragma once
#include 
#include 
#include "camera.h"
#include "hpp/rs_context.hpp"

using namespace std;
using namespace rs2;

class cameras
{
public:
  cameras() {}
  ~cameras() {}
}

As you will remember, I said that I wanted the cameras to be enumerable so I need to do some upfront declarations:

using cameras_t = vector<shared_ptr>;
using iterator = cameras_t::iterator;

using const_iterator = cameras_t::const_iterator;
[/source]

With these in place, I can now start to add the ability to enumerate over the camera instances. Before I do that, though, it’s time to introduce something new. In the preceding code, we saw that the RealSense camera was represented as a device that we passed into the relevant constructors. The question is, how did we get that device in the first place? Well, that’s down to the API providing a context that allows us to access these devices. So, let’s add a member to store a vector of camera instances and then build in the method to get the list of devices.

cameras() : cameras_(make_shared())
{
  context context;
  // Iterate over the devices;
  auto devices = context.query_devices();
  for (const auto dev : devices)
  {
    const auto cam = make_shared(dev);
    cameras_->push_back(cam);
  }
}

private:
  shared_ptr cameras_;

There’s nothing complicated in that code. We get the devices from the context using query_devices and iterate over each one, adding a new camera instance to our vector. We have reached the point where we can now add the ability to enumerate over our vector. All the scaffolding is in place so let’s add that capability.

int capacity() const
{
  return cameras_->capacity();
}

iterator begin() { return cameras_->begin(); }
iterator end() { return cameras_->end(); }

const_iterator begin() const { return cameras_->begin(); }
const_iterator end() const { return cameras_->end(); }
const_iterator cbegin() const { return cameras_->cbegin(); }
const_iterator cend() const { return cameras_->cend(); }

That’s it. We now have the ability to build and iterate over the devices on our system. Let’s see how we would go about using that. To test it, I created a little console application that I used to call my code. It’s as easy as this:

#include "stdafx.h"
#include "cameras.h"
#include 

int main()
{
const auto devices = std::make_shared();
for (auto &dev : *devices)
{
cout <get_information()->dump_diagnostic();
}
return 0;
}

This is what it looks like on my system (running a web camera and a separate RealSense camera).

real-sense-images

Advertisement

Sensing the future with WPF

This post is a look into a new library that I’m writing that’s intended to make life easier for WPF developers working with Intel RealSense devices. As many of you may know, I’ve been involved with the RealSense platform for a couple of years now (back from when it was called the Perceptual Computing SDK). When I develop samples with it, I tend to use WPF as my default development experience, and the idea of hooking up the Natural User Interface capabilities of RealSense devices with the NUI power of WPF in an easy to use package is just too good to resist. On top of this, I still strongly believe in WPF and it will take a lot to remove me from developing desktop applications with it because it is just so powerful.

To this end, I have started developing a library called RealSenseLight that will enable WPF developers to easily leverage the power of RealSense without having to worry about the implementation details. While it”s primarily aimed at WPF developers, the functionality available will be usable from other C# (Windows Desktop) applications, so hooking into Console applications will certainly be possible.

One of the many decisions I’ve taken is to allow configuration of features to be set via Fluent interface, so it’s possible to do things like this:

RealSenseApplication.Uses(new EmotionDetectionConfiguration())
  .Uses(SpeechRecognition.DefaultConfiguration().ChangePitch(0.8))
  .Start();

ViewModels will be able to hook into RealSense using convenient interfaces that abstract the underlying implementations. There’s no need to call Enable… to enable a RealSense capability. The simple fact of integrating a concrete implementation means that the feature is automatically available. The following example demonstrates what an IoC resolved implementation looks like:

public class EmotionViewModel : ViewModelBase
{
  private IEmotion _emotion;
  public EmotionViewModel(IEmotion emotion)
  {
    emotion.OnUserHappy(user =&gt; System.Debug.WriteLine(&quot;{0} is happy&quot;, user.DetectedUser));
  }
}

The library will provide the ability to do things such as pause/resume individual RealSense capabilities, identify and choose from the relevant RealSense compatible devices, although this does require identifying up front, what the different aspects are you’re interested in because it uses these to evaluate the devices that meet these capabilities.

I’m still fleshing out what the whole interface will look like, so all of the features haven’t been determined yet, but I will keep posting my designs and a link to the repo once I have it in a state where it’s ready for an initial commit.

Getting a RealSense of my status

Long time readers will have realised that I have been spending a lot of time with the technology that was formally known as Perceptual Computing (PerC). You may also know that this technology is now known as RealSense and that it will be rolling out to a device near you soon. What you might not know is that I’m currently writing a course on this technology for Pluralsight. As part of writing this course, I’ve been creating a few little wrapper utilities that will make your life easier when developing apps with the SDK.

In this post, I’m going to show you a handy little method for working with API methods. Pretty much every RealSense API method returns a status code to indicate whether or not it was successful. Now, it can get pretty tedious writing code that looks like this:

pxcmStatus status = Session.CreateImpl<PXCMVoiceRecognition>
  (PXCMVoiceRecognition.CUID, out voiceRecognition);
if (status < pxcmStatus.pxcmStatus.PXCM_STATUS_NO_ERROR)
{
  throw new InvalidStatusException("Could not create session");
}
status = _voiceRecognition.QueryProfile(out pInfo);
if (status < pxcmStatus.pxcmStatus.PXCM_STATUS_NO_ERROR)
{
  throw new InvalidStatusException("Could not query profile");
}

As you can imagine, the more calls you make, the more status checks you have to do. Well, I like to log information about what I’m attempting to do and what I have successfully managed to do, so this simple method really helps to write information about the methods being invoked, and to throw an exception if things go wrong.

public void PipelineInvoke(Func<pxcmStatus> pipelineMethod, string loggingInfo = "")
{
  if (!string.IsNullOrWhiteSpace(loggingInfo))
  {
    Debug.WriteLine("Start " + loggingInfo);
  }
  pxcmStatus status = pipelineMethod();
  if (status < pxcmStatus.PXCM_STATUS_NO_ERROR)
  {
    throw new InvalidStatusException(loggingInfo, status);
  }
  if (!string.IsNullOrWhiteSpace(loggingInfo))
  {
    Debug.WriteLine("Finished " + loggingInfo);
  }
}

This makes it easier to work with the API and gives us code that looks like this:

PipelineInvoke(() => 
  Session.CreateImpl<PXCMVoiceRecognition>(PXCMVoiceRecognition.CUID, 
  out _voiceRecognition), "creating voice recognition module");

And this is what InvalidStatusException looks like:

public class InvalidStatusException : Exception
{
  public InvalidStatusException(string alertMessage, pxcmStatus status)
  {
    Message = alertMessage;
    Status = status;
  }
  public string Message { get; private set; }
  public pxcmStatus Status { get; private set; }
}

Over the course of the next couple of months, I’ll share more posts with you showing the little tricks and techniques that I use to make working with the RealSense SDK a real joy.

Haswell – Intel SDP Unit (Software Developer Preview) – a death in the family

Okay, that’s possibly a bit too melodramatic a title, but that’s almost what it felt like when my Ultrabook decided to pack in and shuffle off to the great gig in the sky. This post will contain no screenshots for reasons that will soon become abundantly clear.

Some context first, I’ve been using the Ultrabook pretty much continuously since I got it. It was my default, go-to, day to day development box. Yup, that’s a lot of different ways of saying that I was using the Ultrabook very heavily. So heavily, in fact, that I was using it as the workhorse for developing my Synxthasia project for Intel ; a Theramin(ish) type of music application which translated gestures in 3D space into sound and visuals – it even took your picture at intervals and imposed that into the visuals; I was particularly proud of the fact that it gave you the ability to alter the “shape” of the sound to simulate a wah effect just by using your mouth. The SDP really is an excellent device for development tasks like this.

So, about a week before I was due to submit this to Intel; with the app heavily in the polishing stages, the Ultrabook died on me – taking 8 days of code that hadn’t been committed back to source control (I have no excuses really as I should have been committing this to source control as I went along). Remember that I said that it had the habit of giving the low battery warning and then switching off, well this is exactly what happened. Anyhoo, I plugged it in and started it back up – all was fine and dandy here, but I noticed that Windows was reporting that there was no battery present. Well, I couldn’t leave the Ultrabook permanently plugged in and I needed to go to a client site – I unplugged the unit and set off. When I tried to power it back on later on, it refused to start – all that happened was the power button LED flashed blue and that was it.

I got in touch with contacts in Intel and they provided some first line support options which included opening the unit and disconnecting the batteries. Unfortunately, this hasn’t worked – the unit still has all the vitality of the Norwegian Blue. Intel has offered to replace the device but as I have been away from the physical unit for a while, I’ve been unable to take them up on their kind offer. Once I get back home, I will avail myself of their help because the device itself really is that good.

What has become clear to me over the last year or so, is just how much Intel cares about the feedback it gets. I cannot stress enough how they have listened to reviewers and developers like myself, and how they have sought to incorporate that feedback into their products. As a developer, it’s a genuine pleasure for me to deal with them as they really do seem to understand what my pain points are, and they provide a heck of a lot of features to help me get past those problems. Why do I say this? Well, I was lucky enough to have an SDP from last year to use and while it was a good little device, there were one or two niggles that really got to me as I used it – the biggest problem, of course, being the keyboard. The first SDP just didn’t have a great keyboard. The second issue was that the first SDP also felt flimsy – I always felt that opening the screen was a little fragile and that I could do damage if I opened it too vigorously. Well, not this time round – the updated version of the SDP has a fantastic keyboard, and feels as robust as any laptop/Ultrabook that I’ve used, and all this in a slimmer device. You see, that’s what I mean by Intel listening. I spend a lot of time at the keyboard and I need to feel that it’s responsive – and this unit certainly is. The thing is, Intel aren’t selling these units – they are giving them away for developers to try out for free. It would be perfectly understandable if they cut corners to save costs, but it’s patently apparent that they haven’t – they really have tried to give us a commercial level device. Yes, I’ve been unlucky that the Haswell died on me, but given the care from Intel this hasn’t tarnished my opinion of it.

So, does the death of the device put me off the Haswell? Of course it doesn’t. For the period I have been using it, it has been a superb workhorse. It combines good looks with performance with great battery life. It’s been absolutely outstanding and I wouldn’t hesitate to recommend it. Needless to say, I’m delighted with the help and support that I’ve had from Intel – it certainly hasn’t been any fault of theirs that I’ve been unable to return the device for them to replace. I’d also like to thank Rick Puckett at Intel for his help and patience, as well as Carrie Davis-Sydor and Iman Saad at DeveloperMedia for helping hook me up with the unit in the first place and their help when the Haswell shuffled off the mortal coil.

Haswell – Intel SDP Unit (Software Developer Preview) – The Keyboard fights back

Well, since my initial review of the Ultrabook, I’ve been using it for pretty much all my computing needs. From writing C++ and C# code, through standard surfing and the likes, through to heavier duty 3D work, and the Haswell just keeps on going.

First of all, let me just say that the keyboard on this generation of Intel SDP is a whole lot better than the one that shipped with last years SDP. This one actually feels solid and responsive, and it features a nice little backlight. Intel really has upped the game for what are effectively demo units here. My only, admittedly minor, niggle is the fact that the keyboard is designed for the US market. When I change it to the UK settings, the \ key is missing.

Writing code using this Ultrabook is a pleasant experience. The screen resolution is good, and the screen itself is bright and very pleasant to use. Unfortunately, the 4GB memory means that there’s sometimes a slow down when compiling a C++ application from scratch using Visual Studio 2012. Once you’re past the first compilation though, the compile process is responsive enough, so this isn’t a huge issue. As I’m a heavy user of Visual Studio and WPF, I’m pleased to see that the responsiveness of the WPF designer window is nice and snappy – there’s no hold up in the system making me wish I was using a different machine. Even the dodgy upper case menus in Visual Studio don’t distract me while I’m looking at this glorious screen.

Visual Studio running on the Ultrabook.

One thing I’ve tried to do is keep the settings of the machine as default as possible (except for the keyboard layout – that’s’ a compromise too far). This has allowed me to test the claims that the Haswell gives me long battery life. Well, it’s true – using it for a mixture of DirectX development, 3D work and WPF development has allowed me to get just over 7 hours on a single charge. It’s not quite the potential 12 hours, but my day to day usage is pretty atypical, so I wouldn’t be unduly worried about the battery life if I were looking to use this on a really long journey. The reason that I mentioned the defaults is because Windows 8 suddenly tells me that I’ve got about 5% battery life left, and then it shuts down – there’s not enough time for me to get the unit plugged in then.

Now Pete – surely it can’t be all sweetness and light, can it? Where are the niggles that must exist here, or are you playing nicey-nice with Intel here? Well, there is one thing that really gets me annoyed and that’s the Trackpad. If I click on the left hand side of the Trackpad (anywhere on the left), I can trigger a click – and that’s exactly what I’d expect. Clicking or tapping on the right hand side doesn’t trigger anything. This has caused me quite a bit of frustration. It’s my issue to deal with though, as the Trackpad is a multi touch unit so it’s something I’m going to have to work on.

You may wonder what applications I typically have running on the Ultrabook, to see how this compares to what you would use. Well, on a day to day basis, I typically have up to 4 instances of Visual Studio 2012 running simultaneously. I also have a couple of instances of Expression Blend running, as well as Chrome, with a minimum of 10 to 12 fairly heavy web pages in them (if you’re really interested, these pages are Facebook, Code Project – several instances covering different forums and articles,, GMail, Twitter, The Daily Telegraph and nufcblog.org). I also usually have Word, Excel, Photoshop, Huda and Cinema 4D running, along with Intel’s Perceptual Computing SDK and Creative Gesture camera. As you can see, that’s a lot of heavy duty processing going on there, and this machine copes with them admirably.

Okay, to whet your appetite – I’ve been using the Ultrabook, off the charger, just editing documents and web pages for the last two hours, and I still have 91% battery life left. That’s absolutely incredible.

Battery

 

Haswell – Intel SDP Unit (Software Developer Preview) – 1st impressions

The reveal

Bonk, bonk bonk bonk bong….

That’s the sound that greeted me when I opened the box from Intel™ featuring a developer preview Haswell Ultrabook™, and there’s no geek in the world who wouldn’t want to repeat that sound, so after opening and closing the case for ten minutes or so, I decided to actually get the Ultrabook™ out of the box.

WP_20130801_001

Okay, this is the Ultrabook™ packaging (outside of the cardboard it was shipped in). Little did I know the simple joy I was to receive upon opening this box.

WP_20130801_002

I’ve opened up the packaging, and this is the sight that greets me. I’m keen to start the unit, but I want to see what other goodies the box has inside.

WP_20130801_003

Is that a carry case for the Ultrabook™? Nope, it’s just packing material to keep the unit from getting scratched.

WP_20130801_004

Ahh, that’s more like it. Cables and a USB drive (containing the Windows reinstallation media). The reflection in this image is from the plastic covering that protects this layer.

WP_20130801_005

Okay, the plastic is off and I can see the quick start guide as well as the power supply and USB drive. That PSU is tiny. Consider me seriously impressed.

WP_20130801_006

What have we here? Ah, Mini HDMI to Full, to VGA and a USB to Ethernet. Thoughtful of them to supply this.

WP_20130801_007

Well, there’s the keyboard. Nice.

After starting the machine up, it’s a simple matter to configure Windows™ 8. Windows really suits a machine like this.

Day to day

Well, the first thing to do with a machine like this is to take it out for a drive. In my case, this means installing Visual Studio Ultimate 2012. As you might expect, I have several computers, so I have a lot of reference there on how long this actually takes to install. On my old I7 with a spindle hard drive, it took up to an hour to install. The question then, is how long would it take to install on this machine with the same feature set. Start to finish, the whole process took 8 minutes. Yup, you read that right, 8 minutes start to finish. I couldn’t ask for a better initial impression than that.

The good

This unit is fast. Oh, it’s so very fast. Okay, the boot up time is a tad slower than my I7 Lenovo™ Yoga 13, (by less than half a second), but once you actually get going, the machine is very responsive. While it’s a fairly entry level I5, it is more than capable of coping with my day to day development tasks. Opening up a XAML page in Visual Studio 2012 and showing it with both the designer and code view open, the code view updated Intellisense as I typed, and the designer refreshed itself snappily – those of you who develop XAML based apps know how slow this can be – it was an absolute joy to behold.

The unit is running Windows 8 Pro (Build 9200), and Windows works well on it. The screen is responsive to the touch, and Windows feels snappy and responsive.

The machine is light. Maybe it’s not Mac Air light, but it’s heading in the right direction, and it’s thin. If you were going to use this only on your desktop, this wouldn’t matter but if, like me, you travel a lot, you’ll really appreciate them.

That battery life. Oh boy, 6 hours spent doing a lot of fairly intensive development and the battery still has plenty of life to give yet. This really has hit the sweetspot with regards to providing me with portability. I can easily see myself whiling away journeys using this.

The niggles

This is going to sound churlish considering I’ve received this unit as a “favour”, but I do wish it had more than 4GB of RAM and a bigger hard drive than it has (a 128GB SSD).

The shift keys are on the large side, but I have no doubt I will soon get used to them.

There’s a lot of real estate lost around the screen, but I’ve no doubt that commercial units will take care of this, using all the available space.

It’s hard to put my finger on what the exact problem is, but the keys don’t feel quite right to me. There’s a bit of play to them that doesn’t make them feel that stable. Obviously, as this is not a commercial Ultrabook™, I’m not unduly worried – I would expect a commercial Ultrabook™ to have a more solid feel to the keys.

So, where are we at?

I’ve been asked to write the reviews as unbiased as I possibly can be. This review is my initial impression only. My next review will cover using this machine on a daily basis. As I do a lot of Ultrabook™ related work as part of my daily development routine, this will be my primary device, so I’ll let you know how this compares.

Disclosure of Material Connection:

I received one or more of the products or services mentioned above for free in hope that I would mention it on my blog. Regardless, I only recommend products or services I use personally and believe my readers will enjoy. I am disclosing this in accordance with the Federal Trade Commission’s 16 CFR, Part 255: “Guides Concerning the Use of Endorsements and Testimonials in Advertising.”

Ultimate Coder – Ma il Mio Mistero è Chiuso In Me

This is a copy of the post I made on the Intel site here. For the duration of the contest, I am posting a weekly blog digest of my progress with using the Perceptual Computing items. This weeks post shows how Huda has evolved from the application that was created at the end of the third week. My fellow competitors are planning to build some amazing applications, so I’d suggest that you head on over to the main Ultimate Coder site and follow along with their progress as well. I’d like to take this opportunity to wish them good luck.

Executive summary

The judges will be pleased to know that this blog entry is a lot less verbose than previous ones. This isn’t because there’s not a lot to talk about, but rather because I’m concentrating on writing a lot of code this week, and I’ve hit a real roadblock with regards to the gesture camera. Code that was working perfectly well last week has suddenly stopped working. This means that I’ve got some serious debugging to do here to find out why the gesture code is no longer displaying the cursor on the screen – this doesn’t happen all the time, but it’s happened often enough for me to be concerned that there’s an underlying problem there that needs to be sorted out.

As part of my remit is to help coders learn to write code for the Perceptual SDK, I will be continuing writing about the coding process, so I apologise to the judges in advance – there’s nothing I can do to alleviate the detail of explanation. If they wish to skip ahead, the summary talks about where we are today and what my plans are for next week, and here are some screen shots. Unlike my previous blogs, I’m not going to be breaking this one down on a day by day basis. There’s a lot that needs to be done, and it doesn’t break down neatly into days.

For those who want to know the result, don’t look away now. I found and fixed the issue with the gesture code. It’s all a matter of patient debugging and narrowing down areas where there can possibly be problems.

I’ve been wondering how best to train people in how to use Huda, and this week I’ve hit on the solution. I’m going to let Huda teach you. By the end of the week, I added in voice synthesis, and I think I can use this to help provide context. The type of thing I have in mind is, say you bring up the filters, Huda will tell you that a swipe left or right will cycle through the filters, and the thumbs up will add the filter to the currently selected image. The beauty is, now that I have the code in place to do the voice synthesis, I can easily add this type of context.

Voice recognition is an area of the Perceptual SDK that has a long way to go. It just isn’t accurate enough and, for that reason, I’m dropping support for it from Huda. I had the commands “Open Filter” and “Close Filter” in the application, and no matter what I tried, it kept turning Open Filter into Close Filter. Yes, it recognises a few words, but given that the accuracy is so hit and miss, I can’t leave it in there. So, I apologise to you right now – voice control is out. I had to take this decision this week because there are certain things I was using it for that I have to find other ways to accomplish these things and I really don’t have the time to keep mucking around with this.

If you don’t want to read the rest of the post, this is what Huda looks like now:

Woe to the interface

It’s going to be a busy week this week. The more I play with the interface, the more I find it clunky and unpolished, and this is incredibly frustrating. While I know that I’m a long way from polishing the interface, I can’t help but think that I’m missing a trick or two here. The biggest thing that I have issues with is the concept of showing the folders and selecting the images. So, the first thing I’m going to do is to unify the two parts; this should make it obvious that the picture selection and folder selection actually belong together. Part of my thinking here is that I want it to be apparent to anyone who’s used a photo editing application before that this is still a photo editing application. While it’s fine to play about with an interface, there still has to be a certain level of familiarity; there has to be a feeling of comfort in using the application otherwise the user will end up being overwhelmed and will give up on the application. While the contest is about working with Perceptual computing and features of the ultrabook, it’s important that I don’t lose sight of the fact that this must be an application that people can use.

Once I’ve got this in place, I want to address the selection of “things to do to the photograph”. By this, I mean that I want to put some form of toolbar into place to trigger options. Don’t worry that I’m throwing sections of the UI together – I’m not. What I will be doing here is giving the user obvious trigger points – I can use this to bridge the gap between traditional photo editing applications and Huda. There has to be some familiarity for the user here, so I’ll be providing that today.

You may think that this is going to take a lot of work, and in other languages you may well be right, but I’m using MVVM and WPF. As the different parts of the interface are loosely coupled controls, it’s a simple matter for me to move the controls into other containers, and they will still work. The only change I’m going to make is to keep the images showing regardless of whether or not the user has opened a folder.

With the basics of the “toolbar” in place, I now have something that feels more natural for using gestures with. I’m going to hook the filter selection into both touch and gestures, and do something a little bit fancy for displaying the list of filters. What I’ll do is move the filters into a carousel, and use the swipe left/swipe right to move backwards and forwards through the list.

I’m concerned about the speed of loading of various items when thingsare selected, so I will be moving things out onto background threads.

The interface is really starting to firm up now. It’s starting to feel that little bit more polished to me, but there is more that I can do with it.

Don’t worry that all the filters have the same name and appearance right now. This is purely a visual that will be changed soon – I need to come up with some good iconography for these. It’s strangely satisfying swiping your hand left and right to rotate through this list. My final action on this end will be to introduce a gesture that actually applies the filter. The thumbs up seems oddly appropriate to me.

The big reveal

After a frenzy of activity on the actual interface, I want to address one of the central concepts of Huda; namely, the ability to filter the image, and not touch the underlying image. This is where I give up the secret of Huda – it’s all a massive con. If you remember, I promised that the underlying image would not be touched. I’ve had people get in touch with all sorts of wild theories on how I was going to do this – from the person who thought I was going to use steganography to hide this information in the picture to someone who thought I was going to convert the file into a custom format and put something in place to fool image editors into thinking that the type was still the original file format.

If I were to tell you that the solution to this problem was incredibly straightforward, and blindingly obvious, would this surprise you? Well, here it is. Here’s how it’s done…. Did you catch that? Yes, it’s all smoke and mirrors. What Huda does is actually not modify the original image at all – instead, we maintain an index of all the files that we’ve edited in Huda, along with the details of all the edits that have been performed, and these can be played again and again and again. In other words, when you open a file up in Huda, it goes to the index first to see if you’ve previously opened it. If you have, it gets the image transformations and reapplies them there and then. It’s as simple as that. Of course, this does rely on you not moving photos around, but I do have some plans in place for a post contest version of Huda to take care of that as well. Anyone for hashing?

Transformers – Robert’s in disguise

A judgely warning – this section is going to be intense. I’m sorry, but there’s no way around the fact that we need to write some code, and there needs to be some explanation of it. It’s going to be quite dry, so I won’t blame you if you want to skip to the bit at the end.

As a first pass for the transformations that I want to apply, I decided that I needed to declare a common interface that all my transformation classes would apply. It’s a very simple interface, but as I want all my transformations to be serializable, I’ve made sure that it implements ISerializable by default

using System.Windows.Media.Imaging;
using System.Runtime.Serialization;
namespace Huda.Transformations
{
    public interface IImageTransform : ISerializable
    {
        int Version { get; }
        BitmapSource Transform(BitmapSource source);
    }
}

At this point it’s merely a matter of creating some simple transformations that use this functionality. Being the nice chap that I am, and to give Nicole some coding kudos that she can drop into casual conversation, I’ll show you all the classes I’m using for transformation. First of all, here’s the image crop routine:

using System.Windows.Media.Imaging;
using System.Windows;
using System.Runtime.Serialization;
using System;
namespace Huda.Transformations
{
    [Serializable]
    public class CropImage : IImageTransform
    {
        public CropImage()
        {
            Version = 1;
        }
        protected CropImage(SerializationInfo info, StreamingContext context) : this()
        {
            Version = info.GetInt32(Constants.Version);
            Left = info.GetInt32(Constants.Left);
            Top = info.GetInt32(Constants.Top);
            Width = info.GetInt32(Constants.Width);
            Height = info.GetInt32(Constants.Height);
        }
        public int Version
        {
            get;
            private set;
        }
        public int Left { get; set; }
        public int Top { get; set; }
        public int Width { get; set; }
        public int Height { get; set; }
        public BitmapSource Transform(BitmapSource source)
        {
            Int32Rect rect = new Int32Rect(Left, Top, Width, Height);
            source = new CroppedBitmap(source, rect);
            return source;
        }
        public void GetObjectData(SerializationInfo info, StreamingContext context)
        {
            info.AddValue(Constants.Version, Version);
            info.AddValue(Constants.Left, Left);
            info.AddValue(Constants.Top, Top);
            info.AddValue(Constants.Width, Width);
            info.AddValue(Constants.Height, Height);
        }
    }
}

There’s a lot of code in here that you’ll see in other classes, so I’ll just explain it once and you should easily be able to follow how I’m using it in other locations. The first thing to notice is that I’ve marked this class as Serializable – you need to do that if you want .NET to do it’s mojo when you save things. You’ll see that there are two things inside the class that say CropImage; these are the constructors that are used to create the object. The second one is the special constructor, and is the reason that I had IImageTransform implemenent ISerializable. .NET knows, when it sees this interface, that you want to read and write out the values in the class yourself. There are many, many reasons that I want to do this, but the main reason is because of what happens if you just let the runtime try and figure this stuff out itself – it’s not very performant because of the way it has to map things together. By taking control of this myself, I make the job easy. If you see how I use info.GetInt32 (there are many other methods for reading other types of data), it’s apparent what type of data the property is – we must always make sure we get it right, otherwise unpredictable and nasty things can happen.

At the bottom of the code, there’s a matching method called GetObjectData that just writes the items to the serialization stream. It’s important to make sure that you use the same names for the items in the constructor that you did in this method. The property names don’t have to match these names, but the names themselves do.

The Version currently does nothing, but if I have to add features to any of these classes, I can use this to work out which properties should be present, so the saving and reading of the data carries on seamlessly. It’s always a good idea to design your serialization with this firmly in mind.

The Left, Top, Width and Height properties must be set before we attempt to call Transform. If we forget to set them, all that will happen is that the rectangle that’s created will be 0 pixels in size, and we wouldn’t want a zero pixel crop.

The Transform method is where the “clever” stuff actually happens. I bet you thought there would be more to it. All it does is create a rectangle based on the size and location set in the properties, and then it uses the inbuilt CroppedBitmap class to actually perform the crop.

See! I told you that was easy. Next, I’ll show you what the FlipImage transformation looks like:

using System;
using System.Windows.Media.Imaging;
using System.Windows.Media;
using System.Runtime.Serialization;
namespace Huda.Transformations
{
    [Serializable]
    public class FlipImage : IImageTransform
    {
        public FlipImage()
        {
            Version = 1;
        }
        protected FlipImage(SerializationInfo info, StreamingContext context)
        {
            Version = info.GetInt32(Constants.Version);
            HorizontalFlip = info.GetBoolean(Constants.Horizontal);
        }
        public int Version { get; private set; }
        public bool HorizontalFlip { get; set; }
        public BitmapSource Transform(BitmapSource source)
        {
            source = new TransformedBitmap(source, new ScaleTransform(HorizontalFlip ? -1 : 1, HorizontalFlip ? 1 : -1));
            return source;
        }
        public void GetObjectData(SerializationInfo info, StreamingContext context)
        {
            info.AddValue(Constants.Version, Version);
            info.AddValue(Constants.Horizontal, HorizontalFlip);
        }
    }
}

I won’t explain this code, as it’s largely the same as the CropImage code. If you’re interested why I didn’t put things like the Version and Transform into an abstract base class, please let me know and I’ll cover attempt to answer it in the comments. At this point, the judges are screaming for mercy and begging me to stop, so I’ll try not go too far into philosophical architectural debates.

The ResizeImage transformation is an interesting one (well, interesting if you’re into that type of thing), and the Transform code is more complex than other tranformations. To save you having to fight with this yourself, here it is

using System;
using System.Windows.Media.Imaging;
using System.Windows.Media;
using System.Windows;
using System.Runtime.Serialization;
namespace Huda.Transformations
{
    [Serializable]
    public class ResizeImage : IImageTransform
    {
        public ResizeImage()
        {
            Version = 1;
        }
        protected ResizeImage(SerializationInfo info, StreamingContext context)
            : this()
        {
            Version = info.GetInt32(Constants.Version);
            Width = info.GetInt32(Constants.Width);
            Height = info.GetInt32(Constants.Height);
        }
        public int Version
        {
            get;
            private set;
        }
        public int Width { get; set; }
        public int Height { get; set; }
        public BitmapSource Transform(BitmapSource source)
        {
            Rect rect = new Rect(0, 0, Width, Height);
            DrawingVisual visual = new DrawingVisual();
            using (DrawingContext ctx = visual.RenderOpen())
            {
                ctx.DrawImage(source, rect);
            }
            RenderTargetBitmap resized = new RenderTargetBitmap(
                Width, Height, 96, 96, PixelFormats.Default
                );
            resized.Render(visual);
            return resized;
        }
        public void GetObjectData(SerializationInfo info, StreamingContext context)
        {
            info.AddValue(Constants.Version, Version);
            info.AddValue(Constants.Width, Width);
            info.AddValue(Constants.Height, Height);
        }
    }
}

The constants we’d use

namespace Huda.Transformations
{
    public class Constants
    {
        public const string Left = "Left";
        public const string Right = "Right";
        public const string Top = "Top";
        public const string Bottom = "Bottom";
        public const string Width = "Width";
        public const string Height = "Height";
        public const string Horizontal = "Horizontal";
        public const string Version = "Version";
    }
}

I have been asked how the gesture hookup behaves. Well, imagine you were tracking mouse moves to see which items were under the cursor, you might want to create a Blend Behavior that looked something like this:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Windows;
using System.Windows.Interactivity;
using System.Windows.Input;
using System.Windows.Controls;
using System.Diagnostics;
using System.Timers;
using System.Windows.Media;
using LinqToVisualTree;
using System.Windows.Threading;

namespace Huda.Behaviors
{
    public class GestureMovementBehavior : Behavior<TreeView>
    {
        private bool inBounds;
        private Timer selectedTimer = new Timer();
        private TreeViewItem selectedTreeview;
        private TreeViewItem previousTreeView;

        protected override void OnAttached()
        {
            Window.GetWindow(AssociatedObject).MouseMove += GestureMovementBehavior_MouseMove;
            selectedTimer.Interval = 2000;
            selectedTimer.AutoReset = true;
            selectedTimer.Elapsed += selectedTimer_Elapsed;
            base.OnAttached();
        }

        void selectedTimer_Elapsed(object sender, ElapsedEventArgs e)
        {
            selectedTimer.Stop();
            Dispatcher.Invoke(DispatcherPriority.Normal,
                (Action)delegate()
                {
                    if (previousTreeView != null)
                    {
                        previousTreeView.IsSelected = false;
                    }
                    previousTreeView = selectedTreeview;
                    selectedTreeview.IsSelected = true;
                });
        }

        protected override void OnDetaching()
        {
            Window.GetWindow(AssociatedObject).MouseMove -= GestureMovementBehavior_MouseMove;
            base.OnDetaching();
        }

        void GestureMovementBehavior_MouseMove(object sender, MouseEventArgs e)
        {
            // Are we over the treeview?
            Point pt = e.GetPosition(Window.GetWindow(AssociatedObject));
            Rect rect = new Rect();
            rect = AssociatedObject.RenderTransform.TransformBounds(
                new Rect(0, 0, AssociatedObject.ActualWidth, AssociatedObject.ActualHeight));
            if (rect.Contains(pt))
            {
                if (!inBounds)
                {
                    inBounds = true;
                }
                // Now, let's test to see if we are interested in this coordinate.
                if (selectedRectangle == null || !selectedRectangle.Contains(pt))
                {
                    GetElement(pt);
                }
            }
            else
            {
                if (inBounds)
                {
                    selectedTimer.Stop();
                    inBounds = false;
                }
            }
        }

        private Rect selectedRectangle;
        private void GetElement(Point pt)
        {
            IInputElement element = AssociatedObject.InputHitTest(pt);
            if (element == null) return;

            TreeViewItem t = FindUpVisualTree<TreeViewItem>((DependencyObject)element);
            if (t != null)
            {
                // Get the bounds of t.
                if (selectedTreeview != t)
                {
                    selectedTimer.Stop();
                    // This is a different item.
                    selectedTreeview = t;
                    selectedRectangle = selectedTreeview.RenderTransform.TransformBounds(
                        new Rect(0, 0, t.ActualWidth, t.ActualHeight));
                    selectedTimer.Start();
                }
            }
            else
            {
                // Stop processing and drop this out.
                selectedTimer.Stop();
                selectedTreeview = null;
            }
        }

        private T FindUpVisualTree<T>(DependencyObject initial) where T : DependencyObject
        {
            DependencyObject current = initial;

            while (current != null && current.GetType() != typeof(T))
            {
                current = VisualTreeHelper.GetParent(current);
            }
            return current as T;
        }
    }
}

With only a minimal amount of change, that’s exactly the Behavior that you hook up to a treeview.

 

And there you have it – code to do some basic transformations. The other filters that I apply will all follow this pattern, so it’s an incredibly trivial thing for me to put in place – and, more importantly, this provides the core functionality for me to save out the filters along with details of the files they apply to. All of a sudden, this application is starting to come together.

This weeks music

  • Deep Purple – Purpendicular
  • Elbow – Build A Rocket Boys
  • David Garrett – Rock Symphonies
  • Crash Test Dummies – God Shuffled His Feet
  • Nickelback – All The Right Reasons
  • Jethro Tull – Aqualung Live
  • Creedance Clearwater Revival – Creedance Clearwater Revival
  • Deep Purple – The Battle Rages On
  • Caitlin – The Show Must Go On

Status update

I’ve reached the end of the week and it’s worth taking stock of where I’ve reached. The harsh reality right now is that I won’t be delivering a fully-featured photo editing application at the end. Before the alarm bells start ringing, this is exactly what I expected. The aim for this competition is not to release the next Photoshop at the end of week 7. This would be a physical impossibility, given how I’m incorporating none Photoshop features. I am confident, however, that I will deliver what I was aiming at from the very start – a none destructive gesture and touch driven photo editing application. For this reason, I am going to only going to incorporate a limited number of filters. Don’t worry though, if you want to use Huda, I will continue developing it once the contest has finished. Your investment in using it will not be wasted.

Right now, Huda is limited to only editing a single image at a time. If I have time, before the end of the contest, I will give Huda the ability to edit multiple photos. Over the next week, I’m going to be tackling the reordering of items in the list, fleshing out that toolbar and also having the interface behave a little bit differently when it’s in tablet or desktop mode. Right now, the design has all been based on a tablet experience – but let’s see what we can do for the desktop users. Oh, and I should also have the first draft of the voice context in place as well.

Ultimate Coder – Week 3: Blog posting

This is a copy of the post I made on the Intel site here. For the duration of the contest, I am posting a weekly blog digest of my progress with using the Perceptual Computing items. This weeks post shows how Huda has evolved from the application that was created at the end of the first week. My fellow competitors are planning to build some amazing applications, so I’d suggest that you head on over to the main Ultimate Coder site and follow along with their progress as well. I’d like to take this opportunity to wish them good luck.

Week 2

Well, it’s week 2 now and development is carrying on apace. I’ve managed to get a fair bit into Huda in the first week, so the question has to be asked “Will Peter manage to maintain this awesome pace?” We all know that was the question you wanted to ask, but were far too nice to ask, so we’ll take it as read, shall we?

Monday

Today, I’m going to be working on adding some filters to the pictures that I’m opening. In the short term, all I’m going to do is hardcode the application to apply blur and Sepia filters to any picture that opens. This isn’t a huge task, and it’s a great way to get something in place that I can use to style the open filters window.

As I want to use filters from many places, I’m going to create a central location for all my filter code. I’ll treat this as a service and use something that’s known rather nerdily as Service Location to get a single reference to my filter code where it’s needed. Ultimately, this is going to be the master model for the whole approach of managing and manipulating the filters, including the “save” method. One thing I want to do with this approach is get away from the concept of the user having to save work. It’s no longer 1990, so why should the user have to remember to save their work. This is probably going to be the most controversial area for those who are used to traditional desktop applications but, given that we are building the next generation of application for people who are used to working with tablets and phones, I think this is an anachronism that we can dispense with quite happily.

Right, that’s a central filter model in place, so all I need to do is hook up the photo selection code and *bang* we have our filters being applied in place. As that was so easy to do, why don’t we get some feedback on the photograph as well? Let’s have the RGB histograms and display them to the user. We’ll put them at the bottom of the display, and add some animation to them so that we can start to get some feedback from them. Again, because I’m working in WPF, and I’m using the MVVM design pattern, the actual screens and animation are incredibly trivial to implement.

As an aside, here, you may be wondering what I’m using in terms of external libraries other than the Perceptual SDK. While I developed my own MVVM framework, I decided for this implementation to go with my good friend Laurent Bugnion’s MVVM Light, as there’s a good chance that the WPFers who end up downloading and playing with the source will have had exposure to that. I contemplated using Code Project uber developer Sacha Barber’s excellentCinch implementation, but the fact that Laurent had converted MVVM Light over to .NET 4.5 was the winning vote for me here. For the actual photo manipulation, I’ve gone with AForge.NET. This is a superb library that does all that I need, and so much more. Again, I could have gone with home rolled code, but the benefit of doing so is far outweighed by the tight timescales.

Tuesday

It’s Tuesday, now, and Intel have arranged a Google hangout for today. This is our opportunity to speak directly to the judges and to give feedback on where we are, and what’s going on. I’ve left the office earlier than I normally would, so as to be in plenty of time for the hangout. All I need is a link from our contacts and I am good to go.

Okay, the hangout has started and I’ve still had no link – nothing. I can see the other parties, but I can’t get in. I’ll fire off an email and hope that someone opens it to let me in. Robert and Wendy are indredibly clued in, so I’m hopeful that I should be able to join soon.

The hangout has ended, and I’m feeling pretty despondent now. Nothing came through, so I haven’t had a chance to talk things through with the others, or to talk to the judges. I’d hoped that they would be online, as well, so we could get some feedback from them. Still, this did give me time to noodle around on a guitar for an hour or so – something that I haven’t had anytime for since the competition started – so that was good. For those that are interested, I was practicing the Steve Morse trick of alternate picking arpeggios that you would normally sweep pick; talk about a challenge. (Note – I was playing this on an Ibanez Prestige RG550XX – a very smooth guitar to play on).

Anyway, I don’t have time to let this get in the way of the coding. As all the other competitors noted, time is tight. Time is so tight that at least one team has allocated a project manager to help them prioritise and manage this as a deliverable. I can’t do this unfortunately, but at least I won’t have to fill in status reports on my progress other than through the blog. One of my fellow solo coders, Lee, made some bold claims in his blog posting, and what I’ve seen from his blog video shows that his bold claims are backed up by bold ability – he’s definitely one to watch here; his project really could revolutionise online webinars and skype like activities.

Todays code is doing some animation on the histogram views, and then hooking that animation up to the gesture camera. This is where things start to get serious. There are, effectively, two approaches I can take here. The first approach is to take the coordinates I’m getting back from the camera and use the VisualTreeHelper.HitTest method in WPF to identify the topmost window that the coordinates match with. This is fairly trivial to implement, but it’s not the method I’m going to take for several reasons, the top ones being:

  1. This returns the topmost item, regardless of where the mouse is. In other words, it will return the top item of the window, even if I’m not over something that I could be interested in. This means I’ll get continuous reports of things like the position being over a layout Grid or Border.
  2. The hit test is slow. What it would be doing, on every frame from the camera, is looking through the VisualTree for the topmost item at that point. That’s an expensive computational operation.
  3. Once I’ve identified that the topmost item is actually something I’m interested in, the code would have to trigger some operation there. This means that I would have to put knowledge of the views I’m showing into the code behind the main window. This is something that I want to avoid at all costs, because this hard coupling can lead to problems down the line.

The approach I’m taking is for each control that I want to interact with the gesture camera to subscribe to the gesture events coming out of the gesture camera. What this means to the code is that each control has to hook into the same gesture camera instance – I’ve created what’s known as a singleton in my Perceptual SDK that serves up a single instance of each perceptual pipeline that I’m interested in. In real terms, this currently intersperses the gesture logic in a few locations, because I want the code that’s displaying the finger tracking image handled in the main window, but the code that’s actually working out whether or not the finger is over the control needs to be in the control. Today, I’m just going to get this working in one control, but the principle should allow me to then generalise this out into a functionality you can subscribe to if you want to have this handled for you.

As I’ve added an animation to the the histogram view, this seems to be a great place to start. What I’ll do here is have the animation triggered when the gesture moves into the control, and have a second animation triggered when the gesture moves out of the control. I now have enough of a requirement to start coding. The first thing I have to do here is to hook into the gesture event inside the Loaded event for the control; this is done here so that we don’t end up having problems with the order that views are started meaning that we try to hook into the event when the pipeline hasn’t been initialised.

Okay, now that I’ve got that event in place, I can work out whether or not the X and Y coordinates fall inside my control. Should be simple – I just need to translate the coordinates of my control into screen coordinates and as I’m just working with rectangular shapes, it’s a simple rectangle test. So, that’s what I code – half a dozen lines of boilerplate code and a call to trigger the animation.

Time to run the application. Okay, I’m waving my hand and the blue spot is following along nicely. Right, what happens when I move my hand over the control I want to animate? Absolutely nothing at all. I’ll put a breakpoint in the event handler just to ensure the event is firing. Yup. I hit the breakpoint no problem, but no matter what I do, the code doesn’t trigger the animation. My rectangle bounds checking is completely failing.

A cup of coffee later, and the answer hits me, and it’s blindingly obvious when it does. I’m returning the raw gesture camera coordinates to Huda, and I’m converting it into display coordinates when I’m displaying the blue blob. What I completely failed to do, however, was remember to perform the same transformation in the user control. In order to test my theory, I’ll just copy the transformation code in to the control and test it there. This isn’t going to be a long term fix, it’s just to prove I’m right with my thinking. Good programming practice indicates that you shouldn’t repeat yourself (we even give this the rather snappy acronym of DRY – or Don’t Repeat Yourself), so I’ll move this code somewhere logical, and just let it be handled once. The logical thing for me to do is to actually return the transformed coordinates, so that’s what I’ll do.

Run the application again and perform the same tests. Success. Now I have the histogram showing when I move my hand over it, and disappearing when I move my hand away from the control. Job done, and time to turn this into a control that my user controls can inherit from. When I do this, I’m going to create GestureEnter and GestureLeave events that follow a similar pattern that WPF developers are familiar with for items such as Mouse and Touch events.

One thing that I didn’t really get time to do last week was to start adding touch support in, beyond the standard “press the screen to open a photo”. That’s something I’m going to start remedying now. I’m just about to add the code for zooming and panning the photo based on touch. Once I’ve got that in place, I think I’ll call it a night. I’ve decided that I’m going to add pan and zoom to the photograph based on touch events; the same logic I use here will apply to gestures, so I will encapsulate this logic and just call it where I need to. WPF makes this astonishingly simple, so the code I will be using looks a lot like this:

<ScrollViewer HorizontalScrollBarVisibility="Auto" VerticalScrollBarVisibility="Auto">
<Image
Source="{Binding PreviewImage}"
HorizontalAlignment="Stretch"
x:Name="scaleImage"
ManipulationDelta="scaleImage_ManipulationDelta_1"
ManipulationStarting="scaleImage_ManipulationStarting_1"
VerticalAlignment="Stretch"
IsManipulationEnabled="True"
RenderTransformOrigin="0.5,0.5" Stretch="Fill">
<Image.RenderTransform>
<MatrixTransform>
</MatrixTransform>
</Image.RenderTransform>
&lt;/Image&gt;
</ScrollViewer>

Now, some people think that MVVM requires you to remove all code from the code behind. Far be it for me to say that they are wrong but, they are wrong. MVVM is about removing the code that doesn’t belong in the view, from the view. In other words, as the image panning and zooming relates purely to the view, it’s perfectly fine to put the logic into the code behind. So, let’s take advantage of the ManipulationDelta and ManipulationStarting events which give us the ability to apply that ol’ pan and zoom mojo. It goes something like this:

private void scaleImage_ManipulationDelta_1(object sender, ManipulationDeltaEventArgs e)
{
Matrix rectsMatrix = ((MatrixTransform)scaleImage.RenderTransform).Matrix;
ManipulationDelta manipDelta = e.DeltaManipulation;
Point rectManipOrigin = rectsMatrix.Transform(new Point(scaleImage.ActualWidth / 2, scaleImage.ActualHeight / 2));

rectsMatrix.ScaleAt(manipDelta.Scale.X, manipDelta.Scale.Y, rectManipOrigin.X, rectManipOrigin.Y);
rectsMatrix.Translate(manipDelta.Translation.X, manipDelta.Translation.Y);
((MatrixTransform)scaleImage.RenderTransform).Matrix = rectsMatrix;
e.Handled = true;
}

private void scaleImage_ManipulationStarting_1(object sender, ManipulationStartingEventArgs e)
{
e.ManipulationContainer = this;
e.Handled = true;
}

I’m not going to dissect this code too much but, suffice it to say the ScaleAt code is responsible for zooming the photo and the Translate code is responsible for panning it. Note that I could have easily added rotation if I’d wanted to, but free style rotation isn’t something I’m planning here.

Wednesday

Well, it’s a new day and one of the features I want to be able to do is to retrieve the item that a finger is pointed at, based on it’s X/Y coordinates. As WPF doesn’t provide this method by default, it’s something that I’ll have to write myself. Fortunately, this isn’t that complicated a task and the following code should do nicely.

public static int GetIndexPoint(this ListBox listBox, Point point)
{
int index = -1;
for (int i = 0; i < listBox.Items.Count; ++i)
{
ListBoxItem item = listBox.Items[i] as ListBoxItem;
Point xform = item.TransformToVisual((Visual)listBox.Parent).Transform(new Point());
Rect bounds = VisualTreeHelper.GetDescendantBounds(item);
bounds.Offset(xform.X, xform.Y);
if (bounds.Contains(point))
{
index = i;
break;
}
}
return index;
}

Tracking the mouse move, for instance, would look like this:

int lastItem = 0;
private Timer saveTimer;
public MainWindow()
{
InitializeComponent();
this.MouseMove += new MouseEventHandler(MainWindow_MouseMove);
saveTimer = new Timer();
saveTimer.Interval = 2000;
}
void saveTimer_Elapsed(object sender, ElapsedEventArgs e)
{
saveTimer.Stop();
saveTimer.Elapsed -= saveTimer_Elapsed;
Dispatcher.BeginInvoke((Action)(() =&gt;
{
selectyThing.SelectedIndex = lastItem;
}), DispatcherPriority.Normal);
}
void MainWindow_MouseMove(object sender, MouseEventArgs e)
{
int index = selectyThing.GetIndexPoint(e.GetPosition(this));
if (index != lastItem)
{
lastItem = index;
saveTimer.Stop();
saveTimer.Elapsed -= saveTimer_Elapsed;
if (lastItem != -1)
{
saveTimer.Elapsed += saveTimer_Elapsed;
saveTimer.Start();
}
}
}

Thursday

Today I’ve really been motoring, and managed to pull slightly ahead of my schedule, but I’ve also hit a bit of a gesture road block. So, I’ve updated the gesture code to retrieve multiple finger positions, as well as to retrieve a single node. This actually works really well from a code point of view, and it’s great to see my “fingers” moving around on the screen. The problem is that one finger “wobbles” less than a hand, plus the item selection actually works better based on a single finger. At least I’ve got options now – and the gesture code is working well – plus, I’ve managed to hook single selection of a list item based on my finger position. This allows the gesture code to mimic the touch code, something I’m going to explore more in week 3. My gesture selection code has now grown to this:

using Goldlight.Perceptual.Sdk.Events;
using Goldlight.Windows8.Mvvm;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;

namespace Goldlight.Perceptual.Sdk
{
public class GesturePipeline : AsyncPipelineBase
{
private WeakEvent<GesturePositionEventArgs> gestureRaised = new WeakEvent<GesturePositionEventArgs>();
private WeakEvent<MultipleGestureEventArgs> multipleGesturesRaised = new WeakEvent<MultipleGestureEventArgs>();
private WeakEvent<GestureRecognizedEventArgs> gestureRecognised = new WeakEvent<GestureRecognizedEventArgs>();

public event EventHandler<GesturePositionEventArgs> HandMoved
{
add { gestureRaised.Add(value); }
remove { gestureRaised.Remove(value); }
}

public event EventHandler<MultipleGestureEventArgs> FingersMoved
{
add { multipleGesturesRaised.Add(value); }
remove { multipleGesturesRaised.Remove(value); }
}

public event EventHandler<GestureRecognizedEventArgs> GestureRecognized
{
add { gestureRecognised.Add(value); }
remove { gestureRecognised.Remove(value); }
}

public GesturePipeline()
{
EnableGesture();

searchPattern = GetSearchPattern();
}

private int ScreenWidth = 1024;
private int ScreenHeight = 960;

public void SetBounds(int width, int height)
{
this.ScreenWidth = width;
this.ScreenHeight = height;
}

public override void OnGestureSetup(ref PXCMGesture.ProfileInfo pinfo)
{
// Limit how close we have to get.
pinfo.activationDistance = 75;
base.OnGestureSetup(ref pinfo);
}

public override void OnGesture(ref PXCMGesture.Gesture gesture)
{
if (gesture.active)
{
var handler = gestureRecognised;
if (handler != null)
{
handler.Invoke(new GestureRecognizedEventArgs(gesture.label.ToString()));
}
}
base.OnGesture(ref gesture);
}

public override bool OnNewFrame()
{
// We only query the gesture if we are connected. If not, we shouldn't
// attempt to query the gesture.
try
{
if (!IsDisconnected())
{
var gesture = QueryGesture();
PXCMGesture.GeoNode[] nodeData = new PXCMGesture.GeoNode[6];
PXCMGesture.GeoNode singleNode;
searchPattern = PXCMGesture.GeoNode.Label.LABEL_BODY_HAND_PRIMARY;
var status = gesture.QueryNodeData(0, searchPattern, nodeData);

if (status >= pxcmStatus.PXCM_STATUS_NO_ERROR)
{
var handler = multipleGesturesRaised;

if (handler != null)
{
List<GestureItem> gestures = new List<GestureItem>();
foreach (var node in nodeData)
{
float x = node.positionImage.x - 85;
float y = node.positionImage.y - 75;

GestureItem item = new GestureItem(x, y, node.body.ToString(), ScreenWidth, ScreenHeight);
gestures.Add(item);

handler.Invoke(new MultipleGestureEventArgs(gestures));
}
}
}

status = gesture.QueryNodeData(0, GetSearchPattern(), out singleNode);
if (status >= pxcmStatus.PXCM_STATUS_NO_ERROR)
{
var handler = gestureRaised;
if (handler != null)
{
handler.Invoke(new GesturePositionEventArgs(singleNode.positionImage.x,
singleNode.positionImage.y, singleNode.body.ToString(), ScreenWidth, ScreenHeight));
}
}
}
Sleep(20);
}
catch (Exception ex)
{
// Error handling here...
}
return true;
}

private readonly object SyncLock = new object();

private void Sleep(int time)
{
lock (SyncLock)
{
Monitor.Wait(SyncLock, time);
}
}

private PXCMGesture.GeoNode.Label searchPattern;

private PXCMGesture.GeoNode.Label GetSearchPattern()
{
return PXCMGesture.GeoNode.Label.LABEL_BODY_HAND_PRIMARY |
PXCMGesture.GeoNode.Label.LABEL_FINGER_INDEX |
PXCMGesture.GeoNode.Label.LABEL_FINGER_MIDDLE |
PXCMGesture.GeoNode.Label.LABEL_FINGER_PINKY |
PXCMGesture.GeoNode.Label.LABEL_FINGER_RING |
PXCMGesture.GeoNode.Label.LABEL_FINGER_THUMB |
PXCMGesture.GeoNode.Label.LABEL_HAND_FINGERTIP;
}
}
}

I’ve hooked into the various events from here to handle the type of interactions we are used to seeing from mouse and touch. Again, there’s a bit of repeated code in here, but I’ll be converting this into WPF behaviors which you will be able to use in your own projects. I’ve said it before, I love writing Blend Behaviors – they really make your life easy. As a personal plea to Microsoft; if you want to get people to write for WinRT, give us the features we’re used to in XAML right now. The fact that I can’t use Blend Behaviors in Windows 8 “Modern” applications is one more barrier in the way of me actually writing for the app store.

Friday

Following the big strides forwards I made with the gesture code yesterday, I’m going to have an easy one today and hook up the filter code. A lot of the time here will just be putting the plumbing in place to actually display the filters and select them. Initially, I’m only going to offer a few filters – I’ll add more next week. The key thing here is to prove that I can do this easily. I’m not spending too much time on the UI for the filters right now, but I will be adding to it in the near future. The point is that, while the interface is quite spartan right now, for a seasoned WPF developer, actually beefing it up isn’t a major task. The key thing I have in place is that all the main interface areas are solid black, with a glowy border around them. Once I’ve added the filters in and hooked them up, I think I’ll post a picture or two to show the before and after filter behaviour of Huda.

This screenshot is taken immediately after the very posh sheep photo was loaded:

If you look carefully at the bottom of the image, you’ll see the very faint Histogram view that I was talking about earlier. I’ve made this part of the interface transparent – gesture over it, or touch it and it becomes opaque. I’ll be doing something similar with other operations as well – the photo should be the view point, not the chrome such as the available filters.

I’ll apply the channel filter and blur like this:

Well, they have applied quite nicely. I’ll add other filters and the ability to set their values next week.

This weeks music
  • Deep Purple – Purpendicular
  • AC/DC – Back in Black
  • Andrea Bocelli – Sacred Arias
  • Muse – Black Holes and Revelations
  • Chickenfoot = Chickenfoot III
  • Jethro Tull – Aqualung and Thick As A Brick

End of week 2

Well here we are. Another week of developing and blogging. Where do I think I’m at right now? In some ways, I’m ahead of schedule but in other ways I feel I’m behind schedule. Oh, the code will be finished on time, and the application will do what I set out to do, but when I look at what my competitors are doing I can’t help feeling that I could do so much more with this. The big problem, of course, is that it’s easy to get carried away with “cool” features but if they don’t add anything to the application, then they are worthless. While the things that the other teams are doing make sense in the problem domains they are using, some of these features really don’t make sense in a photo editor. Perhaps I could have been more ambitious with what I wanted to deliver at the start, but overall I’m happy with what I’m going to deliver and what is coming up.

As a corollary to this, have you checked out what my fellow contestants are planning? Please do so and comment on their posts. They are all doing some amazing things, and I would take it as a personal favour if you would take the time to offer your encouragement to them.

By the end of week 3, I fully intend to have the whole central premise linked in. If you remember, my statement for Huda was that edits should be none-destructive. Well, they aren’t, but they aren’t saved anywhere right now. That’s what I need to hook in next week. The edits must be saved away, and the user must be able to edit them. If I haven’t delivered this, then I’ve failed to deliver on the promise of the application. Fortunately, I already know how I want to go about the save and it doesn’t touch on too much code that’s already in place, so it should be a nice and natural upgrade.

While I’ve thanked my fellow competitors, I haven’t taken the time to thank the judges and to our support team, (primarily the wonderful Bob Duffy and Wendy Boswell; what does the X stand for in your name?) There’s a real sense of camaraderie amongst the competitors, more so than I would ever have expected in a contest. We are all prepared to give our time and our thoughts to help the other teams – ultimately, we were all picked because of our willingness to share with the wider audience so there’s a natural symbiosis of us sharing with each other. Personalities are starting to come to the fore, and that is wonderful. Anyway, back to the judges (sorry for the digression). Come Wednesday, I hit refresh on the browser so many times waiting for their comments. Surprisingly enough, for those who know me, this is not an ego issue. I’m genuinely interested to read what the judges have to say. This is partly because I want to see that they think the application is still worth keeping their interest (and yes Sascha, I know precisely why Ctrl-Z is so important in photo apps given the number of disastrous edits I’ve made in the past). The main reason, though, is that I want to see that I’m writing the blog post at the write level. As someone who is used to writing articles that explain code listings, I want to make sure that I don’t make you nod off if you aren’t a coder, but at the same time I have a duty of care towards my fellow developers to try and educate them. So, to the judges, as well as Bob and Wendy; I thank you for your words and support.

You may have noticed that I haven’t posted any Youtube videos this week. There’s a simple reason for that. Right now, Huda is going through rapid changes any day. It’s hard to choose a point that I want to say “yes, stopping and videoing at this point makes sense” because I know that if I wait until the following day, the changes to the application at that point would make as much, or even more, sense. My gut feeling, right now, is that the right time to make the next video is when we can start reordering the filters.

One thing that was picked up on by the judges was my posting the music I was listening to. You may be wondering why I did that. There was a reason for it, and it wasn’t just “a bit of daft” on my part. Right now, this contest is a large part of my life. I’m devoting a lot of time to it, this isn’t a complaint as I love writing code, but you can go just the slightest bit insane if you don’t have some background noise. I can’t work with the TV on, but I can code to music, so I thought it might be interesting for the none coders to know how at least one coder hangs on to that last shred of what he laughingly calls his sanity. Last week, I didn’t manage to pick up my guitar so I am grateful that I managed to get 40 minutes or so to just noodle around this week. My down time is devoted to my family, including our newest addition Harvey, and this is the time I don’t even look at a computer. Expect to see more of Harvey over the next few weeks.

Harvey in all his glory.

So, thanks for reading this. I hope you reached this point without nodding off, and the magic word for this week is Wibble. We’ll see which judges manage to get it into their posts – that’ll see who’s paying attention. Until next week, bye.

Ultimate Coder: Going Perceptual – Week 1 Blog Posting.

This is a copy of the post I made on the Intel site here. For the duration of the contest, I am posting a weekly blog digest of my progress with using the Perceptual Computing items. The first weeks post is really a scene setter where I explain how I got to this point, and details bits and pieces about the app I intend to build. My fellow competitors are planning to build some amazing applications, so I’d suggest that you head on over to the main Ultimate Coder site and follow along with their progress as well. I’d like to take this opportunity to wish them good luck.

A couple of months ago, I was lucky enough to be asked if I would like to participate in an upcoming Intel® challenge, known at the time as Ultimate Coder 2. Having followed the original Ultimate Coder competition, I was highly chuffed to even be considered. I had to submit a proposal for an application that would work on a convertible Ultrabook™ and would make use of something called Perceptual Computing. Fortunately for me, I’d been inspired a couple of days earlier to write an application and describe how it was developed on CodeProject – my regular hangout for writing about things that interest me and that haven’t really been covered much by others. This seemed to me to be too good an opportunity to resist; I did some research on what Perceptual Computing was and I’d write the application to incorporate features that I thought would be a good match. As a bonus, I’d get to write about this and as I like giving code away, I’d publish the source to the actual Perceptual Computing side as a CodeProject article at the end of the competition.

Okay, at this point, you’re probably wondering what the application is going to be. It’s a photo editing application, called Huda, and I bet your eyes just glazed over at that point because there are a bazillion and one photo editing applications out there and I said this was going to be original. Right now you’re probably wondering if I’ve ever heard of Photoshop® or Paint Shop Pro®, and you possibly think I’ve lost my mind. Bear with me though, I did say it would be different and I do like to try and surprise people. 

A slight sidebar here. I apologise in advance if my assault on the conventions of the English language become too much for you. Over the years I’ve developed a chatty style of writing and I will slip backwards and forwards between the first and third person as needed to illustrate a point – when I get into the meat of the code that you need to write to use the frameworks, I will use the third person.

So what’s so different about Huda? Why do I think it’s worth writing articles about? In traditional photo editing applications, when you change the image and save it, the original image is lost (I call this a destuctive edit) – Fireworks did offer something of this ability, but only if you work in .png format. In Huda, the original image isn’t lost because the edits aren’t applied to it – instead, the edits are effectively kept as a series of commands that can be replayed against the image, which gives the user the chance to come back to a photo months after they last edited it and do things like insert filters between others, or possibly rearrange and delete filters. The bottom line is, whatever edit you want to apply, you can still open your original image. Huda will, however, provide the ability to export the edited image so that everyone can appreciate your editing genius.

At this stage, you should be able to see where I’m going with Huda (BTW it’s pronounced Hooda), but what really excited me was the ability to offer alternative editing capabilities to users. This, to me, has the potential to really open up the whole photo editing experience for people, and to make it accessible beyond the traditional mouse/keyboard/digitizer inputs. After all, we now have the type of hardware available to us that we used to laugh at in Hollywood films, so let’s develop the types of applications that we used to laugh at. In fact, I’ve turned to Hollywood for ideas because users have been exposed to these ideas already and this should help to make it a less daunting experience for users.

Why is this learning curve so important? Well, to answer this, we really need to understand what I think Perceptual Computing will bring to Huda. You might be thinking that Perceptual Computing is a buzz phrase, or marketing gibberish, but I really believe that it is the next big thing for users. We have seen the first wave of devices that can do this with the Wii and then the XBox/Kinect combination, and people really responded to this, but these stopped short of what we can achieve with the next generation of devices and technologies. I’ll talk about some of the features that I will be fitting into Huda over the next few weeks and we should see why I’m so excited about the potential and, more importantly, what I think the challenges will be.

Touch computing. Touch is an important feature that people are used to already, and while this isn’t being touted in the Perceptual Computing SDK, I do feel that it will play a vital part in the experience for the user. As an example, when the user wants to crop an image, they’ll just touch the screen where they want to crop to – more on this in a minute because this ties into another of the features we’ll use. Now this is all well and good but we can do more, perhaps we can drag those edits round that we were talking about to reorder them. But wait, didn’t we say we want our application to be more Hollywoody? Well, how about we drag the whole interface around? Why do we have to be constrained for it to look like a traditional desktop application? Let’s throw away the rulebook here and have some fun.

Gestures. Well, touch is one level of input, but gestures take us to a whole new level. Whatever you can do with touch, you really should be able to do with gesture, so Huda will mimic touch with gestures, but that’s not enough. Touch is 2D, and gestures are 3D, so we really should be able to use that to our advantage. As an example of what I’ll be doing with this – you’ll reach towards the screen to zoom in, and pull back to zoom out. The big challenge with gestures will be to provide visual cues and prompts to help the user, and to cope with the fact that gestures are a bit less accurate. Gestures are the area that really excite me – I really want to get that whole Minority Report feel and have the user drag the interface through the air. Add some cool glow effects to represent the finger tips and you’re well on the way to creating a compelling user experience.

Voice. Voice systems aren’t new. They’ve been around for quite a while now, but their potential has remained largely unrealised. Who can forget Scotty, in Star Trek, picking up a mouse and talking to it? Well, voice recognition should play a large part in any Perceptual system. In the crop example, I talked about using touch, or gestures, to mark the cropping points; well, at this point your hands are otherwise occupied, so how do you actually perform the crop? With a Perceptual system, you merely need to say “Crop” and the image will be cropped to the crop points. In Huda, we’ll have the ability to add a photographic filter merely by issuing a command like “Add Sepia”. In playing round with the voice code, I have found that while it’s incredibly easy to use this, the trick is to really make the commands intuitive and memorable. There are two ways an application can use voice; either dictation or command mode. Huda is making use of command mode because that’s a good fit. Interestingly enough, my accent causes problems with the recognition code, so I’ll have to make sure that it can cope with different accents. If I’d been speaking with a posh accent, I might have missed this.

A feature that I’ll be evaluating for usefulness is the use of facial recognition. An idea that’s bubbling around in my mind is having facial recognition provide things like different UI configurations and personalising the most recently used photos depending on who’s using the application. The UI will be fluid, in any case, because it’s going to cope with running as a standard desktop, and then work in tablet mode – one of the many features that makes Ultrabooks™ cool.

So, how much of Huda is currently built? Well, in order to keep a level playing field, I only started writing Huda on the Friday at the start of the competition. Intel were kind enough to supply a Lenovo® Yoga 13 and a Gesture Camera to play around with, and I’ve spent the last couple of weeks getting up to speed with the Perceptual SDK. Huda is being written in WPF because this is a framework that I’m very comfortable in and I believe that there’s still a place for desktop applications, plus it’s really easy to develop different types of user interfaces, which is going to be really important for the applicatino. My aim here is to show you how much you can accomplish in a short space of time, and to provide you with the same functionality at the end as I have available. This, after all, is what I like doing best. I want you to learn from my code and experiences, and really push this forward to the next level. Huda isn’t the end point. Huda is the starting point for something much, much bigger.

Final thoughts. Gesture applications shouldn’t be hard to use, but the experience of using it should be easily discoverable. I want the application to let the user know what’s right, and to be intuitive enough to us the without having to watch a 20 minute getting started video. It should be familiar and new at the same time. Hopefully, by the end of the challenge, we’ll be in a much better position to create compelling Perceptual applications, and I’d like to thank Intel® for giving me the chance to try and help people with this journey. And to repay that favour, I’m making the offer that you will get access to all the perceptual library code I write.

Altering my perception

My apologies for not posting for a while; it’s been a pretty crazy couple of months and it’s just about to get a whole lot crazier. For those who aren’t aware, Intel® have started running coder challenges where they get together people who are incredibly talented and very, very certifiable and issue them with a challenge. Last year they ran something called the Ultimate Coder which looked to find, well, the ultimate coder for creating showcase Ultrabook™ applications. This competition proved so successful, and sparked such interest from developers that Intel® are doing it again, only crazier.

So, Ultimate Coder 2 is about to kick off, and like The Wrath Of Kahn, it proves that sequels can be even better than the original. The challenge this time is to create applications that make use of the next generation of Ultrabook™ features to cope with going “full tablet”, and as if that wasn’t enough, the contestants are being challenged to create perceptual applications. Right now I bet two questions are going through your mind; first of all, why are you writing about this Pete, and secondly, what’s perceptual computing?

The answer to the first question lies in the fact that Intel® have very kindly agreed to accept me as a charity case developer in the second Ultimate Coder challenge (see, I can do humble – most of the time I just choose not to). The second part is a whole lot more fun – suppose you want to create applications that respond to touch, gestures, voice, waving your hands in the air, moving your hands in and out to signify zoom in and out, or basically just about the wildest UI fantasies you’ve seen coming out of Hollywood over the last 30 years – that’s perceptual computing.

So, we’ve got Lenovo Yoga 13 Ultrabooks™ to develop the apps on, and we’ve got Perceptual Camera and SDK to show off. We’ve also got 7 weeks to create our applications, so it’s going to be one wild ride.

It wouldn’t be a Pete post without some source code though, so here’s a little taster of how to write voice recognition code in C# with the SDK.

public class VoicePipeline : UtilMPipeline
{
private List<string> cmds = new List<string>();

public event EventHandler<VoiceEventArgs> VoiceRecognized;
public VoicePipeline() : base()
{

EnableVoiceRecognition();

cmds.Add("Filter");
cmds.Add("Save");
cmds.Add("Load");
SetVoiceCommands(cmds.ToArray());
}

public override void OnRecognized(ref PXCMVoiceRecognition.Recognition data)
{
var handler = voiceRecognized;
if (data.label >= 0 && handler != null)
{
handler.Invoke(new VoiceEventArgs(cmds[data.label]));
}
base.OnRecognized(ref data);
}

public async void Run()
  {
await Task.Run(() => { this.LoopFrames(); });
this.Dispose();
}
}

As the contest progresses, I will be posting both here on my blog, and a weekly report on the status of my application on the Intel® site. It’s going to be one wild ride.