Archive

Archive for the ‘Ultimate Coder’ Category

Ultimate Coder – Ma il Mio Mistero è Chiuso In Me

March 12, 2013 Leave a comment

This is a copy of the post I made on the Intel site here. For the duration of the contest, I am posting a weekly blog digest of my progress with using the Perceptual Computing items. This weeks post shows how Huda has evolved from the application that was created at the end of the third week. My fellow competitors are planning to build some amazing applications, so I’d suggest that you head on over to the main Ultimate Coder site and follow along with their progress as well. I’d like to take this opportunity to wish them good luck.

Executive summary

The judges will be pleased to know that this blog entry is a lot less verbose than previous ones. This isn’t because there’s not a lot to talk about, but rather because I’m concentrating on writing a lot of code this week, and I’ve hit a real roadblock with regards to the gesture camera. Code that was working perfectly well last week has suddenly stopped working. This means that I’ve got some serious debugging to do here to find out why the gesture code is no longer displaying the cursor on the screen – this doesn’t happen all the time, but it’s happened often enough for me to be concerned that there’s an underlying problem there that needs to be sorted out.

As part of my remit is to help coders learn to write code for the Perceptual SDK, I will be continuing writing about the coding process, so I apologise to the judges in advance – there’s nothing I can do to alleviate the detail of explanation. If they wish to skip ahead, the summary talks about where we are today and what my plans are for next week, and here are some screen shots. Unlike my previous blogs, I’m not going to be breaking this one down on a day by day basis. There’s a lot that needs to be done, and it doesn’t break down neatly into days.

For those who want to know the result, don’t look away now. I found and fixed the issue with the gesture code. It’s all a matter of patient debugging and narrowing down areas where there can possibly be problems.

I’ve been wondering how best to train people in how to use Huda, and this week I’ve hit on the solution. I’m going to let Huda teach you. By the end of the week, I added in voice synthesis, and I think I can use this to help provide context. The type of thing I have in mind is, say you bring up the filters, Huda will tell you that a swipe left or right will cycle through the filters, and the thumbs up will add the filter to the currently selected image. The beauty is, now that I have the code in place to do the voice synthesis, I can easily add this type of context.

Voice recognition is an area of the Perceptual SDK that has a long way to go. It just isn’t accurate enough and, for that reason, I’m dropping support for it from Huda. I had the commands “Open Filter” and “Close Filter” in the application, and no matter what I tried, it kept turning Open Filter into Close Filter. Yes, it recognises a few words, but given that the accuracy is so hit and miss, I can’t leave it in there. So, I apologise to you right now – voice control is out. I had to take this decision this week because there are certain things I was using it for that I have to find other ways to accomplish these things and I really don’t have the time to keep mucking around with this.

If you don’t want to read the rest of the post, this is what Huda looks like now:

Woe to the interface

It’s going to be a busy week this week. The more I play with the interface, the more I find it clunky and unpolished, and this is incredibly frustrating. While I know that I’m a long way from polishing the interface, I can’t help but think that I’m missing a trick or two here. The biggest thing that I have issues with is the concept of showing the folders and selecting the images. So, the first thing I’m going to do is to unify the two parts; this should make it obvious that the picture selection and folder selection actually belong together. Part of my thinking here is that I want it to be apparent to anyone who’s used a photo editing application before that this is still a photo editing application. While it’s fine to play about with an interface, there still has to be a certain level of familiarity; there has to be a feeling of comfort in using the application otherwise the user will end up being overwhelmed and will give up on the application. While the contest is about working with Perceptual computing and features of the ultrabook, it’s important that I don’t lose sight of the fact that this must be an application that people can use.

Once I’ve got this in place, I want to address the selection of “things to do to the photograph”. By this, I mean that I want to put some form of toolbar into place to trigger options. Don’t worry that I’m throwing sections of the UI together – I’m not. What I will be doing here is giving the user obvious trigger points – I can use this to bridge the gap between traditional photo editing applications and Huda. There has to be some familiarity for the user here, so I’ll be providing that today.

You may think that this is going to take a lot of work, and in other languages you may well be right, but I’m using MVVM and WPF. As the different parts of the interface are loosely coupled controls, it’s a simple matter for me to move the controls into other containers, and they will still work. The only change I’m going to make is to keep the images showing regardless of whether or not the user has opened a folder.

With the basics of the “toolbar” in place, I now have something that feels more natural for using gestures with. I’m going to hook the filter selection into both touch and gestures, and do something a little bit fancy for displaying the list of filters. What I’ll do is move the filters into a carousel, and use the swipe left/swipe right to move backwards and forwards through the list.

I’m concerned about the speed of loading of various items when thingsare selected, so I will be moving things out onto background threads.

The interface is really starting to firm up now. It’s starting to feel that little bit more polished to me, but there is more that I can do with it.

Don’t worry that all the filters have the same name and appearance right now. This is purely a visual that will be changed soon – I need to come up with some good iconography for these. It’s strangely satisfying swiping your hand left and right to rotate through this list. My final action on this end will be to introduce a gesture that actually applies the filter. The thumbs up seems oddly appropriate to me.

The big reveal

After a frenzy of activity on the actual interface, I want to address one of the central concepts of Huda; namely, the ability to filter the image, and not touch the underlying image. This is where I give up the secret of Huda – it’s all a massive con. If you remember, I promised that the underlying image would not be touched. I’ve had people get in touch with all sorts of wild theories on how I was going to do this – from the person who thought I was going to use steganography to hide this information in the picture to someone who thought I was going to convert the file into a custom format and put something in place to fool image editors into thinking that the type was still the original file format.

If I were to tell you that the solution to this problem was incredibly straightforward, and blindingly obvious, would this surprise you? Well, here it is. Here’s how it’s done…. Did you catch that? Yes, it’s all smoke and mirrors. What Huda does is actually not modify the original image at all – instead, we maintain an index of all the files that we’ve edited in Huda, along with the details of all the edits that have been performed, and these can be played again and again and again. In other words, when you open a file up in Huda, it goes to the index first to see if you’ve previously opened it. If you have, it gets the image transformations and reapplies them there and then. It’s as simple as that. Of course, this does rely on you not moving photos around, but I do have some plans in place for a post contest version of Huda to take care of that as well. Anyone for hashing?

Transformers – Robert’s in disguise

A judgely warning – this section is going to be intense. I’m sorry, but there’s no way around the fact that we need to write some code, and there needs to be some explanation of it. It’s going to be quite dry, so I won’t blame you if you want to skip to the bit at the end.

As a first pass for the transformations that I want to apply, I decided that I needed to declare a common interface that all my transformation classes would apply. It’s a very simple interface, but as I want all my transformations to be serializable, I’ve made sure that it implements ISerializable by default

using System.Windows.Media.Imaging;
using System.Runtime.Serialization;
namespace Huda.Transformations
{
    public interface IImageTransform : ISerializable
    {
        int Version { get; }
        BitmapSource Transform(BitmapSource source);
    }
}

At this point it’s merely a matter of creating some simple transformations that use this functionality. Being the nice chap that I am, and to give Nicole some coding kudos that she can drop into casual conversation, I’ll show you all the classes I’m using for transformation. First of all, here’s the image crop routine:

using System.Windows.Media.Imaging;
using System.Windows;
using System.Runtime.Serialization;
using System;
namespace Huda.Transformations
{
    [Serializable]
    public class CropImage : IImageTransform
    {
        public CropImage()
        {
            Version = 1;
        }
        protected CropImage(SerializationInfo info, StreamingContext context) : this()
        {
            Version = info.GetInt32(Constants.Version);
            Left = info.GetInt32(Constants.Left);
            Top = info.GetInt32(Constants.Top);
            Width = info.GetInt32(Constants.Width);
            Height = info.GetInt32(Constants.Height);
        }
        public int Version
        {
            get;
            private set;
        }
        public int Left { get; set; }
        public int Top { get; set; }
        public int Width { get; set; }
        public int Height { get; set; }
        public BitmapSource Transform(BitmapSource source)
        {
            Int32Rect rect = new Int32Rect(Left, Top, Width, Height);
            source = new CroppedBitmap(source, rect);
            return source;
        }
        public void GetObjectData(SerializationInfo info, StreamingContext context)
        {
            info.AddValue(Constants.Version, Version);
            info.AddValue(Constants.Left, Left);
            info.AddValue(Constants.Top, Top);
            info.AddValue(Constants.Width, Width);
            info.AddValue(Constants.Height, Height);
        }
    }
}

There’s a lot of code in here that you’ll see in other classes, so I’ll just explain it once and you should easily be able to follow how I’m using it in other locations. The first thing to notice is that I’ve marked this class as Serializable – you need to do that if you want .NET to do it’s mojo when you save things. You’ll see that there are two things inside the class that say CropImage; these are the constructors that are used to create the object. The second one is the special constructor, and is the reason that I had IImageTransform implemenent ISerializable. .NET knows, when it sees this interface, that you want to read and write out the values in the class yourself. There are many, many reasons that I want to do this, but the main reason is because of what happens if you just let the runtime try and figure this stuff out itself – it’s not very performant because of the way it has to map things together. By taking control of this myself, I make the job easy. If you see how I use info.GetInt32 (there are many other methods for reading other types of data), it’s apparent what type of data the property is – we must always make sure we get it right, otherwise unpredictable and nasty things can happen.

At the bottom of the code, there’s a matching method called GetObjectData that just writes the items to the serialization stream. It’s important to make sure that you use the same names for the items in the constructor that you did in this method. The property names don’t have to match these names, but the names themselves do.

The Version currently does nothing, but if I have to add features to any of these classes, I can use this to work out which properties should be present, so the saving and reading of the data carries on seamlessly. It’s always a good idea to design your serialization with this firmly in mind.

The Left, Top, Width and Height properties must be set before we attempt to call Transform. If we forget to set them, all that will happen is that the rectangle that’s created will be 0 pixels in size, and we wouldn’t want a zero pixel crop.

The Transform method is where the “clever” stuff actually happens. I bet you thought there would be more to it. All it does is create a rectangle based on the size and location set in the properties, and then it uses the inbuilt CroppedBitmap class to actually perform the crop.

See! I told you that was easy. Next, I’ll show you what the FlipImage transformation looks like:

using System;
using System.Windows.Media.Imaging;
using System.Windows.Media;
using System.Runtime.Serialization;
namespace Huda.Transformations
{
    [Serializable]
    public class FlipImage : IImageTransform
    {
        public FlipImage()
        {
            Version = 1;
        }
        protected FlipImage(SerializationInfo info, StreamingContext context)
        {
            Version = info.GetInt32(Constants.Version);
            HorizontalFlip = info.GetBoolean(Constants.Horizontal);
        }
        public int Version { get; private set; }
        public bool HorizontalFlip { get; set; }
        public BitmapSource Transform(BitmapSource source)
        {
            source = new TransformedBitmap(source, new ScaleTransform(HorizontalFlip ? -1 : 1, HorizontalFlip ? 1 : -1));
            return source;
        }
        public void GetObjectData(SerializationInfo info, StreamingContext context)
        {
            info.AddValue(Constants.Version, Version);
            info.AddValue(Constants.Horizontal, HorizontalFlip);
        }
    }
}

I won’t explain this code, as it’s largely the same as the CropImage code. If you’re interested why I didn’t put things like the Version and Transform into an abstract base class, please let me know and I’ll cover attempt to answer it in the comments. At this point, the judges are screaming for mercy and begging me to stop, so I’ll try not go too far into philosophical architectural debates.

The ResizeImage transformation is an interesting one (well, interesting if you’re into that type of thing), and the Transform code is more complex than other tranformations. To save you having to fight with this yourself, here it is

using System;
using System.Windows.Media.Imaging;
using System.Windows.Media;
using System.Windows;
using System.Runtime.Serialization;
namespace Huda.Transformations
{
    [Serializable]
    public class ResizeImage : IImageTransform
    {
        public ResizeImage()
        {
            Version = 1;
        }
        protected ResizeImage(SerializationInfo info, StreamingContext context)
            : this()
        {
            Version = info.GetInt32(Constants.Version);
            Width = info.GetInt32(Constants.Width);
            Height = info.GetInt32(Constants.Height);
        }
        public int Version
        {
            get;
            private set;
        }
        public int Width { get; set; }
        public int Height { get; set; }
        public BitmapSource Transform(BitmapSource source)
        {
            Rect rect = new Rect(0, 0, Width, Height);
            DrawingVisual visual = new DrawingVisual();
            using (DrawingContext ctx = visual.RenderOpen())
            {
                ctx.DrawImage(source, rect);
            }
            RenderTargetBitmap resized = new RenderTargetBitmap(
                Width, Height, 96, 96, PixelFormats.Default
                );
            resized.Render(visual);
            return resized;
        }
        public void GetObjectData(SerializationInfo info, StreamingContext context)
        {
            info.AddValue(Constants.Version, Version);
            info.AddValue(Constants.Width, Width);
            info.AddValue(Constants.Height, Height);
        }
    }
}

The constants we’d use

namespace Huda.Transformations
{
    public class Constants
    {
        public const string Left = "Left";
        public const string Right = "Right";
        public const string Top = "Top";
        public const string Bottom = "Bottom";
        public const string Width = "Width";
        public const string Height = "Height";
        public const string Horizontal = "Horizontal";
        public const string Version = "Version";
    }
}

I have been asked how the gesture hookup behaves. Well, imagine you were tracking mouse moves to see which items were under the cursor, you might want to create a Blend Behavior that looked something like this:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Windows;
using System.Windows.Interactivity;
using System.Windows.Input;
using System.Windows.Controls;
using System.Diagnostics;
using System.Timers;
using System.Windows.Media;
using LinqToVisualTree;
using System.Windows.Threading;

namespace Huda.Behaviors
{
    public class GestureMovementBehavior : Behavior<TreeView>
    {
        private bool inBounds;
        private Timer selectedTimer = new Timer();
        private TreeViewItem selectedTreeview;
        private TreeViewItem previousTreeView;

        protected override void OnAttached()
        {
            Window.GetWindow(AssociatedObject).MouseMove += GestureMovementBehavior_MouseMove;
            selectedTimer.Interval = 2000;
            selectedTimer.AutoReset = true;
            selectedTimer.Elapsed += selectedTimer_Elapsed;
            base.OnAttached();
        }

        void selectedTimer_Elapsed(object sender, ElapsedEventArgs e)
        {
            selectedTimer.Stop();
            Dispatcher.Invoke(DispatcherPriority.Normal,
                (Action)delegate()
                {
                    if (previousTreeView != null)
                    {
                        previousTreeView.IsSelected = false;
                    }
                    previousTreeView = selectedTreeview;
                    selectedTreeview.IsSelected = true;
                });
        }

        protected override void OnDetaching()
        {
            Window.GetWindow(AssociatedObject).MouseMove -= GestureMovementBehavior_MouseMove;
            base.OnDetaching();
        }

        void GestureMovementBehavior_MouseMove(object sender, MouseEventArgs e)
        {
            // Are we over the treeview?
            Point pt = e.GetPosition(Window.GetWindow(AssociatedObject));
            Rect rect = new Rect();
            rect = AssociatedObject.RenderTransform.TransformBounds(
                new Rect(0, 0, AssociatedObject.ActualWidth, AssociatedObject.ActualHeight));
            if (rect.Contains(pt))
            {
                if (!inBounds)
                {
                    inBounds = true;
                }
                // Now, let's test to see if we are interested in this coordinate.
                if (selectedRectangle == null || !selectedRectangle.Contains(pt))
                {
                    GetElement(pt);
                }
            }
            else
            {
                if (inBounds)
                {
                    selectedTimer.Stop();
                    inBounds = false;
                }
            }
        }

        private Rect selectedRectangle;
        private void GetElement(Point pt)
        {
            IInputElement element = AssociatedObject.InputHitTest(pt);
            if (element == null) return;

            TreeViewItem t = FindUpVisualTree<TreeViewItem>((DependencyObject)element);
            if (t != null)
            {
                // Get the bounds of t.
                if (selectedTreeview != t)
                {
                    selectedTimer.Stop();
                    // This is a different item.
                    selectedTreeview = t;
                    selectedRectangle = selectedTreeview.RenderTransform.TransformBounds(
                        new Rect(0, 0, t.ActualWidth, t.ActualHeight));
                    selectedTimer.Start();
                }
            }
            else
            {
                // Stop processing and drop this out.
                selectedTimer.Stop();
                selectedTreeview = null;
            }
        }

        private T FindUpVisualTree<T>(DependencyObject initial) where T : DependencyObject
        {
            DependencyObject current = initial;

            while (current != null && current.GetType() != typeof(T))
            {
                current = VisualTreeHelper.GetParent(current);
            }
            return current as T;
        }
    }
}

With only a minimal amount of change, that’s exactly the Behavior that you hook up to a treeview.

 

And there you have it – code to do some basic transformations. The other filters that I apply will all follow this pattern, so it’s an incredibly trivial thing for me to put in place – and, more importantly, this provides the core functionality for me to save out the filters along with details of the files they apply to. All of a sudden, this application is starting to come together.

This weeks music

  • Deep Purple – Purpendicular
  • Elbow – Build A Rocket Boys
  • David Garrett – Rock Symphonies
  • Crash Test Dummies – God Shuffled His Feet
  • Nickelback – All The Right Reasons
  • Jethro Tull – Aqualung Live
  • Creedance Clearwater Revival – Creedance Clearwater Revival
  • Deep Purple – The Battle Rages On
  • Caitlin – The Show Must Go On

Status update

I’ve reached the end of the week and it’s worth taking stock of where I’ve reached. The harsh reality right now is that I won’t be delivering a fully-featured photo editing application at the end. Before the alarm bells start ringing, this is exactly what I expected. The aim for this competition is not to release the next Photoshop at the end of week 7. This would be a physical impossibility, given how I’m incorporating none Photoshop features. I am confident, however, that I will deliver what I was aiming at from the very start – a none destructive gesture and touch driven photo editing application. For this reason, I am going to only going to incorporate a limited number of filters. Don’t worry though, if you want to use Huda, I will continue developing it once the contest has finished. Your investment in using it will not be wasted.

Right now, Huda is limited to only editing a single image at a time. If I have time, before the end of the contest, I will give Huda the ability to edit multiple photos. Over the next week, I’m going to be tackling the reordering of items in the list, fleshing out that toolbar and also having the interface behave a little bit differently when it’s in tablet or desktop mode. Right now, the design has all been based on a tablet experience – but let’s see what we can do for the desktop users. Oh, and I should also have the first draft of the voice context in place as well.

Ultimate Coder – Week 3: Blog posting

This is a copy of the post I made on the Intel site here. For the duration of the contest, I am posting a weekly blog digest of my progress with using the Perceptual Computing items. This weeks post shows how Huda has evolved from the application that was created at the end of the first week. My fellow competitors are planning to build some amazing applications, so I’d suggest that you head on over to the main Ultimate Coder site and follow along with their progress as well. I’d like to take this opportunity to wish them good luck.

Week 2

Well, it’s week 2 now and development is carrying on apace. I’ve managed to get a fair bit into Huda in the first week, so the question has to be asked “Will Peter manage to maintain this awesome pace?” We all know that was the question you wanted to ask, but were far too nice to ask, so we’ll take it as read, shall we?

Monday

Today, I’m going to be working on adding some filters to the pictures that I’m opening. In the short term, all I’m going to do is hardcode the application to apply blur and Sepia filters to any picture that opens. This isn’t a huge task, and it’s a great way to get something in place that I can use to style the open filters window.

As I want to use filters from many places, I’m going to create a central location for all my filter code. I’ll treat this as a service and use something that’s known rather nerdily as Service Location to get a single reference to my filter code where it’s needed. Ultimately, this is going to be the master model for the whole approach of managing and manipulating the filters, including the “save” method. One thing I want to do with this approach is get away from the concept of the user having to save work. It’s no longer 1990, so why should the user have to remember to save their work. This is probably going to be the most controversial area for those who are used to traditional desktop applications but, given that we are building the next generation of application for people who are used to working with tablets and phones, I think this is an anachronism that we can dispense with quite happily.

Right, that’s a central filter model in place, so all I need to do is hook up the photo selection code and *bang* we have our filters being applied in place. As that was so easy to do, why don’t we get some feedback on the photograph as well? Let’s have the RGB histograms and display them to the user. We’ll put them at the bottom of the display, and add some animation to them so that we can start to get some feedback from them. Again, because I’m working in WPF, and I’m using the MVVM design pattern, the actual screens and animation are incredibly trivial to implement.

As an aside, here, you may be wondering what I’m using in terms of external libraries other than the Perceptual SDK. While I developed my own MVVM framework, I decided for this implementation to go with my good friend Laurent Bugnion’s MVVM Light, as there’s a good chance that the WPFers who end up downloading and playing with the source will have had exposure to that. I contemplated using Code Project uber developer Sacha Barber’s excellentCinch implementation, but the fact that Laurent had converted MVVM Light over to .NET 4.5 was the winning vote for me here. For the actual photo manipulation, I’ve gone with AForge.NET. This is a superb library that does all that I need, and so much more. Again, I could have gone with home rolled code, but the benefit of doing so is far outweighed by the tight timescales.

Tuesday

It’s Tuesday, now, and Intel have arranged a Google hangout for today. This is our opportunity to speak directly to the judges and to give feedback on where we are, and what’s going on. I’ve left the office earlier than I normally would, so as to be in plenty of time for the hangout. All I need is a link from our contacts and I am good to go.

Okay, the hangout has started and I’ve still had no link – nothing. I can see the other parties, but I can’t get in. I’ll fire off an email and hope that someone opens it to let me in. Robert and Wendy are indredibly clued in, so I’m hopeful that I should be able to join soon.

The hangout has ended, and I’m feeling pretty despondent now. Nothing came through, so I haven’t had a chance to talk things through with the others, or to talk to the judges. I’d hoped that they would be online, as well, so we could get some feedback from them. Still, this did give me time to noodle around on a guitar for an hour or so – something that I haven’t had anytime for since the competition started – so that was good. For those that are interested, I was practicing the Steve Morse trick of alternate picking arpeggios that you would normally sweep pick; talk about a challenge. (Note – I was playing this on an Ibanez Prestige RG550XX – a very smooth guitar to play on).

Anyway, I don’t have time to let this get in the way of the coding. As all the other competitors noted, time is tight. Time is so tight that at least one team has allocated a project manager to help them prioritise and manage this as a deliverable. I can’t do this unfortunately, but at least I won’t have to fill in status reports on my progress other than through the blog. One of my fellow solo coders, Lee, made some bold claims in his blog posting, and what I’ve seen from his blog video shows that his bold claims are backed up by bold ability – he’s definitely one to watch here; his project really could revolutionise online webinars and skype like activities.

Todays code is doing some animation on the histogram views, and then hooking that animation up to the gesture camera. This is where things start to get serious. There are, effectively, two approaches I can take here. The first approach is to take the coordinates I’m getting back from the camera and use the VisualTreeHelper.HitTest method in WPF to identify the topmost window that the coordinates match with. This is fairly trivial to implement, but it’s not the method I’m going to take for several reasons, the top ones being:

  1. This returns the topmost item, regardless of where the mouse is. In other words, it will return the top item of the window, even if I’m not over something that I could be interested in. This means I’ll get continuous reports of things like the position being over a layout Grid or Border.
  2. The hit test is slow. What it would be doing, on every frame from the camera, is looking through the VisualTree for the topmost item at that point. That’s an expensive computational operation.
  3. Once I’ve identified that the topmost item is actually something I’m interested in, the code would have to trigger some operation there. This means that I would have to put knowledge of the views I’m showing into the code behind the main window. This is something that I want to avoid at all costs, because this hard coupling can lead to problems down the line.

The approach I’m taking is for each control that I want to interact with the gesture camera to subscribe to the gesture events coming out of the gesture camera. What this means to the code is that each control has to hook into the same gesture camera instance – I’ve created what’s known as a singleton in my Perceptual SDK that serves up a single instance of each perceptual pipeline that I’m interested in. In real terms, this currently intersperses the gesture logic in a few locations, because I want the code that’s displaying the finger tracking image handled in the main window, but the code that’s actually working out whether or not the finger is over the control needs to be in the control. Today, I’m just going to get this working in one control, but the principle should allow me to then generalise this out into a functionality you can subscribe to if you want to have this handled for you.

As I’ve added an animation to the the histogram view, this seems to be a great place to start. What I’ll do here is have the animation triggered when the gesture moves into the control, and have a second animation triggered when the gesture moves out of the control. I now have enough of a requirement to start coding. The first thing I have to do here is to hook into the gesture event inside the Loaded event for the control; this is done here so that we don’t end up having problems with the order that views are started meaning that we try to hook into the event when the pipeline hasn’t been initialised.

Okay, now that I’ve got that event in place, I can work out whether or not the X and Y coordinates fall inside my control. Should be simple – I just need to translate the coordinates of my control into screen coordinates and as I’m just working with rectangular shapes, it’s a simple rectangle test. So, that’s what I code – half a dozen lines of boilerplate code and a call to trigger the animation.

Time to run the application. Okay, I’m waving my hand and the blue spot is following along nicely. Right, what happens when I move my hand over the control I want to animate? Absolutely nothing at all. I’ll put a breakpoint in the event handler just to ensure the event is firing. Yup. I hit the breakpoint no problem, but no matter what I do, the code doesn’t trigger the animation. My rectangle bounds checking is completely failing.

A cup of coffee later, and the answer hits me, and it’s blindingly obvious when it does. I’m returning the raw gesture camera coordinates to Huda, and I’m converting it into display coordinates when I’m displaying the blue blob. What I completely failed to do, however, was remember to perform the same transformation in the user control. In order to test my theory, I’ll just copy the transformation code in to the control and test it there. This isn’t going to be a long term fix, it’s just to prove I’m right with my thinking. Good programming practice indicates that you shouldn’t repeat yourself (we even give this the rather snappy acronym of DRY – or Don’t Repeat Yourself), so I’ll move this code somewhere logical, and just let it be handled once. The logical thing for me to do is to actually return the transformed coordinates, so that’s what I’ll do.

Run the application again and perform the same tests. Success. Now I have the histogram showing when I move my hand over it, and disappearing when I move my hand away from the control. Job done, and time to turn this into a control that my user controls can inherit from. When I do this, I’m going to create GestureEnter and GestureLeave events that follow a similar pattern that WPF developers are familiar with for items such as Mouse and Touch events.

One thing that I didn’t really get time to do last week was to start adding touch support in, beyond the standard “press the screen to open a photo”. That’s something I’m going to start remedying now. I’m just about to add the code for zooming and panning the photo based on touch. Once I’ve got that in place, I think I’ll call it a night. I’ve decided that I’m going to add pan and zoom to the photograph based on touch events; the same logic I use here will apply to gestures, so I will encapsulate this logic and just call it where I need to. WPF makes this astonishingly simple, so the code I will be using looks a lot like this:

<ScrollViewer HorizontalScrollBarVisibility="Auto" VerticalScrollBarVisibility="Auto">
<Image
Source="{Binding PreviewImage}"
HorizontalAlignment="Stretch"
x:Name="scaleImage"
ManipulationDelta="scaleImage_ManipulationDelta_1"
ManipulationStarting="scaleImage_ManipulationStarting_1"
VerticalAlignment="Stretch"
IsManipulationEnabled="True"
RenderTransformOrigin="0.5,0.5" Stretch="Fill">
<Image.RenderTransform>
<MatrixTransform>
</MatrixTransform>
</Image.RenderTransform>
&lt;/Image&gt;
</ScrollViewer>

Now, some people think that MVVM requires you to remove all code from the code behind. Far be it for me to say that they are wrong but, they are wrong. MVVM is about removing the code that doesn’t belong in the view, from the view. In other words, as the image panning and zooming relates purely to the view, it’s perfectly fine to put the logic into the code behind. So, let’s take advantage of the ManipulationDelta and ManipulationStarting events which give us the ability to apply that ol’ pan and zoom mojo. It goes something like this:

private void scaleImage_ManipulationDelta_1(object sender, ManipulationDeltaEventArgs e)
{
Matrix rectsMatrix = ((MatrixTransform)scaleImage.RenderTransform).Matrix;
ManipulationDelta manipDelta = e.DeltaManipulation;
Point rectManipOrigin = rectsMatrix.Transform(new Point(scaleImage.ActualWidth / 2, scaleImage.ActualHeight / 2));

rectsMatrix.ScaleAt(manipDelta.Scale.X, manipDelta.Scale.Y, rectManipOrigin.X, rectManipOrigin.Y);
rectsMatrix.Translate(manipDelta.Translation.X, manipDelta.Translation.Y);
((MatrixTransform)scaleImage.RenderTransform).Matrix = rectsMatrix;
e.Handled = true;
}

private void scaleImage_ManipulationStarting_1(object sender, ManipulationStartingEventArgs e)
{
e.ManipulationContainer = this;
e.Handled = true;
}

I’m not going to dissect this code too much but, suffice it to say the ScaleAt code is responsible for zooming the photo and the Translate code is responsible for panning it. Note that I could have easily added rotation if I’d wanted to, but free style rotation isn’t something I’m planning here.

Wednesday

Well, it’s a new day and one of the features I want to be able to do is to retrieve the item that a finger is pointed at, based on it’s X/Y coordinates. As WPF doesn’t provide this method by default, it’s something that I’ll have to write myself. Fortunately, this isn’t that complicated a task and the following code should do nicely.

public static int GetIndexPoint(this ListBox listBox, Point point)
{
int index = -1;
for (int i = 0; i < listBox.Items.Count; ++i)
{
ListBoxItem item = listBox.Items[i] as ListBoxItem;
Point xform = item.TransformToVisual((Visual)listBox.Parent).Transform(new Point());
Rect bounds = VisualTreeHelper.GetDescendantBounds(item);
bounds.Offset(xform.X, xform.Y);
if (bounds.Contains(point))
{
index = i;
break;
}
}
return index;
}

Tracking the mouse move, for instance, would look like this:

int lastItem = 0;
private Timer saveTimer;
public MainWindow()
{
InitializeComponent();
this.MouseMove += new MouseEventHandler(MainWindow_MouseMove);
saveTimer = new Timer();
saveTimer.Interval = 2000;
}
void saveTimer_Elapsed(object sender, ElapsedEventArgs e)
{
saveTimer.Stop();
saveTimer.Elapsed -= saveTimer_Elapsed;
Dispatcher.BeginInvoke((Action)(() =&gt;
{
selectyThing.SelectedIndex = lastItem;
}), DispatcherPriority.Normal);
}
void MainWindow_MouseMove(object sender, MouseEventArgs e)
{
int index = selectyThing.GetIndexPoint(e.GetPosition(this));
if (index != lastItem)
{
lastItem = index;
saveTimer.Stop();
saveTimer.Elapsed -= saveTimer_Elapsed;
if (lastItem != -1)
{
saveTimer.Elapsed += saveTimer_Elapsed;
saveTimer.Start();
}
}
}

Thursday

Today I’ve really been motoring, and managed to pull slightly ahead of my schedule, but I’ve also hit a bit of a gesture road block. So, I’ve updated the gesture code to retrieve multiple finger positions, as well as to retrieve a single node. This actually works really well from a code point of view, and it’s great to see my “fingers” moving around on the screen. The problem is that one finger “wobbles” less than a hand, plus the item selection actually works better based on a single finger. At least I’ve got options now – and the gesture code is working well – plus, I’ve managed to hook single selection of a list item based on my finger position. This allows the gesture code to mimic the touch code, something I’m going to explore more in week 3. My gesture selection code has now grown to this:

using Goldlight.Perceptual.Sdk.Events;
using Goldlight.Windows8.Mvvm;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;

namespace Goldlight.Perceptual.Sdk
{
public class GesturePipeline : AsyncPipelineBase
{
private WeakEvent<GesturePositionEventArgs> gestureRaised = new WeakEvent<GesturePositionEventArgs>();
private WeakEvent<MultipleGestureEventArgs> multipleGesturesRaised = new WeakEvent<MultipleGestureEventArgs>();
private WeakEvent<GestureRecognizedEventArgs> gestureRecognised = new WeakEvent<GestureRecognizedEventArgs>();

public event EventHandler<GesturePositionEventArgs> HandMoved
{
add { gestureRaised.Add(value); }
remove { gestureRaised.Remove(value); }
}

public event EventHandler<MultipleGestureEventArgs> FingersMoved
{
add { multipleGesturesRaised.Add(value); }
remove { multipleGesturesRaised.Remove(value); }
}

public event EventHandler<GestureRecognizedEventArgs> GestureRecognized
{
add { gestureRecognised.Add(value); }
remove { gestureRecognised.Remove(value); }
}

public GesturePipeline()
{
EnableGesture();

searchPattern = GetSearchPattern();
}

private int ScreenWidth = 1024;
private int ScreenHeight = 960;

public void SetBounds(int width, int height)
{
this.ScreenWidth = width;
this.ScreenHeight = height;
}

public override void OnGestureSetup(ref PXCMGesture.ProfileInfo pinfo)
{
// Limit how close we have to get.
pinfo.activationDistance = 75;
base.OnGestureSetup(ref pinfo);
}

public override void OnGesture(ref PXCMGesture.Gesture gesture)
{
if (gesture.active)
{
var handler = gestureRecognised;
if (handler != null)
{
handler.Invoke(new GestureRecognizedEventArgs(gesture.label.ToString()));
}
}
base.OnGesture(ref gesture);
}

public override bool OnNewFrame()
{
// We only query the gesture if we are connected. If not, we shouldn't
// attempt to query the gesture.
try
{
if (!IsDisconnected())
{
var gesture = QueryGesture();
PXCMGesture.GeoNode[] nodeData = new PXCMGesture.GeoNode[6];
PXCMGesture.GeoNode singleNode;
searchPattern = PXCMGesture.GeoNode.Label.LABEL_BODY_HAND_PRIMARY;
var status = gesture.QueryNodeData(0, searchPattern, nodeData);

if (status >= pxcmStatus.PXCM_STATUS_NO_ERROR)
{
var handler = multipleGesturesRaised;

if (handler != null)
{
List<GestureItem> gestures = new List<GestureItem>();
foreach (var node in nodeData)
{
float x = node.positionImage.x - 85;
float y = node.positionImage.y - 75;

GestureItem item = new GestureItem(x, y, node.body.ToString(), ScreenWidth, ScreenHeight);
gestures.Add(item);

handler.Invoke(new MultipleGestureEventArgs(gestures));
}
}
}

status = gesture.QueryNodeData(0, GetSearchPattern(), out singleNode);
if (status >= pxcmStatus.PXCM_STATUS_NO_ERROR)
{
var handler = gestureRaised;
if (handler != null)
{
handler.Invoke(new GesturePositionEventArgs(singleNode.positionImage.x,
singleNode.positionImage.y, singleNode.body.ToString(), ScreenWidth, ScreenHeight));
}
}
}
Sleep(20);
}
catch (Exception ex)
{
// Error handling here...
}
return true;
}

private readonly object SyncLock = new object();

private void Sleep(int time)
{
lock (SyncLock)
{
Monitor.Wait(SyncLock, time);
}
}

private PXCMGesture.GeoNode.Label searchPattern;

private PXCMGesture.GeoNode.Label GetSearchPattern()
{
return PXCMGesture.GeoNode.Label.LABEL_BODY_HAND_PRIMARY |
PXCMGesture.GeoNode.Label.LABEL_FINGER_INDEX |
PXCMGesture.GeoNode.Label.LABEL_FINGER_MIDDLE |
PXCMGesture.GeoNode.Label.LABEL_FINGER_PINKY |
PXCMGesture.GeoNode.Label.LABEL_FINGER_RING |
PXCMGesture.GeoNode.Label.LABEL_FINGER_THUMB |
PXCMGesture.GeoNode.Label.LABEL_HAND_FINGERTIP;
}
}
}

I’ve hooked into the various events from here to handle the type of interactions we are used to seeing from mouse and touch. Again, there’s a bit of repeated code in here, but I’ll be converting this into WPF behaviors which you will be able to use in your own projects. I’ve said it before, I love writing Blend Behaviors – they really make your life easy. As a personal plea to Microsoft; if you want to get people to write for WinRT, give us the features we’re used to in XAML right now. The fact that I can’t use Blend Behaviors in Windows 8 “Modern” applications is one more barrier in the way of me actually writing for the app store.

Friday

Following the big strides forwards I made with the gesture code yesterday, I’m going to have an easy one today and hook up the filter code. A lot of the time here will just be putting the plumbing in place to actually display the filters and select them. Initially, I’m only going to offer a few filters – I’ll add more next week. The key thing here is to prove that I can do this easily. I’m not spending too much time on the UI for the filters right now, but I will be adding to it in the near future. The point is that, while the interface is quite spartan right now, for a seasoned WPF developer, actually beefing it up isn’t a major task. The key thing I have in place is that all the main interface areas are solid black, with a glowy border around them. Once I’ve added the filters in and hooked them up, I think I’ll post a picture or two to show the before and after filter behaviour of Huda.

This screenshot is taken immediately after the very posh sheep photo was loaded:

If you look carefully at the bottom of the image, you’ll see the very faint Histogram view that I was talking about earlier. I’ve made this part of the interface transparent – gesture over it, or touch it and it becomes opaque. I’ll be doing something similar with other operations as well – the photo should be the view point, not the chrome such as the available filters.

I’ll apply the channel filter and blur like this:

Well, they have applied quite nicely. I’ll add other filters and the ability to set their values next week.

This weeks music
  • Deep Purple – Purpendicular
  • AC/DC – Back in Black
  • Andrea Bocelli – Sacred Arias
  • Muse – Black Holes and Revelations
  • Chickenfoot = Chickenfoot III
  • Jethro Tull – Aqualung and Thick As A Brick

End of week 2

Well here we are. Another week of developing and blogging. Where do I think I’m at right now? In some ways, I’m ahead of schedule but in other ways I feel I’m behind schedule. Oh, the code will be finished on time, and the application will do what I set out to do, but when I look at what my competitors are doing I can’t help feeling that I could do so much more with this. The big problem, of course, is that it’s easy to get carried away with “cool” features but if they don’t add anything to the application, then they are worthless. While the things that the other teams are doing make sense in the problem domains they are using, some of these features really don’t make sense in a photo editor. Perhaps I could have been more ambitious with what I wanted to deliver at the start, but overall I’m happy with what I’m going to deliver and what is coming up.

As a corollary to this, have you checked out what my fellow contestants are planning? Please do so and comment on their posts. They are all doing some amazing things, and I would take it as a personal favour if you would take the time to offer your encouragement to them.

By the end of week 3, I fully intend to have the whole central premise linked in. If you remember, my statement for Huda was that edits should be none-destructive. Well, they aren’t, but they aren’t saved anywhere right now. That’s what I need to hook in next week. The edits must be saved away, and the user must be able to edit them. If I haven’t delivered this, then I’ve failed to deliver on the promise of the application. Fortunately, I already know how I want to go about the save and it doesn’t touch on too much code that’s already in place, so it should be a nice and natural upgrade.

While I’ve thanked my fellow competitors, I haven’t taken the time to thank the judges and to our support team, (primarily the wonderful Bob Duffy and Wendy Boswell; what does the X stand for in your name?) There’s a real sense of camaraderie amongst the competitors, more so than I would ever have expected in a contest. We are all prepared to give our time and our thoughts to help the other teams – ultimately, we were all picked because of our willingness to share with the wider audience so there’s a natural symbiosis of us sharing with each other. Personalities are starting to come to the fore, and that is wonderful. Anyway, back to the judges (sorry for the digression). Come Wednesday, I hit refresh on the browser so many times waiting for their comments. Surprisingly enough, for those who know me, this is not an ego issue. I’m genuinely interested to read what the judges have to say. This is partly because I want to see that they think the application is still worth keeping their interest (and yes Sascha, I know precisely why Ctrl-Z is so important in photo apps given the number of disastrous edits I’ve made in the past). The main reason, though, is that I want to see that I’m writing the blog post at the write level. As someone who is used to writing articles that explain code listings, I want to make sure that I don’t make you nod off if you aren’t a coder, but at the same time I have a duty of care towards my fellow developers to try and educate them. So, to the judges, as well as Bob and Wendy; I thank you for your words and support.

You may have noticed that I haven’t posted any Youtube videos this week. There’s a simple reason for that. Right now, Huda is going through rapid changes any day. It’s hard to choose a point that I want to say “yes, stopping and videoing at this point makes sense” because I know that if I wait until the following day, the changes to the application at that point would make as much, or even more, sense. My gut feeling, right now, is that the right time to make the next video is when we can start reordering the filters.

One thing that was picked up on by the judges was my posting the music I was listening to. You may be wondering why I did that. There was a reason for it, and it wasn’t just “a bit of daft” on my part. Right now, this contest is a large part of my life. I’m devoting a lot of time to it, this isn’t a complaint as I love writing code, but you can go just the slightest bit insane if you don’t have some background noise. I can’t work with the TV on, but I can code to music, so I thought it might be interesting for the none coders to know how at least one coder hangs on to that last shred of what he laughingly calls his sanity. Last week, I didn’t manage to pick up my guitar so I am grateful that I managed to get 40 minutes or so to just noodle around this week. My down time is devoted to my family, including our newest addition Harvey, and this is the time I don’t even look at a computer. Expect to see more of Harvey over the next few weeks.

Harvey in all his glory.

So, thanks for reading this. I hope you reached this point without nodding off, and the magic word for this week is Wibble. We’ll see which judges manage to get it into their posts – that’ll see who’s paying attention. Until next week, bye.

Ultimate Coder – Week 2: Blog posting

February 26, 2013 2 comments

This is a copy of the post I made on the Intel site here. For the duration of the contest, I am posting a weekly blog digest of my progress with using the Perceptual Computing items. The first weeks post is really a scene setter where I explain how I got to this point, and details bits and pieces about the app I intend to build. My fellow competitors are planning to build some amazing applications, so I’d suggest that you head on over to the main Ultimate Coder site and follow along with their progress as well. I’d like to take this opportunity to wish them good luck.

Week 1

Well, this is the first week of coding for the Perceptual Computing challenge, and I thought it might be interesting for you to know how I’m approaching developing the application, what I see the challenges as being, and any roadblocks that I hit on the way. I must say, up front, that this is going to be a long post precisely because there’s so much to put in here. I’ll be rambling on about decisions I make, and I’ll even post some code in for you to have a look at if you’re interested.

As I’m not just blogging for developers here, writing these posts is certainly going to be interesting because I don’t want to bog you down with technical details that you aren’t interested in if you just want to know about my thought processes intead, but I don’t want to leave you wondering how I did something if you are interested in it. Please let me know if there’s anything that you’d like clarification on, but also, please let me know if the article weighs in too heavily on the technical side.

Day 1.

Well, what a day I’ve had with Huda. A lot of what I want to do with Huda is sitting in my head, so I thought I’d start out by roughing out a very, very basic layout of what I wanted to put into place. Armed with my trusty copy of Expression Blend, I mocked out a rough interface which allowed me to get a feel for sizing and positioning. What I really wanted to get the feel of was, would Huda really fit into a layout that was going to allow panels to fly backwards and forwards, and yet still allow the user to see the underlying photo. I want the “chrome” to be unobtrusive, but stylish.

As you can see, this representation is spartan at best, and if this was the end goal of what I was going to put into Huda, I would hang my head in shame, but as it’s a mockup, it’s good enough for my purposes. I’ve divided the screen into three rough areas at the moment. At the right, we have a list of all the filters that have been applied to the image, in the order they were applied. The user is going to be able to drag things around in this list using a variety of inputs, so the text is going to be large enough to cope with a less accurate touch point than from a mouse alone.

The middle part of the picture represents the pictures that are available for editing in the currently selected folder. When the user clicks on a folder in the left hand panel, this rearranges to show that folder at the top, and all it’s children – and the pictures will appear in the centre of the screen. The user will click on an picture to open it for editing. I’ve taken this approach, rather than just using a standard Open File dialog because I want the user to be able to use none-keyboard/mouse input, and the standard dialogs aren’t optimised for things like touch. This does have the advantage of allowing me to really play with the styling and provide a completely unified experience across the different areas of the application.

Well, now that I’ve finished roughing out the first part of the interface, it’s time for me to actually write some code. I’ve decided that the initial codebase is going to be broken down into four projects – I’m using WPF, C#, .NET 4.5 and Visual Studio Ultimate 2012 on Windows 8 Professional for those who care about such things – and it looks like this:

  • Goldlight.Common provides common items such as interfaces that are used in the different projects, and definitions for things like WeakEvents.
  • Goldlight.Perceptual.Sdk is the actual meat of the SDK code.  Initially this will be kept simple, but I will expand and enhance this as we go through the weeks.
  • Goldlight.Windows8 contains the plumbing necessary to use UltrabookTM features such as flipping the display into tablet mode, and it isolates the UI from having to know about all the plumbing that has to be put in place to use the WinRT libraries.
  • Huda is the actual application, so I’m going to spend most of this week and next week deep in this part, with some time spent in Goldlight.Perceptual.Sdk.

When I start writing a UI, I tend to just rough-block things in as a first draft. So that’s what I did today. I’ve created a basic page and removed the standard Windows chrome. I’m doing this because I want to have fine grained control of the interface when it transitions between desktop and tablet mode. The styling is incredibly easy to apply, so that’s where I started.

A quick note if you’re not familiar with WPF development. When styling WPF applications, it’s generally a good idea to put the styling in something called a ResourceDictionary. If you’re familiar with CSS, this is WPF’s equivalent of a separate stylesheet file. I won’t bore you with what this file actually looks like at this point, but please let me know if you would like more information. Once I’ve fine tuned some of the styling, I’ll make this file available and show how it can be used to transition the interface over – this will play a large part when we convert our application from desktop to tablet mode, so it makes sense to put the plumbing in place right at the start.

My first pass on the UI involved creating a couple of basic usercontrols that animate when the user brings the mouse over them or touches them; giving a visual cue that there’s something of interest in this area. I’ve deliberately created them to be ugly – this is a large part of my WPF workflow – concentrate on putting the basics in place and then refine them. I work almost exclusively with a development pattern called MVVM (Model View ViewModel), which basically allows me to decouple the view from the underlying application logic. This is a standard pattern for WPF zealots like myself, and I find that it really shines in situations like this, where I just quickly want to get some logic in place.

The first usercontrol I put in place is just an empty shell that will hold the filters that the user has added. As I need to get an image on the screen before I can add any filters, I don’t want to spend too much time on this just yet – I just needed to have it in the UI as a placeholder, primarily so that I can see how gesture code will affect the UI.

The second conrol is more interesting. This control represents the current selected folder and all its children. My first pass of this was just to put a ListBox in place in this control, and to have the control expand and contract as the user interacts with it. The ListBox holds the children of the current folder, so I put a button in place to allow the user to display the images for the top level folder. When I run the application, it quickly becomes apparent to me that this doesn’t work as a UI design, so I will revisit this with alternative ideas.

I could have left the application here, happy that I had the beginnings of a user interface in place. Granted, it doesn’t do much right now – it displays the child folders associated with My Pictures, and that’s about it, but it does work. However, what’s the point of my doing this development if I don’t bring in gesture and voice control. In the following video, once Huda has started, I’m controlling the interface entirely with gestures and voice recognition (when I say filter, a message box is displayed saying filter – not the most startling achievement, but it’s pretty cool at how easy it is to do). Because I’m going to issue a voice command for this first demonstration, I decided not to do a voice over – it would just sound weird if I paused halfway through a sentence to say “Filter”.

 As you can see – that interface is ugly, but it’s really useful to get an idea of what works and what doesn’t. As I’m not happy with the folder view, that’s what I’ll work on tidying up in day 2.

Note: I’m not going to publish videos every day, and I’m not going to publish a day by day account of the development. The first couple of days are really important in terms of starting the process off and these are the points where I can really make quick wins, but later on, when I start really being finicky with the styling, you really aren’t going to be interested in knowing that I changed a TextBlock.FontSize from 13.333 to 18.666 – at least I hope you’re not.

The important thing for me, at the end of day 1, is that I have something to see from both sides. I have a basic UI in place; there’s a long way to go with it yet, but it’s on the screen but there’s actual Perceptual work going on there, and it’s actually pretty easy to get the basics in place. More importantly, my initial experiments have shown that the gestures are quite jerky, and getting any form of fine grained control is going to take quite a bit of ingenuity. Unfortunately, while I can get the voice recognition to work, it appears to be competing for processing time with the gesture code – both of which are running as background tasks. 

One of the tasks I’ll have to undertake is to profile the application to see where the hold up is – I have suspicions that the weak events might be partly to blame, but I’ll need to verify this. Basically, a weak event is a convenience that allows a developer to right code that isn’t going to be adversely affected if they forget to release an event. While this is a standard pattern in the WPF world, it does have an overhead, and that might not be best when we need to eke out the last drop of performance. I might have to put the onus on any consumer of this library to remember to unhook any events that they have hooked up.

Here’s the gesture recognition code that I put in place today, I know it’s not perfect and there’s a lot needs doing to it to make it production level code, but as it’s the end of the first day I’m pretty happy with it:

using Goldlight.Perceptual.Sdk.Events;
using Goldlight.Windows8.Mvvm;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace Goldlight.Perceptual.Sdk
{
    public class GesturePipeline : AsyncPipelineBase
    {
        private WeakEvent gestureRaised = new WeakEvent();

        public event EventHandler HandMoved
        {
            add { gestureRaised.Add(value); }
            remove { gestureRaised.Remove(value); }
        }

        public GesturePipeline()
        {
            EnableGesture();
        }

        public override void OnGestureSetup(ref PXCMGesture.ProfileInfo pinfo)
        {
            // Limit how close we have to get.
            pinfo.activationDistance = 75;
            base.OnGestureSetup(ref pinfo);
        }

        public override bool OnNewFrame()
        {
            // We only query the gesture if we are connected. If not, we shouldn't
            // attempt to query the gesture.
            try
            {
                if (!IsDisconnected())
                {
                    var gesture = QueryGesture();
                    PXCMGesture.Gesture gesture1;
                    PXCMGesture.GeoNode nodeData;
                    var status = gesture.QueryNodeData(0, GetSearchPattern(), out nodeData);
                    if (status >= pxcmStatus.PXCM_STATUS_NO_ERROR)
                    {
                        var handler = gestureRaised;
                        if (handler != null)
                        {
                            var node = nodeData;
                            handler.Invoke(new GestureEventArgs(node.positionImage.x, 
 node.positionImage.y, node.body.ToString()));
                            }
                        }
                    }
                }
            }
            catch (Exception ex)
            {
                // Error handling to go here...
                Debug.WriteLine(ex.ToString());
            }
            return base.OnNewFrame();
        }

        private PXCMGesture.GeoNode.Label GetSearchPattern()
        {
            return PXCMGesture.GeoNode.Label.LABEL_BODY_HAND_PRIMARY |
                PXCMGesture.GeoNode.Label.LABEL_FINGER_INDEX |
                PXCMGesture.GeoNode.Label.LABEL_FINGER_MIDDLE |
                PXCMGesture.GeoNode.Label.LABEL_FINGER_PINKY |
                PXCMGesture.GeoNode.Label.LABEL_FINGER_RING |
                PXCMGesture.GeoNode.Label.LABEL_FINGER_THUMB |
                PXCMGesture.GeoNode.Label.LABEL_HAND_FINGERTIP;
        }
    } 
}

At this stage, I’d just like to offer my thanks to Grasshopper.iics for the idea of tying the hand gesture to the mouse. As a quick way to demonstrate that the gesture was working, it’s a great idea. As I need to track individual fingers, it’s not a viable long term solution, but as a way to say “oh yes, that is working”, it’s invaluable.

Day 2

I’ve had a night to think about the folder display, and as I said yesterday, I’m really not happy with the whole button/list approach to the interface. What had started off as an attempt to try to steer clear of the whole file system as logical tree metaphor just feels too alien to me, and I suspect that I would end up having to rework a lot of the styling there to make this appear to be a logical tree. Plus, I really need to hook something up in the UI so that I can select folders and trigger the reload of the selected folder along with child folders. We’ll attend to the styling first.

As I’ve stated that we are going to present this part of the interface as though it’s a tree, it makes sense for us to actually use a tree, so I’m going to rip out the list and button, and replace them with a simple tree. As I’m using MVVM, the only thing I have to update is my UI code (known as the View). My internal logic is going to remain exactly the same – this is why I love MVVM. More importantly, this highlights why I start off with rough blocks. I like the fact that I can quickly get a feel for what’s working before I invest too much time in it. If you’re a developer using a tech stack that you’re comfortable with, and you have a technology that allows you to take this rapid iterative approach, I cannot recommend this quick, rough prototyping enough. It’s saved me from a lot of pain any number of times, and I suspect that it will do the same for you.

The second thing I’m going to do is hook selecting a tree node up to actually doing the refresh. Again, I put most of the plumbing in place for this yesterday – all I need to do today is actually hook the tree selection to the underlying logic. 

Now, I really want to play around with the styling a little bit. I’m going to restyle the folder tree so that it looks a bit more attractive, and I’m going to change the filter control and the folder view so that the user can drag them around the interface – including using touch as I really feel that this really helps to make the UltrabookTM stand out from other form factors. Having played around with the code, I’ve now got this in place:

I’ve added a lot of plumbing code to support the touch drag here. I’m leaving this part of the code “open” and unfinished right now because I want to add support into this for dragging through gestures. By doing this, I won’t have to touch the UI elements code to make them work, they should just work because they are responding to commands from this code.

Day 3+

I’ve really been pushing to get the application to the point where the user can select a photo from my own folder browser and picture selector combo, but the first thing I thought I would address was the voice control. When I really sat down with the code I’d put in place, I realised that I was doing more than I really needed to. Basically, in my architecture, I was creating a set of commands that the application would use as a sort of command and control option and while that seemed to me to be a logical choice when I put it in, sobre reflection pointed out to me that I was overcomplicating things. Basically, the only thing that needs to know about the commands is the application itself, so why was I supplying this to the perceptual part? If I let the Perceptual SDK just let me know about all the voice data it receives, the different parts of Huda could cherry pick as they saw fit. Two minutes of tidy up code, and it’s responding nicely.

As a quick aside here. The voice recognition doesn’t send you a stream of words as you’re reading out. It waits until you’ve paused and it sends you a phrase. This means that you have to be a little bit patient; you can’t expect to say “Filter” in Huda and for the filters to pop up a millisecond later because the voice recognition portion is waiting to see if you’ve finished that portion.

Fortunately, this means that my voice code is currently insanely simple:

using Goldlight.Perceptual.Sdk.Events;
using Goldlight.Windows8.Mvvm;
using System;
namespace Goldlight.Perceptual.Sdk
{
    /// 
    /// Manages the whole voice pipeline.
    /// 
    public class VoicePipeline : AsyncPipelineBase
    {
        private WeakEvent voiceRecognized = new WeakEvent();

        /// 
        /// Event raised when the voice data has been recognized.
        /// 
        public event EventHandler VoiceRecognized
        {
            add { voiceRecognized.Add(value); }
            remove { voiceRecognized.Remove(value); }
        }

        /// 
        /// Instantiates a new instance of <see cref="VoicePipeline"/>.
        /// 
        public VoicePipeline() : base()
        {
            EnableVoiceRecognition();
        }

        public override void OnRecognized(ref PXCMVoiceRecognition.Recognition data)
        {
            var handler = voiceRecognized;

            if (handler != null)
            {
                handler.Invoke(new VoiceEventArgs(data.dictation));
            }
            base.OnRecognized(ref data);
        }
    } 
}

The call to EnableVoiceRecognition lets the SDK know that this piece of functionality is interested in handling voice recognition (cunningly enough – I love descriptive method names). I could have put this functionality into the same class as I’m using for the gesture recognition, but there are a number of reasons I’ve chosen not to, the top two reasons being. 

  • I develop in an Object Oriented language, so I’d be violating best practices by “mixing concerns” here. This basically means that a class should be responsible for doing one thing, and one thing only. If it has to do more than one thing, then you need more than one class.
  • I want to be able to pick and mix what I use and where I use it in my main application. If I have more than one piece of functionality in one class then I have to start putting unnecessarily complicated logic in there to separate the different parts out into areas that I want to use.

The OnRecognized piece of code lets me pick up the phrases as they are recognized, and I just forward those phrases on to Huda. As Huda is going to have to decide what to do when it gets a command, I’m going to let it see all of them and just choose the ones it wants to deal with. This is an incredibly trivial operation. 

“Wow Pete, that’s a lot of text to say not a lot” I hear you say. Well, I would if you were actually talking to me. It could be the voices in my head supplying your dialogue here. The bottom line here is that Huda now has the ability to recognise commands much more readily than it did at the start of the week, and it recognizes them while I’m waving my arms about moving the cursor around the screen. That’s exciting. Not dangerous exciting. Just exciting in the way that it means that I don’t have to sacrifice part of my desired feature set in the first week. So far so good on the voice recognition front.

By the end of the week, Huda is now in the position where it displays images for the user to select, based off whether or not there are pictures in a particular folder. This is real progress, and I’m happy that we can use this to push forwards in week 2. Better still though, the user can select one of those pictures and it opens up in the window in the background.

I’m not quite happy with the look of the back button in the folders – it’s still too disconnected, so I’ve changed it. Next week, I’ll add folder representations so that it’s apparent what these actually are, but as that’s just a minor template change, I’m going to leave it for now. Here’s a sample of Huda in action, opening up folders and choosing a picture to view.

Keeping my head together

So, what do I do to keep my attention on the project? How do I keep focussed? Music and a lot of Cola. So, for your delectation, this weeks playlist included:

  • AC/DC – For those about to rock (one of my favourites)
  • David Lee Roth – Eat ‘em and Smile/Skyscraper
  • Andrea Bocelli – Romanza
  • Black Veil Brides – Set the world on fire
  • Deep Purple – Purpendicular
  • Herb Ellis and Joe Pass – Two for theroad
  • The Angels – Two minute warning
  • Chickenfoot – Chickenfoot III

Each week, I’ll let you know what I’ve been listening to, and let’s see if you can judge what part of the application goes with what album. Who knows, there may be a correlation on something or other in there – someone may even end up getting a grant out of this.

Final thoughts for week 1

This has been a busy week. Huda has reached a stage where I can start adding the real meat of the application – the filters. This is going to be part of my push next week; getting the filter management and photo saving into place. The photo is there for the user to see, but that’s nowhere near enough so we’ll be looking at bringing the different parts together, and really making that UI pop out. By the end of next week, we should have all the filters in place – including screens for things like saturation filters. This is where we’ll start to see the benefits of the Ultrabook because we’ll offer touch optimised and keyboard/mouse/touch optimised screens depending on how the Ultrabook is being used.

Right now, the Perceptual features aren’t too refined, and I’ve not even begun to scratch the surface of what I want to do with the Ultrabook. Sure, I can drag things around and select them with touch, but that’s not really utilising the features in a way that I’d like. Next week, I’m incorporating some user interface elements that morph depending on whether you are using the Ultrabook in a desktop mode, or as a tablet. For example, I’ll be adding some colour adjustment filters where you can adjust values using text boxes in desktop mode, but I’ll be using another mechanism for adjusting these values when it’s purely tablet (again, I don’t want to spoil the surprise here, but I think there is a pretty cool alternative way of doing this).

The big challenge this week has been putting together this blog entry. This is the area where the solo contestants have the biggest disadvantage – time blogging is time we aren’t coding, so there’s a fine line we have to tread here.

One thing I haven’t articulated is why I’m using WPF over WinRT/Metro for the application development. As I’ve hinted, I’ve a long history with WPF, and I’m a huge fan of it for developing applications. On the surface (no pun intended), WinRT XAML apps would appear to be a no brainer as a choice, but there are things that I can do quickly in WPF that will take me longer to achieve with WinRT XAML, simply because WPF is feature rich and XAML support in Windows 8 has a way to go to match this. That’s not to say that I won’t port this support across to WinRT at some point, but as speed is of the essence here, I want to be able to use what I know well, and what I have a huge backlog of code that I can draw on as and when I need to.

I’d like to thank the judges, and my fellow contestants for their support, ideas and comments this week. No matter what happens at the end of this contest, I can’t help think that we are all winners. Some of the ideas that the other teams have are so way out there, that I can’t help but want to incorporate more. Whether or not I get the time to add extra is something that is up for grabs, but right now I think that I want to try and bring gaze into the mix as an input, possibly to help with people with accessibility issues – an area that I haven’t really explored with the gesture SDK yet.

Ultimate Coder: Going Perceptual – Week 1 Blog Posting.

February 19, 2013 1 comment

This is a copy of the post I made on the Intel site here. For the duration of the contest, I am posting a weekly blog digest of my progress with using the Perceptual Computing items. The first weeks post is really a scene setter where I explain how I got to this point, and details bits and pieces about the app I intend to build. My fellow competitors are planning to build some amazing applications, so I’d suggest that you head on over to the main Ultimate Coder site and follow along with their progress as well. I’d like to take this opportunity to wish them good luck.

A couple of months ago, I was lucky enough to be asked if I would like to participate in an upcoming Intel® challenge, known at the time as Ultimate Coder 2. Having followed the original Ultimate Coder competition, I was highly chuffed to even be considered. I had to submit a proposal for an application that would work on a convertible Ultrabook™ and would make use of something called Perceptual Computing. Fortunately for me, I’d been inspired a couple of days earlier to write an application and describe how it was developed on CodeProject – my regular hangout for writing about things that interest me and that haven’t really been covered much by others. This seemed to me to be too good an opportunity to resist; I did some research on what Perceptual Computing was and I’d write the application to incorporate features that I thought would be a good match. As a bonus, I’d get to write about this and as I like giving code away, I’d publish the source to the actual Perceptual Computing side as a CodeProject article at the end of the competition.

Okay, at this point, you’re probably wondering what the application is going to be. It’s a photo editing application, called Huda, and I bet your eyes just glazed over at that point because there are a bazillion and one photo editing applications out there and I said this was going to be original. Right now you’re probably wondering if I’ve ever heard of Photoshop® or Paint Shop Pro®, and you possibly think I’ve lost my mind. Bear with me though, I did say it would be different and I do like to try and surprise people. 

A slight sidebar here. I apologise in advance if my assault on the conventions of the English language become too much for you. Over the years I’ve developed a chatty style of writing and I will slip backwards and forwards between the first and third person as needed to illustrate a point – when I get into the meat of the code that you need to write to use the frameworks, I will use the third person.

So what’s so different about Huda? Why do I think it’s worth writing articles about? In traditional photo editing applications, when you change the image and save it, the original image is lost (I call this a destuctive edit) – Fireworks did offer something of this ability, but only if you work in .png format. In Huda, the original image isn’t lost because the edits aren’t applied to it – instead, the edits are effectively kept as a series of commands that can be replayed against the image, which gives the user the chance to come back to a photo months after they last edited it and do things like insert filters between others, or possibly rearrange and delete filters. The bottom line is, whatever edit you want to apply, you can still open your original image. Huda will, however, provide the ability to export the edited image so that everyone can appreciate your editing genius.

At this stage, you should be able to see where I’m going with Huda (BTW it’s pronounced Hooda), but what really excited me was the ability to offer alternative editing capabilities to users. This, to me, has the potential to really open up the whole photo editing experience for people, and to make it accessible beyond the traditional mouse/keyboard/digitizer inputs. After all, we now have the type of hardware available to us that we used to laugh at in Hollywood films, so let’s develop the types of applications that we used to laugh at. In fact, I’ve turned to Hollywood for ideas because users have been exposed to these ideas already and this should help to make it a less daunting experience for users.

Why is this learning curve so important? Well, to answer this, we really need to understand what I think Perceptual Computing will bring to Huda. You might be thinking that Perceptual Computing is a buzz phrase, or marketing gibberish, but I really believe that it is the next big thing for users. We have seen the first wave of devices that can do this with the Wii and then the XBox/Kinect combination, and people really responded to this, but these stopped short of what we can achieve with the next generation of devices and technologies. I’ll talk about some of the features that I will be fitting into Huda over the next few weeks and we should see why I’m so excited about the potential and, more importantly, what I think the challenges will be.

Touch computing. Touch is an important feature that people are used to already, and while this isn’t being touted in the Perceptual Computing SDK, I do feel that it will play a vital part in the experience for the user. As an example, when the user wants to crop an image, they’ll just touch the screen where they want to crop to – more on this in a minute because this ties into another of the features we’ll use. Now this is all well and good but we can do more, perhaps we can drag those edits round that we were talking about to reorder them. But wait, didn’t we say we want our application to be more Hollywoody? Well, how about we drag the whole interface around? Why do we have to be constrained for it to look like a traditional desktop application? Let’s throw away the rulebook here and have some fun.

Gestures. Well, touch is one level of input, but gestures take us to a whole new level. Whatever you can do with touch, you really should be able to do with gesture, so Huda will mimic touch with gestures, but that’s not enough. Touch is 2D, and gestures are 3D, so we really should be able to use that to our advantage. As an example of what I’ll be doing with this – you’ll reach towards the screen to zoom in, and pull back to zoom out. The big challenge with gestures will be to provide visual cues and prompts to help the user, and to cope with the fact that gestures are a bit less accurate. Gestures are the area that really excite me – I really want to get that whole Minority Report feel and have the user drag the interface through the air. Add some cool glow effects to represent the finger tips and you’re well on the way to creating a compelling user experience.

Voice. Voice systems aren’t new. They’ve been around for quite a while now, but their potential has remained largely unrealised. Who can forget Scotty, in Star Trek, picking up a mouse and talking to it? Well, voice recognition should play a large part in any Perceptual system. In the crop example, I talked about using touch, or gestures, to mark the cropping points; well, at this point your hands are otherwise occupied, so how do you actually perform the crop? With a Perceptual system, you merely need to say “Crop” and the image will be cropped to the crop points. In Huda, we’ll have the ability to add a photographic filter merely by issuing a command like “Add Sepia”. In playing round with the voice code, I have found that while it’s incredibly easy to use this, the trick is to really make the commands intuitive and memorable. There are two ways an application can use voice; either dictation or command mode. Huda is making use of command mode because that’s a good fit. Interestingly enough, my accent causes problems with the recognition code, so I’ll have to make sure that it can cope with different accents. If I’d been speaking with a posh accent, I might have missed this.

A feature that I’ll be evaluating for usefulness is the use of facial recognition. An idea that’s bubbling around in my mind is having facial recognition provide things like different UI configurations and personalising the most recently used photos depending on who’s using the application. The UI will be fluid, in any case, because it’s going to cope with running as a standard desktop, and then work in tablet mode – one of the many features that makes Ultrabooks™ cool.

So, how much of Huda is currently built? Well, in order to keep a level playing field, I only started writing Huda on the Friday at the start of the competition. Intel were kind enough to supply a Lenovo® Yoga 13 and a Gesture Camera to play around with, and I’ve spent the last couple of weeks getting up to speed with the Perceptual SDK. Huda is being written in WPF because this is a framework that I’m very comfortable in and I believe that there’s still a place for desktop applications, plus it’s really easy to develop different types of user interfaces, which is going to be really important for the applicatino. My aim here is to show you how much you can accomplish in a short space of time, and to provide you with the same functionality at the end as I have available. This, after all, is what I like doing best. I want you to learn from my code and experiences, and really push this forward to the next level. Huda isn’t the end point. Huda is the starting point for something much, much bigger.

Final thoughts. Gesture applications shouldn’t be hard to use, but the experience of using it should be easily discoverable. I want the application to let the user know what’s right, and to be intuitive enough to us the without having to watch a 20 minute getting started video. It should be familiar and new at the same time. Hopefully, by the end of the challenge, we’ll be in a much better position to create compelling Perceptual applications, and I’d like to thank Intel® for giving me the chance to try and help people with this journey. And to repay that favour, I’m making the offer that you will get access to all the perceptual library code I write.

Altering my perception

February 12, 2013 7 comments

My apologies for not posting for a while; it’s been a pretty crazy couple of months and it’s just about to get a whole lot crazier. For those who aren’t aware, Intel® have started running coder challenges where they get together people who are incredibly talented and very, very certifiable and issue them with a challenge. Last year they ran something called the Ultimate Coder which looked to find, well, the ultimate coder for creating showcase Ultrabook™ applications. This competition proved so successful, and sparked such interest from developers that Intel® are doing it again, only crazier.

So, Ultimate Coder 2 is about to kick off, and like The Wrath Of Kahn, it proves that sequels can be even better than the original. The challenge this time is to create applications that make use of the next generation of Ultrabook™ features to cope with going “full tablet”, and as if that wasn’t enough, the contestants are being challenged to create perceptual applications. Right now I bet two questions are going through your mind; first of all, why are you writing about this Pete, and secondly, what’s perceptual computing?

The answer to the first question lies in the fact that Intel® have very kindly agreed to accept me as a charity case developer in the second Ultimate Coder challenge (see, I can do humble – most of the time I just choose not to). The second part is a whole lot more fun – suppose you want to create applications that respond to touch, gestures, voice, waving your hands in the air, moving your hands in and out to signify zoom in and out, or basically just about the wildest UI fantasies you’ve seen coming out of Hollywood over the last 30 years – that’s perceptual computing.

So, we’ve got Lenovo Yoga 13 Ultrabooks™ to develop the apps on, and we’ve got Perceptual Camera and SDK to show off. We’ve also got 7 weeks to create our applications, so it’s going to be one wild ride.

It wouldn’t be a Pete post without some source code though, so here’s a little taster of how to write voice recognition code in C# with the SDK.

public class VoicePipeline : UtilMPipeline
{
private List<string> cmds = new List<string>();

public event EventHandler<VoiceEventArgs> VoiceRecognized;
public VoicePipeline() : base()
{

EnableVoiceRecognition();

cmds.Add("Filter");
cmds.Add("Save");
cmds.Add("Load");
SetVoiceCommands(cmds.ToArray());
}

public override void OnRecognized(ref PXCMVoiceRecognition.Recognition data)
{
var handler = voiceRecognized;
if (data.label >= 0 && handler != null)
{
handler.Invoke(new VoiceEventArgs(cmds[data.label]));
}
base.OnRecognized(ref data);
}

public async void Run()
  {
await Task.Run(() => { this.LoopFrames(); });
this.Dispose();
}
}

As the contest progresses, I will be posting both here on my blog, and a weekly report on the status of my application on the Intel® site. It’s going to be one wild ride.

The Canny Coder

Java 8 Functional Programming with Lambda Expressions

pihole.org

Adventures in theoretical computer science, with your host, chaiguy1337

Confessions of a coder

Confessions of a WPF lover

WordPress.com

WordPress.com is the best place for your personal blog or business site.

Follow

Get every new post delivered to your Inbox.

Join 38 other followers