Monday 25 February 2013

Lee Going Perceptual : Part Two

So what does Lee have in store for you this week? You may wonder, and you won't have to wonder for long as I have provided a nice big video for those who don't have the stomach for the reams of text to follow.



The Promised Meat

I thought I would give you a day in the life to track the progress of the prototype as it was being built.  This part of the blog can get highly fluffy and very techie, so I advise readers to skip this part and jump past all the date stamps. If you absolutely must drill down into the basement, read on:

01:48 Dug Out A DBP Module and Cuppa

I have decided for the sake of expedience to 'graft' the PerC stuff onto the Basic3D module, the central 3D command set of DBP. I know it's hacky and not very responsible, but there is method to the madness.

I will be able to publicly share my modifications through Google Code when I have finished my PerC stuff so other DBP users can benefit from it.  It also means the contaminating code does not affect the new VS2010 build of the modules which I intend to overhaul before starting into the guts of DBP for another development.

02:44 Two New Commands and a Blob

I've created the first of what may be many DBP apps for this project, and added two new hacky commands called MAKE OBJECT PERCBLOB and UPDATE OBJECT PERCBLOB. I have added code into the module to create and modify the vertices of a basic sphere so I can see this on screen.


Now I know everything is running fine and I can see the 3D, and manipulate it, I just have to replace the mesh form with something that represents the depth data in some reasonable way.  It means if anything goes wrong, it's nothing to do with my 3D, just the data and the code that finds the data. Clever huh!

02:58 Ambitious Way

I was just about to create a basic vertices only grid then realised later on I would have to change four vertices for every coordinate in the depth buffer that changed. It is times like this that you realise taking the slightly longer route during early development will save headaches later. Going to use an index buffer and keep it to one vertex per depth reading, which will make a smaller footprint for the 3D object and make changing it MUCH easier.

03:16 Who'd Be A Programmer

It turns out 320x240 grid and six indices per face works out at over 400K in size, and a 16-bit indices buffer has a maximum size of 65535. The current DBP uses WORD index values (designed many moons ago when WORDs where faster and more memory efficient than gorging on 32-bit index buffers).  The natural step is to go back to vertex only where I can have a very large mesh. That said, I don't think I will need (or want) to use the entire 320x240 depth area for the final 3D object (unless the subject is extremely fat and wide). As I am both, if I can squeeze myself into a single indice buffer it should be good for most users.  I can have 10,922 faces, which works out at roughly 45x40 capture area which is okay to start with.  If I had more time, I would just bite the bullet and step up to DirectX 11 and get mucky with tessellation and geometry shading tricks but I don't have the luxury of time here. I will proceed with my little 45x40 grid and basically detect the best place to grab the depth data from as it's probably not far from the actual area I am interested in.

03:29 Smug Mode Times Two

I adjusted the size and my entire vertex+index construction code compiled and ran first time perfectly (almost). When I ran the app there was no 3D object. Thanks to the fact I added a CONTROL CAMERA WITH ARROWKEYS to my DBP program, I was able to go for a short walk to view the other side of my object. And presto, there it was. The face culling winding order was reversed, that's all.  I could have been hammering at that for ages wondering where my 3D went, but after over ten years of messing with 3D graphics I know that old chestnut!

03:44 Make Short Video

I've just made a short video of the current state of the DBP app, with the wibbly wobbly 3D grid, ready to have depth data added onto it. 



Adding the Depth Data is a tense moment as I will be adding lots of new headers and dependencies from the SDK sample code, and many wonderfully bad things can happen.  Fingers crossed, except mine as I need them for typing.

04:24 HOT TIP : Warning For PerC SDK C++ Users

Been banging my head against a wall wondering why my cut and paste code is not linking properly, and decided to trace the decorated name it was creating with that used by the Perc SDK Util Library. Turns out you cannot use a project that switches off wchar_t as a unique type as this confuses the linker. This option can be changed under "Project Properties>C/C++/Language/Treat WChar_t as Builtin type", it can also be changed via the "/Zc" option.

07:13 Way Too Much Fun

I just want to report that I became absorbed with the 3D version of myself. Once I got the basic representation of depth working in 3D, I just kept going. Adding normalisation, averaging vertices, tweaking depth scope and scale, all sorts of tweaks. Have to stop though as I needed to produce a blog video.

The Final DBP Code So Far

Rem Project: PercA.exe

Rem Created: Saturday, February 23, 2013
`
rem App Init
sync on : sync rate 0 : color backdrop rgb(255,128,0)
`
rem Make 3D floor
make matrix 1,1000,1000,100,100
position matrix 1,-500,-55,-500
`
rem Make 3D Object
load image "brick.png",1
make object percblob 1,50,50,50
set object cull 1,0
`
rem Place camera
position camera 0,0,75
point camera 0,0,0
`
rem Add lights
make light 1 : set directional light 1,0,0.9,-0.1 : color light 1,512,255,0
make light 2 : set directional light 2,0,-0.9,-0.1 : color light 2,0,255,512
make light 3 : set directional light 3,-0.5,0.0,-0.2 : color light 3,-100,255,-100
`
rem Main loop
do
 `
 rem Move camera around
 control camera using arrowkeys 0,1,2
 set point light 0,camera position x(),camera position y(),camera position z()
 set light range 0,200
 `
 rem Each cycle refresh 3D data in object
 update object percblob 1,0,0,0
 `
 rem Prompts
 set cursor 0,0
 print screen fps();"fps"
 if spacekey()=1 then texture object 1,1
 if returnkey()=1 then texture object 1,0
 `
 rem Update screen
 sync
 `
rem End loop
loop

The Progress In Brief

So what you have seen is the depth data from the Perceptual Camera used to generate a 3D construct in a prototype that allows me to move around the object and view it from different angles.

I discovered that there is enough fidelity in the depth data to create a good face shape and with further work I can produce more striking 3D elements. I am also happy with the speed of everything so far, and will probably stick to detecting my own gestures and intents using this raw data too.

The technical side of getting the SDK into the DBP module was painless, and apart from a sticky moment with the wchat_t type, quite easy. I now have a good foundation on which I can add DBP BASIC and C++ willy nilly to solve the various challenges ahead.

The Prototype Binary

If you have a Perceptual Camera all set up, you are welcome to try the prototype yourself. Find it at the end of this link:


I will offer the disclaimer this has only been tested on one machine and is not guaranteed to work on any other. If it does, please comment and let me know as I need all the testers I can get.

Signing Off

The next step will be to round off the 3D object so it's a real head instead of a rubber wall with protrusions.  I also want to get it textured and get it transmitted to another client sooner rather than later.  It's always a good idea to get the main chunks of your functionality in early so you know what general shape your app is going to be.  I am happy with the shape of the app at week two, and looking forward to see my Perceptual 3D gubbins evolve.

Monday 18 February 2013

Lee Going Perceptual : Part One

Lee Going Perceptual : Part One

Welcome to the start of another exciting adventure into the weird and wonderful world of cutting edge software development. Beyond these walls lurk strange unfathomable creatures with unusual names, ready to tear the limbs of any unsuspecting coder careless enough to stay on the path.



For me, this is a journey for off-piste adventurers, reckless pioneers and unconventional mavericks. We're off the chain and aiming high so prepare yourself, it's gonna get messy!

A Bit About Me

My name is Lee Bamber, co-founder of The Game Creators (circa 1999). I have been programming since the age of nine and thanks to my long tenure as a coder, brands now include DarkBASIC, The 3D Gamemaker, FPS Creator, App Game Kit and soon to be developed; FPSC-Reloaded. I have been invited by Intel to participate in a challenge to exploit cutting edge technologies, which as we say in the UK is right up my cup of tea.


The Challenge



The basic premise of the challenge is to put six elite developers (and me) into a 'room' and ask them to create an app in seven weeks that takes full advantage of either a convertible Ultrabook and/or gesture camera. We will be judged on our coding choreography by a panel of hardened industry experts who expect everything and forgive nothing. At the end, one coder will be crowned the 'Ultimate Coder' and the rest will be consigned to obscurity.



The Video Blog

I will be posting video versions of my blog every week to communicate with expansive arm gestures and silly accents what I cannot describe in text. Hopefully amusing, sometimes informative, certainly low-budget.




The Real Blog


I do like to blog, but a video of me can be too much for the mind to cope with, so I have provided copious amounts of text as a safer alternative. In these pages you will learn about the technical details of the project, any useful discoveries I make and the dangers to be avoided.


The Idea

For this challenge I want to create a new kind of Web Cam software, perhaps even the sort of app you would find bundled with the hardware when you buy the product. Hardware manufacturers only bundle software that has mass market appeal showing off the best of what their device has to offer. Rather than shoe-horn the technology into something I was already doing, or come up with crazy ideas around what I could do with these wonderful new toys, I wanted to produce a relevant app. An app that users want, something that relates this new hardware to the needs of the human user, not the other way around. If my app can fix an existing problem, or improve a situation, or open a new door, then I will have created a good app.


The Perceptual Computing Myth

Forget the movies! That scene out of such and such was not designed with good computer interaction in mind, it was created to entertain. We all know large physical keyboards are better for writing blogs than virtual keyboards or voice dictation. Simple fact. Ask Hollywood for futuristic keyboard and they'd replace it with a super-intelegent robot, writing the blog for you and correcting your metaphors.

In the real world, we like stuff that 'just works'. The better it works for us, the more we like it. The keyboard works so well we've been using it for over 140 years, but only for writing text. You would not, for example, use it to peel potatoes. Similarly, we would not use Perceptual Interfaces to write a blog, nor would we use it to point at something right in front of our nose, we'd just reach out and touch it.

Context is king, and just as you would not chop tomatoes on your touch tablet, there will be many scenarios where you would not employ Perceptual Computing. Deciding what those scenarios are, and to what degree this new technology will improve our lives remains to be seen. What I do know is that app developers are very much on the front line and the world is watching!


The Development Setup

To create my masterpiece, I have a few tools at my disposal.  My proverbial hammer will be the programming language Dark Basic Professional. It was designed for rapid software development and has all the commands I need to create anything I can dream up.

I will be using an Ivybridge-based Desktop PC running at 4.4Ghz for the main development and a Creative Gesture Camera device for camera and depth capture. 




The Gesture Camera & SDKs

I have created a quick un-boxing video of the Perceptual device I will be using, which comes with a good sized USB cable and handy mounting arm which sits very nicely on my ageing Sony LCD.



The SDKs used will be the Intel Perceptual Computing SDK Beta 3 and the companying Nuance Dragon voice SDK.


The Convertible Ultrabook

To test my app for final deployment and for usage scenarios, I will be using the new Lenovo Ideapad Yoga 13. This huge yet slim 13 inch Ultrabook converts into a super fast touch tablet, and it will be interesting to see how many useful postures I can bend the Ultrabook into over the course of this competition.  Here is a full un-boxing video of the device.



I also continued playing with the Yoga 13 after the un-boxing and had a great time with the tablet posture. I made a quick video so you can see how smooth and responsible this form factor was. Very neat.






The State Of Play


As I write this, there is no app, no design and no code. I have a blank sheet of paper and a few blog videos. The six developers I am competing against are established, seasoned and look extremely dangerous. My chances of success are laughable, so given this humorous outcome, I'm just going to close my eyes and start typing. When I open them in seven weeks, I'll either have an amazing app or an amazing lemon.



My Amazing Lemon

Allow me now, with much ado, to get to the point. The app I am going to create for you today will be heralded as the next generation of Web Cam software. Once complete, other webcam software will appear flat and slow by comparison. It will revolutionise remote communication, and set the standard for all web camera software.

The basic premise will be to convert the depth information captured from the Gesture Camera and convert it to a real-time 3D mesh. It will take the colour information from the regular camera output and use this to create a texture for the 3D mesh. These assets are then streamed to a client app running on another computer where the virtual simulation is recreated. By controlling the quantity of data streamed, a reliable visual connection can be maintained where equivilant video streaming techniques would fail.  Additionally, such 3D constructs can be used to produce an augmented virtual environment for the protagonists, effectively extracting the participants from the real world and placing them in artificial environments.

Such environments could include board rooms for serious teleconferencing or school rooms for remote teaching. Such measures also protect privacy by allowing you to control the degree with which you replace the actual video footage, from pseudo realistic 3D to completely artificial. You could even use voice recognition to capture your voice and submit the transcript to those watching your webcam feed, protecting your identity further.

At that's just the start. With real-time access to the depth information of the caster, you can use facial tracking to work out which part of the 3D scene the speaker is interested in. The software would then rotate the camera to focus in on that area, much like you would in real life.  Your hand position and gestures could be used to call up pre-prepared material for the web cast such as images, bullet points and video footage without having to click or hunt for the files. Using voice recognition, you could bring likely material to the foreground as you speak, and use gestures to throw that item into the meeting for the rest of the group to see.

Current web cam and web casting technologies use the camera in an entirely passive way. All interaction is done with keyboard and mouse. In the real world you don't communicate with other humans by pressing their buttons and rolling them around on the floor (er, most of the time). You stand at arms length and you just talk, you move your arms and you exchange ideas. This is how humans want things to work.

By using Perceptual Computing technology to enable this elevated form of information exchange, we get closer to bridging the gap between how humans communicate through computers to other humans.




Signing Off

Note to judges, quality development is one part inspiration and ten parts iteration. If you feel my blog is too long, too short, too wordy, too nerdy or too silly, or my app is too ugly, too confusing, too broken or too irrelevant, I insist you comment and give me your most candid response. I work with a team who prize brutal honesty, with extra brutal. Not only can I handle criticism, I can fold it like origami into a pleasing shape.

Congratulations! You have reached the end of my blog post.  Do pass go, do collect £200 and do come back next week to hear more tantalising tales of turbulence and triumph as I trek through trails of tremendous technology.

NOTE: This blog is also published officially on the IDZ site at: http://software.intel.com/en-us/blogs/2013/02/17/ultimate-coder-challenge-ii-lee-going-perceptual-part-one