Thursday, March 22, 2018

AI researchers, wake up!

Picture from Jeff Bezos' Tweet
"Taking my new dog for a walk"
I wanted to write this post since a while but have been putting it off due to other priorities but with the recent whistle-blowing about Cambridge Analytica and also my MVP friend James Ashley starting an AI Ethics blogging initiative, it's about time.

I would say I'm fairly knowledgeable in the AI field staying up-to-date with the research field and currently exploring WinML in Mixed Reality. I also developed my own Deep Learning engine aka Deep Neural Network (DNN) from scratch 15 years ago and applied for OCR. It was just a simple Multi-Layer Perceptron but hey we used a Genetic training algorithm which also seems to have a renaissance in the Deep Learning field as Deep Neuroevolution.

If you are familiar with Deep Learning skip this paragraph but if not: A DNN is simply put simulating the human brain with neurons connected with synapses. Artificial neural networks are basically lots of matrix computations where the trained data is stored as synapse weight vectors and tuned via lots of parameters like neuron activations functions, network structure, etc.
Those networks are trained via different ways but most common these days is supervised learning where lots of big data training sets are run through the DNN, a desired output is compared with the actual output and the error is then backpropagated until the actual output is sufficient. Once the DNN is trained it enters the inference phase where it is provided with unknown data sets and if it's trained correctly it will generalize and provide the right output on unknown inputs.
The basic techniques are quite old and were gathering dust but since a few years there is this renaissance when lots of training data became available, huge computing power in form of powerful GPUs and new specialized accelerators, plus AI researchers discovered GitHub and open source.

AI researchers, the question has to be: Should we do it, not can we do it.

We live in the era of Neural Networks AI and it only has just begun. Current AI systems are very specific and targeted at certain tasks but Artificial General Intelligence (AGI) is becoming more and more interesting with advances in Reinforcement Learning (RL) like Google's DeepMind AlphaGo challenge.

RL is just in its infancy but already quite scary if you look at some research from game development innovators like SEED literally using military training camp simulations to train self-learning agents.

They also applied it to Battlefield 1 with a couple of things to improve but nevertheless achieved some impressive results there. Does everyone see an army of AI in this video or is it just me? 

I'm sure the involved developers don't have bad goals in mind and I can see it being nicely suited for computer games AI but it's not just the age of Deep Learning but also real-time ray tracing is moving forward, so why not render photorealistic scenes and use that as input for your RL agent training which is then deployed to a real-world war machine.
Below is another video showing real-time ray tracing and self-learning agents. The video looks cute and all fun but think about if you replace the assets with a different scene.
Don't get me wrong it's super impressive and I'm all in for advancements in tech but we have to think about the implications in a broader context.

Oh wait, there's actually already a photorealistic simulation framework available to train autonomous drones and other vehicles, they just need to add real-time ray tracing now but I guess it's already in the works. I was getting a little worried but good that drones are only used in civilian scenarios.

Now look at the recent Boston Dynamics robots like Atlas or the SpotMini robot dog which inspired Black Mirror for a good reason.

But hey isn't it cute how Jeff Bezos is just taking his robot dog out for a walk?

We don't even have to look at the future of RL and AGI and scary robots with the fear of the singularity, we already have quite amazing achievements especially with LSTM and CNN type neural networks. Some of them outperform humans already. LSTMs are used for time-based information like speech recognition and synthesis and CNNs for computer vision tasks.

The interesting and scary part is that artificial neural networks are almost a black box after being trained and can make implications the developer is not aware of. There's always that uncertainty with DNNs, at least for now and it can have huge implications for racial, gender and religious profiling even if that was not the intend of the developer/researcher.

AI researchers, wake up.

Think twice before working on the next cool thing that raises your reputation in the research community. There's more to it than your research and work silo. It's the whole humanity. You have great power and with that comes even greater responsibility.
Ask yourself constantly: What are the implications? Should we do it?

Let's look at some more amazing examples and their implications 

I could post tons of recent examples which are super cool if you wear the geek hat but which are super scary if you put on the ethics hat on and think a bit about their implications.

Autonomous vehicles are the future and I'm sure lots of us can't wait until we can relax during a long drive in our self-driving car but it might not be ready yet for tests out in the wild like Uber's deadly accident shows. No doubt that human drivers are worse even today but the perception is that AI is much better and it should be indeed. My guess is there's likely not enough training data available for all the edge cases yet since the object movement detection should have triggered a full stop but it might have been interpreted as a moving light shadow but the radar sensor should get it. We will see what the investigations find out but I still think it's too early for these kind of real-world tests.

NVIDIA has some amazing research ongoing with unsupervised mage-to-image translation. Just watch this demo video below and then think about if we can be sure if dashcam footage was really recorded during day or night.

Google's WaveNet and even more so Baidu's DeepVoice show impressive results for speech synthesis using samples of humans and then synthesizing their voice patterns. The amount of sample data needed to fake a person's voice is getting less and less, so not just public figures with lots of open samples but basically everyone can be imitated using text-to-speech.

It doesn't stop with audio synthesis. Researches from University of Washington made great progress with video synthesis. Play the embedded video and think about the implications this tech could have being in the wrong hands.

You might have heard about Deep Fakes videos being mainly used to generate fake celebrity porn but even worse things like this were created.

There's light!

It's not all just dark, there are of course as many good examples available that leverage modern AI for a good cause with little dark implications.

Impressive results were achieved for AI lipreading that beats professional, human lip readers which can help many people to live a better life.

Also huge advances are achieved in the medical field and especially in the computer vision tasks to automatically analyze radiology images like breast cancer mammography or improving noisy MRI data.

Also prominent companies like Google's DeepMind are beginning to realize the implications for humanity of their work and have started ethical initiatives.

And then there's Facebook!

It's crazy how ethical questions play a little role for large companies like Facebook which is hoarding billions of data sets from around the world which can be used for training. We even provide them not just the input data sets but even the training output with our likes, clicks but even just the scrolling behavior when you read. Plus the huge investments in Facebook's AI Research group hiring and growing some of the best AI talent in the industry.
Just look at some of the research areas of FB like DeepText which is super impressive and aims for "Better understanding people's interest". Now ask yourself for what? Ads? What kind of ads? What is an ad? Is a behavioral changing FB feed an ad?
And then you have companies like Cambridge Analytica who crawled/acquired the data and abuse it to sell their information warfare mercenary services to anyone changing human behavior and altering elections.
Real-world war machines might not be needed anymore, Big data + Deep Learning + Behavioral Psychology is a dangerous weapon if not the most dangerous.
It's good to see Mark Zuckerberg apologizing for the issues and we can only hope it will have real consequences and is actually still controllable at all.

And then there's YOU!

It's should not just be the Elon Musk's of the world warning about the impact of unethical AI, we as developers and researchers being at the forefront of technology have a responsibility too and need to speak up. It's about time, so please think about it and share your thoughts and raise your concerns.

The creator of Keras, a popular Deep Learning framework, a real expert shared his thoughts about Facebook and the implications of its massive AI research investments.
I couldn't agree more so let me finish this post here with his must-read Twitter thread:

Make sure to read the whole thread here.
There's a little twist to it since François works for Google but I expect he sticks to his own principles.

AI researchers, wake up! Say NO to unethical AI! 

Wednesday, January 10, 2018

Big in Vegas! - HoloBeam at CES

Last year has been exciting for our Immersive Experiences team and we reached some nice milestones and coverage with our proprietary 3D telepresence technology HoloBeam.

I wrote a post for the Valorem blog with more details about the new version of HoloBeam we are showing at Microsoft's Experience Center (IoT Showcase) at CES 2018 in Las Vegas.
You can read it here:

Friday, November 10, 2017

Øredev är över - Content for Advanced Mixed Reality Development Talk

Just a few hours ago I finished my presentation at Øredev in Malmö. I really enjoyed the conference and the city with nice people, great vibes and very good talks. 
The title of my talk was "HoloLens 301: Advanced Mixed Reality Development and Best Practices" and it covered advanced Mixed Reality / HoloLens development topics, including a live demo, best practices and lessons we learned developing for the HoloLens since 2015. The room was full and I received positive feedback.

The slide deck can be viewed and downloaded here but the main content is in the presentation itself. The presentation was recorded and the video is here and embedded below.
The source code of the demo mentioned is here.

HoloLens 301: Advanced Mixed Reality Development and Best Practices

This was my last conference talk of 2017. See you in 2018 at another conference with new content! 

Tuesday, October 24, 2017

AWE - Recap of Augmented World Expo Europe 2017

Last week I had the pleasure to give a talk at AWE Europe and was also able to attend it with nearly 1,500 other attendees, 100+ speakers and 90+ exhibitors.
AWE was a great conference and allowed me to connect with old and new contacts but also to experience lots of new hardware, software and attend some very good sessions.
This blog post gives a little recap of the event and the things I experienced using photos I've taken. It starts with the keynotes, after that it highlights a few great sessions and finally covers a collection of new devices and experiences I've seen at the expo. Make sure to read till the end and check out the great summary at the Valorem blog.
The content for my talk can be found here


AWE was kicked off with a keynote by Ori Inbar who was sporting a jacket with illuminated AWE letters. Definitely a fun way to start a conference!
All sessions were recorded and will be posted on the AWE YouTube channel.

A couple of gold and silver AWE sponsors were also invited on stage in a "press conference" session to announce their new products. 

emteq's ew facial capturing glasses were particularly interesting:
Emteq’s unique OCOsense™ smartglasses contain many tiny sensors in and around the glasses frame and use AI/Machine Learning to read and interpret the wearer’s facial movements and expressions, without cameras, wires or restrictive headgear. OCOsense™ will be available in 2018 and will combine both real-time facial expression tracking with a cloud-based analytics engine, facilitating emotional response analysis for scientific and market research.

After the keynotes and press conference the different tracks started.
Ed from Scape kicked off the Creators/Developers track talking about their city-scale location tech he and his team are developing in London. He didn't share a lot details as they are in semi-stealth mode but what the topic reminded me of the ARCloud and how important large scale location AR is. 

Shortly after Ed it was my take on Massive Mixed Reality.

Harald Wuest from VisionLib (a spin-off from Fraunhofer IGD) talked about their CAD-model based tracking which is similar to Vuforia's Model Targets.
Kudos to Harald for squeezing in a live demo in this very short presentation time. 

I was also able to try it out myself at VisionLib's booth and was impressed with the quality and how well it worked even with varying light conditions. At the booth I also got a test target paper model which I folded already and I'm eager to run some tests with their HoloLens API. 

Allessandro Terenzi from AR-media presented a great overview about the history of AR tracking starting with markers and the current state of the art object tracking with feature-maps and model-based approaches.

pmd talked about their Time of Flight depth sensor which is being used in the Meta 2, some Google Tango and other devices. It can also be acquired as a standalone sensor. 
The ambient-light invariance was really impressive allowing the sensor to work outdoors as well.

A gentleman from Wayfair was talking about their approach to bring AR to retail with WebAR leveraging experimental Chromium releases at the moment.

One of my favorite talks was given by Khaled Sarayeddine from optinvent. He broke down the optics used for different, state of the art AR see-through devices. I have never heard such a comprehensive overview of AR see-through tech in such a detailed way.

Khaled Sarayeddine is one of the few internationally-recognized optics experts and overall a very nice guy. I was able to catch up with him after his talk and he shared more interesting industry insights.

AWE Europe was held in Munich in southern Germany which is the home to lots of (automotive) industry, so there were also informative sessions covering Digital Transformation of the industry and how it relates to AR/VR.

Adam Somlai-Fischer from prezi showed a preview of their upcoming, exciting AR features.
Pretty cool feature and maybe we will be giving presentations in the future via AR. 

I also attended a few interesting panels covering the VR game market, the art of storytelling and how VR/AR is being used as a new medium by artists.

Gartner presented their Top 5 Predictions about Digital Transformation which were insightful as usual. Gartner is estimating by 2020 more than 100 million consumers will shop in AR and that mobile AR (which I call Classic AR) will be dominant at least till the end of 2018.


The exhibition area at AWE was packed with lots of interesting booths. In fact they had so much amazing tech the 2 days were not even enough to explore everything.

Smart Glasses

Classic AR will drive the mass adoption for a while but there's little doubt that HMDs are the real Future AR. I was able to try a few new smart glasses which are available on the market.
Disclaimer: This post is no review and just a subjective write-down of my experience. Keep in mind, every device is on the market for a reason and fulfilling a certain demand. 

Smart Glasses without Tracking
These devices show a screen in front of one or both eyes but don't track the environment, so the overlays are non-spatially integrated. The form factor is very lightweight.
Some of those devices use different light guide optics (Optinvent, Epson, Google Glass) or just a simple, small display like the Vuzix which is on the other hand highly modular and configurable.

The prototype from Fraunhofer FEP in Dresden is using beam splitting similar to the Meta 2 display tech. The Fraunhofer prototype is super low-power and the screens act as a camera at the same time performing eye tracking, so I was able to scroll the UI with my eye movement. 

Smart Glasses with Tracking
The ODG R-9 provides a wide field of view and the new DAQRI Smart Glasses also had a decent FoV. The DAQRI Smart Glasses use a nice form factor running the computing unit in an external box that can be clipped to the belt. The rendering quality was great with the DAQRI as well.
Both device's inside-out tracking capabilities still lack quite behind the HoloLens which is no surprise considering the HoloLens has a dedicated HPU and went through intense calibration. 

Misc Artists 

More impressions from AWE and some interesting things I spotted like a few very immersive VR games and prototype shoes to walk in VR.

The artist Sutu performing live painting with the HTC Vive and TiltBrush.

A full-blown VR welding simulator using AprilTags to track real welding equipment. The helmet has a light array and 2 cameras outside and a screen inside showing the Augmented Virtuality experience. The metallic material simulation and the real hardware integration was very impressive. 

The German startup Holo-Light showed a prototype of a pen they are developing called Holo-Stylus which is supposed to provide 3D pen input. The device leverages optical infrared tracking using an IR camera bar mounted on top of the HoloLens. The prototype was using Wi-Fi to connect with the HoloLens but the final product is supposed to work via Bluetooth. 
It's an interesting concept and I'm curious to try it out once it's more mature and works flawlessly. 

InsiderNavigation showcased their software solution for indoor navigation using SLAM-based point cloud mapping and AR overlays. 

Last but not least I was pleasantly surprised that Vuforia showcased our very own Tire Explorer HoloLens app at their booth as an example for Model Targets.