To Slam or not to Slam: That Is the Question - AR CLOUD EDITION

If you've been reading the augmented reality news, there are new startups building the AR Cloud every month since Ori Inbar at Super Ventures coined the term. And these are the ones who are funded, so there must be at least 10 times more (us being one of them). Then, Google announced that they are also working on the AR Cloud, which probably just put an expiry date on a bunch of them. The core of the ARCloud gold rush is about SLAM (Simultaneous Location and Mapping) again and getting into this is a sexy, but dangerous path.  

Before 2017 when Apple's ARKit and Google's ARCore were not announced, the big thing for core technology startups to work on was persistent AR using SLAM, which means a digital asset such as a 3D Mickey Mouse could be "anchored" to a specific location for solo users, and if you returned to the location, it would still be there. I remember sitting our team back in 2014 and deciding not to get into mono-cam SLAM (we had the product plan for it) because we assumed the internet oligopolies would solve this. Yet, we couldn't resist the temptation to try, again with a hybrid-SLAM-Google Tango solution. (See the following video of our prototype we made back in 2017 where we could create a runtime 3D map of descriptors. This would enable a single user re-localize or loop-close on their own map. The challenge was to optimize it to work more efficiently as demonstrated by its lag.) 

The core of the ARCloud gold rush is about SLAM (Simultaneous Location and Mapping) again and getting into this a sexy, but dangerous path.  

But as soon as Apple and Google made their announcement in 2017, startups who had spent millions building out their mobile SLAM tech were pretty much wiped out a month or two later, not because their tech was worse, but because they couldn't match the distribution and billions of dollars in future development that would be required.

 

Shapetrace: Runtime Descriptors and Relocalization. First attempt. This was done back on August 2, 2017 with a Google Tango since they didn’t have runtime ADF creation.

So, the natural next step was to turn AR persistence for multiples users. This is basically the AR Cloud, which serves to store all the maps and digital assets that are anchored to them. This time, even our own team got enamored AGAIN by this enormous technical challenge and tremendous opportunity, and were actively charting a path. We even revisited creating another hybrid-SLAM built on top ARKit/Core. With our blinders on, we thought that we could gain a competitive advantage in this space, forgetting that the ability to store spatial information in the AR Cloud is built by breaking out SLAM components and putting their outputs on the cloud for collaboration. This means, creating a 3D map of point clouds, descriptors, or voxels then storing it in the cloud for others to access. This would enable others to re-localize or loop-close on someone else's map.

Duh on me.

Google announced their Google Cloud Anchors at their 2018 I/O conference, which is their first cut at the AR Cloud. Hindsight now tells me that of course Google and Apple have these capabilities already because they have whole divisions dedicated to SLAM. They simply are holding back parts to refine them for mass distribution. So, I'm relieved we didn't execute on this, but this time, it was due to a lack of resource, as opposed to some brilliant business tactic. I was ready to go all-in on a hybrid SLAM engine (again) for cloud collaboration.

In reflection, the AR Cloud and its core technology challenges to solve augmented reality are a billion dollar problem. Ori Inbar predicted that the AR Cloud would likely be solved by the internet oligopolies. So this gets us thinking, what should startups do to survive?

Off the top of my head, here's a list of AR Cloud stuff that I think will need to be solved. Startups will have to decided on the ones that Google and Apple can't do as quickly (or as well) so that they become acquisition targets. I'm not particularly confidence that someone else will replace their tech at this point:

  • Image recognition with relative pose re-computation upon trigger (ARKit/ARCore does this)
  • Pre-map a space, then share it with others: this is where a lot of startups are playing (Google Cloud Anchors does this)
  • Do actual loop closures on a map: (I don't think ARKit/ARCore do this at the moment due to the tax on the hardware... yet)
  • Runtime collaborative mapping of a space for loop closure: the map from multiple users get uploaded to the cloud where the 3D map is reconstructed and broadcasted in real-time back to the users 
  • Real-time occlusions
  • Mediated reality
  • A robust way to deal with different lighting conditions (might be an ML integration?)
  • A whole bunch of software and hardware shit related to smartglasses and even mobile phones...
  • This list will grow...

So what's a startup to do otherwise?

I had a brief chat with Neil Mathews from Vertical who make Placenotes, it's an awesome AR Cloud implementation, so you should try it), and he felt startups should go after niche markets that the oligopolies would never enter so it means integrating AR Cloud into some consumer or enterprise workflow for a specific application.

Else, there is always room for an open-source AR Cloud, similar to Linux vs. Window or Mapbox vs. Google Maps.