visionOYes
I felt, well in advance of Tim Cook unveiling the Apple Vision Pro at WWDC that it was nearly impossible it would be a product that appealed to me. Other people knew well in advance that they absolutely wanted whatever it was. It could have been an Apple-branded ViewMaster and they’d want it. I don’t seek to tell those enthusiastic people that they’re wrong, far from it, but I’ll explain why I don’t want anything on my face, and I feel like my explanation might be applicable to other people as well.
After working on stereoscopic “3D” movies for many years I know this well. We would sit at our desks, with our active shutter glasses, and work for hours. We would go to the small screening room, equipped with a projector and polarized glasses, and we would try to get final approval there because it was better than the active shutter glasses. It’s not fun to wear stuff for extended periods of time, even the uncomfortable active shutter glasses that are featherweight compared to the world-building power of the Vision Pro. Thus I was unable to envision anything I’d wear on my face when the rumors were circulating. It’s not for me, and I suspect it’s not for some other people either. There are ways to shave down the device here or there over time, or redistribute the weight — like the top strap only visible in one shot of the WWDC keynote. But it will always require pressing something to your face because that’s how it has to function. Even the electrostatic paper masks we’ve all used leave unpleasant creases on our face, or pinch in the wrong spot (and I gladly masked the fuck up out of necessity).
What I was absolutely enamored by though was visionOS. Not from the WWDC keynote presentation, which just made it seem like a movie computer interface, but from the WWDC developer videos released after. I highly recommend watching those regardless of your level of skepticism about the hardware. Functionally, it seems like such a natural and organic extension of interaction metaphors we’re already using, while at the same time being adapted to inputs in space. What was unclear in the Keynote, was that your eyes are your “cursor” which is natural because they are your focus. Your hands are at your side like you’d have them for a touchpad or mouse. The array of cameras and sensors monitoring your hands and eyes make this all possible.
It made me want to use visionOS … just not with a Vision Pro.
I know that might sound a little contradictory, and silly, but I’d rather sit at my two UHD monitors with a camera pointed at my face, and move windows around inside the confines of those monitors, than wear anything. With all the complaining people have done about not having touchscreen MacBooks, imagine not pressing on anything on a MacBook just to scroll a web page. Hell what if — not to go all Gene Munster — what if they shipped a HomePod/AppleTV that had a series of projectors and those hated passive glasses for people to use exclusively in dark rooms?
I mean, that’s not going to happen, but that’s where my mind went. Apple does apply their effort on one platform on to their other platforms, in some scenarios, so some cross pollination might be possible, but that really is wishcasting.
With the focus on building a headset empire, I guess I’ll return to critiquing that product and how Apple currently pitches using it.
- 2D and 2.5D windows arranged in space to do office work and web browsing.
- Teleconferencing.
- 2D and stereoscopic theatrical experiences.
- 3D family photos
- Immersive locations.
Notably absent was gaming. Everyone expected Hideo Kojima’s presence in Cupertino to be tied to this headset but it was for porting old games. At this point we should really all know better than to expect anything significant with gaming. It’s for the best that they don’t either, because Apple doesn’t believe in making game controllers. There still isn’t one for the Apple TV, and it doesn’t matter how many times Apple says you can bring your own controller to use, it’s not the same thing. Controllers are a shortcoming of competing VR headsets because you have to use them, but the benefit of having them is mostly physical feedback. Nothing about physical response is present. Functionally every interaction people have seems to be at more than arms length.
Let’s talk about those arms-length interactions:
Cocooned in Email and Spreadsheets
Windows in virtual spaces are nothing new, and honestly I’m happy Apple didn’t try to do some bizarre 3D application interface. Emails should just look like emails, and spreadsheets should just look like spreadsheets.
That’s not to say that I have any idea why anyone would want to work on their email or spreadsheets with a headset on. That seems like something for ardent futurists and not practical for large groups — let alone office environments. Even virtualizing screen real estate doesn’t seem to be a tremendous boon if you’re going to get eyestrain from lower text resolution (WWDC videos note that body text weight should be increased to be legible in the headset if that helps visualize how the displays in the headset aren’t exactly like having multiple real monitors).
Safari seems like a better use case, because that is laid-back on the couch stuff. You’re shopping, or reading sub-par blogs like this one, and it’s more about consumption than work.
Calls From Creeps
A poorly received part of the WWDC keynote was the teleconferencing story. Apple doesn’t want you to feel cut-off from people, which is why they have the creepy eyes on the outside, but making a full creepy avatar of a person to have calls with isn’t helping. That persona, as they call it, has more in common with a Sim than a human being. Not just in terms of performance (seriously look at that mouth move) but in terms the qualities we expect in a video conference call.
99% of people are bad at video calls, but they’re still fundamentally people. We see their messy rooms - not particle cloud voids. We see their cats, their dogs, their kids, what they’re drinking, what they’re wearing. For people that don’t want to participate in that we have this amazing technology where the camera doesn’t turn on.
It’s also pretty telling that Apple doesn’t offer up Continuity Camera so the people you’re on the call with can see you, with your Daft Punk headset and creepy eyes because Apple considers that to be a real world solution for talking to flesh and blood.
The real teleconferencing solution is that you just take off your headset.
Best Seat in the House
Apple didn’t invent virtual movie theaters in VR either. They seem to have made it very nice though, by virtue of their displays being better than competitors. It doesn’t seem like it’s a great social experience though. I know that there’s SharePlay, but I mean social in terms of in your own home. This is designed for an audience of one. Which is a valid movie watching experience to have, some of the time, for some people.
What’s particularly interesting is an emphasis on stereoscopic media — which is almost entirely movies from that window of time when “3D” was being used as a way to charge more for ticket prices, and get people in to theaters for experiences they couldn’t have with their HD TVs, or projectors. Then the HD TVs and projectors started to build it in whether you wanted it or not.
Companies realized that it was very expensive to make these stereoscopic movies so they tried to reduce labor costs, and the quality of the stereo movies notoriously went down. Most filmmakers had very little to do with stereo and it was an afterthought for someone else. A requirement of making the thing that didn’t have to do with them.
This is notably why James Cameron’s most recent Avatar movie was used in demonstrations for the press, because he spans that time period from the original Avatar until now as an ardent proponent of stereoscopic movies.
So, as you might imagine, that makes it a little bit of a niche use case and people will mostly be watching good ol’ fashioned 2D on a really big virtual screen.
Also if I hear one more person say that Apple TV+ shows might all be “3D” now because of machine learning to generate depth I will jettison them from this planet.
Family Photos
This is a really interesting use case, just not with the headset. The most chilling moment in the whole keynote was when that guy took a photo of the kids with the headset. There are no “we just aren’t used to it yet” arguments I will accept, nor does the analogy to the VHS camcorders of yore work. This is an inhuman scenario and I’m perplexed that no one working on this presentation had a similar reaction.
What I will accept is some kind of volumetric capture coming to iPhones. The demonstration seemed to indicate that everything dithered into a stochastically pleasant point cloud as it got further away from the subject, so that doesn’t seem like it’s going to be stereographic capture of two images. Some combination of the depth data being captured, along with more than one camera might get somewhere in the ballpark.
Why would people with iPhones take 3D photos if they don’t have a headset and aren’t planning on buying one? I would imagine that there would be that shadowbox-like tray view of 3D, where as you move your iPhone the parallax in the image shifts as if there’s depth. It’s one of the Apple Design teams favorite iOS interface gags. Or perhaps just good ol’ fashioned wiggle-grams, where stereographic images just toggle back and forth. There would be plenty of ways to execute it. It just seems likely they’ll do at least one of them because people who own iPhones will be the likely people to buy the headset and giving them some material in their libraries that they can look at would be more compelling than starting them with nothing and setting them loose on childrens birthday parties.
Location Location Location
The immersive location stuff is interesting except it’s never depicted as immersive location with motion. It all seems very QuickTimeVR where you have a nodal view of place. It works well as a backdrop to other interface activities, but you don’t seem to do much with the location itself. That’s fine, I happen to love QuickTimeVR. I downloaded the crappy Babylon 5 qtvr files from AOL back in the day, and I had Star Trek “Captain’s Chair” where you could see all the Star Trek bridges. Technology that we use today for selling homes on Redfin.
I don’t object to it, but it’s interesting how it’s just a “desktop background” of sorts.
Eye Look Forward To More
I wholly reject this particular hardware, but I’m absolutely fascinated by the software and what it could mean when it’s applied in different contexts. I wonder what the developer story will ultimately wind up being because adding another dimension doesn’t reduce labor costs, and it doesn’t reduce the cut Apple wants to take from developers for expending all this effort for a niche platform. Those sorts of financial things are beyond the scope of my analysis, but are a very real issue. Meta, lead by Mark Zuckerberg, has been having big financial problems because this whole metaverse thing isn’t working out for him. While what Apple is doing is different from what Meta is doing, it’s not so different when it comes to development costs for 3D experiences.
The best part of the Apple Vision Pro might very well be Mark Zuckerberg freaking out and announcing his Meta Quest 3 early, and a Meta Quest 2 price cut, ahead of WWDC and no one cares. Kudos to Tim Cook for that.
Category: text