Q&A: Why the Metaverse Needs to Be Open - IEEE Spectrum

2022-08-13 07:49:12 By : Ms. Kiki Liao

IEEE websites place cookies on your device to give you the best user experience. By using our websites, you agree to the placement of these cookies. To learn more, read our Privacy Policy.

Making virtual worlds as interconnected as the internet will be tough

A screenshot from 'Unreal for All Creators'

A vast digital mirror world accessible through virtual reality is just around the corner, if you believe the latest pronouncements coming out of Silicon Valley. This so-called metaverse has captured the tech zeitgeist and is attracting billions in investments. Last month, Facebook CEO Mark Zuckerberg even announced plans to turn his trillion-dollar social network into "a metaverse company".

The name comes from author Neal Stephenson's dystopian sci-fi classic Snow Crash, which envisaged a gigantic 3D playground where millions of people escaped the drudgery of the real world. To its proponents, the metaverse is the natural evolution of the internet. They envisage a sprawling network of interconnected virtual worlds for such diverse activities as gaming, watching live entertainment, buying real-estate, and collaborating on design projects.

How to achieve that vision is an open question. Online video games like Fortnite and Roblox are pitching themselves as precursors to the metaverse because of their large 3D environments where millions gather every day. They've been adding social features, character personalization, and the ability create new environments, and have even been holding concerts to tempt people to spend ever more of their time immersed in these virtual worlds.

But the key elements of today's internet are its openness, connectivity, and interoperability, something these early metaverse-like experiences are unable to replicate. While the web has the common language of HTML and Javascript and established protocols to ensure seamless browsing, there are no shared standards for building or connecting the virtual worlds that are supposed to populate the metaverse.

That's why early pioneers came together last week at leading computer graphics conference SIGGRAPH 2021 to chart a path forward. In a session titled "Building the Open Metaverse," representatives from Fortnite developer Epic Games, Roblox, chipmaker NVIDIA, and 3D-development platform Unity outlined their work and the challenges ahead.

They discussed agreeing on the 3D equivalent of the JPEG; finding ways to share not only 3D objects, but also their behavior between virtual worlds; dealing with the vast data requirements of 3D environments; and making it possible to create 3D content without your own design studio. The consensus was that there are some promising developments, but many problems are currently unsolved.

To find out more, we spoke with one of the session organizers, Marc Petit of Epic Games, who oversees the company's Unreal Engine, a 3D creation platform. The conversation has been edited for length and clarity.

IEEE Spectrum: What is the metaverse?

Marc Petit: It's a hard question. For me, it's the evolution of the internet as the fabric for our personal and professional lives. With the internet right now, you keep on scrolling through video. We believe that the metaverse will bring in the era of interactive content, where you have agency over the content and you can control it, you can interact with it, and you can ultimately immerse yourself into the content and not just look at it through a small window. The foundation is real-time 3D media. Whether those elements get integrated into your existing environment through augmented reality, or you immerse yourself with a [virtual reality] headset into the content, all of that is a consequence of the content being 3D and interactive.

Spectrum: Why is openness going to be an important feature for any future metaverse? And what does openness mean, in this context?

Petit: I think the openness is mandated for two reasons. One, for technical reasons, because the internet was based on the ability for things to communicate among themselves. If we have a lot of virtual worlds where you cannot go from one to the next or they are incompatible with each other that's not going to be a good experience. So I think that the very nature of a connected system requires that there is openness.

And then there is the societal aspect of it. I think it's an opportunity to create business models and rules that are more fair for people, [protect] their privacy, and make sure creators get their fair share for the content they create. Because, hopefully, we're going towards an economy of creation, where people who make the money are the people who created the content, not the people who own the platforms. We want everybody to become a consumer and a creator and so we need the platform and the economy that allows participation for everybody.

The metaverse is not here. There are huge technical challenge that we need to solve as an industry.

Spectrum: Lots of companies have developed virtual worlds of their own. How far have they got with linking them together?

Petit: The metaverse is not here. There are huge technical challenge that we need to solve as an industry. Let me be clear about this, we are at the first baby steps. It's about eating an elephant one mouthful at a time. Right now we have to solve the problems of exchanging objects. There's glTF and USD [two leading 3D graphics file formats], but we still can't agree on a material representation today. We're getting there and progress has been made, but it's still not a slam dunk. And some proprietary technology is still important and hasn't been really moving to the open space.

Then it's going to be all about behavior, and we have two kinds of behavior. Procedural behaviors, what we call rigging: so how a character moves, how a vehicle drives itself. And then we have the simulation aspects. We all live in the same world with the same laws of physics. So you would hope that we could harmonize physics to each simulation. But we all have different ways to represent logic and behaviors. That's one of the big things we need to tackle.

A screenshot from 'Fortnite'Epic Games

Spectrum: Are there any obvious route for making things like behavior and attributes interoperable?

Petit: No. When you start thinking about persistent worlds, the technological paradigm is going to be very different. You have to maintain a living and breathing world and do in-flight edits. It's not like a game where you can stop the game, change the game and restart the game. We're talking about persistent world where it's always on. There's a lot of room for innovation in terms of how you handle interactivity and scripting in an environment where you're always live and you have millions of users. Some of the models that we have today are not adequate and need to be evolved.

Spectrum: How do you think the industry will come together to achieve this goal?

Petit: All the companies cooperate and acknowledge the value and the need for open source systems and foundations. I think standards will play a big role. But the standards don't drive innovation, they tend to drive commoditization. So we are managing this duality. We support the standards, but with Unreal Engine 5 we are pushing our engine, our data representation, to become the benchmark and the most advanced solution to create and playback high-fidelity, fully simulated worlds.

It takes a few geniuses, like we have on our staff, to invent those kinds of technologies and prove that they work. Then, when it's proven, it becomes shared and open. I know some people say open source can do innovation, and it can happen. But I think it's following the innovation that's done by some of those private groups.

Spectrum: Assuming we solve these problems and create this open system of interlocking virtual worlds, what will that make possible?

Petit: Seamless integration. Being able to aggregate and consume information from everywhere, and letting data from multiple platforms coexist in a single representation. We already have that in the web with microservice architecture and connectivity, so we can see how this could evolve. Once we have got this shared representation challenge figured out, I think we can define services so that things can interoperate. Being able to use your digital assets and your digital properties is probably the best example. If I buy a Ferrari to play Fortnite I'd love to use it on Roblox.

We're making sure that anybody can create content that other people want to see. Because nobody's watching bad video, and this will be the same for 3D. Nobody wants crappy content.

Spectrum: A common theme during the SIGGRAPH session was that for the metaverse to be truly open, it needs to be easier for users to make their own 3D content. How how far off is that?

Petit: We are we investing proactively in making sure that happens. For example, our Twinmotion product is the easiest way that you can create 3D content right now. We are proving that you can use game mechanics to make things simple. In Twinmotion, if you want snow on your scene you push a button and the snow falls and accumulates.

For content we have Megascans [a library of 3D scans of real world objects], where our mandate is to scan enough of the world so that we can recreate the entirety of the world using proceduralism and machine learning techniques. And then there are all of the techniques around AI-assisted creation and AI-assisted artistry. So all of those efforts and investments by Epic are making sure that anybody can create content that other people want to see because it meets the bar in terms of quality. Because nobody's watching bad video, and this will be the same for 3D. Nobody wants crappy content.

Spectrum: A 3D version of the internet involves a lot more data than today's 2D one. How big of a challenge is it going to be to actually serve this to millions of people?

Petit: Well, it's interesting, because there's a lot of data right now that is represented in video that could be represented more efficiently in 3D. But you're right, the data will be big. I think you'll see software architectures adjust with that reality, with server side systems doing the heavy lifting and [3D] glasses doing something much lighter weight. We're starting to see the evolution of that computing infrastructure, with GPUs moving to the edge and the clients being more capable. Do we have enough? No, never. There's never enough compute!

Spectrum: You've made it clear that there are still fundamental challenges to overcome. Which solutions do you think are close at hand, and which problems are going to be harder to solve?

Petit: I think we are cracking the code of creating content that people want to watch. That's around the corner. And building this critical mass of technology where anybody can actually create their own virtual worlds with as much efficiency as possible. Everybody's at the stage of integrating real time 3D as a medium into their workflow. The more difficult thing is going to be the connection of those workflows and the emergence of platforms that embody the values that we discussed earlier. That's going to be what takes the most time, because platforms are hard to build.

Edd Gent is a freelance science and technology writer based in Bangalore, India. His writing focuses on emerging technologies across computing, engineering, energy and bioscience. He's on Twitter at @EddytheGent and email at edd dot gent at outlook dot com. His PGP fingerprint is ABB8 6BB3 3E69 C4A7 EC91 611B 5C12 193D 5DFC C01B. His public key is here. DM for Signal info.

When transistors can’t get any smaller, the only direction is up

Perhaps the most far-reaching technological achievement over the last 50 years has been the steady march toward ever smaller transistors, fitting them more tightly together, and reducing their power consumption. And yet, ever since the two of us started our careers at Intel more than 20 years ago, we’ve been hearing the alarms that the descent into the infinitesimal was about to end. Yet year after year, brilliant new innovations continue to propel the semiconductor industry further.

Along this journey, we engineers had to change the transistor’s architecture as we continued to scale down area and power consumption while boosting performance. The “planar” transistor designs that took us through the last half of the 20th century gave way to 3D fin-shaped devices by the first half of the 2010s. Now, these too have an end date in sight, with a new gate-all-around (GAA) structure rolling into production soon. But we have to look even further ahead because our ability to scale down even this new transistor architecture, which we call RibbonFET, has its limits.

So where will we turn for future scaling? We will continue to look to the third dimension. We’ve created experimental devices that stack atop each other, delivering logic that is 30 to 50 percent smaller. Crucially, the top and bottom devices are of the two complementary types, NMOS and PMOS, that are the foundation of all the logic circuits of the last several decades. We believe this 3D-stacked complementary metal-oxide semiconductor (CMOS), or CFET (complementary field-effect transistor), will be the key to extending Moore’s Law into the next decade.

Continuous innovation is an essential underpinning of Moore’s Law, but each improvement comes with trade-offs. To understand these trade-offs and how they’re leading us inevitably toward 3D-stacked CMOS, you need a bit of background on transistor operation.

Every metal-oxide-semiconductor field-effect transistor, or MOSFET, has the same set of basic parts: the gate stack, the channel region, the source, and the drain. The source and drain are chemically doped to make them both either rich in mobile electrons ( n-type) or deficient in them (p-type). The channel region has the opposite doping to the source and drain.

In the planar version in use in advanced microprocessors up to 2011, the MOSFET’s gate stack is situated just above the channel region and is designed to project an electric field into the channel region. Applying a large enough voltage to the gate (relative to the source) creates a layer of mobile charge carriers in the channel region that allows current to flow between the source and drain.

As we scaled down the classic planar transistors, what device physicists call short-channel effects took center stage. Basically, the distance between the source and drain became so small that current would leak across the channel when it wasn’t supposed to, because the gate electrode struggled to deplete the channel of charge carriers. To address this, the industry moved to an entirely different transistor architecture called a FinFET. It wrapped the gate around the channel on three sides to provide better electrostatic control.

The shift from a planar transistor architecture [left] to the FinFET [right] provided greater control of the channel [covered by blue box], resulting in a reduction in power consumption of 50 percent and an increase in performance of 37 percent.

Intel introduced its FinFETs in 2011, at the 22-nanometer node, with the third-generation Core processor, and the device architecture has been the workhorse of Moore’s Law ever since. With FinFETs, we could operate at a lower voltage and still have less leakage, reducing power consumption by some 50 percent at the same performance level as the previous-generation planar architecture. FinFETs also switched faster, boosting performance by 37 percent. And because conduction occurs on both vertical sides of the “fin,” the device can drive more current through a given area of silicon than can a planar device, which only conducts along one surface.

However, we did lose something in moving to FinFETs. In planar devices, the width of a transistor was defined by lithography, and therefore it is a highly flexible parameter. But in FinFETs, the transistor width comes in the form of discrete increments—adding one fin at a time–a characteristic often referred to as fin quantization. As flexible as the FinFET may be, fin quantization remains a significant design constraint. The design rules around it and the desire to add more fins to boost performance increase the overall area of logic cells and complicate the stack of interconnects that turn individual transistors into complete logic circuits. It also increases the transistor’s capacitance, thereby sapping some of its switching speed. So, while the FinFET has served us well as the industry’s workhorse, a new, more refined approach is needed. And it’s that approach that led us to the 3D transistors we’re introducing soon.

In the RibbonFET, the gate wraps around the transistor channel region to enhance control of charge carriers. The new structure also enables better performance and more refined optimization. Emily Cooper

This advance, the RibbonFET, is our first new transistor architecture since the FinFET’s debut 11 years ago. In it, the gate fully surrounds the channel, providing even tighter control of charge carriers within channels that are now formed by nanometer-scale ribbons of silicon. With these nanoribbons (also called nanosheets), we can again vary the width of a transistor as needed using lithography.

With the quantization constraint removed, we can produce the appropriately sized width for the application. That lets us balance power, performance, and cost. What’s more, with the ribbons stacked and operating in parallel, the device can drive more current, boosting performance without increasing the area of the device.

We see RibbonFETs as the best option for higher performance at reasonable power, and we will be introducing them in 2024 along with other innovations, such as PowerVia, our version of backside power delivery, with the Intel 20A fabrication process.

One commonality of planar, FinFET, and RibbonFET transistors is that they all use CMOS technology, which, as mentioned, consists of n-type (NMOS) and p-type (PMOS) transistors. CMOS logic became mainstream in the 1980s because it draws significantly less current than do the alternative technologies, notably NMOS-only circuits. Less current also led to greater operating frequencies and higher transistor densities.

To date, all CMOS technologies place the standard NMOS and PMOS transistor pair side by side. But in a keynote at the IEEE International Electron Devices Meeting (IEDM) in 2019, we introduced the concept of a 3D-stacked transistor that places the NMOS transistor on top of the PMOS transistor. The following year, at IEDM 2020, we presented the design for the first logic circuit using this 3D technique, an inverter. Combined with appropriate interconnects, the 3D-stacked CMOS approach effectively cuts the inverter footprint in half, doubling the area density and further pushing the limits of Moore’s Law.

3D-stacked CMOS puts a PMOS device on top of an NMOS device in the same footprint a single RibbonFET would occupy. The NMOS and PMOS gates use different metals.Emily Cooper

Taking advantage of the potential benefits of 3D stacking means solving a number of process integration challenges, some of which will stretch the limits of CMOS fabrication.

We built the 3D-stacked CMOS inverter using what is known as a self-aligned process, in which both transistors are constructed in one manufacturing step. This means constructing both n-type and p-type sources and drains by epitaxy—crystal deposition—and adding different metal gates for the two transistors. By combining the source-drain and dual-metal-gate processes, we are able to create different conductive types of silicon nanoribbons (p-type and n-type) to make up the stacked CMOS transistor pairs. It also allows us to adjust the device’s threshold voltage—the voltage at which a transistor begins to switch—separately for the top and bottom nanoribbons.

In CMOS logic, NMOS and PMOS devices usually sit side by side on chips. An early prototype has NMOS devices stacked on top of PMOS devices, compressing circuit sizes.Intel

How do we do all that? The self-aligned 3D CMOS fabrication begins with a silicon wafer. On this wafer, we deposit repeating layers of silicon and silicon germanium, a structure called a superlattice. We then use lithographic patterning to cut away parts of the superlattice and leave a finlike structure. The superlattice crystal provides a strong support structure for what comes later.

Next, we deposit a block of “dummy” polycrystalline silicon atop the part of the superlattice where the device gates will go, protecting them from the next step in the procedure. That step, called the vertically stacked dual source/drain process, grows phosphorous-doped silicon on both ends of the top nanoribbons (the future NMOS device) while also selectively growing boron-doped silicon germanium on the bottom nanoribbons (the future PMOS device). After this, we deposit dielectric around the sources and drains to electrically isolate them from one another. The latter step requires that we then polish the wafer down to perfect flatness.

An edge-on view of the 3D stacked inverter shows how complicated its connections are. Emily Cooper

By stacking NMOS on top of PMOS transistors, 3D stacking effectively doubles CMOS transistor density per square millimeter, though the real density depends on the complexity of the logic cell involved. The inverter cells are shown from above indicating source and drain interconnects [red], gate interconnects [blue], and vertical connections [green].

Finally, we construct the gate. First, we remove that dummy gate we’d put in place earlier, exposing the silicon nanoribbons. We next etch away only the silicon germanium, releasing a stack of parallel silicon nanoribbons, which will be the channel regions of the transistors. We then coat the nanoribbons on all sides with a vanishingly thin layer of an insulator that has a high dielectric constant. The nanoribbon channels are so small and positioned in such a way that we can’t effectively dope them chemically as we would with a planar transistor. Instead, we use a property of the metal gates called the work function to impart the same effect. We surround the bottom nanoribbons with one metal to make a p-doped channel and the top ones with another to form an n-doped channel. Thus, the gate stacks are finished off and the two transistors are complete.

The process might seem complex, but it’s better than the alternative—a technology called sequential 3D-stacked CMOS. With that method, the NMOS devices and the PMOS devices are built on separate wafers, the two are bonded, and the PMOS layer is transferred to the NMOS wafer. In comparison, the self-aligned 3D process takes fewer manufacturing steps and keeps a tighter rein on manufacturing cost, something we demonstrated in research and reported at IEDM 2019.

Importantly, the self-aligned method also circumvents the problem of misalignment that can occur when bonding two wafers. Still, sequential 3D stacking is being explored to facilitate integration of silicon with nonsilicon channel materials, such as germanium and III-V semiconductor materials. These approaches and materials may become relevant as we look to tightly integrate optoelectronics and other functions on a single chip.

Making all the needed connections to 3D-stacked CMOS is a challenge. Power connections will need to be made from below the device stack. In this design, the NMOS device [top] and PMOS device [bottom] have separate source/drain contacts, but both devices have a gate in common.Emily Cooper

The new self-aligned CMOS process, and the 3D-stacked CMOS it creates, work well and appear to have substantial room for further miniaturization. At this early stage, that’s highly encouraging. Devices having a gate length of 75 nm demonstrated both the low leakage that comes with excellent device scalability and a high on-state current. Another promising sign: We’ve made wafers where the smallest distance between two sets of stacked devices is only 55 nm. While the device performance results we achieved are not records in and of themselves, they do compare well with individual nonstacked control devices built on the same wafer with the same processing.

In parallel with the process integration and experimental work, we have many ongoing theoretical, simulation, and design studies underway looking to provide insight into how best to use 3D CMOS. Through these, we’ve found some of the key considerations in the design of our transistors. Notably, we now know that we need to optimize the vertical spacing between the NMOS and PMOS—if it’s too short it will increase parasitic capacitance, and if it’s too long it will increase the resistance of the interconnects between the two devices. Either extreme results in slower circuits that consume more power.

Many design studies, such as one by TEL Research Center America presented at IEDM 2021, focus on providing all the necessary interconnects in the 3D CMOS’s limited space and doing so without significantly increasing the area of the logic cells they make up. The TEL research showed that there are many opportunities for innovation in finding the best interconnect options. That research also highlights that 3D-stacked CMOS will need to have interconnects both above and below the devices. This scheme, called buried power rails, takes the interconnects that provide power to logic cells but don’t carry data and removes them to the silicon below the transistors. Intel’s PowerVIA technology, which does just that and is scheduled for introduction in 2024, will therefore play a key role in making 3D-stacked CMOS a commercial reality.

With RibbonFETs and 3D CMOS, we have a clear path to extend Moore’s Law beyond 2024. In a 2005 interview in which he was asked to reflect on what became his law, Gordon Moore admitted to being “periodically amazed at how we’re able to make progress. Several times along the way, I thought we reached the end of the line, things taper off, and our creative engineers come up with ways around them.”

With the move to FinFETs, the ensuing optimizations, and now the development of RibbonFETs and eventually 3D-stacked CMOS, supported by the myriad packaging enhancements around them, we’d like to think Mr. Moore will be amazed yet again.