In the gaming world, finding the balance between art and science is essential to getting to the desired outcome: fun. At Unity, we\u2019ve been creating and operating real-time 3D (RT3D) content in gaming for more than 15 years. Our proprietary platform connects game players and game creators, letting developers build games that fans will enjoy. Now, we\u2019re helping creators in other industries make better data-driven decisions using artificial intelligence (AI), machine learning (ML), and synthetic data (data that is derived from simulations based on real-world data).\nUnderstanding synthetic data\u2019s role in an already data-rich world\nGetting the right data into the right place is what most companies struggle with as they get started with ML and AI. It can be difficult for a smaller company to generate or gather the amount of data that\u2019s needed to effectively make accurate predictions. This is where synthetic data comes in\u2014and why Unity has gained so much expertise in this area since our inception.\u00a0\nMy team\u2019s work started four years ago. Our goal was to explore how else we could use the Unity engine and expertise in synthetic data. Since I\u2019d worked at Uber, we started with the idea that synthetic data could speed up software development time for self-driving cars. The prevailing wisdom had been that you needed to log thousands of testing hours to create a reliable self-driving car. But approximately 98% of the time when a human is driving, nothing interesting is happening. The same goes for autonomous vehicle test drives, leading to hours of uneventful footage that didn\u2019t offer any real value.\u00a0\nPlus, it\u2019s highly risky to put software that\u2019s a work in progress on the road. But when you run your software in a simulation\u2014or in Unity\u2019s case, on a simulation run in an environment using our game engine\u2014you can test-drive millions of miles every 24 hours, across thousands of servers, creating scenarios that would rarely occur. So, to build a self-driving car that has experienced thousands of hours of possible events, or a vacuum that can avoid bumping into furniture, or a robot that is able to do surgery\u2014the Unity engine is a great proxy for the real world.\u00a0\nSince that first self-driving car project, we\u2019ve explored many ways to make it easier for creators to use ML modeling and predictive analytics as easily as game creators do. We\u2019ve expanded the use of Unity to new uses such as retail spaces, public spaces, and transportation hubs, as well as robotics. As robots move away from doing repetitive tasks in manufacturing facilities, and transition into labs and households, they need very different skills. For example, smart vacuum cleaners now have cameras and other sensors, helping them understand the layout of a room. Developers working on robots might use Unity AI\/ML to build a simulated version of the robot and run tens, hundreds, or thousands of simulated scenarios before ever running the physical robot in a physical space, saving an incredible amount of time.\u00a0\u00a0\nThe gaming technology we\u2019re bringing to other markets\nA game player is constantly generating simulated data based on situations and movements. That type of spatial simulation is incredibly valuable for other scenarios, such as predicting what might happen under complex conditions. For example, combining the power of Unity\u2019s 3D rendering capabilities with its simulation that can be scaled on the cloud to holistically study large and uncertain systems. This simulation allows big-picture observations and what-if studies, ultimately leading to a conceptual understanding of real-world situations that experts can use to inform challenging policy decisions.\nThe possibilities are boundless. We see retailers use data simulations to choose the best option to lay out clothing displays and to consider factors like whether shoppers are looking for themselves or a family member. Designing an airport terminal or deciding where in the terminal to place a particular store is easier and more informed using a tool like the Unity engine. Shop owners can run simulations using characters to find the optimal location for the store, taking into account factors like where and how many people could stand in line during peak times. In addition, it\u2019s becoming easier to overlay location or other real-time datasets on simulations for even more specific testing.\u00a0\nAll of this requires a powerful cloud back end to achieve the required large scale and volume. Our customers need to run a large number of instances on demand, and usage is very spiky. So we built a cloud-based version of the Unity engine so it\u2019s easy to run on many devices, and we offer it as a managed service using Google Cloud. Customers get the data they need without having to manage the back end.\nAll of our ML and data analytics at Unity runs on Google Cloud, using Compute Engine infrastructure and BigQuery for analytics.\u00a0\nRelated: See why Gartner named Google Cloud a leader in the 2020 Magic Quadrant for both Cloud Database Management Systems and for Cloud Infrastructure and Platform Services.\nToo much of a good thing is\u2026overwhelming\nThrough all the work I\u2019ve been lucky enough to do in this constantly changing industry, there are a few common challenges I\u2019ve encountered with those getting started with AI and ML. Here are two quick tips to remember as you\u2019re getting started on your AI\/ML journey.\n1. Consider how much data you truly need\u00a0\nMore data is better, generally, but there\u2019s a point of diminishing returns. Think about the data you\u2019ll generate with simulations. In 24 hours, it\u2019s possible to generate a thousand years\u2019 worth of video of 30 frames per second. But how are you going to check what you generated? The time involved in figuring out whether you generated the right data, or got the same thing over and over again, doesn\u2019t offer a lot of value and is very time intensive.\u00a0\nYou\u2019ll find the right amount of data for your situation by evaluating and testing often. Work in an iterative loop: generate data, train the model, verify the model against real-world data, and see how it performs. Constantly measure the results, then define what \u201cgood enough\u201d means for your situation. Then, create predictive models. That\u2019s also the point where you\u2019re likely establishing a strong data culture, where your internal users trust the data and depend on it to make better decisions.\u00a0\n2. Simulated data is more fair than real-world data\u00a0\nThe real world is complicated, and it\u2019s not always fair. For a lot of data scientists or ML teams, it\u2019s easy to collect and use real-world data to train systems. But with data representing a world where 80% of software engineers are male, that real-world data can easily make its way into ML modeling. For example, the ML engine will learn that 80% of software engineers are male, based on real-world data, and then prefer male engineers over female engineers when building a model to identify engineers in photos. You have to take responsibility for generating data that represents the world as you would like to see it. With simulated data, you can generate equal amounts of people by gender. Make the system equally good at recognizing children and adults. Generate different types of hair, a whole range of skin colors, and varying physical abilities.\u00a0\nAs an engineer, you have the power to generate data that reflects a more balanced, diverse, and fair world. I\u2019ve seen people wipe their hands of this notion by simply saying that the data returned these results, like in the example above with male vs. female engineers. But the system creates data based on human inputs. If you use unfair data, it will be amplified and can end up harming your brand.\nIt\u2019s also important to note that personal data isn\u2019t necessary to do innovative work with ML, AI, and data analytics. For example, in our engine, gender, age, or other personal details aren\u2019t relevant to gaming. All that matters is how you play the game. This principle can, and should, apply to other applications.\u00a0\nBuilding a future with simulated data\nWe\u2019re already seeing some really impressive results in our work with customers, and there\u2019s so much more potential as these new technologies come together. When you use reinforcement learning along with spatial simulations, you might get a robot with vision capabilities that learns on the fly and can do what used to be a human\u2019s busy work. We\u2019re seeing smarter indoor environments, such as cashierless grocery stories, and there\u2019s lots more to come as we explore new industries and applications. Exploring what\u2019s possible, and connecting creators with the data they need to make the right decisions continues to drive Unity, no matter the industry.\nContinue to explore what\u2019s possible: Assess where you are in the AI journey and get a framework for creating and evolving AI capabilities within your organization. Download Google Cloud\u2019s AI adoption framework.\u00a0\u00a0\nAbout the author\nDanny Lange is Senior Vice President of Artificial Intelligence at Unity where he leads the company\u2019s initiatives in the field of applied Artificial Intelligence. Prior to his role at Unity, Danny\u00a0was the head of Machine Learning at Uber where he led the development of the company\u2019s\u00a0Machine Learning platform. Previously, he was General Manager for Machine Learning at\u00a0Amazon where he managed Amazon\u2019s internal Machine Learning platform as well as launched\u00a0the first AI product for Amazon Web Services (AWS) known as Amazon Machine Learning.\u00a0Danny has also lead Machine Learning efforts at Microsoft and started his career building\u00a0autonomous agents as a Computer Scientist at IBM Research.