With the advent of cloud computing, the amount of data generated every moment reached an unprecedented scale. The discipline of data science flourishes in this environment, deriving knowledge and insights from massive amounts of data. As data science becomes critical to business, its processes must be treated with the same rigor as other components of business IT. For example, software engineering teams today embrace DevOps to develop and operate services with 99.99999% availability guarantees. Data engineering brings a similar rigor to data science, so data-centric processes run reliably, smoothly, and in a compliant way.
For the past few years, I’ve had the privilege of being a software architect for Microsoft’s Customer Growth and Analytics team. Our team’s motto is “Using Azure to understand Azure.” We connect many datapoints across the Microsoft business to better understand our customers and to empower teams across the company. Privacy is important to us, so we never look at our customers’ data, but we do have access to telemetry from Azure, commercial transactions, and other operational pipelines. This gives us a unique perspective on Azure in understanding how customers can get the most value from our offerings.