Talk 228 – AI for Earth Interview Transcript

Sam Charrington: Today’s episode is part of a series of shows on the topic of AI for the benefit of society, that were excited to have partnered with Microsoft to produce. In this show, we’re joined by Lucas Joppa and Zach Parisa. Lucas is the chief environmental officer at Microsoft, spearheading the company’s five-year, $50 million, AI for Earth commitment, which seeks to apply machine learning and artificial intelligence across four key environmental areas, agriculture, water, biodiversity and climate change. Zack is co-founder and president of SilviaTerra, a Microsoft AI for Earth grantee, whose mission is to help people use modern data sources to better manage forest habitats and ecosystems. In our conversation we discussed the ways that machine learning and AI can be used to advance our understanding of forests and other ecosystems and support conservation efforts. We discuss how SilviaTerra uses computer vision and data from a wide array of sensors like LiDAR, combined with AI, to yield more detailed small area estimates of the various species in our forests. And we also discuss another AI for Earth project, WildMe, a computer vision-based wildlife conservation project that we discussed with Jason Holmberg back in episode 166.

Before diving in I’d like to thank Microsoft for their support of the show and their sponsorship of this series. Microsoft is committed to ensuring the responsible development and use of AI and is empowering people around the world with this intelligent technology to help solve previously intractable societal challenges spanning sustainability accessibility and humanitarian action. Learn more about their plan at Microsoft.ai.

Enjoy the show.

Sam Charrington: [00:02:17] All right, everyone. I am here with Lucas Joppa and Zack Parisa. Lucas is the CEO of Microsoft, no, not that CEO, but the Chief Environmental Officer. Zack is the Co-Founder and President of Silvia Terra. Lucas and Zack, welcome to this week in Machine Learning and AI.

Lucas Joppa: [00:00:22] Thanks for having us here. It’s a huge pleasure.

Zack Parisa: [00:00:24] Great to be here.

Sam Charrington: [00:00:25] Awesome. Let’s dive right in. We’ll be talking about Microsoft’s AI For Earth Initiative, but before we jump into that, Lucas, as the CEO of Microsoft. I think, I’m going to run this one all day. Tell me a little bit about your background and how you came to be the CEO of Microsoft.

Lucas Joppa: [00:00:48] Yeah, sure. I would say I never dreamed of being the CEO of anything that’s for sure. Particularly, in the standard context of it, much less what it means in my specific title is the Chief Environmental Officer. I mean, I grew up in far northern rural Wisconsin, I was obsessed with being outside. My approach to school in life in general was what can, how can I get done with anything that I need to get done with so I can go play out in the woods? I think, I thought I was going to grow up to be a game warden or something similar to that.

Technology was not a big factor in my life as well. I mean, I’ve never had a computer growing up or a TV or anything else. I eventually found my way into university, started discovering that I was really interested in thinking about a career in environmental science, studied Wildlife Ecology. Again, not the traditional career path for somebody at Microsoft. Went off and spent a little time, in the United States Peace Corps in Malawi, working for the Department of National Parks and wildlife and then came back and did my PhD in Ecology.

It was really then that I started to put together this, the two kind of incredible ages that I think we’re alive in today and the way I see our world. Which is that we’re doing business here at the intersection of the information age, and then this also incredible age of negative human impacts on earth’s natural systems. It was during my PhD, I just was really struggling with what’s the right way to do science at a way that scales with the scale of the problem. That’s when computing, programming, Machine Learning all kind of came flooding into my life at the same time. Ended up at Microsoft and Microsoft Research leading programs and environmental and computer science, and then things just progressed from there.

Sam Charrington: [00:02:41] You’re actively involved in academic research and a number of organizations. Can you share a little bit about that? We talked about it a bit earlier.

Lucas Joppa: [00:02:51] Sure. I mean, once you, live long enough in the academic world, the Pavlovian response stored some of the rewards that, that environment installs. I mean, I’m not proud to say it, but since I’m not proud, I should just say it. I am still that academic that checks their citations every day when I wake up over breakfast. While I definitely have a much larger and more expanded per view of roles and responsibilities here at Microsoft.

I still think, science is important. Science is what drives all of the environmental sustainability decisions that we make here at this company. It’s what ultimately led to why we invested in this program AI For Earth. I firmly believe that, you have to understand the details, if you’re going to try to lead an organization somewhere with a big picture vision, if you don’t understand the details, if you don’t understand the science and then it’s difficult to do that. Just the way my brain works, the easiest way to understand the details is to get your hands dirty and be in there with the rest of the world trying to build the solutions of the future. That’s where the academic research for me comes in.

It’s just that opportunity to actually like go really deep and work on both sides of the equation. I still publish in the environmental science literature. I still publish in the computer science literature, and the most depressing thing about that is how few of us there are that do both of those things. It’s one of the things that I spend a lot of my time every day doing is just trying to bring those two worlds together, and publishing is a fantastic way to do that.

Sam Charrington: [00:04:35] Zach, you’re a forester.

Zack Parisa: [00:04:37] Yeah, yeah.

Sam Charrington: [00:04:38] I didn’t know that was a thing beyond the Subaru.

Zack Parisa: [00:04:40] Right, right, sure enough. It’s absolutely a thing and an exciting, I think, there’s a rebirth in forestry now. I’m hoping that it’ll become a more broadly known thing here, before too long.

Sam Charrington: [00:04:56] Tell us about your background and about Silvia Terra.

Zack Parisa: [00:04:59] Yeah, sure. The start of my story actually isn’t terribly dissimilar than Lucas’s. I grew up in North Alabama though not Wisconsin, but in this funny place that was like North Alabama’s, covered in woods, but it also has NASA installation, in Huntsville, Alabama. My youth was basically just spend in the woods. When I was in first grade, I wanted to be an Entomologist. When I was in third grade, I wanted to be a Zoologist. I went through, geology and so on and so forth until I finally met somebody who was a forester.

Until you meet somebody and you have somebody walk you through what that is, it’s an obscure field. What that is to me is the confluence of economics and ecology for me. It was this brilliant opportunity at the time, and that’s the way that I saw it because it brought together everything that I cared about. From the ecology side, insects and soils, geology, the interconnected nature of all of those systems, but also the economic side. Not only what the forest is, but also what we want it to be and how we value that as a society, and how we mean to take it from one place now, which is where we find it today to where we want it to be, and what we believe we need. That was my entrance into it.

I believed, I would carry that out. I would live and work as a forester by managing some tract of land for some owner movement, whether that’s public or private, but that I would be focused on that landscape. Going through Undergrad, what I became really interested in, were oddly and a surprise to me was the quantitative aspects of certain problems like insects in a forest.

When I first got into forestry, my freshman year, there was a massive outbreak of southern pine beetle in the U.S. South, and it was killing lots of pine trees. That was a really compelling problem to me because it relates so much not only to the trees themselves and the beetle, but also how we’ve managed them historically and sort of what, how that impacts, locally economies and that type of thing.

I started into pheromone plume modeling of all things in a forest and system and trying to take measurements of concentrations of pheromones in locations, and backtrack to where that originated from in the winter, to try and deal with these beetles more effectively. What I learned from that or what I gathered was that there’s this incredible ability to scale up my interests. To still focus on the things that I loved to most, but to look at them with a different lens and to potentially affect change in a different way, than I had conceived of before.

I wound up doing a work in Brazil, I was really interested in Tropical Forestry. I took some time off from Undergrad to do that, and worked in other areas, Bolivia in South America. There I got to see situations where people were dependent on different aspects of land, in different ways and more direct ways than I think I was familiar with from my youth in the U.S. South. Where, they were hurting animals, they were collecting nuts, fruits, things like that. They’re collecting fuel wood to stay warm, to cook. They were also, wanting to sell wood into a market, and to develop as communities.

Forestry is about trade offs. There are a lot of things that we can do, and there are a lot of potential futures that we have before us, but we have to address the complexity of those systems in more comprehensive ways than we have in the past. There’s far more than just a timber market now, there’s far more than just a concern for delivery of wood to build houses. When we spoke just a little bit before, but that was experienced very acutely here in the Pacific Northwest. When people were confronting the issue of whether we had enough spotted owl habitat or spotted owls themselves or not. Whether we had managed appropriately in the past to accommodate those and everything that’s related to that species, or the habitats and other species that are related, or whether we haven’t, whether we’d failed. If we needed to go back and reconsider the ways that we make decisions.

That was a really freighted conversation, it brought people to boiling points, and that was before my time really, before I really entered into the profession in any meaningful way. That type of conversation goes on now and it’s even more complicated, and there are more issues and more dimensions that we have to consider than there were then. To have constructive conversations, we have to have information to inform those discussions to facilitate the communication that yields solutions, that people can live with.

Sam Charrington: [00:10:40] I’m presuming that, that need is what led you to found Silvia Terra?

Zack Parisa: [00:10:45] It is. Yeah. Absolutely.

Sam Charrington: [00:10:47] What is Silvia Terra, what is the company?

Zack Parisa: [00:10:48] Right, what we do here? Failing to answer your questions here. Silvia Terra we provide information, just like what I was speaking about there. Our objective is to help people use modern data sources, like remotely sensed information from satellites, from aerial basis, from UAVs and modern modeling techniques to help get more resolution on information and get more accuracy and precision on information. Not only just about trees, but about habitats and beyond. That’s the focus of our company.

We’ve been at this for about nine years, a lot of the folks that we work with are timber companies, we also work with non environmental NGOs, we work with government agencies. All of them, they have effectively the same questions, they’re very similar needs. Initially, up until now we’ve been providing data project-to-project to help them answer those critical questions that they confront on a regular basis. I guess, the reason I’m in this room with you all here today is that, we were able to start working with Microsoft AI For Earth. To begin to scale and expand that work, to build a foundational data set that we can start to use to answer these questions and to build on, to improve our ability to manage for the future.

Sam Charrington: [00:12:21] This may be a good segue to taking a step back and Lucas, what is AI For Earth?

Lucas Joppa: [00:12:29] Sure. Well, I think in the context of this conversation, you can think about it. What is AI For Earth? That’s why a reformed forester, who’s now the co founder of a startup and a reformed wildlife ecologists, who’s now the Chief Environmental Officer at Microsoft are at a table talking with you on TWIML.

Sam Charrington: [00:12:44] I feel like we’re in this recursive.

Lucas Joppa: [00:12:46] That’s right. I know exactly, I can’t even see you guys anymore. I’m just staring at myself and an Infinity Mirror here. What AI For Earth is, is as of Tuesday of this week, a one-year-old program.

Sam Charrington: [00:13:00] Happy birthday.

Lucas Joppa: [00:13:01] Thank you. Thank you. It was fantastic. We spent it celebrating with our colleagues at National Geographic in Washington, D.C.

Sam Charrington: [00:13:08] In the woods?

Lucas Joppa: [00:13:10] Unfortunately no, but at the founders table of one of the most iconic and exploration driven organizations in the world. It was an incredible time. What AI For Earth is, is a five year, $50 million commitment on behalf of Microsoft to deploy our 35 years. Actually a little bit more than 35 years of fundamental research in the core fields of AI and Machine Learning. To deploy those to affect change in these four key areas of environment that we care deeply about, which is agriculture, water, biodiversity, and climate change.

The reason that we’re doing that is, because we recognize that at Microsoft, I already spoke about this tale of two ages really, this time of this information age and this time of incredible, negative impacts of human activities on earth’s natural systems. You look and you realize that as a society we’re facing almost an unprecedented challenge. We somehow have to figure out how to mitigate and adapt to changing climates, ensure resilient water supply sustainably feed, human population, rapid, the growing to 10 billion people. All while stemming this ongoing and catastrophic loss of biodiversity that we see are around the world. We’ve got to do that while ensuring that the human experience continues to improve all around the world for everybody that economic growth and prosperity, continue to grow. That’s why I say it’s an unprecedented challenge.

I mean, the scope and the scale are just incredible. If you look at the scope and scale of the problem and you step back and you ask yourself the same question as a company that I asked during my PhD, which is, “Well, what are the things that are growing in the same exponential fashion as the scale and complexity of that challenge of our environmental challenge?” Well, pretty much the only trends that are happening in an analogous fashion, are in the tech sector and particularly in the broader field of AI and the more narrow Machine Learning approaches that are getting a lot of attention today.

That’s when we decided to put together this program to actually say, “Hey, we’ve been investing as a company for over a decade at the intersection, environmental science and computer science.” I led research programs in our blue sky research division called Microsoft Research for a fair number of years on that. But, then the technology reached a point, the criticality of the societal challenge, I think, reached a point that it was time for a company like Microsoft to step in and actually start to deploy some of those resources. Deploy them in ways that, ensure that we ultimately change the way that we monitor model and then ultimately manage earth’s natural systems in a way that we’ve never been able to before.

We started out, as I said, a year ago with basically nothing but aspiration. We looked back this past Tuesday, this event that we had National Geographic where we inducted a new set of grantees into our portfolio, and realize that in that short year we’d set up relationships with organizations all over the world. Over 200 organizations all over the world, each that are dedicated to taking a Machine Learning first approach to solving challenges in these four domain areas that we focus on. There on all set, they’re working on all seven continents now, over 50 countries in the world, 34 countries here in the United States. Today, get the opportunity to sit down with one of the grantees, right? To hear a little bit more about, just their particular experience, and talk about the ways that that Machine Learning in particular can fundamentally change our ability to understand what’s going on on planet earth.

Because I think, that most people don’t take the time to step back and realize when they hear terms like information age, just how narcissistic that really is, that almost every bit of information that we’ve been collecting is about ourselves, right? It’s about where the nearest Starbucks is, it’s about what people who searched for also searched for, right? It’s at the peril of ignoring the rest of life on earth and the ways that it supports us in our economies, it’s what Silvia Terra, I think, is so focused on, is using vast amounts of data, new approaches in Machine Learning to actually just ask them simple questions like, where are all the trees in the United States? We don’t know answers to things like that. I mean, that just blows my mind, and so that’s where a lot of this came from. It’s just a fundamental desire to change our ability to monitor and model life on earth. I guess, that isn’t all that simple, but-

I also think it’s completely and totally doable, right? I mean, look at where we’ve come from, from an information processing capacity over the past 25 years to where we are today. I mean, if you would’ve tried to predict every little bit of it, it would have been impossible, but it seems preordained now that you look back at it.

Sam Charrington: [00:18:38] When I think about the types of systems that we’ve been talking about thus far, both the economic systems, political systems as well as the biological systems. It jumps out at me that there’s a tremendous amount of complexity in those systems, and Machine Learning, deep learning in particular has this great ability to pick out patterns and abstract away from complexity, which kind of says to me, “Oh, it’s a no brainer to apply Machine Learning to this.”

We’re still very early on in our ability to put these Machine Learning to work. I guess, I’m curious, maybe for you Zack, where you think the opportunity is with applying Machine Learning and AI, for the types of problems that concern you in particular with regard to forests?

Zack Parisa: [00:19:43] Yeah, yeah, absolutely. I guess, listening to Lucas there, one thing that jumps out at me from when you first spoken that, your response to the second question there are lots of people that are very interested in natural resources and there are lots of people that are very interested in Machine Learning and AI, but it is a very small community of people. I think, it’s rare that you … it’s uncommon to start out believing, you’re going to spend all your time outside and then find yourself curled up in front of some code.

The first thing, I think there’s a lot of opportunity for people to make that leap and just to begin to see that as a more natural thing, because the questions are very complex. Again, just like Lucas said, most of our focus has been on how to market to somebody to buy a cup of coffee here versus there. How to think about social networks and how to think about marketing networks and transportation networks. I think, it’s exciting to see that begin to percolate down and transition to the story behind how all of those materials come into our world and life.

The fact is that everything around us and I think the surprising fact is that everything around us, every little bit of technology and everything that built this room that we’re in or that your listeners are in, those things were either grown or mined. Every piece of that, every little bit has some geographic story, some geographic stories, some physical story, some environmental story. If we were to be confronted with all of those stories, just from one day of our consumption, one day of us interacting as we normally do, it would take us years to even sift through all of those stories. There’s no way, there’s no way, but those stories, all amass to have a very large impact in how we all live.

To me, that is the huge opportunity here. We with Microsoft AI For Earth have worked on this data set for the continental U.S. at high resolution to inform about, down to species and diameters, where trees are and what those structures and compositions are and moving forward what they could be. That’s not going to stop, the fact that we are all consumers and that while we have a conservation need, we also have a consumptive need. I think, there’s so much opportunity to begin to investigate how we balance that and how we feel about that and to engage a meaningful conversation, as at multiple levels in society about how that can best be done.

Ask about opportunities. I mean, I was never excited about AI or Stats or Machine Learning for the sake of, I mean, it is awesome, I now understand that, and I do get jammed up about exciting advances there, but it’s about what it can answer. I mean, that’s what drew me out of the woods and put me in front of a computer, it was the ability to start to even think about those big questions, and put it all like distill it to something simple and right in front of us. That’s the opportunity. It allows us to know more about our world and ourselves and to create a better world and a better image of our of ourselves.

Sam Charrington: [00:23:34] Can we maybe dig into a little bit more detail of either the Dataset that you just mentioned or another project and talk through, the process through it’s Silvia Terra uses Machine Learning, the challenges that you run into maybe walk us through a scenario.

Zack Parisa: [00:23:54] Sure. Absolutely. I’ll just briefly tell you where we’re coming from. People have been managing forests for hundreds, a couple 100 years and in the U.S. about 100 plus. They needed information then, as they do now, but to get that they would do a statistical survey, they would go and put measurements in and you work up in average and you make a plan based on that average. That has been effective, it’s what people use a lot still today, but what we’re focused on doing is bringing imagery into bear and model assisted and model based methods to yield small area estimates.

For us it’s at a 15 meter resolution, and for a 15 meter pixel, what we’re predicting is the number of stems, their sizes and species. When I say size, I mean the diameter of the trunk of the tree at four and a half feet off the ground. From there to, in a hierarchical context to predict them, maybe the height of the tree or the ratio of crown to just clear bowl at the bottom. From there, maybe the herbaceous, since we can infer or predict they’d be the light conditions under that forest, how much herbaceous plant matter there may be there? Carrying that forward. How many herbivores that could support scaling that up? How many large carnivores that could support?

For now, the primary piece, this foundational data set that we’ve worked with Microsoft on is that tree list information for each one of those pixels, which hasn’t existed before, but that opens up so many doors for what we can begin to build onto and model further down the line.

Sam Charrington: [00:25:46] At a resolution of 15 meters, single pixel might contain how many trees?

Zack Parisa: [00:25:54] It could contain an awful lot. Easily, and this is the tricky thing because the tree could be as small as a seedling, it can be as large as a sequoia. You could have less than one, right? It could have 300 packed, but small, tiny little trees packed and tightened. This is the fundamental difference about what we’re working on here, to me than where we’re coming from. Which is, we need to transition away from the binary or the basically qualitative classifications, forest, non-forest. That’s not actually that informative about what that forest can … what habitat it can provide. What maybe we need to do or not do to ensure that it’s the type of forest that’s going to continue providing the things we care about. Clean water, carbon out of the atmosphere, wood to build this table. Those are the types of things. Beginning to quantify those aspects is very important.

When I began working with this, everything was on the table. I mean, there was a potential to use LiDAR and neural nets, to try and clarify discrete trees. We do not do that for various reasons, largely bias in results. For us, parting out species became a massive problem. If you have, let’s say 40 trees of multiple species in one pixel, how do you begin to differentiate those when you’re looking at one pixel of data from lots of imagery sources. That was a technical challenge.

Lucas Joppa: [00:27:40] One of the things that I think is interesting about this is like you’re talking about forestry, right? Whether or not people know it’s a profession, it’s an extremely old one. You know some people are going to … you don’t think that you’re going to be talking about Machine Learning. You also don’t think that you’re necessarily going to be talking about philosophy or existential questions, but you asked a question about 15 meter resolution, right? Which when you work with organizations like Silvia Terra that are looking down at the world and asking what is there, you end up having these existential conversations about what is a thing, right? At what level should we be taking data points to be able to feed into these Machine Learning algorithms? Because when you incorporate the zed dimension or the Z dimension or whatever you want to call it, whatever part of this planet earth we’re from, you can be looking down at a multitude of different objects, right?

Depending on what sensor you’re using, you may only see one of them or you may see many of them. If you’re using something like LiDAR and you’re able to get your laser sensors enough to see enough of those things. You start struggling with all of these questions that are actually fairlyn unarticulated in the modern Machine Learning literature quite frankly. Where, all the standard libraries taken a 300 by 300 pixel image and they all have these harsh expectations and sure, maybe we think we all left the world of frequent statistics behind, but we still carry over it the ghosts of a lot of those, harsh binary classification results.

It’s just fascinating I think, to think about, not just like what’s hard in the forestry space, and how modern Machine Learning techniques can help transform that, but also what the problems in the applications that an organization like Silvia Terra, and then the rest of our AI first grantees, what that brings to the Machine Learning community, which is what’s hard here? Why can’t we just take all the deep neural network advances that we’ve made and just voila, we’ve solved all the world’s problems, right?

It’s because, as you said, we’re still at the infancy of a lot of what we hope to achieve in Machine Learning. We just also recognize the severely short amount of time that we have to answer some of these bigger and environmental questions. We have got to take everything that we have at our disposal and start to deploy it.

Sam Charrington: [00:30:18] You mentioned sensors and LiDARs, a very specific curiosity question. I’ve always associated LiDAR, like a local, a very short range local sensing mechanism. Is that not the case? Can you do LiDAR from satellites?

Lucas Joppa: [00:30:34] Yes, yes.

Sam Charrington: [00:30:35] Talking about satellites or playing-

Lucas Joppa: [00:30:36] Playing.

Sam Charrington: [00:30:37] What are all the sensors that come into play here?

Zack Parisa: [00:30:38] A new sensor was just launched a couple weeks ago.

Lucas Joppa: [00:30:42] Something like that.

Zack Parisa: [00:30:43] There’s JEDI Sensor, it’s called JEDI. I’m used to it now.

Lucas Joppa: [00:30:48] I was going to say it.

Sam Charrington: [00:30:49] Use the LiDAR?

Zack Parisa: [00:30:50] Use the LiDAR, Lucas.

Lucas Joppa: [00:30:52] JEDI, here’s a …

Zack Parisa: [00:30:55] Well, it’s worth [crosstalk 00:30:56]. They’re strapping this thing onto the space station. It’s going to be pulsing down, not the polls, but basically everything between. I think, it’s full-waveform LiDAR and so absolutely, even historically there was iSAT, which was a satellite-based LiDAR Sensor. Moreover, and more commonly in forestry, and a lot of even in urban areas, they’re collecting LiDAR information from airplanes at different altitudes and different point densities. Something common one might be like 12 or 24 points per square meter.

When you see that over a forest at canopy, some of those pulses reached the ground. The best elevation models that you see in the U.S. right now, are LiDAR derived elevation models. That’s the source of a lot of the information that we’re getting. You see it in a lot of flood plain areas, the Mississippi Delta area, so that we can better understand how flooding may occur or may not occur in certain areas.

Lucas Joppa: [00:32:02] One more thing that I’m always struck by, when you start thinking about remote sensing and just sensing in general as applied to environmental systems, is that as we start to take a more digital or computational approach to sensing, we almost by definition have got to start taking a more Machine Learning approach to driving insights. Because, what computers are able to do, and I don’t know, maybe I’m just missing the conversation or maybe the conversation isn’t as fully articulated as it could be, but computers are able to sense the world in so many more dimensions than people are. Why do we model? Well, we model because we need a simplifying function to help us understand an already complex world.

What was already complex according to our five senses has now become exponentially more complicated with things like hyper spectral resolution monitoring, where you’re getting thousands of bands back of imagery plus things like LiDAR that are getting 24 points per square meter. You can’t, humans don’t even know … It’s interesting, people always complain that they don’t understand what the layers and a deep neural network do. We also have no idea how to even interpret most of the signals that are coming back from the most advanced sensors in the world because they don’t correspond to dimensionality that we live in.

Sam Charrington: [00:33:22] I was just going to ask that, when I’ve talked to folks that are using LiDAR in the context of self-driving vehicles and this whole idea of sensor fusion comes into play and making sense of all these disparate data sources. That example are very local and now we’re talking about, global data sources or at least much larger scale and with overlapping tiles and capabilities. There’s a ton of complexity, are those … is that type of complexity, some of the complexity that your company is working on managing or do you count on upstream providers to sort a lot of that out for you?

Zack Parisa: [00:34:09] That’s exactly the type of complexity that we deal with. I mean, there are an enormous pool of potential data sources that exist and they all have potentially very useful attributes. Some of them less so, they have different timestamps associated with them, and there’s one very nice thing about measuring forests is that, as long as you don’t mess with them, they tend not to move too much. Trees, they’re pretty willing subjects just to be measured, but they are always changing. There’s growth associated, if there’s natural, there’s naturally occurring disturbance. There is human-caused disturbance and both of those we want to keep track of.

What I see our role right now as being is taking that massive pool of potential sources of remotely sensed data, and the very small and often underappreciated pool of field measurements. The things that we actually might care about and translating between those things and creating something that is more highly resolved, more accurate, more precise and more useful than what could otherwise be achieved. So, yeah draw the signal out of the noise, the classic tale.

Lucas Joppa: [00:35:24] If I look at kind of the full portfolio of AI For Earth grantees, well over 200, you see that, at least in my mind, Silvia Terra is as an organization one of the most mature, right? They’re actually out of the lab, their startup business model, et Cetera, et Cetera. When I think about why that is in the context of Machine Learning, why they’re able to take advantage of that. It’s because of one thing that we just heard, which is they’re taking advantage of these ground-based data points that they can use to train their models, right? That’s because forestry is something that is so inherently tied to our broader economy that we have here in the United States and all around the world. A history of going out boots on the ground and putting a tape measure around a tree and a GPS signal next to it and saying, “This tree is here, it’s this height and it’s of this species.” That’s so rare in the broader environmental space.

It’s one of the reasons that I think, organizations like Silvia Terra are unfortunately standing alone in many respects is because there’s so few data sets. It’s called Machine Learning because we’re teaching computers, right? To teach, you have to be taught or to be taught, you need to be shown examples. It’s why we’ve seen, so significant of advances in other fields of Machine Learning but not in others. There’s just so few annotations in our space that when you come into a forestry space where the U.S. government has paid money for the past hundred years to go out and figure all this out. Companies like Silvia Terra can stand on top of that and really just kind of zoom off ahead. But, they are in many ways the exception to the rule, which is unfortunate I think.

Sam Charrington: [00:37:18] Do you find that the kind of work that you’re doing, we talked about the sensing and pulling all that information together. Does this put you at the research frontier of using Machine Learning techniques or you able to use off the shelf types of models? Where does your work fall in the spectrum of complexity?

Zack Parisa: [00:37:45] Boy.

Sam Charrington: [00:37:46] Or maybe complexity is not the right word just in terms of the innovation cycle, are you able to apply things that people are doing in other fields pretty readily? Or are you having to push the limits and pull right out of academic research or things like that?

Zack Parisa: [00:38:05] It’s a little bit of both. I mean, our core algorithm has been, it’s matured over the last nine years of doing the work that we have, and we’re a small team, we’re 10 people effectively. I guess, when I got into this, I originally, when I thought this quant path was something that really resonated with me that I wanted, that I connected with, and then I saw value in. I originally, then thought I was going to a professor, I would be a researcher somewhere. I would be putting papers out because that must be how change happens.

My path changed when I went around to people that I’d worked with an industry and asked them what papers they were reading to effect, to change the way that they worked? What was the most influential journals that they were reading? The answer was that they weren’t reading the journals, they were busy managing land and that they wanted a tool, not a publication.

I mean, that was a little eye opening, that’s what Max my other, Max Nova my Co-Founder and I set about to do is build tools. I don’t really, accept like a full dichotomy between, is it research or is it just off the shelf type stuff? I mean, we pride ourselves in our ability not only to understand the systems that we’re working in, but also, to be abreast of what’s happening in modern computational techniques and modeling efforts, your modeling tools. Which I imagine everybody would probably say, right? Like everybody would tell you, no. We’re right on the edge.

The funny thing that I learned when I got into this, I’m on the applied side. I mean, I talk with people that are trying to figure out wildfire modeling and how to pick which communities to allocate funds and efforts to help manage a forest to prevent catastrophic fires. I work with people that are trying to figure out how to manage for forest carbon. I work with people that try and figure out how to manage forests to deliver wood to a mill to make paper.

What’s I guess, striking to me from where I started to now, is I thought that what people needed to see was the math. I thought I would show up at their offices and be like, “Good news. We figured it out. Check this new method out. We pipe in this data. We put in these measurements from the ground. We’re able to model this more effectively now.” What I learned is that if I can’t communicate effectively about what we’ve done, if it really truly seems like magic than it is by definition, it’s incredible in the truest sense of the word, it is not credible, and credibility counts.

In some cases where, when we’re working with people, we may not use the most fantastic new thing. We may use something that is slightly more costly in terms of input data that it requires or costly in terms of model fit, but that is more easily understood and explained and more robust to, like the boot test. You go out and it just makes sense.

Sam Charrington: [00:41:36] Lucas, does that experience ring true for the other grantees that you work with or are there a spectrum of experiences there in terms of where they are and applying?

Lucas Joppa: [00:41:47] Some of our grantees are using almost commodity services at this moment. I mean, Microsoft for instance has a service called Custom Vision AI, sorry, Custom Vision API. They want to do, some of our grantees want to do, simple image recognition tasks and the service works for them. They literally just drag and a whole bunch of photos of one type and a whole bunch of photos of another type and the system learns it and produces a result for them and that’s fine. Right? That’s pretty far on the one side of just like commoditized services.

Then there are other grantees that are out there creating exceptionally custom algorithms for their work. I think, we’ve got a grantee, called Wild Me that does basically facial recognition for species, so that they can provide better wildlife population estimates of a species like giraffe, and zebra, things that they can. Everybody knows a giraffe or everybody has heard that every giraffe’s pattern is unique, but look at a couple of photos of giraffes and you realize just how hard it is for the human eye to spot those differences. Right?

They’re building algorithms to differentiate any particular, zebra or giraffe and then plug those into statistical models for estimating populations. There’s nothing off the shelf that does that. In fact, most of the main libraries, they have to go back and modify the core code of, so it’s a full, full spectrum. We’re willing to support all of it, right? Because, what we’re trying to get people to understand is, well, first and foremost, we’re just trying to break down the access barrier, right? We want to ensure that budget isn’t a barrier to getting this stuff done. Because as I think, sure you and many of your listeners are aware, sometimes the latest Machine Learning approaches can be fairly expensive. If not, it might be an open source library, but somebody needs 1000 GPUs to run this thing on, right?

We make sure that the infrastructure gets in the hands of folks, et Cetera, but it’s also just awareness that you could be thinking about this, you don’t have to be. We want the world’s leading Machine Learning scientists to be thinking about what they could be doing, but we don’t want the rest of the world to think that they have to be one of the world’s Machine Learning experts to have a crack at this, right? That there’s software and services that can help them as well.

We see the full spectrum and I think it’s super healthy. We also see the full spectrum of, if I would encapsulate what Zack was saying there and just two words of interest in what we would call Explainable AI, right? Do people really care why an algorithm said that this was a giraffe and that was a zebra? Not really. You don’t have to explain that to them. Right? Do they want to understand why some decision support algorithm, like land, like a spatial optimization algorithm that assigns this part of the country or this part of the county into protected land and this part into industrial use and this part into urban growth and expansion? How that works and why people thought that this was the better policy than that?

Sam Charrington: [00:45:14] Probably so.

Lucas Joppa: [00:45:15] Yes, they do. I think, there’s a lot of hand wringing and angst right now around conversations like Explainable AI and whatever. I think, it’s no different than the conversation we’ve always had about modeling, which is why it’s a model of a complex system. Why are you building it? If it’s being built to just do a simple classification task and it’s easy for a human to go and check the accuracy left or right then great. You can use some really advanced statistical techniques, if it’s something that, if that model instead is a model of, for instance, a human decision process, then I think the onus on kind of explainability is much higher.

Sam Charrington: [00:46:03] Along those lines, we’ve used computation to understand the environment climate for a very long time. Weather for example, has been a great focus of high performance computing. Taking a step back from, the fact that we’re all really excited about AI. Where do you think AI offers unique opportunities relative to the things that we’ve done for a long time?

Lucas Joppa: [00:46:31] Sure. Well, I think that the answer to that will be super complex, I’ll try to make it simple, you mentioned weather. I think sure, there’s no question that statistics, and math and then the computational platforms that started to support them over the recent decades have been used for environmental monitoring. I mean, Fisher was, it goes all the way back to some of these guys were biologists. Right?

The bigger question is why are we excited about this today? For me it really is the full broad definition of what we mean by AI. It’s the recognition that we’re finally deploying computing systems that can collect unprecedented amounts of data and not just amounts, but we were talking about the full crazy dimensionality of the data that we’re starting to take on. We’ve got this breakthrough in data, we’ve got this breakthrough in infrastructure, where you can … I made a joke about needing 1000 GPUs. Well, if you need one, 1000, 10,000, you just got to turn a knob these days and get access to it.

Sam Charrington: [00:47:43] Wherever you are on the novice, still a lot cheaper than a supercomputer.

Lucas Joppa: [00:47:47] Extremely. We have made crazy advances and just a whole plethora of algorithms, but for a lot of the most important ones, we’ve directly accelerated the compute, through the perspective of those algorithms. For the first time, and then of course we’ve made it so easy to deploy these algorithms as web based services, as APIs, right? Then, of course the software infrastructure stack and all of that is incredible. We’ve made it commodity level infrastructure, anybody can get access to this stuff. You hear this term Democratizing AI, what we mean by that is bringing it all into a stack that anybody can use. You don’t need access to a government-run super computer anymore, that’s all one side of it.

The other thing is from weather, as a great example here where traditional weather forecasting was strong numerical simulation. That’s one type of math, right? But, there wasn’t a lot of learning in real time about what was going on. We took a physical process, we built a model that we thought strongly corresponded with it, and then we ran numerical simulations of it. Fast forward and yeah, just for the simulation perspective, you need a lot of compute. The question is, but all sorts of crazy things happen when we do that, that we don’t quite understand. Right? Little eddy flux has happened in some atmospheric layer or whatever and we don’t really know why.

Then the weather community started using Machine Learning to not necessarily learn why, but to be able to predict for one reason or another when those things were going to come and weather forecasting got a lot better. Same thing is happening now in climate modeling as well. We know there’s things that we just can’t do, from our traditional approach to climate modeling. There’s a whole new group that just spun out, that’s taking purely Machine Learning first approach to building a new climate model for the world and not positioning themselves as better, but positioning themselves as complimentary.

I think, that there’s a lot of work that’s just happened and commoditizing all of this stuff as well as, recognizing that while we’ve taken a hugely mathematical, statistical and computational approach to doing some of the stuff in the past. Machine Learning is a different approach, right? It’s a data driven approach, and that can be very complimentary and we’ve seen it accelerate extremely economically important things like weather cap, forecasting, forestry, agriculture, and on and on.

Sam Charrington: [00:50:31] As we wind up. Zack, can you share something that you’re particularly excited about, looking forward in terms of the application of AI to Forestry?

Zack Parisa: [00:50:42] Yeah, absolutely. I mean, obviously we’re excited to be releasing this data set, but it’s really about what it enables. We’re excited to see more nuanced and reactive markets around environmental services like species, habitat, carbon, water, be informed by these type of data and to play a part in that process to integrate these concerns into ongoing management decisions. That’s the biggest piece. It’s what you can do with this information, as you even move it from data to information to decisions.

Sam Charrington: [00:51:29] Lucas, how about from your product, as you look at this from both a very technical and research perspective, but also as managing and interacting with this portfolio of innovators that are working in this space. What are you excited about?

Lucas Joppa: [00:51:48] Well, ultimately the future I see, and the way that we’ve structured the whole program is we think the world fundamentally needs is the ability or what society needs is the ability to query the planet by X, Y, and T. We need to be able to ask questions just like we ask some potentially-

Sam Charrington: [00:52:10] No zed?

Lucas Joppa: [00:52:10] What’s that?

Sam Charrington: [00:52:11] No zed?

Lucas Joppa: [00:52:12] No zed. Well, I was actually speaking with my team the other day and I had sent a slide that said X, Y, T. Apostrophe Z and I was like, I said, “Stretch goal.” So, yeah, we get the zed dimension then I can retire. But no, I think, ultimately that’s where we need to go, we need to be able to allow people to ask for any particular piece of land or water, what was there? What’s there now? What could be there? Empower policy makers to figure out what should be there. We’re far from that.

Now, Microsoft has always had an empowering an ecosystem of customers and partners approach. We don’t look at the world and say, “Oh, say we buy into my X, Y, T vision.” We don’t see that as some fantastical crystal ball that the world spins around and taps on, we see it as a constellation of services and products and solutions brought by all sectors. What we’re looking to do is engage with the Silvia Terra’s of the world, unfortunately, there are far too few at the moment.

Engage with those that are there, bring up the next generation and the next and the next, until eventually there’s a self supporting community of Machine Learning, we talk about born digital. I think, about born Machine Learning, these organizations that it’s just baked into their DNA, but the organization doesn’t exist because of Machine Learning. It exists because of the challenges that we face in the environmental space. They just are capable of ingesting Machine Learning approaches natively and efficiently and treat space and time as first class data citizens in this world of Machine Learning.

Sam Charrington: [00:54:07] Fantastic. Well, Lucas in Zack, thanks so much for taking the time to chat with me.

Lucas Joppa: [00:54:13] Thank you. It was a pleasure.

Zack Parisa: [00:54:14] Yeah. Thanks Sam. Appreciate it.