Read the full transcript
So, we all know the story of Isaac Newton, where he went and sat under a tree, and an apple fell, hit him on the head, and led to that aha moment, which led the discovery of gravity. But what if, when Isaac sat under that tree, and was hit with the apple, he didn’t have that insight, instead, what if he thought, man, I need to go find a different tree? The reason I bring this up, is because I wonder if sometimes we are actually, we actually haven’t learned from Newton. And that we tend to spend our time designing special apple helmets, that rather than seek to understand what makes the apples drop in the first place. And this all starts with asking why. Today I wanna talk about why it’s important to ask why. Why we’re not doing as good of a job asking why as we could be, and I want to give you a framework to think about asking why that will result in learnings that are easy to use and result in a bounty of insight, rather than a single bite. So we are wired to ask why. If anyone here has been around a toddler, that a certain age, like that is all they can do. Then we think about our nearest relative, the chimpanzee, or the bonobos, depending on your theory. When they’re the same age as a toddler, they aren’t asking why, they’re thinking about bananas. We are the only animal that we know of that asks why. And in some ways you could say that asking why is what makes us human. But yet, we grow up and we join the workforce, join a team of data scientists, or experimentation, or product marketers, and we stop asking why and we start asking what. Or just talking about what. Let me give you an example. You probably can’t read this, but this is some analysis that I pulled from a recent report that my team did. And so it has some results, the results are something along the lines of deeper engagement rate increased by 5% overall, 24% for eligible visits. And then below that, so those what we could say are the outcomes, those are what happened. Then we move into the insight section. And it says return visits engage more deeply than new visits. When I read that I don’t think, is that an insight? No, that’s really another way of saying what happened. As an executive this irritates me, because I really want to know what is going on with tests. Why did return visits engage more deeply than new visits in this case? So, how did this come to be? There’s a lot of good reasons that we don’t ask why as analysts in experimentation programs. First is that, you know, you’ve all seen the which test one challenges, and you’re supposed to predict which is gonna win, and you of course rarely get it right, and it all reinforces this idea that we’re not our customer, and so we shouldn’t even bother trying to anticipate what they need. The next is that we already have a huge amount of data, especially as personalization becomes a thing. And it’s already really complex, and difficult to parse. And that’s just what’s going on, we start adding why this is happening, there’s usually many why’s to every what. And so now it’s just exponential complexity, and that feels overwhelming. And then another reason is that it feels irresponsible to over interpret your data. You certainly don’t want to come to a conclusion on something that the data really did not drive towards. And then there’s all the other things like well, is it really our job to do this? Should it be a marketers job? And if it worked, who cares why it works? And we really don’t have time for this, we have like a whole bunch of tests and backlog right behind us that we need to focus on, so let’s move on. So, my belief is that we really need to start talking about why, because we’re leaving a huge opportunity on the table. For one, executives crave insights. That’s something that is actually one of the first things I learned when I was starting my company in 2003, and trying to pick up clients. My mentor said, you know, what you need to do to earn any kind of business, is to tell your client something about their customers that they don’t know. And that always stuck with me. And it’s the same thing that we, as employees of large organizations also need to deliver to our executives. Telling them something they don’t know about our customers. Another reason that we need to start asking why is that whoever owns the narrative wins. As an experimentation team, we have access to all the data, we understand the data more deeply than anybody else, we deliver a report, and then we walk away. And the person we gave the report to now has the opportunity to own the narrative. We should be the ones that deliver that, and it will be great for everybody. And a third reason is that it makes your experiments more durable. If we only were around looking about trying to understand what works, like does blue buttons work, does this navigation work, does this experience work, then we’d redesign the site again a year later, and all that context has changed, what we thought worked doesn’t work anymore, and so you have to throw away all of that thinking. If you’re thinking about why this works, why do the customer have this need that led to this behavior, your customer doesn’t change nearly as fast as the rest of your site, and will therefore make all of that work far more durable. So, the way that we start to think about why is using the concept of altitude. So let’s pretend we’re going on a hike. This is from, actually a couple photos from a hike I did in Yellowstone I did last year. So you start at the bottom of the mountain, and there’s lots of trees around you. You don’t really see much beyond the trees. Then you start working your way up the mountain, the trees thin, and you start to see some of the horizon in the distance. Then you get to the top of the mountain, and it is like gorgeous. This is actually, I grew up in Alaska, and this is where I used to ski all the time, Mount Alyeska. So you get to the top and this is like the whole point of your hike. And now you can see where you came from, you can see the detail around you, and you can see in the distance the mountains and the lakes. Then let’s say we continue up in altitude, above, now we’re flying over the mountain in a plane. You can see all the trees and the lakes, but you can’t see the detail anymore. And then let’s say we keep going up, and now we are in orbit, and you feel you can see everything, but actually you can really see nothing. You see just really the horizon. So this is a helpful concept when we’re asking why, because we’re looking for the right altitude, where it’s not so high that it’s not useful, and not so low that it is not transferable. So it’s looking for things that are useful and transferable. So we today, I want to introduce an insight altitude framework that we’ve been developing over at Brooks Bell for the last few months. That we want to help you think through all these altitudes, and help guide you to the right level that makes your why statements more applicable. So the way you start to use this framework, is when you get your result from your test, then you sit down, it’s like okay, why did this happen? And the first thing you do is start thinking about your customer. Who are they? And let’s say you think maybe they’re not tech savvy, they’re even older audience, what does that really imply about their behavior? How that would effect how they might approach life and your site. Then you take it a level lower, and you say, what is their mindset while they’re shopping today? Are they anxious? Are they just killing time, are they bored, or do they know what they wanna buy? And then you start writing down your possible explanations. And these may not be right, they may be completely wrong, but it starts giving you empathy. As Brian just mentioned, being in an analytical mindset makes it very difficult to be empathetic. So we’re trying to pull out of the analytical mindset into the empathetic mindset. So example of customer theories is that toy shoppers already know what toy they want to buy on mobile. And so maybe that’s why these shorter descriptions worked. When it comes to mortgage shopping, the only thing that matters to customers is the rate, which is maybe why when we change the image from a illustration to, a photo, it didn’t really make a difference. So you can start pulling in these ideas about what may have it explained about the customer. Then once you have your customer theories, you can start to categorize them by altitude. I’m not going to go into details on these, but there’s five levels going from the outcome ground level through to the UI and design platform, into summit which is about state of mind, into jetstream, which is about personas, and ultimately to satellite which is about behavioral economic principles. So you kind of map it to that, and I wanna give you a hint. The one that is the most useful and transferrable in general is the summit level. This is when you’re at the top of the mountain, and you can see just the right amount of distance. And that is about customer state of mind. What are they thinking or feeling today as they’re working through your site. And think about it in terms of also different segments that you can track. So let’s look at this in action. Simple test from a beauty site. It was basically, they were testing chat. And they already had chat, but we had one where we expanded it, and it said our online experts will help you find custom fit solutions, and that was it. So chat won, the expanded one won, lifted RPV by something like 5%. And so normally the old way, your report would be something along the lines, a much more complex and detailed version of saying that increased RVP by 5%, your insight is that chat works, and your recommendation is let’s put chat on other beauty sites. It’s pretty straight forward. The new way is that what we should be doing is still saying okay, this is the chat increased it by 5%, but now why, what are three to five reasons that could explain why it worked? So it went from that in order to get those explanations, what we have to do is get aggressively putting ourselves into the customers shoes. So in thinking about beauty, I start thinking about like buying makeup myself, you know, who has bought makeup in a department store? Every woman in here. And so you’re thinking like, you’re sitting in the chair, getting a makeover, and on one hand it’s like a really lovely experience, you know, the makeup artist is telling you you have great skin, and she’s fawning all over you, and you know, it just feels really nice. On the other hand, it also feels super awkward, ’cause you’re sitting in a very like, busy area, it’s public, it’s like bright lights, and you’re kind of having this intimate moment with this artist in public, and it’s just weird. But it ultimately results in you walking away with this hope and belief that just buying that $40 little tiny pot of little colored dust is going to make you a far better person. So that is the state of mind that I think of when I really think about buying makeup. And then we go back, and so how does this type of experience apply to online? Is it different, should it be different? Emma is a different mindset. And really what parts of that department store experience apply online? And we’re just thinking about chat. Did chat work, why did it work? So once we’re in that state of mind ourselves, it’s a lot easier to come up with the reasons. Maybe from the highest level, behavioral economics, it’s just about loss aversion, which is a standard principle. We’re all afraid of making the wrong decision and losing money by buying the wrong color of foundation. But going a level down, maybe it’s about being beautiful, that the target audience are women, and there is a department store experience that we’re thinking about, is chat more like the department store experience? Maybe that’s why it worked. Maybe a level down, it’s about being anxious, about buying the wrong makeup, it’s the feeling that you’re feeling while you’re on the site, and so they need to be reassured by an expert. So maybe the insight is about reassurance. And then maybe they work just because now we can see it, it’s easier to see the box once it’s expanded. So I think the one that’s most useful and transferrable is about being reassured. And this beauty site has actually gone on and transferred this into other site experiences. They have one, it’s a live virtual try on for the lipsticks. You’re saying obviously you’re afraid of, you wanna see what the lipstick looks like on you. They also have a whole page that says many things that reassure you. We’ve got your match, guaranteed, it said there’s more chat, another way to chat there, and a whole foundation finder. They also talk about free shipping and free returns, and it talks about how easy it is to return it. And so they’ve clearly taken that one idea and amplified it across the board, and clearly it’s driving even more and more wins. So that leads me to my last point. And once you have customer theories, you’re creating tons of them, and you’re starting to find ones that are transferrable, and you start attaching them to future tests and past tests that support this, now you’re emerging with a key insight. And I define a key insight as something that has at least five experiments that support it. And key insights are, you know, you should have 10 to 20 of these at the end of every year that are CEO worthy, or CMO worthy, that you can start sharing with the rest of the organization. That is data driven, due to the fact that you’ve iterated on it and attached it to so many different tests. And so this is a framework to help us take data and apply some psychology in a standardized and usable format without getting completely overwhelmed. And again, I see key insights as the opportunity. Again, executives crave customer insights. That’s what they really want, and they also want to make a lot of money, but you know, finding out how much money you made is great but what you really want is to know why we made that much money, so we can make more of it. And whoever owns the narrative wins, and that could be us, and it will ultimately make our program far more durable, which is I think the goal of any program, whether it’s experimentation or marketing. So back to Newton, what if he never asked why? If there was ever a key in sight, it’s gravity. So my employees suggested I drop the mic here, but instead I’ll hand it over to Reid. Thanks.
– Thanks Brooks. Remind me never to go after you, or at least if you drop the mic just to leave it there. You set a high bar. So what I wanted to do here is take a few minutes to discuss what it means to me to think like a data scientist. I wanted to do that under kind of two separate lenses. First, I think we’re moving away from a traditional AB testing environment and more to an algorithmically driven AB testing environment and if that’s true, the second thing we need to understand is if algorithmically driven testing is the new normal, what do we need to be doing to lean into the questions of why, to understanding customer insights to really drive the most value from our testing program. Alright, so I wanted to spend just a minute describing what a typical data scientist thinks about. And let’s focus on digitally focused experimentation based analyst or data scientist. So first they spend time with the business, understanding the question they were trying to answer. They collect a host of data, they clean and enrich that data, ultimately they build statistical models that represent truth in their mind, and really where a data scientist is most focused is understanding the model performance. So data scientists want to minimize errors. I think that’s very very important for experimentation for our data community. I think everyone knows there are four possible outcomes from any given AB test or personalization campaign. You want to focus on maximizing true positives and true negatives, you want to find every winner that exists, and avoid falsely identifying any experience that is not a winner. So you wanna avoid the false positives and avoid the false negatives. It’s a really hard click here. So this was my world back in 2014 I started with Brooks Bell, I was hired on as a data scientist, and I spent a lot of time heads down in our studio, so this was kind of my life here, over here in the left-hand portion of the screen and I just wrote a bunch of code to reflect the environment. So I was focused on minimizing errors upfront, by making sure the campaigns that we ran were properly powered, that we didn’t miss any winners, and we also focused time writing code to solve the analysis portion. So to understand what are we recommending, and minimizing errors in that way. So from that perspective we wanted to write non-parametric models to handle metrics like revenue per visitor, that typically aren’t represented well in a simple T test. For you statisticians out there, and we also wanted to address the multiple comparison problem, which manifests itself as you continue to add on challenger after challenger after challenger, your false positive rate actually increases. So we wanted to address that. And we addressed that or wanted to address it because it simply meant that we made better recommendations. And so I was feeling pretty good about ourselves and our team and the environment that we created, but I was only feeling good about that because it matched what the landscape looked like. What the lifecycle of an AB test was. And so this was a traditional AB test experimentation lifecycle. So the analyst would involve him or herself upfront, in the data strategy, they would start the test, and they would end the test, and so there was like a discreet ending to an AB test. And then you know, the lights would come on, the music would sound, and everyone would look at the analyst for the recommendation. So that’s kind of the value, I thought, an analyst had was during that portion of the lifecycle where they would make the recommendation. And so we built code that solved that problem. But is that really everything we need to be solving for? And the reason I think I was laser focused on that was when I started back in 2014, this was our mission, vision statement. It was to eradicate bullshit in marketing. And you know, I think, we’ve– Cops are coming, I cussed. So I think, you know, we as an agency worked to do that. But I think the broader community was busy doing this as well, I mean there’s 150 of us here at an experimentation conference. And so I think everyone recognizes the value of measuring directly over a control. So if we solve that problem, where do we go from here? So we sat down in 2017 as a senior management team, and tried to redefine our mission, vision statement. And it was around discovering the people behind the data. So Brooks started to encourage us to ask questions like why, and find the right altitude for that insight. And honestly that, as a data scientist, that made me uncomfortable because it was squishy, there wasn’t this discreet recommendation that I was charged with. And so I was kind of an impasse. Like, what do I do? Do I listen to the CEO and founder, and say this is what we needed to do, or do I push against that? And ultimately I came to the conclusion that as a data scientist we do need to be answering the question of why. And it’s not simply because that’s what Brooks said, and it’s because the landscape is changing, and we needed to adapt how we interact with that landscape. So this is what I think the AB testing and personalization world will look like going forward. You’ll notice the main difference is there is not a discreet end to a test. So what will happen is an analyst will advise a creative team to come up with maybe a dozen set of creative assets. The test will be launched, and I think going forward, the machine is gonna match the right customer experience to the right user. And so what you’ll notice there, is there is potentially no discreet ending to that test, if the test launches and it’s performing it well, and the machine is continuing to self optimize, there is not that moment where the lights come on and the music starts, and the analyst can come in the room, and make that recommendation. So what’s the value of analyst in this new paradigm? And I think it’s in the strategy upfront, and the analysis at the back end, specifically aligned around the point of iteration. So to make this a little bit more concrete, I wanted to walk through a quick use case with you all, and describe what does answering the why look like in this new paradigm? And to do that, I wanted to talk about a case study or a test we ran for a baby retailer. They weren’t selling babies, they were selling accessories and clothes for babies, I wanted to make that clear. And we were focused on the registry, we were testing on the registry homepage. And focused on the KPI of increasing registry ads. And so this was a collaborative effort between us and our clients, their side, their analytics team did a lot of work on the front end of that strategy. This was an authenticated environment, couples would log on, they would enter their due date, and when that couple would come back to the website, we would know how far along they were in their pregnancy. And so we realized that there were certain products that were added to a registry at different times, if you look at that from a time series perspective. So the value in the analytics was upfront in the strategy there. And so wanted to walk through kind of what that looks like and how an analyst can influence that strategy by focusing on the state of mind. So upfront this was the first trimester, and this isn’t like a rendering issue, what we wanted to point out was the idea of restraint. So one of the toughest things for a data scientist to do is to actually show some restraint. I think there is a lot of really current issues with Zuckerberg and congress, and Cambridge Analytic and all that that would point to the need for restraint. We felt it was a little bit odd, during a first trimester you may have not even told your closest family and your friends that you’re pregnant, so why should a machine recognize and try to target you based on your due date if your closest friends and family don’t know. So targeting started early in the second trimester, and it started with a message like this. So this was week 16 through 18. If you logged on when you were at that point in your pregnancy, this is the hero you would see. And it’s very inspirational, dream it, design it, there’s a nice white crib, where your baby is gonna obviously sleep peacefully through the night, and probably not, spoken from a dad of two kids, not gettin’ any sleep in that crib. So this was the state of mind that the analyst was able to advise on. They were able to say that crib was purchased. And they were able to guide the creative team in making that asset. As you moved towards the end of the second trimester, things were starting to get a little bit more practical, so you’ll see this titled let’s go baby. It’s their focus on strollers and car seats. You realize that you know, you’re gonna have this kid, but you’re gonna have to integrate it into your life, so there are very practical things that you need at this state of your pregnancy. And this is the end of the third trimester. I’m only gonna highlight three of these heroes, there are actually nine. And this was when things were starting to get real real. So, you’re like okay, D day is emanate, and so we’re gonna have to deliver this baby, we’re gonna have to bring him or her home, so we needed to take classes. CPR classes, delivery classes, things of that nature. So you have a picture of a nice couple here strolling through classes, a very sanitary environment, I suggested this hero image. So that’s me in a class about delivery about four years ago before the birth of our first kiddo. And so we’re all holding signs that say cesarean, catheter, forceps, none of the stuff you wanna think about, so I don’t blame the marketing department, for going with this particular image. I blame it a little bit, I wanted some level of internet fame. So I wanted to focus on, you know, ultimately for experimentation we’re still outcome based. So we can ask the questions of why and we can lean into that, but ultimately our success comes down to measurement. And if we look at click throughs for the A spot banner, you’ll notice they’re down 70%, and that’s huge. So at the surface I would say that’s a huge fail. This personalization test was a huge fail. But if we start to understand and ask the question of why, and what’s the customer state of mind here, and look at what we’re comparing against, we realize that the control was very promotional driven. So it was look at X product and save Y dollars. So it was this like discreet hard call to action. And we were trying to in a very subtle way influence the customer state of mind by saying hey, we know you, we have a relationship with you, and here’s what you should be thinking about at this stage of your pregnancy. So it was really like an uphill battle. The nice thing is that engagement is rarely the end all be all. So we looked at the primary KPI which was registry ads per visitor. And we’ll notice that we actually say a 4% lift. And that was to me tremendous because usually there’s a really strong positive corelation between engagement and that final KPI. But what we were able to ascertain is that people were clicking through just to try to chase the dollars off, but they actually didn’t want the products that were being offered. So by suddenly understanding the consumer state of mind, working with the creative team to design assets that spoke to him or her, we were able to actually drive the KPI that the business was interested in. It was a big win. So I mentioned the analyst is gonna be involved in two things, so everything up until now we’ve been talking about on a strategic level before the test actually launches. I wanted to point out one thing over here, is there was only one hero image that performed worse than the control, and it was that very first hero image that I showed you, the dream it, design it, with the nursery, week 16 to 18. So that’s where I think the value of the analyst can come in even if the tests are algorithmically driven. So an analyst would sit down, analyze the test while it’s running, and come to the conclusion that that particular hero image is performing poorly, and make recommendations on how to change that. So I think that that’s really, in my mind, what it means to think like a data scientist in this new algorithmically driven world. And you know, I think the key idea here, is not to neglect our expertise, we don’t have to put away our clustering algorithms, we don’t have to put away our propensity models. We just can’t apply them in the same way that we had been in the past. It’s not just focused on analysis and recommendations. We have to lean in to the why. We have to apply our expertise to influence others, and as Brooks said, we actually have to own the narrative, so there was no better person to help own the narrative than someone that has data on their side. And we need to step out on to that ledge embrace the discomfort and start doing that to the betterment of our clients and their customers. Thanks.
Brooks Bell herself, along with Reid Bryant, VP of Analytics & Data Science, explain the importance of data geeks getting out from behind the monitor and working with design to craft messages that encourage consumers to take action.