THE KNOWLEDGE PANEL Episode #9: How to use Big Data to scale SEO

It’s one thing to be able to SEO everything you can touch manually, but what happens when you manage a website that has millions of web pages? What happens to your SEO if you have millions of data points and it’s impossible to process and implement your SEO strategy manually?

YouTube player

In Episode 9 of the Knowledge Panel we discuss how to use big data to scale SEO with several of the world’s leading experts in harnessing big data to power SEO activities – joining Dixon Jones are Wil Reynolds from Seer, Rachel Hildebrand from MoneyGeek and Laurence O’Toole from Authoritas.

Want to Read Instead? Here is the Transcript

Dixon: Hello, everyone, and welcome to the “Knowledge Panel,” Episode 9. And today, we’re talking about big data sources and how you can use big data to leverage your SEO. I think it’s gonna be a great session. Certainly, it’s a great panel, so if it’s not a great session, I’m gonna blame them. But let’s start by introducing ’em. Thanks, everyone so much to the panel for coming on. Rachel, why don’t we start with you first? And tell us about yourself, where you’re coming from, you know, what you do?

Rachel: Hey, I’m tuning down here from Orlando in my log cabin as you can see in my background. I’ve been in the SEO industry for about 10 years now. I’ve kind of done multiple things. Started out doing [inaudible 00:00:44] and backlinking, moved on to Disney, and then also tried my luck as a…the agency life working with Stone Temple. I’m now proficient. And then after that, yeah, falling in love with some of the clients I worked for there, I jumped into working in addiction and behavioral health. So, getting into a lot of the YMYL heavy EAT-focused niches, and now I’m testing my luck working at MoneyGeek doing credit cards and really learning what the true meaning of big data is.

Dixon: Yeah, a big competitive marketplace to be in as well. So, yeah. So, YMYL, Your Money and Your Life, and authority, expertise,and trust, being EAT, and YMYL. Okay. Laurence, why don’t you go? How are you?

Laurence: Well, hi, everyone. I’m great, thank you very much, Dixon. Good to see some friendly faces.

Dixon: That was [inaudible 00:01:38].

Laurence: Yes, indeed. So, I’m Laurence O’Toole. I am the CEO of Authoritas. I’ve been doing SEO, well, since I had dark hair, which was some time ago, probably a year or two [crosstalk 00:01:54].

Dixon: Yeah. Okay, I remember. I remember you interviewing me with dark hair. That was…

Laurence: Thanks very much. Don’t rub it in. Yeah. So, I started out, I think, in-house hired…I used to run the digital business of a large Yellow Pages company, and we hired an agency back in the, I don’t know, late ’90s and 2000s. They’re now a client of mine, which is nice. And, yeah, I started then, but quickly started out on my own in sort of 2009 building tools and playing with data, and haven’t really looked back. Most people don’t want a spreadsheet of, you know, two million rows, but I’m quite happy playing with big data. I know I shouldn’t. And, yeah, I’m really here in such a lustrous company, hopefully to contribute some pearls of wisdom, but if not, learn a lot myself.

Dixon: That’s brilliant. Authoritas is a great tool. I mean, I remember it when it was… For those who don’t remember, or do rather, I know you’re supposed to not say a tool, really, but it used to be called AuthorityLabs, but you can’t remember. Anyway. But anyway, it’s now Authoritas.

Laurence: Analytics SEO.

Dixon: Analytics SEO. Sorry. Yeah.

Laurence: Yeah. It’s all right. No worries. That’s another one of your old customers.

Dixon: Yeah, yeah. Sorry, sorry, sorry. Yeah. Okay. Analytics SEO, yes. Yeah, yeah. And then you bought out Linkdex on the way, as well, didn’t you?

Laurence: Oh, yeah, we did. Yeah.

Dixon: Wil, long time no see, mate.

Wil: What’s…?

Dixon: How are you?

Wil: I’m good. How are you?

Dixon: I’m great. Tell us about yourself and Seer and everything.

Wil: Oh, geez. So, I started in search August of 1999. Been at it for a little bit. Somewhere along that road, I started Seer. That was 2002. We’re somehow still here, you know, still standing. We do SEO, paid social analytics are basically, you know, our areas of expertise.

Dixon: Excellent. Okay. Guys, thanks everyone so much for coming on. If anybody’s listening out there, we’re streaming on Facebook and YouTube and Twitter. Feel free to tell anyone else that, you know, we got Wil and Rachel on the call, you know, and you can say Laurence as well, if you like. And I’m pinning a link. And if you’ve got any questions, feel free to ask them. We’ve also got the luxury of having some production in the background. David, where are you? Do you wanna come in and say hello and [crosstalk 00:04:20]?

David: Good day, everyone. So, I’m just looking forward to a wonderful conversation. How to use big data to scale SEO. Three wonderful panelists we’ve got here. I’ll tell everyone a little bit about where else we’ll be broadcasting at other times, and also where you can get the podcast just at the end of the show as well.

Dixon: That’s amazing. Okay. So, guys, of course, this thing is all sponsored by InLinks and all put together by InLinks. If you haven’t tried InLinks, okay, well, that is a data source, a big data source. We’ve built our own knowledge graph, and give it a try. It’s free. Well, it starts free anyway. Obviously, we want your money in the end. So, let’s get onto the questions and see what pearls of wisdom we can get out of people. I’m gonna start with you, Wil, and then ask everyone, really. Well, firstly, I wanna say, what does big data mean to you, really? I mean, how do you define big data?

Wil: Oh, geez. That’s tough, but I think it’s when the spreadsheets start to fail to meet. You know, it’s, like, when you start realizing that the time that you’re spending manipulating data multiple times in a spreadsheet, they just start to fail. And that’s when you have to find a new way to manage that much data.

Dixon: Yeah. Okay. That’s fair enough. Particularly, the pivot tables, they seem to take the effort, really. So, yeah. Anyone else wanna stick up a different view of big data? Or shall I just dive into things I…?

Laurence: I’d just say for me, it’s about context more than anything else. It’s when you want to just stretch and go beyond, you know, the basic report. So, basically, look beyond, I don’t know, what keywords your pages are ranking for and how they fluctuated over time to properly understand in your competitive environment. You know, every key page on your website’s not competing with the same set of competitors, for example, for most sites. So, it’s when you just need to stretch yourself and go beyond what you can do in a basic Excel or a basic, you know, Google Sheet, and need more data, and, you know, more analysis or interpretation of that data. You need to slice and dice it in different ways. So for me, it’s about context and going beyond.

Rachel: Yeah, I agree with that. And kind of, like, building off that, we use big data a lot to, like, find how we could differentiate. So, kind of, like, building off what Laurence just said, using multiple data sources, not just looking at the basic SEO ranking factors, sometimes even, like, building your own data sources to figure out how you can be competitive in markets that are more about things other than just keywords and backlinks. We had to use that a lot in my previous experience.

Dixon: I think that’s an interesting and an important take on it, is that oftentimes, big data is about blending other data sources, really. So, data sources that are out of your grasp perhaps for the whole data set. But ultimately, it’s the blending of different APIs and different data sources, and I think it’s an excellent definition of data manipulation and how SEOs use it. And I think, for me, I’ve been quite lucky in that I’ve been working with…well, previously, previous life, for Majestic, and now InLinks. Basically, yes, data sources that are too big to fit in a spreadsheet. And then, finding ways to have an output for those so people can then use and manipulate, you know, sort of from server side. It’s probably my approach or my angle on that. So, what, you know, for SEO purpose, or for any other purpose, what are your favorite big data sources to jump into? So, I’ll start with Laurence because I know what his big favorite data source is gonna be.

Laurence: I’m not actually…

Dixon: We might as well get it out. You [crosstalk 00:08:06].

Laurence: I’m not gonna say Authoritas. I’m not. No. You know, obviously, we’ve been building our tools for years, but yes, we build our own data, and, yeah, we capture our own keyword ranking data at scale. So, for me, the biggest source of data, you know, there is Google. You know, we are querying Google and other search engines heavily, as is the whole industry, to try and get the insights that they won’t, you know, give us via API. I mean, if they gave us a nice, handy API, I’m sure we’d rather use that instead. But, so for me, yeah, the search engines first and foremost, and then people’s websites, you know, second. You know, obviously, we are crawling websites, crawling competitors’ websites trying to get insights. So, they are the two data sources. We turn those into a service of course, but, you know, we come third in the queue there.

Dixon: Yeah. No, I understand. Yeah. Wil, what about you? What’s your favorite place to glean data?

Wil: For me, for SEO specifically, it’s paid search data. You know, I think one of the things I think SEOs have always struggled with is articulating our value. And one of the best ways to articulate your value in my opinion is to be able to say, “Hey, if I have all of your paid data, I can see how much you’ve spent for these words, which helps me to make a market price on what you’re currently willing to spend for this basket of words.” And then I apply that to, you know, large-scale scrapes of Google, trying to understand, well, who’s ranking for these different words. But there’s something to be said for… I get conversions, right? I mean, monthly search volume to me is just like… You know, I get conversions, you know, inside of paid, and then I also get how much you spent, which means I can take that and use that as a, “Hey, if you’re willing to spend this much to get these conversions for this word, then blah, blah, blah, on the SEO side.” So, I just love taking paid data and joining it in as just one of the many data sets I enjoy joining to my SEO data set.

Dixon: Okay. I’d like to come back to that in a little bit, but Rachel, what would you go with as your favorite data source?

Rachel: My favorite data source to bring in? Oh, that’s a tough one. But I really…even though, like, I…

Dixon: You can have more than one. It’s okay, you know.

Rachel: Yeah. For me, like, especially working at Advanced Recovery Systems, I got, like, heavy into the local market. So, I spent a lot of time using, like, local SEO data to kind of help influence some of our strategies. So, from going from that front, there’s just, like, kind of, like, not so well-known tool called persuaded.io built by this guy named Zach Todd. All his data sources, which is a lot from Google Local and then some other searches, he and also the UPS index to find out if some of these actual websites are legit or not. Like, he can find the spam, which keywords have the most volatility, and things like that. So that really helped us, like, kind of, like, scale some of our local SEO efforts, which was new to me and something that you had never really thought of.

Dixon: Okay. So, I’ve not heard about persuaded.io, so let’s come back in a little bit on that as well and maybe you can dive into [crosstalk 00:11:14].

Rachel: Yeah. It’s very similar to, like, BrightLocal, but a little deeper.

Dixon: Well, okay. We’ll dive into that one a little bit now. So, persuaded.io, but is that largely U.S.-focused? Because I get the feeling that local results are so much more of a big thing in the U.S. than they are in the UK. And Laurence will know because I’m sure he tracks ’em both. But is all of the data there U.S.-Centric, would you say, Rachel?

Rachel: I would assume so. But it’s just, like, how he’s building it. It’s definitely U.S.-based, and I’ve only used it U.S.-based for primarily the really competitive markets in Florida, California, and New York.

Dixon: Okay. So…

Rachel: Yeah. He has a lot of big data.

Dixon: Laurence, so am I right in saying that, you know, the U.S. just is so much more into the local results than the…? I mean, I see all the SEO gurus talking about local results so much, and yet in the UK, I don’t see that quite as…you know? Amazon can get [crosstalk 00:12:19] gurus.

Laurence: Yeah. I mean, we’re just a small, little island, aren’t we? What, I don’t know, a fraction the size of Texas or something?

Dixon: Very much a fraction. Yeah.

Laurence: So, yeah, I would say, anecdotally, what we’re seeing from our platform is our U.S. clients, I mean, you know, some of them will have hundreds of different locations they’re tracking, or, you know, a dozen or more major cities. So, for them, hyperlocal rank tracking, you know, is really important. And, you know, it still is important in the UK, but possibly less so. So, we see a greater demand in the U.S., certainly.

Dixon: Well, in those paid tools you were talking about, there’s Google’s ad Planner and those kind of technologies that you’re talking about. I remember…it must have been quite a long time back, but all of a sudden, I got to an SMX in New York, and Google had decided to take away the ad-planner APIs, or the local APIs. Oh, we got William Rock in there. Hello, William. They seem to take away all the tools for SEOs and left it for PPC people. And there was a real problem for a while with…SEO companies have been using, you know, those ad-planning tools in their technologies and stuff. Has that calmed down now? Or is that, because you’re such a big PPC player, then they’re not gonna take it away from you? Or is that something you always worry about? Do you get angry about that?

Wil: Yeah, I get pissed regularly because the way that I manage it is I’m actually literally taking what you spent your money on and not your clicks on. Like, I don’t want any estimates, because, I mean, once you actually, like, join the data… So, once you take a tool that says, “Here’s what the monthly search volume,” especially for people that have any kind long tail, and then you join that to your actual paid conversion data, you realize that, you know, the average tool that’s telling you a word has no monthly search volume, that for one of our clients, it was 80% of their conversions. Like, 80% of their conversions came from keywords that all the tools out there were saying have no search volume, which means everybody’s ignoring them.

So, for me, I’m literally taking, like, “How many clicks did you get on this keyword last month, or this search term in paid?” Then I’m saying, “Okay, let me run an analysis on each one of those keywords.” So, at any given point, we’re analyzing, you know, five, six million keywords trying to understand what’s happening specifically on the spend. Now, what Google’s done, is for some crazy reason, they’ve said, “Oh, we’re not gonna show you all the words that you actually paid to get a click on.” And, you know, that kind of sucks.

Dixon: So, they’ve taken away… The Google not provided has now extended to paid as well?

Wil: Yeah, but, you know, it was different with not provided because they were giving you free traffic. They’re, like, “We don’t owe you anything.” You know, I literally just found a client that was bidding on the word “Things.” Things, and spent, like, $20,000, right? And, you know, when you start taking that information away from marketers and saying, “Trust the machines. They’ll optimize your shit,” it’s like, you know, yeah, I could have paid a thousand dollars for this mic. I don’t want somebody looking at my bank account before they choose how much they’re gonna charge me for it. And with Google, it’s, like, “Hey, if you said you wanna come in at this CPA, you came in under.” It’s, like, “Well, just because I came in under doesn’t mean I wanna pay $20 grand a year for the word things, dude.”

So, what we’re doing is we’re pivoting and we’re starting to offset the Google data with the Bing data to help us to better find, because Bing will give you impressions. Whereas Google won’t give you what search terms you showed up for until they get a click, Bing will show you what search terms you showed up for at the impression level, which means I’m actually getting the data faster even though it’s a smaller data set because Google was hiding so much of that before. So, that’s one of the errors we’re going.

Dixon: Can you…?

Laurence: Can I ask you, Wil? [crosstalk 00:16:08].

Dixon: Go on, Laurence.

Laurence: Do you use Google Search Console data tool, or is it just too limited for you to actually, you know, do anything with the kind of clients you work with?

Wil: No. So, I think there is value in Google Search Console data. The biggest valuable area that I find, and others’ mileage may vary, is the click-through rate. Well, I think when you work with big data and you work trying to join data, your whole mentality, to me at least, is 100% focused on what dataset gives me something that no other dataset gives me that I can join to my other data. And for me, Search Console gives me that click-through rate by position, which helps me to do all other kinds of interesting things. So that’s the thing that I like most out of Search Console, but I don’t use it for much just yet.

Dixon: Okay. And I’m sorry, I’ll come back to Rachel and Laurence in just a second. But I saw a video very recently. It’s kind of a promo for Traject. Anyway, you were saying, there’s so much traffic coming through for a client that you say 200,000, 300,000, where I’m only getting one click on a keyword in a year. But, you know, that’s 300,000 clicks of really long-tail, really good converting traffic. And that’s the stuff that you value. So, you’re still very much of the feeling that, you know, long-tail is where it’s at for you in terms of conversions, Wil?

Wil: Well, you know what? I have the data to say where it is and isn’t. That’s the freaking beauty of big data is you stop using your experience as the way that you’re gonna make a recommendation. Instead, like, I can go into BigQuery and find out for every client, what percentage of their conversions are coming on keywords that had less than 10, less than 20, less than 50. So now, I’m not coming out with some blanket, you know, like, “Hey, everybody, look at your long-tail.” I’m like, “No. Bring your data somewhere you can join multiple data sets together, analyze all that data, and then now you can just see what kind of strategies might work for which kind of clients.” If anything, I think that’s the power of big data, is for so long, we had to use our experience. It’s like, “What have I worked on, you know?” A thousand websites? There’s, like, a billion of them. That’s a horrible rate for me to go out and say, “Hey, here’s what I think you should do.” You know, so…

Dixon: “I know what I’m doing. I don’t know.”

Wil: Yeah. I have no freaking clue.

Wil: Yeah. Okay. Rachel, I mean, you’re dealing with data sets all the time. I mean, that sort of long-tail kind of low volume, high-value traffic. Are you finding that harder to analyze these days? Or, you know, how do you find that end of the spectrum?

Rachel: It’s harder to analyze, but I feel like, especially in the industries I’ve worked in, like, that’s, like, my competitive edge. Like, now, especially with credit cards, like, a lot of the competitors aren’t focusing or putting their effort into focusing on those long-tail keywords. And with some of the data sources that we’re able to pull in, which we use a lot of like you said earlier, like a lot of APIs to pull in data sources from government websites, like at Advanced Recovery Systems, CBC, at MoneyGeek, the financial government bureau site, and pulling in those data sources to kind of see where competitors are missing out on, where our competitive edge can be on. So, at least kind of doing the whole run, walk method, breaking for those longer-tail, more intent-focused keywords, and then hoping to rank for those head terms later on.

Dixon: All right. So, using those data sources aren’t necessarily in the front end of your product, they’re to find out where you’ve got gaps in your product portfolio and using that to fill the product portfolio, right?

Rachel: Yeah.

Dixon: That’s kind of interesting.

Rachel: Luckily, I’ve always been on the front end, so I have used it to, like, do my planning. But there have been times, especially at Advanced Recovery Systems, the strategy was kind of already built by the time I got there, but we had to really redefine that after the dreaded medic update. So, that’s another time where we had to, like, kind of go back, look at all our data, and then even do a little crafty ways to bring in our own, I guess, like, big data sources from things that weren’t really quantitative before but we made them quantitative in a way.

Dixon: So, I got a question for you, guys, and this is maybe me going out a little bit on a limb. But I had a demo today where somebody was coming in for the InLinks product. And the new InLinks product, for those that don’t know, most of my audience will, but it’s a knowledge graph. So, we’ve taken our own knowledge graph, we’ve created our own semantic connections between entities and ideas, and everything else is built off of that, really.

But we had someone coming in who wanted it for PPC. And it rings that… Wil was sitting there saying that Google now are sort of taking away that keyword granularity that they used to have within the paid search data stuff. And he was saying that he thought that was largely because Google’s paid products are gonna start moving towards topic-based systems as well and move away from keyword systems at the front end as well. Do you think that… And I don’t know if he’s right or I don’t know if he’s wrong, really. I mean, you know, we’re guessing out there. But it does seem to me that this whole entity-based approach to anything that Google does now is starting to become more important. And I’m wondering if it’s gonna start coming into the to the paid data sources as well, the paid search data sources as well. Any thoughts on that, Wil, or Laurence, or Rachel?

Laurence: Well, I’m definitely not the right person to talk about paid. It was about, I don’t know, year 2000 when I actually built a paid PPC platform. But it does worry me that, you know, it’s like an extension of Google’s dominant market position. Access to data from, you know, a monopoly provider. They need to provide access to data on terms that are fair and reasonable. And I’d say, you know, it’s essential for website owners to understand demand, to build websites, to launch campaigns. And if you are totally reliant on advertising on Google to actually get the data you need, then, you know, clearly that really just strengthens their position even more. So, the specter of them…you know, you can see where they’re going, obfuscating keyword data more. And we’ve seen it on the SEO side with trying to get search volume data out, and then them with the close variance and returning the same search volumes for close variance.

It does look like it’s heading that way. I mean, Wil might know more, or Rachel might know more. They might do more paid campaigns than us. But, you know, it’s a bleak picture you’re painting of them, you know, reinforcing their dominant position and making it harder and harder for us to get the data we need to make decisions.

Dixon: Okay.

Rachel: Yeah. I feel like Wil’s gonna have a lot more to say, so I’ll make my point quick on this. But I feel like it’s heading that direction just with things, like, kind of, like, what Laurence said, things that we’re just seeing in the SEO industry in general. Like, even with Question Hub, everything’s entity-based. Like, you gotta get really specific to find a good question, but from the high level, it’s all entity-based. So, I feel like everything’s…like, Google’s heading that way, so, like, why would they not include paid?

Dixon: Wil, do you wanna jump in with [inaudible 00:23:33] thoughts?

Wil: I mean, the sad part is as long as Google is willing to let people spend $20,000 on the word “things,” they need to show us every freaking piece of data that they have. When I got matched… You know, like, Google’s not using entities for paid because it would kill their business. So, like, I mean, if they use it, right? So, here’s an example. I was bidding on the word exact match GA 360, right? Google Analytics 360. If you type in “GA 360” right now into your browser, you’re gonna get…all their entity work on organic tells them that that’s a Google Analytics 360 query, right?

So why was I getting clicks for words like Georgia 360 on their variant matches? And I’m paying $20 bucks a click for Georgia 360? It’s like, “No, you’re using your entities to build a better organic search engine, but you won’t flip those same entities over on the paid side, because if you did and you stopped showing ads on those, people wouldn’t be spending $20 grand a year on words like “things.” And you know what, then you’re gonna just start to remove the data so we can’t see it as easily, which, you know, you’ll say it’s in my best interest with your machine learning. So, you know, I couldn’t run a business that way, you know? It’s a shitty a way to run a business if you ask me.

Dixon: I’m gonna take a different angle just because there’s no point in us just all agreeing on the whole thing, and I wanna bring it a little bit back to organic, I suppose, as well. But there’s other ways in which entities are starting to come out in Google’s products. And a good one, I don’t even have an Android, but I know on Androids, your phones, you have Google Discover. And so, these things are popping up with ideas, and they’re all topic-based. They’re very much based on, “Here’s some interesting pages about hitchhiking,” or…sorry, “about hiking or swimming or whatever,” because you’ve already shown some interest in that topic.

So, when they’re flipping around to entities, I agree they shouldn’t take away the keyword stuff, but by flipping around into entities, they’ve got this other avenue of traffic that they can start monetizing by understanding the underlying topics that people are interested in. And then, you know, I guess at some point, they’re gonna say, “Well, as long as you’re thinking in terms of topics and ideas on your advertising campaigns, then you’re gonna get good traffic coming back as long as you’re matching hiking to hiking.”

Wil: I think the nasty thing is that they just control both sides of the market, right? So that’s the nasty part. Like, “Oh, trust us,” you know? “This is in your best interest.” And it’s like, “Hey, can I get my data to make sure it’s in my best interest?” “No, we’re not gonna give you all of it.” It’s like, “Well, then who’s gonna police you.” “Oh, we will.” And it’s like, “Well, I don’t know if that’s an idea.”

Dixon: That’s a very fair point. It’s a…

Rachel: It’s even worse in local. Like, oh. Like, Local search is so bad. Kinda, like, what Wil’s saying, there’s so many things Google can do to make it better, but it’s in their money’s or their pocket’s best interest to just leave it as is and just leave the independent SEOs to just submit redressal forms to fight spam or people that are, like, bidding. It’s [crosstalk 00:27:01].

Wil: I mean, it’s the worst. Like, yeah, I don’t even want to talk about it anymore because I get so frustrated because it’s frustrating.

Dixon: We’ll move on. We’ll move on. We’ll move on. But William put a point up and I didn’t get a chance to read it out. So, William said, “Google Ads is difficult, especially since they are forcing enhanced bidding and smart campaigns are a waste of money for high-dollar keywords.” And that’s the point, isn’t it? It’s the high-dollar ones where you…as exact match it as you possibly can in the new Google world, and you don’t want Google to learn because Google’s gonna just go worse from your exact match campaign there, which is not great, I will admit.

Okay. Let’s get back to the organic stuff. I’m sorry for sending Wil off on a little heart attack there and, you know, raising the blood pressure, right? So, I apologize for that. So, you talked about BigQuery. You know, getting these data sources is one thing, mixing them and matching them and visualizing stuff and putting those different data sources together is a completely different kettle of fish, really. What tools do you use to do that? And what’s worked for you in trying to put together different disparate bits of information and matching ’em up? What do you use? Who’s going?

Laurence: Oh, I can…

Dixon: Rachel?

Laurence: Okay.

Rachel: I can go. If the data source is small enough, and depending on, like, how many resources you have, I really like Data Studio. It’s free. It can be slow and clunky, but if your data source is small enough, you can do a lot with it at least just to get your point across. But if you have a lot of data… I think somebody mentioned in the comments, too. I got lucky enough to have the access to use Microsoft Power BI, which I know there’s a ton of other tools that do the same as that, but that was a tool that was super easy and not as expensive as, like, other sources like Tableau. And I’m sure there’s cheaper and better ones, but Power BI was, like, easy enough for me not to know what I was really doing to get a lot of data in a way.

Dixon: And is Power BI…is that Microsoft’s one?

Rachel: Yeah.

Wil: Mm-hmm.

Dixon: Yeah. Yeah. Okay. And so, Power BI, I guess is competing with BigQuery, Google’s BigQuery, is that correct or…? No? Okay. What’s, you know…?

Wil: No, I would say, well, Google bought Looker to compete with Power BI. And because Data Studio wasn’t robust enough, like, you know, to Rachel’s point, it was… You know, if you’re dealing with smaller data sets, it’s cool. And they’ve made a lot of improvements, but, you know, I feel like Data Studio is chat to Slack, right? It’s, like, it gets you by and it’s good enough, but when you want to do the real kind of stuff, you know that there’s a better product out there. But, yeah. No. So, you know, usually for most of us, like, I know at least for our team now, is we have a ton of data engineers now just constantly dumping this data into BigQuery, making sure it’s clean so that people like me can easily connect into it and join data.

And the other thing I love about Power BI is, you know, if I wanna mess around with, you know, medicare.gov data, I can just go download a CSV and then join it to all of the data that I’m pulling from BigQuery. So, like, I can do that on my desktop. And I think there’s something to be said for empowering your team to be able to have a clean set of data coming in, and then the tools you’re using being open enough for you to take data from other places and join them into the main data set that you have.

Dixon: Yeah. I think that’s key. So, yeah. And so, Supermetrics is out there, but that’s kind of spreadsheet based as well, bigmetrics.io as well. But these ideas are all really about how to connect those different tools. And then, in the middle of all that is stuff like Sapia, which is kinda, like, the connecting piece, really. I think if I’ve got a data source, whenever I do have a data source, one of the first things I try and get my dev team to do is to get Sapia endpoints so that, you know… I can’t expect SEOs to really go too far. I mean, unless they’re gonna take my API and program it and build their own connections. But a Sapia connection just means that loads of other tools can theoretically just pull in what we built once. And it may not be the cheapest way of doing it, but I don’t have to develop every single time a tool wants some stuff. What about you, Laurence? How do you visualize data?

Laurence: So, similar comments, really. We use Google Data Studio for clients. Find it a bit flaky. Sometimes, it would just, you know… You just have to refresh it a couple of times to get a perfectly good data source to show. It’s a bit frustrating. But sending data into BigQuery and then visualizing it in Data Studio is a much better way to go. And obviously, once you send the data into BigQuery, to Wil’s point, you can send it anywhere. So Power BI, Tableau, any tool there, you can really sort of visualize this data. And we do for clients. I would probably just mention some different tools, just, you know…

Dixon: [crosstalk00:32:10].

Laurence: I like graphs and looking at graph data, and we use ranking data in graphs to try and give insights on the whole market. So, we have our own tools for that. But there’s some free tools out there which I’ve played around with from time to time and really find valuable, things like kumu.io. I’m probably not pronouncing it right, K-U-M-U.io. And GraphComments. They’re two very similar online free graph tools where you can just upload or connect directly to Google Sheets and spreadsheets of data, and you can then, you know, annotate with metadata your nodes and your edges, and you can build a graph on the fly.

And they’ve got some really nice clustering. And I use it for finding frequently asked questions that are central to a theme. So, I look at all my ranking data, I look at all my pages. I can pull all my competitor data in. I throw all that in with all the “People Also Ask.” And I can go, “Hang on a second.” You know, I can put up a graph and it’s, like, a parallel distribution of questions that are central to a couple of topics that I’m interested in, like, I don’t know, keyword, ranking APIs or whatever. And, you know that those questions, really, you’ve gotta be answering this website. This piece page has gotta be answering those questions. So, that’s…

Dixon: I think you’ve used those tools to pull out some pretty good blog posts as well, case studies in the past, I think.

Laurence: Well, yeah, we did a… This is going back a few years. We did a graph of a whole…I did try a graph the whole of the UK with all our ranking data we have, which was a bit foolhardy. And after trying to analyze it for five weeks, we gave up. But it was fun. That was a few years ago. So, then we focused on my… CTO’s tearing his hair out.

But anyway. So, [inaudible 00:33:59] must be able to do it. Now, that reminds me actually, remind me later of some crazy hair-brained idea I’ve got about big data one day. But anyway, we’ll get to that. But, yeah. No, we did an analysis for PriceMinister, a big French eCommerce site. They’re doing okay. They’re ranked for 900,000 keywords on the first three pages of Google. And then, what we did was build a graph, but rather than just look at ordinary ranking data in a sort of SQL type of way, we took every single ranking page and all their ranking keywords, went out and found all the ranking competitors’ pages and all their ranking keywords, and just built out that graph, and then ran a clustering algorithm just to…a community detection algorithm. It was 4.2 million keywords, and we found all the different clusters.

And then, I’m afraid to Wil’s point, we had to use search volume. We didn’t have any other data. But we could use search volume, and we used Majestic link data to help them understand in the clusters against their top 100 competitors how dominant they were, how, you know…so how big the clusters were, so what’s the opportunity? And then based on whether they’re in the cluster or not, you can say, “Okay, well, you’re ranking okay. Out of all these clusters, which ones have the best potential for you and where are you dominant?” You can end up with a matrix. So, you kind of got high potential, high strength quick wins. High potential, low strength, build authority. You know, reasonable potential, fairly strong maintenance, and low load, don’t bother you’ll never get there. So that’s a great way of looking at sort of SEO…

Dixon: I love how you use the word just as you went through that development process. You know, it’s not as if you did it in the morning, is it really, you know?

Laurence: No. No, no. It took a long, long time, but you can now. You know, we have a clustering tool. But you can take all that data and you can aggregate and you can cluster that data and you can run these community detection algorithms yourself and draw insights out. And to Wil’s, yeah, and Rachel’s mentioned earlier, you can factor in… We were using analytics data in eCommerce transactions. And then you go, “Well, all these pages have got a 10X potential. But hang on a second. These 5, these 50, we make 3 times as much per organic visitor than these, so let’s focus on those.”

And then you’ve actually… That, to me, is big data in a context like I said in the beginning. I’ve taken, you know, a site that’s doing well and I found a market of 4.2 million keywords. And I’m not gonna do that manually. And I just wanna press a button and get some insights. And the real skill, I think, for everyone here, and whether we’re using our brains or software or whatever, is using all that big data and distilling the insights into some clear deliverables that can be communicated, you know, up and across the organization and acted upon. And that is the art of it, and we certainly haven’t mastered it yet. But we’re taking baby steps in the right direction, I feel.

Dixon: There’s lots of nods from Wil and Rachel. Anything you wanna add on that? So…

Wil: No. I mean, like, he’s pretty much spot on there.

Dixon: I think that…

Rachel: Yeah, I did. Everything.

Dixon: Get it, yeah. Get it. Finding the needle in the haystack is what we’re, you know… You could have just said that, Laurence, you know?

Laurence: Sorry. One thing I would say, I feel when…and we’re guilty of this. And, you know, lots of old tools out there are much better known than us, Semrush, Ahrefs, everybody, Searchmetrics, etc. They’ve all got visibility tools that analyze you against the competition. And every time I see a graph, it’s like, “I’m an eCommerce brand. I’m competing with Amazon and eBay.” And I see that sort of visibility graph and that competition, or Venn diagrams. I really don’t like Venn diagrams, right? My pet bugbear. And I go, “People aren’t doing competitive analysis properly because they’re not using big data.” And, you know, if you’re an eCommerce site like diy.com, or someone like Lowe’s in the states, and you’re competing, yes, with your traditional competitors, but you’re competing with Amazon and you’re competing with the eBays of this world. And, you know, if you were to ask a question, “I got 9% growth in organic traffic last year. Is that good or bad?”

Well, it depends on how the market grows. You know, what happens in the market? And put it in context. And when you compare a graph of your visibility against all of Amazon’s keywords, it’s irrelevant. You’ve got so much noise in there that’s not relevant to you. If I sell bikes, I wanna understand what the gap analysis is and the opportunity analysis against the keywords that Amazon and eBay ranking in bikes. And that, to me, is a really good use of big data to try and give you a better context.

Wil: Hey, Laurence, one of the things that we’ve just started working on…so it’s so in its infancy. But going back, I think it was you, or maybe it was Dixon that asked about Google Search Console. So, when I was talking about using data for where it has such a unique piece of data that you can’t get anywhere else, that click-through rate for Amazon, we’re starting to look at, like, when Amazon shows up in what position do we see that your click-through rate is way different when Amazon’s above you sitting right around you, right? So then all of a sudden, you might say to a client, “Hey, if Amazon’s two positions above you in almost every product category, it’s not worth it because your click-through rate gets crushed.” So instead, let’s look where Amazon’s not in the top, let’s look at the click-through rate as a cluster for that group and say, “Wow, look, your click-through rate actually is 3X higher when Amazon doesn’t show up.”

And then, when you’re running your monthly rank checkers, if Amazon comes in, you’re like, “Let’s go hands off for a little while.” And if Amazon drops out, you might say, “Hey, let’s go in and try to win for a little while.” So that’s an example of how we’re hypothesizing right now on how to use Search Console data to help us to make better decisions. The other part that we are now… It’s funny. Everything with big data becomes a Pandora’s box, doesn’t it? So, the minute we say, “Oh, Amazon’s ranking. What does that do to our click-through rate?” But then it’s like, “Well, now we have to go scrape Amazon to see whether or not it’s our page that’s showing up. Are we listed?” Because then, the money’s still going into the same bank, so then the client’s not really as concerned. So, you know, like, it just becomes this Pandora’s box. But that’s the fucking fun of it, right? Like, that’s the fun of it. It’s opening that…

Laurence: It’s the sould of the business.

Wil: …Pandora’s box up and being like, “Oh.” And then getting it to your point Laurence, to the point where you click refresh one time and you’ve engineered all the data to come in and surface those insights, you know? So, that’s exactly how I’m trying to use Google Search Console data for something that could give us, you know, a little bit of a leg up. I think Rachel was talking about that as well. I think, honestly, these days, in my opinion, like, SEO is like, “Where can you find that little thing that if you do it, gives you a wedge that’s broken between you and your competition?” Because everybody’s gonna have Semrush, everybody’s gonna have Ahrefs, everybody’s gonna have these tools. And the thing is, it’s like, “How can I use their data, joined to other people’s data, or even their data in a different way, that tells a better story or a different story?” And that’s where I think most SEOs are gonna create value these days.

Laurence: Yeah.

Dixon: That’s kind of interesting. And William Rock thinks that you’ve got a killer idea, the Amazon [crosstalk 00:40:57].

Wil: And we’re working on it right now. I just don’t have as much time to play with it as I wish I did, but we have team that’s working on that literally right now.

Dixon: Do you also look at click-through rates? We’ve got something that does above-the-fold analysis on ranking data. So, you can look at where competitors are bidding on your brand term or brand-related terms, and then you can obviously see that impact on your CTR as well. So, if you then drop below the fold on mobile or desktop, and you go, “Hang on a second,” and someone’s bidding here, then that obviously leads you to one set of actions, as opposed to, “Actually, it’s fine. No one’s bidding right now. Perhaps I don’t need to spend as much bidding on my own brand term.”

Wil: Yeah. We’re doing a little bit of that, but that gives me some inspiration to do a little bit more. If it’s also what you’re talking about, you do above and below the fold. What I like about that…and it’s funny how this wasn’t, like, my world before. Like, it’s crazy that once you decide the spreadsheet is the wrong place to win, how you start learning skills you didn’t think you would have to learn, right?

So, for me, you know, when you do above and below the fold, what I like is you’re aggregating that data into two groups instead of position one, 1.1, 1.5, 1 point this, because then when you go to look at the data, it becomes so disparate that you can’t really see a trend. So, one of the things I’ve learned for those of you that are gonna start off newer in big data, you start learning real quick. You start slicing that data super small, and you’re sitting there like, “[vocalization ]What am I gonna do with this?” So, sometimes rolling that up…

Dixon: It’s where you started, really. You might as well click…

Wil: …sometimes rolling that up in the two big groups is like, “Hey, above and below, the fold is a good start because now my 10,000 words are only spread into two categories instead of spread across 50 categories of all these 1, 1.5, 2, 2.5, 3. And it can drive you nuts.

Dixon: That’s great. That’s some good thoughts there. So, I didn’t get an opportunity, and we’re nearly at the end, really, but I wanted to…just because InLink has got its own knowledge graph, you know, the whole idea of inorganic Google moving towards entity-based algorithms and organic search, moving towards an entity approach, you know, how do you think a system like… I know you don’t necessarily know anything about InLinks, guys. I’m not asking you to know about it, but how do you think we, as a data source, could make our data accessible? We’ve got our own knowledge graph. We know how things are related to each other. So we’ve got it categorized so we can sit there and say, “Right. If you want to know about the concept of, you know, history, then we know that if you’re gonna talk about history, you might also wanna talk about geography. You might wanna talk about these whole school subjects. Or within the context of economics, you might wanna talk about these things, like, you know, the South Sea Bubble,” or whatever.

So, you’ve got different topics related to each other, and being able to throw that back out to people. You know, do you think we gotta make that accessible to people to put into data sets like into Google Sheets? Or can we just give people CSV downloads so they can take that data and move from there? How much effort should a data source that isn’t yet commonly available, I suppose, how much energy should they put into making it available to be picked up by anybody else in any other form?

Rachel: Part of me, selfishly, if it’s a really crazy data set, just leave it as a CSV file and don’t let that many people know about it. So the people that are crafty can find it and use it, [crosstalk 00:44:42] a competitive edge.

Dixon: [crosstalk 00:44:43].

Rachel: But I love if I go to a site, especially all the government sites and I see that it’s just a CSV file of data, I am so happy at that moment, because, yeah, it’s gonna take me a little bit more time to splice in and there’s probably better ways to get it, but it’s still like, “Heck, yeah, I have something I can work with at least, so…” [crosstalk 00:45:03].

Wil: Rachel, do you have access to data engineering in your organization?

Rachel: At this one? Yes. It’s a smaller team than what I had at Advanced Recovery Systems. We had a bigger team there that…like, I would basically give it to them and they would do a lot more with it in Power BI, and then make it easily updateable for me to go in there to slice and splice it and just find my little wins when I could.

Wil: Yeah. That’s why I was asking. You know, it’s like, I find that, to not waste my data engineer’s time, I’m just like you. I’m like, “Get me a CSV, ASAP,” right? One of the things I love about Power BI, and I don’t know about other tools because I don’t have time to research them all, but one of the sources you can make is a folder. So, anything that can email you a CSV every day, you can just basically backdoor it as an API. You just have to jump the file out of your Gmail and drag it into a folder, and you just hit refresh and it updates all your visualizations.

So, for me, a CSV is usually good enough. And then, when the CSV starts to show a lot of value, my data engineers are really good at being like, “Wil, don’t even come to us with this idea until you run it through CSVs four or five times, showing it to four or five clients, it all created something that they haven’t seen that they see value in. Then maybe we’ll engineer the data in a way that just to bring it right into our BigQuery instance.” Because early on, I was just wasting people’s time on my team by being like, “Ooh, there’s an API. Let’s attach to it.” And then they would, and I’d be like, “Well, now that it’s here,” I’m like, “That is not what I thought it was gonna be.” So, I just love playing with CSVs until I find the real value and then I throw it over to the real developers.

Dixon: Okay. That’s really useful for me to know, actually. So, Ammon was coming in with some stuff there. Can you bring back the last couple of things, David? So, what was the thing Ammon said before? He had another one before that. “So, accessibility is often about lowering the floor, providing a ramp. Helping identify useful opportunity is always in that ballpark.” And then goes on to say, “So, perhaps identifying very easily and very clearly where a lever is.” I guess above the fold or below the fold is a really good one. That’s a lever, really. “Where ambiguity exists and can be exploited or fixed.” So, if you don’t know Ammon, find him on… He used to be called Black Knight, but I think he probably doesn’t tell anybody that since those days. But that’s where people have been in the industry for many, many moons.

Guys, we’re pretty much near the end of our time. Is there anything massively left unsaid that I really need to bring in here? If not, then David, what are we gonna be talking about next time? And when’s the next show and how do people get onto it? Because we’ve changed our whole system.

David: Well, we’re actually still open in terms of subject. So, we’re working on the subject at the moment, but I can tell you that the next show is gonna be on Monday the 19th of April at 4:00 PM, BST, that’s 11:00 a.m. Eastern Daylight Time. So, make sure that you sign up at theknowledgepanelshow.com to get alerted for that one. We’ve had a great crowd watching us live on YouTube, especially for this one. Thanks for your interaction, Ammon, William Rock, especially. We’ve got some great likes on Facebook. Izzy Wright, Chris Wright, got Tim, Kim Tomfrey [SP], as well. So, thank you so much for your interaction there as well. Of course, you can listen to the show afterwards in a podcast as well. We’ll tell you all about that over the theknowledgepanelshow.com.

Dixon: Okay. And the podcast is pretty much anywhere, Spotify, iTunes, you know, those places as well.

David: Apple Podcasts, Google. Exactly. Yeah.

Dixon: So, guys, before we go, it’s time for me to say thank you very much for coming on. I know, you know, taking time out of your day is a big thing, but it’s really great to have experts like yourselves come in and chat about…deep, deep dive onto something like this, a topic that’s there. How do people find out about you? Where do they go to get more information about you? What message do you wanna leave people with? Laurence?

Laurence: Well, I’m easy to find on LinkedIn. So hopefully, you find me there or find the Authoritas website.

Dixon: Yeah. But don’t spell Laurence O’Toole like that.

Laurence: Well, I said, “If Google can have 10 O’s, then why can’t I?” And, yeah, just last thought just on…I think Wil sort of touched on it. Just, you never know how far you can go until you really push something. It’s so easy to play with this data, so, yeah, connect to Google Sheet to BigQuery and, you know, start playing around with it in Data Studio. And from there, you can get more and more advanced. Some of the tools I mentioned, like, you know, kumu.io and GraphComment, it’s so easy to build yourself a graph. And actually, you know, then it helps you understand the potential of tools like InLinks even more. So just get started, get your feet wet, and you’ll probably be swimming before you know it.

Dixon: Rachel?

Rachel: Yeah, if you wanna follow me for, like, SEO content, I’m normally on Twitter with that. If you like animals, that’s Instagram, which is also linked from my Twitter. And I guess, like, a little word of advice before leaving off. Just, like, always keep digging. Like, one benefit to SEO is, like, you don’t have the limits to some job, so you can take a little time to dig a little deeper in the data. And I feel like when you do, you just find something that you didn’t see with all the other standard SEO tools that will either make your content different, better, or even, like, rank for keywords that you didn’t even know how to rank for. So, yeah, that would be my typical advice leaving off.

Dixon: Your Twitter handle is something a little odd, so what was your Twitter handle, Rachel?

Rachel: I think my Twitter is, just, like, @rachelh_SEO.

Dixon: Okay. That’s it. Yeah. Okay. Wil, how do we find you? What do you wanna leave us with?

Wil: Just google me, you’ll find me. Blabbing somewhere.

Dixon: Just use one L. William Rock. Use one L in Wil Reynolds.

Wil: Or find some other guy. You know, Google will auto-correct it. They’re smart. No. You know, if anything, Dixon, I would just say, thank you for bringing us together. You know, I think honestly, like, this kind of marketing search, whatever, I don’t think there’s a lot of people doing it. It’s not the norm. Let’s just put it that way. And to be able to sit in a room with Laurence and yourself and Rachel today, it’s, like, I think for SEOs out there trying to do more of this work, even if you’re in paid or Analytics, it doesn’t matter. When you’re trying to join this data, itcan be a little bit of a lonely place. So, my recommendation is, you know, find your little tribe. I’m really glad to hear from those folks today because I’m like, “Okay, now I got some other people I can ping,” and be like, “Hey, this is something that I’m working on we didn’t get to talk about.” And vice versa. So, thanks for bringing us together, man. I really appreciate you doing that.

Rachel: Yeah, this is awesome.

Dixon: Well, it was fantastic for us, so…

Wil: Oh, oh. And one thing I would be remiss to not share this. I started messing around in Power BI, like, five years ago to join SEO and PPC data. And I have, like, a library of how to join all your data on YouTube. I must have, like, 30 videos on, like, how to join your SEO and PPC data…

Dixon: Oh, cool.

Wil: …then look at competitors. How to join your SEO and PPC data, and then look at “People Also Ask.” So it’s a huge library out there of literal step-by-step from the CSV level. So, if you don’t have BigQuery, you don’t have all that, we’ve got a bunch of content out there people should watch if they’re interested.

Dixon: Amazing. Guys, that’s absolutely fantastic. I think David, I have said everything I need to say. Just make sure I haven’t missed anything that’s really important. So, it’s just, I’ll just leave to say, thanks to you, David, for making sure I don’t mess everything up again. And I’ll see you all next month, and thanks all of you, guys, and cheers. Thanks for coming to the “Knowledge Panel Show.”

Share this entry

Category

Replies

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *