Joost de Valk on improving the web, schema, and the CMS market
Duration: 38:40
In this Tech Bound podcast, I speak with legendary founder Joost de Valk about the importance of Schema for the web, the CMS landscape, and SEO Fitness. You probably know him from…
Show transcript
ladies and gentlemen i'm proud to present a tech bound podcast interview with jose falk yours needs no introduction but for the few people who don't know him yost is the founder and chief product officer of yoast the well-known seo plug-in that he started in 2010 and that employs more than a hundred people by now jose is also an investor together with his wife marika in companies like peerby student.com or wordproof in this excellent conversation jose and i speak about improving the web the status quo of the cms market share and the importance of schema markup don't skip a beat make sure to listen to the end give me five stars and enjoy this outstanding conversation with mr yosterfolk three two one mr yost welcome to the show thank you thanks for having me it is a delight to have you on uh yes you are running a semi-annually cms market share analysis give us a little bit of a context why do you do that and what are the bigger trends that you have seen lately so i do that because it's very hard to to get good numbers on hey who is using what and how are these cms is doing and just how how is the web progressing what's what's happening out there because there are some there are a couple of tools out there that that do an analysis on which website is built with what software and basically what you want is an overview of hey what's what's changing here and who's doing what and how big is a market share for wordpress how big is a market share for shopify for wix for squarespace for all these companies that are well doing well in in a way i mean these are all companies or ecosystems that that are growing but there are when those are growing growing others are shrinking and why and what's happening in that regard etc so i started looking into this when i was preparing to and gearing up to doing more marketing for wordpress itself so wordpress.org um but i just kept on doing it because it was actually very insightful for us and for the for our business as well to see like hey what are the trends which which systems are doing well and when you go into that you figure out that they're that well in in reality there's only a couple of systems really doing well um like four or five of them doing reasonably well and a lot of them doing very poorly which in a way is good news because let's face it when i started on the web we probably had more cms's than we had people on the web and uh i don't know that that is necessarily a good thing i think it's a good thing that that slowly the web is moving towards a few large systems in many ways whereby a few can still be a hundred different systems i'd be fine with that but not tens of thousands because i don't think that's good for anyone well you can see that progress and and of course i'm mostly a wordpress guy myself and well wordpress is doing incredibly well in those stats but admittedly also the stats have have a tendency to well to lag and also to show only the biggest sites on the web and not and and with that not be a true reflection of the entire web yeah i think it's fair to say and the the progress of wordpress is really stunning i think it's uh you probably have the most up-to-date numbers but it's more than 33 percent at this point of the web running on wordpress yeah it's it it's even closer to 42 which in my mind is just mind-boggling i mean but this is also where it becomes problematic because i'm fairly certain it's not 42 percent of all websites but it is 42 of the top 10 million websites which is two very distinctly different things because there are so many websites out there nobody really knows how many websites are there really are so that makes all of this a bit harder to well to guesstimate well but yeah i think we're at 41.5 at the moment why do you think wordpress has been so successful like why why are they dominating so hard i think it's a combination of relatively low cost of ownership wide availability of developers and versatility and ease of use in the end i think any system that wants to compete on the web for anything whether it's cms or anything else ease of use is probably the biggest thing yeah absolutely true and i'm a wordpress user myself uh my site runs a wordpress and i think easy use is certainly one factor modularity is another factor right it's a very moldable product you can really make it into whatever you need whether it's a web shop or a cms or just a blog or even a small little web app you can you can run a lot of things on wordpress right i totally agree with you i think also the rise of of products like squarespace wix and probably even shopify right they all speak to almost this no-code trend where you make it very easy for somebody to set up a site or even a business without them knowing how to code because let's face it most people are probably not as technical uh as as you and me are no and i think that's also the the danger of wordpress in a way um that you need to be fairly technical to maintain a wordpress site properly that's why some of the um the sas offerings are doing so well and why people are switching to them sometimes because well if i compare running a shopify site to a woocommerce site or running a simple marketing site on uh squarespace versus running that on wordpress it's a lot simpler to do it wrong on wordpress than it is to do on those other platforms so it's a lot easier to shoot yourself in the foot and and and that is the challenge that wordpress has and how yeah in in how people use it is to how do you tell them that they're actually breaking things or not doing it right and how can we help them prevent them from doing that but it's yeah it is i mean if you look at the the market share so number wordpress is the number one in the market with 41.5 and if you look at those stats it's really staggering because shopify is doing incredibly well as the number two in the market but it has 3.5 of that market share versus the 41.5 that wordpress has so it's it's it's such an incredible difference in size absolutely and you already mentioned a couple like you mentioned like showing yourself in the foot and the challenges that some webmasters face and i think you build a company with yoast that that fills a huge gap on the market and really solves one of the biggest pain points for people on wordpress and other uh cmss as well right which is which is seo and making seo a bit more um actionable and tangible um talk to me a little bit about the biggest challenges that you see webmasters face we already talked about that you know to maintain the site you need to be fairly technical like what else do you think are the word which you see as the main challenges that webmasters have today well i think the biggest challenge depends a bit on what type of webmaster you are but if you've had a website for say a decade now and you've been on the web the entire time you've been writing etc then your website is now probably becoming old and a bit sluggish and you need to start cleaning up you need to do some spring cleaning and um so for older sites you see more of this happening because well there as the web grows more and more sites reach that age um at the same time you also have still have a lot of people coming onto the web and building new sites and they're just wondering what they need to do so so they have this site and then you have a wordpress site you've made it looks like something that you are reasonably okay with or your web developer has done that for you and then what if you at that point don't start actually doing stuff online and marketing and promoting and writing content responding on other people's websites or twitter feeds or facebooks or whatever but if you don't seek out the interaction and if you're not urged to do that then it becomes brochure where then it's literally your company brochure on the web and that's it i think the biggest challenge is to actually keep people working on their site and keep on improving it to actually make this whole process into something that is a bit more than um than doing it once so we've started calling this seo fitness where we tell people to like hey this is not something you can do only once this is something that you actually need to do every week and or every month and like fitness if you stop doing it for six months and then start doing it it's gonna hurt a little in the beginning and because it's actual work that you need to you need to put time into this no site is going to be is ever going to be good if you don't put time into it so i think that's the biggest challenge that to tell people like hey your website is just like your shop you need to to keep on cleaning it you need to make sure that you paint the outsides you need to well do all that maintenance now you're getting me excited yours when we talk about fitness i'm getting excited uh as a you know an epic fitness enthusiast what is a fitness programming or or even a diet look like for websites what i recommend webmasters to do with an ongoing basis to keep their site in shape so we're we're in the middle of this at the moment uh within yours on actually defining a whole lot of these things so adios we we have a we well we we used to have pre-covered a very nice office culture and i hope we'll get back to that relatively soon where people come into the office and we actually have a personal trainer uh that trains with all of our staff and uh helps them well get fit etc in many ways what you want is is to adapt personal trainings plans to every website so you want to look at a website and go like hey what does this website need we call that personal but of course even a personal trainer when he looks at you he just grabs from experience and grabs from like the 20 possibilities that he has in his head on how to combine things and that's what we're building as well so we're building workflows in how to help people work out these things i think some of the most important stuff that we can teach people to do we already have a lot of features for within us seo that's internal linking improvements so literally telling people to hey you have this post that has no internal links or very few internal links should you improve it or should that post be deleted or should that post maybe merged with something else fixing duplicate content which usually starts to accrue as your website grows up because you're writing this post about this topic and then three years later again in the news cycle that that article or that topic comes up and you want to write an article about it do you rewrite your old article or do you write a new one and then suddenly have two articles trying to rank for the same term the latter is what most people do the former is what you probably should do but there is a there's a lot of oh well work like that internal linking improvements are often the thing that um a lot of seos do when they come into a site you start looking for like what are the easy wins right you go into a site and you go okay what can i fix where are the very glaring 404s that i should probably take a look at i mean what are what are they the biggest errors that i can fix easily and how can i make this site better well what you usually do is you end up going like okay which terms do we want to rank for how do i improve the internal linking for those terms i think of it a bit like as like fitness programs where you take people through these like okay let's first improve the internal linking for your top 20 terms then let's start fixing all the internal 404s that you have and all all the errors that we can see work on your site speed if that's necessary if it's too slow let's make sure ensure your site is a bit faster and then you you slowly work through these things i love that i think it's such an underrated topic and just in general right this idea of seo fitness or hygiene or maintenance it's huge um actually there's um a study that just came out from the new york times where they looked at over 550 000 articles that contained over 2.2 million links to external websites and i'll post a link to that study in the show notes what they've found is that already articles from 2018 contained six percent of dead links meaning of links that pointed to pages or sites that were no longer existing when you look at the articles from 1998 there was 72 percent of links that they were broken right so they call it i think link rot um you can call link to k whatever you want to call it right but i think a reality that we have to face is that websites quote unquote decay over time internal linking is one um example of that but content is another one we saw all these studies over the last couple of years that show or studies or case studies that show hey when you go back and you update content like you see these big returns and i think that's something that too many seos don't really do they don't go back they don't and it's not just updating as you said it's consolidating questioning is this still relevant should we rewrite this should we take a different angle right because the the truth is that most information out there is not steady right it evolves it progresses things change right so i think this is such an important point that you mentioned that that is so underrated yeah i i mean we have a team doing this the entire time on yoast.com so we literally have a team constantly going through our blog and just looking at hey um what do we need to write about do we have something about this already how do do we use the old one do we bring it back do we delete the old one do what so what we what do we do this is one of the reasons why we uh last year acquired the duplicate post plugin because i wanted to make the whole rewrite and republish workflow a lot easier which is what we did with that so we released that a couple of months ago and you can now just click on a post and then create a new draft from that old post change the entire thing in a normal editorial process and then when you hit publish it'll publish over the old existing post with the new date and the and the old url so it actually makes the entire republishing old content workflow very easy within wordpress it's something that i wanted to have for our team because i saw them do this and it was entirely cumbersome to do this by hand the whole time but it's it's also workflows and thinking of it like that that we that i really want to encourage with seos i mean seo is of course still technical and there's a lot of technical stuff that you that people need to do on their sites if they're not running wordpress in yoast seo or another seo plugin but i think that all that technical work there need not be tens of thousands of people that need that know how all of that works a couple hundred of us technical seos that fix that for everyone else should be good enough but we need thousands of good writers that are continuously updating those websites and and looking at those links and and and well fixing those outbound links because all those links that you have that point to nowhere are a waste of your visitors time and you should well you sound a bit better 100 of course i'll link to the duplicate post plugin in the show notes so everybody can check that out yeah it's free it's free and i'm not pushing you to any paid plugins now of course but it's such an interesting point that you mentioned here about technical seo i've lately perceived a conversation about technical seo from two different camps the one camp says hey google has gotten so good at understanding technical issues and we don't really need to invest that much in technical seo anymore and the other camp says that exactly the opposite is the case that we need to invest more in technical seo that quote unquote sparing google's resources and making it easier for google pays off more what is your opinion how important do you think technical seo is in 2021 i think it's incredibly important um i do also think that a lot of the people that are doing it don't necessarily get what they're doing um i come from this from with a weird perspective right i don't do seo at website scale i usually do seo at wordpress scale so you do it for a very large parts of the web and that means that it's not about competing about having site a compete with site b as much as it is about how do we make this more efficient because that consumes less bandwidth on google side but also on the host side and just makes the web work better i i mean so my perspective on these things is different i also want i i also try to like get the big social networks to all use the same metadata so that we don't have to output six different types of things because it's ridiculous there's not a whole lot of people doing that work and there should be more people doing that sort of work because i think that would actually improve the web looking at crawl rate and looking at hey how many sites how many pages on my site can google crawl etc is interesting and it's important if you're working on the very very large sites of the world so for you at shopify uh for me back in the day when i was at the guardian or at ebay or i mean on those sites yes it makes sense there are probably only a couple of thousand web properties where that really is the case there are tens of thousands of people pretending like they need to think about this but if your site has a thousand pages then this is not your problem and you should not be thinking about this as much and you would probably get more return from writing one good article than from thinking about this at length i think there's a whole lot of work that technical seos need to do to make the web better i think there's a whole lot of technical seo work being done that's not necessarily all that useful yep yep yep i very much agree with you i think it's a very elegant way to phrase it uh because there's so much context and nuance right like with scale the importance of technical seo becomes more important one field of seo that's often being grouped into technical seo is everything around rich snippets schema basically augmenting your site's code but also augmenting the search results on google with more rich snippets right so this is a i think a field that's that's obviously growing in popularity growing in importance um and you with yoast play an important point or important part in this story because it makes your plugin makes it so much easier for webmasters to create bridge snippets so where do you see this whole space going where do you see this augmentation of the search results uh on google going because i'm going to give you a bit of context i was lucky to look into bigger data sets from a couple of tool vendors smrush rank ranger uh over the last uh two years actually 20 well actually three years in total 2018 2019 2020 and i saw in 2018 2019 there was like a huge ramp up of serb features right google was showing a lot more and then 2020 was almost stagnating so why that is it's a different story right but i'm curious about your point of view you know how important are rich snippets where do you see this going and how should webmasters think about that so i i look at that from two perspectives one i just want to add more and better structured schema to as much many websites on the web as possible because it actually makes crawling on the web easier and makes it easier for someone to come up with a competitor to google because it actually helps everyone well digest that data more efficiently so i i very much look at schema as a replacement for facebook open graph twitter cards uh all of that stuff so that's why i want to use it in as much places and as well done as possible i think google can do can build great stuff upon that and we as a society still have to figure out how far we're willing to let them go uh does that mean that i would put less schema on a page because google does stupid stuff with it no but i do think that we have to figure out like at which point does google show too much of a recipe in its search results and and at which point is this really not in the interest of the website owner anymore honestly in a lot of cases what google is doing is taking out the intermediary and with the whole new car section that google launched a while ago i mean that takes out an entire space of course sites all over the world that literally only existed to bring cars that p that are in dealerships and in people's parking lots to other people on the other side and while google says hey we can take a share of this market and we we can do that i if they could actually fill that with schema i'd be fine with that because the way that traffic usually went was people search on google for a specific type of car they'd find one of those car sites then go and then go to that car site and maybe buy that car and that car set would get paid that's google's traffic that is getting those people paid and and to a certain extent i i think that that what google is doing there is fine when google shows the entire recipe or an entire news article in the search results without ads next to it or something that that pays for the like the operational creator that's a bit more problematic but i think those are things that we need to figure out as a society and our governments will need to answer at some point i think that the overall thing that we can do with schema is to just connect the web a lot more and to really go through that web 3.0 where it's not just entities but we tie actions to all of those things and we actually help interactions between websites and apps and all these things become a bit more fluent and easy and i i think we still have a way to go there and i think that schema plays a very important role in that which is why we're also very active in in actually setting the standards so not just implementing the standards in code but also helping to define how and define how schema well defines a webinar for instance or thing things like that how do you how do you define it how do you put that online how do so that's we have regular discussions on that with for instance dan brickley on google site who maintains schema.org and it's just fun to do that and it's fun to help improve make the web a better place i've had that thought very interesting uh or that vision better said very interesting about you know a fully augmented web with the help of schema can you can you paint us a vision or a picture of what it would look like if we could understand the um the schema indicated actions or the references between sites and and how schema plays a role in that so let's say i want to i'm looking at a shopify site and i want to buy a product that product has markup next to it that says what the uh ein or whatever identification code for that product is if that code is in there and i have some some app installed from a review site i could very easily find reviews for that product on a trusted third-party resource instead of on the site that is selling me that thing i find this one of the weirdest things in the world in the web e-commerce setup that we look at reviews on the site that are selling us the thing i mean how would that ever be perceived as reliable it's those kinds of connections very simple where it's a product metadata with a with a simple gtin or or ein or whatever number that you can tie to another website and say hey i'm looking for this product so i i i'm waiting for the product review sites that actually allow me to throw a shopify url into their search box and and give me that product that's the kind of interactions where sites scrape each i'm expecting sites in the end to scrape each other because the users want to connect one to the other and you'll probably at that point figure out better interactions to do that i think that we can tie all these things together i think that we can say hey okay so that goes there and then okay i have these shipping options i've just bought this the shipping data should go straight into my parcel tracker it keeps track of which packages i have coming to me because honestly we've all noticed this in this covet period if you order a lot online you get a lot of packages keeping track of your parcels becomes a lot of work i've got four kids so my if so for me in in my case it might be even more packages that i need to keep track of than the average person out there and all these things have already have schema metadata that we could tie to each other and that end up that websites could do far more with so right now they're all publishers publishers of schema but none of them are consumers of schema so none of them use the schema from other sites to really do cool stuff with i think pinterest is one of the few out there that does this really well when you share your own pinterest it grabs all the uh the stuff from there and creates these rich pins and it does this fairly well now they also still support facebook open graph because not everybody has implemented schema yeah so i think there's a lot of these interactions where you could go like hey i've seen this on pinterest click let me let me go to the store that sells that oh let me get some reviews for that on site x i think all of these together can make for a better web and then if we go one step further i i see a world where we actually know for sure that the person who published that was truly that person marik and my wife and i invested in a startup called wordproof that actually um timestamps content so you can prove that you put something online at a certain point in time before proof later on as these times develop you can actually also prove that a certain person wrote that content and then there are now already also systems where you can proof online without showing even your name that you are for instance a medical doctor and then you can tie all these things together in a much better way and then the eat system that google comes up with suddenly sound very stupid because well we can just prove these entire chains and i can just tie all of this together into a world where where all this connected data shows me that what i'm reading is written by a medical md actually knows what he's writing and what he's writing about if it's about corona and it's not some weird thing that somebody somewhere came up with because he felt like it i think that all of this together has a very well a very high chance of actually improving the web and bringing some trust back to that whip of uh that that's been missing in the last decade or so such an important work um i've been born raised in germany and been living in the us for seven years now and i saw firsthand how trust eroded since the 2016 election fake news this whole conversation obviously pushed google also to invest more into that and then now since i would say one or two years we're seeing the rise of um alternative currencies right more bitcoin blockchain nfts which are also something like the the timestamp uh that you mentioned in that yeah the time stamp actually happens on the blockchain so it is uh yeah it's all tied yeah there we go and now we finally because i was looking for a connection between blockchain and seo and marketing like how can we how can we actually do this right and now like you finally give me a path there right because i think that kind of stamp of approval literally uh will become really important also in the grand scheme of marketing right if google needs to figure out this whole trust issue situation and you provide them something um then i could i could see google rewarding that yeah absolutely and in in news and product search and there's there's several areas of google where this is even more important than others i mean it's why i happily invested into work proof it's built by a dutch friend of ours they timestamp on the blockchain you can just verify that you wrote that piece of content and you can reproduce the hash that we store in the blockchain at that point given our methods we add that blockchain timestamp as schema.org metadata into that article so that ties it all together and it makes it very simple to just well even for simple things right like um your terms of service you can just timestamp them at the time of a transaction and prove that at the time of transaction these were the terms of service that apply to your transaction i think there's a whole lot of these things where over time those things become more important and then as we can later on also tie people to those timestamps so i can say hey this was created by yoast at that point in time that has true value i mean that that actually shows that someone with an opinion on seo that that wasn't just someone with an opinion that it was me and then it's up to you to decide whether i have the credentials to be able to say something that you want to trust or not but at least you can verify that that person is truly that person without a doubt is the the the um the schema property that you mentioned for the timestamp is that already part of the official dictionary um we're we have a proposal open for that it's being heavily discussed on the schema.org github right now what a lot of seos don't seem to realize is that schema org is on github i mean you can literally just go in there and and be in discussions with other seos and and other people on how to improve schema there's like four seos in that i sometimes want to scream like why aren't you there why why aren't you like helping us improve product schema why aren't you help helping us improve all the stuff that we all are fighting with but yeah so we're discussing it right there right now there are a couple of standards that are very similar so we're trying to like make it in such a way that they all become compatible because the hard part in this is that there's always a couple of people with similar ideas at the same same around the same time you want to end up with one standard that everybody agrees with and not a fifth new standard that competes with the four other ones yeah that's the challenge here we're not entirely there yet but we'll get there and of course i'll also include a link in the show notes to the github repo for schema so that more seos get engaged there yeah it is good fun it really is you should read it and then sometimes you go like huh and then why is google pushing for this and then and then three months later or six months later you see a new google thing launching you go like oh wait they were thinking about this in advance and so yeah so i think it's great for seos to actually interact there and see what's happening that's great advice thank you yost um we're nearing the end of our conversation but i wanted to touch on one or two more questions if that's okay with you on the one hand like we talked a lot about yoast the company and the role that it plays in the cms ecosystem and the seo ecosystem what's what's the biggest challenge running a company like yoast i think the biggest challenge is to scale it in a good way because we're we're rather big now i mean ussio runs on 12 million websites um we have uh almost 150 people working for us and scaling is just hard i mean i this is my first company of this size so i've never done this before and our board consists of mostly people that have never done this before so that is uh that is hard and also to stay true to what you think is important because it's very easy to to make money your most important metric and it certainly is not i mean money is something we need to uh to run this business and to grow it and to to do ever more cool stuff one of the things that you really need to challenge yourself the whole time in like is what we're doing here now really going to improve the web consistently doing that is hard but it is also super rewarding so that's why we're still doing it you you run into a lot of these questions like hey a couple weeks ago we celebrated that our readability analysis in yo socio was five years old and five years ago we started working on readability and nobody in the seo space did that we were getting ready to release it and we were we fought a bit about putting it in our premium version of yoast seo and then we said you know what if we put this in the free version and a couple percentage points of our users start writing more readable text because of this then the impact the positive impact of that is so much better than anything else we can do so we decided to put it in the free version and there's years and years of research in that in terms of people hours and in terms of so there's a lot of development and cost but well the better decision for the web at that time was to to put it in the free version of usagio and i'm very happy that we did that and it's funny now for me to look at the seo space and see all these people talk about readability and and all these tool providers provide those metrics and like none of you had this five years ago it's a perfect segue i think it to the almost last question um but to bring us home what is what do you want to take yoast in the future what is what is kind of what is your vision for the company there's a couple of things combined i think at the moment on the one hand is what we talked about so building a better schema fight web where we where we make metadata simpler not harder and where everyone can build a website and be found in the search results without needing to understand the technicalities i think for for us to do that on just one platform is not enough so that's why we're slowly expanding to more platforms and that's our true mission an seo for everyone like in in many ways um it's an extension of the wordpress motto of democratizing publishing i think that if you want to democratize publishing you also need to democratize being found and because otherwise you the publishing is pretty much in vain well mostly i just want to improve the web and and see if we can help uh get the web some of that is that vibe that it had in the beginning back i remember when i was building my first website in 19 1994 i don't think you were even born then not entirely sure yeah at this point um but yeah i at that point the web was very simple in a way and we were all blissfully ignorant of these stupidities that we were creating but you see over time as the web becomes more commercial that we that that they're also well we and there's some some errors that we have in the underlying fabric i uh if we can play a role if even a tiny one in in improving that then i am happy way to bring us home yours way to bring us home before i let you go uh where can people find and follow you um my twitter handle is a j d e v a l k jade of j dewalk um that's probably the the best place to follow me personally and then uh yoast.com and yoast on twitter or facebook or well wherever you wanna want to find us you will probably be able to google us as well thanks so much for being generous with your time coming on this was a fantastic fabulous conversation i'm looking forward to talk to you again in the future thank you thanks for having me [Music] three two one