On-Page SEO - More Than a Primer

Ryuzaki

お前はもう死んでいる
Moderator
BuSo Pro
Digital Strategist
Joined
Sep 3, 2014
Messages
6,244
Likes
13,129
Degree
9
On-Page SEO
Getting Up To Speed in Today's Landscape

Long ago, in an internet that time forgot, processing power and time was far more limited than it is now. Search engine algorithms were much more primitive. The current and historical data in storage was laughable. Directories, web rings, and even print encyclopedias of websites still existed.

NO6iM4e.jpg

What It Was

Back then, you pretty much had to shove not just your topic, but the precise keyword down a search engine's throat to let them know what you wanted to rank for. Managing your On-Page optimization NOW requires knowledge of how the game used to be played.

Bz5GM7L.png

If you want to know how to make your website walk like, talk like, and quack just like an SEO'd site, then all you have to do is follow SEO blogger advice on On-Page SEO. They are about 5 years behind in this regard and most even further. The most popular Wordpress plugins for on-page lead you down this road as well. This is how you used to perform on-page optimization, and take note because it's still relevant in some aspects:

If you wanted to rank highly for a specific term back in the day, just make sure you put your term in the following spots in your HTML...
  • <title>
  • <h1>
  • <h2>
  • <h3>
  • <b> or <strong>
  • <i> or <em>
  • <img alt="" />
  • 5-10% keyword density
  • <meta name="description" content="">
  • <meta name="keywords" content="">
That was pretty much it. You were 50% of the way to #1 if not 75% if you did this. Search engines weren't able to do topical analysis in terms of latent semantic indexing, topical flow, popularity flows, etc. Google, with it's revolutionary Page Rank, was still very primal. It was meant to measure popularity, while Trust Rank flowing from seed sites was meant to measure trust and authority. But still, this didn't help out with understanding the topic, the niche, the concept of the page itself.

g7wKo3A.png


What It Is Now

The game has changed. The only search engine to concern yourself with is Google and they are concerned with three things:
  1. Psychological Warfare - using propaganda to scare people into not being SEO's
  2. Friendly Fire - getting idiots to negative SEO each other, use the disavow tool, and report each other's networks
  3. Cutting Off Heads - letting you stand too tall above the crowd, and then chopping you down with filters
This thread has to do with #3, and only in the on-page arena. We can venture out into off-page in further posts if the right questions are asked.

So what's one way to stand too tall and have your head poking out? Over-optimizing On-Page SEO.

The funniest thing with Google is that they will literally put out guides and teach SEO's how to optimize for Google. Then a few years down the line they will call it manipulation and start chopping heads. And that's what's going on now. If you do it the way it used to be done, as Wordpress plugins would have you do it, and as SEO bloggers will preach, then you're popping your head above the crowd.

The game now is camouflage. You want to look like a normal internet user. And do they even know what all of those HTML tags above are? No. So they definitely aren't going to end up with their keyword in all of those tags or likely even touch HTML any more. It's very simple for Google to run an analysis throughout the index and find out what level of optimization is average for a niche and then to filter out, devalue ranking power, dampen link flows, or whatever on any page that's over-optimized. Do they do this? You bet your newbie ass they do.

C3p7F9j.jpg



Don't Over-Optimize

So step one of doing proper on-page SEO to your money sites is to not over-optimize.

The question then becomes... well which HTML tags still carry the most weight? If I'm going to limit the number of places I put my keyword, then which should I use?

I'll tell you. This is my gift to you, the newbie. Use the <title> and the <h1>. That's it. You don't even have to use your keyword on the page in the content if you don't want. I would at least once.

Google is getting a grip on topical analysis now. So you can use LSA terms (latent semantic analysis), meaning that you need to discover which n-grams (another term you need to look up and understand) usually appear together on a page. And use THOSE instead of your main keyword in other spots. Now you're optimized for a keyword while not over-doing, and you're getting optimized for a topic as well.

This goes for the page level and the entire domain level. This is how true authority sites rise. Have you ever looked Webmaster Tools and seen the "content word usage" section or whatever it's called where they are tracking which terms get used the most in your content? Yep. They are just showing you the tip of the iceberg.

(Hint: If you're creating your own contextual backlinks, do on-page optimization in that content as well to manipulate topical flow so your page and your domain become more authoritative.)

73rHl8t.png



Quality Considerations

So you think Google can handle topical flows, latent semantic analysis, n-gram analysis, term frequency - inverse document frequency (tf*IDF... dropping more hints on you), and all they are worried about is if a keyword appears in certain HTML tags? They taught an entire industry of suckers how to optimize their pages and out themselves. They chopped and are chopping those guys and while finding better ways of telling what's not only relevant but most authoritative and popular.

But they aren't just concerned with authority and popularity. They want to know if those pages are of high quality and easy accessibility too.

I'm making this far too easy for you. You need to understand how the Flesch Kincaide Reading Ease and Grade Level algorithms work and figure that Google has one of their own similar to it. Read up on the Gunning Fog Index, Coleman Liau Index, ARI, and others work as well.

WXPcTrn.jpg

Now consider your niche. Are you writing about macaroni noodle crafts? Yes? Then why are you writing those articles with a doctorate level vocabulary and level of syntax? Dumb it down to what's appropriate for your niche and search term. Are you writing about quantum physics? Then maybe it should score highly on the Flesch Kincaide and use academic formatting like APA or MLA styling...

Most of us are going to be dealing with niches where high-school level readability is desirable. Even in niches that are more sophisticated, nobody wants to expend effort. They want to consume content quickly and Google knows this. So write at the high school level.

What else increases readability? Lists.

What else increases time-on-page and decreases bounce-rate (both quality signals)? Videos, Images, Quizzes, Length of Content.

Quizzes work so well. If you're a Facebook user, you know this very well.

P2jfGqD.jpg

Does Google say "Hey, a video is present, that's a quality signal!" or do they look at the consequences of the video, which would be keeping your viewer around longer than your competitors? Likely both. Include several types of content on your page to ensnare your reader and tell the search engines that you're the best choice for their traffic.

Some Types of Content To Consider

How about a Table of Contents that links to sub-sections of your page? #introduction, #chapter1, #chapter2. I bet the search engines think that makes for good user experience.

What about supplementary content? Yes, the stuff you cram in your sidebar matters. The stuff you cram in your 4 column footer matters. Human quality raters look at every bit of this and report back to a computer which then attempts to simulate their discernment abilities. Make every bit of it count.

What about being a link hub? Do you point your readers to different resources that Google also considers high quality? You definitely want to interlink within your own site as well.

Does your content appear above the fold before advertisements? Webmaster intention plays a role in the determination of whether or not a page is of high quality or not. Over-monetization is a negative ranking signal.

About, Contact, Social Network... what else says legit, trustworthy, authoritative, easy to use, and high quality?

This is how complicated the on-page game is now. Google is hiring the top statisticians and computer engineers on the planet. They have global positioning satellites that direct driverless cars on the interstate. You better not be stuffing keywords in HTML tags.

r1pWYKh.jpg


Page Speed
This is that "meta" realm, and I'm not talking about meta keywords and descriptions. Most people concern themselves with what is on the page, but forget about the page itself. You could have the finest fountain pen in existence which writes with the ink of the gods, and it doesn't matter at all if you're writing on a parchment that will crumble and turn to dust as soon as someone picks it up to read it.

The online equivalent is the page speed. You don't want your server buckling under the load.

You want your content to be served up immediately to the user. Google is becoming increasingly better at simulating this on their end too, now able to render javascript, ajax, and checking only what loads above the fold. So how do you server you content to the user as fast as possible? Look at these numbers CCarter posted in another thread to get an idea of how far you can take this. This is from Pingdom's page speed tool.

Dj6nPeV.png

You want to consider the following relationships between your server and your user:
  • What resources can the server allocate for the user?
  • How much data is it trying to shuffle to the user?
  • How fast can it send this data to the users's browser?
  • And ultimately, how quickly does it render totally?
The first bottleneck you need to worry about is server resources. If you're on a quality shared server and you're not getting railed with traffic, you're probably okay if and only if the rest of your page is in order. But if you're really wanting to serve some fast traffic, you'll at least want a Virtual Private Server (VPS). This means that you are getting some portion of a full server partitioned off just for your own usage. It's not shared. So if some other site on the server starts getting a ton of traffic, they can't pilfer your resources like on a shared server. Your CPU power and RAM is yours and yours alone.

The second question is how much data is being moved around. Everyone is tempted to create the most graphically impressive websites, but minimalism is key here. You want to only use the images necessary and then compress the images to the smallest format possible. If you have a ton of small images, you'll want to create a sprite sheet that loads all of them at once and then pull from that sheet using Cartesian coordinate positioning. A way to still look good while not having a ton of images is to use CSS to create some of the effects you want.

1GpdpQD.jpg

Part of the reason you want to use the least amount of images possible and combine them to a sprite sheet is to reduce the number of HTTP requests. Generally these pieces of data are going to be loaded serially, not in parallel. So the less of them you have, the faster everything can get loaded and rendered. This means having fewer images, combining CSS stylesheets, combining javascript files, calling javascript last so everything else has a chance to load first before the script is executed. If you have a lot of images, there are scripts to only load those above the fold first so you get a faster loading score. The rest will load as the page is scrolled.

Now, you CAN load in parallel instead of serial by using a content delivery network (CDN). What this does is stores your static content all around the globe so that it can be delivered to your user faster than having to skip and jump through many nodes halfway around the globe. If you're thinking that electricity moves at the speed of light, you're right. But it's not a direct line to your user. CDN's can be magic, and they can offer you parallel loading. If you're using a CDN, then you need to be using server caching as well, along with browser caching. You should be doing this anyways. We can't get into all of it, but you can research those possibilities.

Also, and it's not completely desirable these days with content management systems (CMS), but you can forego a database altogether. You can go flat-file HTML and load much faster. This will also improve your site security a million fold. But it's not always desirable. So at least keep your database clean and don't store 100,000,000 spam comments and 10,000 draft revisions.

ByrdAlb.jpg



Conclusion
This post has become far too long. Perhaps I've missed some things as I typed this post. If you feel I have, please share any other methods available to help bring everyone up to speed in this modern SEO landscape. Please ask questions, criticize, or anything else. Join the discussion.

PRCzn3M.jpg

<< TRANSMISSION OVER />>
 
This looks great. I started to read it but I have to run now, looking forward to continuing it later. Will update with my thoughts.
 
I saw in the intro section that there are a few people who have been doing this for around 20 years, really curious to hear their thoughts on this since they've seen a lot over the years.

Solid post, Ryu!
 
Oh no... I think I'm a bit over my head here... I'm going to print this out and highlight all the words I don't understand then I'm gonna look 'em up and try to learn everything that's mentioned in here. Good enough starting point as any. Told you guys I was a serious fella.
 
"You could have the finest fountain pen in existence which writes with the ink of the gods, and it doesn't matter at all if you're writing on a parchment that will crumble and turn to dust as soon as someone picks it up to read it."

Not sure if this is a forum post or an ancient tome of wisdom passed down for thousands of years.

Very interesting, you raise a lot of practical points and some more abstract ideas as well. If we had a rep system, I would give you some.
 
Something that could be mentioned is site organization, hierarchy, and silo-ing.

Site Hierarchy

Most CMS's attempt to take care of this for you but also can create quite a confusion, lots of duplicate content being indexed, and more. Tag systems, category systems, and all are very useful if used properly.

The idea behind this is summed up in the silo philosophy, which says to create hubs within your website with a certain linking pattern. Very specific long-tail articles would link up to the main post on the shorter-tail and sideways to other similar long-tail articles. People used to say to not interlink silo's and try to push the page rank juice upwards, but I don't think that's necessary any more, with Google's understanding of Breadcrumbs and Categories now.

But using those two items (crumbs and cats) can help assure the search engines that you are providing proper navigation for your users. And interlinking everywhere can help increase crawl depth and boost rankings site wide if you want. Or you can strategically interlink to pages you want to boost a bit.
 
Whoa...

I've been one who just gets "close enough" with on-page and then tries to make up the difference with links.

This really makes me wonder how much harder of a time I'm making for myself. I've never even measured my page speed before. Or really added any more content types other than a picture.

I'm going to choose a few pages on one of my sites and add these features and see if they rise while the others stay stable. I guess it'll take a long time to get the results but I'll report back. I won't aim any links at them either.
 
To add on to what CherryPit said about tags and categories, if you want to use both, you can always set the tag pages to no-index so that the users still can use them but they won't result in duplicate content and low-quality pages being auto-generated out of excerpts. The only problem with this is that they are page rank juice leaks, as a tag cloud creates a lot of links that are usually sitewide. It's a give and take. Sometimes to help the users, you have to ignore the robots.
 
I'm going to implement some of the things I didn't know to my site and see what happens to the rankings. How long do you think it'll take before it fully reflects the changes?
 
That depends on how often your site is crawled Sam. But expect G to update your sites cache in their database in 7-10 days and then another 7-14 days for the serps to reflect it.
 
Great post.

I'm spending a lot of my own time at the moment testing out on-page factors.

I'm liking what you're saying about LSI and that's something that I'm implementing where I can. Luckily Google gives us a good idea of what terms they expect to see on a page of each SERP with their 'Searched Related To Keyword' data.

Even implementing that long-term will give you an edge on competitors, plus those long-tails man!

I do at this point feel as though some people are perhaps too conservative with the old school stuff like keyword densities, but it's far more important how you treat your 'hub' as you put it.

Whether that's going to be a hallway page or whatever, it's a good idea to link back up the funnel from pieces of content that are relevant. I'm doing this with exact match and it hasn't hurt any of my own sites, or other test sites thus far.

It's all signals, signals from relevant pieces of content tells Google which page holds the most authority. If you have a hub page that's targeting the main keyword you want to tell Google from other pages that contain the same keyword which is the true authority page, and using internal linking to do that is what you want to do primarily. Everything else is still a secondary consideration in my opinion... It does help to use LSI terms around those links as we all know.

We know that when it comes to rankings links are what matters and I think that goes for on-page too, even though there's a tonne of other factors as well, just as there is with regular off-site SEO.

If you've ever had a partial or exact match niche site and one of the internal pages are ranking for the main term, not the homepage, then you need to look at those on-page signals and TELL Google which page is most relevant for that term.

This is why Silo's are SO effective.

LSI again is all about signals, and all that is on-page comes back to giving Google the relevant signals.
 
Last edited:
If you've ever had a partial or exact match niche site and one of the internal pages are ranking for the main term, not the homepage, then you need to look at those on-page signals and TELL Google which page is most relevant for that term.

I think this is a good point. I've seen people say that this is happening because the homepage is over-optimized, largely due to the partial or exact match domain. So then they remove even MORE of the on-page signals from the homepage and it never ranks.

But I also think google doesn't want to rank homepages for much beyond brand terms right now too. They are focusing on inner pages as being the informative ones. But I'm seeing plenty of EMD's ranking too so who really knows.

I agree with your sentiment that you can tell them which page is more relevant, especially with internal linking.
 
I think this is a good point. I've seen people say that this is happening because the homepage is over-optimized, largely due to the partial or exact match domain. So then they remove even MORE of the on-page signals from the homepage and it never ranks.

But I also think google doesn't want to rank homepages for much beyond brand terms right now too. They are focusing on inner pages as being the informative ones. But I'm seeing plenty of EMD's ranking too so who really knows.

I agree with your sentiment that you can tell them which page is more relevant, especially with internal linking.

Yeah, tricky to say... Thing is that the homepage on a PMD/EMD has got the density there usually and certainly it'll have the most external inbound links with exact match pointing to it and yet so often internal pages pop up for those terms instead.

Way too difficult to say definitively, and usually I've found this is more common on blogs.

When you think about it the homepage has most of the link juice and if it's a blog it's pointing off to various other pages/posts which probably contain the keyword in the title as well - if the niche is fairly generic. So the homepage is basically passing most of its authority, relevancy and link juice away to other pages. If the site is an EMD/PMD then perhaps blogs are making it worse on themselves when trying to rank the homepage.

Contrived Example:
URL: keywordmoniker.com
Internal: keywordmoniker.com/keyword-generic-text-here/

The internal page already has two instances of the keyword in the heading, depending on if you go by the school of thought that Google doesn't take into consideration keywords before the last trailing slash except for relevancy then that's a clue, contrived as I said, and far from the factors that are really important, but at least it's some rudimentary example of how internal pages or posts (in the case of blogs) are pipping the homepage to the post.

If that post is permanently sitting on the homepage as well in the case of a rarely updated blog we're getting an internal link outside of the header and footer which as we know carry less link weight with a partial match anchor straight from the page with the most link juice on the site.

Honestly you seem like you probably know more than me on this and I'm still getting started with testing a lot of my theories so I'm going to have to get back to you on this when I have more data.

- RF
 
Awesome post, Ryuzaki (I like the name, too). When you put it like that it seems so logical, well, because it is. Yet for some reason I never considered that side of things. I must admit, some of the LSA math is over my head, but I understand enough of it to get the general idea.

http://www.hemingwayapp.com/ is a cool free tool heard about a while back which is aimed at improving writing in general. It has a grade level slider and a character/word counter which should cover all the basics.
 
What most of us miss here is that they don't know we have reach only partially semantic search engines, so you still need metadata or some meta tags to specify your content on each page, full semantic web you will have more specific and detailed results, more deep information classification. For example if I'm asking a search engine for information on my hot sexy neighbor "Anna" he will give me every detail I need, one of the problems when I type her name SE wouldn't understand which Anna I meant so I need to give him some detail, full semantic wouldn't need that, so they need to have access to your private information like geo-location, family tree, friendship relations, social data (id's, phones...) and many more. So when you will ask for Anna SE wouldn't ask you for information but give you the most intimate information you can get FB knowledge graph is and example of a partially semantic search engine.

So this is why all companies like Google try hard to get your personal data, without denial of also selling your info to those three alphabets groups !

I didn't explain the theoretical part of this cause it will includes some math and a bit of terms like Fuzzy retrieval, Adversarial information retrieval, Subject indexing, Latent semantic, Machine-learned ranking ...
 
What most of us miss here is that they don't know we have reach only partially semantic search engines, so you still need metadata or some meta tags to specify your content on each page, full semantic web you will have more specific and detailed results, more deep information classification. For example if I'm asking a search engine for information on my hot sexy neighbor "Anna" he will give me every detail I need, one of the problems when I type her name SE wouldn't understand which Anna I meant so I need to give him some detail, full semantic wouldn't need that, so they need to have access to your private information like geo-location, family tree, friendship relations, social data (id's, phones...) and many more. So when you will ask for Anna SE wouldn't ask you for information but give you the most intimate information you can get FB knowledge graph is and example of a partially semantic search engine.

So this is why all companies like Google try hard to get your personal data, without denial of also selling your info to those three alphabets groups !

I didn't explain the theoretical part of this cause it will includes some math and a bit of terms like Fuzzy retrieval, Adversarial information retrieval, Subject indexing, Latent semantic, Machine-learned ranking ...

Dublin Core Metadata is something worth looking at.

Isn't Fuzzy Retrieval based on Boolean logic? Just asking, because as you know there's so many IR Methods about and I'm not too clued in on this one.

I was under the impression that Google use an inverted secondary index as a boolean based index would grow unfathomably large.

I'm sure I've got it wrong, got any reference links you could point us to? Or do you mind explaining a little more?

I really think the subject of IR is fascinating and extremely important from a technical SEO standpoint, so I'm sure nobody would mind it if you had some cool reading material to share! I hope we're not the only IR weirdos in the township!

- RF
 
What I mostly know, that Google, like to have more control on their IR, they use semi-automatic "Machine-learned ranking" other than that, I don't know which modules specifically they use so I'm like you right now don't know much !

Isn't Fuzzy Retrieval based on Boolean logic?
Yes! Fuzzy Retrieval or Fuzzy Information Retrieval a developed method from Extended Bootlean logic, primary problem solving in this techniques is to identify an element belong or not belong to a set of a membership by organizing its intermediate truth value if it's belonging or not or easy called true or false to a set by valuing it using a membership function to identify the real interval [0,1] or [1,1] ...

A bunch of theory we don't need to know :smile:


I really think the subject of IR is fascinating and extremely important from a technical SEO standpoint...

It's extremely hard to know all methods and all theories, sometimes my mind hurts but you can just find some research made "white papers" that includes many types of these information, fuzzy theory is developed a lot so we can't know exactly what method Google use but simply if you check their knowledge graph patents you can find some interesting information.

I don't know which part you didn't understand from my last post, I simplified it to the max it doesn't include anything complex.
 
What I mostly know, that Google, like to have more control on their IR, they use semi-automatic "Machine-learned ranking" other than that, I don't know which modules specifically they use so I'm like you right now don't know much !

Isn't Fuzzy Retrieval based on Boolean logic?
Yes! Fuzzy Retrieval or Fuzzy Information Retrieval a developed method from Extended Bootlean logic, primary problem solving in this techniques is to identify an element belong or not belong to a set of a membership by organizing its intermediate truth value if it's belonging or not or easy called true or false to a set by valuing it using a membership function to identify the real interval [0,1] or [1,1] ...

A bunch of theory we don't need to know :smile:




It's extremely hard to know all methods and all theories, sometimes my mind hurts but you can just find some research made "white papers" that includes many types of these information, fuzzy theory is developed a lot so we can't know exactly what method Google use but simply if you check their knowledge graph patents you can find some interesting information.

I don't know which part you didn't understand from my last post, I simplified it to the max it doesn't include anything complex.

Hah no I think you misunderstood my reply, was simply saying that my understanding is that it's been confirmed that the secondary index is an inverted index. Why this matters is because an inverted index is used over a boolean index and not together.

So since I was right in that Fuzzy Retrieval is boolean based, where did you get the impression that this is what is being used?

I regularly read through various granted Patent Filings and Information Retrieval Method blogs, which you are right, existence of a method or patent doesn't indicate if it's being used or not, so there's no reason for you to just throw down IR / Indexing Method names without explaining it or even stating the fact implicitly that this may or may not be used.

Most on this forum do their homework and will then go and read about those assuming that it's worth reading about or is actually being used by Google, because let's face it that's the main Search Engine we care about here...

So I do find it a little funny that you say you've simplified your post when actually you've just dropped a number of theories that as you now say, may not even be used. You haven't gone in to any detail about why these are used, or anything along those lines - which can be done without getting into the mathematics.

By doing this you've made it the very opposite of simplified, making it more confusing to those who would like to know about those kinds of subjects. In fact it's potentially misleading since you never implicitly stated they may or may not be used in your first post.

In regard to your followup about Fuzzy Retrieval; I can tell you right now that the determining of relationships, context and relevancy of terms in a document is a lot more complicated than simply determining a yes/no value. Fuzzy Retrieval may or may not be used to my knowledge as a part of the weighting of relationships, but from what I've seen there are far more efficient and precise methods of doing this, so yes I don't know much about Fuzzy Retrieval, you may be right that it's importance is understated, but if so please give us an explanation or a source to where you're getting that information.

This isn't a personal attack, as you may have been lead astray yourself. This is why these topics are so important, we need to discern myth from fact.

In order to do this we need to verify information, back it up with explanations and sources and that's all I'm saying here. I am totally open for you to teach me something, but if you want to say these things you really have to back it up.

- RF
 

Oh snap.

I didn't actually mean my reply to come across as scathing, the thing is I get really passionate about this topic, and to go back to what I said before - it's not meant to be personal at all, just think that explanations or sources are in order when something different is put on the table compared to what we already know.

The guy may be right, but how will we know?

- RF
 
Oh snap.

I didn't actually mean my reply to come across as scathing, the thing is I get really passionate about this topic, and to go back to what I said before - it's not meant to be personal at all, just think that explanations or sources are in order when something different is put on the table compared to what we already know.

The guy may be right, but how will we know?

- RF
Nah, man. I didn't read it as scathing at all. I saw the part where you specifically said its not a personal attack, just need empirical evidence to make sure we're all on the same page and not spreading rumors.
 
Nah, man. I didn't read it as scathing at all. I saw the part where you specifically said its not a personal attack, just need empirical evidence to make sure we're all on the same page and not spreading rumors.

I talked to @MoneyStalker and we discussed Fuzzy Retrieval and he's right about it being important.

Important over any other method though, not so much, but he cited the following link as the source he was trying to explain with the methods being used as part of what's going on here; http://dejanseo.com.au/thin-content-update/

As with many other methods it's important as part of the whole and is just one method used in conjunction with many others, which is where I felt he may have had a point, but I was unsure if he was trying to say FR was used over others, which he wasn't and to be honest, now I understand FR more would be ridiculous to ever assume anyway. Was good to talk to him and that's what I like about BuSo... Everyone's willing to have a chat and explain what they mean.

I guess we all tend to be overly busy and just try to simplify stuff which can lead to confusion and even he admitted himself that I had a point there, so yeah it's all good.

He even sent me an epic book on IR etc etc he has copies of so that's cool :smile:
 
What bothers me is the mention of Dublin Core.

No one is using Dublin core, and I am not even kidding. It is one (of several) humongous library metadata frameworks that ended up not being used at all or in an incomplete / faulty / very customized fashion by libraries / online repositories.

Look at schema to see where things are heading ... this has actually the big ones behind it.

Q: Why are Google, Bing, Yandex and Yahoo! collaborating? Aren't you competitors?

Currently, there are many standards and schemas for marking up different types of information on web pages. As a result, it is difficult for webmasters to decide on the most relevant and supported markup standards to use.
Creating a schema supported by all the major search engines makes it easier for webmasters to add markup, which makes it easier for search engines to create rich search features for users.

Source: https://schema.org/docs/faq.html

(Actually, the whole FAQ is interesting)
Even that has a very abandoned feel about it - my guess is that is the start and now everyone is expanding on it on their own.

::emp::

P.S. If you meant to point to Dublin Core as just an example, then all is good. Just don't think it holds any water nowadays.
 
Back