Monday, January 13, 2014

On Page SEO Part 2: An Introduction To Signals of Quality


In the previous tutorial we looked at some basic on page factors including the alt attribute. It was suggested that every img tag should also have an alt attribute even if the image referred to was entirely decorative. These changes might at first seem a bit pedantic, however it makes for better accessibility and standards compliant HTML.


Ensuring pages are accessible and standards compliant can cause a lot of work for webmasters trying to rectify things after a site has gone live, especially if every page contains multiple HTML errors. So is it worth all the bother? The simple fact is that accessible sites are generally more search engine friendly and can be viewed on a wider selection of devices and browsers.
Making sure that every piece of html code on every page validates and meets current accessibility standards are signals that a business cares about every single visitor to their website. Spammers using ‘throwaway domains’ are more likely to shy away from this type of work because of labor, time and expense.
Signals of quality are rarely about relevance, for example it’s easy to understand why allowing a page to go live as an ‘untitled document’ would harm relevancy, it’s not so obvious why including a telephone number would increase search engine rankings.
There is a distinct difference between quality and relevance and search engine must necessarily balance both aspects in order to deliver the best results. The task of Identifying quality is becoming increasingly important due to the amount of low-quality content that is being uploaded to the web every day.

Bayesian Filters

Bayesian filtering is utilized by most modern day mail clients as a means to weed out spam emails from legitimate emails. Search engines use it to categorize documents and Google uses it to deliver relevant Adsense ads. How do Bayesian filters Work? Initially the process starts with a list of sites that have been classified as high quality and another list that has been classified as low quality. The filter looks at both and analyzes the characteristics common to either type of site.
Once the filter has been seeded and the initial analysis completed they can be used to analyze every page on the web. The clever thing about Bayesian filters is that they continue to spot new characteristics and get smarter over time. Before we delve into any great detail on how Bayesian filters work, here is a couple of quotes from Matt Cuts regarding Signals of quality that clearly show Google is addressing the problems caused by low quality mass generated content.
“Within Google, we have seen a lot of feedback from people saying, Yeah, there’s not as much web spam, but there is this sort of low-quality, mass-generated content . . . where it’s a bunch of people being paid a very small amount of money. So we have started projects within the search quality group to sort of spot stuff that’s higher quality and rank it higher, you know, and that’s the flip side of having stuff that’s lower-quality not rank as high.”
“You definitely want to write algorithms that will find the signals of good sites. You know, the sorts of things like original content rather than just scraping someone, or rephrasing what someone else has said. And if you can find enough of those signals—and there are definitely a lot of them out there—then you can say, OK, find the people who break the story, or who produce the original content, or who produce the impact on the Web, and try to rank those a little higher. . . .”
There has been mention of Signals of Quality in Google patents and some specifics have been discussed by Google engineers so hopefully the days of article mills and article spinners are numbered.

How Bayesian Filtering Works

Although it is known that search engines use Bayesian Filtering the exact algorithm is of course proprietary and unlikely to be made public, however the actions of Bayesian filters are well understood. So lets start by looking at how Bayesian filtering works.
To begin a large sample or white list of known good documents (authoritative highly trusted pages) and a large sample of known bad documents (pages from splogs, scrapper sites etc) are analyzed and the characteristics of each page compared. When a large corpus of documents is compared programmatically patterns or ‘signals’ emerge that were hitherto invisible. These signals can then be used to provide a numeric value (or percentage likelihood) of whether the characteristics of other pages lean towards those from the original sample of good documents or those from the original sample of bad documents.
Some simple examples of this would be to compare the words in the good documents to those in the bad documents, if it is discovered that many low quality pages use the terms like ‘buy cheap Viagra’ or have a section on each page for ‘sponsored links´ then other pages that do the same might be of low quality also. Conversely if it is discovered that high quality pages often contain a link to a Privacy Policy or display a contact telephone number then other pages that do the same might also be high quality pages.
As the process continues more signals are uncovered. In this way the filter learns to recognize other traits and whether they are good or bad. There is likely to be many signals of quality measured, each one adding to or subtracting from an overall score of a pages quality.
This means is that SEO’s web designers and webmasters need to adopt a holistic approach that takes into account information architecture, relevancy, accessibility, usability, quality, hosting and user experience.

The Link Structure of The Web

Although links will be covered in future tutorials, it makes sense to discuss some of the implications of recent changes in the link structure of the web now. Once upon a time reciprocal links were all that were needed to achieve top search engine rankings. Because reciprocal links were easy to acquire and made it easy to promote sites of lesser quality so that they outranked quality sites search engines stepped in and devalued reciprocal links along with PageRank.
One way links were now the way to go, so a new market in selling one way links emerged. Search engines again viewed this as a way to game the system and paid links, if detected, were devalued so that they passed no value whatsoever. The nofollow attribute was implemented so that, amongst other reasons, links could be sold without penalty. The nofollow attribute has also been adopted for other reasons and is used on millions of blogs and some of the most popular social sites.
URL shortening is also popular and again is used by some of the most popular sites on the web. The upshot of all this is that although the web continues to grow the ability of many millions of pages to link out and cast a vote for other pages has been removed. Of course you still get the traffic which can be substantial if you make the front page of Digg. Because the link graph of the entire web is essentially in recession, search engines are again reevaluated the way they calculate rankings and quality has many discernable signals.

The Need To Discern Quality

According a study carried out by WebmasterWorld the top 15 doorway domains are a haven for spam. The study analyzed popular search terms and discovered that more than 50% of the results were spam. 77% of the results from blogspot.com were found to be spam. The following list shows the level of spam found on the top 15 doorway domains:
Dorway Domain
Spam%
sitegr.com
100%
blog.hix.com
100%
blogstudio.com
99%
torospace.com
95%
home.aol.com
95%
blogsharing.com
93%
hometown.aol.de
91
usaid.gov
85
hometown.aol.com
84
maxpages.com
81
oas.org
78
blogspot.com
77
xoomer.alice.it
77
netscape.com
74
freewebs.com
52
The study shows that on the keywords tested some of these blogs are used exclusively by spammers, while others had a very high percentage. The reason for this is that these sites provide free blog space which is a magnet for spammers who need to generate links to low quality splogs or scraper sites quickly.
The next list compares percentage of spam sites by top-level domain' (TLD):
TLD
Spam%
.info
68
.biz
53
.net
12
.org
11%
.com
4%


This research highlights the incredible amount of spam that exists on the web but it would be unfair to penalize every .info domain for example just because a high percentage of .info domains are used by spammers
Conversely it would be unwise to trust every .com even though in general they seem to be comparatively spam free. To discern quality many signals have to be considered covering every aspect of a website.
The next tutorial in this series will be looking at on page signals of quality nad why quality score is the new PageRank

Socializer Widget By Blogger Yard
SOCIALIZE IT →
FOLLOW US →
SHARE IT →

0 nhận xét:

Post a Comment

Labels

1 column 2 column 3 column 4 column aaSolah Accessories add button subscrise for blogger Add Digg Button to Blogger add feedburner for blogger add like button facebook to blogger Add Pinterest Pin it Button to Blogger add sitemap blogger to google add stumbleUpon badge to blogger Ads Ready afghan afghan stitch AGULHAS AMIGURUMI Anchor text Ảnh đẹp Ảnh vui Animasi applique Art artikel Árvore de Natal em croche Aulia baby crochet Baby Dress baby knitting Baby learn bag Bags Bài viết BAKTUS Ballet Bun Cover balloons BARRADOS Bayi Bead Beads Bear Ear Hat BEBÊS belts Berita Berry Best WP Theme 2014 Better Blogger Seo Biologi Black blanket Blogal Template Blogger Codex Blogger Images Slider blogger navbar Blogger Rating blogger template 2014 Blogger Templates Blogger Tool Tip Blogger Tutorials Blogger Widgets Blogger-Template Blogspot Blogspot Templates Blue BLUSAS BOINA BOLEROS BOLSAS books BORBOLETAS Bounce Rates bows boxes bracelet Brown Browse Business Butterfly Button Counter Button Facebook CACHECOL CACHECOLAR cake candles candy Candy Cane Cards carving carving fruit carving fruits CASACO INFANTI CASACOS CASAQUINHO EM CROCHE CASAQUINHO EM TRICO CASAQUINHO PARA BEBÊ Cerpen CHAPEU clay COLAR COLAR DE FLORES COLETE coloring Competitor Research Contact Form for Blogger CORDÃO CORUJUINHAS crafts CROCHE COM SAQUINHOS PLASTICOS CROCHE IRLANDES Crochet Crochet Geek crochet pattern cross stitch Curtain customize cutting paper DECORAÇAO Design blogger template Dicionário de Trico - Inglês/Portugues Digg button Dijual Documentation Doily Domain dress dwell time earring ECHARPE Ecommerce Embroidery ENXOVAL PARA BEBÉ ESTOLAS Event Exit Rates facebook facebook button facebook like in blogger facebook share button facebook share button with counter FAIXA PARA CABELOS fashion Favicon for blogger feed rss Felt Fisika Flat News Templates FLORES Flower flower arrangement flowers crochet Folding paper FOLHAS food food decor Foto Free Blogger Templates Free Download Blogger Templates FREE FORM Free Premium frustration Funny Galaxy Stitch Gallery Games garden Gaya Hidup gel grips Geografi gift Gifts Girl glass Go Green GOLINHAS Google Google Analytics Google Merged All Accounts into One google plus button Google SERPs Google Webmaster Tools Google+ Google+ comment GORRO GRÁFICOS Granny Square Gray Green hair clip Haji Hama hats Heart Hexagon hide blogger navbar holiday craft holiday crafts home and garden Hook how to how to add favicon for blogger how to add feedburner How to install flat news how to turn off google plus comment how-to-guides Hunian Iklan Baris Ilmu Pertanian Image Thumbnail for Blogger imam Indexed Quickly for blogspot iNews 2014 Blogger Template Info Kontes Installing Blogger Template Internet Iseng Islam Itunes jewelery jewelry Karma Blogger Template Kata Bijak Kata Mutiara Kesehatan Kewarganegaraan Keyword Research Kiếm tiền kirigami Knit knitting knitting pattern knot Kuliner Landing Page Blogger Templates Lập trình Blogspot Latent Semantic Indexing LEAVES Like button facebook Lirik Lagu LirikNasyid Log Cabin Square LSI LUMINÁRIAS LUVAS LUVINHAS PARA BEBÊ Magazine Magazine Blogger Template make up making flower making flowers Making toys Manta MANTA PARA BEBÊ MANTA PARA SOFÁ Mashable MEUS MOMENTOS MEUS TRABALHOS mittens motif Motivasi Movie Multiples Music Nail art Nature necklace News News blogger template News Google News Pro Blogger Template Nivo image slider Olah Raga On page seo Online store Orange Organik Origami Pagerank painting PANTUFAS PAP Paper Patchwork PELERINE Penyakit Tanaman Peribahasa Personal Phần mềm pictures Pilihan Pillows Pink plastic POLAINA PONCHO Popcorn Popular Posts Popular Posts Code For Blogger Portfolio post automatically to twitter Premium Blogger Teamplates professional blogger templates Profil puff stitch Puisi pumpkin cap QUADRADO Quilling paper Quilt RECEITA RECEITA CULINÁRIA RECEITA DE PONTOS recycle recycling Red redirected to FeedBurner Related Post Widget remove navbar blogger RENDA DE PELO DE CABRA repair work Responsive Blogger Template retweet button ribbon rings rss for blogger rugs SACOLAS SAIAS DE CROCHE SANDÁLIAS EM CROCHÊ SAPATINHOS SAPATINHOS PARA BEBÊ Sastra Scarf scarves search engine optimization Search engine users seo Seo Blogger Template SEO Course Outline Seo For Blogger SEO Friendly Seo in 2014 Seo tips Seo Tutorial Seo Tutorial 2014 Seo With Blogger Serba Info Serba Tips serps sewing share button share twitter button shawls shoes and sandals Simple Singing sitemap blogspot sitemap page skirt Slider for blogspot snood Snowflake soap socks SQUARE steps to design a nice blogger template Strawberry subscrise button SUÉTER sweater Swiffer Sweeper Cover Syahira blogger theme TABELAS DE MEDIDAS DE TRICO E CROCHE tablecloths Tanaman Padi TAPETES. RECEITA Template TENIS PARA BEBÊ Thank you the panda update Thủ thuật Tien Nguyen tip tips Tips and Tricks TOALHAS TOUCAS TOUCAS PARA BEBÊ Travel TRICO Từ điển Blogspot TÚNICA tunisian turn off google plus comment Tutorial Tutorial Blog Tutorial video twitter button Twitter follow button twitter for blogspot underwear Valentine's day Vegeta Blogger Templates Video VÍDEO Video Blogger Templates video tutorial weaving Web 2.0 wedding Wedding Mag what is a bounce What Is Anchor Text what is bounce rate What is Seo White WIDGET Widget For Blogger wood Wordpress Look WordPress Themes XALE Yellow Youtube