Sep 22, 2011

Search engines and PDFs

People are always excited to know how would search engines treat various factors of a website. People have their own myths whereas few people experiment and derive conclusions. There are millions of websites that use PDFs for genuine reasons. There might me companies who want to upload their magazines on their websites in the PDF versions. People have queries how would Google treat PDFs and would they be ranked in search results?

Google Webmaster Central Help Forum has been a one stop source for the resolution of such queries. Following is a useful link which provides a lot of information on how would Google treat PDF files on a website:-

One of the Google employee John Mueller has provided his valuable inputs also which helps the cause of resolving this query. We could conclude from this forum post that Google can read the PDFs if the content is available in a textual form within the PDF file and that it is in the proper character-sets (a simple test is to select some of the text and to try to copy it into a text-editor). But it is always recommended to create HTML versions of those PDFs.Web-users may prefer to be able to read the content before opting to download and view a PDF file. Many visitors don't appreciate PDFs. There isn't a concern over duplicate content if you mirror the content of PDFs on HTML pages as search engines would show only one of them in the results (most probably the HTML versions would rank above the PDF versions).

Follow Google Webmaster Central Help forum for such queries.

Sep 21, 2011

Russian Internet Web Search Engine Enters Turkish Market

Yandex is a Russian internet web search engine and it is planning to enter the Turkish market. The plan includes a launch of Turkish version of search engine including widgets and services related to Turkish web users.


Yandex's chief executive, Arkady Volozh said that Turkey has a dynamic population and a promising market. So it was their obvious choice for being the 5th country where Yandex would start their operations.


Google is already one of the leading international search engines in Turkey and Yandex plans to have an approximate 10 to 20% of the market in Turkey in next few years.

Jul 20, 2011

Google Removes .co.cc domains from natural listings


Few weeks back Google reported that it has deleted around 11 million .co.cc sub domains from the search listing. The issue with these sites was regarding spamming which led this giant to consider their listing against authentic and genuine domains.

Various companies offered these .co.cc domains for free or at cheaper prices compared to other domains. Because of these reasons there were many spamy websites which were ranking higher against other genuine domains. Many companies also offered .co.cc domains and are classed as 'freehost' by Google and therefore, exercised their right to block these domains. These .co.cc domains were been registered in 1000s with Korean company offering 15000 sub domains for about $1000. That accounts roughly to about 5,741,810 different accounts, with a total of 11,403,223 domains registered to date.

The Anti-Phishing Working Group also reported some 5000 phishing attacks from .co.cc domains in last six months of 2010 alone. You can check this at Google Online Security Blogpost This search engine news is all over the internet and many .co.cc domains are been affected. It has also been advised to contact the bulk sub-domain provider if Google Safe Browsing shows a warning for the website you own.

Jun 22, 2011

Google to remove I'm Feeling Lucky Button?

Google is testing few new changes in its new interface for the search engine. Inspired by Bing, Google is now removing "I'm feeling Lucky" button from the home page. Many users these days do not use "I'm feeling lucky" button. Considering that this is no longer used by many users, Google is considering removing the button. Another change according to various search engine news is that Google is making is to making prominent its search button, however most users might not find it useful in Instant mode.



The cached and similar buttons are also not used by many users. Therefore Google is planning to put these buttons in the Instant Preview box, so they are more difficult to find. These changes are been tested on Google Finland and if Google find interesting results from this test, Google might roll it out on global scale.

Jun 6, 2011

Schema.org - Google, Bing & Yahoo Joins for Content Markup

Schema.org is a joint effort taken by search engine giants Google, Bing and Yahoo! to improve search results. Many sites about search engine news and updates are talking about this hot topic trending in. The main reason for the rivals joining hands is to create a common set of HTML tags so that search engines can identify, crawl and index websites in same way.
Schema.org provides set of collection from different streams so that webmasters can use to mark their pages in such a way that it can be understood by these search engines. The complete hierarchy for various Schema items contains different categories in which you can add different tags of the website according to relevance (like book, movie, author, person, place etc.)
Marking up the content of the website as much as possible can help search engines to use your information and to present your page to users in most useful way. It is stated in Schema FAQs that one should markup the content which is visible to visitors of the website.

You can learn more about Schema by checking Schema Getting Started Guide or check the FAQs for more information.

Jun 2, 2011

Google Architecture

Google Search Engine works upon its own algorithm to present the results for every search query. The algorithms are kept secret but there are certain rules which are followed and need to be understood keeping the website optimization in view. Following is a generalized working of Google’s search engine.


Google Architecture: Web crawling is carried out by several distributed crawlers.




Indexing is done in three fundamental steps.

1. Parsing
2. Indexing documents into Barrels
3. Sorting

Then the ranking system comes into play. Google maintains much more information than any other typical search engines like Yahoo & Bing. Every hit list contains position, font and capitalization info. Google’s ranking system is designed such that no particular factor is given too much focus. For a single word query google will look at the hit list for that keyword. The hits are considered as different types (title, anchor, URL, plain text large font, plain text small font….) where each has its own type-weight.

Google is successful in giving the most relevant results for different search queries because of such control measures and filter processes. So according to Google search engine news, keep it simple and original to rank better in their search results.


May 24, 2011

Google Buys Sparkbuy

Google has recently made an acquisition recently with a shopping website Sparkbuy. Sparkbuy is a product comparison and shopping search engine. Even though there is no such official announcements from Google, Sparkbuy have updated this acquisition on their website. They have also twitted about this.


It is said that Google will use the expertise and technology to make Google shopping results more efficient and user friendly. The Sparkbuy team will be joining Google’s Kirkland, WA office.
Sparkbuy was founded by Dan Shapiro in 2010 and the company successfully raised about 1 million USD for their development and comparison software. Any search engine news has not yet disclosed the amount of this deal still now.

The service of Sparkbuy is now closed, as a result of acquisition.