Archive for the ‘Google’ Category

Adobe fight fire with fire

Tuesday, July 1st, 2008

Recently Adobe has been needing to deal with a massive force attacking its main domain of dominance, we can call this domain - the highly interactive web or RIA. I don’t refer to Microsoft SilverLight which is supposed to compete with Adobe Flash on the same ground, but to the brutal MS marketing machine. This machine can make every boy and girl blindly recite fallacious facts and numbly say things like “Yeah, but, SilverLight is search engine optimized”.

It took Adobe some time to understand what it is dealing with, and I think I’ve noticed a change in their PR brutality lately, generating big PR out of small things.

This last SEO announcement from Adobe, which claim that Flash will be more searchable by search engines, might have some substance in it, as opposed to the similar one from Microsoft, but, it’s still mainly a marketing battle. I just hope it doesn’t take too many resources out of the real development of the products.

Google were probably working on their own humanoid crawler that has a broader vision then just the Flash Player and can work with any RIA applications even if its written in AJAX or SilverLight. Apparently searching and indexing RIA is not an easy thing to achieve, and it doesn’t seem that even google has managed to do it yet.

The main problem of indexing Flash websites or any other RIA website, is to understand the context of the data and then link to it directly, aka deep linking. The fact that google can now read the text from within Flash even better then it did before, don’t yet solve that problem.

Even so, it doesn’t mean that we shouldn’t be optimistic, and there is a possibility that this will improved the indexing of Flash content. We’ll have to wait and see.

My blog has been hacked

Monday, June 16th, 2008

The first part of an hacker’s job would be to gather some information about her target, server, technology and software that runs on the desired target. With Worpress all is needed is viewing the html source to see the “<meta>” tag that describes what version of Wordpress is currently running and how vulnerable it is. Attackers scan/google this automatically along with other parameters to see what blogs they likely want to hack.

I have always saw the updates in the Worpress dashboard and always stupidly ignored it, thinking, who would want to hack my blog?! I should have known that a PR of 7 is very appealing to the spammers. But even if you don’t have any PR or have very low traffic it doesn’t mean that you’re safe from being hacked and it’s been reported that very new and unpopular blogs has been hacked as well.

The attackers have managed to use an old exploit in my blog, a very old one, and polluted my blog with thousands of spamming pages, all hidden in some obscure folders. One of the first things I’ve noticed was some strange traffic is going into my blog, mostly from unrelated blogs which showed no indication of linking to me. Only when looking inside their HTML source I saw its hidden links to me. I’ve realized that I’m part of a zombie network of hacked blogs and splogs all for the sake of generating spam money. I’ve informed some websites that they were probably hacked as well, and I still found new websites that have hidden links to my blog and probably been controlled as part of this spammer network. This is an indication that the attackers work is far from perfect and probably not fully automatic, as they still don’t know I’m out of it, and still link to me.

Servers these days have become (relatively) very secure, securing it has become mostly a plug and play, you plug your firewall, you plug your security software suit and your almost done. (I don’t wanna disregard any IT and their hard work, but you get the point). Attacks vectors needed to be changed into exploiting the developer’s code and the end user, as these are the most error prone areas these days. As such, it became the developers responsibility to not only write a compiling code but also write a secure code. As for the users, they still shouldn’t be expected much and allowed to be very dumb. Its not sure yet if developers can be expected to always produce a safe code, Wordpress is created by highly talented developers and still all of it’s security flaws were due to insecure coding. I’ve heard this being compared with an old development problem, which is, producing optimized code, that problem was never completely solved. Currently developers don’t have sufficient tools and resource to overcome these problems. One can only hope that in the same way that viruses has lost their strength over the years this will be the same for these kind of attacks. We can only wonder what will be the next generation of attackers, maybe the end users will become the only reasonable target.

The first lesson here is to always upgrade your blog. Although this can be tiring process, with updates coming all the time, it is must be done. The Wordpress update process itself is very easy and fast and I really encourage you to do it the minute a new version is available. You might want to be assisted by this auto upgrade plug-in.

What is described here is mostly about the Wordpress blog platform but this is far from being the only massively used and attacked open-source web application.

Finally I would like to try and coin a new phrase. The same way we were introduced by the developer who can also be a designer named - Devinger. I think it time to introduce the Safeloper. The Safeloper is a developer that has the tools and knowledge to produce secure programs. ;)

I guess we should always expect to be hacked and always backup.

How to find out if you’ve been hacked:

As opposed to old school Internet hacking, where the attacker main goal was to make a name for herself and that the attack would be known and published. In this new kind of hacking the attackers main goal is to make money through spam, and as such their last intention is that the owner of the hacked website will have any clue that she’s been compromised. You might get weird increase or decrease in traffic and the google PR might drop a bit, but, you won’t see anything completely different unless you’ll look for it.

Simple as that, view source and search for spam words like cars, montage, pharmaceutical, etc’.

look at traffic to your blog - If you see some strangely unrelated blogs linking to you there is a good chance you’ve been hacked and used as a splog. Go to the suspect blog and view its source for hidden spam links to you.

Look at the google search traffic to your blog - The latest exploit, also known as the anyresult.net hack, is a way to steal google result of your blog. Clean all cookies, search yourself in google, if a link to your blog is redirecting to another web-site then you’ve been hacked. Clean your cookie again and do this a few times to be sure.

Make Sure Your WordPress is Not Hacked - some more info.

What to do if you’ve been hacked

I would suggest to backup everything from your blog including all the file folders and database and then do a fresh install of the new Wordpress (Currently 2.5.1). To backup the folders use an FTP client, the DB backup is generally done from the website’s control panel or from the WP admin. Only after the fresh install, start adding all the customized stuff like themes and plug-ins checking each and every one as you add it, you should even check the images. When it comes to the plug-ins your better off re-downloading it.

Change your blog password and all of the blog registered users passwords, make sure all the users are valid and not some hacker created. It’s better not to use WP for user registration as this is a source for a lot of the previous exploits.

How to prevent your blog from future hacks

Always install updates - It’s fast and easy

Remove the Generator Meta tag - WP shows its version number inside the HTML. If existed it’ll help the hacker to know how vulnerable you are.

Put empty index.html files inside the WP pligins folder and any other folder that don’t have an index file. it won’t stop anyone, but, will give the attacker a harder time understanding the structure of your blog and what plug-ins you have installed.

Monitor your files for changes or use some kind of script firewalls

Install only trusted plug-ins

More Resources:

Did your WordPress site get hacked? - More info about the structure of the Wrdpress attacks and how to prevent them, written by one of the Wordpress people.

Patching the WordPress AnyResults.Net Hack - Describes how to fix the latest Wordpress exploit, which is found on WP 2.5 or earlier, it was fixed on WP 2.5.1 but, updated blogs aren’t automatically fixed if it were already exploited. This exploit redirect search engine results of your website to anyresult.net. More, more and more.

File change notifications for your WordPress blog on Linux - A good explanation on how to monitor files changes on your blog. This way you’ll know when a hacker have managed to change or add files. The problem with it, is that it’s recommended not to monitor the cache folder, because it’s constantly being written by Wordpress. Hackers are also aware that this folder is difficult to monitor and it’s where they put their malicious files.

Firewallscript Wordpress Firewall - Commercial (85$) firewall that runs on the php script level without the need of installing it on the server itself, and hence good for shared hosting. It’ll monitor files for changes and more.

Munin A PHP application firewall - The same as above just free and open-source.

Wordpress exploit: we been hit by hidden spam link injection - More information on how to deal with hidden spam link injection

Won’t publish posts anymore - A less common hack that prevent you from publishing into your own blog.

How to Protect Your WordPress Site

9 easy ways to secure your WordPress blog

10 Ways to Secure your Wordpress Install

Almost Perfect htaccess File for WordPress Blogs

When Patches are the Problem - Apparently automatic security updates isn’t a perfect solution either.

Security through visibility: The secrets of open source security - Wordpress is open source, is it really make it less secure?

RIA on the mobile phones and small devices

Monday, March 31st, 2008

Flash, SilverLight, Android, JavaFX, QT and the iPhone. Seems that everyone wants to redefine our mobile phone, the ultimate device/gadget of all time. I’ve written a summary of the latest advancement in the area of rich mobile applications.

Read it here.

OSE instead of SEO

Saturday, March 15th, 2008

The promise of google to have a human like understanding of the Internet it crawls has yet to reach reality. My point is that, we should start to expect Optimized Search Engines (OSE) instead of painfully optimizing our content for them (SEO). Currently search engines can’t understand RIA (Rich Internet Application), websites written in Ajax Flash and SilverLight, and the authors of these websites need to invest a lot of resources to make it SEO. As RIA become bigger and more significant part or the Internet daily, what use is a search engine that can’t understand it? It’s the age of obscurity all over again, the age before google.

This clip (02:22) has reminded me of the old promise that google will see and understand the web the same as we humans do, a promise which wasn’t really fulfilled. I know there is a big technological challenge in that, hey google can’t do it yet, but the one that will do it the best might be the next google.

The search engine game might be open again since the late 90th.

The greatest SilverLight lie

Tuesday, March 4th, 2008

I’ve been to a few SilverLight events and read about it on the web, I’ve even played with it a little, and I think it’s very interesting. But, one thing I’ve learned from all of these experiences, beside the fact that the average dot.net developer feels awed when he sees how to create a rectangle with a gradient fill, is that Microsoft is pumping the fallacious fact that SilverLight is SEO (Search Engine Optimized) because it uses external XAML files, which are basically plain XML files. In the last event I were at, the presenter repeated this "fact" with such determination, which made me jump out of my seat with rage, well not really rage :), I’ve just explained it to him nicely why it’s not true. He was more modest with he’s determinations, afterwards.

Even though it’s a known fact that SilverLight isn’t just SEO out of the box, I still see this being repeated all over the web. You should question authority, and shouldn’t believe everything you’re being told, even if it’s Microsoft.

Currently, search engines don’t even bother looking at XAML files, IMHO they won’t start parsing it any time soon. The same way google don’t parse dynamically loaded XML files, since it can’t do much with it, you can’t get much out of a parsed XAML unless your looking for a Rectangle that is positioned at x=0.1232 and y=33.4355.

My blog jumped to a pagerank of 7

Monday, November 5th, 2007

You probebly heard about googel’s last house cleaning. In short, google changed the way they give pagerank to websites, apparently it seems to be related to advertisements on the website.
I thought my blog stayed at the comfortable pagerank of 6 but, after checking and rechecking today it seems that my blog gets a PR of 7 again …WooHoo.

I say “again” because my blog had this rank before. Short time after I’ve started it, it jumped straight to 7. I believe the steep jump was due to my blog being linked from the MXNA which have a PR of 10, and fullasagoog.com Which back then had a PR of 9. fullasagoog.com hasn’t survived the last google change and was, sadly, dropped to a PR of 5.
After a few months without new posts, (and also because it didn’t deserved it, but don’t tell anyone ;) ), my blog dropped to a, still impressive, PR-6. This was a lesson to not neglect your blog.

I know, I know google’s pagerank isn’t everything in life but it’s sure nice to have, at least until the next change.

Can you make me first in google ???

Thursday, March 16th, 2006

Every now and then a person come to me and nonchalantly ask “How can I make my website first in google?”. And she awaits a simple answer too, something like: change your website title to “best site in the world” and the background color to bluish grey. Well, not really, as you might know google optimization is a complete science.

I won’t go into details about page ranking in general, at least not in this post, but more about the Flash related ‘Search Engine Optimization’ buzzly known as SEO. There are few different methods to improve your flash indexing by se (search engines) and I’ll try to cover them all. First you need to understand that your flash .swf will be looked at, by the se, as an independent page. Which means that, if your flash rank high for some phrase then your flash .swf file will be linked directly. This is a problem because most swf files are’nt designed to be used as full individual page but as part of an html. It is possible to design your flash to look the same as a standalone page and as part of an html page with some limitations but this is also beyond the scop of this post. If you want se to link directly to your html holding the flash then use also the “alternative content” method described below.

 

Flash 8 supports Meta Tags - The simple way

It was a different century when html meta tags were considered important for seo. Search engines these days don’t really care about your MetaData (the info inside the meta tags). In the old days you could write some description about your site for ex: “best site in the world” and some power words or short descriptions about it too, for ex: “best”, “bluish”, “Flash”, “actionscript”, “best site”, etc’. And that was the main reason your site would have been categorized and rank. Clearly a faulty technic. But, it is possible to assume that the flash MetaData will be more influential since it’s hard to index flash anyway. Anyhow it’s still unclear how search engines look at these flash meta tags if at all they care about it. IMHO they do care now and they will care more in the futue. Since it’s so easy to insert MetaData into your flash there is no excuse for not doing it. Click on Modify -> Document enter your title and description. That it!, now when you publish your flash it’ll have MetaData.

metadata_01

 

Using Alternative Content

Alternative content means that, your html page, the one with the flash contetnt in it will also have some html content. S.e. love html it’s there mother tongue so the site get indexed and linked directly to the html and not to the swf. But when the user will come to the site as opposed to the se, they will see the flash and not the html. There are few ways you can use Alternative contetnt, here I’ll specify two, one I’ll recommend and the other I’ll recommend against.

The easiest but not the best way is to put some hidden html inside your page (for ex: in a non visible div). The user wont see the html but it’s there for the se to index it. The reason I recommend against it is because se don’t like this way, they feel cheated. This method is widely used by spammers and cheater and you’ll be in a risk of counted as one, as so your site might get penalized.

The best way to do this is to use javascript. As for now se don’t run javascript they found in a page. This enables us to alternate the html content to the flash one whenever a user enters the page. This also will enable us to check for flash availability on the user browser. In this way we create basicaly two differnt pages on the same page:

  1. Html page - for Search engines and users without the right flash player.
  2. Flash page - for users that have flash.

This guy has writen an in depth explanation about this subject and I suggest you read it.

He is also the author of this fine tool called flashObject that will help you to achieve this easily.

Alternative content - Reversed

Imagine the se spider as he crawls the web and he suddenly bump into your .SWF file. He then extract out his feelers and try to probe your flash file. If it was a few mothes ago (maybe years) the flash file would have look like nothing to him, a big chunk of nothing. These days the spider will probebly be able to extract some inforamtion from your Flash .SWF file (check the next topic for further info).

Since se love html so much why not give him html inside the flash. Put a TextField outside the visible stage with some html text, for ex:

<html><head><title>Best Site</title></head><body>some html…</body></html>

The search engine will be able to see the html text allthough it isn’t on the visible stage as long as it has initial value assigned at author time.

 

Macromedia’s Search Engine SDK - See what you look like

Some time ago Macromedia released their ‘Search Engine SDK’. This tool provide search engines the ability to parse and index content (links and text) from inside a swf (flash) file. Google for their part said they’re already developed their own swf parser and they got better results then the macromedia SDK. I greatly hope so, coz the macromedia’s one has some issues. The main problem with it is that it’ll only extract text from a TextFiled with initial value assigned. Which means that search engines won’t know about your dynamiclly loaded content. But still it is good to know that your flash can be indexed in some way. You can downlod the ‘Search Engine SDK’ from macromedia and check your own swf so you’ll have a clue how it is seen by the search engines. This is an excerpt from the FAQ: “The SDK includes an application named ‘swf2html’. Swf2html extracts text and links from a Macromedia Flash .SWF file, and returns the data to an HTML document.”

To download and use this tool go here and enter your email address (you need to be registered, registretion is simple and free for everyone). After you sign with your email macromedia will send you the URL for downloading the tool. Extract the zip and look for the swf2html.exe in \flash_search_sdk\windows. Now put any .swf you wanna check in the same directory with the swf2html.exe and run this line from a CMD prompt:

swf2html myMovie.swf -o myMovie.html

This will generate an HTML file named ‘myMovie.html’ with the extracted links and text from the flash movie ‘myMovie.swf’.

If the last step looks obscured to you, here a more explanatory step-by-step process:

  1. Download the ‘flash_search_sdk.zip’ from macromedia and extract it.
  2. Create a new directory on your hard drive for eample: “C:\flash_check”
  3. Copy the file ’swf2html.exe’ from the ‘extracted-location\flash_search_sdk\windows’ to the new directory (in this example: “C:\flash_check”.
  4. Copy any .SWF file you want to check to the same directory (for ex: myMovie.swf).
  5. Click on “START” -> “Run” in the Run prompt type “CMD” without the quotes.
  6. When the CMD opens, it’s default value is probebly something like (on winXP): “C:\Documents And Setting\user>”
  7. Just type this line without the quotes: “CD C:\flash_check” and press ENTER.
  8. And this line also: swf2html myMovie.swf -o myMovie.html and ENTER again (duh)
  9. Look for myMovie.html in the same directory.

Download swf2html_exaples

Easier way to see what you look like…

This site is probebly implementing the macromedia SDK or something similar. It gives an easy way to check how your flash .swf online files are perceived by se. Paste in the path to your flash and get a look at yourself.

 

Summary

Search engines work hard to get an objective real view of the websites they crawl. As similar as they can to the way the user will see and use it. Since Flash is a big part of the intenet se can’t allow themself to ignore it. Se (mainly google for now) are already indexing flash content and their efficiency will improve in the future. Prepare your flash and html with the methodes described as there is no excuse anymore.

 

200,000 Flash games in one link

Thursday, March 16th, 2006

Google can now search and parse what’s inside flash swf files. Which means that now you can search flash content directly. For Example, Search this pharse in google to get approximately 200,000 links to games:

“game” filetype:swf

or follow this link.

Â