Software-as-a-service, or SaaS, is becoming increasingly important as a low-cost way of incorporating third-party, best-of-breed software components (or even components you develop yourself that you want to reuse) into larger software applications. This trend has really taken off with "recent" (well, if a few years counts as recent) developments in client-side JavaScript and Ajax in particular.
Alongside this trend has been a similarly growing trend to pay more and more attention to search engine optimization (SEO). Businesses on the web need a way to generate traffic for their websites, and a good organic search strategy is going to be part of any serious effort to generate that traffic.
From a technical perspective it's important for application developers to understand some ways in which these two trends interact. Certain SaaS integration techniques have negative impact on SEO. And as it turns out, the more convenient techniques are the ones that can "break" SEO.
Here are three common integration techniques and some gotchas to keep in mind.
Iframe-based integration. This technique is nice because it is so easy to implement. Just add an iframe to your application page, point it at the service you want to incorporate, and you're in business. While there are some annoyances here, such as the iframe having a fixed size (a nuisance when the target service doesn't occupy a predetermined amount of screen real estate), it's easy and it avoids cross-site issues that can arise with Ajax-based integrations.
From an SEO perspective, however, this approach is a loser if you want the content associated with the target service contribute to your PageRank. (I'll just say PageRank, like Frisbee or Kleenex.) A good example is user-generated comments. My website features articles that I write in part because I like to write, and in part because they help generate traffic for my software consulting business. Something nice about allowing users to post comments on your articles is that the comments themselves become very useful as a means for improving PageRank by attaching the right keywords to your page, making the content more dynamic, and so forth. (Brief digression: One humorous example would be misspelled keywords. Back in the day, you had to include all of those misspellings in the article itself, or as hidden text, or as page metadata, etc. But with comments, users may sometimes misspell words in a way that really helps your PageRank on common misspellings, such as the user who put "cfx" in a comment on my Apache CXF article.)
So a few weeks ago I got the bright idea to convert my inline comment engine to an iframe-based engine that I could reuse in other applications, share with other people, etc. The following graph of my keyword traffic, while probably statistically inconclusive, at least suggests that that was a bad move from an SEO perspective:

In my case the effect, if there really is one, isn't large. But I can imagine that if I had shorter pages, less keyword-rich pages, or else more user comments that the effect might be more pronounced. Incidentally, this is a great reason to hook up Google Analytics and pay attention to the impact of the changes you make.
Ajax-based integration. If you can solve the cross-site issues (for example, by using a service proxy, or by deploying the service itself in the same domain as the client applications), then Ajax-based integration can be a convenient way to go. It solves the fixed-size problem associated with iframes, even though it introduces its own annoyances; for example, content suddenly appearing on your screen just as you were trying to click something else, thus taking you to the wrong page. In the case of my comment service, this annoyance wasn't an issue, since the comments appear at the bottom of the page (and so they don't displace anything).
Unfortunately, coming again from an SEO perspective, this approach suffers from the same problems that the iframe approach has. The search engines don't pull the Ajax content into the page, and thus the associated content won't help you with PageRank.
Integration via web services. Of the three techniques described, this is the most heavyweight from an integration point of view. You have to actually get into the application code itself instead of just adding some JavaScript to the HTML. But there are some important benefits, such as better control over the way the data or content are presented. And for our purpose, the SEO benefit is that search engines see the content as just being part of the page itself. That means that the service-generated content (like our user comments) is indexed just like the rest of the page is.
This isn't to say that integrations via web services are always the preferred way to go. That simply isn't the case. For example, if you're developing an application where you don't care about SEO (such as internally-facing business applications), then any of the options is probably feasible. Or if the service you're including isn't content-rich (like a voting widget), then again it would be perfectly sensible to look at iframes or Ajax.
And it may even be that the service is content-rich, but for whatever reason you don't want it to interfere with the primary content. Here's a good example. In the upper-right hand corner of my articles pages, there is a list of recent articles that I've written. The article titles are keyword-rich and so when users do a Google Custom Search for one of the associated keywords on my site (I currently have it disabled but I'll turn it back on), pretty much every article page shows up in the search results, which is not what I want at all. This may well be a candidate for using Ajax to actually suppress the offending content.
At any rate, now you know some of the pros and cons of the approaches described above. As always in engineering there's no one right answer; it's an issue of understanding your requirements and making the right tradeoffs. Have fun!
No comments have yet been posted.