Designing a growing back end (InfoWorld)

| Navigational map -- for text only please go to the bottom of the page |

August 23, 1999
Designing a growing back end

Veterans of high-volume Web sites weigh in on how to plan to expand

By Lynda Radosevich

These days, if your business strategy does not include running a dot-com with hypergrowth traffic rates, you are considered passé. But increased traffic is destined to make Web sites operate more slowly and crash -- unless IT professionals build the behind-the-scenes servers, software, and networks to handle heavy-duty use before the crowds come.

In a recent panel discussion, four technology officers at companies with dynamic, high-volume Web sites discussed what they have learned about scaling back-end systems. Back-end systems comprise the core server components -- including applications, databases, and supporting technologies, such as load-balancing software -- rather than static HTML Web servers.

The panel included Marc Hansen, chief technology officer (CTO) at iEngineer.com, a San Francisco-based engineering portal, and former vice president of systems architecture at J. Crew; Rick Lamb, chief operating officer at Bottle Rocket, a New York-based online game site; Jon Tulk, CTO at HealthAxis.com, an online health insurance company in Norristown, Penn., and former CTO at music site N2K; and Dan Woods, CTO at TheStreet.com, a New York-based financial news site. InfoWorld Deputy Features Editor Lynda Radosevich moderated the forum.

The panelists described the experience and wise practices they picked up preparing their companies' Web sites for high-volume traffic.

Expect Web site traffic demands and bottlenecks to be moving targets

Jon Tulk: "The traffic goes up, and you see that you're starting to get a little slow in one area. You figure out where [the bottleneck is] coming from, whether it's a network issue, whether it's CPU capacity, [or whether it's related to the] memory disc, database, or whatever. You go in and resolve that problem; then the next day, the system's getting slow again. You're almost never chasing the same problem twice."

Rick Lamb: Referring to a former job at an online game company: "We had no idea what the spikes and surges would be against our system."

Focus on the back-end database and overall architecture

Marc Hansen: "Performance isn't any one thing; it's a good fundamental architecture, and then it's hundreds of things. We fine-tuned our cache on our database so well -- using a main cache and gigabytes of memory -- [that] our disk activity was extraordinarily low, except for the disk log. Eventually our bottleneck became the rotation to the hard disk. As time goes on, the bottlenecks will change ... but if you keep all the historical data, you can go back to it and predict what's going to happen."

Dan Woods: "Segment your architecture so you have lots of smaller machines ... handling one application."

Marc Hansen: "Think of [structuring your architecture] like constructing a building. We could order a big pile of lumber and start nailing it together, or we can create a clear set of plans before we start. When it comes to creating the plans, there are home builders and there are architects who work on large-scale buildings. There is a continuum of skills available. Matching the skills to the size of the project is important. Choosing the correct initial architecture is also important. If you start with a wood-frame house, it will be difficult to grow into a 40-story office building.

Select a high-quality application server, and buy scalable database servers

Jon Tulk: "[The application server] should provide application-server load balancing, session-state management, database-connection pooling, thread pooling, data or content caching, and, ideally, some monitoring, measurement, and management facilities. We have selected Bluestone Sapphire/Web, which is an excellent product."

Marc Hansen: "We're talking about real databases here, not PCs with [Microsoft] SQL Server on them. Put in place all the properties you would put in place for a mainframe. ... In fact, I would recommend you hire yourself an IBM data center manager."

Rick Lamb: "[This] is absolutely true for application-server and database-server layers: At the Web-server layer, PCs running Linux can be a perfectly acceptable solution and slightly more cost-effective."

Marc Hansen: "I agree with Rick that Linux works fine for HTTP servers, but I would differ in that I believe that the salary cost of maintaining skills on another OS [such as Linux] and hardware cost more than the slight savings in hardware [that Linux can provide]."

Build a good data model up front

Jon Tulk: "The data model we had was not designed that well up front, basically because those of us who first built the site didn't have database backgrounds. We didn't realize ... how hard it is to change a data model once you're on your way. And we carried that history with us for the next five years. It affected the performance of the site. It affected the ease of integrating new features and adding data fields and columns to the data tables. It affected the ability to replicate the database, and that affected the ability to do distribution."

Rick Lamb: "Design your database into subject areas and entities -- don't rush the design. Review all application requirements and loads prior to committing to a database implementation. Test the system on paper, in diagrams. Find and productively employ experienced data modelers."

Get good database administrators, and pay them well

Jon Tulk: "With help from a couple of consulting DBAs [database administrators], we added a lot of indexes [to the database], and we had a huge difference in performance. The load on our server went down almost in half at the same time as our traffic nearly tripled -- this was going into last Christmas season -- and it was all to do with optimizations. It's a lot easier to fix this up front by finding the good data modelists, and with a good design."

Marc Hansen: "We were shocked to discover the market price [for a DBA], which was about double what we budgeted -- about six figures." However, "It turned out to be the best investment we ever made in terms of solving scalability problems. A good DBA pays [a company back in hardware savings]."

Dan Woods: "If you can find really smart people, I would pay them above market rates. You should never have them thinking about their compensation. They get 15 to 20 headhunter calls a month, and [compensation] should be a nonissue.

"I've got a playbook on [hiring good DBAs]. First, the person must be personally secure. They must not brag about how much they know and what they did. They must be interested in your business and what you want to do. Second, they must understand the role of technology -- to help business, not to be an end in itself. Third is, are they smart? And the last is whether they have experience. I find if you run that playbook over and over, you get an environment where you have really smart people who attract other really smart people and nobody is a bully or fearmonger."

Cache data and pieces of static content that will be used to construct a page

Dan Woods: "People get drunk on dynamic content with unproven business value, such as push and personalization. ... Keep everything, if possible, on static pages; avoid dynamic content as much as possible. The way you handle that for a large-scale Web site is, you have a high-speed templating agent that is working against application components that are cached."

Marc Hansen: "Dan's comments about static content are true, but there are some kinds of Web businesses that don't lend themselves to static content, like e-commerce. For example, at J. Crew, when you want to buy something, we actually verify inventory in real time in our inventory management system that ran on CICS on [an IBM] System 390."

Dan Woods: "I would much rather do something like take the screen of updates from your real-time inventory and update all the static pages in the background, then run the site using Thunderstone [dynamic publishing software], so you're delivering flat pages all the time. Then you've got the maximum scalable system."

Do capacity planning for traffic spikes and ongoing growth

Jon Tulk: "Traffic planning is an absolute must, and it's hard to do when you start, because you don't have enough data to predict off of. After building some data, you can use a spreadsheet to create a simple traffic prediction model based on the historical data. Get more sophisticated later as the need demands and time permits. Also, it is good to choose some performance metrics that you will measure over time. One useful metric is the number of Web pages [served] per minute, per CPU. This can be used to predict hardware requirements against the traffic model or to monitor system performance changes over time."

Dan Woods: "We figure that [building capacity for] six to eight times the average load is what we need to not have to turn anybody away. ... Our big days are around a million page views. We think that on the biggest day, we'll probably get two and a half to 3 million [page views]."

Marc Hansen: "Measure everything that's going on, and keep the data. Because your business people will decide to run a promotion, and they'll say, `Remember that promotion we did last week? We're going to run one like it but with double the traffic.' Now you can go back and look at all your instrumented data and see what's going to happen."

Use outsourcing to host sites more reliably and less expensively than you can in-house, but develop and prototype new applications in-house, where you can make changes more rapidly

Rick Lamb: "You simply can't take upon yourself the complete responsibility of putting together a bulletproof design, reacting to what are usually pretty aggressive schedule demands. We use consultants on the design side. We use consultants on the program management side. ... There is tremendous value to capturing and preserving the intelligence from consultants. ... It's often difficult to enable a high degree of cooperation between outsource consultants and in-house staff, but you've got to overcome that with human management."

Dan Woods: "It is very difficult to duplicate internally what you can attain from the first-tier colocation providers. I mean, these organizations have separate power feeds from different grids, multiple UPSes [uninterruptible power supplies], multiple generators, and full-time staff to maintain the power and environment infrastructure. It would never be cost-effective for us to do that on an individual basis."

Jon Tulk: "At N2K, we sourced our own machines ... because we already had a computer facility. And we preferred having the system right there in the next room. Likewise, at HealthAxis, we're bringing the site home, because for the next few months, while we're really building the thing and changing hardware, architecture, and software, it's going to be advantageous to have really tight access to [the system]."

If your company is going public, establish the focus of the business and the stability of the Web site before the initial public offering

Dan Woods: "Once you become public, you have a whole new set of people talking to your senior management. The Wall Street analysts are going to ask, `What are you guys doing about disaster recovery?' ... A lot of hard questions are being asked about environments."

Deputy Features Editor Lynda Radosevich thinks proper planning can keep growing back ends from adding too much fat to the IT department.

For rundown of the most recent feature stories on InfoWorld Electric see Features at a glance.

Questions or comments? Send an e-mail to our Editors.

Go to the Week's Top News Stories

Please direct your comments to InfoWorld Deputy News Editor, Carolyn April

InfoWorld Electric is a member of IDG.net