Our eCommerce server infrastructure

Published on 02 June 2009 by Chris in Blog, Development

9

In order for our eCommerce site to run smoothly and have good load times, we need to start with a good infrastructure to work with.

We will be using our own hosting solution, which will be hosted in a Data Centre in Newcastle upon Tyne, UK. The hosting infrastructure comprises of the following:

  • Mail Server: This is a standalone mail server which is shared across all our clients who use email. It runs Centos 5.2 Linux and uses PostFix for mail.
  • Load Balancer: This is a load balancing server which also acts as one of our name servers, it will be used to distribute the load of traffic visiting our site between our two web servers. It runs Centos 5.2 Linux and uses Bind for the name server and Pound for load balancing.
  • Web Server 1: This Web server is the master web server and runs Centos 5.2 in conjunction with a LAMP stack. The web server runs two quad core 2.5Ghz Intel processors with 4GB of DDR2 Ram and 3×160GB hard disks in Raid 5.
  • Web Server 2: This Web server is the slave web server and runs Centos 5.2 in conjunction with a LAMP stack. The web server runs two quad core 2.5Ghz Intel processors with 4GB of DDR2 Ram and 3×160GB hard disks in Raid 5. This server replicates all files and databases etc from web server 1 using rsync.

This server infrastructure is extremely robust as it provides us with a complete failover system, by using the load balancer, if web server one were to fail (touch wood it doesn’t), then web server two should be able to take over where it left off. Our site will be replicated onto web server two throughout the day and we make an offsite backup every night to our office server via rsync which is kept for a week.

9 Responses to “Our eCommerce server infrastructure”

  1. [...] This post was Twitted by gavinelliott – Real-url.org [...]

  2. Ryan Tomlinson says:

    Hi Chris,

    Nice post, it’s good to hear how other companies infrastructure is set up. Just a few questions, mean’t to be considerations, I’m not trying to poke any holes here.

    Firstly, what about backup power and power failures. Do you have UPS and/or a separate power distribution for your redundant server. ie. does your failover have it’s own power supply?

    Secondly, how can you ensure network connectivity? We are currently looking at failing over our infrastructure to a separate data center. At present though we are doing this on a smaller scale, by having our DNS point to a hosted solution provider. Far from ideal, i know, but atleast if our servers go down our application doesn’t.

    Also you don’t mention databases here (afaik). How do you replicate your data between database servers? i.e. log shipping or full blown replication. Unless you mean your databases are on your web servers?

    Again I’m just interested, especially as we have an infrastructure audit this week ;) .

    Ryan Tomlinson

    • Chris says:

      Hi Ryan,

      First of all, thanks for the great question.

      The datacenter provides power backup in our case. They have UPS backup should the power fail and a generator backup should both the UPS and main power fail. So short of a nuclear attack the power shouldn’t be a problem.

      Two of our servers (mail and load balancer) run BIND DNS in a master/slave configuration, which gives us our two name servers. Ideally we would co-locate the whole infrastructure in another datacenter, but obviously this starts to become more and more costly.
      For the databases, both www1 and www2 run MYSQL servers, which act in a master/slave configuration using replication so that we have an almost current copy of all the databases at any one time. We have a file server locally in our office, which downloads a full copy of the databases every night and keeps these for a week. (Using rsync)

      As you say, we currently store all our databases on the web servers and long term we would look to use dedicated MySQL servers to spread the load.

      I think that when it comes to Server Infrastructure, it can become a case of “how far do you go?” We have built our infrastructure in such a way that we can ensure we have a backup of all our files and databases for up to a week and we have failovers in place for the web servers so if worst comes to the worst, the other will just take over.

      There is a fine line between cost and risk, there will always be a risk involved as a system is never 100% safe, as your servers might fail and so could your fail over. It could potentially be never ending, as you would be forever buying servers to use them as fail overs just in case the worst happens and you’ll spend a fortune chasing the perfect system.

      As long as you take a methodical approach and ensure you have covered for the most likely scenarios, then you can start to sleep safely at night.

      • RAID incase of a hard disk failure
      • Replication to a failover incase your server fails
      • Secure source of power to prevent power loss.
      • Secure source of network connectivity to prevent loss of connectivity.
      • Backup locally to a physical medium that can be removed and stored elsewhere incase of virus/hacker/complete failure/corruption
      • Antivirus and Firewall to protect against malware and hackers.

      Then as your company progresses and you need to scale up your infrastructure then you can look to build in more enhancements to the infrastructure to better cater for your needs.

      Cheers

      Chris

  3. Hi Chris, just discovered your post via a tweet from Ryan; how is the PostFix server working out for you? I’m a Ruby on Rails guy, and many people in the Rails community start an app using Gmail for SMTP and migrate to something else if the need arises (PostFix, or a hosted solution, being one option).

    PostFix is an option for the apps I’m running (I have two apps running on SliceHost, and they have a great tutorial on how to install PostFix on their slices) but I’ve heard that running your own PostFix can be a bit tricky when it comes to inbox spam filters and keeping it all whitelisted. Have you had any difficulties in this regard, or has it all been plain sailing for you?

    Neil

    • Chris says:

      Hi Neil,

      I’ve found that on the whole PostFix seems to work well with our systems. Initially we were getting absolutely slaughtered by SPAM, but we’ve since managed to resolve this with a combination of other software to reduce the SPAM by about 95%.

      We use spam assassin and grey listing for our mail server, but rather than integrating spam assassin directly with postfix we use Mail Scanner, which provides a very fine degree of control over spam assassin. It can also provide virus scanning, dangerous content scanning and phishing/fraud detection.

      Unattended, spam assassin can block about 80% of SPAM and you can use Bayesian learning to improve this, but on a multi domain mail server like ours, it can become extremely tricky very quickly. Spam assassin is also quite a system hog.

      Grey listing works by initially rejecting unknown mail servers forcing them to wait for 5 minutes before it accepts mail for delivery. The idea being that spam mail servers never attempt to resend whereas legitimate mail servers always attempt to resend.

      The downside to this is that it can cause a delay of 5-30 minutes for new senders to come through (once it learns a sender is legitimate it comes straight through).

      Our SPAM has thus been reduced by about 95% and can be tweaked even further to push it to 97-98% on a good day. The main benefit of this is that it hugely reduces the load that spam assassin puts on the server.

      Configuring mail scanner can be a major headache as there are lots of options available but the degree of control it gives you over what to scan, how to scan it and what to do with it afterwards is excellent.

      I’ve found that there is a very thin line to tread when it comes to dealing with SPAM as it can very quickly become an absolute nightmare to deal with. Even using the setup we do has to be carefully thought out as each of Mail Scanners features can result in false positives, so the settings need to start really low and then you can tweak them from there until you are satisfied with the results.

      Cheers

      Chris

  4. Ryan Tomlinson says:

    Neil and Chris,

    Another option that you should consider for email delivery is to use an email service provider with an open API. The reason for this over a self hosted solution is two-fold.

    There are plenty of companies out there that are dedicated to email deliverability and companies such as Campaign Monitor, for example, have the infrastructure in place to deliver email on your behalf. They use expensive and performant MTA’s and typically have the infrastructure to match. The additional benefit to this is that you have less hardware to maintain.

    Perhaps more importantly, these companies have alliances with ISP’s to ensure deliverability (note here that I am not saying guarantee delivery, this is impossible to do). If you do go down this route, ensuring that the company is Sender Score Certified with Return Path is key. Return Path are an intermediary between email delivery companies and the ISP’s (google, microsoft, yahoo, etc). Again, don’t confuse this with whitelisting and blacklisting, because ultimately it isn’t as black and white as this. If your data and campaigns are crap, it’s more likely delivery rates will drop. Once this occurs your IP is rated poorly and it takes a long time to improve IP reputation.

    Two popular ones are:
    http://www.mailchimp.com/api/
    http://www.campaignmonitor.com/api/

    Other options like email cloud computing such as Email Cloud from Rozmic will deal with the SPAM side of things.

    Having worked for a large email marketing company I know it takes a lot to ensure deliverability, where feasible I would suggest email integration rather than having the problems that result in self hosting.

    Ryan

    • Chris says:

      Hi Ryan,

      You’re exactly right; using an email service provider is an excellent solution for ensuring a much higher rate of deliverability, especially when sending out emails to mailing lists. We actually use an external email service provider for sending out such emails, which I believe Gavin is writing a post for at the moment.

      The reason we have a self hosted email solution is to provide simple email accounts for ourselves and our customers to use every day. We have actually used Email Cloud from Rozmic in the past and it is an excellent piece of software that works really well and is perfectly suited for enterprise level companies. For smaller businesses a good alternative would be to use Google Mail via Google Apps to host your mail. It is a free service provided by Google and they will host your emails for your own domains for free. This means you get the benefit of using their email filtering system to fight the SPAM for you.

      We chose to run our own email solution was because we wanted to have our emails centralised to make it easier to manage the accounts for our clients. We also wanted more control over things like reliability, if we had used Google, we would be at their mercy should a problem occur and you can’t just call them up and find out what the problem is if something goes wrong. An example here would be that recently there were a few Gmail outages that lasted for half a day or longer. In comparison by running our own system we can log in within seconds and find out what the problem is immediately, or alternatively go directly to the data centre, find the problem and fix it.

      Cheers

      Chris

  5. Ross Cooney says:

    Thank you everybody for your kind words about emailcloud.

    Running a successful infrastructure project demands that you focus on your objectives. In the case of this ecommerce project the goal is to provide a secure, stable and reliable user experience for the visitor of the ecommerce shop. Often this means that you need to identify areas of the infrastructure that are not core, or are not within your skill set….in this case it is good to outsource that area. In some cases that can be the firewall management or backup policy…or spam prevention policy.

    emailcloud is simply an outsourced spam and virus protection system.

  6. Chris, thanks for all the insights on spam assassin, etc. I know that’ll come in useful in the not too distant future; there’s been talks of configuring of PostFix server. Preventing incoming spam, and ensuring that outgoing email isn’t considered as spam, were my two main concerns. I hadn’t even heard of ‘greylisting’ before.

    Ryan, I’ve been following Campaign Monitor and Mail Chimp, they look like very useful services. Mail Chimp’s integration with Google Analytics is something I want to try out soon (I watched all their use-case vids) – I just don’t have a newsletter to apply them to yet!

    Have you use any analytics services in your past experiences with email marketing?

    Ross, I’m with you on that – there definitely are processes within web applications that can be cut out in to verticals and left to specialists to handle (leaving you free to create value around your core business proposition). EmailCloud is something I’ll be investigating as soon as the need arises.

Leave a Reply