Message Queuing & Segregation: Lessons from the Airline Industry

Note: This also appears at http://www.messagesystems.com/wordpress/?p=38.

Many email marketers are unaware of the importance of message queuing to the successful delivery of their email. As a component of their messaging infrastructure, queuing is something that marketers typically defer to their IT department to manage. Yet, the reality is that queuing and the segregation of message streams can make the critical difference between the success and failure of a company’s messaging programs, and therefore, should be of concern to both the IT and marketing departments.

Effective queuing really comes down to the choice of messaging infrastructure. When using a technologically advanced messaging platform, companies can efficiently manage parallel queues with messages assigned into multiple streams to ensure that each stream flows at an appropriate rate, providing efficient delivery of all classes of traffic. Unfortunately, those that rely legacy MTAs have no such options. They’re left trying to manage the bottlenecks and slowdowns that result from poor architecture with complicated priority schemes within a single queue.

Recently, one legacy MTA provider suggested that their routine for queue prioritization was the answer for reaching high-value customers first. While the business need is certainly legitimate, trying to prioritize messages within a single queue is both outmoded and a solution to a problem that should not exist in the first place. There are better ways to satisfy this need that are both simpler and more powerful at the same time. To illustrate my point, allow me to provide an analogy that should be familiar to my fellow business travelers.

One of the most common headaches for the air traveler is the security checkpoint; you get your ID checked, get in line, get your ID checked again, get in another line, empty your bags, take off your shoes and belt, get in yet another line and then get radiated in the name of public safety. During the highest traffic times these lines can become so long that people start missing their flights because there are a very limited number of security checkpoints and the airports were architected in a time that predated the need for such extensive security. This fundamental flaw in the architecture of the airports means that the current needs of travelers for additional parallel security screening checkpoints cannot be met, and everyone has to wait in a queue to get to their plane, sometimes with unacceptable results, leading to additional costs for all involved and potential lost business.

This is handled in a variety of ways, including performing the security check at every gate area (creating a very parallel security screening system) and by using priority security lines. Imagine for a moment that instead of this solution, the airport chose instead to assign priority on an individual basis to every single passenger and then tried to sort the individual travelers in the security line. As you can imagine such a solution would require additional work to make the individual assignments and then keep track of who ranks where in the line, with a risk that low-priority passengers would find themselves significantly delayed as they were repeatedly bumped. In all my travels I have never seen such an approach, primarily because the airports already have an approach that works.

Find a particularly efficient airport and what you’ll see is a fairly consistent set of practices:

  • Separation of passengers into queues based on their fitting into a certain profile.
  • A large number of parallel checkpoints.
  • Efficient handling of passengers.
  • Intelligent queue management that can modify queuing on the fly to meet circumstances.

Look at a particularly efficient airport and you will see multiple queues, including queues for:

  • Frequent / First Class travelers – The most frequent travelers and those who sit in First Class. These people bring a lot of value to the airlines and receive a lot of value in return.
  • Expert travelers – A new lane starting to appear in some airports, for those who are experienced in getting through security and unlikely to cause delays.
  • Family / Special Assistance – A slow lane, these groups will take longer to get through security.
  • Casual Travelers – A lane for those who move at an average speed through security.
  • Staff / Crew – While in some airports workers and air crew jump to the front of the line, the most efficient airports avoid this disruption by maintaining a separate checkpoint for those who work at the airport, minimizing disruption and ensuring that staff can get to work on time.
  • Specialty – From time to time I’ve seen the airport create a special temporary queue for unique groups such as chartered planes by opening a checkpoint and redirecting the group to the specialty group.

Not only has the separation of travelers into queues according to their profiles (including their priority as a group) proven sufficient to make prioritization by individual traveler unnecessary, it is much easier to manage.

While separating passengers into a number of queues can certainly benefit airports, the most efficient airports are also architected to operate a large number of parallel checkpoints, preventing a situation where every passenger in the airport needs to be funneled through the same metal detector. Imagine an airport trying to service millions of passengers a year on only one or two metal detectors and a single x-ray machine.

In addition to having many checkpoints, the best airports will also have efficient checkpoints, maximizing the flow of passengers through any given checkpoint through better design of the checkpoints and better training of the staff, all without compromising safety.

Perhaps most importantly to the smooth operation of an airport is intelligent management. I’ve seen airports where there were several queues open but empty because the people managing things weren’t flexible enough to reassign lines and adjust the queues to ensure well balanced passenger flow. The best airports will change the designation of queues, move staff around and even redirect passengers to alternate checkpoints that are less busy, all in the interests of moving the highest number of passengers per hour.

Senders can follow these same principles to get maximum throughput and deliverability in their own environment:

  • Segment mail by profile.
  • Choose a sending solution that supports highly parallel sending.
  • Choose a sending solution that provides sufficient throughput.
  • Choose a sending solution that is intelligent.

When sending, remember that segmentation is not just for who to send to or what to send them, but for deciding how to send a message and with what priority. You will want to create queues for high priority messages to satisfy your most valuable customers, queues for high-reputation traffic that delivers without issues as well as for traffic that you expect to deliver slowly (one example is traffic that results in human interactions, you may need to slow this to prevent overloading your call centers), test and administrative traffic that needs to go out as soon as possible and transactional traffic that should not queue up behind bulk sends. This is a common practice among ESPs, who often add specialized segmenting for scenarios such as new customers and customers with specific SLAs.

As with the airports, you need to architect your environment to be able to handle more traffic in parallel. This can be accomplished by adding more injectors and more messaging infrastructure, or by adding better infrastructure. Look at your existing solution: how many IPs can it send from? How many messages per hour can you send on a single machine and how many concurrent connections can it handle? Most Open Source solutions can handle one IP address, send 100,000 messages per hour at most, and can open less than one hundred connections. Low-end commercial solutions can often do over a hundred IPs, send close to a million messages per hour and can handle a few hundred to a couple of thousand connections. On the high end you have carrier-grade systems designed for the enterprise such as Momentum by Message Systems, which can utilize thousands of IPs to send millions of messages per hour across tens of thousands of concurrent connections (while you will never open more than a few connections to a given ISP on a given IP address, lesser solutions will fall short when sending across hundreds of IPs to thousands of ISPs, defaulting to ISP prioritization as a workaround).

Finally consider the intelligence of your infrastructure:

  • Can your infrastructure send across all servers simultaneously and fail-over in the event of an outage?
  • Can your infrastructure adjust throttles on the fly based on responses from the ISPs and bounce and feedback loop data?
  • Does your infrastructure handle queues so efficiently that performance is the same with thousands of messages in the queues as it is with millions of messages in the queues?
  • Can your infrastructure dynamically change from email to other protocols such as SMS and MMS based on subscriber preferences and ISP responses?

If not, we should talk.

Technical Considerations For IP Warmup

Introduction

In response to the recent news regarding Goodmail closing its doors, Tom Sather at Return Path published a blog entry regarding IP warmup and the difference it can make for inbox placement.

Tom sums up the need for IP warmup well:

If you had talked to any email marketer 10 years ago and asked them how they dealt with blocks on their IP addresses, the answer would probably be the same: “We just switched IPs.” Not only was this an unfortunate, albeit effective, way to deal with blocks, it also became a common method used by spammers. They would simply send from one IP address for a very short time and then move on to another, either with IPs they owned or through hijacked computers controlled by botnets. Because of spammers’ behaviors, ISPs and email providers respond by temporarily blocking and limiting the amount of email a new IP address could send. ISPs now treat any new sending IP address like a dog on a short leash, and only extend the leash when the senders’ reputation is proven.

I personally have seen what can happen when senders try to send too much, too soon, with senders trying to send millions of messages on their first day using new IP addresses and finding themselves blacklisted in short order. For a reputable sender the key is to start sending slowly and gradually increase volume on new IP addresses until a proper sending reputation has been established.

A Clarification

Before I get into some technical advice I’d like to clarify one thing from the Return Path article. Regarding the shutdown of Goodmail Tom has this to say:

There are a couple of reasons you still might have to send from a new IP address, such as moving to a new ESP, moving to a new data provider, or moving off of Goodmail. Goodmail had a unique way of tokenizing their customers’ mail by relaying mail through their own IP addresses, and consequently their reputation. Therefore, once you stopped using Goodmail, your traffic now goes through your IPs, which hasn’t had any traffic in awhile, which means you’ll need to work on building up your sending reputation again.

This statement applies to any customer’s of Goodmail’s hosted imprinting service but does not apply to in-house senders using products such as Momentum by Message Systems that had a built-in Goodmail Imprinter. For such users the shutting down of Goodmail involved shutting off the Goodmail Imprinter component of their infrastructure but IP warmup will not be required since those users were already sending using their own IP addresses.

Tom’s Advice

Assuming you didn’t get a chance to read the link, here’s the five points of advice provided by the article:

  1. Sign up for all feedback loops. Suppress from future mailings.
  2. Authenticate. Use SPF, SenderID and DKIM.
  3. Segment and mail your active subscribers. Put your best foot forward.
  4. Monitor. Use seedlists such as Mailbox Monitor and watch your IP’s Sender Score.
  5. Get Certified. Get your new IP Sender Score Certified.

Some of these warrant additional discussion from a technical point of view.

Feedback Loops

In order to be effective, Feedback Loop message handling needs to be automatic. Message Systems customers should already be aware that we have provided built-in Feedback Loop processing as of our 3.0 release in 2008. In addition to automatically unsubscribing recipients that trigger a Feedback Loop message, you should also take the volume of feedback loop hits as a metric to show the effectiveness of your mailings. Feedback Loop hits should also be used as a factor when determining traffic shaping rules, especially when an IP address is new. If you see a lot of FBL hits, you should throttling back on the sending IP address.

Monitoring

With regards to monitoring, seedlist monitoring is a good indicator of how ISPs are treating your mail, but they provide only part of the overall picture. To get a complete view of your deliverability you need to also monitor what happens before the ISP accepts a given message, taking into account what temporary (aka transient or 4xx) failures and permanent (aka 5xx) failures that are occurring as you try to send. When monitoring permanent failures, keep in mind that permanent failures can occur both during delivery (synchronous or in-band) and post-delivery (asynchronous or out-of-band) through the ISP sending back a DSN (delivery status notification) message to the return path (aka envelope sender or envelope from) address of the original message. You should be tracking and trending all failures, especially when sending on new IP addresses.

Additional Technical Considerations

Keep in mind that just like in life, you are judged not just by what you say, but by how you say it (and whom you say it to). With regards to deliverability this comes down to content and sending practices. From a technical standpoint we focus less on content (what you say), but it does have a significant impact on IP warmup. You should avoid sending riskier content on new IP addresses both on overall content and wording (I’ve seen deliverability dip just for using the word “sexy” in a mailing, even when the overall message was not sexual in nature).

Sending Throttles

Before you ever start sending from a new IP it is vital that you pre-configure your sending software to comply with as many published ISP recommendations as possible. A convenient resource is this page provided by Word to the Wise: ISP Summary Information. Pay particular attention to the Connection Limits and Sending Limits columns.

Sending Volume

When warming up IP addresses it is important to start slowly; ISPs do not trust new IPs and will not respond well to new IPs coming online and immediately bursting out large amounts of traffic. While there are no published limits online one of the recommendations I have heard is to avoid sending more than 10,000 messages per day to the major ISPs (Yahoo!, Gmail, Hotmail, AOL, etc.) when first sending, and I’d say it would be best to send less than 1,000 messages per day to any smaller ISPs. By reviewing your temporary and permanent failure messages you will be able to get a feel on whether your reputation is sufficient to increase volume, and after increasing volume you should pay particularly close attention to your failure metrics to make sure that the change has not had an adverse effect on deliverability. I generally recommend not increasing volume by more than 2x at a time and not more than once every day or so. Don’t hesitate to revert to a lower volume if you start seeing an increase in temporary and permanent failures.

Suppress Bounces

When sending, make sure to quickly and automatically suppress any recipients that the ISPs identify as being invalid. When first sending you should assume that you are being watched closely, and one aspect of that is your practices regarding bounce processing. If you repeatedly send to someone that an ISP identifies as invalid through a bounce message you will be penalized for it by the ISP, and that punishment can potentially come faster when an IP address is new due to the lower starting reputation.

Watch Out For Deferrals

There is a specific class of temporary and permanent failure responses that you need to keep a particular eye out for, the deferral messages. A deferral message from an ISP indicates that you need to quickly and decisively change your sending practices as they are the warning messages you receive from the ISPs prior to being blacklisted. You can get examples of deferral messages from Yahoo here and many ISPs will list examples of their deferral messages on their postmaster page. As an example, here is a hotmail deferral message:

421 4.16.55 [TS01] Messages from x.x.x.x temporarily deferred due to excessive user complaints

When you see such messages, you need to review your content and throttles and pause sending for a couple of hours to allow things to cool off while you determine what changes you need to make.

Automating IP Warmup

In his article Tom Sather advised:

If this looks like a lot of work, then you’re right. To be successful, you need to plan appropriately, be patient, send smarter, and constantly monitor.

Tom is absolutely correct, warming up new IP addresses requires research, preparation and diligence.

The good news for Message Systems customers is that we’ve taken care of this for you with our Adaptive Delivery module. Adaptive Delivery will automatically identify new IP addresses, set initial throttles and gradually increase volume as the IPs age, monitoring ISP responses to ensure that the ISPs are responding positively. If at any point the Adaptive Delivery module identifies a negative ISP response, it adjusts throttles in realtime and monitors for additional negative responses. If an ISP replies with a deferral response, Adaptive Delivery will suspend delivery, throttle back and send you an alert so that you can check the content being sent. All of this is built using intelligence that is constant reviewed and improved by a full-time, in-house deliverability specialist. In addition, our bounce processing system now supports live updates, allowing us to improve classifications thanks to automated feedback from customer systems. If you’re not taking advantage of these new capabilities contact Message Systems and we’ll help.

How To Send One Billion Email Marketing Messages Per Month

One... *billion* emails!
One... *billion* emails!

One *Billion* Emails

In email marketing there are senders of all shapes and sizes, from small businesses using self-serve ESPs to the largest web properties self-sending to massive user bases. While only a few senders will reach or exceed volumes of one billion messages per month, the tools and practices needed to achieve such a volume level are applicable to all senders who want to succeed in email marketing.

Who Am I?

My name is Mike Hillyer (click here for bio and social links). I manage a team of Sales Engineers for Message Systems, a leading provider of digital messaging solutions for both senders and receivers. In my work over the last several years I have helped a number of clients reach the billion messages per month level and even more clients successfully deploy email marketing solutions ranging in scale from hundreds of thousands to millions of messages per month.

Who Needs To Send This Much Mail?

Contrary to what the image to the image above implies, there’s nothing inherently evil about sending a billion messages a month. Some of the businesses that move a billion messages a month include ESPs, social networks (some move more than a billion a day for that matter), social gaming sites and large online retailers.

Any time you have a fairly large number of users (5-20 million) who receive multiple messages per day, or a really large number of users (40-50 million) receiving one message per day you are heading into the billion messages per month territory.

What Are The Numbers?

So exactly how much mail are we talking about here? That will depend on sending patterns:

In a lot of high-volume environments the sending will be to a world-wide audience, resulting in round-the-clock sending with no significant bursts of traffic. In such an environment the hourly volume will be 1,000,000,000 messages divided by 30 days divided by 24 hours equaling 1,388,889 messages per hour (386 messages per second), assuming 30 days in a month.

In an environment with inconsistent hourly volumes, we have to allow for both an average hourly volume and a maximum hourly volume and then design our solution to address the maximum hourly volume.

We need to look at seasonal factors: Does your social network move a lot of extra messages around Mother’s Day? Does your dating site move a lot of extra messages around Valentine’s day? Does your web shopping portal do a lot of extra business around Christmas?

We need to look at growth: If you are sending a billion messages a month it is very likely due to successful growth of your user base, something which you certainly have no intention of slowing. Look at how you have grown your email volume so far and extrapolate it out for the next year or two (especially if you only get budget for your infrastructure every two years).

Let’s assume for the sake of this article that you have an average volume of one million messages per hour with a peak volume of two and a half million messages per hour during your busiest season. You expect to double your user base each year for the next two years. At the end of two years you expect to be sending ten million messages per hour, or 7.2 billion messages per month (I’ve seen just this kind of growth several times with customers and prospects).

You Will Need to Send In-House

A lot of senders start by using an Email Service Provider (ESP) for their sending and should do so: an ESP provides infrastructure and expertise to handle the details of sending email marketing messages for their clients at a good price, allowing companies to focus on their business. In addition, the costs of installing and maintaining proper sending infrastructure and practices are not justifiable for most low-volume senders.

That said, if you are aiming for a billion email marketing messages a month and are using an ESP it’s time to plan your move to in-house sending. Assuming a $1.00 CPM (Cost Per Mille with Mille being Latin for thousand, so cost per thousand) you are looking at paying an ESP a million dollars a month to handle this kind of volume. Naturally you can probably secure a better rate than $1.00 CPM at these volume levels but regardless of the discount at this volume level you will pay less to buy the infrastructure and hire the people needed to do this yourself, gaining the control you need when sending at these volume levels.

Start With a Good Reputation

In order to hit the volume levels we’re talking about it is going to be vital that you have a solid sending reputation. This means you need to follow best practices for list acquisition, list hygiene, segmentation and relevancy. There’s a wealth of information online and an excellent catalog of it at Email Marketing Reports. This article will focus primarily on the technical aspects of sending one billion email messages per month but keep in mind that reaching one billion messages a month without a solid reputation on your domain and sending IPs is very difficult. A number of tools for checking the reputation of your IP addresses can be found at Word to the Wise. At this volume level email is key to your business and a solid reputation is going to be essential.

From a technical perspective there’s a number of bases we need to cover regarding authentication, whitelisting, bounce processing and complaint handling.

Authentication

As a reputable sender you will want to associate your IP addresses with your domain using the authentication standards available to you. These include SPF, SenderID, DomainKeys (DK) and DomainKeys Identified Mail (DKIM). There are indications that SPF (and SenderID by association) is ineffective but given the low effort required to implement it I would recommend doing so anyway. While SPF and SenderID are purely DNS-based, DK and DKIM require an implementation either during message creation or during relay by the MTA and as a result will impact the maximum throughput of your infrastructure (more about this later).

DomainKeys is quickly being superseded by DomainKeys Identified Mail but with most solutions supporting both DK and DKIM it is simple enough to use both when sending to an ISP that supports one standard or the other. Implementation details will vary based on your sending solution. While some recommend selectively signing DK and DKIM for only messages sent to ISPs that are known to check authentication (in order to lower the impact signing has on throughput on a solution that takes a significant performance hit from signing), I recommend signing all messages; you never know who is checking for authentication without announcing it.

Whitelisting

One benefit of getting on the various whitelists provided by ISPs and reputation providers is that in some cases you can send higher volumes on whitelisted IP addresses than would otherwise be possible. Keep in mind that in most situations whitelisting is something that comes after sending has already begun in order to allow the provider of the whitelist to examine your sending patterns as part of the whitelisting process, so put your best foot forward (and follow it up with consistent behavior).

Bounce Processing

One quick way to lose reputation is to repeatedly send mail to recipients that do not exist. The ISPs will track how many non-existing addresses you send to and throttle you accordingly. Even more seriously, ISPs will occasionally take inactive email addresses and re-activate them as spam traps; any mail sent to the address will immediately get the source classified as a spam source and filtered accordingly.

To prevent this it is necessary to capture and act on the responses sent by the ISPs and unsubscribe those addresses identified as non-existent or inactive, while retaining those with responses that identify users on vacation and other not-fatal errors. Commercial sending solutions will perform this automatically with varying levels of effectiveness while other platforms will require a third-party solution such as Boogie Tools. Keep in mind that the more you send, the more you receive back in the form of automated responses and bounce notifications. As your reply addresses reach more and more users the flow of notifications will become contaminated with spam and virus-carrying messages, requiring the implementation of Anti-Virus/Anti-Spam solutions for your incoming mail stream.

Complaint Handling

In an effort to help senders improve their practices, a number of ISPs have implemented ARF formatted Feedback Loop programs. When a user on a supported ISP clicks the “This is Spam” button, an automated message is sent to an address you define in advance (when signing up with the ISP for the Feedback Loop program). By processing these messages and un-subscribing the relevant users, you prevent further reputation damage that may result when sending them future messages.

The ARF format used by the ISPs makes it relatively straightforward to process Feedback Loop messages and use them to unsubscribe the users who have complained about your messages. There are tools available to process ARF formatted messages and some sending solutions will handle FBL messages natively.

Infrastructure Considerations

There are a number of architectural components that come into play to make it possible to send email marketing messages at volume levels of one billion email messages per month (or more) including network connectivity, server hardware and software.

Connectivity

Most professional sending operations are based in rented datacenters, simplifying the provisioning of network connectivity. In our initial example of a maximum throughput of 2.5 million messages per hour we’ll use a sample message size of 50 kilobytes (51,200 bytes), meaning that we need to send at a rate of  2,500,000 * 51,200 = 128,000,000,000 bytes per hour or 271.2 megabits per second.

With the throughput we’re talking about we certainly need to use gigabit speed networking within the datacenter and, more importantly, need backbone connectivity that can support not only a sustained throughput of 271 megabits per second but than can handle our future needs of 7.2 billion messages per month. You need to look at a datacenter that will be able to provide sustained gigabit speeds to the backbone.

Keep in mind that when you are sending a billion messages per month it means that email has significant impact on your bottom line and you won’t be able to tolerate extended outages. You need to not only make sure that the datacenter you choose has redundant power and backbone connections, you also need to consider using redundant datacenters.

Server Hardware

Moving over a million messages per hour does not require the purchase of custom server hardware but it does require making a proper investment in hardware. Generally speaking you will be using an infrastructure similar to the following:

An example of a basic sending infrastructure

The Message Injector queries the database and uses the results to assemble one or more messages which it relays to the Outbound Mail Server. The Outbound Mail Server queues the message, performs any necessary manipulations on the message and then sends it via the Internet to the destination server. In the event of a delivery failure message or a feedback loop message, the incoming message arrives via the Internet to the Inbound Mail Server. The Inbound Mail Server performs anti-virus/anti-spam scanning and then, in the case of a legitimate message, processes the message and updates the subscriber information in the database (not all server solutions can perform this processing in-stream, when using such solutions an intermediate server will be needed to accept the clean message from the Inbound Mail Server and process it using custom code).

In a production deployment there can be several variations on this example, typically with multiple servers used on the outbound and inbound roles, with multiple message injectors pushing to the outbound machines and often specialty servers on the inbound side dedicated to processing incoming feedback loop and bounce messages.

I generally recommend mail servers similar to the following:

  • 2x multi-core, 64-bit processors
  • 16-32 GB of RAM
  • 8x 15K RPM hard-disks
  • Battery-backed RAID-10 controller

The specific details of your hardware selection will depend on the ability of your specific software to leverage the resources provided. A large number of fast disks in a RAID-10 array is recommended for the message spool as standards-compliant mail servers must write messages to disk before accepting them for delivery, placing significant demands on storage resources.

Software

As an employee of a leading software provider for high-volume senders you would rightly expect me to recommend a commercial solution, and specifically my company’s solution. I’d like to take a moment to point out why:

Performance

We need to send at a rate in excess of one million messages per hour. I’ve dealt with a number of solutions and my experience has shown that most Open Source MTAs such as Postfix and Sendmail are limited to around 100,000 messages per hour. Commercial sending solutions typically show real-world performance ranging from 500,000 messages per hour to over two million messages per hour.

I have helped several companies that were operating dozens of Open Source servers to consolidate down to one-tenth as many servers running a commercial solution.

Segregation

In addition to limited throughput, Open Source MTAs are usually limited to sending through a single IP address, meaning that to send through ten IP addresses you need ten separate server instances. Commercial solutions support sending through multiple IP addresses simultaneously.

Advanced Functionality

Commercial solutions go beyond basic message queuing and sending, providing the additional functionality required for a high-volume sender. This includes features such as APIs, bounce classification, feedback loop processing, internal scripting, automated throttling, and database integration.

Availability

If you’re sending a billion emails a month, you absolutely need a solution that provides high availability out of the box. If a server goes down you can’t afford to be frantically activating a warm spare, just to find out that it too has some issue. You need an active-active solution that reacts automatically to server failures and keeps the mail flowing.

Manageability

You need a solution that can be easily managed on your terms, whether you prefer editing configuration files or using a web interface. In addition, you need something that grows with you, providing centralized management of an entire cluster of servers. Commercial solutions will provide easier, centralized management.

Reporting

One key to successful sending at high volumes is keeping tabs on how your server is performing and how your mailings are doing. You need to know what is passing through your server, how quickly messages are moving, whether queues are backed up, how the various ISPs are treating your traffic, all with the ability to drill down on specific source IPs and specific destination ISPs. You need to be able to see all of this in real-time and across your entire infrastructure. A good commercial solution provides all of this out of the box.

Time is Money

On multiple occasions I’ve seen organizations choose a free or low-cost solution and then spend countless hours building workarounds to the weaknesses of their chosen platform, writing scripts to automate administration, reporting tools to fill their needs, failover scripts to provide redundancy, etc.

While a lot of this work was impressive, it required time to implement and time to maintain. Time spent creating tools that are already provided in an alternative solution is time (and money) wasted. You are always better off using your time to create your “special sauce”; that which makes your business unique and gives you a competitive advantage.

The Price of Success is Continued Vigilance

When sending at a rate of one billion messages a month (or more), you can’t just use a ‘fire and forget’ mentality. You are going to have to have people around to keep a constant eye on what is happening in your environment, monitoring multiple key factors to ensure you can continue to successfully send.

Reputation Monitoring

Remember earlier when I said you need to start with a good reputation? You also need to keep a good reputation, and the only way to do that is to know what your reputation is. You will need to take advantage of reputation monitoring tools provided by companies like Return Path and Pivotal Veracity as well as keep a close eye on the reporting produced by your sending solution (remember when I said you need good reporting?)

You need to watch things such as bounce rates, FBL hit rates, blacklist hits, transient failures and response rates.

Infrastructure Monitoring

If email is your company’s lifeblood (and if you’re sending a billion messages a month it certainly is) then you need to make sure to keep it flowing no matter what happens, and that means making sure your email infrastructure stays online. I spoke earlier of  the need for high-availability, active monitoring goes hand-in-hand with this need. You will need to monitor the health of the servers that support your infrastructure, the network components that carry your messages and the software that creates and relays your messages.

There are a number of monitoring solutions available to accomodate any platform and budget, be sure to implement one that meets your needs and get monitoring. Make sure to test simulated failures to confirm that monitoring is working successfully. Consider setting up a simulated mailing that runs on a regular basis using your full infrastructure stack: a monitoring script can check an inbox and if the test message fails to appear, something is potentially wrong in your sending infrastructure. This approach can help identify issues that may pass by other monitoring systems unnoticed (and can be integrated into some monitoring solutions directly).

Response Monitoring

Keep in mind that you get what you monitor for; if you focus too much on one metric it may improve without helping the big picture. In addition to making sure all the underlying pieces are in place, don’t forget to keep an eye on things where the rubber meets the road. You may be sending at phenomenal rates with great metrics but failing to generate customer actions that lead to revenue.

Conclusion

While by no means an exhaustive list, I hope this gives you some idea as to the scope of sending in a high-volume environment of one billion email messages per month. Watch this space over the coming weeks for deeper dives into some of the subjects covered here.

Questions? Did I miss something? Let me know in the comments!

Disclaimer

The opinions and information in this post are my own and do not necessarily reflect those of my employer.

Congratulations, You Can Mail-Merge!

I had this exact thought the other day and need to share this great article by Morgan Stewart. Quick quote:

I’ve received over 500 emails so far this year in which the subject line included my first name. “Morgan, Book Now & Save on Top Travel Deals” “Morgan – Congratulations! Your Nomination to Cambridge Who’s Who!” “Morgan, Get Dad a 58″ Samsung Widescreen.”

When I see an email with my name in the subject line, my first thought is not “Phew! These guys know my name!” No, it’s become a red flag for spam.

He’s spot-on: these days seeing my name in the subject line is almost a sure sign of spam. Read the entire post here.

Good to know I share good company on the Cambridge Who’s Who list.

Mailchimp Makes Project Omnivore Public

Pretty impressive blog post at MailChimp today, in which they make public the details of their Project Omnivore: http://www.mailchimp.com/blog/project-omnivore-declassified.

In short:

Omnivore is a program that runs in the background and analyzes email campaign and user account data. Non-stop.

When it finds anything suspicious about a MailChimp user or his campaigns, it’ll do one of two things:

  1. Send the user a warning for something that looks problematic.
  2. Suspend a user’s account for something bad, send them a warning, and alert our abuse team to investigate the account.

The long version is at the link and the impressive part is how much data and prediction they are incorporating into the tool to help them avoid sending campaigns that will damage their sending reputation in the long run. It’s not just about filtering the mail stream to make sure it’s not going to trip filters, but watching the list management practices of their customers.

This is the kind of thing that all ESPs are going to have to start doing moving toward, using internal systems to ensure that the right message is being sent to the right people, then combined with tools like Adaptive Delivery and real-time bounce and feedback loop processing to make sure that the messages are being sent in the right way.

Some days I’d just like to see what a company like MailChimp would do with our toolset, give them a messaging server with internal scripting that can hook into their datasources and I’d wager some very cool things would come out.