Based in Calgary, Canada, I've managed the Enterprise Sales Engineering team at Message Systems since 2006, helping senders such as Facebook, Salesforce and Match.com get the message out to their customers.
Previously I was a Technical Writer, Trainer and Speaker for MySQL AB, the makers of the world's most popular Open Source RDBMS.
I greatly admire Pixar and its people, and one of the people I admire greatly is Ed Catmull, the Pixar founder. His personal contributions to computer science and computer graphics are phenomenal, but he’s also an excellent leader and businessman. The following video from an Economist conference provides a good example of his wisdom:
And here’s another, older example:
It’s wisdom like this that puts books like this on my desk:
There’s a number of sessions I’m looking forward to attending, but I’d like to invite you to attend the sessions I’ll be delivering next month (read to the end to save on conference admission):
What the Convergence of Data Security & Privacy Concerns Will Mean to Companies
The barrage of news stories about data breaches and privacy violations is taking a toll on consumer confidence.
What You’ll Learn:
Why data security and privacy issues are converging and how an erosion of consumer confidence can jeopardize data availability for communication and commerce.
How security and privacy are connected to Message Convergence and why they should now be of concern to all ecosystem players and at all levels, Marketing as well as IT.
What principles companies should embrace to address security and privacy in their own environments.
How companies can safeguard their customer data and messaging streams.
New Directions in Email Deliverability
Our panel of industry experts will explore the ongoing evolution of deliverability management and new technology advances, such as adaptive delivery, that will make it easier.
What You’ll Learn:
How deliverability is a tactic companion to Message Convergence – getting messages delivered, read and acted on.
How new advances in technology can improve deliverability management effectiveness and remove the hassles for all stakeholders.
Building Multi-Channel Apps
This session will introduce participants to the whys and wherefores of multi-channel messaging applications how they deliver business value, and how to construct them. You¹ll gain both an understanding of the business strategy behind multi-channel apps, and a nuts-and-bolts working knowledge of the tools and techniques required to design, build and deploy them. Topics will include how to access multiple data sources on the fly and how to make routing determinations. For instance, once you¹ve made a judgment on content, context and preference, how to go about actually getting a message routed to its ultimate destination. We’ll go in depth on the subject of multi-channel message type (MCMT), a proprietary content container format that makes it possible to inject messages into the delivery stream with content alternatives dependent on the preferred message channel.
Product and program managers, developers, line of business owners.
What you’ll learn:
How multi-channel messaging delivers business value across any number of industry verticals.
The messaging and data systems/architectures needed to deploy multi-channel messaging.
Introduction to MCMT.
How to configure Momentum, Mobile Momentum and Message Central for multi-channel apps.
Understanding and acting on customer preference data.
Advanced Momentum & Message Scope
This session will extend the sessions on “Introduction to Lua” and “Momentum Essentials and Message Scope” by taking participants through advanced, Lua-based message parsing APIs. Advanced policy scripts for database-driven binding assignment and DKIM signing will be demonstrated. Participants will see practical, but advanced remediation list usage with Message Scope and learn how to create custom remediation actions.
Target Audience: System administrators, operations and support personnel and developers.
What You’ll Learn:
Various parsing techniques using Lua API functionality.
Write Lua policy scripts that implement database-driven binding assignment and DKIM signing.
How to integrate Momentum bounce information with an external database.
How to integrate Message Scope with 3rd-party data feeds.
How to create custom remediation actions with Message Scope.
It’s going to be a great conference and I look forward to meeting everyone, to make it even more appealing, register now and use discount code VIP2S2 to save $250!
Anyone who follows the email marketing industry news is no doubt aware of the increasing number of well-publicized data breaches that have been occurring at the various major ESPs. In addition to the major ESPs, there are no doubt a number of less-publicized or even non-publicized data breaches occurring all the time at both smaller ESPs and in-house enterprise senders. The days when most of us in the email industry could watch from the sidelines and shake our heads have surely passed. Henceforth we should all operate on the assumption that we’re either now under attack as well, or will be shortly.
Email marketers have two valuable resources that malicious parties want to capture and exploit: information and infrastructure. Attackers want to access the information you hold, including email addresses, personally identifiable information (PII) and affiliation information (which organizations send to which recipients). Using this information the attackers can send spam or phishing messages and (in an unlikely worst-case scenario) even perform identity theft.
It’s time for all email marketers, whether sending themselves or through service providers, to make security a fundamental principle in their operations. The Online Trust Alliance (http://otalliance.org) recently published a set of guidelines (https://otalliance.org/resources/securitybydesign.html) that I highly recommend reviewing and following. I’d like to make a few additional recommendations of my own.
All the security technologies in the world can often be defeated by a simple phone call or a few dollars. There are multiple cases where attackers have been able to get into a system through social engineering: calling up someone in the target company and presenting themselves as a trusted co-worker and asking for the unsuspecting employee’s login credentials. In other cases a simple offer of cash in exchange for information or access can bypass any number of security measures.
Whether they are acting innocently or maliciously, your own employees (and customers) can easily be your downfall. There are a number of security measures that can help alleviate this:
Educate your employees and users. Make sure they understand what social engineering attacks are and how to identify and prevent them. Teach them to never disclose their usernames and passwords, and enforce a policy of never asking customers for their credentials and make it clear to your customers that you will never do so.
Do your homework. Employ best practices in your HR department. That includes performing background checks on your employees (at least the ones with access to sensitive customer information), including credit checks. Keep in mind that people in positions with access to sensitive data could be susceptible to enticement – this is particularly true if you’ve made it easy for them to act on that temptation.
Apply the ‘need to know’ rule. Consider who really needs to be able to see customer information, and how much needs to be visible. Does someone who manages message templates really need access to your recipient list? Does someone who manages segmentation really need to be able to see both the user and domain portion of an email address? Perhaps they can do their job with access to the domain information only. Do customer service reps really need to be able to see lists of recipients or do they just need to be able to look up a specific recipient to do their job? There will always be people who need to access sensitive information, but not as many as you might initially think, and few need access to absolutely all the information rather than just a subset of it.
There are a number of best practices around securing the data you store, but I want to share a few ideas about what to store, to be used in combination with data security best practices.
Store as little as possible. An attacker cannot steal information that you don’t possess. Do not ask for information you do not need or can’t use. Marketers tend to err on the side of over-collection because it ‘might’ come in handy. (Example: are you asking for a physical address when you do not send anything by regular mail?)
Use encryption where possible. Consider a suppression list to prevent sending to people who have unsubscribed (hopefully you followed step one and purged everything but their address when they unsubscribed); you need to have their address in order to prevent sending to it, but you can store their address as a one-way hash and compare a one-way hash of recipient addresses to identify if a recipient should be suppressed. I’ve worked with senders who encrypt the user portion of every recipient address (firstname.lastname@example.org would be stored as email@example.com as an example) in the database, with a custom Lua script in the messaging server decrypting the user portion of the address on the fly just before sending. With this approach, they can still do domain reporting and segmentation, while making it much more difficult for attackers to extract useful information.
Purge data as soon as possible. Again, you cannot lose what you do not have. Purge information as soon as feasible, both customer data and the various logs that can contain customer information. If you need a piece of information for a specific mailing, purge the data once the mailing is complete.
While I have no reports to date of email infrastructure as an attack vector there are still some steps you can take to better secure your email infrastructure.
Secure the server. Implement security at the operating system level as well as at the network level. Restrict access to the web UI to internal machines only (use a VPN for external access). Strongly consider using two-factor authentication including password-protected SSH key-based authentication.
Secure your logs. Remember that your logs will often contain address information, so you need to secure your logs with the same vigilance that you secure your database. Ensure that your file system permissions are properly set and that you retain your logs for no longer than necessary.
Customize your logs. If your system supports customizable logging, consider trimming your logs down to the bare minimum data required for your purposes. Instead of storing the recipient email address, store a customer identifier that you can use to lookup the customer address (high-end solutions will let you store just the domain portion of the address so you can still do reporting on domain volumes and deliverability).
Secure against being an open relay. Grant permission to inject mail on a per-IP basis if possible, use TLS and authentication if you need to allow relaying to external hosts.
Scan your outbound mail streams. An effective way to mitigate infrastructure attacks is to filter all traffic as it leaves the server to prevent sending mail that contains viruses, spam or malware. The incident at CheetahMail I mentioned at the start of this entry could have been prevented with outbound traffic filtering. Keep in mind I’m speaking about AV/AS filtering on a per-message basis. It’s not enough to send a test message to a preview tool if you’re trying to protect your infrastructure; you need a messaging server that can filter traffic on egress.
Implement Feedback Loops. While this may not seem like a security tool, I’ve worked with senders who were able to use a spike in incoming FBL messages to identify an unusual sending pattern coming from their servers, leading them in turn to identify that their network had been compromised and a malicious attacker was using their system to send mail.
Implement authentication tools such as DK/DKIM/SPF/SenderID. Again, this does not directly secure your data, but if a list is compromised it will be harder for a malicious party to deliver mail from their own servers and make it appear to come from you (especially when making phishing attempts with your data).
Monitor Block Activity. As with spam complaints, a sudden burst in rolling blocks could be a red flag that an infrastructure beach has occurred. Set-up alerting system for blocks and automated suspension processes to catch and shut down malicious mail streams before serious damage is done. The good news, if you’re running Momentum, is that our Adaptive Delivery product does this for you automatically.
The latest security breaches in the email marketing industry have re-enforced that an attack is a matter of when, not if, and senders need to plan accordingly. The recommendations of the OTA, combined with the recommendations above (and constant vigilance) should provide a good start at avoiding (and minimizing the impact of) a malicious attack.
Many email marketers are unaware of the importance of message queuing to the successful delivery of their email. As a component of their messaging infrastructure, queuing is something that marketers typically defer to their IT department to manage. Yet, the reality is that queuing and the segregation of message streams can make the critical difference between the success and failure of a company’s messaging programs, and therefore, should be of concern to both the IT and marketing departments.
Effective queuing really comes down to the choice of messaging infrastructure. When using a technologically advanced messaging platform, companies can efficiently manage parallel queues with messages assigned into multiple streams to ensure that each stream flows at an appropriate rate, providing efficient delivery of all classes of traffic. Unfortunately, those that rely legacy MTAs have no such options. They’re left trying to manage the bottlenecks and slowdowns that result from poor architecture with complicated priority schemes within a single queue.
Recently, one legacy MTA provider suggested that their routine for queue prioritization was the answer for reaching high-value customers first. While the business need is certainly legitimate, trying to prioritize messages within a single queue is both outmoded and a solution to a problem that should not exist in the first place. There are better ways to satisfy this need that are both simpler and more powerful at the same time. To illustrate my point, allow me to provide an analogy that should be familiar to my fellow business travelers.
One of the most common headaches for the air traveler is the security checkpoint; you get your ID checked, get in line, get your ID checked again, get in another line, empty your bags, take off your shoes and belt, get in yet another line and then get radiated in the name of public safety. During the highest traffic times these lines can become so long that people start missing their flights because there are a very limited number of security checkpoints and the airports were architected in a time that predated the need for such extensive security. This fundamental flaw in the architecture of the airports means that the current needs of travelers for additional parallel security screening checkpoints cannot be met, and everyone has to wait in a queue to get to their plane, sometimes with unacceptable results, leading to additional costs for all involved and potential lost business.
This is handled in a variety of ways, including performing the security check at every gate area (creating a very parallel security screening system) and by using priority security lines. Imagine for a moment that instead of this solution, the airport chose instead to assign priority on an individual basis to every single passenger and then tried to sort the individual travelers in the security line. As you can imagine such a solution would require additional work to make the individual assignments and then keep track of who ranks where in the line, with a risk that low-priority passengers would find themselves significantly delayed as they were repeatedly bumped. In all my travels I have never seen such an approach, primarily because the airports already have an approach that works.
Find a particularly efficient airport and what you’ll see is a fairly consistent set of practices:
Separation of passengers into queues based on their fitting into a certain profile.
A large number of parallel checkpoints.
Efficient handling of passengers.
Intelligent queue management that can modify queuing on the fly to meet circumstances.
Look at a particularly efficient airport and you will see multiple queues, including queues for:
Frequent / First Class travelers – The most frequent travelers and those who sit in First Class. These people bring a lot of value to the airlines and receive a lot of value in return.
Expert travelers – A new lane starting to appear in some airports, for those who are experienced in getting through security and unlikely to cause delays.
Family / Special Assistance – A slow lane, these groups will take longer to get through security.
Casual Travelers – A lane for those who move at an average speed through security.
Staff / Crew – While in some airports workers and air crew jump to the front of the line, the most efficient airports avoid this disruption by maintaining a separate checkpoint for those who work at the airport, minimizing disruption and ensuring that staff can get to work on time.
Specialty – From time to time I’ve seen the airport create a special temporary queue for unique groups such as chartered planes by opening a checkpoint and redirecting the group to the specialty group.
Not only has the separation of travelers into queues according to their profiles (including their priority as a group) proven sufficient to make prioritization by individual traveler unnecessary, it is much easier to manage.
While separating passengers into a number of queues can certainly benefit airports, the most efficient airports are also architected to operate a large number of parallel checkpoints, preventing a situation where every passenger in the airport needs to be funneled through the same metal detector. Imagine an airport trying to service millions of passengers a year on only one or two metal detectors and a single x-ray machine.
In addition to having many checkpoints, the best airports will also have efficient checkpoints, maximizing the flow of passengers through any given checkpoint through better design of the checkpoints and better training of the staff, all without compromising safety.
Perhaps most importantly to the smooth operation of an airport is intelligent management. I’ve seen airports where there were several queues open but empty because the people managing things weren’t flexible enough to reassign lines and adjust the queues to ensure well balanced passenger flow. The best airports will change the designation of queues, move staff around and even redirect passengers to alternate checkpoints that are less busy, all in the interests of moving the highest number of passengers per hour.
Senders can follow these same principles to get maximum throughput and deliverability in their own environment:
Segment mail by profile.
Choose a sending solution that supports highly parallel sending.
Choose a sending solution that provides sufficient throughput.
Choose a sending solution that is intelligent.
When sending, remember that segmentation is not just for who to send to or what to send them, but for deciding how to send a message and with what priority. You will want to create queues for high priority messages to satisfy your most valuable customers, queues for high-reputation traffic that delivers without issues as well as for traffic that you expect to deliver slowly (one example is traffic that results in human interactions, you may need to slow this to prevent overloading your call centers), test and administrative traffic that needs to go out as soon as possible and transactional traffic that should not queue up behind bulk sends. This is a common practice among ESPs, who often add specialized segmenting for scenarios such as new customers and customers with specific SLAs.
As with the airports, you need to architect your environment to be able to handle more traffic in parallel. This can be accomplished by adding more injectors and more messaging infrastructure, or by adding better infrastructure. Look at your existing solution: how many IPs can it send from? How many messages per hour can you send on a single machine and how many concurrent connections can it handle? Most Open Source solutions can handle one IP address, send 100,000 messages per hour at most, and can open less than one hundred connections. Low-end commercial solutions can often do over a hundred IPs, send close to a million messages per hour and can handle a few hundred to a couple of thousand connections. On the high end you have carrier-grade systems designed for the enterprise such as Momentum by Message Systems, which can utilize thousands of IPs to send millions of messages per hour across tens of thousands of concurrent connections (while you will never open more than a few connections to a given ISP on a given IP address, lesser solutions will fall short when sending across hundreds of IPs to thousands of ISPs, defaulting to ISP prioritization as a workaround).
Finally consider the intelligence of your infrastructure:
Can your infrastructure send across all servers simultaneously and fail-over in the event of an outage?
Can your infrastructure adjust throttles on the fly based on responses from the ISPs and bounce and feedback loop data?
Does your infrastructure handle queues so efficiently that performance is the same with thousands of messages in the queues as it is with millions of messages in the queues?
Can your infrastructure dynamically change from email to other protocols such as SMS and MMS based on subscriber preferences and ISP responses?
If you had talked to any email marketer 10 years ago and asked them how they dealt with blocks on their IP addresses, the answer would probably be the same: “We just switched IPs.” Not only was this an unfortunate, albeit effective, way to deal with blocks, it also became a common method used by spammers. They would simply send from one IP address for a very short time and then move on to another, either with IPs they owned or through hijacked computers controlled by botnets. Because of spammers’ behaviors, ISPs and email providers respond by temporarily blocking and limiting the amount of email a new IP address could send. ISPs now treat any new sending IP address like a dog on a short leash, and only extend the leash when the senders’ reputation is proven.
I personally have seen what can happen when senders try to send too much, too soon, with senders trying to send millions of messages on their first day using new IP addresses and finding themselves blacklisted in short order. For a reputable sender the key is to start sending slowly and gradually increase volume on new IP addresses until a proper sending reputation has been established.
Before I get into some technical advice I’d like to clarify one thing from the Return Path article. Regarding the shutdown of Goodmail Tom has this to say:
There are a couple of reasons you still might have to send from a new IP address, such as moving to a new ESP, moving to a new data provider, or moving off of Goodmail. Goodmail had a unique way of tokenizing their customers’ mail by relaying mail through their own IP addresses, and consequently their reputation. Therefore, once you stopped using Goodmail, your traffic now goes through your IPs, which hasn’t had any traffic in awhile, which means you’ll need to work on building up your sending reputation again.
This statement applies to any customer’s of Goodmail’s hosted imprinting service but does not apply to in-house senders using products such as Momentum by Message Systems that had a built-in Goodmail Imprinter. For such users the shutting down of Goodmail involved shutting off the Goodmail Imprinter component of their infrastructure but IP warmup will not be required since those users were already sending using their own IP addresses.
Assuming you didn’t get a chance to read the link, here’s the five points of advice provided by the article:
Sign up for all feedback loops. Suppress from future mailings.
Authenticate. Use SPF, SenderID and DKIM.
Segment and mail your active subscribers. Put your best foot forward.
Monitor. Use seedlists such as Mailbox Monitor and watch your IP’s Sender Score.
Get Certified. Get your new IP Sender Score Certified.
Some of these warrant additional discussion from a technical point of view.
In order to be effective, Feedback Loop message handling needs to be automatic. Message Systems customers should already be aware that we have provided built-in Feedback Loop processing as of our 3.0 release in 2008. In addition to automatically unsubscribing recipients that trigger a Feedback Loop message, you should also take the volume of feedback loop hits as a metric to show the effectiveness of your mailings. Feedback Loop hits should also be used as a factor when determining traffic shaping rules, especially when an IP address is new. If you see a lot of FBL hits, you should throttling back on the sending IP address.
With regards to monitoring, seedlist monitoring is a good indicator of how ISPs are treating your mail, but they provide only part of the overall picture. To get a complete view of your deliverability you need to also monitor what happens before the ISP accepts a given message, taking into account what temporary (aka transient or 4xx) failures and permanent (aka 5xx) failures that are occurring as you try to send. When monitoring permanent failures, keep in mind that permanent failures can occur both during delivery (synchronous or in-band) and post-delivery (asynchronous or out-of-band) through the ISP sending back a DSN (delivery status notification) message to the return path (aka envelope sender or envelope from) address of the original message. You should be tracking and trending all failures, especially when sending on new IP addresses.
Additional Technical Considerations
Keep in mind that just like in life, you are judged not just by what you say, but by how you say it (and whom you say it to). With regards to deliverability this comes down to content and sending practices. From a technical standpoint we focus less on content (what you say), but it does have a significant impact on IP warmup. You should avoid sending riskier content on new IP addresses both on overall content and wording (I’ve seen deliverability dip just for using the word “sexy” in a mailing, even when the overall message was not sexual in nature).
Before you ever start sending from a new IP it is vital that you pre-configure your sending software to comply with as many published ISP recommendations as possible. A convenient resource is this page provided by Word to the Wise: ISP Summary Information. Pay particular attention to the Connection Limits and Sending Limits columns.
When warming up IP addresses it is important to start slowly; ISPs do not trust new IPs and will not respond well to new IPs coming online and immediately bursting out large amounts of traffic. While there are no published limits online one of the recommendations I have heard is to avoid sending more than 10,000 messages per day to the major ISPs (Yahoo!, Gmail, Hotmail, AOL, etc.) when first sending, and I’d say it would be best to send less than 1,000 messages per day to any smaller ISPs. By reviewing your temporary and permanent failure messages you will be able to get a feel on whether your reputation is sufficient to increase volume, and after increasing volume you should pay particularly close attention to your failure metrics to make sure that the change has not had an adverse effect on deliverability. I generally recommend not increasing volume by more than 2x at a time and not more than once every day or so. Don’t hesitate to revert to a lower volume if you start seeing an increase in temporary and permanent failures.
When sending, make sure to quickly and automatically suppress any recipients that the ISPs identify as being invalid. When first sending you should assume that you are being watched closely, and one aspect of that is your practices regarding bounce processing. If you repeatedly send to someone that an ISP identifies as invalid through a bounce message you will be penalized for it by the ISP, and that punishment can potentially come faster when an IP address is new due to the lower starting reputation.
Watch Out For Deferrals
There is a specific class of temporary and permanent failure responses that you need to keep a particular eye out for, the deferral messages. A deferral message from an ISP indicates that you need to quickly and decisively change your sending practices as they are the warning messages you receive from the ISPs prior to being blacklisted. You can get examples of deferral messages from Yahoo here and many ISPs will list examples of their deferral messages on their postmaster page. As an example, here is a hotmail deferral message:
421 4.16.55 [TS01] Messages from x.x.x.x temporarily deferred due to excessive user complaints
When you see such messages, you need to review your content and throttles and pause sending for a couple of hours to allow things to cool off while you determine what changes you need to make.
Automating IP Warmup
In his article Tom Sather advised:
If this looks like a lot of work, then you’re right. To be successful, you need to plan appropriately, be patient, send smarter, and constantly monitor.
Tom is absolutely correct, warming up new IP addresses requires research, preparation and diligence.
The good news for Message Systems customers is that we’ve taken care of this for you with our Adaptive Delivery module. Adaptive Delivery will automatically identify new IP addresses, set initial throttles and gradually increase volume as the IPs age, monitoring ISP responses to ensure that the ISPs are responding positively. If at any point the Adaptive Delivery module identifies a negative ISP response, it adjusts throttles in realtime and monitors for additional negative responses. If an ISP replies with a deferral response, Adaptive Delivery will suspend delivery, throttle back and send you an alert so that you can check the content being sent. All of this is built using intelligence that is constant reviewed and improved by a full-time, in-house deliverability specialist. In addition, our bounce processing system now supports live updates, allowing us to improve classifications thanks to automated feedback from customer systems. If you’re not taking advantage of these new capabilities contact Message Systems and we’ll help.
My employer Message Systems is constantly growing and I’d like to share our latest career opportunities. I do want to call out an opening on my team specifically for a new Sales Engineer:
Message Systems, the market leader in Advanced Message Management Solutions, is looking for an energetic Sales Engineer to support our sales in the Financial and Healthcare verticals.
This is a full-time position.
Provide exemplary pre-sales technical expertise through technical and product presentations, product demonstrations, pilot implementations, beta program administration, consistent communication, and on-going technical consultation.
Install and configure trial and demonstration systems and train customers on their use.
Translate complex technical problems for non-technical clients as well as translating non-technical specifications into precise technical requirements.
Meet with clients to evaluate their current systems and needs and make recommendations for software and hardware and integration.
Travel approximately 30% in support of sales and customer activities.
Respond to RFIs, RFPs and serve as liason between the sales, technical and support teams.
Play a pro-active “Technical Account Management” role within strategic accounts including relationship and business development activities.
Bachelor’s degree in IT-related field or relevant experience.
2 – 5 years of experience in a software pre-sales, post-sales or related role.
Keen desire and enthusiasm to assist prospects in understanding the value proposition of the technology and helping customers improve their business processes.
2 – 5 years experience administrating Linux systems (Solaris experience a plus)
Experience programming in Lua a plus
Experience in the Financial and Healthcare verticals a plus
Strong inter-personal, oral and written communication skills a must.
Experience with large enterprise software a plus.
Experience in the email marketing industry a plus.
In addition, we’re also hiring the following roles:
Director of Product Management / Alternative Channels
Enterprise Software Sales Executive – Mid-Atlantic Region
Over the course of the past five weeks, spam campaigns have been aimed at the staff members of over 100 ESPs and gambling sites. These targets have received emails typically with content that mentions the staffer by name, and purports to be from a couple, presumably friends or co-workers.
The phish message has been sent numerous times, over several different systems, including using the facility of some ESPs, using online greeting card sites, and by way of a botnet. Sources confirm the list of addresses is very small (less than 3,000 addresses) and aimed 100% at staff responsible for email operations.
The message links to a site that contains a particularly nasty payload. I received one myself and deleted it as I thought it was harmless spam so the attack is going after email infrastructure providers in addition to ESPs.
In email marketing there are senders of all shapes and sizes, from small businesses using self-serve ESPs to the largest web properties self-sending to massive user bases. While only a few senders will reach or exceed volumes of one billion messages per month, the tools and practices needed to achieve such a volume level are applicable to all senders who want to succeed in email marketing.
Who Am I?
My name is Mike Hillyer (click here for bio and social links). I manage a team of Sales Engineers for Message Systems, a leading provider of digital messaging solutions for both senders and receivers. In my work over the last several years I have helped a number of clients reach the billion messages per month level and even more clients successfully deploy email marketing solutions ranging in scale from hundreds of thousands to millions of messages per month.
Who Needs To Send This Much Mail?
Contrary to what the image to the image above implies, there’s nothing inherently evil about sending a billion messages a month. Some of the businesses that move a billion messages a month include ESPs, social networks (some move more than a billion a day for that matter), social gaming sites and large online retailers.
Any time you have a fairly large number of users (5-20 million) who receive multiple messages per day, or a really large number of users (40-50 million) receiving one message per day you are heading into the billion messages per month territory.
What Are The Numbers?
So exactly how much mail are we talking about here? That will depend on sending patterns:
In a lot of high-volume environments the sending will be to a world-wide audience, resulting in round-the-clock sending with no significant bursts of traffic. In such an environment the hourly volume will be 1,000,000,000 messages divided by 30 days divided by 24 hours equaling 1,388,889 messages per hour (386 messages per second), assuming 30 days in a month.
In an environment with inconsistent hourly volumes, we have to allow for both an average hourly volume and a maximum hourly volume and then design our solution to address the maximum hourly volume.
We need to look at seasonal factors: Does your social network move a lot of extra messages around Mother’s Day? Does your dating site move a lot of extra messages around Valentine’s day? Does your web shopping portal do a lot of extra business around Christmas?
We need to look at growth: If you are sending a billion messages a month it is very likely due to successful growth of your user base, something which you certainly have no intention of slowing. Look at how you have grown your email volume so far and extrapolate it out for the next year or two (especially if you only get budget for your infrastructure every two years).
Let’s assume for the sake of this article that you have an average volume of one million messages per hour with a peak volume of two and a half million messages per hour during your busiest season. You expect to double your user base each year for the next two years. At the end of two years you expect to be sending ten million messages per hour, or 7.2 billion messages per month (I’ve seen just this kind of growth several times with customers and prospects).
You Will Need to Send In-House
A lot of senders start by using an Email Service Provider (ESP) for their sending and should do so: an ESP provides infrastructure and expertise to handle the details of sending email marketing messages for their clients at a good price, allowing companies to focus on their business. In addition, the costs of installing and maintaining proper sending infrastructure and practices are not justifiable for most low-volume senders.
That said, if you are aiming for a billion email marketing messages a month and are using an ESP it’s time to plan your move to in-house sending. Assuming a $1.00 CPM (Cost Per Mille with Mille being Latin for thousand, so cost per thousand) you are looking at paying an ESP a million dollars a month to handle this kind of volume. Naturally you can probably secure a better rate than $1.00 CPM at these volume levels but regardless of the discount at this volume level you will pay less to buy the infrastructure and hire the people needed to do this yourself, gaining the control you need when sending at these volume levels.
Start With a Good Reputation
In order to hit the volume levels we’re talking about it is going to be vital that you have a solid sending reputation. This means you need to follow best practices for list acquisition, list hygiene, segmentation and relevancy. There’s a wealth of information online and an excellent catalog of it at http://www.emailisnotdead.com/. This article will focus primarily on the technical aspects of sending one billion email messages per month but keep in mind that reaching one billion messages a month without a solid reputation on your domain and sending IPs is very difficult. A number of tools for checking the reputation of your IP addresses can be found at Word to the Wise. At this volume level email is key to your business and a solid reputation is going to be essential.
From a technical perspective there’s a number of bases we need to cover regarding authentication, whitelisting, bounce processing and complaint handling.
As a reputable sender you will want to associate your IP addresses with your domain using the authentication standards available to you. These include SPF, SenderID, DomainKeys (DK) and DomainKeys Identified Mail (DKIM). There are indications that SPF (and SenderID by association) is ineffective but given the low effort required to implement it I would recommend doing so anyway. While SPF and SenderID are purely DNS-based, DK and DKIM require an implementation either during message creation or during relay by the MTA and as a result will impact the maximum throughput of your infrastructure (more about this later).
DomainKeys is quickly being superseded by DomainKeys Identified Mail but with most solutions supporting both DK and DKIM it is simple enough to use both when sending to an ISP that supports one standard or the other. Implementation details will vary based on your sending solution. While some recommend selectively signing DK and DKIM for only messages sent to ISPs that are known to check authentication (in order to lower the impact signing has on throughput on a solution that takes a significant performance hit from signing), I recommend signing all messages; you never know who is checking for authentication without announcing it.
One benefit of getting on the various whitelists provided by ISPs and reputation providers is that in some cases you can send higher volumes on whitelisted IP addresses than would otherwise be possible. Keep in mind that in most situations whitelisting is something that comes after sending has already begun in order to allow the provider of the whitelist to examine your sending patterns as part of the whitelisting process, so put your best foot forward (and follow it up with consistent behavior).
One quick way to lose reputation is to repeatedly send mail to recipients that do not exist. The ISPs will track how many non-existing addresses you send to and throttle you accordingly. Even more seriously, ISPs will occasionally take inactive email addresses and re-activate them as spam traps; any mail sent to the address will immediately get the source classified as a spam source and filtered accordingly.
To prevent this it is necessary to capture and act on the responses sent by the ISPs and unsubscribe those addresses identified as non-existent or inactive, while retaining those with responses that identify users on vacation and other not-fatal errors. Commercial sending solutions will perform this automatically with varying levels of effectiveness while other platforms will require a third-party solution such as Boogie Tools. Keep in mind that the more you send, the more you receive back in the form of automated responses and bounce notifications. As your reply addresses reach more and more users the flow of notifications will become contaminated with spam and virus-carrying messages, requiring the implementation of Anti-Virus/Anti-Spam solutions for your incoming mail stream.
In an effort to help senders improve their practices, a number of ISPs have implemented ARF formatted Feedback Loop programs. When a user on a supported ISP clicks the “This is Spam” button, an automated message is sent to an address you define in advance (when signing up with the ISP for the Feedback Loop program). By processing these messages and un-subscribing the relevant users, you prevent further reputation damage that may result when sending them future messages.
The ARF format used by the ISPs makes it relatively straightforward to process Feedback Loop messages and use them to unsubscribe the users who have complained about your messages. There are tools available to process ARF formatted messages and some sending solutions will handle FBL messages natively.
There are a number of architectural components that come into play to make it possible to send email marketing messages at volume levels of one billion email messages per month (or more) including network connectivity, server hardware and software.
Most professional sending operations are based in rented datacenters, simplifying the provisioning of network connectivity. In our initial example of a maximum throughput of 2.5 million messages per hour we’ll use a sample message size of 50 kilobytes (51,200 bytes), meaning that we need to send at a rate of 2,500,000 * 51,200 = 128,000,000,000 bytes per hour or 271.2 megabits per second.
With the throughput we’re talking about we certainly need to use gigabit speed networking within the datacenter and, more importantly, need backbone connectivity that can support not only a sustained throughput of 271 megabits per second but than can handle our future needs of 7.2 billion messages per month. You need to look at a datacenter that will be able to provide sustained gigabit speeds to the backbone.
Keep in mind that when you are sending a billion messages per month it means that email has significant impact on your bottom line and you won’t be able to tolerate extended outages. You need to not only make sure that the datacenter you choose has redundant power and backbone connections, you also need to consider using redundant datacenters.
Moving over a million messages per hour does not require the purchase of custom server hardware but it does require making a proper investment in hardware. Generally speaking you will be using an infrastructure similar to the following:
The Message Injector queries the database and uses the results to assemble one or more messages which it relays to the Outbound Mail Server. The Outbound Mail Server queues the message, performs any necessary manipulations on the message and then sends it via the Internet to the destination server. In the event of a delivery failure message or a feedback loop message, the incoming message arrives via the Internet to the Inbound Mail Server. The Inbound Mail Server performs anti-virus/anti-spam scanning and then, in the case of a legitimate message, processes the message and updates the subscriber information in the database (not all server solutions can perform this processing in-stream, when using such solutions an intermediate server will be needed to accept the clean message from the Inbound Mail Server and process it using custom code).
In a production deployment there can be several variations on this example, typically with multiple servers used on the outbound and inbound roles, with multiple message injectors pushing to the outbound machines and often specialty servers on the inbound side dedicated to processing incoming feedback loop and bounce messages.
I generally recommend mail servers similar to the following:
2x multi-core, 64-bit processors
16-32 GB of RAM
8x 15K RPM hard-disks
Battery-backed RAID-10 controller
The specific details of your hardware selection will depend on the ability of your specific software to leverage the resources provided. A large number of fast disks in a RAID-10 array is recommended for the message spool as standards-compliant mail servers must write messages to disk before accepting them for delivery, placing significant demands on storage resources.
As an employee of a leading software provider for high-volume senders you would rightly expect me to recommend a commercial solution, and specifically my company’s solution. I’d like to take a moment to point out why:
We need to send at a rate in excess of one million messages per hour. I’ve dealt with a number of solutions and my experience has shown that most Open Source MTAs such as Postfix and Sendmail are limited to around 100,000 messages per hour. Commercial sending solutions typically show real-world performance ranging from 500,000 messages per hour to over two million messages per hour.
I have helped several companies that were operating dozens of Open Source servers to consolidate down to one-tenth as many servers running a commercial solution.
In addition to limited throughput, Open Source MTAs are usually limited to sending through a single IP address, meaning that to send through ten IP addresses you need ten separate server instances. Commercial solutions support sending through multiple IP addresses simultaneously.
Commercial solutions go beyond basic message queuing and sending, providing the additional functionality required for a high-volume sender. This includes features such as APIs, bounce classification, feedback loop processing, internal scripting, automated throttling, and database integration.
If you’re sending a billion emails a month, you absolutely need a solution that provides high availability out of the box. If a server goes down you can’t afford to be frantically activating a warm spare, just to find out that it too has some issue. You need an active-active solution that reacts automatically to server failures and keeps the mail flowing.
You need a solution that can be easily managed on your terms, whether you prefer editing configuration files or using a web interface. In addition, you need something that grows with you, providing centralized management of an entire cluster of servers. Commercial solutions will provide easier, centralized management.
One key to successful sending at high volumes is keeping tabs on how your server is performing and how your mailings are doing. You need to know what is passing through your server, how quickly messages are moving, whether queues are backed up, how the various ISPs are treating your traffic, all with the ability to drill down on specific source IPs and specific destination ISPs. You need to be able to see all of this in real-time and across your entire infrastructure. A good commercial solution provides all of this out of the box.
Time is Money
On multiple occasions I’ve seen organizations choose a free or low-cost solution and then spend countless hours building workarounds to the weaknesses of their chosen platform, writing scripts to automate administration, reporting tools to fill their needs, failover scripts to provide redundancy, etc.
While a lot of this work was impressive, it required time to implement and time to maintain. Time spent creating tools that are already provided in an alternative solution is time (and money) wasted. You are always better off using your time to create your “special sauce”; that which makes your business unique and gives you a competitive advantage.
The Price of Success is Continued Vigilance
When sending at a rate of one billion messages a month (or more), you can’t just use a ‘fire and forget’ mentality. You are going to have to have people around to keep a constant eye on what is happening in your environment, monitoring multiple key factors to ensure you can continue to successfully send.
Remember earlier when I said you need to start with a good reputation? You also need to keep a good reputation, and the only way to do that is to know what your reputation is. You will need to take advantage of reputation monitoring tools provided by companies like Return Path and Pivotal Veracity as well as keep a close eye on the reporting produced by your sending solution (remember when I said you need good reporting?)
You need to watch things such as bounce rates, FBL hit rates, blacklist hits, transient failures and response rates.
If email is your company’s lifeblood (and if you’re sending a billion messages a month it certainly is) then you need to make sure to keep it flowing no matter what happens, and that means making sure your email infrastructure stays online. I spoke earlier of the need for high-availability, active monitoring goes hand-in-hand with this need. You will need to monitor the health of the servers that support your infrastructure, the network components that carry your messages and the software that creates and relays your messages.
There are a number of monitoring solutions available to accomodate any platform and budget, be sure to implement one that meets your needs and get monitoring. Make sure to test simulated failures to confirm that monitoring is working successfully. Consider setting up a simulated mailing that runs on a regular basis using your full infrastructure stack: a monitoring script can check an inbox and if the test message fails to appear, something is potentially wrong in your sending infrastructure. This approach can help identify issues that may pass by other monitoring systems unnoticed (and can be integrated into some monitoring solutions directly).
Keep in mind that you get what you monitor for; if you focus too much on one metric it may improve without helping the big picture. In addition to making sure all the underlying pieces are in place, don’t forget to keep an eye on things where the rubber meets the road. You may be sending at phenomenal rates with great metrics but failing to generate customer actions that lead to revenue.
While by no means an exhaustive list, I hope this gives you some idea as to the scope of sending in a high-volume environment of one billion email messages per month. Watch this space over the coming weeks for deeper dives into some of the subjects covered here.
Questions? Did I miss something? Let me know in the comments!
The opinions and information in this post are my own and do not necessarily reflect those of my employer.
This gets to the heart of what companies need to do: focus more on the customer; their situation, their needs, their challenges. Even I’ve been guilty of spending too little time on them and too much time on me: what I am selling, what I have to offer, what I need from them.